Sail logo

Sail

by lakehq

Sail unifies stream processing, batch processing, and compute-intensive (AI) workloads. It currently features a drop-in replacement for Spark SQL and the Spark DataFrame API in both single-host and distributed settings.

View on GitHub

Last updated: N/A

What is Sail?

Sail is a platform designed to unify stream processing, batch processing, and compute-intensive (AI) workloads. It provides a drop-in replacement for Spark SQL and the Spark DataFrame API.

How to use Sail?

Sail can be installed via pip. You can start a Sail server using the command line interface, Python API, or deploy it on Kubernetes. Once the server is running, you can connect to it using PySpark by specifying the remote connection string.

Key features of Sail

  • Drop-in replacement for Spark SQL and DataFrame API

  • Support for single-host and distributed settings

  • Unified platform for stream, batch, and AI workloads

  • Command-line interface and Python API for server management

  • Kubernetes deployment support

Use cases of Sail

  • Data analytics for LLM agents

  • Accelerating Spark workloads

  • Unifying data processing pipelines

  • Running Spark applications in distributed environments

  • Replacing Spark SQL with a faster alternative

FAQ from Sail

How do I install Sail?

You can install Sail using pip: pip install "pysail[spark]".

How do I start the Sail server?

You can start the server using the command line interface (sail spark server), the Python API, or by deploying it on Kubernetes.

How do I connect to the Sail server from PySpark?

Use SparkSession.builder.remote("sc://localhost:50051").getOrCreate() to connect to a local Sail server.

Where can I find the documentation?

The documentation is available at https://docs.lakesail.com/sail/latest/.

Does Sail offer enterprise support?

Yes, LakeSail offers flexible enterprise support options. Contact them for more information.