Ballista A modern distributed compute platform

Ballista 0.3.0-alpha-1

Ballista 0.3.0-alpha-1 is the first release with true distributed compute capabilities. It is now possible to deploy Ballista to Kubernetes, or as a standalone cluster using etcd for discovery, and use a DataFrame API to build and submit queries to the cluster for execution.

Here is a brief demo of Ballista executing TPCH query 1 in Kubernetes.

asciicast

The goal of the alpha release is to make it easier for contributors to start testing the new distributed scheduler and provide feedback. Performance is not optimized yet and this release supports a limited number of operators and expressions. A more capable 0.3.0 release will be available in August 2020.