Ballista A modern distributed compute platform

Ballista 0.4.0 is now available

With this release, Ballista was re-implemented from scratch to take advantage of the many changes in Apache Arrow 3.0.0, especially some major refactoring in the DataFusion query engine that made it easier for projects such as Ballista to extend DataFusion’s functionality.

It is now possible to run TPC-H queries 1, 3, 5, 6, 10, and 12 against a distributed cluster.

Please refer to the user guide for installation instructions. Release notes are available here.

Ballista 0.3.0 is now available

The goal of the 0.3.0 release is to provide a minimum viable product of distributed compute in Rust. It is now possible to run a query that is very close to TPC-H query 1 on a distributed cluster with reasonable performance. Performance and scalability is comparable to Apache Spark (within the range of 2x slower to 2x faster based on initial benchmarks).

Performance tuning will be one of the main areas of focus for the 0.4.0 release.

Please refer to the user guide for installation instructions. Release notes are available here.

Welcome to the new Ballista website!

Ballista is an attempt at building a distributed compute platform based on Apache Arrow and this site has been created to host the user guide and to provide a blog to announce project news and releases.