20 Feb 2021
With this release, Ballista was re-implemented from scratch to take advantage of the many changes in Apache Arrow
3.0.0, especially some major refactoring in the DataFusion query engine that made it easier for projects such as
Ballista to extend DataFusion’s functionality.
It is now possible to run TPC-H queries 1, 3, 5, 6, 10, and 12 against a distributed cluster.
Please refer to the user guide for installation instructions. Release notes are available
here.
10 Aug 2020
The goal of the 0.3.0 release is to provide a minimum viable product of distributed compute in Rust. It is now possible
to run a query that is very close to TPC-H query 1 on a distributed cluster with reasonable performance. Performance
and scalability is comparable to Apache Spark (within the range of 2x slower to 2x faster based on initial benchmarks).
Performance tuning will be one of the main areas of focus for the 0.4.0 release.
Please refer to the user guide for installation instructions. Release notes are available here.
26 Apr 2020
Ballista is an attempt at building a distributed compute platform based on Apache Arrow and this site has been created to host the user guide and to provide a blog to announce project news and releases.