Zoe - Analytics on demand

Zoe provides a simple way to provision data analytics clusters and workflows.

Contact us

Easy to use. A few clicks to start.

Use the web interface for launching complex data exploration frameworks in a few clicks, or use the APIs to call Zoe from your own scripts.

Zoe is independent from applications. A generic application description language is used to build compositions of analytics services, define resource constraints and configuration options.
For example, a user can run Spark or MPI jobs on Zoe, by providing appropriate descriptions and Docker images.

Check this repository for some example applications we use to test Zoe.

Web interface screenshot

Fast. New clusters in seconds.

Zoe can create a fully configured Spark cluster, with 20 compute nodes and an iPython notebook in a few seconds.

In this video we show the user-facing command line interface to Zoe and demonstrate how fast and easy it is to define and start a new application.

Smart. Advanced scheduling.

Zoe is built from the start to make full use of the available capacity in your Docker Swarm cluster. Not only Zoe is smart in placing containers, but when resources are exhausted, Zoe will queue new requests using state of the art scheduling algorithms.

Zoe is streamlined for maximum efficiency. As soon as an application is ready for execution, it is translated in Docker commands and sent to Swarm. An optional private image registry can increase even more the startup performance of new applications.

Zoe can provision compute-only applications, but also data layers, like Hadoop HDFS. We keep testing the Zoe concept composing new services and frameworks and extending Zoe's capabilities. Stay tuned for more!

Zoe architecture

News and updates

2017.03 release is now live on github!

This release marks an important turning point for Zoe. More contributors are joining the project, merging very interesting new features. Moreover all releases are now tested through a fully automated continuous integration pipeline.

Donating idle CPU cycles (24/08/2016)

The Zoe team is proud to announce that spare CPU cycle on our internal platform are being donated to the World Community Grid to help supporting research into important humanitarian causes. At the bottom of this page you can see the projects we are participating to and some statistics.
To run WCG software with Zoe, we created a simple BOINC ZApp. A script monitors the platform load and starts new BOINC ZApps via Zoe when the load is low.

Zoe supports Cassandra (13/07/2016)

We just pushed a Cassandra ZApp to the zoe-applications repository. It uses the standard, unmodified Docker Hub official Cassandra image, without any change.

Docker images for Spark 2.0.0-preview (27/05/2016)

Want to try Spark 2? Now that Spark 2.0.0-preview is available, we have built Docker images and used Zoe to launch a few apps (e.g. a Jupyter notebook) with Spark 2. The Dockerfiles are available in the spark2 framework in the zoe-applications repository.

Zoe in production at Eurecom

We are using Zoe to drive the laboratory sessions of the new Algorithmic Machine Learning course at Eurecom. About 50 students interact with Jupyter notebooks and Spark clusters, learning the basis of data science with real-world use cases. All the course material is available on GitHub, feel free to have a look and contribute!

Zoe at DockerCon EU 2015

Zoe has been presented at the DockerCon EU 2015 event in Barcelona! Here are the slides used to present the project to the Docker community:

Zoe - Swarming Spark applications from Daniele Venzano