Users, roles and quotas

May 14, 2018 • Daniele Venzano

Up to the 2017.12 version, Zoe did not have a real user management system. Authentication was delegated to a third-party service (LDAP) and Zoe retained only a user identifier to assign an owner to executions. Three groups where defined statically: guest, user and admin and everyone had to fit in one of these three groups.

This simple implementation soon started to show its limits:

  • Workspace permissions: Zoe manages workspaces, but has no idea of the UID of each user, so it has no way of telling Zoe applications under which user they should access the workspace. When the user accesses the workspace, it finds a mixture of random UIDs as owners of its files and, consequently, cannot open or delete some of them.
  • Limited groups meant that administrators had to fit everyone in only three categories, each with fixed permissions and quotas
  • Inflexible quotas defined in the source code, just for guest users

To move onward and fix all these problems, we decided to invest the time and properly develop a full user management system, with roles and quotas. This work is mostly finished in the current development master branch.

The Zoe database now contains three more tables, called user, quota and role.

Users

Each user that want to access a certain Zoe deployment will need to have an entry in the user table. This table contains:

  • the username
  • the authentication backend : each user can be authenticated in a different way. For example you can have staff members authenticated via LDAP, visiting researchers via the internal Zoe authentication (fast and easy to manage) and students authenticated via a text file generated by a script.
  • the filesystem UID to use when accessing the workspace : this UID needs to be communicated to the containers to be useful. See below how we are trying to solve this additional problem
  • a password : only in case the authentication back-end is internal. The password is hashed with a salt, via the Passlib library.
  • an email for sending out notifications
  • … and other fields

Users can also enabled and disabled and have a quota and a role associated. As for authentication we also added PAM authentication, using the PAM subsystem of the host running the zoe_api process.

Quotas

A quota, once defined, can be assigned to multiple users. Quotas limit how many resources can be reserved at once by a user. For now the only limit is the number of concurrently running executions, next we will implement limits on the maximum amount of cores and memory that can be reserved.

Roles

A role, like a quota, once defined can be assigned to multiple users. Roles define the capabilities of a user in Zoe. Currently we have the following capabilities:

  • can_see_status : can access the status page on the web interface
  • can_change_config : can make changes to the configuration (add/delete/modify users, quotas and roles)
  • can_operate_others : can operate on others’ work (see and terminate other users’ executions)
  • can_delete_executions : can permanently delete executions and all the associated logs
  • can_access_api : can access the REST API
  • can_customize_resources : can use the web interface to modify resource reservations when starting ZApps from the shop
  • can_access_full_zapp_shop : has access to all ZApps in the shop

Permissions in containers

Thanks to the new user system, Zoe knows which user/UID/GID should be used to run the processes in each container of a ZApp execution. Unfortunately most of the images available from the Docker Hub do not care at all and run the processes as root. Jupyter made an effort and runs processes with UID 1000 by default.

One of the principles that Zoe always tried to adhere to has been to let users run whatever image they would like, without any change. Currently we are running standard PyTorch, Tensorflow and Jupyter images, but users complain of undeletable files in they workspace and too broad permissions, letting other users peek into one’s workspace.

One solution is to make sure all processes run as root (UID 0) in containers and allow access to workspaces via per-user containers, preventing access to other workspace via container isolation. This is suboptimal for a number of reasons:

  • “users running stuff as root” makes any good sysadmin cringe for very good reasons
  • the files in the workspace will be owned by root, also on the host systems
  • users will need to start a special execution for the gateway container each time they want to copy some files from/to the workspace via SFTP and they have to remember to terminate it when finished

Another solution is to create all container images from a base image that does UID management when the container is started, taking the UID from the environment variables generated by Zoe. This has its own set of problems:

  • users are no longer able to run apt-get install to install some random library that is not available in the image
  • sysadmins need to create and manage images, even when perfectly good ones are already available on the Internet

There isn’t a good solution, it really depends on the use case. At Eurecom we are moving toward the second solution. Users access Zoe via a unified SSH gateway and we cannot have that coexist with root-owned files.

Next steps

ZApp shop

We are working on a good solution for managing access to Zoe applications in the ZApp shop. Ideally an administrator would want to restrict access to some ZApps only to some users.

Administrator web interface

Users, roles and quotas can be managed from the command line. A nice to have, in the future, is to have the same capability built-in the web interface.

Zoe images for data science frameworks

We are slowly building a library of Docker images that understand environment variables about users and UIDs.