DIT4C (dit-for-see) is a platform for hosting data analysis tools "in the cloud" using containers. Container images provide the tools, and DIT4C provides a secure hosting platform for them.
As long as users have Internet access and a modern browser, they'll be able to get to their tools from wherever they are, while allowing resource administrators to provision that compute wherever makes the most sense (eg. self-host, IaaS, etc).
DIT4C is focused on meeting two needs:
- Training sessions - having a working install right from the beginning means training participants start programming sooner, and do so in a consistent environment.
- Reproducible research - container sharing and export allows complete working environments to be exchanged and archived.
Pick an image, or bring your own
DIT4C has plenty of images to choose from:
If they're not an instant fit, that's OK. DIT4C can use any container image that exposes its functionality via HTTP (80/tcp or 8080/tcp).
Want a package installed for all new containers?
Extend one of the existing tool images.
# DOCKERFILE 1.0 FROM dit4c/dit4c-container-ipython ADD apt-get install -y awesome-package-we-forgot
Need to run your own tool?
Extend one of the base images:
DIT4C container instances provide access to research tools, which can be stopped and saved to image to persist their state. This allows continued work using different compute nodes and/or clusters. Persisted instances can also be shared with other users.
In this example, Alice saves a1 and shares it with Bob. Bob then creates b1 to continue working. Bob saves b1 and continues work in b2, but doesn't like how things have turned out. He discards b2 and begins work from b1 again with b3.
Both Alice and Bob finish their work, and export their environment so others can reproduce their analysis in future.
To improve scalability and security, DIT4C runs all containers on compute nodes separate from the portal.
A portal can have multiple schedulers, each able to start containers on compute nodes. Through the use of routing servers, DIT4C is able to allow compute nodes and schedulers to exist on private networks with no open inbound ports, providing better security for compute and data.
Read about DIT4C's architecture in more detail
Let's get started...
Release Signing Key