ML Prototyping Environment on Cloud

Teams that collaborate on data-science tasks using cloud platforms often choose to share a preconfigured ML environment, such as Kaggle Docker Python image. This resolves reproducibility and dependency issues, while individual team members can add custom packages on top, with local virtual environments, for example adding less common packages for computer vision.

This robust setup requires pointing to the base environment as --system-site-packages when configuring the local virtual environment. Below, we see an example of a local environment with the package DeepForest (not present in the Kaggle image).

root@cf1b6f63d729:/home/jupyter/src/tree_counting# python -m venv .deepforest --system-site-packages
root@cf1b6f63d729:/home/jupyter/src/tree_counting# pip install --upgrade pip --quiet
root@cf1b6f63d729:/home/jupyter/src/tree_counting# pip install deepforest --quiet

The local environment can be further exposed to jupyter as a custom kernel.

root@cf1b6f63d729:/home/jupyter/src/tree_counting# source .deepforest/bin/activate
(.deepforest) root@cf1b6f63d729:/home/jupyter/src/tree_counting# python -m ipykernel install --user --name .deepforest --display-name "Kaggle+DeepForest"
Installed kernelspec .deepforest in /root/.local/share/jupyter/kernels/.deepforest

The architecture is shown below.

Dev Environment Architecture, generated with plantuml.

This script demonstrates the difference between system-level and local packages.

(.deepforest) root@cf1b6f63d729:/home/jupyter/src/tree_counting# python
Python 3.7.12 | packaged by conda-forge | (default, Oct 26 2021, 06:08:53) 
[GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> import tensorflow
>>> import deepforest
'/home/jupyter/src/tree_counting/.deepforest/lib/python3.7/site-packages/deepforest/__init__.py'
>>> tensorflow.__file__
'/opt/conda/lib/python3.7/site-packages/tensorflow/__init__.py'

Finally, it is worth mentioning the Dev Containers extension, which connects IDE to a running container. Then we can enjoy all the VS Code features 馃檪

IDE connected to a container.

Published by mskorski

Scientist, Consultant, Learning Enthusiast

Leave a comment

Your email address will not be published. Required fields are marked *