A Docker image for Cellgeni JupyterHub installation Based on docker-stacks and used with Zero to JupyterHub with Kubernetes and template notebooks for scientific analysis.
This Docker image is used as default for every user of Cellgeni JupyterHub installation. The installation contains R packages from jupyter/r-notebook and Python packages jupyter/scipy-notebook.
/files
folder contains files that will be copied into each user's home directory by default.
- Clone the private repository with Cellgeni JupyterHub settings:
git clone https://gitlab.internal.sanger.ac.uk/cellgeni/kubespray/
cd kubespray/sanger/sites
-
Add the new user's Github username to
auth.whitelist.users
or change Docker image atsingleuser.image.tag
injupyter-github-auth.yaml
(jupyter-large-config.yaml
for jupyter-large) -
Commit and push your changes so that your colleagues do not override your changes in the following upgrades
git add jupyter-github-auth.yaml jupyter-large-config.yaml && git commit -m "Add new users" && git push
- Upgrade Jupyter with
helm upgrade jpt jupyterhub/jupyterhub --namespace jpt --version 0.8.0 --values jupyter-github-auth.yaml
or jupyter-large with
helm upgrade jptl jupyterhub/jupyterhub --namespace jptl --version 0.8.0 --values jupyter-large-config.yaml
- Wait until the hub's state switches into
Running
. Monitor throughkubectl get pods -n jpt
orkubectl get pods -n jptl
.
- In your browser go to https://jupyter.cellgeni.sanger.ac.uk.
- Use your Github credentials for authentication. It may take some time to load first time.
- Now you are ready to run your notebooks!
At the moment by default every user is given 50GB (guaranteed) to 200GB (maximum, if available) of RAM and 1 (guaranteed) to 16 (maximum, if available) CPUs. Default storage volume is 100G.
For special cases, we have https://jupyter-xl.cellgeni.sanger.ac.uk with 150 Gb of RAM, 150 Gb of storage and 4 to 16 CPU, this one is available upon request.
- JupyterHub environment and storage are not backed up!!! Please only use for computations and download your results (and notebooks) afterwards. If you store your data there you can easily lose it. You've been warned!
- Do not modify files from
data
andnotebooks
folder directly - make a copy, put it in a separate folder and work with a copy. Changes to the original files in thedata
andnotebooks
folders will not survive the server updates. - Please read the instructions on package installations below.
- JupyterHub website is public, so you don't need to turn on VPN to use it. However, it is only available to users who messaged us their Github usernames and have been whitelisted.
- You can switch to a classic Jupyter interface by change the word
lab
in your adress bar to the wordtree
:
https://jupyter.cellgeni.sanger.ac.uk/user/<your-username>/tree
We provide some notebook templates with the pre-installed software. These are located in the notebooks
folder. Corresponding example data is located in the data
folder. Before running your analysis, please make a copy of a notebook template and work with the copy.
- You can copy files to and from Jupyter directly in a web interface (Menu and a button in the interface).
- You can also copy data from and to the farm using a terminal (click on the
Terminal
icon in the Launcher). To copy from the farm (e.g. forak27
user):
mkdir farm
rsync -avzh ak27@farm4-login:/nfs/users/nfs_a/ak27/<some-file-name> farm/
To copy from the local environment to the farm:
rsync -avzh <some-file-name> ak27@farm4-login:/nfs/users/nfs_a/ak27/
By default, JupyterHub does not provide an ability to download folders, but you can create an archive
tar cvfz <some-archive-name.tar> <target-directory>/
and download the resulting file with the right click "Download" option.
To export a notebook as PDF, install the following pre-requisite software:
sudo apt update && sudo apt-get install -y texlive-generic-recommended texlive-generic-recommended
Now you can export a notebook through "File > Export notebook as..." menu.
To export a Rnotebook as PDF, install the following pre-requisite software:
wget -qO- "https://yihui.org/gh/tinytex/tools/install-unx.sh" | sh
If that it is not enough, the easiest way is to install the whole texlive package, the downside is that it is 4.5G:
sudo apt update && sudo apt-get install -y texlive-full
- Go to your API Tokens page or go to hub/home and then click "Token" on the top menu.
- Type in a note like "Shared with collaborator X"
- Click the orange button "Request new API token"
- Copy the token that shows up under "Your new API Token". (i.e.
ba5eba11b01dfaceca55e77ecacaca11
) - Go to your jupyter instance, but using the "tree" view instead of the "lab" view:
https://jupyter.cellgeni.sanger.ac.uk/user/<your username>/tree
- Find your notebook and open it. You should be on a link that looks like:
https://jupyter.cellgeni.sanger.ac.uk/user/<your username>/notebooks/some_notebook.ipynb
- Add this to the end of the link:
?token=<your API token>
and copy that link. (i.e.:?token=ba5eba11b01dfaceca55e77ecacaca11
) - Share what you have copied. It should be something like:
https://jupyter.cellgeni.sanger.ac.uk/user/<your username>/notebooks/some_notebook.ipynb?token=<your API token>
- Onace you have finished the collaboration. Go to your API Tokens page and click "Revoke" to delete that access token.
Default conda environments are not persistent across Jupyter sessions - you can install an additional package, but it will not be there next time you start Jupyter. To have a persistent conda environment, you can create one inside your /home
folder:
- Open a new terminal (click on the
Terminal
icon in the Launcher) - Run the following commands (replace
myenv
with your environment name):
conda create --name myenv
source activate myenv
# you must install nb_conda package if you want to use this environment as a Kernel inside your notebook
conda install nb_conda
# conda install all packages you need
# ...
(3). Instead of creating a new environment, you can also clone an existing one, e.g.:
conda create --clone old_name --name new_name
This will eliminate the need to install repeated packages.
- Reload the main page. Now you will see your new environment in the Launcher.
pip defaults to installing Python packages to a system directory. To make sure that your packages persist they need to be installed in your home directory use the --user
option to do this.
pip install --user package_name
Open a new terminal on your Jupyter and follow this steps:
- Install icommands:
wget https://files.renci.org/pub/irods/releases/4.1.10/ubuntu14/irods-icommands-4.1.10-ubuntu14-x86_64.deb
apt-get install ./irods-icommands-4.1.10-ubuntu14-x86_64.deb
- Create the .irods folder on your home directory:
mkdir -p ~/.irods
- Copy your irods_environment.json from your home directory on the farm to your Jupyter instance:
scp ak27@farm4-login:/nfs/users/nfs_a/ak27/.irods/irods_environment.json ~/.irods/
- Run
iinit
. If asked for password input your iRODS password. Don't know your iRODS password? Go to the farm and type:head -1 ~/.irods/irods_password
. The output, something like "xUEJAslQ" is your password.
To mount the farm's base paths (/nfs
, /lustre
and /warehouse
) on your jupyter instance:
- Open a new terminal on your Jupyter.
- Type
mount-farm
, then press Enter. - When prompted for your username and password input them.
The three folders will be mounted on the root folder of your instance.
Try opening a new terminal and change directory to your farm home cd /nfs/users/nfs_u/usr99
or your team's lustre cd /lustre/scratch11X/team999
and then type ls
to see the files. You can use the same paths in your notebooks.
You will not see these folders in Jupyter's File Browser because it only shows /home/jovyan
. If you really want to see them on your File Browser you need to create symlinks from the mounted folders to your home folder.
For example:
ln -s /nfs /home/jovyan/nfs
ln -s /warehouse /home/jovyan/warehouse
ln -s /lustre /home/jovyan/lustre
R and RStudio are also available on JupyterHub:
- A new R session can be started from the Launcher
- To switch to RStudio interface, change the word
lab
in your adress bar to the wordrstudio
:
https://jupyter.cellgeni.sanger.ac.uk/user/<your-username>/rstudio
Sometimes, a server restart might solve an issue. For that:
- Go to the menu "File" > "Hub Control Panel" or browse to
https://jupyter.cellgeni.sanger.ac.uk/hub/home
- Hit "Stop my server"
- Reload the page.
- If RStudio displays "[Errno 111] Connection refused", try restarting the server.
- If RStudio displays an error "Rsession did not start in time", go to the
lab
interface, start terminal, and delete the last R session:and reload RStudiols -a .rstudio/sessions/active # see all active sessions rm -r ./rstudio/sessions/active/<session-name> # note the name of the last active session and delete it
- If RStudio displays an error "Could not start RStudio in time", it might be because you ran out of disk space. Check your disk usage with
df -h /home/jovyan/
ordu -ha -d 1 ~
, if the home directory size is close to the limit, you need to delete some files or move to/request a JupyterHub with more space.