SCGC Bioinformatics Workshop

A workshop hosted by the Single Cell Genomics Center and Bigelow Laboratory for Ocean Sciences

Jupyterhub Setup

Getting started on the Jupyterhub

You all should have received a username and password for the course Hub.

To sign in, navigate to jupyterhub.bigelow.org in your browser and enter in your username and password to sign in.

You should now see what looks like a file system on the left, and a working environment on the right. This is your own personal file system within the larger jupyterhub set up for this course.


Setting up your home directory

There are a couple of things we need to do to get your system set up. First we want to link the storage directory to your home directory. To do this open up a terminal like this:

  1. click the plus in the upper-left corner of the hub
  2. under ‘Other’ select terminal to open up a terminal tab
  3. And then type:
$ ln -s /mnt/storage/ storage

This storage directory contains all of the data for the course as well as space for you to work. Shared course data can be found at:

storage/data/

Everyone can have their own working directory in the folder

storage/userlab

Let’s navigate there now and create working directories for ourselves:

$ cd storage/userlab
$ mkdir {your_username}

Consider this your workspace for the week. Feel free to copy data from the data directory into your own directories for you to look at.


Conda Environments

The jupyterhub is installed and managed with conda, so all rules for managing and finding conda environments apply within this hub. We have pre-installed several different environments.

Pre-installed environments can be found here:

storage/envs/

And can be loaded as such:

conda activate storage/envs/biopy

The currently available environments are:
DRAM: source activate storage/envs/dram-1.4.6
DeepVirFinder: source activate storage/envs/dvf
VirSorter2: source activate storage/envs/vs2
Anvi’o: source activate storage/envs/anvio-8
CheckV: source activate storage/envs/checkv
CoverM: source activate storage/envs/coverm
biopy: source activate storage/envs/biopy
r: source activate storage/envs/renv


Jupyter notebooks

Two of these environments maybe be used within jupyter notebooks: R (an R kernel) and biopy (a python kernel). The R notebook kernel (just called ‘R’ within the lab interface) has tidyverse packages pre-installed. The biopy kernel has data science packages such as pandas, numpy and matplotlib installed. You can select which kernel you’d prefer when starting a new Jupyter Notebook.

Feel free to install your own environments and software if you want to run your own analyses during the workshop.


Getting data into and out of the Jupyterhub

To upload files, you can either drag-and-drop data from your desktop to your ‘hub file system, or you can use the upload up arrow in the upper left-hand portion of your ‘hub.

To download files from the hub to your computer, navigate to the file within the file tree on the left side of the hub, right-click the file and select ‘Download’.


Jupyter Notebook Demo

https://github.com/Bigelow-eSCG-tutorials/Day1AM_intro_to_jupyterhub/

Last updated on 1 Apr 2024
Published on 1 Apr 2024