Setting up the environment¶
First step is to setup the environment in a way such that reproducibility is ensured or maximized. In this tutorial we will use the conda environment manager to install python, pyrpipe and the required tools and dependencies into a single environment. Note: Conda must be installed on the system. For help with setting up conda, please see miniconda.
Create a new conda environment¶
To create a new conda environment with python 3.8 execute the following commands. We recommend sharing conda environment files with pipeline scripts to allow for reproducible analysis.
conda create -n pyrpipe python=3.8
Activate the newly created conda environment and install required tools
conda activate pyrpipe
conda install -c bioconda pyrpipe star=2.7.7a sra-tools=2.10.9 stringtie=2.1.4 trim-galore=0.6.6 orfipy=0.0.3 salmon=1.4.0
If the above command fails, please try adding conda channels (see commands below) in the right order and then try again.
conda config --add channels defaults
conda config --add channels bioconda
conda config --add channels conda-forge
Using conda environment in yaml files¶
We have provided a yaml file containing the conda packages required to reproduce pyrpipe environment. Users can also use this file to create a conda environment and run pyrpipe. To create a conda environment, use the pyrpipe_environment.yaml:
conda env create -f pyrpipe_environment.yml
Users can easily export and share their own conda environment yaml files containing information about the conda environment. To export any conda environment as yaml, run the following command
conda env export | grep -v "^prefix: " > environment.yml
To recreate the conda environment in the environment.yml, use
conda env create -f environment.yml
Automated installation of required tools¶
We have also provided a utility to install required RNA-Seq tools via a single command:
pyrpipe_diagnostic build-tools
Note: Users must verify the versions of the tools installed in the conda environment.
Setting up NCBI SRA-Tools¶
After installing sra-tools, please configure prefetch to save the downloads the the public user-repository. This will ensure that the prefetch command will download the data to the user defined directory. To do this
- Type vdb-config -i command in terminal to open the NCBI SRA-Tools configuration editor.
- Under the TOOLS tab, set prefetch downloads option to public user-repository
Users can easily test if SRA-Tools has been setup properly by invoking the following command
pyrpipe_diagnostic test