Installationο
Prerequisitesο
Confirm conda is installed:
conda --version # should print conda 24.x or higher
Always update conda to speed up solving:
conda update --name base conda
1. Clone the repositoryο
git clone https://github.com/mschecht/sc-preprocess.git
cd sc-preprocess
2. Create the conda environmentο
A single command installs all dependencies and the sc-preprocess CLI:
conda env create -f environment.yaml
π Tip: If you have Mamba installed, use
mamba env create -f environment.yamlfor faster dependency resolution.
π Starting fresh? Remove an existing environment first:
conda env remove --name snakemake8
3. Verify the installationο
conda activate snakemake8
Check that the CLI and key tools are working:
# Pipeline CLI
sc-preprocess --help
# Core tools
snakemake --version # should print 9.x
bcftools --version # should print 1.22+
samtools --version # should print 1.22+
cellsnp-lite --version # should print 1.2+
vireo --help # should print usage
Check that Python packages import correctly:
python -c "import scanpy; import anndata; import muon; import snapatac2; import scrublet; print('All imports OK')"
4. Cell Ranger installationο
This package wraps 10x Genomics Cell Ranger but does not include the Cell Ranger software itself.
Check if Cell Ranger is already available:
cellranger --version # GEX: 9.0.0+
cellranger-atac --version # ATAC: 2.0.0+
cellranger-arc --version # ARC: 2.0.0+
If not installed, download from the 10x Genomics download page.
Linking Cell Ranger to the conda environmentο
After installing Cell Ranger, symlink the executables into the conda environment so Snakemake can find them:
ENV_NAME="snakemake8"
CONDA_BIN="$(conda info --base)/envs/${ENV_NAME}/bin"
# Replace /path/to/ with your actual Cell Ranger installation paths
ln -s /path/to/cellranger "$CONDA_BIN/cellranger"
ln -s /path/to/cellranger-atac "$CONDA_BIN/cellranger-atac"
ln -s /path/to/cellranger-arc "$CONDA_BIN/cellranger-arc"
Note: You only need to link the Cell Ranger tools for the modalities you plan to use (e.g., GEX only needs
cellranger).
Note: The actual executable may be inside a
bin/subdirectory of the installation, not at the top level. For example,cellranger-arcis typically atcellranger-arc-2.0.2/bin/cellranger-arc, notcellranger-arc-2.0.2/cellranger-arc. Check the installation directory withls /path/to/cellranger-arc-2.0.2/andls /path/to/cellranger-arc-2.0.2/bin/before symlinking.
After symlinking, verify that each link resolves to a file (not a directory or itself):
ls -la "$CONDA_BIN/cellranger"
ls -la "$CONDA_BIN/cellranger-atac"
ls -la "$CONDA_BIN/cellranger-arc"
The output should show an arrow (->) pointing to an executable file, not a directory. A symlink pointing to a directory will cause a Permission denied error at runtime.
Verify with the built-in version checker:
sc-preprocess check-versions
sc-preprocess check-versions --workflow GEX # check specific workflow
Congrats if you made it this far, you are now ready to preprocess single-cell data from 10X!
Troubleshootingο
Conda is slowο
Use Mamba as a drop-in replacement for faster solves:
conda install -n base -c conda-forge mamba
mamba env create -f environment.yaml
HPC file quota exceededο
Conda creates many files in your home directory by default, which can exceed file quota limits. This is a reoccurrent problem for HPC users. If your home directory has file number restrictions, redirect condaβs cache to a directory without quotas:
conda config --add pkgs_dirs /path/to/project_dir/conda_pkgs
conda config --add envs_dirs /path/to/project_dir/conda_envs
Git pulling the development version (advanced)ο
After pulling new changes from the development version of the workflow, update your environment like this:
conda activate snakemake8
pip install -e .
If environment.yaml has changed, recreate the environment:
conda env remove --name snakemake8
conda env create -f environment.yaml
If per-rule conda environment files (workflows/envs/*.yaml) have changed, delete the Snakemake conda cache so they get rebuilt on next run:
rm -rf .snakemake/conda
Next: Quick Start