Genomics / Bioinformatics / Computational Biology / Deep Learning

Hi All,
I’m starting this thread to have somewhere that is not associated with a specific course where we can help each other solve problems that arise when trying to perform bioinformatic/computational biology/deep learning on genomic data.

Hopefully we can build a helpful community around this thread =)


No expertise, but I’m interested too.

Here’s an open dataset of fastq reads file fro ten soil samples I helped collect in Somerville, MA if anyone could use it.

1 Like

Hi All,

I’m working on a project that requires me to use R, R-studio, the single cell sequencing analysis package Seurat, pheatmap, ggplot2, and dplyr. I’m running into tons if issues, so I’ll note my steps below with failures and work-arounds.

Set up a virtual environment to analyze single cell sequencing data

conda create --name eplt

conda actiavte eplt

conda install r -y && conda install rstudio -y && conda install -c conda-forge -c bioconda r-pheatmap -y && conda install -c conda-forge -c bioconda r-seurat -y && rstudio

Create a virtual environment and install r and rstudio

I installed Anaconda3 on my macOSX

conda update conda -y

conda create --name eplt # ep stands for epithelial, lt stands for lineage tracing

conda activate eplt

conda search r


Name Version Build Channel
r 3.5.1 r351_0 pkgs/r

conda install r -y

Should this worry me? (It works, despite the size mis-match)

SafetyError: The package for r-base located at /Users/drb/anaconda3/pkgs/r-base-3.5.1-h539fb6c_1
appears to be corrupted. The path 'lib/R/doc/html/packages.html'
has an incorrect size.
  reported size: 2946 bytes
  actual size: 32099 bytes

conda search rstudio

  • latest version is 1.1.456

conda install rstudio -y

Run R studio from command line:

So far, everything works :slight_smile:

Install Seurat into my virtual environment

conda search r-seurat

conda install -c bioconda r-seurat

PackagesNotFoundError: The following packages are not available from current channels:

  - r-seurat -> r-cowplot
  - r-seurat -> r-future.apply
  - r-seurat -> r-ggridges
  - r-seurat -> r-ica
  - r-seurat -> r-metap
  - r-seurat -> r-pbapply
  - r-seurat -> r-rcppprogress
  - r-seurat -> r-rsvd
  - r-seurat -> r-sctransform[version='>=0.2.0']
  - r-seurat -> r-sdmtools
  - r-seurat -> r-dosnow
  - r-seurat -> r-dtw
  - r-seurat -> r-hdf5r
  - r-seurat -> r-diffusionmap
  - r-seurat -> r-tclust

Current channels:


Debug (thanks @cjwinslow!)
conda install -c bioconda r-seurat --debug

conda install -c conda-forge -c bioconda r-seurat

Check RStudio works

  • yep, all good :slight_smile:

Install pheatmap

conda search pheatmap


Name Version Build Channel
r-pheatmap 1.0.8 r3.2.2_0 bioconda
  • conda install -c bioconda r-pheatmap
    • a large number of packages are installed, removed, superseded by a higher-priority channel, and downgraded, (see code block below)
  • rstudio
    • Error: a pop up window with a traingle & “!” opens (instead of rstudio)

conda install -c conda-forge -c bioconda r-pheatmap

Yay, rstudio runs :slight_smile:


Hi All, you might find this interesting:


Just saw your last few topics that you created @deena-b - I’m most interested in applying ML in this field so thank you for creating these topics. I’m currently working on the with the fastai library. But always interested in anything around this field. Will make sure to try follow the discussions.


Can you help me ?
I want to ask you in bioinformatics by machine learning

Hey @deena-b. My project with startup involves a component of what you’re doing. Would love to hop in for a short chat and see the intersection. Send me your email