Genomics / Bioinformatics / Computational Biology / Deep Learning

Hi All,
I’m starting this thread to have somewhere that is not associated with a specific course where we can help each other solve problems that arise when trying to perform bioinformatic/computational biology/deep learning on genomic data.

Hopefully we can build a helpful community around this thread =)

4 Likes

No expertise, but I’m interested too.

Here’s an open dataset of fastq reads file fro ten soil samples I helped collect in Somerville, MA https://www.kaggle.com/stillsut/boslab-initial-backyard-analysis/data if anyone could use it.

1 Like

Hi All,

I’m working on a project that requires me to use R, R-studio, the single cell sequencing analysis package Seurat, pheatmap, ggplot2, and dplyr. I’m running into tons if issues, so I’ll note my steps below with failures and work-arounds.

Set up a virtual environment to analyze single cell sequencing data

TLDR
conda create --name eplt

conda actiavte eplt

conda install r -y && conda install rstudio -y && conda install -c conda-forge -c bioconda r-pheatmap -y && conda install -c conda-forge -c bioconda r-seurat -y && rstudio

Create a virtual environment and install r and rstudio

I installed Anaconda3 on my macOSX

conda update conda -y

conda create --name eplt # ep stands for epithelial, lt stands for lineage tracing

conda activate eplt

conda search r

output:

Name Version Build Channel
r 3.5.1 r351_0 pkgs/r

conda install r -y

Should this worry me? (It works, despite the size mis-match)

SafetyError: The package for r-base located at /Users/drb/anaconda3/pkgs/r-base-3.5.1-h539fb6c_1
appears to be corrupted. The path 'lib/R/doc/html/packages.html'
has an incorrect size.
  reported size: 2946 bytes
  actual size: 32099 bytes

conda search rstudio

  • latest version is 1.1.456

conda install rstudio -y

Run R studio from command line:
rstudio

So far, everything works :slight_smile:

Install Seurat into my virtual environment

conda search r-seurat

conda install -c bioconda r-seurat

PackagesNotFoundError: The following packages are not available from current channels:

  - r-seurat -> r-cowplot
  - r-seurat -> r-future.apply
  - r-seurat -> r-ggridges
  - r-seurat -> r-ica
  - r-seurat -> r-metap
  - r-seurat -> r-pbapply
  - r-seurat -> r-rcppprogress
  - r-seurat -> r-rsvd
  - r-seurat -> r-sctransform[version='>=0.2.0']
  - r-seurat -> r-sdmtools
  - r-seurat -> r-dosnow
  - r-seurat -> r-dtw
  - r-seurat -> r-hdf5r
  - r-seurat -> r-diffusionmap
  - r-seurat -> r-tclust

Current channels:

  - https://conda.anaconda.org/bioconda/osx-64
  - https://conda.anaconda.org/bioconda/noarch
  - https://repo.anaconda.com/pkgs/main/osx-64
  - https://repo.anaconda.com/pkgs/main/noarch
  - https://repo.anaconda.com/pkgs/free/osx-64
  - https://repo.anaconda.com/pkgs/free/noarch
  - https://repo.anaconda.com/pkgs/r/osx-64
  - https://repo.anaconda.com/pkgs/r/noarch

Debug (thanks @cjwinslow!)
conda install -c bioconda r-seurat --debug

Solution
conda install -c conda-forge -c bioconda r-seurat

Check RStudio works
rstudio

  • yep, all good :slight_smile:

Install pheatmap

conda search pheatmap

Output:

Name Version Build Channel
r-pheatmap 1.0.8 r3.2.2_0 bioconda
  • conda install -c bioconda r-pheatmap
    • a large number of packages are installed, removed, superseded by a higher-priority channel, and downgraded, (see code block below)
  • rstudio
    • Error: a pop up window with a traingle & “!” opens (instead of rstudio)

Solution
conda install -c conda-forge -c bioconda r-pheatmap

Yay, rstudio runs :slight_smile:

2 Likes

Hi All, you might find this interesting:

2 Likes

Just saw your last few topics that you created @deena-b - I’m most interested in applying ML in this field so thank you for creating these topics. I’m currently working on the https://www.kaggle.com/c/data-science-bowl-2018/overview with the fastai library. But always interested in anything around this field. Will make sure to try follow the discussions.

Hi

Can you help me ?
I want to ask you in bioinformatics by machine learning

Hey @deena-b. My project with startup involves a component of what you’re doing. Would love to hop in for a short chat and see the intersection. Send me your email