Collection of Covid-19 datasets and analyses

Over the past couple weeks I’ve been collecting various Covid-19 related datasets. It would be great if there’s a single repository that collects all of these in one place, but I thought I’d share my findings here, with a brief comment/description about each. These are from my personal notes, so please excuse the rough form of these. It could be a starting point to which others can add.


some external resources that might be useful:

Crowdsourcing data:

COVID-19 Global Impact Survey: Awareness

COVID-19 Global Impact Survey: Economics

COVID-19 Open Research Dataset (CORD-19): Access this dataset to help with the fight against COVID-19. In response to the COVID-19 pandemic, the Allen Institute for AI has partnered with leading research groups to prepare and distribute the COVID-19 Open Research Dataset (CORD-19), a free resource of over 29,000 scholarly articles, including over 13,000 with full text, about COVID-19 and the coronavirus family of viruses for use by the global research community. This dataset is intended to mobilize researchers to apply recent advances in natural language processing to generate new insights in support of the fight against this infectious disease. The corpus will be updated weekly as new research is published in peer-reviewed publications and archival services like bioRxiv, medRxiv, and others.

LitCovid: LitCovid is a curated literature hub for tracking up-to-date scientific information about the 2019 novel Coronavirus. It is the most comprehensive resource on the subject, providing a central access to 1263 (and growing) relevant articles in PubMed. The articles are updated daily and are further categorized by different research topics and geographic locations for improved access. You can read more at Chen et al. Nature (2020) and download our data here.

1 Like

Just saw a Korea dataset, which could be good given the extensive testing they’re doing there:

1 Like

Terra app for biomedical research released COVID19 Best practice workflows

  • I ran few workflows, technically quite easy but need to have biology knowledge to understand semantics :frowning:
  • it runs on top of GCP and Jupyter notebooks, uses Python, R and WDL for workflow definition
  • free datasets with anonymized biosamples, sequencing data, sequencing centers and platforms…