Skip to content
Snippets Groups Projects
README.md 1.79 KiB
Newer Older
bender's avatar
bender committed
# AmericanGutProjectImport

The Python script `american_gut_project_import.py` collects and integrates data about participants of the American Gut Project [1]. Proband-related data and data about their samples are located at BioSamples [2], metagenomic data at MGnify [3]. Several files for a tranSMART batch input are created through this script, namely `annotations.tsv`, `subjects_data.tsv`, `excluded_subjects_data.tsv`, `metagenomic_data.tsv`, and `mappings.tsv`.

The program was successfully used with Python 3.5.2 on a server with Ubuntu 16.04.3 as operating system. It is assumed that the script is run in the directory containing the tab separated file with the analysis results of the American Gut Project from MGnify (file: `ERP012803_taxonomy_abundances_v2.0.tsv`). The program can be tested with data related to a variable range of sequencing run identifiers from this file by adding the options "--start" (or "-s") or "--end" (or "-e") with integers on the command line.

Theresa Bender's avatar
Theresa Bender committed
The attributes referring to the participants of the American Gut Project (and their samples) are gathered both from a questionnaire [4] and a data dictionary [5] in files, which were originally published with the article "American Gut: an Open Platform for Citizen Science Microbiome Research" by McDonald et al. [2018] (DOI: [10.1128/mSystems.00031-18](https://doi.org/10.1128/mSystems.00031-18)) .

This work was published here: https://dx.doi.org/10.3205/19gmds040
bender's avatar
bender committed

## References

[1] URL: http://americangut.org/ (last access on 27th March 2019)  
[2] URL: https://www.ebi.ac.uk/biosamples/ (last access on 27th March 2019)  
[3] URL: https://www.ebi.ac.uk/metagenomics/ (last access on 27th March 2019)  
[4] File: `mcdonald_et_al_2018_supplement_with_questionnaire.docx`  
Theresa Bender's avatar
Theresa Bender committed
[5] File: `mcdonald_et_al_2018_supplement_with_data_dictionary.xlsx`