The rapid growth of microbiome research has led to the development of numerous bioinformatics tools and databases, but information about them remains fragmented across disparate, often outdated cataloging efforts, hindering resource discovery and utilization.
To address this critical gap, the ELIXIR Microbiome Community collaborates with the Research Software Ecosystem to create MiCoReCa (Microbiome Community Resource Catalogue), a comprehensive, dynamic, open-access catalogue of microbiome-related bioinformatics resources:
- Tools from Research Software Ecosystem Atlas
- Workflows from WorkflowHub
- Training
- Standards
- Databases
The extraction, filtering and curation are done following the workflow below and using the defined keywords in the keywords.yml file:
-
Install virtualenv (if not already there)
$ python3 -m pip install --user virtualenv -
Create virtual environment
$ python3 -m venv env -
Activate virtual environment
$ source env/bin/activate -
Install requirements
$ python3 -m pip install -r requirements.txt
-
Extract all workflows metadata from WorkflowHub as a JSON file
$ python bin/extract_workflowhub.py \ extract \ --all content/workflowhub/workflows_full.json -
Filter workflows based on keywords and EDAM terms
$ python bin/extract_workflowhub.py \ filter \ --all content/workflowhub/workflows_full.json \ --filtered content/workflowhub/workflows_filtered.json \ --tsv-filtered content/workflowhub/workflows_filtered.tsv \ --tags keywords.yml \ --status content/workflowhub/workflows_status.tsvAs explained in the decision tree above, workflows are filtered first on EDAM terms (topics and operations), then on tags, workflow name and finally description based on the keywords provided in
keywords.ymlfile. Workflows are filtered first on EDAM terms (topics and operations), then on tags, workflow name and finally description based on the keywords provided in "keywords.yml".
-
Extract all bioconda metadata as a JSON file
mkdir -p ./tmp # download ZIP file *into tmp/* wget -O ./tmp/bioconda-recipes.zip https://codeload.github.com/bioconda/bioconda-recipes/zip/master # unzip from tmp into tmp/ unzip ./tmp/bioconda-recipes.zip -d ./tmp/ # remove the ZIP after extraction rm ./tmp/bioconda-recipes.zip # run your Python script python bin/collect_bioconda_recipes.py \ --bioconda-path ./tmp/bioconda-recipes-master/recipes \ --keywords-file ./keywords.yml \ --output-file ./content/bioconda_filtered.json # cleanup rm -r ./tmp
PYTHONPATH=bin python -m unittest discover -s bin/tests
To contribute to the MiCoReCa Source code:
- Fork the repository,
- Create a branch and add your changes
- Add a unit test for your changes (see unittests for examples).
Warning: new functions now require a unit test to be merged! - Make a pull request.
The unittest framework will run on your PR. Please fix the tests if required.
Upon review the maintainer will merge your pull request. Automatic tests will run on the dev branch.
