-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Description
Hello GenSpectrum folks, I've been looking into using INSDC data for Nextstrain's seasonal-flu with your influenza data. It's super helpful to pull directly from your API and get the records with linked segments!
Nextstrain analyses will need a little more metadata than what's currently available in GenSpectrum, so I'm working on merging in the data from Entrez. I'm hoping this data can be added upstream during GenSpectrum's ingest:
Include the incomplete collection date instead of the current default to the first of the month/first of the year.See Influenza: incomplete collection dates default to the first day of the month/year #930- Include more strain names. We are pulling the
strainfield from Entrez to supplement this since it's not available via NCBI Datasets. - Include passage history. This is also not standardized in NCBI Datasets. I've seen it scattered across different fields from Entrez (
isolation_source,lab_host,note). Passage history would also be helpful for ensuring that segments with different passage histories are not linked to the same record.
Metadata
Metadata
Assignees
Labels
No labels