Skip to content

minecode-pipelines: update pipelines to save api calls as well as purls #845

@JonoYang

Description

@JonoYang

issues with miners and saving metadata

crates.io

  • the index is organizied by some hashing thing, and the package data is stored in json lines
    • how should we store the upstream metadata in the aboutcode-data repos? should we store the json line entry or recreate the directory structure of the index?

maven

  • not certain if the parser we use for parsing the maven index gives us all the fields of pom data
  • updating the web crawler should be easy enough as it gets the pom files directly

npm

  • the code only looks up the index for the package name and versions and reports those. it does not get the package data proper. the code would need to be updated to have an option to get and store the package data

nuget

  • all package data is stored in a single index file, do we save individual index files for each package version we get?

pypi

  • update code to dump jsons for individual packages

alpine

  • should we save the info of the package individually or just save the apkindex for all packages

composer

  • update get_composer_purl to save package data alongside purl

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions