Skip to content

Remove Dependency on GHTorrent #20

@nuthanmunaiah

Description

@nuthanmunaiah

Description

reaper requires the GHTorrent database be restored to a MySQL/MariaDB instance. The requirement to have the full GHTorrent database restored before running reaper is prohibitively time intensive (the GHTorrent database dump from 2019-06-01 is over 100 GB in size). The removal of dependency on GHTorrent will require reaper to mine GitHub for the repository data and metadata that has already been mined by the GHTorrent project. On the other hand, there will be no need to restore repository data and metadata for several million repositories while all the user wants to do is analyze a few.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions