Skip to content
This repository was archived by the owner on Mar 9, 2026. It is now read-only.

Keep external repositories' statistics (i.e. client, tools and modules) up to date#1030

Closed
itamarhaber wants to merge 1 commit into
redis:masterfrom
itamarhaber:ghstats
Closed

Keep external repositories' statistics (i.e. client, tools and modules) up to date#1030
itamarhaber wants to merge 1 commit into
redis:masterfrom
itamarhaber:ghstats

Conversation

@itamarhaber

Copy link
Copy Markdown
Member

@antirez plzlemmeknow your thoughts before I implement the changes to the ui in redis-io (and probably port the script to ruby). Possible triggers: manual, every push, daily...

Generated using this script:

import datetime
import requests
import pprint
import json
import re
import os

GITHUB_TOKEN = os.environ['GITHUB_TOKEN']
REDIS_DOC_PATH = os.environ['REDIS_DOC_PATH']

headers = {'Authorization': 'Bearer {}'.format(GITHUB_TOKEN)}

def run_query(query, variables):
    request = requests.post('https://api.github.com/graphql', json={'query': query, 'variables': variables}, headers=headers)
    if request.status_code == 200:
        return request.json()
    else:
        raise Exception("GitHub GrapQL query failed to run - status code: {}. {}".format(request.status_code, query))

def get_repo(owner, name, days=6*30):
    query = """
query ($owner: String!, $name: String!, $since: GitTimestamp!) {
  repository(owner: $owner, name: $name) {
    nameWithOwner
    description
    isArchived
    createdAt
    forkCount
    watchers {
      totalCount
    }
    stargazers {
      totalCount
    }
    openIssues: issues(states: OPEN) {
      totalCount
    }
    closedIssues: issues(states: CLOSED) {
      totalCount
    }
    openPullRequests: pullRequests(states: OPEN) {
      totalCount
    }
    closedPullRequests: pullRequests(states: CLOSED) {
      totalCount
    }
    mergedPullRequests: pullRequests(states: MERGED) {
      totalCount
    }
    licenseInfo {
      name
    }
    languages(first: 1, orderBy: {field: SIZE, direction: DESC}) {
      nodes {
        name
      }
    }
    periodCommits: defaultBranchRef {
      target {
        ... on Commit {
          history(since: $since) {
            totalCount
          }
        }
      }
    }
    lastCommit: defaultBranchRef {
      target {
        ... on Commit {
          history(first: 1) {
            edges {
              node {
                author {
                  date
                }
              }
            }
          }
        }
      }
    }
  }
  rateLimit {
    limit
    cost
    remaining
    resetAt
  }
}
    """
    since = datetime.datetime.today()-datetime.timedelta(days=days)
    variables = """
    {{
      "owner": "{}",
      "name": "{}",
      "since": "{}"
    }}
    """.format(owner, name, since.isoformat())

    reply = run_query(query, variables)
    repository = reply['data']['repository']
    if repository is None:
      return None
    return {
        'isArchived': repository['isArchived'],
        'createdAt': repository['createdAt'],
        'periodCommits': repository['periodCommits']['target']['history']['totalCount'],
        'committedAt': repository['lastCommit']['target']['history']['edges'][0]['node']['author']['date'],
        'fetchedAt': datetime.datetime.utcnow().replace(microsecond=0).isoformat(),
        'forks': repository['forkCount'],
        'watchers': repository['watchers']['totalCount'],
        'stargazers': repository['stargazers']['totalCount'],
        'openPullRequests': repository['openPullRequests']['totalCount'],
        'openIssues': repository['openIssues']['totalCount'],
    }

if __name__ == '__main__':
    jsons = [
      'clients.json',
      'tools.json',
      'modules.json'
    ]

    for jfile in jsons:
        try:
            open(jfile, 'r')
            run = False
        except FileNotFoundError:
            run = True

        if not run:
            continue

        with open('{}/{}'.format(REDIS_DOC_PATH, jfile)) as f:
            elems = json.load(f)

        ghpat = re.compile(r'^(https://github\.com/|git@github\.com:)(.*)/(.*)$')
        repos = list()
        for el in elems:
            if 'repository' in el:
                repository = str(el['repository'])
                mat = ghpat.search(repository)
                if mat:
                    stats = get_repo(mat.group(2), mat.group(3))
                    if stats is not None:
                        el['stats'] = stats
                        if 'active' in el:
                          el['active'] = stats['periodCommits'] > 0
                        if 'stars' in el:
                          el['stars'] = stats['stargazers']
                        print('touched {} {}'.format(jfile, repository))

        with open(jfile, 'w') as f:
            json.dump(elems, f, indent=4)

Signed-off-by: Itamar Haber itamar@redislabs.com

Using this script:

```python
import datetime
import requests
import pprint
import json
import re
import os

GITHUB_TOKEN = os.environ['GITHUB_TOKEN']
REDIS_DOC_PATH = os.environ['REDIS_DOC_PATH']

headers = {'Authorization': 'Bearer {}'.format(GITHUB_TOKEN)}

def run_query(query, variables):
    request = requests.post('https://api.github.com/graphql', json={'query': query, 'variables': variables}, headers=headers)
    if request.status_code == 200:
        return request.json()
    else:
        raise Exception("GitHub GrapQL query failed to run - status code: {}. {}".format(request.status_code, query))

def get_repo(owner, name, days=6*30):
    query = """
query ($owner: String!, $name: String!, $since: GitTimestamp!) {
  repository(owner: $owner, name: $name) {
    nameWithOwner
    description
    isArchived
    createdAt
    forkCount
    watchers {
      totalCount
    }
    stargazers {
      totalCount
    }
    openIssues: issues(states: OPEN) {
      totalCount
    }
    closedIssues: issues(states: CLOSED) {
      totalCount
    }
    openPullRequests: pullRequests(states: OPEN) {
      totalCount
    }
    closedPullRequests: pullRequests(states: CLOSED) {
      totalCount
    }
    mergedPullRequests: pullRequests(states: MERGED) {
      totalCount
    }
    licenseInfo {
      name
    }
    languages(first: 1, orderBy: {field: SIZE, direction: DESC}) {
      nodes {
        name
      }
    }
    periodCommits: defaultBranchRef {
      target {
        ... on Commit {
          history(since: $since) {
            totalCount
          }
        }
      }
    }
    lastCommit: defaultBranchRef {
      target {
        ... on Commit {
          history(first: 1) {
            edges {
              node {
                author {
                  date
                }
              }
            }
          }
        }
      }
    }
  }
  rateLimit {
    limit
    cost
    remaining
    resetAt
  }
}
    """
    since = datetime.datetime.today()-datetime.timedelta(days=days)
    variables = """
    {{
      "owner": "{}",
      "name": "{}",
      "since": "{}"
    }}
    """.format(owner, name, since.isoformat())

    reply = run_query(query, variables)
    repository = reply['data']['repository']
    if repository is None:
      return None
    return {
        'isArchived': repository['isArchived'],
        'createdAt': repository['createdAt'],
        'periodCommits': repository['periodCommits']['target']['history']['totalCount'],
        'committedAt': repository['lastCommit']['target']['history']['edges'][0]['node']['author']['date'],
        'fetchedAt': datetime.datetime.utcnow().replace(microsecond=0).isoformat(),
        'forks': repository['forkCount'],
        'watchers': repository['watchers']['totalCount'],
        'stargazers': repository['stargazers']['totalCount'],
        'openPullRequests': repository['openPullRequests']['totalCount'],
        'openIssues': repository['openIssues']['totalCount'],
    }

if __name__ == '__main__':
    jsons = [
      'clients.json',
      'tools.json',
      'modules.json'
    ]

    for jfile in jsons:
        try:
            open(jfile, 'r')
            run = False
        except FileNotFoundError:
            run = True

        if not run:
            continue

        with open('{}/{}'.format(REDIS_DOC_PATH, jfile)) as f:
            elems = json.load(f)

        ghpat = re.compile(r'^(https://github\.com/|git@github\.com:)(.*)/(.*)$')
        repos = list()
        for el in elems:
            if 'repository' in el:
                repository = str(el['repository'])
                mat = ghpat.search(repository)
                if mat:
                    stats = get_repo(mat.group(2), mat.group(3))
                    if stats is not None:
                        el['stats'] = stats
                        if 'active' in el:
                          el['active'] = stats['periodCommits'] > 0
                        if 'stars' in el:
                          el['stars'] = stats['stargazers']
                        print('touched {} {}'.format(jfile, repository))

        with open(jfile, 'w') as f:
            json.dump(elems, f, indent=4)

```

Signed-off-by: Itamar Haber <itamar@redislabs.com>
@f0rmiga

f0rmiga commented Oct 17, 2019

Copy link
Copy Markdown
Contributor

@itamarhaber A suggestion: run it daily using GitHub Actions. With GitHub Actions, we can open a PR with the changes, or directly push to master if we trust in ourselves with the generated code. :)

@itamarhaber

Copy link
Copy Markdown
Member Author

That's a great idea for the daily update - thanks!

@zuiderkwast

Copy link
Copy Markdown
Contributor

With GitHub Actions, we can open a PR with the changes, or directly push to master if we trust in ourselves with the generated code.

The commit history would grow very fast in this way.

The stars don't need to be committed to the repo. Why not let the website itself fetch the data and cache it in Redis with a TTL of 24 hours or so? Alternatively, let them be updated when redis.io is deployed.

@huangzhw

Copy link
Copy Markdown
Contributor

With GitHub Actions, we can open a PR with the changes, or directly push to master if we trust in ourselves with the generated code.

The commit history would grow very fast in this way.

The stars don't need to be committed to the repo. Why not let the website itself fetch the data and cache it in Redis with a TTL of 24 hours or so? Alternatively, let them be updated when redis.io is deployed.

Agree, fetching data is better.

@zuiderkwast

Copy link
Copy Markdown
Contributor

Agree, fetching data is better.

Great @huangz1990. I think the fetching from GitHub API and caching could go into app.rb in the redis-io repo. Here's a related PR: redis/redis-io#230.

@madolson madolson added the to-be-closed should probably be dismissed sooner or later label Jun 21, 2021
@madolson

Copy link
Copy Markdown
Contributor

100% agree with @zuiderkwast, let's focus on the other PR.

@nermiller

Copy link
Copy Markdown
Contributor

@zuiderkwast, ok to close this PR?

@zuiderkwast

Copy link
Copy Markdown
Contributor

Yeah this is built into new website, right? I don't know enough about the new website since I was not involved much.

@madolson

Copy link
Copy Markdown
Contributor

@itamarhaber When are we open sourcing the new website?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

to-be-closed should probably be dismissed sooner or later

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants