RoleSkills

Match developers to roles by extracting evidence from GitHub repositories and scoring against job description requirements.

Status: On hold
Development is paused.
The repository is kept for reference; issues and PRs are not being actively monitored.

What It Does

RoleSkills analyzes job descriptions and GitHub repositories to provide evidence-based role matching:

Parse Job Descriptions - Extract structured requirements from JD text (deterministic + LLM-enhanced)
Index GitHub Evidence - Build searchable index of code contributions, commits, and file changes
Retrieve & Match - Hybrid search (lexical + semantic) to find relevant evidence for each requirement
Score & Report - Calculate requirement-level scores with supporting evidence links

Quick Start

# 1. Install dependencies
make dev

# 2. Configure
export OPENAI_API_KEY=your_key_here

# 3. Run tests
make test

# 4. Parse a job description
python -m roleskills.cli jd-parse-llm path/to/job_description.txt

# 5. Build evidence index from your GitHub repos
python -m roleskills.cli evidence-build \
  --github-user your-username \
  --repo /path/to/repo1 \
  --repo /path/to/repo2 \
  --db-path evidence.sqlite

# 6. Score yourself against a JD
python -m roleskills.cli score \
  --jd path/to/parsed_jd.json \
  --db-path evidence.sqlite

Technology Stack

Python 3.11+ - Core language
OpenAI GPT-4 - LLM-enhanced JD parsing
SQLite + FTS5 - Local evidence indexing with full-text search
pytest - Testing framework (119 tests)
Langfuse - Observability and tracing
GitHub Actions - CI/CD

Project Structure

src/roleskills/
  ├── jd/                    # Job description parsing (M1)
  │   ├── parser.py          # Deterministic MD parser
  │   ├── llm_parser.py      # LLM-enhanced parser
  │   └── schema.py          # JD data models
  ├── evidence/              # GitHub evidence indexing (M2)
  │   ├── builder.py         # Index builder
  │   ├── store.py           # SQLite storage
  │   └── github.py          # Git integration
  ├── retrieve/              # Evidence retrieval (M3)
  │   ├── query.py           # Hybrid search
  │   └── rank.py            # Relevance ranking
  ├── score/                 # Scoring system (M3)
  │   ├── rubric.py          # Scoring logic
  │   └── models.py          # Score schemas
  └── cli.py                 # Command-line interface

tests/                       # 119 tests across all modules

Roadmap

✅ Completed Milestones

M0: Foundation - Observable framework, CLI, testing infrastructure
M1: Job Description Parser - Deterministic + LLM-enhanced JD parsing with schema validation
M2: Evidence Indexing - GitHub-based evidence collection and storage
M3: Retrieval & Scoring - Hybrid matching and requirement-level scoring

🚧 In Progress

Documentation - Usage guides, API docs, architecture overview
Public Release Prep - Clean up for open source publication

📋 Planned Milestones

M4: Portfolio Generation - Auto-generate evidence-backed developer portfolios
M5: Web Interface - Streamlit or FastAPI-based UI
M6: GitHub Integration - Direct GitHub API integration (currently uses local repos)
M7: Production Hardening - Performance optimization, error handling, rate limiting

Example Output

{
  "overall_score": 0.89,
  "requirements_met": 7,
  "requirements_total": 8,
  "requirements": [
    {
      "id": "req_1",
      "title": "Python expertise",
      "score": 0.95,
      "evidence_count": 5,
      "top_evidence": [
        {
          "path": "src/analyzer/core.py",
          "anchor": "implement_caching_decorator",
          "relevance": 0.92,
          "url": "https://github.com/user/repo/blob/abc123/src/analyzer/core.py#L45-L67"
        }
      ]
    }
  ]
}

Development

# Run tests
make test

# Run with coverage
make test-coverage

# Lint and format
make lint

# Type checking
make type-check

Limitations

Local-only: Currently requires local git repositories (no direct GitHub API integration yet)
OpenAI dependency: LLM parsing requires OpenAI API key
Single-user: Designed for individual use, not multi-tenant
English only: JD parsing optimized for English job descriptions

Contributing

This project is in early development. Contributions are welcome but expect rapid changes. Please open an issue to discuss before submitting PRs.

License

MIT

Ammons Data Labs builds observable, measurable AI agents and data systems.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
.github/workflows		.github/workflows
docs		docs
src/roleskills		src/roleskills
tests		tests
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RoleSkills

What It Does

Quick Start

Technology Stack

Project Structure

Roadmap

✅ Completed Milestones

🚧 In Progress

📋 Planned Milestones

Example Output

Development

Limitations

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

RoleSkills

What It Does

Quick Start

Technology Stack

Project Structure

Roadmap

✅ Completed Milestones

🚧 In Progress

📋 Planned Milestones

Example Output

Development

Limitations

Contributing

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages