Skip to content

Commit aa424b5

Browse files
committed
feat: make all 10 language grammars individually feature-gated
All tree-sitter language grammars (Rust, TypeScript, JavaScript, Python, Go, Java, C, C++, Ruby, C#) are now individually toggleable via Cargo feature flags (lang-rust, lang-typescript, etc.) and all enabled by default. Users can build with --no-default-features and selectively enable only the languages they need. - Gate all query constants and match arms with #[cfg(feature = ...)] - Gate test imports and helpers behind #[cfg(any(all 10 features))] - Update README, DOCS, and CLAUDE.md with language feature docs - Update test count to 295
1 parent 857617c commit aa424b5

6 files changed

Lines changed: 207 additions & 37 deletions

File tree

CLAUDE.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -99,7 +99,7 @@ lowercase_subject = true
9999

100100
Rust, TypeScript, JavaScript, Python, Go, Java, C, C++, Ruby, C#
101101

102-
All 10 languages enabled by default. To build with only the core 5, use `--no-default-features`. Individual languages can be toggled via feature flags: `lang-java`, `lang-c`, `lang-cpp`, `lang-ruby`, `lang-csharp`.
102+
All 10 languages are individually feature-gated (`lang-rust`, `lang-typescript`, `lang-javascript`, `lang-python`, `lang-go`, `lang-java`, `lang-c`, `lang-cpp`, `lang-ruby`, `lang-csharp`) and enabled by default. Build with `--no-default-features --features lang-rust,lang-go` to include only specific languages.
103103

104104
## File Structure
105105

Cargo.toml

Lines changed: 19 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -57,13 +57,13 @@ gix = { version = "0.80", default-features = false, features = ["revision"] }
5757

5858
# Code analysis
5959
tree-sitter = "0.26"
60-
tree-sitter-rust = "0.24"
61-
tree-sitter-typescript = "0.23"
62-
tree-sitter-python = "0.25"
63-
tree-sitter-go = "0.25"
64-
tree-sitter-javascript = "0.25"
6560

66-
# Additional language grammars (optional, enabled by default)
61+
# Language grammars (all optional, all enabled by default)
62+
tree-sitter-rust = { version = "0.24", optional = true }
63+
tree-sitter-typescript = { version = "0.23", optional = true }
64+
tree-sitter-python = { version = "0.25", optional = true }
65+
tree-sitter-go = { version = "0.25", optional = true }
66+
tree-sitter-javascript = { version = "0.25", optional = true }
6767
tree-sitter-java = { version = "0.23", optional = true }
6868
tree-sitter-c = { version = "0.23", optional = true }
6969
tree-sitter-cpp = { version = "0.23", optional = true }
@@ -90,15 +90,26 @@ keyring = { version = "3", optional = true }
9090
regex = "1.12"
9191

9292
[features]
93-
default = ["lang-java", "lang-c", "lang-cpp", "lang-ruby", "lang-csharp"]
93+
default = [
94+
"lang-rust", "lang-typescript", "lang-javascript", "lang-python", "lang-go",
95+
"lang-java", "lang-c", "lang-cpp", "lang-ruby", "lang-csharp",
96+
]
9497
secure-storage = ["keyring"]
9598
eval = []
99+
lang-rust = ["tree-sitter-rust"]
100+
lang-typescript = ["tree-sitter-typescript"]
101+
lang-javascript = ["tree-sitter-javascript"]
102+
lang-python = ["tree-sitter-python"]
103+
lang-go = ["tree-sitter-go"]
96104
lang-java = ["tree-sitter-java"]
97105
lang-c = ["tree-sitter-c"]
98106
lang-cpp = ["tree-sitter-cpp"]
99107
lang-ruby = ["tree-sitter-ruby"]
100108
lang-csharp = ["tree-sitter-c-sharp"]
101-
all-languages = ["lang-java", "lang-c", "lang-cpp", "lang-ruby", "lang-csharp"]
109+
all-languages = [
110+
"lang-rust", "lang-typescript", "lang-javascript", "lang-python", "lang-go",
111+
"lang-java", "lang-c", "lang-cpp", "lang-ruby", "lang-csharp",
112+
]
102113

103114
[profile.release]
104115
lto = true

DOCS.md

Lines changed: 28 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -518,17 +518,34 @@ CommitBee detects whether it's running in an interactive terminal. In non-intera
518518

519519
## 🌳 Supported Languages
520520

521-
CommitBee uses tree-sitter to parse source files and extract semantic symbols. Currently supported:
521+
CommitBee uses tree-sitter to parse source files and extract semantic symbols. All 10 languages are enabled by default and individually toggleable via Cargo feature flags.
522522

523-
| Language | Parser | What It Extracts |
524-
| --- | --- | --- |
525-
| Rust | `tree-sitter-rust` | Functions, structs, enums, impls, traits, methods |
526-
| TypeScript | `tree-sitter-typescript` | Functions, classes, interfaces, methods, types |
527-
| JavaScript | `tree-sitter-javascript` | Functions, classes, methods, arrow functions |
528-
| Python | `tree-sitter-python` | Functions, classes, methods, decorators |
529-
| Go | `tree-sitter-go` | Functions, types, methods, interfaces |
523+
| Language | Feature Flag | Parser | What It Extracts |
524+
| --- | --- | --- | --- |
525+
| Rust | `lang-rust` | `tree-sitter-rust` | Functions, structs, enums, impls, traits, methods |
526+
| TypeScript | `lang-typescript` | `tree-sitter-typescript` | Functions, classes, interfaces, methods, types |
527+
| JavaScript | `lang-javascript` | `tree-sitter-javascript` | Functions, classes, methods, arrow functions |
528+
| Python | `lang-python` | `tree-sitter-python` | Functions, classes, methods, decorators |
529+
| Go | `lang-go` | `tree-sitter-go` | Functions, types, methods, interfaces |
530+
| Java | `lang-java` | `tree-sitter-java` | Classes, methods, constructors, interfaces, enums |
531+
| C | `lang-c` | `tree-sitter-c` | Functions, structs, enums, typedefs |
532+
| C++ | `lang-cpp` | `tree-sitter-cpp` | Functions, classes, structs, enums, methods |
533+
| Ruby | `lang-ruby` | `tree-sitter-ruby` | Classes, modules, methods, singleton methods |
534+
| C# | `lang-csharp` | `tree-sitter-c-sharp` | Classes, methods, constructors, interfaces, enums |
535+
536+
### Custom Language Builds
537+
538+
To build with only the languages you need (reduces binary size):
539+
540+
```bash
541+
# Only Rust and TypeScript support
542+
cargo install commitbee --no-default-features --features lang-rust,lang-typescript
543+
544+
# All languages except C++ and C#
545+
cargo install commitbee --no-default-features --features lang-rust,lang-typescript,lang-javascript,lang-python,lang-go,lang-java,lang-c,lang-ruby
546+
```
530547

531-
**Files in unsupported languages still work** — they're included in the diff context, they just don't get semantic symbol extraction. The commit message will still be based on the actual diff content; it just won't know which specific functions or types changed.
548+
**Files in unsupported or disabled languages still work** — they're included in the diff context, they just don't get semantic symbol extraction. The commit message will still be based on the actual diff content; it just won't know which specific functions or types changed.
532549

533550
### Symbol Tracking
534551

@@ -662,7 +679,7 @@ No panics in user-facing code paths. The sanitizer and validator are tested with
662679

663680
### Testing Strategy
664681

665-
CommitBee has 202 tests across multiple strategies:
682+
CommitBee has 295 tests across multiple strategies:
666683

667684
| Strategy | What It Covers |
668685
| --- | --- |
@@ -675,7 +692,7 @@ CommitBee has 202 tests across multiple strategies:
675692
Run them:
676693

677694
```bash
678-
cargo test # All 202 tests
695+
cargo test # All 295 tests
679696
cargo test --test sanitizer # Just sanitizer tests
680697
cargo test --test integration # LLM provider mocks
681698
COMMITBEE_LOG=debug cargo test -- --nocapture # With logging

README.md

Lines changed: 9 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@ Most tools in this space pipe raw `git diff` to an LLM and hope for the best. Co
2525

2626
CommitBee uses tree-sitter to parse both the staged and HEAD versions of every changed file — in parallel across CPU cores. It extracts 10 symbol types (functions, methods, structs, enums, traits, impls, classes, interfaces, constants, type aliases) and maps diff hunks to their spans. The LLM doesn't see "lines 42-58 changed" — it sees "the `validate()` function in `sanitizer.rs` was modified, and a new `retry()` method was added." Symbols are tracked in three states: **added**, **removed**, and **modified-signature**.
2727

28-
Supported languages: **Rust, TypeScript, JavaScript, Python, Go**. Files in other languages still get full diff context — just without symbol extraction.
28+
Supported languages: **Rust, TypeScript, JavaScript, Python, Go, Java, C, C++, Ruby, C#** — all enabled by default, individually toggleable via Cargo feature flags. Files in other languages still get full diff context — just without symbol extraction.
2929

3030
### 🧠 It reasons about what changed
3131

@@ -96,7 +96,7 @@ When your staged changes mix independent work (a bugfix in one module + a refact
9696
- **🐚 Shell completions** — bash, zsh, fish, powershell via `commitbee completions`.
9797
- **⚙️ 5-level config** — Defaults → project `.commitbee.toml` → user config → env vars → CLI flags.
9898
- **🦀 Single binary**~18K lines of Rust. Compiles to one static binary with LTO. No runtime dependencies.
99-
- **🧪 202 tests** — Unit, snapshot, property (proptest for never-panic guarantees), and integration (wiremock).
99+
- **🧪 295 tests** — Unit, snapshot, property (proptest for never-panic guarantees), and integration (wiremock).
100100

101101
## 📦 Installation
102102

@@ -217,7 +217,7 @@ The default provider (Ollama) runs entirely on your machine. No data leaves your
217217
## 🧪 Testing
218218

219219
```bash
220-
cargo test # 202 tests — unit, snapshot (insta), property (proptest), integration (wiremock)
220+
cargo test # 295 tests — unit, snapshot (insta), property (proptest), integration (wiremock)
221221
```
222222

223223
See [Testing Strategy](DOCS.md#testing-strategy) for the full breakdown.
@@ -226,9 +226,15 @@ See [Testing Strategy](DOCS.md#testing-strategy) for the full breakdown.
226226

227227
### 🔎 `v0.4.0` — See Everything (current)
228228

229+
- **10-language tree-sitter support** — Added Java, C, C++, Ruby, and C# to the existing Rust, TypeScript, JavaScript, Python, and Go. All languages are individually feature-gated and enabled by default. Disable any with `--no-default-features` + selective `--features lang-rust,lang-go,...`.
230+
- **Custom prompt templates** — User-defined templates with `{{type}}`, `{{scope}}`, `{{subject}}`, `{{body}}` variables via `template_path` config.
231+
- **Multi-language commit messages** — Generate messages in any language with `--locale` flag or `locale` config (e.g., `--locale de` for German).
232+
- **Commit history style learning** — Learns from recent commit history to match your project's style (`learn_from_history`, `history_sample_size` config).
229233
- **Rename detection** — Detects file renames with similarity percentage via `git diff --find-renames`, displayed as `old → new (N% similar)` in prompts and split suggestions. Configurable threshold (default 70%, set to 0 to disable).
230234
- **Expanded secret scanning** — 25 built-in patterns across 13 categories (cloud providers, AI/ML, source control, communication, payment, database, cryptographic, generic). Pluggable engine: add custom regex patterns or disable built-ins by name via config.
231235
- **Progress indicators** — Contextual `indicatif` spinners during pipeline phases (analyzing, scanning, generating). Auto-suppressed in non-TTY environments (git hooks, pipes).
236+
- **Evaluation harness**`cargo test --features eval` for structured LLM output quality benchmarking.
237+
- **Fuzz testing**`cargo-fuzz` targets for sanitizer and diff parser robustness.
232238

233239
### 🔬 `v0.3.1` — Trust, but Verify
234240

src/services/analyzer.rs

Lines changed: 12 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -15,10 +15,15 @@ use crate::error::Result;
1515

1616
// ─── Embedded query patterns ────────────────────────────────────────────────
1717

18+
#[cfg(feature = "lang-rust")]
1819
const RUST_QUERY: &str = include_str!("../queries/rust.scm");
20+
#[cfg(feature = "lang-typescript")]
1921
const TYPESCRIPT_QUERY: &str = include_str!("../queries/typescript.scm");
22+
#[cfg(feature = "lang-javascript")]
2023
const JAVASCRIPT_QUERY: &str = include_str!("../queries/javascript.scm");
24+
#[cfg(feature = "lang-python")]
2125
const PYTHON_QUERY: &str = include_str!("../queries/python.scm");
26+
#[cfg(feature = "lang-go")]
2227
const GO_QUERY: &str = include_str!("../queries/go.scm");
2328

2429
#[cfg(feature = "lang-java")]
@@ -131,41 +136,36 @@ impl AnalyzerService {
131136
.unwrap_or("");
132137

133138
let config = match ext {
139+
#[cfg(feature = "lang-rust")]
134140
"rs" => Some(LanguageConfig {
135141
language: tree_sitter_rust::LANGUAGE.into(),
136142
query_source: RUST_QUERY,
137143
file_ext: "rs",
138144
}),
139-
"ts" => Some(LanguageConfig {
145+
#[cfg(feature = "lang-typescript")]
146+
"ts" | "tsx" => Some(LanguageConfig {
140147
language: tree_sitter_typescript::LANGUAGE_TYPESCRIPT.into(),
141148
query_source: TYPESCRIPT_QUERY,
142149
file_ext: "ts",
143150
}),
144-
"tsx" => Some(LanguageConfig {
145-
language: tree_sitter_typescript::LANGUAGE_TYPESCRIPT.into(),
146-
query_source: TYPESCRIPT_QUERY,
147-
file_ext: "tsx",
148-
}),
151+
#[cfg(feature = "lang-python")]
149152
"py" => Some(LanguageConfig {
150153
language: tree_sitter_python::LANGUAGE.into(),
151154
query_source: PYTHON_QUERY,
152155
file_ext: "py",
153156
}),
157+
#[cfg(feature = "lang-go")]
154158
"go" => Some(LanguageConfig {
155159
language: tree_sitter_go::LANGUAGE.into(),
156160
query_source: GO_QUERY,
157161
file_ext: "go",
158162
}),
159-
"js" => Some(LanguageConfig {
163+
#[cfg(feature = "lang-javascript")]
164+
"js" | "jsx" => Some(LanguageConfig {
160165
language: tree_sitter_javascript::LANGUAGE.into(),
161166
query_source: JAVASCRIPT_QUERY,
162167
file_ext: "js",
163168
}),
164-
"jsx" => Some(LanguageConfig {
165-
language: tree_sitter_javascript::LANGUAGE.into(),
166-
query_source: JAVASCRIPT_QUERY,
167-
file_ext: "jsx",
168-
}),
169169
#[cfg(feature = "lang-java")]
170170
"java" => Some(LanguageConfig {
171171
language: tree_sitter_java::LANGUAGE.into(),

0 commit comments

Comments
 (0)