Skip to content

Releases: KryptSec/oasis

v0.1.5

04 Mar 17:19
9114a56

Choose a tag to compare

Added

  • KSM now includes token efficiency as a third scoring factor — models that burn excessive tokens get penalized up to 30% (#47, #50)
  • Interactive export prompt after benchmark runs — copy share card or save HTML report (#48, #51)
  • Share / export option in results browser detail menu

Fixed

  • Anthropic token undercount: input_tokens excludes cached tokens, now sums all three fields (#44, #45)
  • Score label disambiguation: "Overall Score" → "Strategy Score" for LLM assessment, "Score" → "KSM" in table headers (#46, #49)
  • Remaining label inconsistencies in markdown, text, and terminal analysis output (#54)
  • Export prompt: writeFileSync crash on permission errors, unreachable no-analysis path, Ctrl+C mishandled (#55)
  • curl stderr leaking to terminal during benchmark runs (#52, #53)
  • Formula explainer now accurately describes KSM calculation

Changed

  • 363 tests passing (was 346)

v0.1.4

04 Mar 17:18

Choose a tag to compare

Security

  • Fixed TOCTOU vulnerability in credential file writes — now uses atomic mode setting (0o600)
  • Fixed world-readable result files — benchmark transcripts now written with 0o600 permissions
  • Added path validation for report --output flag — prevents path traversal, warns on symlinks
  • Added input validation for --max-iterations — rejects NaN and negative values

Fixed

  • Ollama analyzer now defaults to benchmark model instead of hardcoded llama3.3 (#33)
  • API calls now timeout after 120s instead of hanging indefinitely on network issues
  • Fixed analysis: any type annotations — now uses proper AnalysisResult interface
  • gradient-string now has graceful fallback for terminals without truecolor support (Windows)

Changed

  • Deduplicated score bar rendering — now uses shared renderScoreBar() helper
  • Added 25 new tests (346 total)

v0.1.3

04 Mar 17:18
764eabd

Choose a tag to compare

Added

  • Polished CLI output: gradient banners, boxed layouts, cli-table3 tables throughout
  • Live model fetching from provider APIs (config flow + run wizard)
  • Share card report format (oasis report <id> -f share) — compact markdown for Discord/GitHub
  • Standalone HTML report format (oasis report <id> -f html) — dark-themed, no external deps
  • Clipboard support (oasis report <id> -f share --clipboard)

Fixed

  • ATT&CK technique classification now runs on every command during benchmarks (was always null)
  • Analyzer backfills step-level techniques from LLM stepsUsed mapping
  • Updated provider model lists to current (Claude Opus 4.6, o3, Grok 4, Gemini 2.5 Pro)

Changed

  • Bumped @anthropic-ai/sdk ^0.71.2 → ^0.78.0
  • Bumped openai ^4.0.0 → ^6.25.0

v0.1.2

04 Mar 17:18

Choose a tag to compare

Fixed

  • CLI --version now reads from package.json instead of hardcoded value
  • Docker auto-start on macOS when daemon isn't running
  • Per-image ARM64 fallback (only emulates containers that need it)

v0.1.1

04 Mar 17:18
9d7ca20

Choose a tag to compare

Fixed

  • KSM score could exceed 100 when rubric total exceeded 100 points (#29)
  • Ollama benchmarks failed with missing OPENAI_API_KEY error (#28)
  • Updated provider model lists: added Gemini 3 Flash, Grok 3/4

v0.1.0 — Initial Release

04 Mar 17:18

Choose a tag to compare

CLI tool with commands: run, analyze, results, report, challenges, config, validate, providers

Multi-provider support: Anthropic, OpenAI, xAI, Google, Ollama, custom endpoints. LLM-powered post-run analysis with MITRE ATT&CK mapping. Kryptsec Scoring Model (KSM) with objective + qualitative rubric scoring. Multiple report formats: terminal, text, JSON, markdown.

Challenge validation against JSON schema. Rate-limit retry with exponential backoff. Results summary with OWASP category grouping. XDG-compliant configuration. 153 automated tests.