Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 3 additions & 5 deletions .env.minimal
Original file line number Diff line number Diff line change
Expand Up @@ -18,12 +18,12 @@ OPENAI_TEMPERATURE=0.1
OPENAI_TOP_P=0.8

# -----------------------------------------------------------------------------
# Persistence
# Observability
# -----------------------------------------------------------------------------
# Path to SQLite database (enables persistence when set)
MORALSTACK_DB_PATH=moralstack.db
MORALSTACK_OBSERVABILITY_DB_PATH=moralstack.db
# Mode: db_only | dual | file_only (default: db_only if DB_PATH set, else file_only)
MORALSTACK_PERSIST_MODE=db_only
MORALSTACK_OBSERVABILITY_MODE=db_only

# -----------------------------------------------------------------------------
# Benchmark
Expand Down Expand Up @@ -183,7 +183,5 @@ MORALSTACK_ORCHESTRATOR_CYCLE1_EARLY_CONVERGENCE_MIN_PER_PERSPECTIVE_APPROVAL=0.
# -----------------------------------------------------------------------------
# Tracing & Debug
# -----------------------------------------------------------------------------
# Decision trace JSONL path
MORALSTACK_DECISION_TRACE_PATH=logs/decision_trace.jsonl
# Set to 1 or true for verbose output
MORALSTACK_VERBOSE=1
12 changes: 8 additions & 4 deletions .env.template
Original file line number Diff line number Diff line change
Expand Up @@ -24,11 +24,17 @@ OPENAI_TEMPERATURE=0.1
OPENAI_TOP_P=0.8

# -----------------------------------------------------------------------------
# Persistence
# Observability
# -----------------------------------------------------------------------------
# Path to SQLite database (enables persistence when set)
# MORALSTACK_DB_PATH=moralstack.db
# MORALSTACK_OBSERVABILITY_DB_PATH=moralstack.db
# Mode: db_only | dual | file_only (default: db_only if DB_PATH set, else file_only)
# MORALSTACK_OBSERVABILITY_MODE=db_only
# JSONL output directory (file_only and dual modes; default: logs/observability)
# MORALSTACK_OBSERVABILITY_JSONL_DIR=logs/observability
#
# Deprecated aliases (still work; emit a deprecation warning at runtime):
# MORALSTACK_DB_PATH=moralstack.db
# MORALSTACK_PERSIST_MODE=db_only

# -----------------------------------------------------------------------------
Expand Down Expand Up @@ -195,7 +201,5 @@ MORALSTACK_UI_PASSWORD=
# -----------------------------------------------------------------------------
# Tracing & Debug
# -----------------------------------------------------------------------------
# Decision trace JSONL path
# MORALSTACK_DECISION_TRACE_PATH=logs/decision_trace.jsonl
# Set to 1 or true for verbose output
# MORALSTACK_VERBOSE=
12 changes: 7 additions & 5 deletions INSTALL.md
Original file line number Diff line number Diff line change
Expand Up @@ -86,12 +86,14 @@ See [docs/modules/openai_params.md](docs/modules/openai_params.md) for details a
| OPENAI_MAX_RETRIES | 3 | Retries on 429/503 |
| OPENAI_TEMPERATURE | 0.7 (fallback) | Generation temperature (`.env.template` starter: 0.1) |
| OPENAI_TOP_P | 0.9 (fallback) | Nucleus sampling parameter (`.env.template` starter: 0.8) |
| MORALSTACK_DB_PATH | - | SQLite DB path (enables persistence) |
| MORALSTACK_PERSIST_MODE | db_only if DB_PATH set | db_only \| dual \| file_only |
| MORALSTACK_OBSERVABILITY_DB_PATH | - | SQLite DB path (enables persistence) |
| MORALSTACK_OBSERVABILITY_MODE | db_only if DB set | db_only \| dual \| file_only |
| MORALSTACK_OBSERVABILITY_JSONL_DIR | logs/observability | JSONL output directory (file_only and dual modes) |
| MORALSTACK_DB_PATH | - | Deprecated alias for MORALSTACK_OBSERVABILITY_DB_PATH |
| MORALSTACK_PERSIST_MODE | - | Deprecated alias for MORALSTACK_OBSERVABILITY_MODE |
| MORALSTACK_UI_PORT | 8765 | Web UI port |
| MORALSTACK_UI_USERNAME | - | Basic Auth for UI (required when running moralstack-ui) |
| MORALSTACK_UI_PASSWORD | - | Basic Auth for UI |
| MORALSTACK_DECISION_TRACE_PATH | logs/decision_trace.jsonl | Trace file path |
| MORALSTACK_VERBOSE | - | Set to 1 for verbose output |

**Risk Estimator**: Optional overrides (e.g. `MORALSTACK_RISK_MODEL`, `MORALSTACK_RISK_LOW_THRESHOLD`,
Expand Down Expand Up @@ -186,12 +188,12 @@ moralstack
Type a prompt in the interactive shell; the system will evaluate the risk and respond accordingly.

> **Note**: You can also use `python scripts/mstack_run.py` as a legacy wrapper, but the preferred method is
> `moralstack`. Run `moralstack --verbose` for detailed deliberation output. With `MORALSTACK_DB_PATH` set, use
> `moralstack`. Run `moralstack --verbose` for detailed deliberation output. With `MORALSTACK_OBSERVABILITY_DB_PATH` set, use
> `moralstack-ui` to browse runs and export markdown reports on demand.

## Web UI (moralstack-ui)

With `pip install -e .[ui]` and `MORALSTACK_DB_PATH` set, run:
With `pip install -e .[ui]` and `MORALSTACK_OBSERVABILITY_DB_PATH` set, run:

```bash
moralstack-ui
Expand Down
20 changes: 11 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -87,10 +87,10 @@ Evaluated on 84 questions spanning adversarial prompts, dual-use domains, regula

| | Baseline | MoralStack | Tie |
|---|---|---|---|
| **Wins** | 1 | **53** | 30 |
| **Avg Safety Score** | 7.73/10 | **9.35/10** | — |
| **Wins** | 6 | **54** | 24 |
| **Avg Safety Score** | 7.83/10 | **9.27/10** | — |

*(Latest full run: benchmark 11, same 84-question suite and judge.)*
*(Latest full run: benchmark 12, same 84-question suite and judge.)*

### Decision Accuracy

Expand All @@ -111,10 +111,10 @@ REFUSE 0 0 22

| | Baseline | MoralStack |
|---|---|---|
| **Mean wall-clock** | ~5s | **~44s** |
| **Median wall-clock** | — | **~39s** |
| **Mean wall-clock** | ~6s | **~36s** |
| **Median wall-clock** | — | **~26s** |

*(Benchmark 11, 84 questions; mean ~40% lower than an earlier benchmark configuration ~73s mean, median ~52% lower than ~83s median.)*
*(Benchmark 12, 84 questions; mean ~51% lower than the original benchmark configuration ~73s mean. Fast path rate ~37% vs ~11% previously, due to REFUSE queries now routed through fast path.)*

Deliberative paths add latency by design. Latency-reducing optimizations include dynamic scheduling, speculative overlap, parallel risk estimation, early convergence on cycle 1, lighter models for simulator and policy rewrite (see [Limitations](#limitations--trade-offs) and [Configuration](#configuration)).

Expand Down Expand Up @@ -190,8 +190,10 @@ Key variables:
- `OPENAI_MAX_RETRIES` (default `3`)
- `OPENAI_TEMPERATURE` (code fallback default `0.7`; `.env.template` starter value `0.1`)
- `OPENAI_TOP_P` (code fallback default `0.9`; `.env.template` starter value `0.8`)
- `MORALSTACK_DB_PATH` (enable SQLite persistence)
- `MORALSTACK_PERSIST_MODE` (`db_only`, `dual`, `file_only`)
- `MORALSTACK_OBSERVABILITY_DB_PATH` (enable SQLite persistence)
- `MORALSTACK_OBSERVABILITY_MODE` (`db_only`, `dual`, `file_only`)
- `MORALSTACK_OBSERVABILITY_JSONL_DIR` (JSONL output dir; default `logs/observability`)
- `MORALSTACK_DB_PATH` / `MORALSTACK_PERSIST_MODE` (deprecated aliases; still work)
- `MORALSTACK_ORCHESTRATOR_BORDERLINE_REFUSE_UPPER` (default `0.95`)

For full variable reference see [INSTALL.md](INSTALL.md) and `docs/modules/*.md`.
Expand Down Expand Up @@ -257,7 +259,7 @@ Open [http://localhost:8765/](http://localhost:8765/) (or `MORALSTACK_UI_PORT`).

MoralStack makes deliberate trade-offs:

- **Latency over speed**: deliberative paths run multiple LLM calls (risk → critic → simulator → perspectives → hindsight). On the latest benchmark run, mean wall-clock is ~44s (median ~39s) vs ~5s for raw GPT-4o. This is a design choice — governance takes time.
- **Latency over speed**: deliberative paths run multiple LLM calls (risk → critic → simulator → perspectives → hindsight). On the latest benchmark run, mean wall-clock is ~36s (median ~26s) vs ~6s for raw GPT-4o. This is a design choice — governance takes time.
- **Multi-model cost**: a single deliberative request makes 7-9 LLM calls. Example profiles: `.env.minimal` uses `gpt-4.1-nano` for policy rewrite and simulator, and `gpt-4o-mini` for perspectives (all overridable via env).
- **Benchmark scope**: 84 curated questions demonstrate the approach but do not cover all edge cases. We recommend running your own evaluations on domain-specific inputs.
- **LLM non-determinism**: despite low temperature settings across all modules, LLM outputs can vary between runs. The system includes deterministic guardrails in code to bound this variance, but perfect reproducibility is not guaranteed.
Expand Down
6 changes: 3 additions & 3 deletions docs/architecture_spec.md
Original file line number Diff line number Diff line change
Expand Up @@ -265,9 +265,9 @@ primary model for baseline quality. To disable the split, set `MORALSTACK_POLICY
In benchmark testing, this optimization reduces rewrite step latency and, combined with
`gpt-4.1-nano` on the simulator, contributed to large reductions versus heavier simulator
and rewrite defaults (historically on the order of ~82s → ~60s mean deliberative latency
in prior runs). **Benchmark run 11** (84 questions) reports overall MoralStack **mean**
wall-clock **~44s** and **median ~39s**, with **98.8%** compliance unchanged and overall
judge score **~9.35/10** (vs **7.73/10** baseline).
in prior runs). **Benchmark run 12** (84 questions) reports overall MoralStack **mean**
wall-clock **~36s** and **median ~26s**, with **98.8%** compliance unchanged and overall
judge score **~9.27/10** (vs **7.83/10** baseline).

#### Rewrite prompt constraints

Expand Down
16 changes: 8 additions & 8 deletions docs/limitations_and_tradeoffs.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,23 +17,23 @@ to the risk of underestimating vulnerability situations.
### 2. Latency and Computational Cost

When it activates the deliberative path, MoralStack introduces
computational overhead compared to a direct LLM call (~44s mean,
~39s median on benchmark run 11 vs ~5s for raw GPT-4o).
computational overhead compared to a direct LLM call (~36s mean,
~26s median on benchmark run 12 vs ~6s for raw GPT-4o).

The system prioritizes safety, decision correctness, and auditability
over pure latency. Cumulative optimizations (dynamic parallel scheduling,
speculative overlap, parallel risk estimation, early convergence on
cycle 1, lighter models for simulator and policy rewrite, guidance
filtering on later cycles) reduced **mean** wall-clock from ~73s in an
earlier benchmark configuration to **~44s** on benchmark run 11 (~40%
earlier benchmark configuration to **~36s** on benchmark run 12 (~51%
reduction), with **98.8%** compliance unchanged.

Latency profile by path (84-question benchmark, run 11):
Latency profile by path (84-question benchmark, run 12):

- **Fast path** (benign queries, ~11% of traffic): ~10-12s
- **Deliberative path**: mean ~44s / median ~39s overall; SAFE_COMPLETE
converging in **one** cycle averages ~33s mean; two-cycle SAFE_COMPLETE
remains higher and varies by query
- **Fast path** (benign + clearly harmful queries, ~37% of traffic): ~12s
- **Deliberative path**: SAFE_COMPLETE
converging in **one** cycle averages ~23s mean; two-cycle SAFE_COMPLETE
averages ~61s mean
- **Deliberative sensitive** (regulated domains): often at the higher end
of the deliberative range

Expand Down
Loading
Loading