Splunk App Lifecycle Copilot is a hackathon MVP for carrying a Splunk app from raw logs to a CIM-clean, AppInspect-green package with self-healing loops and an auditable provenance trail.
Positioning: Splunk's own AI can explain failures. This project resolves, validates, and remembers them.
Fastest look — no Splunk, no MCP, no install (needs only Bun):
make dashboard # cd ui/dashboard && bun install && bun run devOpen the printed URL. The dashboard lands on a Lifecycle overview of both self-heal loops, then drill into the Onboarding and AppInspect stages. The Provenance Ledger panel is the "resolve, validate, and remember" thesis made literal — every diagnosis, patch, rationale, and validation result from a verified run, replayed from committed demo events.
Run the real software end-to-end:
make setup # Python 3.13 venv + install (see Requirements re: 3.13)
make demo # runs the loops, prints where every artifact landedmake demo always runs the dependency-free AppInspect loop. It also runs the
live onboarding loop (HEC ingest → MCP splunk_run_query validation) when
.env carries the Splunk + MCP credentials; otherwise it points back to the
zero-deps dashboard replay. make help lists every target.
- Event: Splunk Agentic Ops Hackathon
- Track: Platform & Developer Experience
- Bonus target: Best Use of Splunk MCP Server
- Submission deadline: June 15, 2026, 9:00 AM PDT
The MVP builds three loops on one shared self-heal engine:
- Stage 1, onboarding: raw UPI/GST-style logs -> inline
rex/evalextraction candidates -> validation against real Splunk events throughsplunk_run_query-> finalprops.conf/transforms.confonly after convergence. - Stage 2, AppInspect: deliberately broken app -> AppInspect JSON failures -> deterministic patch functions -> re-run until green.
- Stage 4, cost-aware SPL lint: a deliberately costly search -> deterministic cost findings (
index=*, no time bound, unbounded| sort) -> deterministic rewrites -> re-lint until clean. Static analysis, no live Splunk.
Stages 3 (scaffold + test data) and 5 (dashboard migration) remain architecture-only future extensions.
SCOPE.md: source-of-truth scope, build sequencing, risks, and submission checklist.UX_DEMO_PLAN.md: dashboard/CLI/IDE surfaces and the agent-to-UI event contract.architecture_diagram.md: required architecture diagram artifact.demo/architecture_demo.md: technical, presentation-ready architecture walkthrough (system overview, the self-heal engine, live-MCP onboarding, and live-mode SSE — five Mermaid diagrams) for the demo.docker-compose.yml: local Splunk Enterprise container with HEC enabled.smoke_test.py: Day-1 Splunk SDK, HEC, search, and AppInspect smoke test.fixtures/onboarding/sample_upi.log: 150-line synthetic UPI transaction fixture.fixtures/appinspect/broken_app: AppInspect failure fixture for the first build milestone.
- Python 3.13 (use 3.13 specifically, not 3.14 — see note below)
- Docker Desktop
- Homebrew
libmagicon macOS forsplunk-appinspect - Splunk Enterprise Docker image (
splunk/splunk:latest) - Splunk MCP Server app for the onboarding loop
- Optional: Splunk AI Assistant for
saia_generate_spl,saia_explain_spl, andsaia_optimize_spl
Python dependency notes:
- PyPI package is
splunk-sdk; import name issplunklib. splunklib.airequiressplunk-sdk>=3.0.0, which requires Python 3.13+.- Pin the interpreter to Python 3.13, not 3.14. On 3.14, AppInspect's bundled
Python static analyzer fails to initialize and reports every Python check as an
error("Python analyzer is failed in initialization"). That leaves the AppInspect self-heal loop unable to reach a clean result even after the three real failures are patched. On macOS:brew install python@3.13and build the venv with/opt/homebrew/opt/python@3.13/bin/python3.13 -m venv .venv. splunk_run_queryis the required MCP validation tool.saia_*tools are optional graceful enhancement only.
python3.13 -m venv .venv
source .venv/bin/activate
pip install -e .
cp .env.example .envEdit .env with your local values. The MCP token must be the encrypted token generated by the Splunk MCP Server app. It is not a plain Splunk REST bearer token. Do not build the demo around OAuth; OAuth for Splunk MCP is Controlled Access / closed preview.
On macOS, install AppInspect's system dependency if needed:
brew install libmagicStart Splunk:
docker compose up -dPorts are driven by .env (SPLUNK_WEB_PORT, SPLUNK_HEC_PORT, SPLUNK_MGMT_PORT).
If host port 8088 is already in use, set a different SPLUNK_HEC_PORT (the local dev
box uses 18088); the onboarding loop reads the same value, so HEC ingest stays in sync.
Wait for the container to become healthy, then run:
python smoke_test.pyThe current AppInspect fixture is designed to fire three deterministic failures:
check_that_local_does_not_exist: forbiddenlocal/directory.check_user_seed_conf_deny_list: forbiddendefault/user-seed.conf.check_if_outputs_conf_exists: forwarding enabled indefault/outputs.conf.
Run the self-heal loop:
copilot appinspect fixtures/appinspect/broken_app --out runs/appinspect-demoThe command copies the fixture into runs/appinspect-demo/work/broken_app,
patches only that working copy, and writes:
runs/appinspect-demo/appinspect/iteration-XX.jsonruns/appinspect-demo/events.jsonlruns/appinspect-demo/events.jsonruns/appinspect-demo/provenance.jsonlruns/appinspect-demo/summary.json
Validate it directly:
splunk-appinspect inspect fixtures/appinspect/broken_app \
--mode test \
--data-format json \
--output-file /tmp/broken_app_result.jsonExpected summary:
failure: 3
error: 0
The onboarding loop is the only path that needs the Splunk MCP Server app and an encrypted token. The static loops (AppInspect, SPL lint), the dashboard, and Live mode need none of this — skip this section if you are not running onboarding.
Verified against Splunk MCP Server v1.2.0 on splunk/splunk:latest. The
commands below use $SPLUNK_MGMT_PORT (default 8089) and the admin password
from .env.
1. Install the app. Download "Splunk MCP Server" from Splunkbase, then in
Splunk Web go to Apps -> Manage Apps -> Install app from file and upload the
.tar.gz (or docker cp it into $SPLUNK_HOME/etc/apps/ and restart the
container). Confirm it is installed and enabled:
curl -sk -u "admin:$SPLUNK_PASSWORD" \
"https://localhost:8089/services/apps/local/Splunk_MCP_Server?output_mode=json" \
| python3 -c "import sys,json;e=json.load(sys.stdin)['entry'][0]['content'];print('enabled' if not e.get('disabled') else 'DISABLED','v'+str(e.get('version')))"2. Enable token authentication (once; idempotent):
curl -sk -u "admin:$SPLUNK_PASSWORD" -X POST \
"https://localhost:8089/services/admin/token-auth/tokens_auth" -d disabled=03. Mint the encrypted token. The app RSA-encrypts a JWT (audience mcp);
mcp.conf sets require_encrypted_token = true, so this is not a plain REST
bearer token. The + in the relative expiry must be URL-encoded as %2B or
the app rejects the request:
curl -sk -u "admin:$SPLUNK_PASSWORD" \
"https://localhost:8089/services/mcp_token?username=admin&expires_on=%2B30d"
# -> {"token": "<encrypted-token>"} (valid 30 days)4. Put the values in .env:
SPLUNK_MCP_ENDPOINT=https://localhost:8089/services/mcp
SPLUNK_MCP_ENCRYPTED_TOKEN=<the token value from step 3>
SPLUNK_MCP_TLS_VERIFY=false # local self-signed Docker devThe token expires after 30 days; re-mint by repeating steps 2-3. The saia_*
tools are optional and only appear when Splunk AI Assistant is installed — the
onboarding loop succeeds with splunk_run_query alone.
The onboarding slice is live-only: it ingests fixtures/onboarding/sample_upi.log
through HEC, validates inline SPL candidates with MCP splunk_run_query, and
hard-fails if the Splunk MCP Server app or splunk_run_query tool is unavailable.
It does not use Splunk AI Assistant, Splunk SDK search fallback, or generated
props.conf / transforms.conf yet.
Required .env values:
SPLUNK_HEC_TOKENSPLUNK_ONBOARDING_INDEX=mainSPLUNK_ONBOARDING_SOURCETYPE=upi_gateway_rawSPLUNK_MCP_ENDPOINTSPLUNK_MCP_ENCRYPTED_TOKENSPLUNK_MCP_TLS_VERIFY=falsefor local self-signed Docker dev
Run it:
copilot onboard fixtures/onboarding/sample_upi.log --out runs/onboarding-demoExpected flow: candidate-00 fails coverage checks, the deterministic patcher
switches to candidate-01, MCP revalidation passes, six CIM mapping events and
two PII flag events are written for dashboard replay.
The SPL lint loop is static like AppInspect — no live Splunk required. It lints a search for cost anti-patterns, heals each with a deterministic rewrite, and re-lints until clean.
copilot lint fixtures/spl_lint/costly_search.spl --out runs/spl-lint-demoThe fixture fires three findings, each healed in its own iteration:
spl_wildcard_index:index=*scans every index -> rewritten toindex=main.spl_all_time: noearliest/latestbound -> prependedearliest=-24h.spl_unbounded_sort:| sortwith no limit -> capped at| sort 1000.
It writes the same artifact set as the other loops
(spl/iteration-XX.json, events.json, provenance.jsonl, summary.json),
so the run drops straight into the dashboard's SPL Lint stage.
- Terminal starts the agent and proves this is real software.
- Dashboard shows onboarding: fields appear as
splunk_run_queryvalidates inline extraction candidates against the 150-line fixture, PII is flagged, and CIM mapping converges. - Dashboard shows AppInspect: three red failures are diagnosed, patched by deterministic functions, and revalidated to green.
- VS Code cutaway shows the same agent entry point from the IDE.
- Provenance ledger shows every diagnosis, patch, rationale, and validation result.
The dashboard replays all three self-heal loops. It opens on a Lifecycle
overview that summarizes every loop side by side (status, failures healed,
iterations, MCP calls) to make the one-engine/many-loops thesis legible at a
glance. From there, use the sidebar to open the Onboarding, AppInspect,
or SPL Lint stage; each renders committed demo events
(demo/onboarding_events.json, demo/appinspect_events.json,
demo/spl_lint_events.json) and requires no Splunk, MCP, or live WebSocket. The
onboarding stage adds an MCP tool-call count and a CIM-mapping / PII panel
sourced from a verified live run.
Every stage also renders a Provenance Ledger panel: the complete, durable
audit trail read from the persisted *_provenance.jsonl — each entry's
diagnosis, patch, rationale, validation result, changed paths, and timestamp.
That panel is the "and remember" half of the thesis, and it is exactly what a
reviewer would inspect to trust an automated fix.
cd ui/dashboard
bun install
bun run devVerification:
bun run test
bun run buildThe dashboard can also stream a self-heal loop as it runs, instead of replaying committed events. Start the SSE server, then click Go Live on the AppInspect or SPL Lint stage:
copilot serve # or: make serve (defaults to 127.0.0.1:8765)The server (lifecycle_copilot.server) runs the chosen static loop — AppInspect
or SPL lint, neither needs a live Splunk — in a background thread and streams
each event over Server-Sent Events. The browser feeds those events through the
same reducer used for replay, so live and replay render identically; only the
event source differs. Point the dashboard at a non-default server with
VITE_LIVE_URL. Onboarding is replay-only here because it requires Splunk + MCP.
The repo ships a .vscode/ workspace so the agent runs from the IDE — the same
copilot entry point, surfaced as Run Task and Run and Debug entries:
- Terminal → Run Task lists each loop: AppInspect self-heal, SPL lint self-heal, Onboard (live MCP), plus Live stream server, Dashboard, and Run Python tests. AppInspect and SPL lint need no Splunk.
- Run and Debug (
launch.json) starts any loop underdebugpywith the venv interpreter, so you can set breakpoints in the self-heal engine. extensions.jsonrecommends the Python, debugpy, and Bun extensions.
Run make setup first so .venv exists; the tasks call .venv/bin/copilot.
The self-heal engine is intentionally constrained. The LLM produces diagnosis and rationale text; deterministic patch functions make file changes. That gives the demo repeatability, keeps patch provenance reviewable, and makes the platform thesis credible.