StoryCheck enables end-to-end natural language user story verification for Web3 apps.
StoryCheck promotes smooth collaboration between dApp developers, testers, document writers, product managers and support teams by enabling use of natural language for executable and verifiable user stories. StoryCheck reduces the need for expertise in lower level e2e test frameworks and brittle test code such as Synpress, Cypress, Playwright, Selenium, etc. Brittle UI testing code often leads to poor return on invested effort, which leaves front ends vulnerable to exploits. StoryCheck changes that by focusing on UX intention and stable UI flows rather than detailed CSS and HTML inspection.
Typical Workflow:
- dApp Builders collaborate with frontier AI models to generate user stories in markdown format. The format includes pre-requisites such as chain ID, UI steps and expected results for blockchain transactions.
- Use this AI Story Ideation Prompt Template for sessions with Grok or similar models.
- StoryCheck parses user stories and executes the steps in a virtual web browser (via Playwright) closely emulating the actions of a real user. Uses SOTA VLM (Jedi-3B) to understand and execute UI instructions.
- StoryCheck injects in the browser a mock wallet which intercepts and redirects UI transaction requests to a local EVM fork (via anvil).
- As long as the user story intention remains stable, StoryCheck will remain robust to minor stylistic UI component and layout changes.
- Finally StoryCheck verifies blochain transactions against expected results. Frontier model generates expected results description and verifier code at ideation time. Since blockchain code is strictly deterministic, verifier code remains stable as long as dApp user story intention remains stable.
StoryCheck's north star 🌟: becoming the default tool for Ethereum dApp e2e testing and documentation by blending frontier AI for creative/ideation phases with efficient local models for repetitive execution. This ensures developers save time on story generation (via collaborative Grok/Gemini/ChatGPT/Claude sessions) while enabling low-cost, automated CI/CD checks using efficient small AI models to boost ecosystem security and adoption.
When the project was originally founded in 2023, dApp developers were mainly focused on smart contract security. However since then front end hacks have become more prominent culminating with a $1.5B hack of Bybit protocol due to javascript injected malicious code in the Safe multisig front end which almost all projects use to manage their treasuries and have therefore been vulnerable to the same hack. StoryCheck is now relevant more than ever as serious institutional funds are pouring into the Ethereum ecosystem even surpassing Bitcoin ETF inflows in July 2025.
To get a sense of the capability of modern local VLMs, try the following UI referencing playground
# Register a new ENS domain
This use story walks through an initial domain name search "storychecktest" and successful registration on a mobile device.
## Prerequisites
1. Chain
- Id 1
- RPC https://lb.drpc.org/ogrpc?network=ethereum&dkey=****
- Block 23086523
2. Browser
- Pixel 7
## User Steps
1. Browse to https://app.ens.domains/
1. Click Accept
1. Click on search box
1. Type storychecktest
1. Press Enter
1. Click Connect
1. Click Browser Wallet
1. Scroll down
1. Select Ethereum payment method
1. Click Next
1. Click Skip Profile
1. Scroll down
1. Click Begin
1. Click "Open Wallet"
1. Wait 5 seconds
## Expected Results
- Verify commitment transaction succeeded [verifier](verifiers/tx_success.py)
- Verify commitment timestamp set [verifier](verifiers/commitment_timestamp.py)
- App should display 'Transaction Successful' [verifier](verifiers/ui_start_timer_ok.py)
The prerequisites section sets conditions which allow the test to execute from a deterministic blockchain state, which respectively allows for predictable results. Currently supported prerequsite is Chain at the top level with Id as a required parameter, and optionally Block and RPC. These parameters are passed to anvil to create a local EVM fork for the test run.
By default each test starts with 10,000 ETH in the mock user wallet (same as anvil default test accounts).
In order to fund the mock wallet with other tokens (e.g. USDC, DAI, NFTs), the User Steps section of the story file should begin with prompts that initiate the funding via front end interactions (e.g. Uniswap flow for ETH/USDC swap).
Often Web3 Apps use front end libraries such as wagmi.sh to access current chain state. When that is the case, the user story should include the exact RPC URL used by the front end as a prerequisite. That allows StoryCheck to intercept all calls directed to the RPC and reroute towards the local chain fork. This is important to ensure that the app reads and writes from/to the local chain fork.
The format of user steps in this section resembles the HOWTO documentation of a web3 app. Teams may use the same markdown in their documentation (e.g. gitbook, notion, docusauros) and execute it with StoryCheck to make sure that the latest web app behavior is in sync with docs.
Each step in a user story is classified as an action prompt from the following set:
Browse- prompts that start withbrowseand include a URL link to a web page are interpreted as browser navigation actions. For examplebrowse to https://app.uniswap.org. For implementation details, see Playwright goto.Click- prompts that start withclick,tap, orselectfollowed by a natural language referring expression of a UI element are interepreted as click actions with the corresponding UI element target. For exampleclick on Submit button at the bottomorselect logo next to ETH option. For implementation details see Playwright mouse click and RefExp GPTType- prompts that start with the keywordtype,inputorenter(case insensitive) followed by a string are interpreted as a keyboard input action. For exampleType 1000orType MyNewDAO. For implementation details, see Playwright type.Scroll- prompts that start withscrollfollowed byupordownare interpreted respectively asPress PageDownandPress PageUpPress- prompts that start withpressfollowed by a keyboard key code (F1-F12,Digit0-Digit9,KeyA-KeyZ,Backquote,Minus,Equal,Backslash,Backspace,Tab,Delete,Escape,ArrowDown,End,Enter,Home,Insert,PageDown,PageUp,ArrowRight,ArrowUp) are interpreted as a single key press action. For further details, see Playwright press.Wait- prompts that start withwaitfollowed by a number andsecondsorminutesresult in a pause that is useful when the web3 app UI is awaiting blockchain confirmation.
Expected Results section provides several ways to check the results of running a user story.
StoryCheck saves a transaction snapshot check similar to jest snapshot matching.
The first time a test is run, all write transactions going through window.ethereum are recorded and saved. Subsequent runs must match these write transactions.
If there is a mismatch, then one of three changes took place in the UI under test:
- Developers changed the frontend code in a significant way. This warrants a careful code review and update of the user stories.
- There is malicious injected code that changes the behavior of the app. A big red alert is in order! App infrastructure is compromised: hosting providers, third party libraries, or build tools.
- There is a bug in some of the third party dependencies that affects UI behavior. Developer attention required to track down and fix the root cause.
Sometimes web3 apps use randomization (e.g. ENS registration commit step), which makes snapshots different on each run. In this case and generally when more advanced verification is needed, StoryCheck allows custom verifiers. Usually frontier models are very good at understanding the intention of user stories, suggesting and implementing custom verifiers. As long as the intended behavior of a user story remains stable in relation to onchain transactions, the verifiers code also remains robust. Inherently blockchain code is a lot more robust and stable across app iterations than UI code.
Verifiers are referenced in the Expected Results section of a story.md file and saved under /verifiers sub directory.
├─ astory/
│
├─story.md
│
├─/verifiersStoryCheck has the following high level workflow:
- StoryCheck users leverage a their favorite IDE and frontier model (Grok, Gemini, ChatGPT, Claude) to prepare verifiable user stories in sync with project docs in a markdown format that is easy to parse and execute via steps 2-3 below.
- Ideation prompt template available at ai_story_ideation_template.md
- Stories are executed and verified in three stages: a. Prerequisites: prepare local context with EVM fork, virtual web browser and mock crypto wallet a. User steps: UI commands are parsed with a local AI model and run through virtual browser. ("Click on the Connect button", "Type 20 in the Sell text field", "Enter rETH in search bar"). a. Expected Results: finally, verifiers inspect app and chain state.
flowchart TD
A[User Story] -->|check| B(StoryCheck)
B --> |parse| C[Markdown Parser]
B -->|play| D[Browser Driver / playwright]
D -->|locate UI element| E[AI Model]
D -->|sign tx| F[Mock Wallet / EIP1193Bridge]
F -->|blokchain tx| G[Local EVM Fork / anvil]
├─ .\ — "Main StoryCheck python app."
│ │
│ ├─ markdown — "Markdown parser. Outputs abstract syntax tree (AST) to interpreter."
│ │
│ ├──┬─ interpreter — "Runtime engine which takes AST as input and executes it."
│ │ │
│ ├──┼──┬─ browser — "Playwright browser driver."
│ │ │ │
│ │ │ └─ mock_wallet — "JavaScript mock wallet provider injected in playwright page context as Metamask."
│ │ │
│ │ ├─ ai — "RefExp GPT AI model that predicst UI element location based on natural language referring expressions."
│ │ │
│ │ └─ blockchain — "Local EVM fork runtime via Foundry Anvil."
│ │
│ └─ examples — "Example user stories."To set up the environment locally using uv (a fast Python package manager), run the following script. This creates a virtual environment with Python 3.10, installs dependencies from requirements.txt, sets up Playwright, and builds the mock wallet.
chmod +x setup_env.sh
./setup_env.sh
source .venv/bin/activateStoryCheck can be run as a shell command or as a web service.
$>./storycheck.sh --help
usage: StoryCheck by GuardianUI [-h] [-o OUTPUT_DIR] [--serve] storypath
Parses and executes user stories written in markdown format.
positional arguments:
storypath Path to the user story dir (e.g. mystory/).
options:
-h, --help show this help message and exit
-o OUTPUT_DIR, --output-dir OUTPUT_DIR
Directory where all results from the storycheck run will be stored. Defaults to "results"
--serve Run as a web service. Defaults to "False".
Copyright(c) guardianui.com 2023For example to run a check of mystory/, use:
./storycheck.sh mystory/If all story checks / tests pass, the command will return with exit code 0. Otherwise if any test fails or other errors occur, the exit code will be non-zero.
This makes it possible to use storycheck in shell scripts or CI scripts.
The output directory of a test run is either specified via --output-dir command line argument
or defaults to ./results. It contains a number of helpful artifacts for debugging:
├─ ./results — "Main output directory for an input story file."
│ │
│ ├─ storycheck.log — "Consolidated log file between test runner, browser and EVM."
│ │
│ ├─ tx_snapshot.json — "Snapshot of all blockchain write transactions."
│ │
│ ├─ videos/ — "Video recordings of browser interactions."
│ │
│ ├─ screenshots/ — "Browser screenshot for every prompt in the User Steps section."
│ │
│ ├─ anvil-out.json — "Configuration for the anvil EVM fork."
│ │
│ ├─ trace.zip — "Session trace for Playwright Trace Viewer."
│ │You can integrate StoryCheck into your project's CI/CD workflow using the GitHub Action. This runs story checks automatically on pushes/pull requests, gating merges on pass/fail and uploading artifacts for review.
Example workflow (add to .github/workflows/storycheck.yml in your repo):
name: StoryCheck CI
on:
push:
branches: [ main, dev ]
pull_request:
branches: [ main, dev ]
jobs:
check:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Run StoryCheck
uses: GuardianUI/storycheck@v0.1.0 # Replace with your tagged version
with:
storypath: 'examples/sporosdao' # Path to your story dir/file (relative to your repo)
output-dir: 'storycheck-results' # Optional: Custom output dir
- name: Check if passed
if: ${{ steps.storycheck.outputs.passed != 'true' }}
run: echo "StoryCheck failed!" && exit 1If your web3 app needs to run locally during the check (e.g., for testing against a dev server), use these optional inputs:
start-command: Command to start the app (e.g.,yarn workspace @my-app start).wait-on: URL to poll until ready (e.g.,http://localhost:3000).wait-on-timeout: Timeout in seconds (default: 60).app-working-directory: Directory for the start command (relative to your repo, default:.).
The action starts the app in the background, waits for the URL to respond with 200 OK, runs StoryCheck (ensure your story.md uses local URLs like Browse to http://localhost:3000/), then stops the app. App logs are captured in app.log and uploaded as part of artifacts.
Example workflow to test a simple web3 app that sends ETH to a user-entered ENS name or address:
name: StoryCheck Simple Web3 App
on:
push:
branches: [main, dev]
pull_request:
branches: [main, dev]
jobs:
check:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Install pnpm dependencies
run: pnpm install
working-directory: examples/simple-web3-app
- name: Run StoryCheck
uses: GuardianUI/storycheck@v0.1.2 # Replace with the latest stable tagged version
with:
storypath: 'examples/simple-web3-app'
start-command: 'pnpm dev'
wait-on: 'http://localhost:5173'
wait-on-timeout: '60'
app-working-directory: 'examples/simple-web3-app'
- name: Check if passed
if: ${{ steps.storycheck.outputs.passed != 'true' }}
run: echo "StoryCheck failed!" && exit 1This example runs StoryCheck on the simple-web3-app example, which tests a web3 app that connects a wallet, allows entering an ENS name or address (e.g., vitalik.eth) and an ETH amount (e.g., 0.01), and sends the transaction. The user story (examples/simple-web3-app/story.md) includes steps like:
- Browse to http://localhost:5173/
- Click Connect Wallet
- Type vitalik.eth in the ENS name or address input
- Type 0.01 in the ETH amount input
- Click Send ETH
- Wait 5 seconds
- Transaction succeeded with 0.01 ETH sent to vitalik.eth resolving to 0xd8dA6BF26964aF9D7eEd9e03E53415D37aA96045 - verifier
The action starts the Vite dev server (pnpm dev), waits for http://localhost:5173, runs the story, and automatically uploads artifacts (including app.log and results/). The job fails if StoryCheck doesn’t pass.
Thanks for your interest in contributing!
Please start with a new discussion before opening an Issue or Pull Request.