Skip to content

Serve app as a web app and provide a max tokens option within sampling#8

Draft
ichim-david wants to merge 4 commits intostevibe:mainfrom
ichim-david:david-main
Draft

Serve app as a web app and provide a max tokens option within sampling#8
ichim-david wants to merge 4 commits intostevibe:mainfrom
ichim-david:david-main

Conversation

@ichim-david
Copy link
Copy Markdown

This pr isn't really meant to be merged as I'm sure it would fail your quality bar & I still need todo some cleanup before it would be merge ready but I wanted to open this pr just to see if any of these features make sense for your repo to have.

Currently it does two things:

  1. introduce a max_tokens because with models such as qwen 27 which are quite verbose I had
    a few benches that failed because the response got cut off (mac mini m4 pro 64gb and qwen is quite slow 6-11tps).
    When I bumped to 2048 tokens or more for hermes benchmark some benches turned green with more tokens allowed for output
  2. Serve the app logic also as a webapp for users that do the inference on a desktop while working on the laptop it's easier to consume a web app than a desktop app

ichim-david and others added 4 commits May 2, 2026 11:19
…es which broke some good results for models that are verbose
- Add Fastify server (app/src/server/) with REST API at /api/*
- Add SSE event streaming at /api/events/sse for run events, mutation progress, verifier progress
- Add in-process SSE bus and active run manager
- Extract Electron-free helpers (themes, app-metadata, models) for server use
- Add HTTP + SSE API client (app/src/renderer/src/api/client.ts) replacing IPC bridge
- Adapt App.tsx: window.benchlocal.* -> bl.*, IPC listeners -> SSE, remove update/detached-logs UI
- Add Vite web config (vite.config.web.ts) for renderer-only builds
- Add npm scripts: web:dev, web:build, web:start
- Add fastify, @fastify/static, tsx, esbuild, concurrently dependencies
- Stub removed Electron features (updates, logs, onOpenAbout/onOpenSettings) for backward compatibility
- Web server runs on port 4300 (configurable via BENCHLOCAL_PORT)
- Moved path resolution logic to a new module for better separation of concerns.
- Updated app-metadata.ts to utilize new path resolution functions.
- Enhanced error handling for license and package.json loading.
- Improved server initialization in index.ts to check for renderer output directory.
- Refactored SSE route handling for better response management.
- Simplified theme loading logic in themes.ts by using path resolution functions.
- Updated Vite configuration to use environment variables for ports and output directories.
@ichim-david
Copy link
Copy Markdown
Author

web-app image

works surprisingly well given the mac os look :)

Feel free to close this pr anytime and implement anything or nothing out of this work if it doesn't make sense for you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants