mortar

minimal fill-in-the-middle autocomplete for vs code, backed by a remote llama.cpp or openai-compatible server.

the middle that holds your code together.

prerequisites

llama.cpp server (for infill) or an openai compatible endpoint.

can be a cloud provider or self-hosted.

for best results, use an FIM compatible model like qwen3-coder-30b-a3b-instruct

minimal config quickstart: llama-server --fim-qwen-30b-default --port 7860

otherwise, setup manually with eg. unsloth quants

configure

easiest path: click the $(sparkle) mortar status bar item and pick Run setup. this prompts for the endpoint and then fetches the model list from the server.

or set two things in vs code settings directly:

mortar.endpoint — base url, e.g. http://localhost:7860
mortar.model — model name, e.g. qwen3-coder-30b-a3b-instruct:Q8_0

optional:

mortar.apiKey — bearer token.
mortar.mode — auto (default, probes /infill then falls back), infill, openai.
mortar.openaiPromptTemplate — FIM template used only in openai mode. default is qwen-style <|fim_prefix|>{prefix}{prompt}<|fim_suffix|>{suffix}<|fim_middle|>.
mortar.statusBar — show the status bar indicator (default true).
mortar.ringNChunks — max chunks from recently-edited buffers to pass as extra context (default 16, 0 disables).
mortar.prefetch — speculatively fetch the next completion after each accept (default false).
mortar.statusBarTimings — show completion timings inline in the status bar text (default false; timings always available in the tooltip).

build

make deps
make compile

run the extension via F5 in vs code (launches the extension host).

commands

Mortar: Trigger Completion — force a fresh suggestion at the cursor.
Mortar: Toggle Automatic Completion — enable/disable automatic triggers.
Mortar: Run Setup — endpoint + model selection.
Mortar: Configure Endpoint — just the endpoint.
Mortar: Select Model — fetch /v1/models and pick one.
Mortar: Show Menu — quick-pick of the above (also bound to the status bar click).
Mortar: Open Settings — jump to the extension's settings pane.
Mortar: Accept Next Line of Completion / Mortar: Accept Next Word of Completion — partial-accept wrappers for keybindings.

tests

make test           # unit tests only

integration tests are skipped unless you point them at a live server:

MORTAR_TEST_ENDPOINT — base url, e.g. http://localhost:7860 (required).
MORTAR_TEST_MODEL — model name the server should load for the run (optional if the server only has one model).

MORTAR_TEST_ENDPOINT=http://localhost:7860 \
MORTAR_TEST_MODEL=qwen3-coder-30b-a3b-instruct:Q8_0 \
  make test

protocols

mortar speaks two wire protocols:

native llama.cpp /infill — preferred. the server applies the right FIM tokens for the loaded model.
openai /v1/completions — fallback. the extension wraps the prompt in a FIM template client-side and uses stop strings.

see DESIGN.md for the exact request shapes.

reference implementations

llama.vscode
llama.vim

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
resources		resources
src		src
test		test
.eslintrc.json		.eslintrc.json
.gitignore		.gitignore
.vscodeignore		.vscodeignore
DESIGN.md		DESIGN.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
REQUIREMENTS.md		REQUIREMENTS.md
ROADMAP.md		ROADMAP.md
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

mortar

prerequisites

configure

build

commands

tests

protocols

reference implementations

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

mortar

prerequisites

configure

build

commands

tests

protocols

reference implementations

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages