Built-in llama-cpp provider via inline ExtensionFactory#4823
Open
julien-c wants to merge 6 commits into
Open
Conversation
Co-Authored-By: julien-agent <Agents+cyolo@huggingface.co>
Co-Authored-By: julien-agent <Agents+cyolo@huggingface.co>
Add ExtensionFactoryEntry union so inline factories can carry a path, thread "<built-in:llama-cpp>" through the resource loader, and strip the <built-in:NAME> wrapping in interactive-mode label helpers. Co-Authored-By: julien-agent <Agents+cyolo@huggingface.co>
…p list" This reverts commit 8d92537.
…veat
Built-in slash commands like /model are intercepted by the interactive
editor before emitInput runs, so the pi.on("input") handler never sees
them. Surface failures via ctx.ui.notify when ctx is available, and
leave a comment pointing at a future "model_selector_open" event as
the proper trigger to refresh the model list.
Co-Authored-By: julien-agent <Agents+cyolo@huggingface.co>
Contributor
Author
|
Hi @badlogic does this approach of hooking a built-in provider for llama-cpp (to make local models as seamless as remote ones 🙏 ) based on a few possible env vars, look reasonable to you, or not really? (cc @hanouticelina with whom we've worked on this) |
Collaborator
|
i'll need a few days to find time to think about and review this. directionally, i think it is right. not sure about env var detection yet. could be a setting instead. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Built-in llama-cpp provider, activated when any
LLAMA_*env var is set (LLAMA_BASE_URL,LLAMA_CACHE, orLLAMA_ARG_*). Shipped as an inlineExtensionFactoryso it can use the existing extension event hooks.Discovers models from
${baseUrl}/modelsat startup and on/model, then refines per-modelcontextWindowvia${server}/propson first use.Test plan
LLAMA_*set: factory not instantiated, pi unchanged.LLAMA_BASE_URL+ runningllama-server: provider appears, models list, streams.