streaming-llm

Here are 3 public repositories matching this topic...

FonaTech / Project_Chronos

⚡ Zero-Stall MoE Inference via Lookahead Prediction & Async DMA Prefetching. Optimized for SSD I/O with Hybrid MLA+Sliding Window Attention.

open-source artificial-intelligence lora high-throughput open-models mixture-of-experts llm generative-ai large-language-model streaming-llm predictive-inference sliding-window-attention io-latency-hiding async-dma ssd-offloading lookahead-routing mla-attention dual-layer-moe

Updated Apr 26, 2026
Python

Mattral / SimulatedSelf

Star

A real-time, browser-native digital twin: your webcam drives a 3D humanoid that mirrors your pose, reads your emotion, and chats back through a streaming Llama 3.1 voice pipeline.

threejs human-robot-interaction emotion-recognition digital-twin voice-interface ai-avatar streaming-llm browser-ai real-time-3d-pose-estimation

Updated Jun 5, 2026
TypeScript

SandyCompetent / exeter_academic_agent

Star

Agentic AI assistant powered by Google Gemini 2.5, with streaming LLM output, multi-tool data routing, and cross-platform Flutter deployment.

android-application webapp gemini-api flutter-app prompt-engineering generative-ai streaming-llm llm-agents google-gemini agentic-ai llm-integration model-routing

Updated Mar 19, 2026
Dart

Improve this page

Add a description, image, and links to the streaming-llm topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the streaming-llm topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly