Mobilint NPU – LLM Inference Demo Container

The Mobilint NPU LLM Inference Demo Container provides a fully integrated, ready-to-run environment for executing various large language models (LLMs) locally on Advantech’s edge AI devices embedded with Mobilint’s ARIES-powered MLA100 MXM AI accelerator module.

Overview

This edge LLM demo container features a user-friendly web-based GUI that allows users to select from a list of pre-compiled LLMs without any command-line configuration. It is designed for quick evaluation and demonstration of ARIES’s NPU acceleration in real-world LLM workloads.

All required runtime components and model binaries are preloaded to ensure a smooth out-of-the-box experience. Users can test different models and parameters from the GUI without editing configuration files or entering commands.

Key Features

Browser-based GUI – Model selection and inference execution from a single dashboard
Pre-compiled model set – Includes INT8-quantized LLMs
Optimized Runtime Library – Hardware-accelerated inference for ARIES NPUs and Python and C++ backend integration for extended development

Mobilint NPU – LLM Performance Report

All LLM metrics are measured using GenAI-Perf by NVIDIA. The number of input tokens and output tokens were 240 and 10 respectively.

Model	Time To First Token (ms)	Output Token Throughput Per User (tokens/sec/user)
c4ai-command-r7b-12-2024	4,667.31	4.58
EXAONE-3.5-2.4B-Instruct	963.86	14.23
EXAONE-4.0-1.2B	329.37	31.62
EXAONE-Deep-2.4B	886.35	13.03
HyperCLOVAX-SEED-Text-Instruct-1.5B	435.50	22.46
Llama-3.1-8B-Instruct	4,430.71	5.81
Llama-3.2-1B-Instruct	430.56	30.73
Llama-3.2-3B-Instruct	1218.22	12.16

Environmental Prerequisites on Host OS

Hardware

The container is designed to demonstrate Mobilint NPU’s local LLM capabilities as embedded in AIR-310, Advantech’s edge AI hardware. Other compatible hosts include:

Mobilint MLA100 Low Profile PCIe Card

Software

Docker Engine ≥ 28.2.2
Mobilint SDK modules
- Pre-compiled Mobilint-compatible LLM binaries (.mxq)
- Mobilint ARIES NPU Driver
  - NOTE: To access the files and modules, please contact tech-support@mobilint.com.
  1. To verify device recognition, run the following command in the terminal: ls /dev | grep aries0 If the output includes aries0, the device is recognized by the system.
  2. For Debian-based operating systems, verify driver installation by running: dpkg -l | grep aries-driver If the output contains information about aries-driver, the device driver is installed.

Container Information

Directory Structure

├── backend
│   └── src
└── frontend
    ├── app
    │   └── components
    └── public
        └── fonts

Container Components

Mobilint Runtime Library (latest stable release)
Web-based GUI frontend (Next.js based)
Python LLM server backend (socket.io based)

Quick Start Guide

Install Docker

Follow the official Docker installation guide. After installation, add your user to the docker group by following the Linux post-installation steps.

Create Docker Network & Build Image

docker network create mblt_int
docker compose build

Run

docker compose up

Set Production Mode

This demo was originally designed for single-user demonstration purposes. However, you can enable multi-user functionality by setting up the production environment variable. To do this, copy backend/src/.env.example to backend/src/.env and make PRODUCTION="True". In production mode, changing the model will not be applied immediately. Instead, the server will automatically load the requested model for each LLM request as needed.

Change list of models

You can change the list of LLMs by editing backend/src/models.txt. These change will be applied when server is restarted.

Change text prompts

You can change system prompts without any docker rebuild by editing backend/src/system.txt and backend/src/inter-prompt.txt. The changes will be applied when the conversation is reset.

Run on background

docker compose up -d

Shutdown background

docker compose down

From the GUI, select a model from the list.
Interact with the loaded LLM as needed.
To troubleshoot unexpected errors, please contact tech-support@mobilint.com.

Advantech × Mobilint: About the Partnership

Advantech and Mobilint have partnered to bring advanced deep learning applications, including large language models (LLMs), multimodal AI, and advanced vision workloads, fully at the edge.

Advantech’s industrial edge hardware, integrating Mobilint’s NPU AI accelerators, provides high-throughput, low-latency inference without cloud dependency.

Preloaded and validated on Advantech systems, Mobilint’s NPU enables immediate deployment of optimized AI applications across industries - including manufacturing, smart infrastructure, robotics, healthcare, and autonomous systems.

License

Provided “as is” without warranties.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
backend		backend
frontend		frontend
.gitignore		.gitignore
PERFORMANCE_REPORT.md		PERFORMANCE_REPORT.md
README.md		README.md
demo.gif		demo.gif
docker-compose.yml		docker-compose.yml
partnership.jpg		partnership.jpg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Mobilint NPU – LLM Inference Demo Container

Overview

Key Features

Mobilint NPU – LLM Performance Report

Environmental Prerequisites on Host OS

Hardware

Software

Container Information

Directory Structure

Container Components

Quick Start Guide

Install Docker

Create Docker Network & Build Image

Run

Set Production Mode

Change list of models

Change text prompts

Run on background

Shutdown background

Advantech × Mobilint: About the Partnership

License

About

Uh oh!

Releases

Packages

Contributors 3

Uh oh!

Languages

Advantech-EdgeSync-Containers/Mobilint-NPU

Folders and files

Latest commit

History

Repository files navigation

Mobilint NPU – LLM Inference Demo Container

Overview

Key Features

Mobilint NPU – LLM Performance Report

Environmental Prerequisites on Host OS

Hardware

Software

Container Information

Directory Structure

Container Components

Quick Start Guide

Install Docker

Create Docker Network & Build Image

Run

Set Production Mode

Change list of models

Change text prompts

Run on background

Shutdown background

Advantech × Mobilint: About the Partnership

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Uh oh!

Languages

Packages