SmartResume can run without external APIs in two different ways.
- Install vLLM and download the resume model:
pip install vllm python scripts/download_models.py
- Launch the server (port 8001 is used by default in the config):
python -m vllm.entrypoints.openai.api_server \ --model ./models/Qwen3-0.6B \ --port 8001 \ --host 0.0.0.0 \ --tensor-parallel-size 1
- Update
configs/config.yamlso that the extraction channels point to the local endpoint:channels: local_qwen: name: "models/Qwen3-0.6B" api_url: "http://localhost:8001/v1" api_key: "local" extract_channels: basic_info: "local_qwen" work_experience: "local_qwen" education: "local_qwen"
- Run the parser as usual:
python scripts/start.py --file resume.pdf
If you prefer to load the Transformers model directly, enable the direct mode in the same config:
use_direct_models: true
direct_model_name: "models/Qwen3-0.6B"When use_direct_models is true, SmartResume first attempts to load the model from disk and falls back to the configured channels or remote API if necessary.
from smartresume import ResumeAnalyzer
analyzer = ResumeAnalyzer(init_ocr=True, init_llm=True, config_path="configs/config.yaml")
result = analyzer.pipeline(
cv_path="resume.pdf",
resume_id="resume_001",
extract_types=["basic_info", "work_experience", "education"],
)No extra arguments are required—the behavior is entirely driven by the YAML configuration.