This guide complements README.md and the other reference pages by pointing you to first steps when configure/build/install/test flows misbehave.
- Check the toolchain. Run
cmake --versionandclang++ --version/g++ --versionso CMake can emit a clear error about C++20 support. Upgrade to at least CMake 3.22 if you seeCMAKE_CXX_STANDARDcomplaints. - Clean stale state. Remove the stale
build/directory (orbuild-full/if you were using CUDA/ROCm) before rerunningcmake -S . -B build -DT81LIB_BUILD_TESTS=ON. Problems that look like “header not found” often come from mixing generator outputs. - Missing dependency hints. When configure mentions
pybind11or other dependencies, confirm you ranpip install -U pip setuptools wheel(needed beforepip install ".[torch]") and avoid mixing system and virtualenv interpreters. - Enable diagnostics. Build with
cmake --build build --target t81lib --config RelWithDebInfo -jand capture the verbose log; redirect intobuild/verb.logif the console output scrolls too fast.
-
Virtual environment isolation. Activate the same interpreter you used for
cmake(checkpython -m site) before runningpip install ".[torch]". The binding build looks at the interpreter’s include/lib paths. -
Make sure pybind11 is fresh. The repo ships
python/bindings.cppand uses thepybind11submodule; a stale pip package can cause symbol mismatches. Runpip install -U pybind11inside the venv ifImportError: undefined symboloccurs. -
pipx helpers. For CLI helpers installed via
pipx, re-runpipx inject t81lib torch transformers accelerate datasets safetensorsaftergit pullso the optional dependencies stay aligned with your working tree. -
Confirm the binding works. Run:
python - <<'PY' import t81lib print("t81lib", t81lib.__version__) PY
Any errors here usually mean the
.so/.dylibbuilt against a different Python than the one inPATH.
- Not found / wrong version. If
command not foundor the CLI reports a stale version, ensure yourPATHincludes the venv’sbin/directory or the pipx shim (~/.local/bin). Reinstall viapip install .[torch]orpipx reinstall t81lib. - Validation failures. The
--validateflag rerunsgguf.read_gguf. If it fails, rerun with--validate --verboseto surface metadata problems, and verify the GGUF file withllama.cpptooling (gguf_validate,gguf_to_gguf). - Progress bar missing? The progress reporting relies on
tqdm; install it (pip install tqdm) if the CLI skips bars or prints raw percentages. - Meta device / accelerate offload errors. When converting large Hugging Face checkpoints with the default
device_map=auto, Accelerate may place many layers onto disk/meta. Ift81 convert/t81 gguf(or the legacyt81-convert/t81-ggufscripts) later tries to call.to("cpu")you’ll hitNotImplementedError: Cannot copy out of meta tensororRuntimeError: You can't move a model that has some modules offloaded to cpu or disk.Always rerun with--force-cpu-device-mapor--device-map none/cpuso the checkpoints stay on host RAM, and setACCELERATE_DISABLE=1orHF_ACCELERATE_DISABLE=1before launching the CLI so no accelerate hooks re-enable offloading. This makes everynn.Linearserializable and avoids the meta-device save failure that occurs after the “Some parameters are on the meta device” log. - Large GGUF conversions. Extremely large ternary bundles (Gemma 3.x / Llama 3.x) may exhaust RAM when you read them with older readers because the whole file was loaded before parsing. The new
t81.gguf.read_ggufimplementation parses metadata, tensor infos, and tensor payloads directly from the file handle, seeks to each sorted tensor offset, and never slices the entire bundle into memory. When you still hit memory pressure or Matplotlib font-cache warnings, defineMPLCONFIGDIR=$PWD/data/cache/matplotlibandFONTCONFIG_PATH=$PWD/data/cache/fontconfig, prefer--force-cpu-device-map, and keepACCELERATE_DISABLE=1/HF_ACCELERATE_DISABLE=1set before rerunning the CLI so every tensor stays on the CPU. - GPU PTQ fallback.
t81.torch.TernaryTensor.from_floatcurrently quantizes on CPU; when your model lives on GPU it will warn and move tensors to CPU for PTQ, then return outputs back to the original device. Keep enough host RAM available and avoid meta/offload tensors (device_map=auto) if you plan to run PTQ PPL or short QAT loops.
-
ctest fails. Rebuild with
cmake --build build --target t81-testsand rerunctest --test-dir build --output-on-failure. Capturetests/unit/test_output.txt(if created) as part of the diagnostics. -
Python tests fail. From the repo root:
python -m pytest tests/python/test_bindings.py
Pass
-k <pattern>to narrow it down or-vvfor full stack. Virtualenv mismatch shows asModuleNotFoundError: No module named 't81lib'. -
Benchmark hangs. The Fashion-MNIST benchmark (
scripts/ternary_quantization_benchmark.py) logs latency/compression; if it stalls, check for GPU dispatch issues by settingUSE_CUDA=OFF/USE_ROCM=OFFand rerun withT81LIB_DISABLE_NEON=1.
- CUDA/ROCm kernels not used. Pass
-DUSE_CUDA=ONor-DUSE_ROCM=ONat configure time and make sure thecuda/rocmtoolkits are visible inCUDA_HOME/ROCM_PATH. Usecmake -S . -B build -DUSE_CUDA=ON -DT81LIB_BUILD_TESTS=ONto rebuild the Python extension with the dispatch layer. - NEON overriden. If you want to avoid runtime SIMD dispatch (e.g., cross-compiling for unknown hardware), define
T81_DISABLE_NEON=1inCXXFLAGSbefore configuring. - GPUs missing metadata. The helpers rely on
t81::TensorMetadata. If you seeunsupported dtypewhile operating on NumPy/Torch tensors, confirm you installed the versions documented inpyproject.toml(e.g.,pip install ".[torch]") so the binding and Torch/Numpy builds stay aligned.
- Docs/site builds broken. The doc site uses
mkdocs. Runpip install mkdocs mkdocstringsand verifymkdocs serveworks from the repo root before pushing updates that reference new pages. - New files not picked up. After adding source/header files, rerun
cmake --build build(don’t forgetcmake -S . -B build --refresh-cacheif you moved files between directories) and rerun tests. - Need to surface diagnostics. Capture logs via
script -q -c "cmake --build build && ctest --test-dir build --output-on-failure" test-output/ci.logso you can share the raw output when reporting issues.
- Review the index.md portal for deep dives on Python, Torch, CLI, and hardware flows.
- If you hit a reproducible failure, open an issue referencing the log from
build/test-outputand the exactcmakecommand you used.