Skip to content

Project-NIC/NIC-MLA

Repository files navigation

English · Čeština · Русский


NIC-MLA

License: MIT


Matroshka Logging Archive — a universal single-file container for logging data from measurement stations. Both the data and the log live in one portable file, readable across platforms from an 8-bit microcontroller to a PC.

One file, one format, one way to read it — pull the card out of the device, plug it into a computer, and you have everything. No zoo of formats.

Full format specification: DESIGN-MLA.md

Logging several station types into one file (datalogger / repeater): DESIGN-MLA-datalogger.md

Key features

  • One file = data + log. Two streams grow toward each other: data from the top, the log from the bottom.
  • Dumb container. MLA only stores bytes. All the brains (compression, encryption, station-number translation, LoRa/Wi-Fi) live in a separate glue layer — MLA stays small and never gets in the way.
  • Tiny 16 B log record, fully CRC-protected. No "flags outside the CRC" trick: abandon a record by overwriting it with zeros — its CRC then fails and readers skip it.
  • Crash-safe. "LOCK first, DATA second" commit protocol + CRC16 (CCITT-FALSE). After a reset the last record either verifies (carry on) or is zeroed and the space reclaimed. No on-disk search tree to corrupt.
  • Self-describing. The prefix carries a SCHEMA table (8-char field names + units → ready for CSV/SQL export with no prior knowledge) and a STATION table (the 1-byte station index in each log record → the real station number).
  • Small for a microcontroller. The ATmega328 (2 KB RAM) only writes; no dynamic allocation, largest buffer 32 B. Searching and reading happen on the host.
  • File rotation. When one file fills up, the next is started; large volumes = many smaller files, the host reads them as a whole.
  • 32-bit addressing → a single file up to 4 GB (beyond that, rotation).
  • Optional compression. The container only flags compressed data (one compressed bit in the record) and tracks kf_back (distance back to the owning keyframe); it does not define the compression method — which codec / keyframe lives in the data block's own header (e.g. NIC-DMD).
  • Filesystem-independent. Access through a thin HAL (4 functions); FAT16 / FAT32 / exFAT / NTFS / ext4 are handled by the layer beneath it (the OS, SdFat or FatFs).

File layout

offset 0                                                              EOF
┌──────────────────┬──────────────────┬───────────────┬──────────────┐
│ PREFIX           │ DATA  stream  →   │   free  0xFF   │   ← LOG stream│
│ 1–255 sectors    │ (grows up)        │               │ (grows down)  │
│ (512 B each)     │                   │               │               │
└──────────────────┴──────────────────┴───────────────┴──────────────┘
  • Prefix: a 34 B header + the SCHEMA and STATION tables, covered by a CRC16 in its last 2 bytes. Normally one 512 B sector; it grows in whole sectors (up to 255 ≈ 127 KB) only if the tables need it.
  • Data block: MAGIC(2) + payload(1..65535) + CRC16(2)
  • Log record (16 B), all CRC-covered: offset, timestamp, subsec (two opaque bytes, meaning owned by the glue), length, flags (bit7 = compressed, bits0-6 = kf_back; 0 = keyframe), station (1-byte index), CRC16.

Repository structure

Path Contents
nic_mla.py Python reference core (format / mount / append / read / scan / recover)
nic_mla_archive.py Python: file rotation (MlaArchive) + host-side query (mla_query)
tools/mla_schema.py Build/read the SCHEMA + STATION tables; decode payloads for CSV/SQL
nic_mla_test.py Test suite (Python)
c/ C libraries: write-only (MCU) + complete (ARM/PC) + HAL adapters
DESIGN-MLA.md Format design specification

Quick start — Python

from nic_mla import MlaCore, MlaPosixHAL

# First run (creates a 1 MB file pre-filled with 0xFF)
hal = MlaPosixHAL.create("log.mla")
with hal:
    mla = MlaCore(hal)
    mla.format()
    mla.append(timestamp, station=1, data=b"\x01\x02\x03")   # station = table index

# Later runs: mount() restores the state; iteration reads records
with MlaPosixHAL("log.mla") as hal:
    mla = MlaCore(hal); mla.mount()
    for rec, payload in mla:
        ...

Rotation across multiple files and filtering:

from nic_mla_archive import MlaArchive, mla_query
with MlaArchive("/data") as arch:          # MLA00000.MLA, MLA00001.MLA, …
    arch.append(ts, station=1, data=payload)
for rec, data in mla_query(MlaArchive("/data"), station=1, time_from=t0, time_to=t1):
    ...

Self-describing file (schema + station tables → ready for CSV/SQL export):

from mla_schema import MlaSchemaBuilder, MlaStationTable, mla_read_schema, \
                       mla_read_stations, mla_decode_payload, mla_split_station

sb = MlaSchemaBuilder()
sb.data("temp", unit="degC", width=2, exp10=-1, signed=True)
sb.data("hum",  unit="pct",  width=2, exp10=-1)
st = MlaStationTable()
st.station(region=55, number=25000)          # log index 1 → this station

hal = MlaPosixHAL.create("log.mla")
with hal:
    mla = MlaCore(hal)
    mla.format(schema_table=sb.table(), station_table=st.table())
    mla.append(ts, station=1, data=temp.to_bytes(2,"little",signed=True)+hum.to_bytes(2,"little"))

# Any reader recovers names, units and the real station number — no prior knowledge:
with MlaPosixHAL("log.mla") as hal:
    mla = MlaCore(hal); mla.mount()
    pfx = mla._prefix.to_bytes()
    _, fields = mla_read_schema(pfx); stations = mla_read_stations(pfx)
    for rec, data in mla:
        region, number, _ = mla_split_station(stations[rec.station - 1])
        cols = mla_decode_payload(fields, data)   # [(name, unit, value), …]

Tests:

python3 nic_mla_test.py

Quick start — C

Two libraries share one format definition (c/nic_mla_format.h):

  • write-only (c/nic_mla_write.{h,c}) — for the ATmega / small Arduinos,
  • complete (c/nic_mla.{h,c}) — for ARM Arduino / PC (+ read, query, recover).

You wire the HAL (4 functions) to your filesystem. Ready-made adapters in c/hal/:

Platform "Beneath the HAL" Adapter
Raspberry Pi / PC (SSD, SD, USB) OS: ext4 / exFAT / NTFS / FAT32 / FAT16 hal/nic_mla_hal_posix.{h,c}
Arduino AVR / ESP / STM32duino SdFat examples/atmega_sd_writeonly.ino
STM32 bare-metal (CubeIDE/HAL) FatFs (ChaN) hal/nic_mla_hal_fatfs.{h,c}

Build and test on a PC:

cd c
cc -std=c99 -Wall -Wextra -O2 nic_mla_test.c nic_mla.c nic_mla_write.c \
   hal/nic_mla_hal_posix.c -o mlatest
./mlatest

See c/README.md.

Notes for integrators

  • Station names are not in the file. The STATION table stores only 6 raw bytes per station; what they mean (region / number / city / …) is decided by your glue layer, which keeps its own mapping "6 bytes → meaning". The log carries just a 1-byte index — translating it to a real station number is the glue's job, not the container's.
  • The subsec field is two opaque bytes the glue owns. MLA assigns them no meaning and never touches them. The name reads both ways on purpose: sub-second time and sub-section (e.g. a rotation / section index). Your glue may use the 2 bytes (0..65535) as one 16-bit value, as two independent bytes, or for several things at once — e.g. high byte = section/rotation, low byte = a sub-second tick so MLA handles sampling well above 1 Hz (e.g. seismic). Set it to 0 when unused.

Data transport (LoRa / network)

Out of scope — the container is storage, not transport. Each record is self-contained (offset + length + flags + CRC), so sending it over LoRa/network means "take the record's bytes and send them". The project leaves the transport choice to the user.

Status

Both the Python and C references are complete, tested, and byte-for-byte identical (a file written by the C library is read by Python and vice versa).

License

MIT License — Copyright (c) 2026 NIC — Native Intellect Community


Acknowledgements

To my brother for advice during the development of this project. For technical assistance with code optimisation, to AI assistants Claude (Anthropic) and Gemini (Google).

★ Viva La Resistánce ★