Skip to content
This repository was archived by the owner on May 1, 2026. It is now read-only.

Configuration

benzsevern edited this page Mar 29, 2026 · 1 revision

Configuration

infermap.yaml

Override scorer weights and extend aliases:

scorers:
  FuzzyNameScorer:
    weight: 0.2
  LLMScorer:
    enabled: false

aliases:
  mrn: [medical_record_number, patient_id, chart_number]
  npi: [provider_id, national_provider_identifier]

Pass to the engine:

engine = infermap.MapEngine(config_path="infermap.yaml")

Saved Mapping Configs

# Save
result.to_config("mapping.yaml")

# Reload (no inference)
result = infermap.from_config("mapping.yaml")
remapped = result.apply(new_df)

Generated YAML format:

version: "1"
mappings:
  - source: fname
    target: first_name
    confidence: 0.95
  - source: email_addr
    target: email
    confidence: 0.97
unmapped_source:
  - internal_ref
unmapped_target: []

Schema Definition Files

fields:
  - name: email
    type: string
    aliases: [email_address, e_mail, contact_email]
    required: true
  - name: phone
    type: string
    aliases: [telephone, tel, mobile]

Use as overlay: infermap.map("src.csv", "tgt.csv", schema_file="schema.yaml")

Required Fields

Three sources (unioned):

  1. required=["email"] on map()
  2. --required email on CLI
  3. required: true in schema files

With --strict, CLI exits code 1 if unmapped.

Engine Parameters

Parameter Default Description
min_confidence 0.3 Minimum score to keep a mapping
sample_size 500 Rows to sample for profiling
scorers default_scorers() Scorer instances
config_path None Path to infermap.yaml

Clone this wiki locally