This repository was archived by the owner on May 1, 2026. It is now read-only.
-
-
Notifications
You must be signed in to change notification settings - Fork 0
Configuration
benzsevern edited this page Mar 29, 2026
·
1 revision
Override scorer weights and extend aliases:
scorers:
FuzzyNameScorer:
weight: 0.2
LLMScorer:
enabled: false
aliases:
mrn: [medical_record_number, patient_id, chart_number]
npi: [provider_id, national_provider_identifier]Pass to the engine:
engine = infermap.MapEngine(config_path="infermap.yaml")# Save
result.to_config("mapping.yaml")
# Reload (no inference)
result = infermap.from_config("mapping.yaml")
remapped = result.apply(new_df)Generated YAML format:
version: "1"
mappings:
- source: fname
target: first_name
confidence: 0.95
- source: email_addr
target: email
confidence: 0.97
unmapped_source:
- internal_ref
unmapped_target: []fields:
- name: email
type: string
aliases: [email_address, e_mail, contact_email]
required: true
- name: phone
type: string
aliases: [telephone, tel, mobile]Use as overlay: infermap.map("src.csv", "tgt.csv", schema_file="schema.yaml")
Three sources (unioned):
-
required=["email"]onmap() -
--required emailon CLI -
required: truein schema files
With --strict, CLI exits code 1 if unmapped.
| Parameter | Default | Description |
|---|---|---|
min_confidence |
0.3 | Minimum score to keep a mapping |
sample_size |
500 | Rows to sample for profiling |
scorers |
default_scorers() |
Scorer instances |
config_path |
None | Path to infermap.yaml |