Motivation
Writing config.yaml files requires knowing the full set of available options, their types, defaults, and valid choices for each estimator/prior. Today this information lives only in Python dataclasses (SNPEConfig, GaussianConfig, TrainingLoopConfig, etc.) and isn't surfaced to users at config-writing time.
The goal: a user should be able to discover all config options without reading source code.
Proposal
1. JSON Schema generation → IDE autocompletion
Auto-generate a JSON Schema from the existing config dataclasses. With VS Code + the Red Hat YAML extension, this gives autocomplete, validation, and hover docs for free.
Implementation:
- Walk config dataclasses, emit JSON Schema
properties with types, defaults, and descriptions
_target_ fields get enum of known estimator/prior paths
choices metadata (e.g., net_type) becomes enum in schema
- Generated once at release time (or via
falcon schema --json-schema)
2. falcon schema CLI command
Interactive introspection from the terminal:
# Show full config tree with defaults and descriptions
$ falcon schema falcon.estimators.Flow
falcon.estimators.Flow:
loop:
num_epochs: 100 # Max training epochs
batch_size: 128 # Samples per training step
early_stop_patience: 16 # Epochs without improvement before stopping
network:
net_type: zuko_nice # Flow architecture [nsf, maf, zuko_gf, naf, ...]
theta_norm: true # Normalize parameter space
embedding: {} # _target_ + _input_ for observation embedding
optimizer:
lr: 0.01 # Learning rate
lr_decay_factor: 0.1 # LR multiplier on plateau
inference:
gamma: 0.5 # Amortization mixing (0=sequential, 1=amortized)
# Dump as YAML template
$ falcon schema falcon.estimators.Flow --yaml > config_template.yaml
Prerequisite: dataclass field metadata
Both features are driven by the same source — metadata on dataclass fields:
@dataclass
class NetworkConfig:
net_type: str = field(
default="zuko_nice",
metadata={"help": "Flow architecture", "choices": ["nsf", "maf", "zuko_gf", "naf"]}
)
theta_norm: bool = field(
default=True,
metadata={"help": "Normalize parameter space"}
)
Adding metadata={"help": ...} to existing config dataclasses is the single investment that pays off across both the schema and the CLI.
Scope
Motivation
Writing
config.yamlfiles requires knowing the full set of available options, their types, defaults, and valid choices for each estimator/prior. Today this information lives only in Python dataclasses (SNPEConfig,GaussianConfig,TrainingLoopConfig, etc.) and isn't surfaced to users at config-writing time.The goal: a user should be able to discover all config options without reading source code.
Proposal
1. JSON Schema generation → IDE autocompletion
Auto-generate a JSON Schema from the existing config dataclasses. With VS Code + the Red Hat YAML extension, this gives autocomplete, validation, and hover docs for free.
Implementation:
propertieswith types, defaults, and descriptions_target_fields getenumof known estimator/prior pathschoicesmetadata (e.g.,net_type) becomesenumin schemafalcon schema --json-schema)2.
falcon schemaCLI commandInteractive introspection from the terminal:
Prerequisite: dataclass field metadata
Both features are driven by the same source —
metadataon dataclass fields:Adding
metadata={"help": ...}to existing config dataclasses is the single investment that pays off across both the schema and the CLI.Scope
metadata={"help": ...}to all config dataclass fields (TrainingLoopConfig,OptimizerConfig,InferenceConfig,NetworkConfig,SNPEConfig,GaussianConfig,GaussianPosteriorConfig)schema_from_dataclass()utility that walks dataclasses → JSON Schemafalcon schema <target>CLI subcommand (pretty-printed YAML with comments)falcon schema <target> --json-schemaoutput modeschemas/and add.vscode/settings.jsonexample