Skip to content

CLI startup optimization: lazy import heavy modules in exstruct.cli.main #108

@harumiWeb

Description

@harumiWeb

Before submitting

  • I understand that this project is maintained in spare time and that not all issues may result in changes.

Description

Summary

The current CLI entrypoint imports extraction-related code such as process_excel and edit CLI modules at module import time.
As a result, even lightweight commands like exstruct --help or exstruct ops list pay the import cost of heavier extraction-related dependencies.

To reduce startup latency, we should minimize top-level imports in the CLI entrypoint and import implementation modules only after command routing is known.

Problem

The current structure front-loads import cost for:

  • extraction APIs
  • edit CLI implementations
  • their transitive heavy dependencies

This creates avoidable inefficiencies:

  • --help is slower than it should be
  • edit commands pay extraction import cost
  • extraction commands may also import edit-related code unnecessarily

Proposal

  • Keep src/exstruct/cli/main.py top-level imports as light as possible
  • Route commands based on argv before importing heavy modules
  • Import exstruct.cli.edit only when edit commands are invoked
  • Import process_excel only immediately before extraction execution

The guiding principle is:

  • CLI startup should only do lightweight routing
  • Implementation code should be imported only when needed

Expected benefits

  • Faster startup for exstruct --help
  • Faster startup for lightweight edit commands such as exstruct ops list
  • Cleaner separation of CLI responsibilities
  • Easier measurement and reasoning using -X importtime

Scope

Potential files:

  • src/exstruct/cli/main.py
  • possibly src/exstruct/cli/edit.py

Acceptance criteria

  • exstruct --help does not require extraction-related heavy imports
  • exstruct ops list does not require importing process_excel
  • process_excel is imported only when extraction execution begins
  • Existing CLI behavior remains backward-compatible

Notes

This is likely a relatively low-risk change and a strong first step for improving startup latency.

Minimal example (optional)

No response

Additional notes (optional)

No response

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions