Skip to content

Technical review: person_follow mode lifecycle, polling and enrollment logic #1594

@bibi1977

Description

@bibi1977

Hi, thanks for the great work on OM1.

After reviewing the person_follow implementation (config, hooks and input plugins),
I’d like to share some technical observations and potential improvement points.
These are not blocking bugs, but architectural and robustness-related findings.

  1. Enrollment duplication risk
  • Enrollment is triggered in two different places:
    • lifecycle hook: start_person_follow_hook (on_startup / on_entry)
    • PersonFollowingStatus input (_poll → _try_enroll when status == INACTIVE)
  • In certain timing scenarios (slow status response, delayed tracking),
    this can lead to multiple overlapping /enroll calls.
  • Consider centralizing enrollment responsibility in a single component
    or introducing a shared enrollment state / lock.
  1. Polling model and LLM noise
  • PersonFollowingStatus polls /status every poll_interval (default 0.5s).
  • When tracking is active, formatted status messages are generated continuously:
    "TRACKING: Following person at X m ahead..."
  • This may introduce unnecessary LLM context noise, especially since movement
    is autonomous and the LLM does not act on distance changes.
  • Possible improvement:
    • Rate-limit status messages
    • Only emit messages on significant delta or state change
    • Separate "LLM-visible" status from internal telemetry
  1. State machine split across layers
  • Tracking state is partially handled by:
    • HTTP service (INACTIVE / SEARCHING / TRACKING_ACTIVE)
    • PersonFollowingStatus internal flags (_previous_is_tracked, _has_ever_tracked)
    • Lifecycle hooks triggering TTS feedback
  • This distributed state handling makes reasoning about edge cases harder
    (e.g. SEARCHING → INACTIVE → TRACKING transitions).
  • A clearer state ownership or explicit state diagram could improve maintainability.
  1. TTS feedback coupling
  • start_person_follow_hook directly pushes TTS messages via ElevenLabsTTSProvider
  • At the same time, LLM is instructed to react verbally to PersonFollowingStatus input
  • This can result in duplicated or out-of-order voice feedback.
  • Suggestion:
    • Route all user-facing speech through a single layer (either LLM or hooks),
      or introduce a priority/queue mechanism.
  1. Error handling consistency
  • Network errors are handled differently across:
    • start_person_follow_hook
    • stop_person_follow_hook
    • PersonFollowingStatus polling
  • Some errors produce user-facing TTS, others are silent.
  • Aligning error-handling strategy would improve predictability and UX.

Overall, the feature is well-structured and already usable.
These notes are meant as technical feedback to help long-term robustness
and maintainability.

Happy to clarify or help with a PR if needed.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions