Skip to content

Distinguish individual characters from groups/factions #37

@renaudcepre

Description

@renaudcepre

Problem

The analyzer currently extracts collectives as characters: "the orcs", "the dwarves", "the drones", "the Nazgûl". These pollute the character graph with non-character nodes that have no meaningful profile.

Observed in:

  • LotR corpus: The Orcs, The Dwarfs, Black Riders extracted as characters
  • Felix corpus: the drones, La Canopée confused with Andrew Milton

Short-term fix

Update CHARACTER_PROMPT in analyzer.py to explicitly exclude races, armies, factions, and unnamed collectives. Only individually named characters should be extracted.

Long-term vision

Introduce a Group node type distinct from Character:

```
(Gimli) -[:MEMBER_OF]-> (:Group {name: "Dwarves"})
(Aragorn) -[:MEMBER_OF]-> (:Group {name: "Fellowship of the Ring"})
```

This would enable:

  • Filtering characters by faction/race
  • Querying "which characters are Elves"
  • Group-level arc tracking

Open questions

  • Where is the line? "The Nazgûl" as a force vs a named Nazgûl individual
  • Should the LLM infer MEMBER_OF relations, or should it be a separate extraction step?
  • Should groups appear in the beat graph (subject/object)?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions