Skip to content

feat: Prose Format Workflow #55

@hpnyaggerman

Description

@hpnyaggerman

Currently, the assistant is only structurally encouraged to comply with the format of the prose, and that only through whatever pressure the preceding context happens to exert on it. Hoping that the message being processed will return in perfect compliance with that format is therefore naïve. My own experience with prompting tells me that the obvious next move, namely instructing the writer to comply with the existing format, is equally naïve, even when the format in question is fully specified. Thus a workflow is needed to handle precisely this problem.

Building such a solution, however, requires a great many architectural considerations. I will begin by laying out my current vision and what justifies its shape.

The workflow consists of three agentic paths: the analyzer, the judge, and the enforcer. The analyzer sits on on-demand and pre hooks. Its purpose is to infer the format of the prose within a conversation and record it to workflow conversation state. The enforcer's purpose, by contrast, is to edit the current draft minimally so as to bring it into compliance with the format recorded by the analyzer. It would make the most sense to structure it as a ReAct loop, in the same way the editor currently works. However, in order for the enforcer to know what it must fix, someone must quantify and concretize the violations present within the draft. And for that, we need the LLM judge. I am well aware that the principle of the project is to avoid such things, but here we are facing a real problem: the space of violations is great, and its elements are often unpredictable. Intelligence of some form is required to handle the work of concretization and quantification, for otherwise the ReAct loop will simply guess at the size of the task and at the number of iterations needed to finish it. It would, however, make sense to optionally gate the activation of the judge, and through it the activation of the entire pipeline, behind a static detector with regex-based detection, in the same way the editor's phrase bank has recently gained such capacity.

Now to the hard architectural question. Its substance is this: how concretely do we want to enumerate the taxonomy of prose, and how strong do we want our design decisions to be regarding what counts as valid for each category? The further we go in the direction of strong and well-defined decisions, the easier static inference becomes, but the less freedom users will have in how they structure prose in their character cards and in their conversations. Moreover, hard enumeration demands that the taxonomical and design decisions made as part of this solution accommodate a very large space of possibilities, many of which are not obvious at all.

The question itself is this: how is inferred prose format to be structured? A static dict whose fields accept any text values? A dynamic dict whose fields are created and populated during inference? A dict that accepts enums corresponding to known concrete prose formats? Almost every other question and consideration is downstream from this single decision.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions