Skip to content

Define a core data model for ABIF #15

@robla

Description

@robla

Many text formats aspire to simplicity, with the belief that data models are an "implementation detail". My inclination is to err in that direction, because I fear that trying to start discussion by agreeing on a serialized data model leads to this series of unfortunate reasoning:

  1. Let's agree on a data model before we agree on syntax
  2. Great, we have a data model, how do we serialize it?
  3. Why invent another data serialization format; why don't we use something like JSON or XML?
  4. Result: a large, complicated data hierarchy that is difficult/impossible to author with a text editor, and difficult to spot errors with human inspection.

Having seen the development of many "Document Object Models (DOMs)" over the years (including working closely with the folks defining a document object model for MediaWiki markup), I've been hesitant to tackle such a complicated issue so early in the development of a new format that seems so clear in my mind. However, I've come to realize that my ideas about the data that is "important" (or "interesting" to me) and the data that is "unimportant" (or "uninteresting" to me) may be very important to others, and I want to build consensus around my idea of what ABIF can be. After mulling over the discussions in several issues here (particularly issues #6 and #14 regarding the metadata format), it occurs to me that a core data model may be helpful.

Here's my take on a core data structure that ABIF files should resolve to, expressed as a partial JSON file (NOTE: this comment is subject to revision):

{
    "metadata":
    [
        {
            <key-1>: <value-1>,
            <key-2>: <value-2>,
            <key-3>: <value-3>,
            ...
            <key-n>: <value-n>
        }
    ],
    "candidates":
    [
        {
            <candidate-id-1>: <candidate-information-1>,
            <candidate-id-2>: <candidate-information-2>,
            <candidate-id-3>: <candidate-information-3>,
            ...
            <candidate-id-n>: <candidate-information-n>
        }
    ],
    "ballot_bundles":
    [
        {
            <ballot-bundle-id-1>: <ballot-bundle-1>,
            <ballot-bundle-id-2>: <ballot-bundle-2>,
            <ballot-bundle-id-3>: <ballot-bundle-3>,
            ...
            <ballot-bundle-id-n>: <ballot-bundle-n>
        }
    [

Expressing this as JSON is tricky, because JSON dictionaries are unordered key-value pairs, and there's not a great way to stipulate "order matters!". Moreover, I would like to make sure it's possible to build the data structure above using a single-pass parser. That's going to have all sorts of really tricky implications. I think we can pull it off if we have keep a shared data model in mind, but we're going to have to do things that make people who love beautiful context-free grammars (CFGs) cringe.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions