Skip to content

ameros/genyaml

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

131 Commits
 
 
 
 
 
 

Repository files navigation

genyaml

GEDCOM to YAML

Why

GEDCOM is now the de facto standard for genealogical data. However, beyond its limitations, it is full of shortcuts. Well, its purpose was to exchange genealogical data between software.

How about making it simple and human-readable so that anyone (also software) can write and read it.

What

Genealogical data, as described by GEDCOM, contains between the others:

  • individual records (INDI)
  • family records (FAM)

linked together by references. Relations are well structured so let's try to keep them the most similar.

How

Let's start with replacing abbreviations of Tags with real words e.g.:

GEDCOM TAG: replaced with: YAML type: notes:
BIRT birth dictionary
DEAT death dictionary
BURI burial dictionary
MARR marriage dictionary
PLAC place string

...etc

Relations

What if the family tree was presented in the YAML file. The simplest could look like this:

# Yes, YAML allows us to comment
individuals: # GEDCOM INDI records
  - id: I1
    familyIds: 
      - F1
  - id: I2
    familyIds: 
      - F1
  - id: I3
    parentsId: F1
families: # GEDCOM FAM records
  - id: F1
    partnerIds: 
      - I1
      - I2
    childIds: 
      - I3
GEDCOM TAG: replaced with: YAML type: notes:
FAMS familyIds list of strings
FAMC {1} parentsId string
FAMC {n>1} parents list of dictionaries for different types like biological and adoption
HUSB, WIFE partnerIds list of strings
CHIL childIds list of strings

Plurals for collections

In the GEDCOM, there is no easy way to recognize if a record is part of a collection or just a single item. There are of course some specifications out there for different versions. But you can't tell it from just looking at the file.

I think the good old convention of naming collections with a plural is a way to go (of course a collection in YAML itself is also visible by - or []) e.g.:

GEDCOM TAG: replaced with: YAML type: notes:
OBJE objects list of dictionaries
TITL titles list of strings
OCCU occupations list of strings
EDU educations list of strings

Personal Name

Personal Name is hard thing to model. It's the set of names that the individual person is known. However, it highly depends on cultural context - synonyms, order of parts, their meaning. That's also why there is so many ideas for personal name in GEDCOM and why it still evolves there.

In GEDCOM specifications NAME can be a string / text value (with surname between slashes) or a list of such values or a list of objects. When in 5.5.1 version they can be distinguished by TYPE (one of aka | birth | immigrant | maiden | married | user defined), it is no more possible to do it in 5.5.5 version. However still both define NAME with number of possibilities (pieces) like NPFX, GIVN, NICK, SPFX, SURN and NSFX.

That's why my proposal is simply to use:

legalFullName: Prince Rogers Nelson
normalShortName: Prince
otherKnownNames: [The Artist, Joey Coco, Jamie Starr]
GEDCOM TAG: replaced with: YAML type: notes:
NAME legalFullName string
normalShortName string
otherKnownNames list of strings

By the way, I really think that a maiden name could be written as legalFullName but under birth like:

birth:
  date: 28 JUL 1929
  place: Southampton, New York, U.S.
  legalFullName: Jacqueline Lee Bouvier

Store in folders

Working comfortably on a family tree is not only a human-readable file, but also a well-organized storage of data. We could use a specific folder structure for this. In each such folder, we could store a piece of data about a specific person or family. Simple tooling would make it possible to produce one file from all folders. The proposal is to reflect the relations in such folders per family tree:

  • individuals/{person-name-unique}/
  • families/{family-id}/

and place there all particular files with kind of symlinks to be able to traverse the tree (folders) accordingly to relation structure.

YAML to MARKDOWN

While we're here, why not use GitHub with Markdown files to conveniently navigate the tree 🤔 From the About READMEs

GitHub will recognize and automatically surface your README to repository visitors.

So, if we could make simple transformation of YAML file to the README.md in each folder then we could also make use of Markdown links to refer relative folder READMEs.

Example YAML file:

# John Fitzgerald Kennedy
id: I1
legalFullName: John Fitzgerald Kennedy
normalShortName: John F. Kennedy
otherKnownNames: [JFK, John Kennedy]
titles: [35th President of the United States, Senator, Congressman]
occupations: [politician]
educations: [Harvard University]
birth:
  date: 27 MAY 1917
  place: Brookline, Massachusetts, U.S.
death:
  date: 22 NOV 1963
  place: Dallas, Texas, U.S.
  cause: assassination
burial:
  place: Arlington National Cemetery
objects:
  - file: https://upload.wikimedia.org/wikipedia/commons/thumb/c/c3/John_F._Kennedy%2C_White_House_color_photo_portrait.jpg/370px-John_F._Kennedy%2C_White_House_color_photo_portrait.jpg
    title: John F. Kennedy, photograph in the Oval Office by Cecil Stoughton, White House; Public Domain
    format: jpg
familyIds:
  # with Jacqueline Lee Bouvier
  - F1

could be easily transformed to such simple Markdown:

# John Fitzgerald Kennedy
- id: I1
- legalFullName: John Fitzgerald Kennedy
- normalShortName: John F. Kennedy
- otherKnownNames: JFK, John Kennedy
- titles: 35th President of the United States, Senator, Congressman
- occupations: politician
- educations: Harvard University
- birth:
  - date: 27 MAY 1917
  - place: Brookline, Massachusetts, U.S.
- death:
  - date: 22 NOV 1963
  - place: Dallas, Texas, U.S.
  - cause: assassination
- burial:
  - place: Arlington National Cemetery
- objects:
  - file: ![](https://upload.wikimedia.org/wikipedia/commons/thumb/c/c3/John_F._Kennedy%2C_White_House_color_photo_portrait.jpg/370px-John_F._Kennedy%2C_White_House_color_photo_portrait.jpg)
    - title: John F. Kennedy, photograph in the Oval Office by Cecil Stoughton, White House; Public Domain
    - format: jpg
- familyIds:
  - F1 ([with Jacqueline Lee Bouvier](../../families/F1))

where the rules of transformation are easily visible and clear...

And voilà, there we got it - human-readable and well-organized files representing family tree 🎄

Please check the example starting from John Fitzgerald Kennedy.

There are also some templates already prepared for:


Now, all we need is just simple toolset

  • to convert from YAML to Markdown
  • from all YAML files to single one
  • and, maybe still if someone interested, from that one YAML back to GEDCOM 🤔

TBD

Of course, there are many more in the GEDCOM standard (TAGS, type of RECORDS) to which the above proposal requires adjustment. Still, the base seems to be solid, and this adjustment should be easy.

Google Sheets Template

Here comes something additional. I've just created Google Family Tree Template that allows to collect GEDCOM-style data as linked sheets. You can walk it through by clicking links to parents, marriages and children. Such complete view on family tree members might be sometimes helpful and convenient.

Feedback

Any feedback is appreciated https://github.com/ameros/genyaml/discussions


This proposal is listed @ https://www.cyndislist.com/gedcom/gedcom-software/

http://www.cyndislist.com/create-a-link-to-cyndis-list/

About

proposal for human-readable genealogy files stored in well-organized traversable way

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors