Skip to content

Conversation

@ReallyNiceGuy
Copy link

This change allows the creation of local tables named by entity in the tables directory.

The table can be selected with the switch -l in the pybufrkit tool or by using get_local_table_name to request the local table directory to pass to Encoder or Decoder classes.

Marco Aurelio da Costa added 3 commits March 14, 2025 12:48
@ywangd
Copy link
Owner

ywangd commented Mar 14, 2025

IIUC, the intention is to use local tables along side WMO tables. This is already supported since the code loads local tables as defined in section 1 and the tables directory is structured to accomodate extra local tables. The current tables directory is as the follows:

tables/
└── 0  <-- master table number as defined in section 1
    ├── 0_0  <-- originating centre and sub-centre as defined in section 1, 0_0 for WMO
    └── 98_0  <--- originating centre and sub-centre for ECMWF

There it should work by placing the new local tables alongside the existing ones, e.g.:

tables/
└── 0
    ├── 0_0
    ├── 98_0
    ├── 85_0
    └── 255_255

Let me know if the above works for you.

@ReallyNiceGuy
Copy link
Author

ReallyNiceGuy commented Mar 15, 2025 via email

@ywangd
Copy link
Owner

ywangd commented Mar 15, 2025

Thanks for explaining. Do you mean two entities use exactly the same codes for originating centre and originate sub-centre? Is that valid in WMO specs? I was assuming at least the code for originating centre should be uniquely assigned to different entities? Maybe that is not true?

If that is the case, is there a reason why this cannot be solved with the existing -t (--tables-root-directory) option to select a different root directory for different entities? That is, given the following tables directory structure

tables
├── imd
│   └── 0
│       ├── 0_0
│       └── 42_0
├── meteofrance
│   └── 0
│       ├── 0_0
│       └── 42_0
└── wmo
    └── 0
        └── 0_0

where both imd and meteofrance define their different local tables using the same codes 42_0. The set of tables to use can be selected with something like the follows

pybufrkit -t /PATH-TO/tables/imd decode BUFR_FILE

The difference betwen the above and what you are proposing is the need to specify the full tables path instead of a shorter name. I am not sure how significant this difference is since there are multiple ways you can make the full path shorter, e.g. define an environment variable for it, e.g. export IMD=/PATH-TO/tables/imd and use it like pybufrkit -t $IMD decode .... Does this make sense or am I missing something?

@ReallyNiceGuy
Copy link
Author

Yes, you understood it correctly. It is not valid, but this will not prevent the end user from making this mistake, unfortunately.

Your example was exactly my first solution.
Later I added the helper function and command line switch to simplify the usage, but I condensed it on one commit.

If you are fine with changing the structure to be:

tables
 ├── imd
 │   └── 0
 │       ├── 0_0  -> link to ../../wmo/0_0
 |       ├── 98_0 -> link to ../../wmo/98_0
 │       └── 42_0
 ├── meteofrance
 │   └── 0
 │       ├── 0_0  -> link to ../../wmo/0_0
 |       ├── 98_0 -> link to ../../wmo/98_0
 │       └── 42_0
 └── wmo
     └── 0
         ├── 0_0
         └── 98_0

I can definitely live with that. The helper function and switch are just "easy of use" anyways.

I saw that you just committed some changes to download the WMO tables. This will have to be addressed also with such a change, and my commit doesn't do it.

The reason for the links is because of another crazy thing I saw in the wild: overriding codes from WMO. My approach to that would be to just create a modified WMO table with the override and put it directly inside the local table directory for the entity that dares to do that.

Again, thank you for taking the time to work this out with me.

@ywangd
Copy link
Owner

ywangd commented Mar 15, 2025

I may have been unclear in the previous comment. I'd prefer to not change the default directory layout. My view is that non-standard tables are better managed outside of the program itself. The purpose of the builtin tables is mostly there for a good get-started experience with standard WMO tables. Once you have serious need for local tables, it's better to manage them separately with more nuances, e.g. with the above suggested structure. The existing -t option already supports loading tables from arbitrary locations. Hence I don't see the need for changes proposed in the PR. I hope that makes sense. Thank you!

@ReallyNiceGuy
Copy link
Author

ReallyNiceGuy commented Mar 15, 2025 via email

@ReallyNiceGuy
Copy link
Author

ReallyNiceGuy commented Mar 15, 2025 via email

@ywangd
Copy link
Owner

ywangd commented Mar 16, 2025

IIUC, you are proposing a directory structure as the follows

tables
└── 0
    ├── 0_0
    ├── 98_0
    ├── imd   <-- new local tables with a name for selection
    │   └── 42_0
    └── meteofrance   <-- another new local tables
        └── 42_0

The local tables can then be selected with a new option, say -l, --local-tables-selector, which works alongside the existing -t option. This sounds reasonable to me since it is compatible with the existing tables directory strcuture and a pure extension.

@ReallyNiceGuy
Copy link
Author

That would work fine, as long as I can have this structure also separated from the main table:

pybufrkit/tables
└── 0
    ├── 0_0
    └── 98_0
some_other_path/tables
└── 0
    ├── imd   <-- new local tables with a name for selection
    │   └── 42_0
    └── meteofrance   <-- another new local tables
        └── 42_0

By default, the local path could be the main path as in your example, but the user can select an external path.

This would allow me to create packages for each entity, without having to interfere with the pybufrkit installation at all.

@ReallyNiceGuy
Copy link
Author

Please check this branch, if you agree with it, I can create a pull request.
https://github.com/ReallyNiceGuy/pybufrkit/tree/disjoint_local_tables

The changes are:

  1. By default, local tables are looked up on the same path as root directory
  2. You can pass a new local table directory with -l
  3. If you pass -e, the entity is appended to the local table directory, this allows the use of the structure you provided, without having to repeat the root directory as the local directory in the command line.

If you don't like it for any reason, let me know and I will do the changes you feel necessary.

@ywangd
Copy link
Owner

ywangd commented Mar 17, 2025

I actually quite like the idea of a separate directory (-l) for locating the local tables since it enables decoupling between WMO tables and local tables. I do have a few questions for the other option:

  1. It seems only a convenience feature and is most useful when the default tables directory contains sub-directories beginning with entity names. But your branch contains no change to the default tables. Is this option still useful for you? I proposed it on that assumption. But not sure if you still need it now with the idea of completely separate local table directory?
  2. If we do want it, is it possible to find an alternative word for entity? Maybe it is due to me being unfamiliar with Met centres, but "entity" sounds too generic to me, that is, I don't quite get what it refers to. I'd personally suggest something like variant, selector, or provider. What do you think?

Overall the idea is sound to me. Please feel free to open a PR and I will give it a proper review. Thanks!

@ReallyNiceGuy
Copy link
Author

Actually, 1. is for when we have the local tables directly under the root directory, without an entity name. I implemented it this way because it will be backwards compatible. If an user has local tables under the root directory, it will just keep working for him.
2. selector sounds good to me. I will change it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants