Add NX docstring as attribute#415
Conversation
| [submodule "src/pynxtools/definitions"] | ||
| path = src/pynxtools/definitions | ||
| url = https://github.com/FAIRmat-NFDI/nexus_definitions.git | ||
| branch = nxmpes-unit-doc-test |
There was a problem hiding this comment.
reminder to remove before merging
Any practical solution works in my opinion. The only issue will be how to report/add this in the NXDL structure. It's a bit meta over the already meta attributes we have in the NXDL. Will something like adding this renameable field, FIELDNAME__docs, in NXobject work out nicely in the NXDL framework? |
I didn't go this far. For groups and fields, I basically added an attribute |
|
Ah alright. So it's just for the attributes that you add a suffix. You're right. They will remain undocumented. Let's say to see how it goes in use for us we can leave it undocumented. It will make it practically easier to understand Nexus files like this. It makes the Nexus files more self sufficient too. And it seems this is the best we can do without overcomplicating it. |
|
Notes from TF meeting:
|
7369779 to
81edf90
Compare
…F5 attrs Rebased from PR #415 onto nexus-inheritance-concept-paths, rewriting the doc extraction to use NexusNode instead of get_inherited_nodes(). Key changes: - writer.py: new __nxdl_docs() using get_inheritance_concept_paths(); writes @docs on groups/fields, @<attr>_docs on attributes (single underscore per TF decision); _format_doc() helper for optional RST rendering via docutils - convert.py: extract write_docs/docs_format from kwargs before Writer.write() - cli.py: --write-docs flag and --docs-format choice option with guard - NXtest.nxdl.xml: add <doc> to version attribute for test coverage - pyproject.toml: add docutils as hard dependency - test_writer.py: test_write_docs() covering appdef root, field, attr, group docs
55ea46a to
fc747e2
Compare
…F5 attrs Rebased from PR #415 onto nexus-inheritance-concept-paths, rewriting the doc extraction to use NexusNode instead of get_inherited_nodes(). Key changes: - writer.py: new __nxdl_docs() using get_inheritance_concept_paths(); writes @docs on groups/fields, @<attr>_docs on attributes (single underscore per TF decision); _format_doc() helper for optional RST rendering via docutils - convert.py: extract write_docs/docs_format from kwargs before Writer.write() - cli.py: --write-docs flag and --docs-format choice option with guard - NXtest.nxdl.xml: add <doc> to version attribute for test coverage - pyproject.toml: add docutils as hard dependency - test_writer.py: test_write_docs() covering appdef root, field, attr, group docs
fc747e2 to
2b1cfe6
Compare
|
@mkuehbach @rettigl I made some changes here, following the changes to the What changed (since rebase onto latest
|
…F5 attrs Rebased from PR #415 onto nexus-inheritance-concept-paths, rewriting the doc extraction to use NexusNode instead of get_inherited_nodes(). Key changes: - writer.py: new __nxdl_docs() using get_inheritance_concept_paths(); writes @docs on groups/fields, @<attr>_docs on attributes (single underscore per TF decision); _format_doc() helper for optional RST rendering via docutils - convert.py: extract write_docs/docs_format from kwargs before Writer.write() - cli.py: --write-docs flag and --docs-format choice option with guard - NXtest.nxdl.xml: add <doc> to version attribute for test coverage - pyproject.toml: add docutils as hard dependency - test_writer.py: test_write_docs() covering appdef root, field, attr, group docs
bb7d425 to
a3097dd
Compare
…F5 attrs Rebased from PR #415 onto nexus-inheritance-concept-paths, rewriting the doc extraction to use NexusNode instead of get_inherited_nodes(). Key changes: - writer.py: new __nxdl_docs() using get_inheritance_concept_paths(); writes @docs on groups/fields, @<attr>_docs on attributes (single underscore per TF decision); _format_doc() helper for optional RST rendering via docutils - convert.py: extract write_docs/docs_format from kwargs before Writer.write() - cli.py: --write-docs flag and --docs-format choice option with guard - NXtest.nxdl.xml: add <doc> to version attribute for test coverage - pyproject.toml: add docutils as hard dependency - test_writer.py: test_write_docs() covering appdef root, field, attr, group docs
a3097dd to
61ff989
Compare
…F5 attrs Rebased from PR #415 onto nexus-inheritance-concept-paths, rewriting the doc extraction to use NexusNode instead of get_inherited_nodes(). Key changes: - writer.py: new __nxdl_docs() using get_inheritance_concept_paths(); writes @docs on groups/fields, @<attr>_docs on attributes (single underscore per TF decision); _format_doc() helper for optional RST rendering via docutils - convert.py: extract write_docs/docs_format from kwargs before Writer.write() - cli.py: --write-docs flag and --docs-format choice option with guard - NXtest.nxdl.xml: add <doc> to version attribute for test coverage - pyproject.toml: add docutils as hard dependency - test_writer.py: test_write_docs() covering appdef root, field, attr, group docs
61ff989 to
c86328f
Compare
…F5 attrs Rebased from PR #415 onto nexus-inheritance-concept-paths, rewriting the doc extraction to use NexusNode instead of get_inherited_nodes(). Key changes: - writer.py: new __nxdl_docs() using get_inheritance_concept_paths(); writes @docs on groups/fields, @<attr>_docs on attributes (single underscore per TF decision); _format_doc() helper for optional RST rendering via docutils - convert.py: extract write_docs/docs_format from kwargs before Writer.write() - cli.py: --write-docs flag and --docs-format choice option with guard - NXtest.nxdl.xml: add <doc> to version attribute for test coverage - pyproject.toml: add docutils as hard dependency - test_writer.py: test_write_docs() covering appdef root, field, attr, group docs
find_node_at_path was removed from NexusNode; use the resolve_path() function from nexus.schema_resolver and tree.get_inheritance_concept_paths() for the appdef root doc special case instead of get_inheritance_docs().
h5web renders newlines as literal \n in attribute values. For 'plain' format, reflow paragraphs to a single string (intra-paragraph line breaks become spaces, multi-paragraph docs concatenated with space). Use ': ' between concept label and doc text, ' | ' between inheritance levels.
…write-docs - test_writer.py: parametrize test_write_docs over default/plain; plain variant additionally asserts no embedded newlines in any docs attribute - dataconverter-and-readers.md: add section explaining --write-docs / --docs-format, attribute naming convention, and format comparison table
…ocument link behaviour - writer.py: __nxdl_docs() returns (doc_text, docs_url) tuple; URL from node.get_link() pointing to the online NeXus manual entry; writes @docs_url on groups/fields and @<attr>_docs_url on attribute datasets - annotator.py: _annotate_attribute() skips @docs, @docs_url, @*_docs and @*_docs_url so pynx read output stays uncluttered - dataconverter-and-readers.md: expand docs_url description; expand warning admonition to cover both write-time (links skip docs) and read-time (links to doc'd files expose source-concept docs at destination)
c86328f to
802d214
Compare

@rettigl this is a way to add the NX docstrings to the HDF5 files as attributes. The idea is that you can pass the
write-docsflag to the dataconverter and then docstrings are added. By default, this is turned off so as to not change any existing workflows. We can discuss if we want this to happen by default.Here's an example using the xps reader:
output.nxs.zip
The implementation is relatively trivial. There are however two open questions for me
How to handle inherited docs
Usually, our appdefs have very sparse documentation since all the docstrings are in the base classes or in appdefs that are further up the chain. My understanding is that the docs are also extended (not overwritten) when you inherit from a class. So technically, one should concatenate the docs of all inherited nodes. However, this gets quite unwieldy and messy as the docs in the files get rather large and sometimes they might even contradict each other. My suggestion (implemented here) was that as we travel up the inherited nodes, as soon as there is a docstring we only use that one (and don't add any docs coming from even earlier nodes). But this is probably not strictly correct. Maybe @sanbrock can comment.
Docs for NeXus attributes
NeXus attributes are written as HDF5 attributes already. Since HDF5 attributes cannot have attributes themselves, the question is where to place the docs for these attributes? My solution here: write another attribute
<attribute>__docs(e.g. entry/definition/version__docs) to the HDF5 file.