Skip to content

Create typed IRI concept for Publication Channels#2437

Draft
brinxmat wants to merge 4 commits into
mainfrom
create-typed-iri-concept
Draft

Create typed IRI concept for Publication Channels#2437
brinxmat wants to merge 4 commits into
mainfrom
create-typed-iri-concept

Conversation

@brinxmat

Copy link
Copy Markdown
Collaborator

Motivation

The major motivation for this is to simplify operations related to date checking in what is essentially a composite string.

  • When we create the publication channel uris, we encode a lot of information in the path: /{type}/{identifier}/{year}

Secondly, the data is now validated

  • Any operation on the data involves potential pitfalls if the data is in fact just any plain old URI, or the data is in some other way not valid (which it may be in any number of combinations of the three data elements in the URI, in addition to the endpoint URI being incorrect)

Thirdly, we are interested in the data abstract from the environment

  • If we wish to copy data from a production environment, with e.g. host "api.nva.unit.no", to demo-environment-15, with host "api.demo-15.nva.aws.unit.no", we are not actually interested in manipulating every channel in every location in every database entry.

What is done

  • Created a data object that represents PublicationChannelId, which is now a concept in the model. This thing is not a "journal", nor is it a "publisher", it is simply a representation of the data in @id. It provides accessors to the elements that form a valid publication channel id and fails with a relevant error message if we encounter something that is not as it should be.
  • methods:
    • fromUri(URI uri) creates the PublicationChannelId from a URI
    • uri(String host) creates the URI from the PublicationChannelId
    • value() provides the concatenated string for JSON output.

Usage

Initially, we will use PublicationChannelId::fromUri and PublicationChannelId::uri to migrate the platform away from URI.

Later, we will use PublicationChannelId::value to remove the URI from persisted data.

This latter statement sounds weird, how will that work?

Consider the following, noting the @base and lack of host in the channel @id:

{
  "@context": { 
    "@vocab": "https://schema.org",
    "@base": "https://api.nva.unit.no"
  },
  "channel": {
    "@id": "/publication-channels-v2/serial-publication/360B8D2C-736F-450A-8D34-9596BFE28CB4/2025",
    "@type": "Journal",
    "owner": "Casey",
    "label": "Something"
  }
}

This serialises to N-Triples as:

<https://api.nva.unit.no/publication-channels-v2/serial-publication/360B8D2C-736F-450A-8D34-9596BFE28CB4/2025> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <https://schema.orgJournal> .
<https://api.nva.unit.no/publication-channels-v2/serial-publication/360B8D2C-736F-450A-8D34-9596BFE28CB4/2025> <https://schema.orglabel> "Something" .
<https://api.nva.unit.no/publication-channels-v2/serial-publication/360B8D2C-736F-450A-8D34-9596BFE28CB4/2025> <https://schema.orgowner> "Casey" .
_:b0 <https://schema.orgchannel> <https://api.nva.unit.no/publication-channels-v2/serial-publication/360B8D2C-736F-450A-8D34-9596BFE28CB4/2025> .

Since our base is relative to the environment we are in, and we already know this from the environment variables, this change is a simple way of making the data portable between the environments (noting that we do not persist the context in the database). Where we do persist the context, we can check one field in the @context object rather than any field containing a URI that may or may not match a predefined expectation.

Downside

@context becomes environment specific

@github-actions

github-actions Bot commented Nov 29, 2025

Copy link
Copy Markdown

Test Results

  286 files  + 2    286 suites  +2   24m 7s ⏱️ +12s
9 067 tests +50  9 056 ✅ +50  11 💤 ±0  0 ❌ ±0 
9 549 runs  +60  9 538 ✅ +60  11 💤 ±0  0 ❌ ±0 

Results for commit e2e57eb. ± Comparison against base commit 5263c03.

This pull request removes 304 and adds 195 tests. Note that renamed tests count towards both.
            "composer" : "DtePAk2ZjLw8Lzr",
            "day" : "3syDklZHbnBwWUfV5"
            "day" : "rPHYfsH5VIzz"
            "description" : "Some description"
            "extent" : "GuJRYj8fAOMcQPQSMI"
            "formatted" : "M-2306-7118-7"
            "introduction" : {
            "month" : "agqDekENHtR2Gy",
            "month" : "dmslbxWc3hmaUBQiv",
            "name" : "OVuxPTXnSLd7jcg"
…
no.sikt.nva.brage.migration.merger.PublicationInstanceMergerTest ‑ [21] no.unit.nva.model.instancetypes.journal.ProfessionalArticle@4825c949
no.sikt.nva.brage.migration.merger.PublicationInstanceMergerTest ‑ [22] no.unit.nva.model.instancetypes.journal.AcademicArticle@805aee7
no.sikt.nva.iri.ChannelIdentifierTest ‑ [1] "/publication-channels-v2/serial-publication/360B8D2C-736F-450A-8D34-9596BFE28CB4/2025"
no.sikt.nva.iri.ChannelIdentifierTest ‑ [1] https://api.nva.unit.no/publication-channel-v2/publisher/360B8D2C-736F-450A-8D34-9596BFE28CB4/2025
no.sikt.nva.iri.ChannelIdentifierTest ‑ [1] https://api.nva.unit.no/publication-channel-v2/serial-publication/360B8D2C-736F-450A-8D34-9596BFE28CB4/2025
no.sikt.nva.iri.ChannelIdentifierTest ‑ [1] https://api.nva.unit.no/publication-channels-v2/publisher/360B8D2C-736F-450A-8D34-9596BFE28CB4/
no.sikt.nva.iri.ChannelIdentifierTest ‑ [1] https://api.nva.unit.no/publication-channels-v2/publisher/360B8D2C-736F-450A-8D34-9596BFE28CB4/2025A
no.sikt.nva.iri.ChannelIdentifierTest ‑ [1] https://api.nva.unit.no/publication-channels-v2/publisher/360B8D2C-736F-450A-8D34-9596BFE28CBK/2025
no.sikt.nva.iri.ChannelIdentifierTest ‑ [1] https://api.nva.unit.no/publication-channels-v2/publishers/360B8D2C-736F-450A-8D34-9596BFE28CB4/2025
no.sikt.nva.iri.ChannelIdentifierTest ‑ [1] https://api.nva.unit.no/publication-channels-v2/serial-publication/360B8D2C-736F-450A-8D34-9596BFE28CB4/
…

♻️ This comment has been updated with latest results.

@brinxmat brinxmat marked this pull request as draft November 29, 2025 10:00
@brinxmat brinxmat requested a review from truhacevkir November 29, 2025 10:00
Comment on lines +13 to +15
public SerialPublicationId(UuidYearPair validate) {
this(validate.uuid(), validate.year());
}

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Input variable name is wrong.

Comment on lines +43 to +51
static PublicationChannelId from(URI uri) {
if (nonNull(uri) && uri.toString().contains(ChannelType.PUBLISHER.getType())) {
return PublisherId.from(uri);
} else if (nonNull(uri) && uri.toString().contains(ChannelType.SERIAL_PUBLICATION.getType())) {
return SerialPublicationId.from(uri);
} else {
throw new IllegalArgumentException("Encountered URI that is not a valid publication channel id");
}
}

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want to be very strict, should we validate that ChannelType.PUBLISHER.getType() and ChannelType.SERIAL_PUBLICATION.getType() are in right places? It seems that we do it downstream :)

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happens downstream, this is simply to differentiate the two so they end up being processed correctly.

String value();

static String value(ChannelType type, UUID identifier, Year year) {
return "/%s/%s/%s/%s".formatted(PATH_ELEMENT_ONE, type.getType(), identifier, year);

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have PATH_TEMPLATE to use here instead of string

}

default URI uri(String host) {
if (isNull(host) || host.isBlank()) {

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

StringUtils.isBlank() combines it in single method

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure that I want the entirety of commons here yet.

import static java.util.Objects.isNull;
import static java.util.Objects.nonNull;

public interface PublicationChannelId {

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

90% of this interface is common validation rules for PublicationChannelId, in addition to 4 public getters(). What about moving all the validation to PublicationChannelIdValidator? Separation of concerns :)

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, this is a good idea. I want to see if this idea works for other types and the validator concept will be more important then since a lot of the IRIs we create are structured in similar ways.

}

public static SerialPublicationId from(URI uri) {
return new SerialPublicationId(PublicationChannelId.validate(uri, CHANNEL_TYPE));

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Validate method implies void or boolean, but here we return an object.
What about do validation and then extract identifier and year by methods in common interface?

  public static SerialPublicationId from(URI uri) {
      PublicationChannelId.validate(uri, CHANNEL_TYPE);
      var identifier = PublicationChannelId.extractIdentifier(uri);
      var year = PublicationChannelId.extractYear(uri);
      return new SerialPublicationId(identifier, year);
  }

Thank we may no need wrapper object UuidYearPair

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, this was a quick fix for an earlier issue, but I think it is probably important that we know the pair is a single thing (the identifier)

import java.time.Year;
import java.util.UUID;

public record SerialPublicationId(UUID identifier, Year year) implements PublicationChannelId {

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Record has default constructor which allows us to instantiate invalid PublicationChannelId, for example:
Maybe it should be a class with private constructor and factory method which enforces user to validate object before instantiation.

@Test
    void possibleToInstantiateInvalidPublicationChannelId() {
        assertDoesNotThrow(() -> new SerialPublicationId(null, Year.parse("1999")));
    }

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think what the interface is is a factory…I will refactory it :D

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants