Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
71 commits
Select commit Hold shift + click to select a range
ff94dc6
restructure
theosanderson Dec 15, 2025
c7bebc3
update
theosanderson Dec 15, 2025
84d32f4
fixup
theosanderson Dec 15, 2025
4d1fb0f
update
theosanderson Dec 15, 2025
8bce4f6
fixup
theosanderson Dec 15, 2025
3faee25
up
theosanderson Dec 15, 2025
b5b7a3a
fixup
theosanderson Dec 15, 2025
eb3d998
fixup
theosanderson Dec 15, 2025
94e803d
upd
theosanderson Dec 15, 2025
0bf193e
up
theosanderson Dec 15, 2025
d3e9f0e
u
theosanderson Dec 15, 2025
2de5151
format
anna-parker Jan 5, 2026
30f9586
remove duplication
anna-parker Jan 5, 2026
22b423e
delete unused file
anna-parker Jan 5, 2026
64dade1
fix order of reference and segments
anna-parker Jan 5, 2026
b12bbec
fixup
anna-parker Jan 5, 2026
a68c52f
change prepro config as well
anna-parker Jan 5, 2026
8b687ee
fixup
anna-parker Jan 9, 2026
432a4e4
feat(website): format
anna-parker Jan 9, 2026
0d6795d
testing
anna-parker Jan 9, 2026
5b4e253
feat(config): update the config to use format decided on in slack
anna-parker Jan 12, 2026
02f2693
fixup
anna-parker Jan 12, 2026
b58d1e0
fix reading in config
anna-parker Jan 12, 2026
d9ada3a
more changes
anna-parker Jan 12, 2026
981556f
improve
anna-parker Jan 12, 2026
650f4c6
small improvements
anna-parker Jan 12, 2026
a5429ad
fix config
anna-parker Jan 12, 2026
12c11a9
fix more code
anna-parker Jan 12, 2026
4bde3dd
fix up
anna-parker Jan 12, 2026
8fcc48e
fixup
anna-parker Jan 12, 2026
a527b3f
testing
anna-parker Jan 12, 2026
cc8dd71
testing
anna-parker Jan 12, 2026
6b73354
more
anna-parker Jan 12, 2026
70a2295
fixup
anna-parker Jan 12, 2026
d277d17
fix referenceSelector
anna-parker Jan 12, 2026
8f7f697
make referenceSelector for each segment
anna-parker Jan 12, 2026
4d12b34
fix more
anna-parker Jan 12, 2026
cae8b3b
finally
anna-parker Jan 12, 2026
a7d7e5f
fixup
anna-parker Jan 13, 2026
febf14e
fix EVs
anna-parker Jan 13, 2026
88acb27
fixup
anna-parker Jan 13, 2026
aaf4871
fix tests
anna-parker Jan 13, 2026
1c7fc11
cut down code complexity
anna-parker Jan 13, 2026
ac4e5ff
revert formatting changes
anna-parker Jan 13, 2026
69cc8c4
more fixes
anna-parker Jan 13, 2026
5e26002
wupps
anna-parker Jan 13, 2026
9fc950d
ugly fix
anna-parker Jan 13, 2026
dbe896c
fixes
anna-parker Jan 13, 2026
e44b9ee
set state correctly
anna-parker Jan 13, 2026
b7ccfc6
fix selector
anna-parker Jan 13, 2026
cd2106f
extend for multi-segment
anna-parker Jan 13, 2026
f22b9ba
add multi-seg, multi-ref example
anna-parker Jan 13, 2026
a4e5c7e
also add multiple references for the M segment
anna-parker Jan 13, 2026
8e9c1ae
wupps
anna-parker Jan 13, 2026
654cc02
make reference selectable
anna-parker Jan 13, 2026
7c407e6
get rid of warning
anna-parker Jan 13, 2026
9f3706f
improve
anna-parker Jan 13, 2026
2719ff2
format
anna-parker Jan 13, 2026
c9f62e5
fix reference selector again
anna-parker Jan 13, 2026
613c62d
improve more
anna-parker Jan 13, 2026
9b06328
fix some bugs
anna-parker Jan 13, 2026
4c0d47a
fix mutation search
anna-parker Jan 14, 2026
be9d589
fix download
anna-parker Jan 14, 2026
fb0d3ed
fix download button
anna-parker Jan 14, 2026
6bffb36
fix integration tests
anna-parker Jan 14, 2026
0e6e3d4
fix: dont duplicate reference selection on search page
anna-parker Jan 14, 2026
c8b2753
fix mutation search for EVs
anna-parker Jan 14, 2026
5cf4112
more fixes
anna-parker Jan 14, 2026
cda058c
fix some bugs
anna-parker Jan 15, 2026
2aae4ad
fix revocation page bug
anna-parker Jan 15, 2026
ad66f84
fix revocation viewer
anna-parker Jan 15, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
58 changes: 28 additions & 30 deletions backend/docs/organismWithSuborganisms.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Solution Design - Organisms With Suborganisms
# Solution Design - Organisms With Multiple References ("Subtypes")

The purpose of this feature is to allow a single top level organism to contain multiple suborganisms.
The purpose of this feature is to allow a single top level organism to contain multiple references, or multiple "suborganisms".

Motivation:

Expand Down Expand Up @@ -39,7 +39,7 @@ defaultOrganisms:
metadataAdd:
- name: clade_cv_a10
# tells the website to only show this field on the search page when CV-A10 is selected
onlyForSuborganism: CV-A10
onlyForReference: CV-A10
preprocessing:
args:
segment: CV-A10
Expand All @@ -62,10 +62,10 @@ defaultOrganisms:
perSegment: true
website:
<<: *website
# When the website needs to know which suborganism a sequence entry belongs to,
# it will look up the value of this metadata field.
# When the website needs to know which suborganism a sequence entry belongs to (i.e. which reference it aligns to),
# it will look up the value of this metadata field (this metadata field will exist for each segment, e.g. genotype_L, genotype_M).
# Preprocessing must make sure that this field is always populated.
suborganismIdentifierField: genotype
referenceIdentifierField: genotype
preprocessing:
- <<: *preprocessing
configFile:
Expand All @@ -88,22 +88,20 @@ defaultOrganisms:
# `referenceGenomes` is now an object { suborganismName: referenceGenomeOfThatSuborganism }
# The special suborganism name `singleReference` must be used when there is only a single suborganism
referenceGenomes:
CV-A10:
nucleotideSequences:
- name: main
- name: main
references:
- reference_name: CV-A10
sequence: "..."
insdcAccessionFull: ...
genes:
- name: VP4
sequence: "..."
EV-A71:
nucleotideSequences:
- name: main
genes:
- name: VP4
sequence: "..."
- reference_name: EV-A71
sequence: "..."
insdcAccessionFull: ...
genes:
- name: VP2
sequence: "..."
genes:
- name: VP2
sequence: "..."
```

The website will then receive the `referenceGenomes` as configured above.
Expand Down Expand Up @@ -322,10 +320,10 @@ The reference genome will be a product "suborganism x segment":
{"name": "suborganism2", "sequence": "..."}
],
"genes": [
{"name": "suborganism1_gene1", "sequence": "..."},
{"name": "suborganism1_gene2", "sequence": "..."},
{"name": "suborganism2_gene1", "sequence": "..."},
{"name": "suborganism2_gene2", "sequence": "..."}
{"name": "gene1_suborganism1", "sequence": "..."},
{"name": "gene2_suborganism1", "sequence": "..."},
{"name": "gene1_suborganism2", "sequence": "..."},
{"name": "gene2_suborganism2", "sequence": "..."}
]
}
```
Expand All @@ -335,16 +333,16 @@ for multi-segment:
```json
{
"nucleotideSequences": [
{"name": "suborganism1_segment1", "sequence": "..."},
{"name": "suborganism1_segment2", "sequence": "..."},
{"name": "suborganism2_segment1", "sequence": "..."},
{"name": "suborganism2_segment2", "sequence": "..."}
{"name": "segment1_suborganism1", "sequence": "..."},
{"name": "segment2_suborganism1", "sequence": "..."},
{"name": "segment1_suborganism2", "sequence": "..."},
{"name": "segment2_suborganism2", "sequence": "..."}
],
"genes": [
{"name": "suborganism1_gene1", "sequence": "..."},
{"name": "suborganism1_gene2", "sequence": "..."},
{"name": "suborganism2_gene1", "sequence": "..."},
{"name": "suborganism2_gene2", "sequence": "..."}
{"name": "gene1_suborganism1", "sequence": "..."},
{"name": "gene2_suborganism1", "sequence": "..."},
{"name": "gene1_suborganism2", "sequence": "..."},
{"name": "gene2_suborganism2", "sequence": "..."}
]
}
```
Expand Down
12 changes: 6 additions & 6 deletions ena-submission/scripts/get_ena_submission_list.py
Original file line number Diff line number Diff line change
Expand Up @@ -43,24 +43,24 @@ class SubmissionResults:
@dataclass
class ENAOrganism:
enaOrganismName: str # noqa: N815
suborganismIdentifierField: str | None = None # noqa: N815
referenceIdentifierField: str | None = None # noqa: N815


def loculus_organism_to_ena_organism(config: Config) -> dict[str, list[ENAOrganism]]:
loculus_organism_to_ena_organism: dict[str, list[ENAOrganism]] = defaultdict(list)
for ena_organism, details in config.enaOrganisms.items():
if details.loculusOrganism:
if not details.suborganismIdentifierField:
if not details.referenceIdentifierField:
error_msg = (
"Could not find suborganismIdentifierField in enaOrganism "
"Could not find referenceIdentifierField in enaOrganism "
f"config for {ena_organism}"
)
logger.error(error_msg)
raise ValueError(error_msg) from None
loculus_organism_to_ena_organism[details.loculusOrganism].append(
ENAOrganism(
enaOrganismName=ena_organism,
suborganismIdentifierField=details.suborganismIdentifierField,
referenceIdentifierField=details.referenceIdentifierField,
)
)
continue
Expand Down Expand Up @@ -94,11 +94,11 @@ def assign_ena_organism(
entry: dict[str, Any],
ena_organisms: list[ENAOrganism],
) -> str:
"""Assign the correct ena organism based on suborganismIdentifierField if present."""
"""Assign the correct ena organism based on referenceIdentifierField if present."""
if len(ena_organisms) == 1:
return ena_organisms[0].enaOrganismName
for ena_organism in ena_organisms:
suborganism_field = ena_organism.suborganismIdentifierField
suborganism_field = ena_organism.referenceIdentifierField
if (
suborganism_field
and entry["metadata"].get(suborganism_field) == ena_organism.enaOrganismName
Expand Down
2 changes: 1 addition & 1 deletion ena-submission/src/ena_deposition/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ class EnaOrganismDetails(BaseModel):
topology: Topology = Topology.LINEAR
segments: list[str]
loculusOrganism: str | None = None # noqa: N815
suborganismIdentifierField: str | None = None # noqa: N815
referenceIdentifierField: str | None = None # noqa: N815

def is_multi_segment(self) -> bool:
return len(self.segments) > 1
Expand Down
16 changes: 8 additions & 8 deletions kubernetes/loculus/templates/_common-metadata.tpl
Original file line number Diff line number Diff line change
Expand Up @@ -304,8 +304,8 @@ organisms:
{{- if .includeInDownloadsByDefault }}
includeInDownloadsByDefault: {{ .includeInDownloadsByDefault }}
{{- end }}
{{- if .onlyForSuborganism }}
onlyForSuborganism: {{ .onlyForSuborganism }}
{{- if .onlyForReference }}
onlyForReference: {{ .onlyForReference }}
{{- end }}
{{- if .customDisplay }}
customDisplay:
Expand All @@ -331,7 +331,7 @@ organisms:

{{/* Generate website metadata from passed metadata array */}}
{{- define "loculus.generateWebsiteMetadata" }}
{{- $rawUniqueSegments := (include "loculus.extractUniqueRawNucleotideSequenceNames" .referenceGenomes | fromYaml).segments }}
{{- $rawUniqueSegments := (include "loculus.getNucleotideSegmentNames" .referenceGenomes | fromYaml).segments }}
{{- $isSegmented := gt (len $rawUniqueSegments) 1 }}
{{- $metadataList := .metadata }}
fields:
Expand Down Expand Up @@ -440,7 +440,7 @@ fields:

{{/* Generate backend metadata from passed metadata array */}}
{{- define "loculus.generateBackendMetadata" }}
{{- $rawUniqueSegments := (include "loculus.extractUniqueRawNucleotideSequenceNames" .referenceGenomes | fromYaml).segments }}
{{- $rawUniqueSegments := (include "loculus.getNucleotideSegmentNames" .referenceGenomes | fromYaml).segments }}
{{- $isSegmented := gt (len $rawUniqueSegments) 1 }}
{{- $metadataList := .metadata }}
fields:
Expand All @@ -464,7 +464,7 @@ fields:

{{/* Generate backend metadata from passed metadata array */}}
{{- define "loculus.generateBackendExternalMetadata" }}
{{- $rawUniqueSegments := (include "loculus.extractUniqueRawNucleotideSequenceNames" .referenceGenomes | fromYaml).segments }}
{{- $rawUniqueSegments := (include "loculus.getNucleotideSegmentNames" .referenceGenomes | fromYaml).segments }}
{{- $isSegmented := gt (len $rawUniqueSegments) 1 }}
{{- $metadataList := .metadata }}
fields:
Expand Down Expand Up @@ -527,11 +527,11 @@ enaOrganisms:
{{- end }}
{{- with $instance.schema }}
{{ $configFile.configFile | toYaml | nindent 4 }}
{{- if $configFile.suborganismIdentifierField }}
suborganismIdentifierField: {{ quote $configFile.suborganismIdentifierField }}
{{- if $configFile.referenceIdentifierField }}
referenceIdentifierField: {{ quote $configFile.referenceIdentifierField }}
{{- end }}
organismName: {{ quote .organismName }}
{{- $rawUniqueSegments := (include "loculus.extractUniqueRawNucleotideSequenceNames" $instance.referenceGenomes | fromYaml).segments }}
{{- $rawUniqueSegments := (include "loculus.getNucleotideSegmentNames" $instance.referenceGenomes | fromYaml).segments }}
segments: {{ $rawUniqueSegments | toYaml | nindent 6 }}
externalMetadata:
{{- $args := dict
Expand Down
90 changes: 50 additions & 40 deletions kubernetes/loculus/templates/_merged-reference-genomes.tpl
Original file line number Diff line number Diff line change
@@ -1,60 +1,70 @@
{{- define "loculus.mergeReferenceGenomes" -}}
{{- $referenceGenomes := . -}}
{{- $segmentWithReferencesList := . -}}
{{- $lapisNucleotideSequences := list -}}
{{- $lapisGenes := list -}}

{{- if len $referenceGenomes | eq 1 }}
{{- include "loculus.generateReferenceGenome" (first (values $referenceGenomes)) -}}
{{- else }}
{{- range $suborganismName, $referenceGenomeRaw := $referenceGenomes -}}
{{- $referenceGenome := include "loculus.generateReferenceGenome" $referenceGenomeRaw | fromYaml -}}

{{- $nucleotideSequences := $referenceGenome.nucleotideSequences -}}
{{- if $nucleotideSequences -}}
{{- if eq (len $nucleotideSequences) 1 -}}
{{- $lapisNucleotideSequences = append $lapisNucleotideSequences (dict
"name" $suborganismName
"sequence" (first $nucleotideSequences).sequence)
-}}
{{- else -}}
{{- range $sequence := $nucleotideSequences -}}
{{- $lapisNucleotideSequences = append $lapisNucleotideSequences (dict
"name" (printf "%s-%s" $suborganismName $sequence.name)
"sequence" $sequence.sequence
) -}}
{{- end -}}
{{- end -}}
{{- if or (not $segmentWithReferencesList) (eq (len $segmentWithReferencesList) 0) -}}
{{- $result := dict "nucleotideSequences" (list) "genes" (list) -}}
{{- $result | toYaml -}}
{{- else -}}

{{- $singleSegment := eq (len $segmentWithReferencesList) 1 -}}

{{- range $segment := $segmentWithReferencesList -}}
{{- $segmentName := $segment.name -}}
{{- $singleReference := eq (len $segment.references) 1 -}}
{{- range $reference := $segment.references -}}
{{- $referenceName := $reference.reference_name -}}
{{- if $singleReference -}}
{{/* Single reference mode - no suffix */}}
{{- $lapisNucleotideSequences = append $lapisNucleotideSequences (dict
"name" $segmentName
"sequence" $reference.sequence
) -}}
{{- else -}}
{{- $name := printf "%s%s" (ternary "" (printf "%s-" $segmentName) $singleSegment) $referenceName -}}
{{- $lapisNucleotideSequences = append $lapisNucleotideSequences (dict
"name" $name
"sequence" $reference.sequence
) -}}
{{- end -}}

{{- if $referenceGenome.genes -}}
{{- range $gene := $referenceGenome.genes -}}
{{- $lapisGenes = append $lapisGenes (dict
"name" (printf "%s-%s" $suborganismName $gene.name)
"sequence" $gene.sequence)
-}}
{{/* Add genes if present */}}
{{- if $reference.genes -}}
{{- range $gene := $reference.genes -}}
{{- if $singleReference -}}
{{- $lapisGenes = append $lapisGenes (dict
"name" $gene.name
"sequence" $gene.sequence
) -}}
{{- else -}}
{{- $geneName := printf "%s-%s" $gene.name $referenceName -}}
{{- $lapisGenes = append $lapisGenes (dict
"name" $geneName
"sequence" $gene.sequence
) -}}
{{- end -}}
{{- end -}}
{{- end -}}
{{- end -}}

{{- $result := dict "nucleotideSequences" $lapisNucleotideSequences "genes" $lapisGenes -}}
{{- $result | toYaml -}}
{{- end -}}

{{- end -}}

{{- $result := dict "nucleotideSequences" $lapisNucleotideSequences "genes" $lapisGenes -}}
{{- $result | toYaml -}}
{{- end -}}

{{- define "loculus.extractUniqueRawNucleotideSequenceNames" -}}
{{- $referenceGenomes := . -}}
{{- $segmentNames := list -}}

{{- range $suborganismName, $referenceGenomeRaw := $referenceGenomes -}}
{{- $referenceGenome := include "loculus.generateReferenceGenome" $referenceGenomeRaw | fromYaml -}}
{{- define "loculus.getNucleotideSegmentNames" -}}
{{- $segmentWithReferencesList := . -}}

{{- range $sequence := $referenceGenome.nucleotideSequences -}}
{{- $segmentNames = append $segmentNames $sequence.name -}}
{{- end -}}
{{/* Extract segment names directly from .name */}}
{{- $segmentNames := list -}}
{{- range $segment := $segmentWithReferencesList -}}
{{- $segmentNames = append $segmentNames $segment.name -}}
{{- end -}}

segments:
{{- $segmentNames | uniq | toYaml | nindent 2 -}}
{{- $segmentNames | sortAlpha | toYaml | nindent 2 -}}
{{- end -}}
2 changes: 1 addition & 1 deletion kubernetes/loculus/templates/_preprocessingFromValues.tpl
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,7 @@
{{- $metadata := .metadata }}
{{- $referenceGenomes := .referenceGenomes}}

{{- $rawUniqueSegments := (include "loculus.extractUniqueRawNucleotideSequenceNames" $referenceGenomes | fromYaml).segments }}
{{- $rawUniqueSegments := (include "loculus.getNucleotideSegmentNames" $referenceGenomes | fromYaml).segments }}
{{- $isSegmented := gt (len $rawUniqueSegments) 1 }}

{{- range $metadata }}
Expand Down
2 changes: 1 addition & 1 deletion kubernetes/loculus/templates/_siloDatabaseConfig.tpl
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@

{{- define "loculus.siloDatabaseConfig" }}
{{- $schema := .schema }}
{{- $rawUniqueSegments := (include "loculus.extractUniqueRawNucleotideSequenceNames" .referenceGenomes | fromYaml).segments }}
{{- $rawUniqueSegments := (include "loculus.getNucleotideSegmentNames" .referenceGenomes | fromYaml).segments }}
{{- $isSegmented := gt (len $rawUniqueSegments) 1 }}
schema:
instanceName: {{ $schema.organismName }}
Expand Down
2 changes: 1 addition & 1 deletion kubernetes/loculus/templates/ingest-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@
{{- $values := $item.contents }}
{{- if $values.ingest }}
{{- $metadata := (include "loculus.patchMetadataSchema" $values.schema | fromYaml).metadata }}
{{- $rawUniqueSegments := (include "loculus.extractUniqueRawNucleotideSequenceNames" $values.referenceGenomes | fromYaml).segments }}
{{- $rawUniqueSegments := (include "loculus.getNucleotideSegmentNames" $values.referenceGenomes | fromYaml).segments }}
---
apiVersion: v1
kind: ConfigMap
Expand Down
Loading
Loading