Restructure threat-model sources by controller#32
Conversation
Replace the "Data sources" section's trusted/untrusted split with a "Sources" section organised by who controls each input: configuration (operator-controlled), structural identifiers and control (developer-controlled), and content (user-controlled). This makes the trust classification explicit and gives a clean basis for triaging which reports are in scope. Thread context values are classified as untrusted content. The trust level of thread context keys is deferred to discussion apache/logging-log4j2#4132 and marked as a known open gap. The adversary-capabilities section now references the new content subsection, drops thread context keys from the in-scope channels pending that discussion, and marks an adversary who controls developer-controlled structural or control inputs as out of scope. Sink concepts (structured vs unstructured layouts, active vs passive sinks, downstream-destination trust) are intentionally left for a follow-up PR. Assisted-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
This PR restructures the "Data sources" section of the common threat model document into a "Sources" section organized by who controls each input (operator/developer/user), making the trust classification explicit. It also aligns the adversary-capabilities section with the new categories and defers the classification of thread context keys to an external discussion, marking it as a known open gap.
Changes:
- Replace the flat trusted/untrusted split with three controller-based subsections: Configuration (operator), Structural identifiers and control (developer), and Content (user).
- Add an explicit remediation rule (trusted inputs may be rejected; user-controlled content must not be rejected) and move the deserialization/object-stringification guidance under Content.
- Update adversary-capability cross-references to the new anchors, drop thread context keys from in-scope channels, and mark developer-controlled structural/control inputs as out of scope.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
FreeAndNil
left a comment
There was a problem hiding this comment.
Maybe we should put the string conversion topic in a separate note to move it a little bit away from the bullet points before about what we don't trust?
| * The logging frameworks do not trust neither the keys nor the values in the thread context. | ||
| * They **do not** trust the **values** stored in the thread context. | ||
|
|
||
| The frameworks **trust** that the objects passed to a log statement can be safely converted to strings. |
There was a problem hiding this comment.
| The frameworks **trust** that the objects passed to a log statement can be safely converted to strings. | |
| [NOTE] | |
| ==== | |
| Although the frameworks accept arbitrary content, they **trust** that the objects passed to a log statement can be safely converted to strings. |
Maybe we should put this in a separate note to move it a little bit away from the bullet points before about what we don't trust?
|
|
||
| The frameworks **trust** that the objects passed to a log statement can be safely converted to strings. | ||
| They **should not** be used to log deserialized data from untrusted sources; see https://owasp.org/www-community/vulnerabilities/Deserialization_of_untrusted_data[the related OWASP guide]. | ||
|
|
Restructures the "Data sources" section of the common threat model into a Sources section organised by who controls each input, rather than a flat trusted/untrusted split.
The three categories are defined by their controller:
MSGIDandSD-IDfields), and the format string of a parameterized log statement.Why
Triaging recent reports showed that reporters have a difficulty into classifying sources correctly, such as the structural identifiers of a structured message. Organising sources by controller makes the trust classification explicit and gives a clean basis for deciding which reports are in scope.
It also lets the model state the remediation rule that follows from the classification: trusted inputs may be rejected (fail-fast, for example by throwing), while user-controlled content must be accepted and never rejected, since rejecting it would turn a malicious value into a denial of service.
Thread context keys
Thread context values are classified as untrusted content. The trust level of thread context keys is deferred to discussion apache/logging-log4j2#4132 and marked in the document as a known open gap, rather than silently choosing a side.
Dependent changes
#threat-common-sourcesanchor is retained.Out of scope for this PR
Sink-side concepts (structured versus unstructured layouts, active versus passive sinks, and downstream-destination trust) are intentionally left for a follow-up PR.