[FFL-1720] Evaluation Logging: Aggregation Engine & Test Utilities by typotter · Pull Request #3145 · DataDog/dd-sdk-android

typotter · 2026-01-22T08:48:46Z

🥞 Evaluation Logging Stacked Pull Requests 🥞

🔲 Integration & Configuration (PR #3147)
🔲 Storage & Network Infrastructure (PR #3146)
👉 Aggregation Engine & Test Utilities (this PR)
☑️ FlagEvaluation Schema (PR #3166)
☑️ Event Schema & Data Models (PR #3144)
☑️ Evaluations Subfeature (PR #3159)
⎿ feature/flags-evaluations-logging (feature branch)

Datadog Internal
🎟️ Ticket: FFL-1720 - Implement Evaluation Logging for Android SDK

What does this PR do?

Implements the core evaluations aggregation engine that groups feature flag evaluations by key characteristics and manages periodic flushing from memory to persistent storage (via StorageBackedFeature/EvaluationsFeature).

The EvaluationEventProcessor manages concurrent access and mutation of AggregationStats instances in memory. Each Stats object is given an AggregationKey - a composite key made of the set of identifiers across the evaluation (targetingKey, variationKey, etc.) and SDK context (rum View ID). This ensures proper evaluation counting under the "same" circumstances.

The processor also manages the periodic and size-based triggers for flushing the in-memory aggregations to persistent storage.

Motivation

We need to implement Evaluation Logging to provide comprehensive visibility into all feature flag evaluations, including defaults, errors, and successful matches. This goes beyond exposure logging by capturing aggregated metrics about evaluation frequency, error rates, and runtime default usage across all flags.

Description

This PR adds the core business logic for evaluation logging:

Implementation:

AggregationKey: Defines how evaluations are grouped (by flag, variant, allocation, targeting key, and error code)
AggregationStats: Tracks count, first/last timestamps, and last error message for each aggregation
EvaluationEventsProcessor: Orchestrates aggregation, manages flush triggers (time-based, size-based, shutdown)
EvaluationEventWriter: Interface for persisting evaluation events (implementation in PR [FFL-1720] Evaluation Logging: Storage & Network Infrastructure #3146)

Additional Notes

Review checklist (to be filled by reviewers)

Feature or bugfix MUST have appropriate tests (unit, integration, e2e)
Make sure you discussed the feature or bugfix with the maintaining team in an Issue
Make sure each commit and the PR mention the Issue number (cf the CONTRIBUTING doc)

typotter · 2026-01-22T09:27:30Z

[FFL-1720] Evaluation Logging: Integration #3147
[FFL-1720] Evaluation Logging: Storage & Network Infrastructure #3146
[FFL-1720] Evaluation Logging: Aggregation Engine & Test Utilities #3145 👈 (View in Graphite)
[FFL-1720] Evaluation Logging: Event Schema & Data Models #3144
feature/flags-evaluations-logging

This stack of pull requests is managed by Graphite. Learn more about stacking.

...oid-flags/src/main/kotlin/com/datadog/android/flags/internal/aggregation/AggregationStats.kt

...ndroid-flags/src/main/kotlin/com/datadog/android/flags/internal/EvaluationEventsProcessor.kt

...oid-flags/src/main/kotlin/com/datadog/android/flags/internal/aggregation/AggregationStats.kt

dd-oleksii · 2026-01-23T17:30:01Z

...oid-flags/src/main/kotlin/com/datadog/android/flags/internal/aggregation/AggregationStats.kt

+        // Take atomic snapshot of statistics to prevent torn reads
+        val snapshotCount: Int
+        val snapshotFirst: Long
+        val snapshotLast: Long
+        val snapshotMessage: String?
+        synchronized(this) {
+            snapshotCount = count
+            snapshotFirst = firstEvaluation
+            snapshotLast = lastEvaluation
+            snapshotMessage = lastErrorMessage
+        }


minor: this piece is confusing. These fields are volatile, so torn read is not possible. Otherwise, this class is not designed to handle concurrent calls to recordEvaluation and toEvaluationEvent as that may leave some events uncounted.

perhaps this is being overly defensive. Because of the CAS of the map of aggregation stats for flushing, _no other thread should be calling toEvaluationEvent or recordEvaluation on this AggregationStats instance.

Without this synchronized, recordEvaluation could change the computed values while we're getting them for constructing the FlagEvaluation

With the synchronized CAS snapshotting the aggregationMap for flushing, this method should be protected from another thread calling recordEvaluation which could change the values as they're being grabbed for the FlagEvaluation constructor
Is this too defensive?

...droid-flags/src/main/kotlin/com/datadog/android/flags/internal/aggregation/AggregationKey.kt

...dk-android-flags/src/main/kotlin/com/datadog/android/flags/internal/EvaluationEventWriter.kt

...ndroid-flags/src/main/kotlin/com/datadog/android/flags/internal/EvaluationEventsProcessor.kt

codecov-commenter · 2026-01-27T20:04:57Z

Codecov Report

❌ Patch coverage is 97.18310% with 4 lines in your changes missing coverage. Please review.
✅ Project coverage is 70.99%. Comparing base (986afe7) to head (2c13aff).

Files with missing lines	Patch %	Lines
...flags/internal/aggregation/EvaluationAggregator.kt	93.55%	0 Missing and 2 partials ⚠️
...ndroid/flags/internal/EvaluationEventsProcessor.kt	98.25%	1 Missing ⚠️
...oid/flags/internal/aggregation/AggregationStats.kt	97.92%	0 Missing and 1 partial ⚠️

Additional details and impacted files

@@                          Coverage Diff                          @@
##           feature/flags-evaluations-logging    #3145      +/-   ##
=====================================================================
+ Coverage                              70.81%   70.99%   +0.18%     
=====================================================================
  Files                                    901      905       +4     
  Lines                                  33172    33314     +142     
  Branches                                5596     5618      +22     
=====================================================================
+ Hits                                   23489    23648     +159     
+ Misses                                  8118     8106      -12     
+ Partials                                1565     1560       -5

Files with missing lines	Coverage Δ
...droid/flags/internal/aggregation/AggregationKey.kt	`100.00% <100.00%> (ø)`
...ndroid/flags/internal/EvaluationEventsProcessor.kt	`98.25% <98.25%> (ø)`
...oid/flags/internal/aggregation/AggregationStats.kt	`97.92% <97.92%> (ø)`
...flags/internal/aggregation/EvaluationAggregator.kt	`93.55% <93.55%> (ø)`

... and 35 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

typotter

Thanks Oleksii, as always for your 👀 . ptal

dd-oleksii · 2026-02-03T14:22:56Z

...oid-flags/src/main/kotlin/com/datadog/android/flags/internal/aggregation/AggregationStats.kt

+    private val context: EvaluationContext,
+    private val service: String?,
+    private val rumApplicationId: String?,
+    private val rumViewName: String?,


nit: view name is duplicated in aggregation key. do we need it here?

...droid-flags/src/main/kotlin/com/datadog/android/flags/internal/aggregation/AggregationKey.kt

dd-oleksii · 2026-02-03T14:27:28Z

...oid-flags/src/main/kotlin/com/datadog/android/flags/internal/aggregation/AggregationStats.kt

+            runtimeDefaultUsed = reason == null ||
+                reason == ResolutionReason.DEFAULT.name ||
+                reason == ResolutionReason.ERROR.name


minor: the condition I would use here is aggregationKey.variantKey == null. Reasons are somewhat less reliable and it's valid to return runtime default with reasons besides "default" or "error"

good idea. fixed

dd-oleksii · 2026-02-03T14:41:40Z

...ndroid-flags/src/main/kotlin/com/datadog/android/flags/internal/EvaluationEventsProcessor.kt

+        if (shouldFlush) {
+            flush()
+        }


minor: if there are many threads recording evaluations, it's still possible to flush too frequently:

Thread A records an event, shouldFlush is true

Thread B records an event, shouldFlush is true

Thread A flushes all events, aggregation map is now empty

Thread C records an event, shouldFlush is false

Thread B tries to flush, flushes a single event (written by C)

Note in my previous message, the size limit is re-checked after acquiring a write lock and holding it through to the flush.

Two ways we can implement this in the current structure: either add a parameter to flush() for it to do an extra check and only flush if the size exceeds the limit. Alternatively, we can make aggregator.record() return drained records instead of a flag—then we can keep this synchronization entirely inside aggregator.

return drained records instead of a flag—then we can keep this synchronization entirely inside aggregator.

Interesting approach

yeah, the only downside is that it doesn't cancel scheduled flush holding the same lock, so it's possible for:

Thread A calls aggregator.record() which triggers size limit and drains records

Thread B writes more records

Thread C — scheduled flush triggers

Thread A flushes returned records

Thread C continues with scheduled flush and writes just a few records

This is probably less severe than the previous situation with extra flushes due to size limit because there's at most one scheduled flush whereas the number of threads producing evaluations can be much higher.

...flags/src/main/kotlin/com/datadog/android/flags/internal/aggregation/EvaluationAggregator.kt

dd-oleksii · 2026-02-03T19:07:39Z

...flags/src/main/kotlin/com/datadog/android/flags/internal/aggregation/EvaluationAggregator.kt

+        }
+        val toDrain = aggregationMap
+        aggregationMap = ConcurrentHashMap()
+        return toDrain.map { (_, stats) -> stats.toEvaluationEvent() }


minor: this mapping is best done outside of write lock

dd-oleksii · 2026-02-03T19:08:35Z

...ndroid-flags/src/main/kotlin/com/datadog/android/flags/internal/EvaluationEventsProcessor.kt

+        )
+
+        if (drainedEvents != null) {
+            writer.writeAll(drainedEvents)


re-schedule periodic flush here? (don't forget flushMutex, so we don't schedule multiple futures simultaneously)

0xnm · 2026-02-03T22:00:33Z

...dk-android-flags/src/main/kotlin/com/datadog/android/flags/internal/EvaluationEventWriter.kt

+     *
+     * @param events the evaluation events to write to storage
+     */
+    fun writeAll(events: List<FlagEvaluation>)


nit: as I noted in another review, maybe it can be just write, especially if we don't expect another methods in this interface

0xnm · 2026-02-03T22:04:20Z

...droid-flags/src/main/kotlin/com/datadog/android/flags/internal/aggregation/AggregationKey.kt

+    val flagKey: String,
+    val variantKey: String? = null,
+    val allocationKey: String? = null,
+    val targetingKey: String?,


Is there specific reason why targetingKey has no default value as other nullable properties of this class?

Should this property be nullable?

It must be nullable because the user might not pass the targeting key. But null is not a good default value per se as that might hide issues/bugs.

I would suggest removing default values from the rest of the properties. If they are null, it's better to pass that explicitly

0xnm · 2026-02-03T22:14:23Z

...ndroid-flags/src/main/kotlin/com/datadog/android/flags/internal/EvaluationEventsProcessor.kt

+import kotlin.concurrent.withLock
+
+/**
+ * Aggregates flag evaluation events before writing them to storage.


It seems that aggregation logic can be done on the backend side, is there a reason we do this on the client side? Seems like doing it on the client side adds some complexity.

Or is it because otherwise we have to send too many events to the intake?

Yep, the client-side aggregation is to reduce network usage. Flag evaluations can be on the hot path in users' code, and would generate tons of events without aggregation.

0xnm · 2026-02-05T07:44:36Z

...flags/src/main/kotlin/com/datadog/android/flags/internal/aggregation/EvaluationAggregator.kt

+        val toDrain = aggregationMap
+        aggregationMap = ConcurrentHashMap()


this can be simpler:

val toDrain = aggregationMap.toMap() aggregationMap.clear()

and then aggregationMap can be simply val without @Volatile

0xnm · 2026-02-05T07:53:19Z

...flags/src/main/kotlin/com/datadog/android/flags/internal/aggregation/EvaluationAggregator.kt

+            errorMessage = errorMessage
+        )
+
+        mapLock.read {


although aggregationMap.putIfAbsent below is a write operation

0xnm · 2026-02-05T07:58:38Z

...flags/src/main/kotlin/com/datadog/android/flags/internal/aggregation/EvaluationAggregator.kt

+            existing?.recordEvaluation(timestamp, errorMessage)
+        }
+
+        if (aggregationMap.size < maxAggregations) {


note:

* Bear in mind that the results of aggregate status methods including * {@code size}, {@code isEmpty}, and {@code containsValue} are typically * useful only when a map is not undergoing concurrent updates in other threads.

but probably it is fine is .size call is off by a bit

0xnm · 2026-02-05T08:12:00Z

...oid-flags/src/main/kotlin/com/datadog/android/flags/internal/aggregation/AggregationStats.kt

+    private var count: Int = 1
+    private var firstEvaluation: Long = firstTimestamp
+    private var lastEvaluation: Long = firstTimestamp
+    private var lastErrorMessage: String? = errorMessage


I think this can be done simpler, without mutability and synchronisation primitives.

AggregationStats can be just a data class, meaning when update is needed .copy method is called at the usage side.

0xnm · 2026-02-05T08:12:55Z

...oid-flags/src/main/kotlin/com/datadog/android/flags/internal/aggregation/AggregationStats.kt

+                    FlagEvaluation.Rum(
+                        application = FlagEvaluation.Application(id = appId),
+                        view = aggregationKey.viewName?.let { viewName ->
+                            FlagEvaluation.View(url = viewName)


Generally speaking View URL != View Name. Is there a mistake here?

0xnm · 2026-02-05T08:15:08Z

...flags/src/main/kotlin/com/datadog/android/flags/internal/aggregation/EvaluationAggregator.kt

+    @Volatile
+    private var aggregationMap = ConcurrentHashMap<AggregationKey, AggregationStats>()
+
+    private val mapLock = ReentrantReadWriteLock()


if we guard concurrent access to this class from the processor side, why do we need another guard here?

typotter force-pushed the typo/FFL-1720-pr2-aggregation branch from e1704c3 to 95f4c0a Compare January 22, 2026 09:27

This was referenced Jan 22, 2026

[FFL-1720] Evaluation Logging: Event Schema & Data Models #3144

Merged

[FFL-1720] Evaluation Logging: Storage & Network Infrastructure #3146

Open

[FFL-1720] Evaluation Logging: Integration #3147

Open

typotter force-pushed the typo/FFL-1720-pr2-aggregation branch from 95f4c0a to a0e7df9 Compare January 22, 2026 16:29

This comment has been minimized.

Sign in to view

dd-oleksii reviewed Jan 23, 2026

View reviewed changes

...dk-android-flags/src/main/kotlin/com/datadog/android/flags/internal/EvaluationEventWriter.kt Outdated Show resolved Hide resolved

dd-oleksii reviewed Jan 23, 2026

View reviewed changes

...ndroid-flags/src/main/kotlin/com/datadog/android/flags/internal/EvaluationEventsProcessor.kt Outdated Show resolved Hide resolved

sameerank mentioned this pull request Jan 26, 2026

FFL-1721 Emit flagevaluation EVP Track DataDog/dd-sdk-ios#2646

Open

5 tasks

typotter mentioned this pull request Jan 27, 2026

[Flags] Evaluations subfeature #3159

Merged

3 tasks

typotter force-pushed the typo/FFL-1720-pr1-schema-models branch 2 times, most recently from eb13adb to c924067 Compare January 27, 2026 19:25

typotter force-pushed the typo/FFL-1720-pr2-aggregation branch from 97e4a4e to eb0ace5 Compare January 27, 2026 19:29

typotter force-pushed the typo/FFL-1720-pr1-schema-models branch from c924067 to a031b97 Compare January 28, 2026 06:34

typotter force-pushed the typo/FFL-1720-pr2-aggregation branch from 968dc68 to 80c8de7 Compare January 28, 2026 06:55

typotter force-pushed the typo/FFL-1720-pr1-schema-models branch 2 times, most recently from ddeaaa3 to 75827d0 Compare January 28, 2026 07:27

typotter force-pushed the typo/FFL-1720-pr2-aggregation branch from 2eb16ca to 0cfb30a Compare January 28, 2026 07:27

typotter requested a review from dd-oleksii January 28, 2026 07:27

typotter force-pushed the typo/FFL-1720-pr2-aggregation branch from 0cfb30a to 141cded Compare January 28, 2026 07:50

typotter marked this pull request as ready for review January 28, 2026 07:57

typotter requested a review from a team as a code owner January 28, 2026 07:57

typotter force-pushed the typo/FFL-1720-pr1-schema-models branch 2 times, most recently from 1c0c827 to 00b8f55 Compare January 28, 2026 16:30

typotter force-pushed the typo/FFL-1720-pr2-aggregation branch from 141cded to 4bfc37e Compare January 28, 2026 16:33

typotter force-pushed the typo/FFL-1720-pr1-schema-models branch from 00b8f55 to 7dba2fe Compare January 28, 2026 17:01

typotter force-pushed the typo/FFL-1720-pr2-aggregation branch from 4bfc37e to 550f9f0 Compare January 28, 2026 17:05

typotter added 18 commits February 2, 2026 11:39

Add RUM view to the evaluation aggregation key

fe8347c

create evaluation event context (inc. dd context) in evaluation event

2bac376

detekt

3fc36d8

internalize DDContext contruction

b51581e

improve tests

71974f1

aggregation key in stats, new tests, other fixes

170e50d

remove write method, fix tests

7cc7763

use singular schema instead of batched

3ad8195

reschedule default false

f7eabda

fix flakey tests

3fd3bcd

fix for new schema

2cd54a7

Refactor to DatadogContext

9473435

artifact

145a8b6

pass fields to exposure processor, leave fetch to different layer

24db497

refine locks and split processor into aggregator + processor

4460c80

lint

9fa27be

inject aggregator into processor

0636adb

simplify periodic schedule bool

66b62f4

typotter force-pushed the typo/FFL-1720-pr2-aggregation branch from e09cbe5 to 66b62f4 Compare February 2, 2026 18:47

typotter requested a review from dd-oleksii February 2, 2026 18:58

typotter commented Feb 2, 2026

View reviewed changes

dd-oleksii previously approved these changes Feb 3, 2026

View reviewed changes

typotter added 3 commits February 3, 2026 10:34

reduce redundant fields in aggregation objects

89b75cc

drain events on record, avoid spurious capacity flushes

eae5970

lint

2c13aff

typotter dismissed dd-oleksii’s stale review via 2c13aff February 3, 2026 19:02

typotter requested a review from dd-oleksii February 3, 2026 19:03

dd-oleksii approved these changes Feb 3, 2026

View reviewed changes

0xnm reviewed Feb 3, 2026

View reviewed changes

0xnm reviewed Feb 5, 2026

View reviewed changes

		val toDrain = aggregationMap
		aggregationMap = ConcurrentHashMap()

Conversation

typotter commented Jan 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🥞 Evaluation Logging Stacked Pull Requests 🥞

What does this PR do?

Motivation

Description

Additional Notes

Review checklist (to be filled by reviewers)

Uh oh!

typotter commented Jan 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

This comment has been minimized.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codecov-commenter commented Jan 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

typotter left a comment

Choose a reason for hiding this comment

Uh oh!

dd-oleksii Feb 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dd-oleksii Feb 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

typotter Feb 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dd-oleksii Feb 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

0xnm Feb 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

typotter commented Jan 22, 2026 •

edited

Loading

typotter commented Jan 22, 2026 •

edited

Loading

codecov-commenter commented Jan 27, 2026 •

edited

Loading

dd-oleksii Feb 3, 2026 •

edited

Loading

dd-oleksii Feb 3, 2026 •

edited

Loading

typotter Feb 3, 2026 •

edited

Loading

dd-oleksii Feb 3, 2026 •

edited

Loading

0xnm Feb 3, 2026 •

edited

Loading