Problem
I am in a case where I need a single activity fact table containing all activity types with pre-computed temporal flags. The goal is a semantic layer build for self service analytics. I feel it might be useful for others as well, as agent querying semantic layer will only increase.
Why pre-computed flags?
- MetricFlow/Cube have limited query-time join capabilities for large dataset
- Reduces compute at runtime (e.g. agent calling a lot for insight)
- Ensures metric consistency across queries
Why dataset_column?
The pattern is ideal for reusable temporal flags, but currently requires filtering to a specific activity type:
sql```
select all order (...) -- ✅ Works, but only orders
select all pageview (...) -- ✅ Works, but only pageviews
select all * (...) -- ❌ Not supported
We need temporal flags computed across the entire stream (or null where not applicable), not just filtered subsets.
Proposed syntax
sql```
{% set aql %}
using activity_stream
select all * (
ts,
entity_key,
activity_id,
activity,
entity2_join_key
...
)
include (
first_order_date,
is_active_at_date,
total_events_before_at_date,
total_events_last_month_at_date,
)
{% endset %}
{{ dbt_activity_schema.dataset(aql) }}
Not sure if my case is too specific (then I can just build window function for my flags), or it might be usefull for the package to be future ready.
Problem
I am in a case where I need a single activity fact table containing all activity types with pre-computed temporal flags. The goal is a semantic layer build for self service analytics. I feel it might be useful for others as well, as agent querying semantic layer will only increase.
Why pre-computed flags?
Why dataset_column?
The pattern is ideal for reusable temporal flags, but currently requires filtering to a specific activity type:
sql```
select all order (...) -- ✅ Works, but only orders
select all pageview (...) -- ✅ Works, but only pageviews
select all * (...) -- ❌ Not supported
Not sure if my case is too specific (then I can just build window function for my flags), or it might be usefull for the package to be future ready.