Skip to content

Move main graph to new API v2 endpoint#6152

Open
RobertJoonas wants to merge 13 commits intomasterfrom
main-graph-to-api-v2
Open

Move main graph to new API v2 endpoint#6152
RobertJoonas wants to merge 13 commits intomasterfrom
main-graph-to-api-v2

Conversation

@RobertJoonas
Copy link
Contributor

@RobertJoonas RobertJoonas commented Mar 10, 2026

Changes

Needs a rebase and most likely in conflict with frontend changes. Builds on top of an intermediate state of #6119.

  • Implement the required logic into the new /query endpoint to support everything that main-graph does
  • A major rewrite of main_graph_test.exs - populate_stats and other tests setup remained the same, but the request is now made against the new endpoint. Also, the assertions on the data have changed significantly
  • Support returning comparisons in the API v2 response format. This was tricky to figure out. See below for details.
  • Fix a known bug in API v2 (see commit with description 4923839)
  • Remove the main graph endpoint
  • Remove comparisons logic from legacy timeseries (see commit with description 7ad120a)
  • Wire up the UI properly
  • Update changelog

New comparison response format for timeseries data

Returning comparison data in time buckets was not implemented before. This PR implements it. Unlike comparisons with empty dimensions or non-time dimensions, the number of dimension groups returned in results is guaranteed to match between original and comparison results.

  • In case of empty dimensions, the length is just 1.
  • In case of non-time dimensions (e.g. ["event:page"]), an IN filter guarantees that comparison results is a subset, and can never include a dimension group that's not in the original results.

Therefore, in these two cases, it's very easy to attribute a comparison object to each item (row) in the results list.

With a time dimension however, there might be a situation where comparison results contain less/more rows than the original results. For that reason, when constructing the final results list, we iterate over time labels instead of main results. And not just the main query time labels. If the comparison_time_labels have greater length, then they also make the results list grow, filling the original results with nil rows. (See query_test.exs in diff to get a better idea of how it works).

It's also important to note that ClickHouse queries are not returning groups without any data. Therefore, in order for the response to be able to indicate which comparison bucket corresponds to which original bucket, those groups need to be filled with empty metrics. This was also working like that before.

Whenever comparisons are not queried though, the results list doesn't include those empty metric buckets. Therefore, the frontend code (which is not yet implemented) needs to account for that (cc @apata):

  • When comparisons are queried
    • iterate over the results list to get both the plot and comparison_plot. There's meta.time_labels in the response, but no meta.comparison_time_labels. comparison time labels should be read from the comparison object in each row (the dimensions key).
  • When comparisons are NOT queried
    • iterate over time labels to get the plot -- if results.find(r => r.dimensions[0] == time_label) returns a row, read it's metrics, otherwise, it's up to the frontend to fill that gap in the plot with an "empty metric" value. Thinking of it now, it sounds easier to solve this problem on the backend (wouldn't have to duplicate the logic of "what is an empty value for this metric" on the frontend either). If we decide to go that way though, we'll want to keep this exclusive to the internal API.

Tests

  • Automated tests have been added

Changelog

  • Needs a changelog update

Documentation

  • This change does not need a documentation update

Dark mode

  • This PR does not change the UI

This is currently only used by the main graph, which is going to move to
a new endpoint in this PR.

* The CSV export is currently ignoring comparisons

* the `&compare=previous_period` option in Stats API v1 is ignored by
  the timeseries endpoint
* For time:hour and time:minute, sessions are smeared using time_slots.
  The fix is to filter out time_slots that fall outside of the utc boundaries

* For any other time dimension, there's no session smearing, but since
  sessions are put into time buckets by the last event timestamps, the
  query might return buckets that are outside of the query time range.
  The fix is to clamp those sessions into the last bucket instead.
@RobertJoonas RobertJoonas force-pushed the main-graph-to-api-v2 branch from b75763c to e643b00 Compare March 11, 2026 15:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant