Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,13 @@

# Changelog

### v1.1.0 (Mergebox RAM Residency Collector)

- **New `MergeboxCollector`** - Measures Meteor's MERGEBOX RAM residency (the per-session, server-side cache of published documents) and posts per-(publication, collection) rollups to `POST /api/v1/metrics/mergebox`. The collector walks `Meteor.server.sessions` read-only, estimates the resident bytes each session's mergebox holds per published collection (sizing each `SessionDocumentView.dataByKey` field value directly), reads the publication strategy via `Meteor.server.getPublicationStrategy()` (reverse-mapped to all four Meteor strategies — `SERVER_MERGE` / `NO_MERGE` / `NO_MERGE_NO_HISTORY` / `NO_MERGE_MULTI`; `unknown` only when the strategy genuinely can't be read), and attributes residency to subscriptions via a pure even-split across `existsIn`. The even-split is sum-preserving: the rows for a collection sum back to that collection's true residency. `connectionCount` is a count of distinct DDP sessions (never a list of connection ids).
- **Ships dark / opt-in** - `collectMergebox` defaults to **false**. Enable via `SkySignalAgent.configure({ collectMergebox: true })` or `SKYSIGNAL_COLLECT_MERGEBOX=true`.
- **Performance-bounded** - 60s default cadence (`mergeboxInterval`), per-session sampling (`mergeboxSampleRate`, every row stamps the rate for server-side extrapolation), `mergeboxMaxSessions` / `mergeboxMaxDocsPerSession` caps to bound a single synchronous tick, per-session try/catch, feature-detection of internal shapes, and a top-N output cap aligned with the server's 500-rows-per-POST limit. Read-only — never wraps `session.send` / `processMessage`.
- Requires platform-side support for the mergebox ingest endpoint (gated on the `ddp` feature / pro tier).

### v1.0.33 (App Version on Browser Errors)

- **Browser errors now carry the app version** - The client `ErrorTracker` stamps an `appVersion` onto every captured error (via a new `errorTracking.appVersion` setting, falling back to `Meteor.settings.public.skysignal.appVersion`, `Meteor.settings.public.appVersion`, then `__meteor_runtime_config__.appVersion`). The SkySignal Error Details screen shows this version, making it possible to tell whether a reported bug originates from an old or a current build. Errors reported without a version are unaffected. (fixes [#17](https://github.com/SkySignalAPM/agent/issues/17))
Expand Down
27 changes: 27 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -134,6 +134,7 @@ Every server-side configuration option has a corresponding environment variable.
| `SKYSIGNAL_COLLECT_OUTBOUND_HTTP` | `collectOutboundHttp` | Boolean |
| `SKYSIGNAL_COLLECT_CPU_PROFILES` | `collectCpuProfiles` | Boolean |
| `SKYSIGNAL_COLLECT_LIVE_QUERIES` | `collectLiveQueries` | Boolean |
| `SKYSIGNAL_COLLECT_MERGEBOX` | `collectMergebox` | Boolean |
| `SKYSIGNAL_COLLECT_PUBLICATIONS` | `collectPublications` | Boolean |
| `SKYSIGNAL_COLLECT_ENVIRONMENT` | `collectEnvironment` | Boolean |
| `SKYSIGNAL_COLLECT_VULNERABILITIES` | `collectVulnerabilities` | Boolean |
Expand Down Expand Up @@ -216,8 +217,34 @@ All collection interval and performance options also have corresponding `SKYSIGN
| `collectPublications` | Boolean | `true` | Detect publication over-fetching and missing projections |
| `collectEnvironment` | Boolean | `true` | Capture environment metadata (packages, flags, OS info) |
| `collectVulnerabilities` | Boolean | `true` | Run `npm audit` scans and report high/critical CVEs |
| `collectMergebox` | Boolean | `false` | **Opt-in.** Attribute server-side mergebox RAM residency to publications/collections. Off by default; see [Mergebox Residency](#mergebox-residency-opt-in) for details and tuning |
| `ingestAggregation` | Boolean | `true` | Roll up live query / subscription telemetry into fixed-shape aggregates before shipping. Reduces server ingest row counts 10-100× on high-volume apps. Requires a platform version that supports the aggregate ingest endpoints (v1.0.30+). Set to `false` to post per-observer / per-subscription records instead. |

### Mergebox Residency (opt-in)

`collectMergebox` is **disabled by default** — it is the one collector you must explicitly enable. When on, the agent takes a low-frequency, read-only snapshot of each DDP session's mergebox (Meteor's per-connection, server-side document cache) and reports estimated RAM residency per publication and collection, along with the active publication strategy (`SERVER_MERGE` / `NO_MERGE` / `NO_MERGE_NO_HISTORY` / `NO_MERGE_MULTI`).

Enable it via settings or environment variable:

```json
{ "skysignal": { "collectMergebox": true } }
```

```bash
SKYSIGNAL_COLLECT_MERGEBOX=true
```

Snapshot cost scales with sessions × collections × documents, so the collector is bounded by these safety knobs. Each also has a `SKYSIGNAL_*` environment variable (`SKYSIGNAL_MERGEBOX_INTERVAL`, `SKYSIGNAL_MERGEBOX_SAMPLE_RATE`, `SKYSIGNAL_MERGEBOX_MAX_SESSIONS`, `SKYSIGNAL_MERGEBOX_MAX_DOCS_PER_SESSION`):

| Option | Type | Default | Description |
|--------|------|---------|-------------|
| `mergeboxInterval` | Number | `60000` | Residency snapshot interval in ms (1 minute) |
| `mergeboxSampleRate` | Number | `1.0` | Fraction of DDP sessions sampled per tick (0-1). Lower on very large fleets; stamped on each row so the server extrapolates |
| `mergeboxMaxSessions` | Number | `2000` | Max sessions walked per snapshot tick |
| `mergeboxMaxDocsPerSession` | Number | `50000` | Max documents inspected per session per tick |

Requires a SkySignal plan with the DDP feature and a platform version that exposes the mergebox ingest endpoint.

### Agent-Side Aggregation

When `ingestAggregation` is enabled (the default), the `LiveQueriesCollector` and `DDPCollector` pre-aggregate telemetry on the agent into two compact payload shapes (one per live query signature, one per publication + params signature) and POST them to:
Expand Down
28 changes: 27 additions & 1 deletion lib/SkySignalClient.js
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@ const BATCH_ENDPOINTS = {
publications: "/api/v1/metrics/publications",
environment: "/api/v1/metrics/environment",
vulnerabilities: "/api/v1/metrics/vulnerabilities",
mergebox: "/api/v1/metrics/mergebox",
liveQueryAggregates: "/api/v1/live-queries/aggregates",
subscriptionAggregates: "/api/v1/subscriptions/aggregates"
};
Expand Down Expand Up @@ -64,6 +65,7 @@ const BATCH_PAYLOAD_KEYS = {
publications: "metrics",
environment: "metrics",
vulnerabilities: "metrics",
mergebox: "metrics",
liveQueryAggregates: "aggregates",
subscriptionAggregates: "aggregates"
};
Expand Down Expand Up @@ -171,7 +173,8 @@ export default class SkySignalClient {
deprecatedApis: [],
publications: [],
environment: [],
vulnerabilities: []
vulnerabilities: [],
mergebox: []
};

// Track batch sizes incrementally (O(1) add instead of O(n))
Expand Down Expand Up @@ -438,6 +441,29 @@ export default class SkySignalClient {
this._addToBatch("vulnerabilities", metric, "/api/v1/metrics/vulnerabilities");
}

/**
* Add a mergebox RAM residency rollup row to the batch.
*
* Each row is a pre-aggregated per-(publication, collection) residency rollup
* for a flush window (the MergeboxCollector does the even-split attribution
* agent-side). Rows are batched per-item like other per-window metrics; the
* server reads them under the "metrics" key at POST /api/v1/metrics/mergebox.
*
* @param {Object} metric - Mergebox residency rollup row
* @param {string} metric.collectionName - Published collection name
* @param {string} [metric.publicationName] - Publication name (omitted for auto-publish)
* @param {string} metric.strategy - SERVER_MERGE | NO_MERGE | NO_MERGE_NO_HISTORY | NO_MERGE_MULTI | unknown
* @param {number} metric.bytesHeld - Estimated mergebox residency bytes for this group
* @param {number} metric.docCount - Documents resident for this group
* @param {number} metric.connectionCount - Distinct DDP connections holding this group
* @param {number} metric.sampleRate - Per-session sample rate used by the collector
* @param {string} [metric.buildHash] - Deployed build hash (omitted when unresolved)
* @returns {void}
*/
addMergeboxMetric(metric) {
this._addToBatch("mergebox", metric, "/api/v1/metrics/mergebox");
}

/**
* Add a Real User Monitoring (RUM) measurement to the batch.
*
Expand Down
Loading
Loading