The idea is to have the Partout Master be able to control and statistically throttle the volume of events being sent by large numbers of agents.
| Field | Description | |
|---|---|---|
| 1 | agent_uuid | Agent's unique id |
| hostname | Agent's hostname | |
| arch | Agent's architecture | |
| platform | Agent's platform | |
| os_release | Agent's Operating System release | |
| os_family | Agent's Operating System family | |
| os_dist_name | Agent's Operating System Distribution name | |
| os_dist_version_id | Agent's Operating System Distribution version identifier | |
| 2 | module | Name of the agent module sending this event |
| 3 | object | Name or title of the object subject to this event |
| 4 | msg | Description of the Event or Action |
1 Primary agent identifier
2
3
4
| Level | Aggregates |
|---|---|
| 1 | Counts by agent_uuid |
| 2 | Counts by agent_uuid & module |
| 3 | Counts by agent_uuid & module & object |
| 4 | Counts by agent_uuid & module & object & msg |
Default starts at level 4.
Master tells the agent's (via event send responses (POSTs to /events)) to use an event collection period based on:
Em = Events per minute arriving at the master
Ps = Collection Period in seconds
D = Event rate divisor
Pmin = Minimum floor collection period in seconds, e.g. 10 secs
Pmax = Maximum collection period in seconds, whereupon the agents must implement event level aggregation
Ps = int(Em / D)
if Ps < Pmin then Ps = Pmin
if Ps > Pmax then Ps = Pmax
if Ps ≥ Pmax then ...
Tobj = 500
Tmod = 2000
Tuuid = 5000
Tmax = 10000
if Em < Tobj
Alevel = 4 // all detail
if Em ≥ Tobj & < Tmod
Alevel = 3 // aggregate at Object
if Em ≥ Tmod & < Tuuid
Alevel = 2 // aggregate at Module
if Em ≥ Tuuid & < Tmax
Alevel = 1 // aggregate at Agent UUID
if Em ≥ Tmax
Alevel = 0 // aggregate on Agent alive msgs (same content as level 1)
{
agent: { // level 0/1
uuid: ...,
count: n,
modules: { // level 2
module_name: {
count: n,
objects: { // level 3
object_name_base64: {
count: n,
messages : { // level 4
msg_base64: {
level: error|info,
count: n
}
}, ...
}, ...
}, ...
},
period_secs: n
}
e.g.:
{
agent: {
uuid: '89609ae7-c955-4109-a2b8-4d0e7edcf460',
count: 1,
modules: {
'file': {
count: 1,
objects: {
'8765765jhgfjfhgf876587': { // for special chars handling
count: 1,
messages: {
'786576576576dsfsdfsfdsdf976876': {
level: 'info',
count: 1
}
}
}
}
}
}
},
period_secs: 10
}
POST /events JSON as above
(Note: /event api will be deprecated.)
The /events api will always respond with the current aggregate throttling settings for the agents. e.g.:
res
.status(500)
.send({
aggregate_period_secs: 60,
aggregate_period_splay: 0.05,
aggregate_level: 4,
notify_alive_period_secs: 60 * 60 * 24
});
Where:
aggregate_period_secs= Psaggregate_level= 0 - 4notify_alive_period_secs= final fallback level, notify agent is alive (includes above aggregates counts at uuid detail level 0).
Data will be stored in the ArangoDB database.
| Period | Detail Level |
|---|---|
| Current day | Raw aggregate data from agents |
| Past 7 days | Hourly aggregates |
| Past 31 days | Daily aggregates |
| Past 365 days (or more) | Weekly aggregates |
Combination of per Master traffic thresholds AND per agent traffic thresholds???