You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
refactor: implement structured event models and improve trace parsing
- Add Pydantic models for syscall events (ExecveEvent, ForkEvent, CloneEvent, ConnectEvent)
- Introduce UnparsedEvent model for raw trace lines that cannot be parsed
- Update TraceReader to yield structured event models instead of raw strings
- Refactor tests to handle new event model structure
- Update documentation to reflect new event model architecture
- Improve README with examples of structured event data
- Clean up project structure and remove legacy code markers
This change improves type safety and validation of event data while maintaining
backward compatibility through the UnparsedEvent model for unparseable lines.
The previous grouping by process name (`process_events`) in Cell reports might change based on how these structured events are aggregated. The core reporting hierarchy remains.
101
+
86
102
## Configuration
87
103
88
104
Linux EDR can be configured using a `config.ini` file with the following options:
Copy file name to clipboardExpand all lines: docs/architecture/overview.md
+16-14Lines changed: 16 additions & 14 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -17,7 +17,7 @@ For more details, see the [Clean Architecture](clean-architecture.md) page.
17
17
18
18
### Domain Layer
19
19
20
-
-**Models**: Defines the structure of events and reports using Pydantic, ensuring data consistency and validation.
20
+
-**Models**: Defines the structure of events and reports using Pydantic, ensuring data consistency and validation. Includes a base `BaseSyscallEvent` and specific models for traced syscalls (e.g., `ExecveEvent`, `ForkEvent`).
21
21
22
22
### Application Layer
23
23
@@ -45,15 +45,16 @@ For more details, see the [Clean Architecture](clean-architecture.md) page.
45
45
46
46
## Data Flow
47
47
48
-
1. The `TraceReader` continuously reads `execve` events from the kernel trace pipe.
49
-
2. Events are passed to the `Aggregator`, which buffers them in a thread-safe manner.
50
-
3. A background scheduler triggers the appropriate use case at the configured interval.
51
-
4. The use case retrieves a snapshot of events from the `Aggregator`.
52
-
5. The service creates a Level 1 `Cell` report from the event snapshot.
53
-
6. The `Cell` is passed to the `ReportManager`.
54
-
7. The `ReportManager` saves the `Cell` and checks if enough Cells exist to create a Level 2 `Block`. This process continues up the hierarchy (Daily, Weekly, Monthly).
55
-
8. The `Reporter` can optionally save the initial `Cell` report to a JSON file and send it to OpenAI for analysis.
56
-
9. Higher-level reports (Blocks, etc.) can also be configured for AI analysis via the `ReportManager` interacting with the `Reporter`.
48
+
1. The `TraceReader` continuously reads raw event strings from the kernel trace pipe.
49
+
2. The `TraceReader` attempts to parse known syscall event lines (e.g., execve, fork, connect) into corresponding Pydantic models (`ExecveEvent`, `ForkEvent`, etc.). Unparsed lines are yielded as raw strings.
50
+
3. Parsed event models (or raw strings if parsing fails) are passed to the `Aggregator`, which buffers them.
51
+
4. A background scheduler triggers the report generation process at the configured interval.
52
+
5. The reporting process retrieves a snapshot of buffered events (now structured models) from the `Aggregator`.
53
+
6. A Level 1 `Cell` report is created from the event snapshot.
54
+
7. The `Cell` is passed to the `ReportManager`.
55
+
8. The `ReportManager` saves the `Cell` and checks if enough Cells exist to create a Level 2 `Block`. This process continues up the hierarchy (Daily, Weekly, Monthly).
56
+
9. The `Reporter` can optionally save the initial `Cell` report to a JSON file and send it to OpenAI for analysis.
57
+
10. Higher-level reports (Blocks, etc.) can also be configured for AI analysis via the `ReportManager` interacting with the `Reporter`.
57
58
58
59
## Project Structure
59
60
@@ -62,6 +63,7 @@ linux-edr/
62
63
├── linux_edr/ # Main source code package
63
64
│ ├── domain/ # Core business logic
64
65
│ │ └── models/ # Domain entities and value objects
66
+
│ │ └── events/ # Pydantic models for specific syscall events
65
67
│ ├── application/ # Application-specific business rules
66
68
│ │ ├── services/ # Stateless operations
67
69
│ │ └── use_cases/ # Business processes
@@ -71,12 +73,12 @@ linux-edr/
71
73
│ │ └── controllers/ # Input adapters (CLI, API controllers)
0 commit comments