Skip to content

Commit 44df600

Browse files
committed
docs: update docs
1 parent c5cf7c7 commit 44df600

14 files changed

Lines changed: 373 additions & 7 deletions

docs/api/aggregator.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
# Aggregator API
2+
3+
::: linux_edr.aggregator.Aggregator

docs/api/models.md

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
# Models API
2+
3+
This page documents the Pydantic models used for representing events and reports.
4+
5+
::: linux_edr.models.CommandLine
6+
7+
::: linux_edr.models.ProcessEvents
8+
9+
::: linux_edr.models.SummaryReport
10+
11+
::: linux_edr.models.Cell
12+
13+
::: linux_edr.models.Block
14+
15+
::: linux_edr.models.DailyReport
16+
17+
::: linux_edr.models.WeeklyReport
18+
19+
::: linux_edr.models.MonthlyReport
20+
21+
::: linux_edr.models.DailySummary

docs/api/report_manager.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
# ReportManager API
2+
3+
::: linux_edr.report_manager.ReportManager

docs/api/reporter.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
# Reporter API
2+
3+
::: linux_edr.reporter.Reporter

docs/api/trace.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
# TraceReader API
2+
3+
::: linux_edr.trace.TraceReader

docs/architecture/ai-analysis.md

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,29 @@
1+
# AI-Enhanced Security Analysis
2+
3+
Linux EDR integrates with OpenAI's language models (defaulting to `gpt-4o-mini`, configurable via `model` setting) to provide automated analysis of system activity reports.
4+
5+
## Analysis Process
6+
7+
1. At each reporting interval (default 15 minutes), the generated `Cell` report can be sent to the configured OpenAI model.
8+
2. Higher-level reports (`Block`, `DailyReport`, etc.) generated by the `ReportManager` can also trigger analysis.
9+
3. The `Reporter` component formats the report data into a specific prompt tailored for security analysis.
10+
4. The prompt instructs the AI to act as a security analyst specializing in Linux systems and to look for potential threats based on command execution patterns.
11+
12+
## Focus Areas for AI Analysis
13+
14+
The AI analysis primarily looks for:
15+
16+
- **Unusual Command Patterns**: Execution of rare commands, unexpected command sequences, or commands run at odd times.
17+
- **Privilege Escalation Indicators**: Commands associated with gaining higher privileges (e.g., `sudo`, `su`, exploits).
18+
- **Data Exfiltration Attempts**: Use of tools like `scp`, `rsync`, `curl`, `wget` in suspicious contexts or targeting sensitive directories.
19+
- **Anomalous Network Activity**: Commands initiating unexpected network connections (though network traffic itself is not monitored, the commands *causing* it are).
20+
- **Suspicious File/Directory Operations**: Access to sensitive files (`/etc/shadow`), creation of hidden files/directories, or unusual use of file manipulation tools.
21+
- **Living-off-the-Land Techniques**: Misuse of standard system utilities for malicious purposes.
22+
23+
## Output
24+
25+
- The analysis text generated by the AI is logged by the application.
26+
- If an `output_file` is configured for the base `Cell` reports, the corresponding AI analysis is appended to a separate file named `<output_file>.analysis`.
27+
- For higher-level reports managed by `ReportManager`, the analysis text is stored directly within the `analysis` field of the respective report's JSON file (e.g., in `reports/blocks/block_....json`).
28+
29+
This automated analysis provides actionable security insights directly from the collected data, reducing the need for manual log review and helping to quickly identify potential threats.

docs/architecture/overview.md

Lines changed: 54 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,54 @@
1+
# Architecture Overview
2+
3+
Linux EDR is designed with a modular and robust architecture to handle real-time event processing and reporting efficiently.
4+
5+
## Core Components
6+
7+
- **Trace Reader (`trace.py`)**: Uses non-blocking I/O (`selectors`) to read from the kernel's `trace_pipe` without impacting system performance. Includes robust error handling and automatic reconnection logic.
8+
- **Aggregator (`aggregator.py`)**: A thread-safe buffer (`deque`) that collects events from the trace reader. Implements backpressure using a maximum length and optional event age limits.
9+
- **Report Manager (`report_manager.py`)**: Orchestrates the creation, storage, and aggregation of hierarchical reports (Cells, Blocks, Daily, Weekly, Monthly). Manages the lifecycle of reports based on time and event counts.
10+
- **Models (`models.py`)**: Defines the structure of events and reports using Pydantic, ensuring data consistency and validation.
11+
- **Reporter (`reporter.py`)**: Handles the output of reports, including saving to JSON files and sending data to OpenAI for analysis.
12+
- **Summary (`summary.py`)**: Contains logic for building the initial summary reports (Cells) from aggregated events.
13+
- **Application (`app.py`)**: The main application class that initializes components, manages the scheduler (using `APScheduler`), and orchestrates the event processing pipeline.
14+
- **Configuration (`config.py`)**: Loads and provides access to configuration settings from `config.ini` files.
15+
- **CLI (`cli.py`)**: Provides the command-line interface using Typer.
16+
17+
## Data Flow
18+
19+
1. The `TraceReader` continuously reads `execve` events from the kernel trace pipe.
20+
2. Events are passed to the `Aggregator`, which buffers them in a thread-safe manner.
21+
3. A background scheduler triggers the `_summarize` method in `app.py` at the configured interval (`report_interval`).
22+
4. `_summarize` retrieves a snapshot of events from the `Aggregator`.
23+
5. `build_summary` creates a Level 1 `Cell` report from the event snapshot.
24+
6. The `Cell` is passed to the `ReportManager`.
25+
7. The `ReportManager` saves the `Cell` and checks if enough Cells exist to create a Level 2 `Block`. This process continues up the hierarchy (Daily, Weekly, Monthly).
26+
8. The `Reporter` can optionally save the initial `Cell` report to a JSON file (`output_file`) and send it to OpenAI for analysis.
27+
9. Higher-level reports (Blocks, etc.) can also be configured for AI analysis via the `ReportManager` interacting with the `Reporter`.
28+
29+
## Project Structure
30+
31+
```text
32+
linux-edr/
33+
├── linux_edr/ # Main source code package
34+
│ ├── __init__.py
35+
│ ├── cli.py # Typer-based CLI interface
36+
│ ├── app.py # Core application logic
37+
│ ├── config.py # Configuration management
38+
│ ├── trace.py # Non-blocking trace reader
39+
│ ├── aggregator.py # Thread-safe event buffering
40+
│ ├── summary.py # Initial report generation (Cells)
41+
│ ├── reporter.py # OpenAI integration and output handling
42+
│ ├── report_manager.py # Hierarchical report management
43+
│ └── models.py # Pydantic data models
44+
├── tests/ # Comprehensive test suite
45+
├── docs/ # Documentation source files
46+
├── .github/ # GitHub Actions workflows
47+
│ └── workflows/
48+
│ └── docs.yml # Documentation deployment workflow
49+
├── linux-edr.service # Systemd service definition
50+
├── pyproject.toml # Project metadata and dependencies
51+
├── mkdocs.yml # MkDocs configuration
52+
├── PRIVACY.md # Privacy policy
53+
└── README.md # Repository README
54+
```

docs/architecture/reporting.md

Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,31 @@
1+
# Hierarchical Reporting Architecture
2+
3+
Linux EDR implements a sophisticated multi-tiered reporting system that provides security visibility across different time scales. This allows for analysis ranging from immediate, granular events to long-term strategic trends.
4+
5+
## Reporting Levels
6+
7+
The system aggregates data progressively through the following levels:
8+
9+
| Level | Coverage | Name | Source Components | Description |
10+
|:-----:|:--------------------|:---------------|:--------------------------|:-----------------------------------------------------|
11+
| 1 | 15 minutes | **Cell** | 1 Event Snapshot | Base unit capturing immediate system activity |
12+
| 2 | 16 Cells = 4 hours | **Block** | 16 Cells | Short-term patterns across multiple Cells |
13+
| 3 | 6 Blocks = 24 hours | **DailyReport**| 6 Blocks | Consolidated view of a full day's activity |
14+
| 4 | 7 DailyReports | **WeeklyReport**| 7 DailyReports | Week-long trends with daily breakdowns |
15+
| 5 | ~4 WeeklyReports | **MonthlyReport**| Approx. 4 WeeklyReports | Strategic view of monthly security posture |
16+
17+
*(Default intervals and aggregation counts are configurable in `config.ini`)*
18+
19+
## Benefits
20+
21+
This hierarchical architecture enables:
22+
23+
- **Immediate Threat Detection**: The `Cell` level provides a near real-time view (default 15 mins) of command executions, allowing for rapid identification of obviously malicious or unusual commands.
24+
- **Contextual Pattern Recognition**: The `Block` level (default 4 hours) aggregates data to reveal short-term patterns, such as repeated failed login attempts followed by a suspicious command, or unusual process behavior within a limited timeframe.
25+
- **Daily Security Posture Assessment**: The `DailyReport` consolidates a full day's activity, highlighting the most active processes and commands, and serving as a basis for identifying significant deviations from normal daily operations.
26+
- **Trend Identification**: The `WeeklyReport` analyzes trends over seven days, making it possible to spot recurring suspicious activities, track the evolution of potential incidents, and calculate weekly risk scores.
27+
- **Strategic Security Planning**: The `MonthlyReport` offers a high-level, long-term view of the system's security posture, summarizing key activities, risks, and incidents, suitable for strategic reviews and planning security improvements.
28+
29+
## Storage
30+
31+
All generated reports are automatically stored as individual JSON files within the directory specified by `reports_dir` in the configuration. They are organized into subdirectories corresponding to their level (e.g., `reports/cells/`, `reports/blocks/`, etc.).

docs/configuration.md

Lines changed: 66 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,66 @@
1+
# Configuration
2+
3+
Linux EDR behavior is controlled via a configuration file, typically named `config.ini`. The tool searches for this file in the following locations (in order):
4+
5+
1. `./config.ini` (current directory)
6+
2. `~/.config/linux_edr/config.ini` (user's config directory)
7+
3. `/etc/linux_edr/config.ini` (system-wide config)
8+
4. The default `config.ini` included with the package.
9+
10+
You can also specify a path directly using the `--config` command-line option.
11+
12+
## Configuration Options
13+
14+
Here are the available sections and options:
15+
16+
```ini
17+
[DEFAULT]
18+
# Path to the kernel trace_pipe used for monitoring execve events.
19+
# Default: /sys/kernel/tracing/trace_pipe
20+
trace_path = /sys/kernel/tracing/trace_pipe
21+
22+
# Interval (in minutes) at which summary reports (Cells) are generated.
23+
# Default: 15
24+
report_interval = 15
25+
26+
# The OpenAI model to use for security analysis (e.g., gpt-4o-mini, gpt-4).
27+
# Default: gpt-4o-mini
28+
model = gpt-4o-mini
29+
30+
# Enable verbose debug logging (true/false).
31+
# Default: false
32+
debug = false
33+
34+
# Path to save periodic JSON reports (Cells). Leave empty to disable file output.
35+
# The Report Manager will still store hierarchical reports in `reports_dir`.
36+
# Default: (empty string)
37+
output_file =
38+
39+
[OPENAI]
40+
# Your OpenAI API key. If left empty, the tool will attempt to read the
41+
# OPENAI_API_KEY environment variable.
42+
# Default: (empty string)
43+
api_key =
44+
45+
[REPORTS]
46+
# The base directory where hierarchical reports (Cells, Blocks, Daily, etc.)
47+
# will be stored in subdirectories.
48+
# Default: reports
49+
reports_dir = reports
50+
51+
[ADVANCED]
52+
# The maximum number of raw events to buffer in memory before being processed
53+
# into a Cell report. Acts as a backpressure mechanism.
54+
# Default: 10000
55+
max_events_buffer = 10000
56+
57+
# Limits the number of command examples per process included in the prompt
58+
# sent to the LLM for Cell-level analysis, preventing overly long prompts.
59+
# Default: 50
60+
max_summary_lines = 50
61+
62+
# Whether to include the raw event data within the saved JSON Cell reports.
63+
# Set to false to reduce storage space if raw data is not needed.
64+
# Default: true
65+
include_raw_events = true
66+
```

docs/development.md

Lines changed: 66 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,66 @@
1+
# Development Guide
2+
3+
Contributions and local development are welcome!
4+
5+
## Setup
6+
7+
1. **Clone the repository:**
8+
```bash
9+
git clone https://github.com/ParttimeWorks/linux_edr.git
10+
cd linux-edr
11+
```
12+
13+
2. **Install in editable mode with development dependencies:**
14+
We use `uv` for all dependency management.
15+
```bash
16+
# Installs the package itself and dependencies listed under [project.optional-dependencies]
17+
# in pyproject.toml (like pytest, mypy, black, mkdocs, etc.)
18+
uv pip install -e .[dev]
19+
```
20+
21+
## Running Tests
22+
23+
The project uses `pytest` for testing.
24+
25+
```bash
26+
# Run all tests
27+
uv run pytest
28+
29+
# Run with verbose output
30+
uv run pytest -v
31+
32+
# Run specific test files or functions
33+
uv run pytest tests/test_app.py::test_parse_execve
34+
```
35+
36+
## Type Checking
37+
38+
We use `mypy` for static type checking.
39+
40+
```bash
41+
uv run mypy linux_edr
42+
```
43+
44+
## Code Style
45+
46+
Code style is enforced using `black`.
47+
48+
```bash
49+
# Check formatting
50+
uv run black --check .
51+
52+
# Apply formatting
53+
uv run black .
54+
```
55+
56+
## Building Documentation
57+
58+
Documentation is built using `MkDocs`.
59+
60+
```bash
61+
# Serve documentation locally for preview (auto-reloads on changes)
62+
mkdocs serve
63+
64+
# Build the static documentation site (output in the `site/` directory)
65+
mkdocs build
66+
```

0 commit comments

Comments
 (0)