Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

58 changes: 48 additions & 10 deletions docs/data-format.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,23 +4,59 @@ Denet outputs JSON in a streaming format optimized for efficiency and time-serie

## Format Structure

**First line**: Process metadata (emitted once)
Every line carries a `"kind"` discriminator so downstream tooling can dispatch by type. The possible values are `env`, `metadata`, `sample`, and `tree`.

**Optional first line** (when `--write-env` is set): host/NUMA/affinity snapshot, emitted once.
```json
{"kind":"env","ts_ms":1748542000000,"host":"omnibenchmark","kernel":"6.18.7-...","lscpu":{...},"numa":{...},"affinity_inherited":"0-127", ...}
```

**Header line**: Process metadata, emitted once.
```json
{"pid": 1234, "cmd": ["sleep", "5"], "exe": "/usr/bin/sleep", "t0_ms": 1748542000000}
{"kind":"metadata","pid":1234,"cmd":["sleep","5"],"executable":"/usr/bin/sleep","t0_ms":1748542000000}
```

**Subsequent lines**: Process tree metrics (streamed continuously)
**Subsequent lines**: Process tree metrics, streamed continuously.
```json
{"ts_ms": 1748542001000, "parent": {...}, "children": [...], "aggregated": {...}}
{"kind":"tree","ts_ms":1748542001000,"parent":{...},"children":[...],"aggregated":{...}}
```

Single-process mode (`--exclude-children`) emits `{"kind":"sample",...}` records instead.

> **Back-compat:** files written before the `kind` field existed are still readable by the `stats` / `summary` subcommands and by the Python reader — parsers fall back to the legacy untagged shapes when no `kind` is present.

## Env Record (reproducibility snapshot)

Enabled with `--write-env` on the CLI or `write_env=True` in the Python binding. Captured once at the start of monitoring; useful for benchmark reproducibility (NUMA placement, CPU governor, hyperthreading, cgroup limits).

| Field | Type | Description |
|-------|------|-------------|
| `ts_ms` | number | Capture timestamp (Unix milliseconds) |
| `host` | string | Hostname (`/proc/sys/kernel/hostname`) |
| `kernel` | string | Kernel release (`/proc/sys/kernel/osrelease`) |
| `lscpu.sockets` | number | Physical sockets |
| `lscpu.cores_per_socket` | number | Cores per socket |
| `lscpu.threads_per_core` | number | Threads per core (SMT siblings) |
| `lscpu.model` | string | First `model name` from `/proc/cpuinfo` |
| `numa.nodes` | number | NUMA node count |
| `numa.distances` | number[][] | Square distance matrix from `/sys/.../node*/distance` |
| `numa.node_sizes_mb` | number[] | `MemTotal` per node, in MB |
| `affinity_inherited` | string | CPU affinity of the monitor process as a range list (e.g. `"0-3,7-9"`) |
| `cpu_governor` | string[]? | `scaling_governor` per CPU (omitted if cpufreq is unavailable) |
| `cpu_freq_khz` | number[]? | `scaling_cur_freq` per CPU |
| `thp_enabled` | string? | `/sys/kernel/mm/transparent_hugepage/enabled` raw value |
| `smt_active` | bool? | `/sys/devices/system/cpu/smt/active` |
| `cgroup` | string? | `/proc/<pid>/cgroup` of the monitored process |

Optional fields degrade to `null`/absent on kernels, distros, or containers where the source file is missing. On non-Linux platforms only `ts_ms`/`host`/`kernel` are populated.

## Metadata Fields

| Field | Type | Description |
|-------|------|-------------|
| `pid` | number | Process ID |
| `cmd` | string[] | Command line arguments |
| `exe` | string | Executable path |
| `executable` | string | Executable path |
| `t0_ms` | number | Process start time (Unix milliseconds) |
| `capabilities` | object? | Manifest of optional metric sources detected at startup. See below. |

Expand Down Expand Up @@ -94,22 +130,24 @@ Includes all fields from Individual Process Metrics plus:
- **`--nodump`**: Disable automatic JSON dump to `out.json`
- **`--out FILE`**: Write JSON output to specified file
- **`--stats FILE`**: Write summary statistics to specified file
- **`--write-env`**: Prepend a one-shot `env` record (host/NUMA/affinity/governor/THP/SMT/cgroup) for reproducibility

## Example Complete Record

The output is [JSON Lines](https://jsonlines.org/) — one JSON object per line, newline-delimited. Each line is a self-contained record and can be parsed independently (e.g. with `jq`).

The **first line is a header** containing process metadata (pid, command, `t0_ms`). All subsequent lines are metric samples.
Lines are tagged with `"kind"`. When `--write-env` is set, an `env` header precedes the `metadata` header; otherwise the file starts at `metadata`. All subsequent lines are `sample` (single-process) or `tree` (process tree, default).

Timestamps use Unix milliseconds (ms since 1970-01-01 00:00:00 UTC):
- `t0_ms`: process start time, in the header line only
- `ts_ms`: sample timestamp in every metrics line
- `t0_ms`: process start time, in the `metadata` line only
- `ts_ms`: capture/sample timestamp in every other line

To get the elapsed time of a sample relative to process start: `elapsed_ms = ts_ms - t0_ms`.

```json
{"pid":1234,"cmd":["python","script.py"],"exe":"/usr/bin/python3","t0_ms":1748542000000}
{"ts_ms":1748542001000,"parent":{"ts_ms":1748542001050,"cpu_usage":15.2,"mem_rss_kb":8192,"mem_vms_kb":32768,"disk_read_bytes":1024,"disk_write_bytes":2048,"net_rx_bytes":512,"net_tx_bytes":256,"thread_count":3,"uptime_secs":1},"children":[{"pid":1235,"command":"worker","metrics":{"ts_ms":1748542001060,"cpu_usage":5.1,"mem_rss_kb":4096,"mem_vms_kb":16384,"disk_read_bytes":512,"disk_write_bytes":0,"net_rx_bytes":0,"net_tx_bytes":0,"thread_count":1,"uptime_secs":1}}],"aggregated":{"ts_ms":1748542001000,"cpu_usage":20.3,"mem_rss_kb":12288,"mem_vms_kb":49152,"disk_read_bytes":1536,"disk_write_bytes":2048,"net_rx_bytes":512,"net_tx_bytes":256,"thread_count":4,"process_count":2,"uptime_secs":1}}
{"kind":"env","ts_ms":1748542000000,"host":"omnibenchmark","kernel":"6.18.7-76061807-generic","lscpu":{"sockets":1,"cores_per_socket":64,"threads_per_core":2,"model":"AMD EPYC 7742 64-Core Processor"},"numa":{"nodes":4,"distances":[[10,12,12,12],[12,10,12,12],[12,12,10,12],[12,12,12,10]],"node_sizes_mb":[64272,64500,64500,64481]},"affinity_inherited":"0-127","cpu_governor":["performance"],"thp_enabled":"always [madvise] never","smt_active":true,"cgroup":"0::/user.slice"}
{"kind":"metadata","pid":1234,"cmd":["python","script.py"],"executable":"/usr/bin/python3","t0_ms":1748542000000}
{"kind":"tree","ts_ms":1748542001000,"parent":{"ts_ms":1748542001050,"cpu_usage":15.2,"mem_rss_kb":8192,"mem_vms_kb":32768,"disk_read_bytes":1024,"disk_write_bytes":2048,"sys_net_rx_bytes":512,"sys_net_tx_bytes":256,"thread_count":3,"uptime_secs":1},"children":[{"pid":1235,"command":"worker","metrics":{"ts_ms":1748542001060,"cpu_usage":5.1,"mem_rss_kb":4096,"mem_vms_kb":16384,"disk_read_bytes":512,"disk_write_bytes":0,"sys_net_rx_bytes":0,"sys_net_tx_bytes":0,"thread_count":1,"uptime_secs":1}}],"aggregated":{"ts_ms":1748542001000,"cpu_usage":20.3,"mem_rss_kb":12288,"mem_vms_kb":49152,"disk_read_bytes":1536,"disk_write_bytes":2048,"sys_net_rx_bytes":512,"sys_net_tx_bytes":256,"thread_count":4,"process_count":2,"uptime_secs":1}}
```

## Statistics Output
Expand Down
12 changes: 12 additions & 0 deletions docs/python-api.md
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,7 @@ exit_code, monitor = denet.execute_with_monitoring(
store_in_memory=True, # Store samples in memory
output_file=None, # Optional file output
write_metadata=False, # Write metadata as first line to output file (default False)
write_env=False, # Prepend a host/NUMA/affinity `env` record (default False)
include_children=True # Monitor child processes (default True)
)

Expand Down Expand Up @@ -111,6 +112,17 @@ exit_code, monitor = denet.execute_with_monitoring(
write_metadata=True # Includes metadata as first line: {"pid": 1234, "cmd": ["python", "script.py"], "executable": "/usr/bin/python", "t0_ms": 1625184000000}
)

# Capture host/NUMA/affinity reproducibility info as the very first line.
# Useful when comparing benchmark runs across machines or affinity settings.
exit_code, monitor = denet.execute_with_monitoring(
cmd=["python", "bench.py"],
output_file="metrics.jsonl",
write_env=True,
write_metadata=True,
)
# Or grab it as a string on demand:
env_line = monitor.get_env() # tagged JSON: {"kind":"env","host":...,"numa":{...},"affinity_inherited":"0-127",...}

# execute_with_monitoring also accepts subprocess.run arguments:
exit_code, monitor = denet.execute_with_monitoring(
cmd=["python", "script.py"],
Expand Down
31 changes: 23 additions & 8 deletions src/bin/denet.rs
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ use colored::Colorize;
#[cfg(feature = "ebpf")]
use denet::ebpf::debug;
use denet::error::Result;
use denet::monitor::{AggregatedMetrics, Metrics, Summary, SummaryGenerator};
use denet::monitor::{tagged_json, AggregatedMetrics, Metrics, Summary, SummaryGenerator};
use denet::ProcessMonitor;
use std::fs::File;
use std::io::{self, Write};
Expand Down Expand Up @@ -71,6 +71,10 @@ struct Args {
#[clap(long)]
no_polling: bool,

/// Write a host/NUMA/affinity `env` record before metadata (for reproducibility)
#[clap(long)]
write_env: bool,

#[command(subcommand)]
command: Commands,
}
Expand Down Expand Up @@ -306,12 +310,23 @@ fn execute_monitoring_with_output(
None
};

// Emit env record first (if requested) — captures host/NUMA/affinity for reproducibility.
if args.write_env {
let env_json = tagged_json("env", &monitor.get_env()).unwrap();
if let Some(file) = &mut file_handles.out_file {
writeln!(file, "{env_json}")?;
}
if args.json && !args.quiet {
println!("{env_json}");
}
}

// Get metadata
let metadata = monitor.get_metadata();

// Emit metadata first (always for files, only output to console if JSON mode)
// Emit metadata (always for files, only output to console if JSON mode)
if let Some(metadata_ref) = &metadata {
let metadata_json = serde_json::to_string(&metadata_ref).unwrap();
let metadata_json = tagged_json("metadata", metadata_ref).unwrap();
if let Some(file) = &mut file_handles.out_file {
writeln!(file, "{metadata_json}")?;
}
Expand Down Expand Up @@ -346,7 +361,7 @@ fn execute_monitoring_with_output(

let final_tree_metrics = monitor.sample_tree_metrics();
if args.json {
let json = serde_json::to_string(&final_tree_metrics).unwrap();
let json = tagged_json("tree", &final_tree_metrics).unwrap();
println!("{json}");
} else if let Some(agg) = final_tree_metrics.aggregated {
results.push(convert_aggregated_to_metrics(&agg));
Expand Down Expand Up @@ -375,7 +390,7 @@ fn execute_monitoring_with_output(

// Format and display metrics
if args.json {
let json = serde_json::to_string(&metrics).unwrap();
let json = tagged_json("sample", &metrics).unwrap();
if let Some(file) = &mut file_handles.out_file {
writeln!(file, "{json}")?;
}
Expand All @@ -401,7 +416,7 @@ fn execute_monitoring_with_output(
} else {
let formatted = format_metrics(&metrics);
if let Some(file) = &mut file_handles.out_file {
writeln!(file, "{}", serde_json::to_string(&metrics).unwrap())?;
writeln!(file, "{}", tagged_json("sample", &metrics).unwrap())?;
}
if !args.quiet {
if update_in_place {
Expand Down Expand Up @@ -441,7 +456,7 @@ fn execute_monitoring_with_output(

// Format and display tree metrics
if args.json {
let json = serde_json::to_string(&tree_metrics).unwrap();
let json = tagged_json("tree", &tree_metrics).unwrap();
if let Some(file) = &mut file_handles.out_file {
writeln!(file, "{json}")?;
}
Expand Down Expand Up @@ -469,7 +484,7 @@ fn execute_monitoring_with_output(
// Format and display tree metrics with parent and children
let formatted = format_aggregated_metrics(agg_metrics);
if let Some(file) = &mut file_handles.out_file {
writeln!(file, "{}", serde_json::to_string(&tree_metrics).unwrap())?;
writeln!(file, "{}", tagged_json("tree", &tree_metrics).unwrap())?;
}
if !args.quiet {
if update_in_place {
Expand Down
19 changes: 19 additions & 0 deletions src/config.rs
Original file line number Diff line number Diff line change
Expand Up @@ -106,6 +106,9 @@ pub struct OutputConfig {
pub update_in_place: bool,
/// Whether to write metadata as first line when writing to file
pub write_metadata: bool,
/// Whether to write an `env` (host/NUMA/affinity) record before metadata
/// when writing to file. Captured once at the start of monitoring.
pub write_env: bool,
}

impl Default for OutputConfig {
Expand All @@ -117,6 +120,7 @@ impl Default for OutputConfig {
quiet: false,
update_in_place: true,
write_metadata: false,
write_env: false,
}
}
}
Expand Down Expand Up @@ -210,6 +214,7 @@ pub struct OutputConfigBuilder {
quiet: Option<bool>,
update_in_place: Option<bool>,
write_metadata: Option<bool>,
write_env: Option<bool>,
}

impl OutputConfigBuilder {
Expand Down Expand Up @@ -248,6 +253,11 @@ impl OutputConfigBuilder {
self
}

pub fn write_env(mut self, write: bool) -> Self {
self.write_env = Some(write);
self
}

pub fn build(self) -> OutputConfig {
OutputConfig {
output_file: self.output_file,
Expand All @@ -256,6 +266,7 @@ impl OutputConfigBuilder {
quiet: self.quiet.unwrap_or(false),
update_in_place: self.update_in_place.unwrap_or(true),
write_metadata: self.write_metadata.unwrap_or(false),
write_env: self.write_env.unwrap_or(false),
}
}
}
Expand Down Expand Up @@ -481,6 +492,7 @@ mod tests {
.quiet(true)
.update_in_place(false)
.write_metadata(true)
.write_env(true)
.build();

assert_eq!(config.output_file, Some(PathBuf::from("output.json")));
Expand All @@ -489,6 +501,13 @@ mod tests {
assert!(config.quiet);
assert!(!config.update_in_place);
assert!(config.write_metadata);
assert!(config.write_env);
}

#[test]
fn test_output_config_write_env_default_false() {
let config = OutputConfigBuilder::default().build();
assert!(!config.write_env);
}

#[test]
Expand Down
62 changes: 43 additions & 19 deletions src/core/process_monitor.rs
Original file line number Diff line number Diff line change
Expand Up @@ -169,32 +169,35 @@ pub fn summary_from_json_file<P: AsRef<Path>>(path: P) -> io::Result<Summary> {
continue;
}

// Try to parse as different types of metrics
if let Ok(agg_metric) = serde_json::from_str::<AggregatedMetrics>(&line) {
// Got aggregated metrics
if first_timestamp.is_none() {
first_timestamp = Some(agg_metric.ts_ms);
}
last_timestamp = Some(agg_metric.ts_ms);
metrics_vec.push(agg_metric);
} else if let Ok(tree_metrics) = serde_json::from_str::<ProcessTreeMetrics>(&line) {
// Got tree metrics, extract aggregated metrics if available
if let Some(agg) = tree_metrics.aggregated {
// Try the tagged Record schema first; fall back to legacy untagged
// shapes for files written before the `kind` discriminator existed.
match crate::monitor::record::parse_record(&line) {
Some(crate::monitor::record::Record::Aggregated(agg)) => {
if first_timestamp.is_none() {
first_timestamp = Some(agg.ts_ms);
}
last_timestamp = Some(agg.ts_ms);
metrics_vec.push(agg);
metrics_vec.push(*agg);
}
} else if let Ok(metric) = serde_json::from_str::<Metrics>(&line) {
// Got regular metrics
if first_timestamp.is_none() {
first_timestamp = Some(metric.ts_ms);
Some(crate::monitor::record::Record::Tree(tree)) => {
if let Some(agg) = tree.aggregated {
if first_timestamp.is_none() {
first_timestamp = Some(agg.ts_ms);
}
last_timestamp = Some(agg.ts_ms);
metrics_vec.push(agg);
}
}
last_timestamp = Some(metric.ts_ms);
regular_metrics.push(metric);
Some(crate::monitor::record::Record::Sample(metric)) => {
if first_timestamp.is_none() {
first_timestamp = Some(metric.ts_ms);
}
last_timestamp = Some(metric.ts_ms);
regular_metrics.push(metric);
}
// Env / Metadata / unknown: header records, no metric content.
_ => {}
}
// Ignore metadata and other lines we can't parse
}

// Calculate total time
Expand Down Expand Up @@ -858,6 +861,12 @@ impl ProcessMonitor {
self.include_children
}

/// Snapshot host/NUMA/affinity/governor state for the monitored PID.
/// One-shot, suitable for writing as the first JSONL line.
pub fn get_env(&self) -> crate::monitor::EnvRecord {
crate::monitor::EnvRecord::collect(self.pid as u32)
}

/// Returns metadata about the monitored process
// Get process metadata (static information)
pub fn get_metadata(&mut self) -> Option<ProcessMetadata> {
Expand Down Expand Up @@ -2226,6 +2235,21 @@ mod tests {
}
}

#[test]
fn test_get_env_returns_record_for_running_process() {
// get_env is a one-shot wrapper; just verify it returns a record
// whose ts_ms is populated and host/kernel are non-empty on Linux.
let cmd = vec!["sleep".to_string(), "1".to_string()];
let monitor = create_test_monitor(cmd).unwrap();
let env = monitor.get_env();
assert!(env.ts_ms > 0);
#[cfg(target_os = "linux")]
{
assert!(!env.host.is_empty());
assert!(!env.kernel.is_empty());
}
}

#[test]
fn test_process_metadata() {
use std::thread;
Expand Down
Loading
Loading