diff --git a/HOWTO.md b/HOWTO.md
index 123647c..e5ef6d5 100644
--- a/HOWTO.md
+++ b/HOWTO.md
@@ -1,16 +1,17 @@
+### Disclaimer
+
+Like many other debugging tools, **bpfvv** may help you better understand **what** is happening with the verification of your BPF program but it is up to you to figure out **why** it is happening.
+
# How to use bpfvv
-> [!WARNING]
-> The bpfvv app is in early stages of development, and you should expect
-> bugs, UI inconveniences and significant changes from week to week.
->
-> If you're working with BPF and you think this tool (or a better
-> version of it) would be useful, feel free to use it and don't be shy
-> to report issues and request features via github. Thanks!
+The tool itself is hosted here: https://libbpf.github.io/bpfvv/
-Go here: https://libbpf.github.io/bpfvv/
+You can load a log by pasting it into the text box or choosing a local file.
-Load a log by pasting it into the text box or choosing a file.
+You can also use the `url` query parameter to link to a raw log file, for example:
+```
+https://libbpf.github.io/bpfvv/?url=https://gist.githubusercontent.com/theihor/e0002c119414e6b40e2192bd7ced01b1/raw/866bcc155c2ce848dcd4bc7fd043a97f39a2d370/gistfile1.txt
+```
The app expects BPF verifier log of `BPF_LOG_LEVEL1`[^1]. This is a log
that you get when your BPF program has failed verification on load
@@ -42,14 +43,32 @@ lot of information about the interpreted state of the program on each
instruction. The app parses the log and re-constructs program states
in order to display potentially useful information in interactive way.
-There are two main views of the program:
-* (on the left) formatted log, instruction stream
-* (on the right) program state: known values of registers and stack slots
-
+## UI overview
+
+There are three main views of the program:
+* (on the left) C source view
+* (in the middle) interactive instruction stream
+* (on the right) program state: known values of registers and stack slots for the *selected log line*
+
+The left and right views are collapsible
-## What's in the log
+https://github.com/user-attachments/assets/758d650b-22f1-49f0-ab46-ae1a089667a8
-Notice that the displayed text has different content than the raw log.
+### Top bar
+
+The top bar contains basic app controls such as:
+* clear current log
+* load an example log
+* load a local file
+* link to this howto doc
+
+https://github.com/user-attachments/assets/4d3f8aa0-cb9d-46e0-ae46-a1224c7a5600
+
+### The instruction stream
+
+The main view of the log is the interactive instruction stream.
+
+Notice that the displayed text has content different from the raw log.
For example, consider this line:
```
1: (7b) *(u64 *)(r10 -24) = r2 ; R2_w=1 R10=fp0 fp-24_w=1
@@ -70,157 +89,152 @@ interactive features. Notable example is call instructions.
For example, consider the following raw log line:
```
-23: (85) call bpf_map_lookup_elem#1 ; R0=map_value_or_null(id=3,map=eventmap,ks=4,vs=2452)
+7: (85) call bpf_probe_read_user#112
```
It is displayed like this:
```
-r0 = call bpf_map_lookup_elem#1(r1, r2, r3, r4, r5)
+r0 = bpf_probe_read_user(dst: r1, size: r2, unsafe_ptr: r3)
```
+If bpfvv is aware of a helper signature, it knows the number and names of arguments and displays them in the format `name: reg`.
+For known helpers its name is also a link to documentation for that helper.
+
Notice also that the lines not recognized by the parser are greyed
out. If you notice an unrecognized instruction, please submit a bug
report.
-### Subprogram calls
+#### Data dependencies
+
+The app computes a use-def analysis [^2] and you can interactively view dependencies between the instructions.
+
+The concept is simple. Every instruction may read some slots (registers, stack, memory) and write to others.
+Knowing these it is possible to determine, for a given slot, where its value came from, from what slot, and at what instruction.
+
+You can view the results of this analysis by clicking on some instruction operands (registers and stack slots).
+
+The selected slot is identified by a box around it. This selection changes the log view, greying out "irrelevant" instructions, and leaving only data-dependent instructions in the foreground.
+
+On the left side of the instruction stream are the lines visualizing the dependencies. The lines are interactive and can be used for navigation.
+
+https://github.com/user-attachments/assets/82ae80d6-314e-47bf-9892-f5dded4b9944
+
+#### Subprogram calls
When there is a subprogram call in the log instruction stream, the
-stack frames are tracked by the app when computing state. When subprogram
-call is detected there is indentation and comments in the main log view to
-visualize it.
+stack frames are tracked by the app when computing state. When a subprogram
+call is detected it is visualized in the main log view.
-
+https://github.com/user-attachments/assets/14b2302e-9814-4d9a-ae94-e176727fd11a
+### The state panel
-## What can you do?
+The state panel displays the current state of the program based on the loaded log, with the current state determined by the line selected in the instruction stream view.
-### Step through the instruction stream
+Remember that the verifier log is a trace through the program.
+This means that a particular instruction may be visited more than once, and the state at the same instruction (but a different point of execution) is usually also different. And so a log line roughly represents a particular point of the program execution, as interpreted by the BPF verifier.
-The most basic feature of the visualizer is "stepping" through the
-log, similar to what you'd do in a debugger.
+The verifier reports changes in the program state like this:
+```
+1: (7b) *(u64 *)(r10 -24) = r2 ; R2_w=1 R10=fp0 fp-24_w=1
+```
+After the semicolon `;`, there are expressions showing relevant register and stack slot states. The visualizer accumulates this information from all the prior instructions, and in the state panel this accumulated state is displayed.
-You can select a line by clicking on it, or by navigating with arrows
-(you can also use pgup, pgdown, home and end). The selected line has
-light-blue background.
+The header of the state panel shows the context of the state: log line number, C line number, program counter (PC) and the stack frame index.
-When a line is selected, current state of known values is displayed in
-the panel on the right. By moving the selected line up/down the log,
-you can see how the values change with each instruction.
+The known values of the registers and stack slots are displayed in a table.
-In the "state panel", the values that are written by selected
-instruction are marked with light-red background and the previous
-value is also often displayed, for example:
+The background color of a row in the state panel indicates that the relevant value has been affected by the selected instruction.
+Rows marked with red background indicate a "write" and the previous value is also often displayed, for example:
```
r6 scalar(id=1) -> 0
```
-Means that current instruction changes the value of `r6` from
-`scalar(id=1)` to `0`.
+This means that current instruction changes the value of `r6` from `scalar(id=1)` to `0`.
-The values that are read by current instruction have light-green
-background.
+The values that are read by the current instruction have a blue background.
-Note that for "update" instructions (such as `r1 += 8`), the slot
-will be marked as written.
+Note that for "update" instructions (such as `r1 += 8`), the slot will be marked as written.
-#### Sometimes a value of a slot has changed, but it's not highlighted as a write. Is that a bug?
+This then allows you to "step through" the instruction stream and watch how the values are changing, similar to classic debugger UIs.
+You can click on the lines that interest you, or use arrow keys to navigate.
-Currently the visualizer only considers writes derived from the instructions
-themselves. For example, `r1 = r2` is a write by definition, or a call would
-scratch some registers.
+https://github.com/user-attachments/assets/c6b5b5b1-30fb-4309-a90a-1832a0a33502
-But remember that we are looking at the BPF verifier log. BPF verifier
-simulates execution of a program, which requires maintaining and continuously
-updating a virtual state of the program. This means that whenever the verifier
-gains some knowledge about a value (which is not necesarily a write instruction),
-it will update it.
+#### The rows in the state panel are clickable!
-For example when processing conditional jumps such as `if (r2 == 0) goto pc+6`,
-the verifier usually explores both branches. But in both cases it gained information
-about r2: it's either 0 or not. And so while there was no explicit write into r2,
-it's value is known (and has changed) after the jump instruction, when you look at
-it in the verifier log.
+It is sometimes useful to jump to the source of a particular slot value from the selected instruction, even if the slot is not relevant to that instruction.
-Going forward the visualizer will likely treat all value updates as writes,
-as it is useful to know at what point verifier inferred a particular value.
+https://github.com/user-attachments/assets/8f5d03cc-54a5-426b-8428-c8b11f4ccf11
-### View data dependencies
+### The C source view
-The app computes a use-def analysis [^2] and you can interactively
-view dependencies between the instructions.
+The C source view panel (on the left) shows reconstructed C source lines.
-The concept is simple. Every instruction may read some slots
-(registers, stack, memory) and write to others. Knowing these sets
-(verifier log contains enough information to compute them), it is
-possible to determine for a slot used by current instruction, where
-its value came from (from what slot in what instruction).
+A raw verifier log might contain source line information, and bpfvv attempts to reconstruct the source code and associate it with the instructions.
+Here is how it looks in the raw log:
+```
+1800: R1=scalar() R10=fp0
+; int rb_print(rbtree_t __arg_arena *rbtree) @ rbtree.bpf.c:507
+1800: (b4) w0 = -22 ; R0_w=0xffffffea
+; if (unlikely(!rbtree)) @ rbtree.bpf.c:517
+1801: (15) if r1 == 0x0 goto pc+132 ; R1=scalar(umin=1)
+```
-You can view the results of this analysis by clicking on some
-instruction operands (registers and stack slots).
+The original source code is not available in the log of course. So bpfvv doesn't have enough information to even format it properly.
-The selected slot is identified by a box. This selection changes the
-log view, greying out "irrelevant" instructions, and leaving only
-data-dependent instructions in the foreground.
-
+However, it allows you to see a rough correspondence between BPF instructions and the original C source code.
-#### What's clickable?
+Be aware though that this information is noisy and may be inaccurate, since it reached the visualizer through a long way:
+* the compiler generated DWARF with line info, which is already "best-effort"
+* DWARF was transformed into BTF with line data
+* BTF was processed by the verifier and available information was dumped interleaved with the program trace
-Registers r0-r9 and explicit stack accesses such as `*(u32 *)(r10 -8)`.
+https://github.com/user-attachments/assets/3e8c52f0-3823-4d5f-abbd-f7c2d8e31d19
-r10 (stack frame pointer) is not clickable because it's effectively a
-constant [^3].
+### The bottom panel
-Note that the stack slots may be accessed indirectly: if say `r6 = fp-64`
-and then you do `*(u32 *)(r6 -8)` it's equivalent to `*(u32 *)(r10 -72)`.
-The visualizer does not show such dependencies (yet). Although state values
-are tracked correctly.
+The bottom panel shows original log text for the selected line and for the current hovered line.
+It is sometimes useful to check the source of the information displayed by the visualizer.
-#### How deep is the displayed dependency chain?
-It depends, but usually not deep.
+## Not frequently asked questions
-The problem with showing all dependencies is that it's too much
-information, which renders it useless.
+### What exactly do "read" and "written" values means here?
-Currently the upstream instruction is highlighted if it's an
-unambiguous dependency. For example:
-```
-42: r1 = 13
-43: r7 = 0
-44: r2 = r1
-```
+Here is a couple of trivial examples:
+* `r1 = 0` this is a write to `r1`
+* `r2 = r3` this is a read of `r3` and write to `r2`
+* `r2 += 1` this is a read of `r2` and write to `r2`, aka an update
-Instruction 42 is an unambiguous dependency of instruction 44, because
-r1 is the only read slot, and there were no modifications to it along
-the way.
+Here is a couple of more complicated examples:
+* `*(u64 *)(r10 -32) = r1` this is a read of `r1` and a write to `fp-32`
+ * `r10` is effectively constant[^3], as it is always a pointer to the top of a BPF stack frame, so stores to `r10-offset` are writes to the stack slots, identified by `fp-off` or `fp[frame]-off` in the visualizer
+* `r1 = *(u64 *)(r2 -8)` this is a write to `r1` and a read of `r2`, however it may also be a read of the stack, if `r2` happens to contain a pointer to the stack slot
-All such direct dependencies up the chain are shown.
+Most instructions have intrinsic "read" and "write" sets, defined by its semantics. However context also matters, as you can see from the last example.
-However, when more than one value is read in the upstream instruction,
-the UI will stop highlighting at that instruction.
+The visualizer takes into account a few important things, when determining data dependencies:
+* it is aware of scratch and callee-saved register semantics of subprogram/helper calls
+* it is aware of the stack frames: we enter new stack memory in a subprogram, and pop back on exit
+* it is aware of indirect stack slot access and basic pointer arithmetic
-Consider an example:
-```
-42: r1 = r2
-43: r3 = *(u32 *)(r10 -16)
-44: r1 += r3
-45: *(u32 *)(r10 -64) = r1
-```
+### Side effects?
-If you select `r1` at instruction 45, only instruction 44 will be
-highlighted, even though 42 and 43 are its transitive dependencies
-(`r1 += r3` reads both `r1` and `r3`).
+One counterintuitive thing about data dependencies in the context of BPF verification is that the instructions which don't do any arithmetic or memory stores can still change the progam state.
-The reason for this UI behavior is that showing all dependencies (both
-r1 and r3 and in turn all their dependencies) may very quickly cover
-most of the instructions. This is especially true for call
-instructions, which read up to 5 registers.
+Remember, we are looking at the BPF verifier log.
+The BPF verifier simulates the execution of a program, which requires maintaining a virtual state of the program.
+This means that whenever the verifier gains some knowledge about a value (which is not necesarily an intrinsic write instruction), it will update the program state.
-On the other hand the app can't know what the user is looking for, and
-there is no point in guessing. So, for an instruction like `r1 += r3`,
-the user must choose specific operand (r1 or r3 in this case) to
-expand the dependency chain further.
+For example, when processing conditional jumps such as `if (r2 == 0) goto pc+6`,
+the verifier usually explores both branches. But in both cases it gained information
+about `r2`: it's either 0 or not. And so while there was no explicit write into r2,
+it's value is known (and has changed) after the jump instruction, when you look at
+it in the verifier log.
-#### Note on memory stores and loads
+https://github.com/user-attachments/assets/94d271e2-f033-439b-8554-d9f8a66b4143
+
+### What if we write to memory or a BPF arena?
Currently non-stack memory access is a "black hole" from the point of
view of use-def analysis in this app. The reason is that it's
@@ -235,6 +249,28 @@ dependencies. If you see `*(u32 *)(r8 +0)` down the instruction
stream, even if value of r8 hasn't changed, the analysis does not
recognize these slots as "the same".
+**Unless** `r8` contains a pointer to a stack slot.
+In that case you can click both on the register to see where its value came from, and on the dereference expression to see where the stack slot value came from.
+
+https://github.com/user-attachments/assets/f345ec63-b91d-411c-b1d2-3890ed8f1c99
+
+### An instruction is highlighted as dependency, but I don't understand why. Is that a bug?
+
+Probably not[^4].
+
+The visualizer has a single source of information: the verifier log.
+The log contains two streams of information: the instructions and the associated state change, as reported by the verifier.
+
+Some of the state that the visualizer computes is derived from the instructions themselves.
+However, the state reported by the verifier always takes precedence.
+
+Since the values in the context of the visualizer are just strings, if the verifier reported a slightly different string, we treat it as an update.
+For example, you might see something like this:
+```
+r8 ptr_or_null_node_data(id=9,ref_obj_id=9,off=16) -> ptr_node_data(ref_obj_id=9,off=16)
+```
+
+The verifier reported a different value, and that's what bpfvv shows.
## Footnotes
@@ -247,3 +283,5 @@ the browser will not be happy to render it.
[^2]: https://en.wikipedia.org/wiki/Use-define_chain
[^3]: https://docs.cilium.io/en/latest/reference-guides/bpf/architecture/
+
+[^4]: But maybe yes... If you suspect a bug, please report.
diff --git a/README.md b/README.md
index 8a828e4..b60daa4 100644
--- a/README.md
+++ b/README.md
@@ -1,21 +1,17 @@
-> [!WARNING]
-> The bpfvv app is in early stages of development, and you should expect
-> bugs, UI inconveniences and significant changes from week to week.
->
-> If you're working with BPF and you think this tool (or a better
-> version of it) would be useful, feel free to use it and don't be shy
-> to report issues and request features via github. Thanks!
-
[](https://github.com/libbpf/bpfvv/actions/workflows/ci.yml)
**bpfvv** stands for BPF Verifier Visualizer
https://libbpf.github.io/bpfvv/
-This project is an experiment about visualizing Linux Kernel BPF verifier log to help BPF programmers with debugging verification failures.
+BPF Verifier Visualizer is a tool to analyze Linux Kernel BPF verifier logs.
+
+The goal of bpfvv is to help BPF programmers debug verification failures.
The user can load a text file, and the app will attempt to parse it as a verifier log. Successfully parsed lines produce a state which is then visualized in the UI. You can think of this as a primitive debugger UI, except it interprets a log and not a runtime state of a program.
+For more information on how to use **bpfvv** see the [HOWTO.md](https://github.com/libbpf/bpfvv/blob/master/HOWTO.md)
+
## Development
- Fork the website repo: https://github.com/libbpf/bpfvv.git