Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 17 additions & 0 deletions .github/pull_request_template.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
# Pull Request Template

## Goal
<!-- What does this PR accomplish? 1 sentence. -->

## Changes

-

## Testing
<!-- How did you verify it? -->

## Checklist

- [ ] Title is a clear sentence (≤ 70 chars)
- [ ] Commits are signed (`git log --show-signature`)
- [ ] `submissions/labN.md` updated
43 changes: 43 additions & 0 deletions submissions/attachments/lab4/lab4-trace.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
01:49:40.615211 IP6 ::1.53968 > ::1.8080: Flags [S], seq 3820193617, win 65476, options [mss 65476,sackOK,TS val 1817494082 ecr 0,nop,wscale 7], length 0
`..i.(.@.......................................Q.........0.........
lT.B........
01:49:40.615231 IP6 ::1.8080 > ::1.53968: Flags [S.], seq 1109512878, ack 3820193618, win 65464, options [mss 65476,sackOK,TS val 1817494082 ecr 1817494082,nop,wscale 7], length 0
`....(.@....................................B!.....R.....0.........
lT.BlT.B....
01:49:40.615245 IP6 ::1.53968 > ::1.8080: Flags [.], ack 1, win 512, options [nop,nop,TS val 1817494082 ecr 1817494082], length 0
`..i. .@.......................................RB!.......(.....
lT.BlT.B
01:49:40.615379 IP6 ::1.53968 > ::1.8080: Flags [P.], seq 1:175, ack 1, win 512, options [nop,nop,TS val 1817494082 ecr 1817494082], length 174: HTTP: POST /notes HTTP/1.1
`..i...@.......................................RB!.............
lT.BlT.BPOST /notes HTTP/1.1
Host: localhost:8080
User-Agent: curl/8.5.0
Accept: */*
Content-Type: application/json
Content-Length: 39

{"title":"trace me","body":"in flight"}
01:49:40.615388 IP6 ::1.8080 > ::1.53968: Flags [.], ack 175, win 511, options [nop,nop,TS val 1817494082 ecr 1817494082], length 0
`.... .@....................................B!...........(.....
lT.BlT.B
01:49:40.617285 IP6 ::1.8080 > ::1.53968: Flags [P.], seq 1:207, ack 175, win 512, options [nop,nop,TS val 1817494084 ecr 1817494082], length 206: HTTP: HTTP/1.1 201 Created
`......@....................................B!.................
lT.DlT.BHTTP/1.1 201 Created
Content-Type: application/json
Date: Tue, 16 Jun 2026 20:49:40 GMT
Content-Length: 93

{"id":5,"title":"trace me","body":"in flight","created_at":"2026-06-16T20:49:40.615872267Z"}

01:49:40.617332 IP6 ::1.53968 > ::1.8080: Flags [.], ack 207, win 511, options [nop,nop,TS val 1817494084 ecr 1817494084], length 0
`..i. .@........................................B!.}.....(.....
lT.DlT.D
01:49:40.617562 IP6 ::1.53968 > ::1.8080: Flags [F.], seq 175, ack 207, win 512, options [nop,nop,TS val 1817494085 ecr 1817494084], length 0
`..i. .@........................................B!.}.....(.....
lT.ElT.D
01:49:40.617680 IP6 ::1.8080 > ::1.53968: Flags [F.], seq 207, ack 176, win 512, options [nop,nop,TS val 1817494085 ecr 1817494085], length 0
`.... .@....................................B!.}.........(.....
lT.ElT.E
01:49:40.617702 IP6 ::1.53968 > ::1.8080: Flags [.], ack 208, win 512, options [nop,nop,TS val 1817494085 ecr 1817494085], length 0
`..i. .@........................................B!.~.....(.....
lT.ElT.E
174 changes: 174 additions & 0 deletions submissions/lab4.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,174 @@
# Lab 4 submission

Environment: WSL2 (Ubuntu 24.04). QuickNotes was cross-compiled for `linux/amd64` and run inside WSL so the loopback traffic is visible to `tcpdump -i lo`.

## Task 1: Trace a request end-to-end

### 1.1 Capture

```bash
# terminal A: run the server
./quicknotes

# terminal B: capture loopback traffic on port 8080
sudo tcpdump -i lo -nn -s 0 -A 'tcp port 8080' -w lab4-trace.pcap

# terminal C: fire exactly one request
curl -v -X POST http://localhost:8080/notes \
-H 'Content-Type: application/json' \
-d '{"title":"trace me","body":"in flight"}'
```

`curl` resolved `localhost` to `::1` first, so the whole exchange runs over the **IPv6 loopback** (`::1`).

### 1.2 Decode (annotated)

Full decode: [lab4-trace.txt](attachments/lab4/lab4-trace.txt). The four phases the lab asks for:

**1. TCP three-way handshake** (`SYN` -> `SYN/ACK` -> `ACK`):

```
::1.53968 > ::1.8080: Flags [S], seq 3820193617 # SYN
::1.8080 > ::1.53968: Flags [S.], seq 1109512878, ack 1 # SYN/ACK
::1.53968 > ::1.8080: Flags [.], ack 1 # ACK -> connection established
```

**2. HTTP request line + JSON body** (`PSH` carrying 174 bytes):

```
::1.53968 > ::1.8080: Flags [P.], seq 1:175 ... HTTP: POST /notes HTTP/1.1
POST /notes HTTP/1.1
Host: localhost:8080
User-Agent: curl/8.5.0
Content-Type: application/json
Content-Length: 39

{"title":"trace me","body":"in flight"}
```

**3. HTTP response line + response JSON** (`201 Created`, 206 bytes):

```
::1.8080 > ::1.53968: Flags [P.], seq 1:207 ... HTTP: HTTP/1.1 201 Created
HTTP/1.1 201 Created
Content-Type: application/json
Content-Length: 93

{"id":5,"title":"trace me","body":"in flight","created_at":"2026-06-16T20:49:40.615872267Z"}
```

**4. Connection close** (graceful `FIN` from both ends):

```
::1.53968 > ::1.8080: Flags [F.], seq 175 # client FIN
::1.8080 > ::1.53968: Flags [F.], seq 207 # server FIN
::1.53968 > ::1.8080: Flags [.], ack 208 # final ACK -> closed
```

### 1.3 Five debugging commands

**1. What is listening?** `sudo ss -tlnp | grep :8080`

```
LISTEN 0 4096 *:8080 *:* users:(("quicknotes",pid=471,fd=3))
```

`quicknotes` owns the listening socket on port 8080 (`sudo` needed to see the PID/process name).

**2. Routes from the host.** `ip route show`

```
default via 172.21.48.1 dev eth0 proto kernel
172.21.48.0/20 dev eth0 proto kernel scope link src 172.21.57.201
```

Default route goes out the WSL NAT gateway on `eth0`; our `localhost` request never touches it - it stays on `lo`.

**3. Reachability.** `mtr -rwc 5 localhost`

```
HOST: danielpancakePC Loss% Snt Last Avg Best Wrst StDev
1.|-- localhost 0.0% 5 0.1 0.1 0.1 0.1 0.0
```

One hop, 0% loss, ~0.1 ms - the loopback path is healthy.

**4. DNS works.** `dig +short example.com @1.1.1.1`

```
172.66.147.243
104.20.23.154
```

The external resolver answers, so DNS egress is fine.

**5. Service logs.** `journalctl --user -u quicknotes -n 20`

```
-- No entries --
```

QuickNotes runs as a foreground process, not a systemd unit, so journald has nothing for it. Its log line goes to stdout instead: `quicknotes listening on :8080 (notes loaded: 4)`.

### 1.4 What I would check first if QuickNotes returned 502

A 502 comes from a gateway/proxy in front of the app that could not get a valid response from the upstream, so I debug the proxy-to-app boundary outside-in. First I confirm the app is actually up and bound to the expected port (`ss -tlnp | grep :8080`), then I hit it directly, bypassing the proxy (`curl -s -o /dev/null -w '%{http_code}' http://localhost:8080/health`). If the direct call is `200`, the app is healthy and the fault is the proxy's upstream config - wrong port or host, or the app bound to `127.0.0.1` while the proxy dials a different address. If the direct call also fails, the app crashed or never bound, so I check its logs/exit status and whether something else already holds the port. The order is always: is it running, is it listening, is it reachable from where the proxy sits, and is the proxy pointed at the right upstream.

## Task 2: Outside-in debugging on a broken deploy

Reproduced with the same `~/lab4/quicknotes` linux binary, started twice on the same port to simulate a failed deploy.

### 2.1 Run a broken instance

```bash
ADDR=:8080 ./quicknotes > qn1.log 2>&1 & # instance 1 - binds :8080
sleep 1
ADDR=:8080 ./quicknotes > /tmp/qn-broken.log 2>&1 & # instance 2 - the "new deploy"
sleep 2
ps -ef | grep quicknotes | grep -v grep
```

Instance 1 (pid 487) stays up; instance 2 logs its error and exits immediately:

```
2026/06/17 01:53:55 quicknotes listening on :8080 (notes loaded: 5)
2026/06/17 01:53:55 listen: listen tcp :8080: bind: address already in use
```

Exact error: `bind: address already in use`. The new deploy never came up.

### 2.2 Outside-in chain

| Step | Command | Output | Decision |
| --- | --- | --- | --- |
| 1. Running? | `ps -ef \| grep quicknotes` | `... 487 ... ./quicknotes` (one PID) | A process IS running - but which instance? |
| 2. Listening? | `sudo ss -tlnp \| grep 8080` | `LISTEN *:8080 users:(("quicknotes",pid=487))` | Port 8080 is held by pid 487 (the first instance) |
| 3. Reachable? | `curl -s -o /dev/null -w "%{http_code}" .../health` | `200` | The socket answers, so L3/L4/L7 are fine |
| 4. Firewall? | `iptables -L` / `nft list ruleset` | neither installed, no rules | No packet filter in the path - not a firewall issue |
| 5. DNS? | `dig +short localhost` | `127.0.0.1` (`getent`: `::1 localhost`) | Name resolution works |

Every outside-in check is green, yet the deploy failed - that is the tell. The fault is not in the network path: a process is already bound to 8080, so the new instance could never start. Cross-referencing `ss` (pid **487** = the first instance) against the broken log (`address already in use`) pins it.

### 2.3 Repair + re-verify

```bash
kill 487 # free the port by killing the conflicting first instance
sleep 1
ADDR=:8080 ./quicknotes & # start the intended instance
curl -s http://localhost:8080/health
```

```
ss: LISTEN *:8080 users:(("quicknotes",pid=526)) # new pid now owns the port
curl: {"notes":5,"status":"ok"} # HTTP 200
```

Port freed, new instance bound, health green.

### Root cause

`listen tcp :8080: bind: address already in use` - a previous QuickNotes process still held port 8080, so the second `ListenAndServe` could not bind and `log.Fatalf` terminated the process.

### Mini-postmortem (blameless)

What is systemic here is not that someone "forgot to stop the old process" - it is that nothing in the path made the conflict visible or prevented it. A second instance was allowed to launch against an already-bound port, and its only signal was a single fatal line that scrolled past in a backgrounded process. From the outside everything looked healthy (a process was running, the port returned 200), which is exactly how a stale instance masquerades as a good deploy. The deeper gap is the missing supervisor: run under systemd or any process manager and the unit owns the port, a restart stops the old instance before starting the new one, and a failed start surfaces in `systemctl status` / `journalctl` instead of a lost stdout line. Tooling that prevents this class of failure: a real init/supervisor with restart-on-failure, a deploy health gate that confirms the *new* process answered (not just any process), and an `ss`/`lsof` preflight that fails fast on a busy port.