Skip to content

Lab 8: SRE and Monitoring#8

Open
tivdzualubem wants to merge 1 commit into
mainfrom
feature/lab8
Open

Lab 8: SRE and Monitoring#8
tivdzualubem wants to merge 1 commit into
mainfrom
feature/lab8

Conversation

@tivdzualubem

Copy link
Copy Markdown
Owner

Summary

This PR completes Lab 8 by extending QuickNotes with a complete SRE monitoring stack using Prometheus, Grafana, alerting, runbook documentation, and external synthetic monitoring.

Implemented

  • Prometheus scraping QuickNotes every 15 seconds
  • Provisioned Grafana Prometheus data source
  • Provisioned four-panel Golden Signals dashboard
  • Real request-latency histogram with P50 and P95 metrics
  • Traffic, error-rate, and saturation panels
  • High-error-rate alert for more than 5% errors sustained for five minutes
  • severity: page label and linked operational runbook
  • Alert validation in both Pending and Firing states
  • Cloudflare Quick Tunnel through GitHub Actions
  • Checkly monitoring every minute from Frankfurt and Singapore
  • Same-window Prometheus and Checkly latency comparison
  • Complete evidence and design answers in submissions/lab8.md

Validation

  • go test ./... passed
  • go vet ./... passed
  • Docker Compose configuration passed validation
  • Grafana dashboard JSON passed validation
  • Prometheus target reported up = 1
  • Alert transitioned from Pending to Firing
  • Checkly recorded 100% availability during the selected 30-minute window
  • The branch contains one signed Lab 8 commit and no unrelated lab submission files

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant