Skip to content

release: v0.4.0#3

Merged
luceydav merged 8 commits into
mainfrom
dev
Apr 9, 2026
Merged

release: v0.4.0#3
luceydav merged 8 commits into
mainfrom
dev

Conversation

@luceydav

@luceydav luceydav commented Apr 9, 2026

Copy link
Copy Markdown
Owner

Attribution fixes (Retrosheet), DRY cleanup (db_query), R CMD check fixes (zip Suggests, .Rbuildignore PDF), renv updated. 261 PASS | 0 errors | 0 warnings.

David Lucey and others added 8 commits March 31, 2026 13:11
The Lahman CSV files are frozen at 2021. BattingStats now conditionally
UNIONs FangraphsBattingWAR rows for yearID > 2021 when FangraphsBattingWAR,
ChadwickIDs, and People tables are all present in the database.

Join path: FangraphsBattingWAR.playerid -> ChadwickIDs.key_fangraphs ->
People.bbrefID -> People.playerID. Graceful fallback to Lahman-only when
FG tables are absent (e.g., in :memory: test DBs).

Adds message '(view, Lahman + FanGraphs (2022+))' on create to confirm
the extension is active.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Remove mcptools MCP server entry: mcp_server() with no tools arg
  errors immediately; session_tools=TRUE would expose arbitrary R code
  execution via list/select_r_sessions -- not acceptable
- Fix shell injection in DuckDB server command: replace sh -c with
  Python subprocess so LAHMANS_DBDIR is passed as a list arg, never
  interpolated into a shell string
- Remove .copilot/mcp-config.json from .gitignore so the config is
  version-controlled and reviewed like other code
- Add .github/CODEOWNERS: require owner review on copilot-instructions.md
  and mcp-config.json to protect against prompt injection via PRs
- Update copilot-instructions: fix test count (71 blocks / 202 assertions),
  add MCP server docs, add Security section explaining attack surface and
  CODEOWNERS rationale

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Add vignettes/franchise-efficiency.qmd: two-stage playoff efficiency
  analysis narrative (Stage 1: getting to October; Stage 2: going deep)
- DESCRIPTION: add quarto, ggrepel, scales, knitr to Suggests
- DESCRIPTION: add VignetteBuilder: quarto

Vignette covers: payroll paradox, dead-money burden, homegrown pipelines,
postseason WAR retention proxy, and 5-dimension franchise scorecard.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ordering

- Add WAR background section with interpretation table
- Add mission statement (5 management dimensions explained)
- Add data coverage limitations note (postseason capped at 2021)
- Fix short_names to use franchID keys (NYY/NYM/LAD not NYA/NYN/LAN)
- Remove manual HIGHLIGHTS filter -- label all franchises in Stage 1 chart
- Fix scorecard: add OVERALL column with thick border, sort by overall mean
- Add callout box explaining Red Sox paradox (95th pct achievement, mid-table overall)
- Update regression takeaway: R² < 0.001, slope slightly negative but not significant

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Use // (integer division) instead of / for year extraction from
  YYYYMMDD integer dates; DuckDB's / on BIGINT returns DOUBLE
- Replace year-matched Teams join with ROW_NUMBER() latest-year lookup:
  Teams only covers through 2021, causing 0-row results for 2022+
- Remove p_gdp reference from PitchingPost query (column absent in
  Retrosheet pitching.csv); replaced with 0::BIGINT
- Restructure to single CSV read per table (all stat columns in src CTE)
- Remove unused src_alias argument from round_cte_sql() helper
- Verified: 1706 BattingPost + 793 PitchingPost + 44 SeriesPost rows
  loaded for 2022-2025; WS outcomes match known results (HOU/TEX/LAN)
- All 251 tests pass

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- setup_baseball_db(): add load_retrosheet and retrosheet_zip params so
  load_retrosheet_post() runs automatically during DB build
- playoff_efficiency: extend SeriesPost queries to 1995-2025 (excl. 2020);
  team_rs_war / rs_pa / rs_ip extended to 2025 via FanGraphs + SalariesAll;
  ws_wins and all Stage 1 queries use fran_lookup (eliminates 6 repeated CTEs);
  total_ach_by_fran computed once and reused; n_playoffs collision fixed in syn merge
- vignettes/franchise-efficiency.qmd: update data coverage section and captions
  to reflect Retrosheet 2022-2025 postseason and WAR coverage notes
- tests/testthat/test-loaders.R: add 4 tests for load_retrosheet_post()
  (missing tables error, bad zip path error, skip when already loaded,
  integration test appending rows to BattingPost/PitchingPost/SeriesPost)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Data from Retrosheet is used via load_retrosheet_post(). Retrosheet terms
require a prominent notice in any work that includes their data. The roxygen
docs already carry the notice; this file makes it visible at the package level.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Add Retrosheet attribution to README.md attribution table and DESCRIPTION
- DRY: replace 3 raw as.data.table(dbGetQuery()) calls with db_query()
- Add zip to Suggests (used in test-loaders.R)
- Add PDF to .Rbuildignore to fix top-level file NOTE
- Update renv.lock: magrittr 2.0.5, rlang 1.2.0, cli 3.6.6, zip 2.3.3, quarto 1.5.1

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@luceydav luceydav merged commit d249ea6 into main Apr 9, 2026
1 check failed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant