Conversation
The Lahman CSV files are frozen at 2021. BattingStats now conditionally UNIONs FangraphsBattingWAR rows for yearID > 2021 when FangraphsBattingWAR, ChadwickIDs, and People tables are all present in the database. Join path: FangraphsBattingWAR.playerid -> ChadwickIDs.key_fangraphs -> People.bbrefID -> People.playerID. Graceful fallback to Lahman-only when FG tables are absent (e.g., in :memory: test DBs). Adds message '(view, Lahman + FanGraphs (2022+))' on create to confirm the extension is active. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Remove mcptools MCP server entry: mcp_server() with no tools arg errors immediately; session_tools=TRUE would expose arbitrary R code execution via list/select_r_sessions -- not acceptable - Fix shell injection in DuckDB server command: replace sh -c with Python subprocess so LAHMANS_DBDIR is passed as a list arg, never interpolated into a shell string - Remove .copilot/mcp-config.json from .gitignore so the config is version-controlled and reviewed like other code - Add .github/CODEOWNERS: require owner review on copilot-instructions.md and mcp-config.json to protect against prompt injection via PRs - Update copilot-instructions: fix test count (71 blocks / 202 assertions), add MCP server docs, add Security section explaining attack surface and CODEOWNERS rationale Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Add vignettes/franchise-efficiency.qmd: two-stage playoff efficiency analysis narrative (Stage 1: getting to October; Stage 2: going deep) - DESCRIPTION: add quarto, ggrepel, scales, knitr to Suggests - DESCRIPTION: add VignetteBuilder: quarto Vignette covers: payroll paradox, dead-money burden, homegrown pipelines, postseason WAR retention proxy, and 5-dimension franchise scorecard. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ordering - Add WAR background section with interpretation table - Add mission statement (5 management dimensions explained) - Add data coverage limitations note (postseason capped at 2021) - Fix short_names to use franchID keys (NYY/NYM/LAD not NYA/NYN/LAN) - Remove manual HIGHLIGHTS filter -- label all franchises in Stage 1 chart - Fix scorecard: add OVERALL column with thick border, sort by overall mean - Add callout box explaining Red Sox paradox (95th pct achievement, mid-table overall) - Update regression takeaway: R² < 0.001, slope slightly negative but not significant Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Use // (integer division) instead of / for year extraction from YYYYMMDD integer dates; DuckDB's / on BIGINT returns DOUBLE - Replace year-matched Teams join with ROW_NUMBER() latest-year lookup: Teams only covers through 2021, causing 0-row results for 2022+ - Remove p_gdp reference from PitchingPost query (column absent in Retrosheet pitching.csv); replaced with 0::BIGINT - Restructure to single CSV read per table (all stat columns in src CTE) - Remove unused src_alias argument from round_cte_sql() helper - Verified: 1706 BattingPost + 793 PitchingPost + 44 SeriesPost rows loaded for 2022-2025; WS outcomes match known results (HOU/TEX/LAN) - All 251 tests pass Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- setup_baseball_db(): add load_retrosheet and retrosheet_zip params so load_retrosheet_post() runs automatically during DB build - playoff_efficiency: extend SeriesPost queries to 1995-2025 (excl. 2020); team_rs_war / rs_pa / rs_ip extended to 2025 via FanGraphs + SalariesAll; ws_wins and all Stage 1 queries use fran_lookup (eliminates 6 repeated CTEs); total_ach_by_fran computed once and reused; n_playoffs collision fixed in syn merge - vignettes/franchise-efficiency.qmd: update data coverage section and captions to reflect Retrosheet 2022-2025 postseason and WAR coverage notes - tests/testthat/test-loaders.R: add 4 tests for load_retrosheet_post() (missing tables error, bad zip path error, skip when already loaded, integration test appending rows to BattingPost/PitchingPost/SeriesPost) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Data from Retrosheet is used via load_retrosheet_post(). Retrosheet terms require a prominent notice in any work that includes their data. The roxygen docs already carry the notice; this file makes it visible at the package level. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Add Retrosheet attribution to README.md attribution table and DESCRIPTION - DRY: replace 3 raw as.data.table(dbGetQuery()) calls with db_query() - Add zip to Suggests (used in test-loaders.R) - Add PDF to .Rbuildignore to fix top-level file NOTE - Update renv.lock: magrittr 2.0.5, rlang 1.2.0, cli 3.6.6, zip 2.3.3, quarto 1.5.1 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Attribution fixes (Retrosheet), DRY cleanup (db_query), R CMD check fixes (zip Suggests, .Rbuildignore PDF), renv updated. 261 PASS | 0 errors | 0 warnings.