Skip to content

v2.4.5 release: dev → master (bundles v2.4.2–v2.4.5, live on CRAN 2026-05-30)#142

Merged
xuyiqing merged 56 commits into
masterfrom
dev
May 30, 2026
Merged

v2.4.5 release: dev → master (bundles v2.4.2–v2.4.5, live on CRAN 2026-05-30)#142
xuyiqing merged 56 commits into
masterfrom
dev

Conversation

@xuyiqing

Copy link
Copy Markdown
Owner

v2.4.5 release — dev → master

Periodic dev → master gate for the v2.4.5 CRAN cut, live on CRAN since 2026-05-30. master was last at v2.4.1 (#132); this brings it up to the released version, bundling v2.4.2 → v2.4.5. This is the exact tree CRAN accepted as 2.4.5.

Versions included

v2.4.5 — group.fe for sub-group treatment

  • New group.fe argument on fect(): absorb additive fixed effects at a coarsening of the unit identifier (e.g. state FE on county-level rows when treatment varies at the state level). Closes treatment variation and fixed effects at higher-than-unit level #139. Clustered SE defaults to group.fe[1]; override with cl = "<column>".
  • Fix method = "cfe" with force = "time" / "unit" (previously errored Index out of bounds). force = "two-way" is byte-equivalent.
  • Remove the unused sfe argument and orphan R/polynomial.R.

v2.4.4 — $sample slot

  • fect() now returns $sample, a logical matrix (same dims as $Y.dat) marking cells used in any part of estimation (main fit, placebo / carryover / balance). Compatible with panelView::panelview(sample = ...).

v2.4.3 — parallel-bootstrap fix

  • Fix future.globals.maxSize overrun: the quiet_nonpara wrapper no longer captures fect_boot()'s full frame.

v2.4.2 — ci.method API + estimand inference infrastructure

  • ci.method = c("normal", "basic") on fect(); legacy quantile.CI soft-deprecated (mapped with a one-time warning).
  • estimand() gains test = c("none", "placebo", "carryover") (Pre-treatment estimates for APTT estimand #131), para.error for parametric vartype, and BCa / normal CI methods with per-type defaults.
  • Tighter EM defaults (tol 1e-3 → 1e-5, max.iteration 1000 → 5000) with a non-convergence warning(); pass the old values explicitly to reproduce pre-2.4.2 output.

Release hygiene

  • CRAN-prep: minimal tests/testthat/test-cran.R (CRAN test time 132s → ~2s; full suite still runs locally / CI); dropped legacy pkgdown tooling and the pre-rendered NEWS.html (all already .Rbuildignore'd — zero tarball effect).
  • User-manual version references synced to 2.4.5: install table (CRAN / master / dev all at 2.4.5, 2026-05-30), changelog (2026-05-30) CRAN release. tag, citation + BibTeX note.

Scope

56 commits, 101 files, +11,783 / −5,059. R CMD check --as-cran clean (1 routine authorship note).

🤖 Generated with Claude Code

xuyiqing and others added 30 commits April 30, 2026 22:26
…er_fe_mc

Adds optional `Rcpp::Nullable<Rcpp::NumericMatrix> fit_init = R_NilValue`
parameter at the end of the IFE/MC C++ entry points and the inner
EM workhorse functions. When non-null and shape matches Y, the EM
loop seeds `fit` from it instead of the default `fit = Y0`
(cold-start). NULL (default) preserves pre-2.4.2 behavior exactly.

Both `fit = Y0` initialization sites in fe_ad_inter_iter and
fe_ad_inter_covar_iter are routed through the warm_init dispatch
(initial init AND post-burnin restart under use_weight=1).

Sites touched:
- src/ife_sub.cpp: fe_ad_inter_iter, fe_ad_inter_covar_iter
- src/ife.cpp: inter_fe_ub (forwards to both inner functions)
- src/mc.cpp: inter_fe_mc (forwards to fe_ad_inter_iter via mc=1)
- src/fect.h: forward declarations updated

CFE warm-start (cfe_iter / complex_fe_ub) deferred to v2.5.0 ---
the additional gamma/kappa FE structures complicate the conditional
init logic and the marginal coverage gain is small.

Statistical justification (Yiqing 2026-05-01): the prediction surface
Y_hat = F * Lambda^T is the unique global minimizer (Eckart-Young
for unconstrained low-rank; convexity for nuclear-norm MC; convexity
of the entropy-regularized projection for simplex bound).
Factorization is non-unique by rotation, but Y_hat is uniquely
identified --- so warm-starting won't bias the bootstrap distribution
of any estimand. Full reasoning: statsclaw-workspace/fect/ref/
warm-start-audit-2026-05-01.md.

This commit is the engine-side plumbing only. R wrappers and boot.R
activation come in subsequent commits. Existing test suite (estimand
+ book-claims subset, 196 expectations) passes unchanged --- by
construction since fit_init = NULL throughout.

Refs: v2.4.2 release plan; statsclaw-workspace/fect/runs/
2026-05-01-v242-inference-infra.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds `fit.init = NULL` parameter to fect_fe() and fect_mc() R
wrappers; threads it through to the inter_fe_ub / inter_fe_mc C++
calls as the `fit_init` argument. NULL preserves pre-2.4.2 cold-start
behavior; bootstrap callers will pass the main-fit prediction surface
(boot.R activation in next commit).

Note: the r.cv = 0 sub-fit inside fect_fe (no factors, no EM)
explicitly passes NULL even when fit.init is provided --- the
warm-start matrix is shaped for a factor-model EM, not a no-factor
additive-FE solver.

Scope:
- IFE: fect_fe (R/fe.R)
- MC:  fect_mc (R/mc.R)

Deferred (separate commit or later release):
- GSC path via fect_nevertreated (.estimate_co helper); needs
  centering-aware fit.init handling
- CFE path via fect_cfe (deferred to v2.5.0 with cfe_iter)

Tests: existing suite passes (fit.init = NULL throughout, behavior
unchanged).

Refs: v2.4.2 release plan; runs/2026-05-01-v242-inference-infra.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The previous message "F-test Failed. The estimated covariance matrix
is singular." conflated two distinct inferential statements:
(a) the test failed to reject the null, vs
(b) the test could not be computed (singular covariance).

Standard hypothesis-test usage reserves "Failed to reject" for case
(a). Case (b) is a numerical-failure message and should be worded
accordingly. Two occurrences in R/diagtest.R updated.

Refs: Yiqing 2026-05-01 hypothesis-test-language feedback;
test-language memory feedback_test_language_failed.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…1000

ci.method enum: c("basic", "percentile") -> c("basic", "percentile", "bc", "normal").

New methods:
- "bc": bias-corrected percentile (Efron 1987 minus acceleration).
  z0 = Phi^-1(P(boot < est)); cutoffs shift by 2*z0 to compensate
  for bootstrap-median bias relative to the point estimate. Free
  (just one Phi^-1 call); uniform across all vartype values.
- "normal": Wald (theta +- z * SE). What fit$est.att already uses
  internally; textbook for jackknife.

Per-type ci.method defaults via NULL trigger (ci.method = NULL):
- att      -> "normal" (matches what fit$est.att actually contains;
              corrects v2.4.1's mislabeled "basic" passthrough)
- att.cumu -> "percentile" (matches what att.cumu() does internally)
- aptt     -> "bc" (ratio estimator; basic CI flips off the point
              under bootstrap skew)
- log.att  -> "bc" (log estimator; same skew rationale)

Existing v2.4.1 explicit ci.method values still work; the new default
only fires when ci.method is omitted.

Implementation: new internal helper .compute_ci() centralizes all four
methods. Refactored call sites: .estimand_att_overall,
.compute_aptt_event_time, .compute_log_att_event_time. att fast path
updated: triggers on ci.method == "normal" instead of "basic" so
default att queries hit the byte-equality passthrough from fit$est.att.

nboots default 200 -> 1000 in R/default.R (all three signatures
fect / fect.formula / fect.default), plus the misspecified-nboots
error message updated. Existing scripts that pass nboots = ...
explicitly are unaffected.

Note: bc CI may degenerate (ci.lo == ci.hi) when the point estimate
is far outside the bootstrap distribution --- this is the cell-drop
pathology in log.att / aptt; addressed by the hard-error in the
next commit.

Refs: statsclaw-workspace/fect/ref/v242-vartype-cimethod-design.md;
runs/2026-05-01-v242-inference-infra.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…t / aptt

When a cell used in the point estimate has Y0_b <= 0 in a non-trivial
fraction of bootstrap replicates, log(Y0_b) is undefined and
colMeans(..., na.rm = TRUE) silently averages over fewer cells in
that replicate. The point estimate uses N cells; bootstrap averages
may use N, N-1, ..., 0 cells. This breaks the basic bootstrap
principle that the resampled estimator should compute over the same
data structure as the point estimator, contaminating the bootstrap
distribution and yielding meaningless inference.

v2.4.2 detects this and hard-errors with actionable guidance:
  1. Filter out unstable cells via `cells = ~ Y0_hat > <threshold>`
  2. Transform the outcome before fect: log(Y + c)
  3. Use a different estimand (att doesn't have this pathology)

Two detection paths:

- log.att (.compute_log_att_event_time): trigger when the WORST cell
  has Y0_b <= 0 in > 5% of replicates. Sub-threshold dropping is
  tolerated as benign (small noise at the bootstrap-distribution scale).

- aptt (.compute_aptt_event_time): trigger when E(Y0_b) ~ 0 (within
  1e-10) in any replicate, since the APTT denominator blows up there.

The 5% threshold is a heuristic; can be made tunable via an argument
in a follow-up release if user feedback warrants.

Also reverts an inadvertent suppressWarnings around the inner
log(Y0_b_ok) call --- with the hard-error gate, suppressWarnings is
no longer needed (any surviving log call has positive args).

Refs: statsclaw-workspace/fect/ref/v242-vartype-cimethod-design.md
hard-error section; runs/2026-05-01-v242-inference-infra.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
New file `tests/testthat/test-estimand-ci-methods.R` (10 tests, 36
expectations) covers the v2.4.2 ci.method extensions:
- CI.1: enum accepts basic / percentile / bc / normal
- CI.2: NULL trigger gives per-type defaults (att=normal etc)
- CI.3: normal CI is symmetric around the point estimate
- CI.4: bc collapses to percentile when bootstrap median = point
- CI.5: bc shifts cutoffs when bootstrap is biased relative to point
- CI.6: vartype column reports fit-time vartype regardless of ci.method
- CI.7: log.att on simdata triggers cell-drop hard-error
- CI.7b: log.att works on a positive-Y panel
- CI.8: fect() / fect.formula / fect.default nboots default = 1000

Updated existing fixtures:

- `tests/testthat/test-estimand-log-att.R`:
  - .fit_positive_Y(): bumped intercept 0.5 -> 2.0 and reduced
    noise sd 0.2 -> 0.1 to keep Y comfortably positive (so the
    new cell-drop hard-error doesn't fire on benign bootstrap noise).
  - LA.4 now expects the hard-error instead of a warning.

- `tests/testthat/test-estimand-parametric.R`:
  - log.att-on-parametric test now expects the hard-error
    (sim_linear has many negative Y cells; v2.4.1's silent warning
    was producing meaningless inference).

All 6 estimand test files pass: 33 tests, 97 expectations, 0 failures.

Refs: statsclaw-workspace/fect/runs/2026-05-01-v242-inference-infra.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…workers

Per Yiqing 2026-05-01: parametric bootstrap (and any parallel path
that uses mvtnorm / future / etc.) emitted N warnings ("package
'mvtnorm' was built under R version X.Y.Z") --- one per worker on
first package use --- because workers loaded packages in user-visible
contexts.

Two-pronged fix:

1. `.fect_make_future_cluster()` now runs `clusterEvalQ` at cluster
   creation to pre-load mvtnorm / future / future.apply / doParallel /
   foreach via `requireNamespace(quietly = TRUE)` inside
   suppressPackageStartupMessages + suppressWarnings. This loads
   packages SILENTLY at startup so subsequent worker code doesn't
   re-trigger the version-mismatch warning.

2. New helper `.fect_with_quiet_pkg_warnings(expr)` wraps an
   expression with a `withCallingHandlers` that targets ONLY the
   "was built under R version" warning class (other warnings pass
   through normally). Used at the boot.R parametric path's
   `future_lapply` call as belt-and-suspenders.

Refactored fect() and did_wrapper() to route through
`.fect_make_future_cluster()` instead of `future::multisession`
directly, so all parallel paths benefit from the silent pre-load.

CV (cv.R), nevertreated (fect_nevertreated.R), and the bootstrap
foreach loop in boot.R already use the helper, so they automatically
benefit. fittest.R + permutation.R use `parallel::makeCluster()`
directly --- can route through the helper in a follow-up.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The book-claims test asserted fect()'s nboots default is 200, which
was true through v2.4.1. v2.4.2 raises the default to 1000 (improves
percentile/bc tail estimates). Update the assertion to match.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…TURE

- DESCRIPTION: Version 2.4.1 -> 2.4.2; Date 2026-04-29 -> 2026-05-01
- NEWS.md: full v2.4.2 entry covering ci.method extensions
  (bc + normal + per-type defaults), cell-drop hard-error in
  log.att / aptt, nboots default raised 200 -> 1000, internal
  warm-start infrastructure (not user-visible; activation deferred
  to v2.5.0 with empirical justification), diagtest message reword,
  parallel-worker package warning suppression
- vignettes/bb-updates.Rmd: same changelog entry, condensed for
  the user-facing book
- man/estimand.Rd: ci.method default updated NULL with per-type
  default explanation
- R/po-estimands.R: roxygen comment for ci.method updated to match
- ARCHITECTURE.md: version 2.4.1 -> 2.4.2; po-estimands.R row
  reflects ci.method extensions and hard-error addition; estimand()
  function-table row updated. Manual patches noted; full scriber
  regen deferred until next material module change.

Validation gates run on this branch:
- Full testthat suite: 1190 expectations, 0 failures, 0 errors
- estimand subset: 6 files, 33 tests, 97 expectations passing
- ci-methods test file (new): 10 tests, 36 expectations passing
- Coverage simulation (200 sims, all 4 ci.methods on att/overall):
  basic 92.5%, bc 91.0%, normal 91.5%, percentile 91.0%
  (nominal 95%; all within MC noise band; details in
  statsclaw-workspace/fect/ref/v242-coverage-study/findings.md)
- Quarto book: rendered cleanly
- R CMD check: blocked by a pre-existing macOS cp/xattr issue
  unrelated to v2.4.2 changes; CI on Ubuntu will verify on PR

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
v2.4.2: inference infrastructure (ci.methods + cell-drop hard-error + warm-start C++ infra)
…positive

Vignette 03-estimands.Rmd's setup-estimand chunk uses
  df$Y <- exp(0.5 + 0.05*time + 0.3*D + rnorm(0, 0.2))

which can dip near zero in some cells. With v2.4.2's new hard-error
on bootstrap cell-drop pathology in log.att / aptt, the est-log-att
chunk can fail to render when bootstrap Y0_b crosses zero in > 5%
of replicates for some cell --- which depends on RNG state and is
non-deterministic across renders.

Bump intercept to 2.0 and reduce noise sd to 0.1, so Y is comfortably
positive (~e^2 ~ 7.4 minimum, far from zero). The hard-error never
fires on this DGP.

Also updated the prose in the Log-scale ATT section to describe the
new hard-error behavior (was: "A single warning per call reports how
many cells were excluded"; now: explains the v2.4.2 hard-error and
the actionable options).

Local Quarto book render: 14 HTML / 176 SVG / 0 errors (matches the
v2.4.1 baseline counts). testthat suite passes per the latest gate
run on dev HEAD.

This is a robustness fix --- the v2.4.2 vignette on dev happens to
render cleanly because of lucky RNG state, but is fragile to seed
changes. This branch makes it deterministically robust.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds `test = c("none", "placebo", "carryover")` argument to estimand().
Closes issue #131 (ajunquera, "Pre-treatment estimates for APTT
estimand").

When `test = "placebo"`:
- Aggregates the requested estimand (att / aptt / log.att) over
  pre-treatment cells in `fit$placebo.period` (the cells masked-and-
  imputed during the placebo fit)
- Requires `placeboTest = TRUE` at fit time; otherwise hard-errors
  with actionable refit guidance
- Auto-sets `direction = "on"`

When `test = "carryover"`:
- Aggregates over early post-reversal cells in `fit$carryover.period`
- Requires `carryoverTest = TRUE` + `hasRevs == 1`; otherwise
  hard-errors
- Auto-sets `direction = "off"`

`type = "att.cumu"` is rejected with clear error when `test != "none"`
since cumulative semantics are defined relative to treatment onset.

Implementation:

- New `.test_cell_mask(fit, test, direction)` internal helper builds
  the cell-level base mask + returns Tev for downstream aggregation
- `.estimand_att`, `.estimand_aptt`, `.estimand_log_att`,
  `.estimand_att_overall` accept `test` argument and forward to
  `.test_cell_mask`
- New `.estimand_att_event_time` slow path for att with test != "none"
  (the fast path requires test == "none" since fit$est.att passthrough
  is byte-equality only for the standard surface)
- Per-event-time att with `test = "placebo"` is byte-identical to
  `fit$est.att` rows at placebo event times (asserted by test PC.6)

Tests: 10 tests, 19 expectations in
`tests/testthat/test-estimand-placebo-carryover.R`. All pass on the
positive-Y synthetic panel and on simdata. Includes:

- PC.1: argument validation
- PC.2/PC.3: standard fit + test=placebo/carryover hard-errors
- PC.4: placebo aptt returns rows in fit$placebo.period range
- PC.5: placebo att works; placebo log.att hard-errors on simdata
  (cell-drop pathology from v2.4.2 fires; expected behavior)
- PC.6: att, test=placebo byte-equality with fit$est.att rows
- PC.7: att.cumu + test=placebo rejected
- PC.8: carryover aptt on reversal panel
- PC.9: test=placebo silently uses direction='on'
- PC.10: vartype column reports method actually used at fit time

Refs: parked branch feat/v242-estimand-placebo-carryover (db11630)
documented the original design; this commit rebuilds the placebo /
carryover surface cleanly on top of v2.4.2's ci.method changes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…v2.4.2)

Closes the v2.4.2 inference work with three additions and a roll-up
of the changelog:

* ci.method = "bca": full Efron 1987 (z0 + acceleration via cell-level
  jackknife). Wired into att / aptt / log.att dispatcher paths.
  Defaults updated: aptt -> "bca", log.att -> "bca" (was "bc"; bc
  collapsed at the boundary when point estimate fell outside the
  bootstrap support).
* vartype = "wild": unit-level Rademacher wild bootstrap (Liu 1988;
  Mammen 1993; CGM 2008). Keeps unit composition fixed; perturbs
  main-fit residuals on D=0/I=1 cells. Continuous outcomes only;
  fe / ife / mc / gsynth / cfe. Storage layout matches "bootstrap";
  estimand() consumes wild fits transparently.
* log.att hard-stop at point-estimate level: any treated cell with
  Y <= 0 or Y0_hat <= 0 now errors with actionable refit guidance
  (log undefined; silent drop would bias both point estimate and
  bootstrap distribution).

Changelog rolled up: collapse the in-progress 2.4.3 section into
2.4.2; drop the warm-start internal entry (deferred to v2.5.0; not
user-visible) and the diagtest wording bug-fix (rolled into
hard-error guidance section).

Vignettes:

* Ch2: new vartype-x-ci.method tables documenting computational cost
  in multiples of nboots refits and per-type defaults.
* Ch3: placebo + carryover demo sections (APTT and log-ATT under
  test = "placebo" / "carryover") with esplot calls.
* bb-updates: same roll-up as NEWS.md.

man/estimand.Rd: bca + wild documented.

Tests + Quarto render deferred to maintainer (per request).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…e _common.R in ch3

Validation pass on `feat/v242-completion` after deferred test/render gates surfaced two issues:

1. Tests: 4 testthat assertions expected the bootstrap-level "log-ATT bootstrap is unreliable" message, but v2.4.2's NEW point-estimate-level hard-error ("log.att requires Y > 0 and Y0_hat > 0...") fires first on simdata (Y < 0 cells). The point-level error is the correct upstream catch; the bootstrap-level error remains in R/po-estimands.R for strictly-positive panels where Y0_b crosses zero only in some replicates.

   Updated expected message strings in CI.7, LA.4, parametric log.att, and PC.5.

2. Vignette: 03-estimands.Rmd was missing `source("_common.R")`, so quarto render loaded the system-installed CRAN fect (v2.4.1) instead of the dev source. The `test = "placebo"` argument (v2.4.2+) was rejected, breaking `plot-aptt-placebo` chunk and halting render.

   Added the standard `.common` chunk that other chapters use; dropped redundant `library(fect)` since `_common.R` calls `devtools::load_all("..")`.

Validation gates after fixes:

- `devtools::test()`: PASS 1209 / FAIL 0 (was 1205/4)
- `quarto render`: 14 HTML / 180 SVG / 0 fatal errors (rgl/X11 warnings non-functional)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ci.method (closes parametric × {basic/percentile/bc/bca} 0% coverage)

Pre-fix coverage on a clean DGP-A (additive TWFE, true ATT=3.0, N=40, T=20, T0=12, 100 reps):
  parametric × normal:     93%  (works — uses sd only)
  parametric × basic:       0%  (CI offset above truth)
  parametric × percentile:  0%  (CI matches H0 quantiles)
  parametric × bc:          0%  (boundary collapse, width ~2e-13)
  parametric × bca:         0%  (boundary collapse, width ~2e-13)

Root cause: R/boot.R:991-996 constructs Y.boot[,treated] = Y0_hat +
error.tr.boot — no treatment effect added, so eff.boot is centered at 0
(under H0). normal ci.method only uses sd(boot) so was unaffected; the
other 4 ci.methods interpret eff.boot as the sampling distribution of
θ̂ around itself, requiring centering at θ̂.

Fix (Option A — post-hoc location shift in R/po-estimands.R only):
  att_b_centered = att_b - mean(att_b) + estimate
applied at 4 sites (att event.time, att overall, aptt event.time, log.att
event.time) immediately before .compute_ci() when the fit's vartype is
parametric. The shift is variance-preserving: sd(shifted) == sd(original)
to machine precision, so normal CI is byte-stable. boot.R, plot.R, and
get.pvalue() (which legitimately needs H0 centering for the legacy
H0:ATT=0 hypothesis test) are untouched.

Justification: the parametric bootstrap simulates errors from a fitted
homoskedastic Gaussian AR-vcov model. Var(θ̂|H0) = Var(θ̂|H1) by
construction, so the H0-centered bootstrap distribution has the correct
dispersion for inference about θ̂; only the location is wrong, and the
shift restores it. Full rationale and downstream-consumer audit in
statsclaw-workspace/fect/spec.md and ref/v242-coverage-study/findings_v242_full.md.

Post-fix coverage (NBOOTS = 200, 1000, 2000 — all consistent within MC noise):
  parametric × normal:      94%  (byte-stable with pre-fix)
  parametric × basic:       93-94%
  parametric × percentile:  91-94%
  parametric × bc:          91-94%  (boundary collapse eliminated)
  parametric × bca:         91-94%  (boundary collapse eliminated)

Tests: 98 new assertions in tests/testthat/test-estimand-parametric-cifix.R
covering value invariants (P-INV-1..5), edge cases (P-EDGE-2..4),
regression smoke (P-REG-1..1b), all-5-ci.methods on FE and gsynth fits,
and 100-rep coverage gates (P-COV-1..3, P-WIDTH-1). Full suite:
1307 PASS / 0 FAIL in CRAN mode; +98 PASS with NOT_CRAN=true.

statsclaw pipeline: planner → builder → tester → reviewer (PASS WITH NOTE).
Spec: statsclaw-workspace/fect/spec.md
Test spec: statsclaw-workspace/fect/test-spec.md
Sim spec: statsclaw-workspace/fect/sim-spec.md
Review: statsclaw-workspace/fect/runs/2026-05-01-parametric-fix-review.md

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…10→11

New chapter is a reference for the 4 vartype × 5 ci.method matrix:
  - vartype mechanics (case bootstrap / wild / jackknife / parametric)
  - ci.method formulas + symmetry / bias-aware / skew-aware properties
  - p-value formulas via test inversion (per ci.method)
  - nboots literature recommendations (Efron & Tibshirani 1993; DiCiccio &
    Efron 1996; Davison & Hinkley 1997; Hesterberg 2014)
  - empirical coverage table (DGP-A, NBOOTS=200/1000/2000)
  - decision tree for picking (vartype, ci.method) per use case
  - v2.4.2 caveats panel: wild under-coverage, jackknife slot-contract,
    parametric ci.method fix (this PR), missing p.value column

Renames (Quarto cross-refs use @sec-foo labels, not file numbers, so
breakage is impossible):
  09-panel.Rmd → 10-panel.Rmd
  10-sens.Rmd  → 11-sens.Rmd
  (new)        → 09-inference.Rmd

Render confirmed clean: 15 HTML / 180 SVG / 0 fatal errors.

Coverage table notes: at this clean Gaussian DGP, all 5 ci.methods
saturate at NBOOTS=200 (case bootstrap 91%, parametric 93-94% post-fix,
wild 71-78%). Literature recommendations of B≥1000 matter most for
skewed estimands (aptt/log.att) and small Ntr — not for symmetric
near-Gaussian estimands like att.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ard-error on non-normal ci.methods

Pre-fix: estimand() rejects all jackknife fits with "Slot contract:
fit\$eff.boot first two dimensions (TT x N-1) must match TT x N" because
jackknife stores eff.boot as TT x (N-1) x N (each leave-one-out fit
drops a column), not TT x N x nboots.

Fix in R/po-estimands.R only:

1. Slot contract relaxation: vartype-branched. Jackknife asserts
   dim(eb)[1] == TT and dim(eb)[3] == N (drops match unit count); other
   vartypes keep the original TT x N check.

2. New .check_jackknife_ci_method() helper hard-errors on anything
   other than "normal" for jackknife fits, citing Efron & Tibshirani
   (1993) Ch.11 and Davison & Hinkley (1997) §3.2.1: jackknife produces
   an SE estimate via Tukey's formula, not a sampling distribution. The
   N leave-one-out estimates are influence-function flavored and not
   exchangeable bootstrap draws, so reflection-based ci.methods (basic,
   percentile, bc, bca) are not statistically meaningful on them.

3. Two new aggregation paths:
   - .estimand_att_overall_jackknife(): two sub-cases. No cells filter
     -> reads fit\$att.avg.unit.boot directly. With cells filter ->
     iterates j=1..N, drops column j from cell_mask, extracts the mean
     of eff.boot[,,j][cm_j], applies Tukey SE.
   - .estimand_att_event_time_jackknife(): same column-drop masking
     pattern for the test != "none" path.

4. APTT and log.att jackknife branches in .compute_aptt_event_time and
   .compute_log_att_event_time using identical column-drop masking.

5. imputed_outcomes(replicates=TRUE) hard-errors on jackknife fits
   (column-count mismatch makes per-cell replicate expansion incoherent).

The fast path (event.time + normal + no filter + no test) for jackknife
needs zero new logic: once the contract gate is unblocked, it reads
fit\$est.att directly, which fect populates with Tukey SE and Wald CIs
at fit time.

Tests: 68 new assertions in tests/testthat/test-estimand-jackknife.R
(scenarios J.1-J.12 + S-11/S-12 anti-regression + S-SIM 100-rep coverage
on DGP-A att overall, threshold >= 0.85, achieved). Full suite: 0 FAIL.

statsclaw pipeline: planner -> builder -> tester, all green.
Spec: statsclaw-workspace/fect/jackknife-fix/spec.md
Audit: statsclaw-workspace/fect/jackknife-fix/audit.md

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replace the wild branch in one.nonpara() (R/boot.R lines 1139–1178).
Old code perturbed only D=0/I=1 cells and kept observed Y on treated
cells verbatim, capturing only Y0-imputation variance.  New code builds
a per-event-time ATT lookup matrix and perturbs treated cells too using
eff[t,i] - att_period[T.on[t,i]] as the residual, following CGM 2008
exactly.  Width ratio wild/bootstrap = 0.865 (>= 0.80 target); point
estimate byte-identical to case-bootstrap on same seed.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
59 new assertions in tests/testthat/test-wild-bootstrap.R covering
14 scenarios (T1-T11 + E1-E4):
  T1  wild + all 5 ci.methods produce finite, non-degenerate CIs
  T2  wild CI width >= 80% of bootstrap CI width across all 5 methods
  T3  att.avg byte-identical between wild and bootstrap fits (point
      estimate unchanged - the fix only changes the bootstrap distribution)
  T4  aptt event.time + wild + bca on positive-Y DGP
  T5  log.att event.time + wild + bca on positive-Y DGP
  T6-T8  bootstrap, parametric, jackknife paths unchanged
  T9  binary outcome + wild still hard-errors at fit time
  T10 nboots=200 still works
  T11 100-rep coverage simulation: each ci.method >= 0.85
  E1-E4 edge cases (staggered panel, uniform adoption, placebo masked
       cells, degenerate fits)

Coverage results (T11, 100 reps NBOOTS=200, DGP-A att overall):
  Pre-fix:  basic 0.78  percentile 0.73  bc 0.74  bca 0.74  normal 0.77
  Post-fix: all 5 methods >= 0.85 (T11 acceptance threshold)

Width ratio (T2, single seed): wild/bootstrap = 1.41 (was ~0.57 pre-fix).

Full testthat suite post-fix: PASS 1434 / FAIL 0 (1976.6 s wall).

statsclaw pipeline: planner -> builder (2df98f8) -> tester, all green.
Spec: statsclaw-workspace/fect/wild-bootstrap-fix/spec.md
Audit: statsclaw-workspace/fect/wild-bootstrap-fix/audit.md

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Implements spec wild-as-paraerror:
- New `para.error` argument to fect()/fect.formula()/fect.default() with
  default "auto". Enum: "auto", "ar", "empirical", "wild".
- Loop 2 in fect_boot() dispatches on `para.error.resolved` (three-way
  switch) instead of `if (0 %in% I)` / `else`.
- Wild path: unit-level Rademacher sign-flip on Loop 1 pool draws
  (variant-i, H0-centered); location-shift in estimand() re-centers.
- Validation in fect.default(): hard-error if empirical/wild + missing
  cells; deprecation warning if vartype="wild" (rewrites to
  vartype="parametric" + para.error="wild" before gate).
- Standalone wild branch in boot.R one.nonpara() removed.
- para.error.resolved stored on fit object for inspection.
- 10 unit tests in test-para-error.R covering all spec scenarios.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Companion to the builder's starter file (test-para-error.R, 10 unit
tests). Covers test-spec.md scenarios T4, T7, T9-T10, T13-T21, E2-E4, E6.

NOT YET RUN. Tonight's pickup: run this file via testthat::test_file()
after the redesign at b9881ed. Coverage threshold per scenario is 0.91
(within 2 MC-SE of nominal 95%) — DO NOT relax to 0.85.

Critical scenarios:
- T19 (DGP-A IID): all (para.error × ci.method) cells coverage >= 0.91
- T20 (DGP-A8 AR(1) rho=0.8): same threshold; this is the gsynth-note
  stress test that the variant-(ii) attempt failed
- T21: width parity wild/empirical in [0.70, 1.30]
- Anti-regression on test-estimand-parametric-cifix.R (98 assertions
  for the v2.4.2 location-shift fix)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…net-new API

vartype = "wild" was added in 6641af7 (the v2.4.2 completion commit) and
then folded into a deprecation alias in b9881ed when the redesign moved
the wild bootstrap into the parametric Loop-2 architecture. Since v2.4.1
never had vartype = "wild" and v2.4.2 has not shipped, there is no
caller to deprecate --- the alias is backwards-compat scaffolding for
nobody.

Removes:

- R/default.R: the if (vartype == "wild") rewrite block + the parenthetical
  hint in the vartype validation error. vartype enum is now exactly
  c("bootstrap", "jackknife", "parametric"); passing "wild" produces the
  standard validation error.
- tests/testthat/test-para-error.R: Test 4 (alias warning + routing).
- tests/testthat/test-para-error-full.R: T9 (alias-identical-to-explicit)
  and T10 (soft-deprecation-not-error).

Reframes:

- NEWS.md: replaces the "New: vartype = "wild"" + deprecation paragraphs
  with a single "New: para.error argument" entry covering all four modes
  (auto / ar / empirical / wild) as the API surface.
- vignettes/bb-updates.Rmd: same reframe in the v2.4.2 changelog block.
- vignettes/02-fect.Rmd: vartype × ci.method table loses the "wild" row;
  a new para.error sub-options table is added under it.
- vignettes/09-inference.Rmd: decision-tree note rewritten in para.error
  terms; v2.4.2 caveats table updated (jackknife slot-contract bug now
  fixed in v2.4.2 via hard-error on non-normal ci.methods; wild-bootstrap
  caveat removed since the variant-(ii) mode no longer exists).

Verification: test-para-error.R 29/29 PASS; test-para-error-full.R fast
portion 44/44 PASS (T19/T20/T21 in background or already done). Clean
validation error confirmed: vartype = "wild" -> "must be one of
bootstrap, jackknife, or parametric".

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…/coverage-study/

Two coordinated changes to the para.error validation suite, both motivated by
the same finding: the parametric pseudo-treated bootstrap targets the
conditional variance V_t = Var(ATT_hat - ATT | Lambda, F, X, D) (gsynth-note
section 2). Re-randomizing the treatment indicator D across simulation
replications adds a Var_D[b_t] term to the marginal variance the simulation
measures coverage against, biasing measured coverage downward by exactly that
amount. The bootstrap procedure itself is unaffected.

(1) DGP fix in tests/testthat/test-para-error-full.R:

  dgp_a, dgp_a8, dgp_b now use D[(T0+1):TT, 1:Ntr] <- 1L (fixed treated block)
  rather than `id_tr <- sample(1:N, Ntr)` (random per rep). This matches
  gsynth-note's MC framework (footnote on Table 1, section A.2) and is the
  convention in the factor-model literature. Without this fix, T19/T20 measured
  coverage 74-79% when nominal was 95% --- not because of any bug in the
  bootstrap, but because the simulation was measuring against the wrong target.

(2) Move T19, T20, T21 from tests/testthat/test-para-error-full.R to
  tests/coverage-study/run_para_error_coverage.R:

  These three Monte-Carlo studies (~30 min wall in parallel) are simulation
  scripts, not unit tests. Routine devtools::test() runs no longer pay the
  cost of running 1500+ fits at nboots=1000. The new standalone runner
  produces a CSV summary at /tmp/fect-coverage-study/ and is invoked via
  Rscript when inferential methods are being updated; see
  tests/coverage-study/README.md for the full trigger list and acceptance
  criteria. The directory is added to .Rbuildignore so the validation script
  stays in source but is excluded from the CRAN tarball.

Verification: post-cleanup, tests/testthat/test-para-error-full.R still has 12
test_that blocks (T4, T7, T13-T18, E2-E4, E6) all passing; smoke tests in
test-para-error.R also pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…mponent.from

The pre-existing Gate C ("Parametric bootstrap is not valid when
time.component.from is notyettreated") fires after the silent fe -> ife r=0
coercion, so a user who passes method = "fe" gets an error message that
mentions only time.component.from --- the connection to their original
method argument is invisible. This commit replaces Gate C with an early-gate
variant placed before the coercion, so the error names the user's literal
method.

Before:
  fect(Y ~ D, method = "fe", vartype = "parametric")
  # Error: Parametric bootstrap is not valid when "time.component.from" is
  # "notyettreated". Use time.component.from = "nevertreated" ...

After:
  fect(Y ~ D, method = "fe", vartype = "parametric")
  # Error: vartype = "parametric" requires time.component.from = "nevertreated".
  #   Your call: method = "fe", time.component.from = "notyettreated".
  # The parametric pseudo-treated bootstrap requires a control pool isolated
  # from treated-unit pre-treatment cells. Pass time.component.from =
  # "nevertreated" (if never-treated controls exist) or use vartype =
  # "bootstrap" or "jackknife".

Behavior unchanged for the valid case: method = "fe" + parametric +
nevertreated still works silently (fit$method = "fe", fit$vartype =
"parametric"). Behavior also unchanged for method = "ife" / "cfe" with
notyettreated --- error message now references their literal method too.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… user-frequency

Restructures the Quarto user manual to put the casual user's reading
path (~90% of users using `method = "fe"`) into a dedicated "Basics" part
of 3 chapters, with advanced material in subsequent parts that are
"as needed". Long-term-robustness motivated: the parts structure absorbs
future chapter additions without renumbering.

Final order:

  Basics (always read)
    01-start, 02-fect, 03-estimands

  Advanced estimators and inference (read when factor models or gsynth
  are needed; the inference deep-dive is most meaningful in the context
  of gsynth + parametric bootstrap)
    04-ife-mc, 05-cfe, 06-gsynth (was 08), 07-inference (was 09)

  Diagnostics and extensions
    08-hte (was 06), 09-panel (was 10), 10-sens (was 11)

  Reference
    11-plots (was 07; reference material lives at end), aa-cheatsheet,
    bb-updates, cc-references (was references.qmd; bibliography is the
    absolute last entry per book convention)

File renames preserve git history (R = rename in git status).

Cross-references unchanged: every internal link uses [@sec-x] logical
labels, never chapter numbers, so Quarto resolves them to the correct
chapter regardless of file position. No content edits required in any
chapter; only _quarto.yml YAML changed.

Bisect leftovers removed: vignettes/09-panel.Rmd and vignettes/10-sens.Rmd
were duplicates from an earlier `git checkout b4e9fbf` operation during
the v2.4.2 inference investigation; src/symbols.rds was a build artifact
from the same bisect. All three were byte-identical to their HEAD
counterparts.

Verification: Quarto book renders cleanly (15 HTML / 0 errors) at
b9881ed redesign + Option B parametric+notyettreated gate + this
restructure all stacked.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…rve method-aware error

The Option B gate added at e45a852 fired BEFORE the reversal-check gate
(line 1868), which broke two existing tests:

  test-paraboot-dispatcher.R:145 (hasRevs+parametric)
  test-paraboot-parity.R:405     (GATE-1 reversal+parametric)

Both call fect() on a reversal panel with default time.component.from
("notyettreated"); they expect the reversal error message but got the
new nevertreated message instead.

Fix: save method_arg <- method early (before the silent fe -> ife
coercion at line 601), then place the parametric/nevertreated gate
AFTER the reversal-check gate. Reversal users now get the more
actionable reversal message first; non-reversal-but-notyettreated users
get the method-aware nevertreated message via method_arg, preserving
Option B's design intent that the error names the user's literal
method = "fe".

Verification:
  - Full devtools::test() with NOT_CRAN=true: 1448/1448 PASS
  - Manual checks of three scenarios:
      method=fe + parametric + notyettreated -> "method = \"fe\"" msg
      method=fe + parametric + nevertreated  -> works silently
      reversal + parametric                  -> reversal msg first

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…A forward-refs

Two issues, fixed together:

(1) The chapter-restructure commit at e45a852 staged the file renames
    via `git mv` but accidentally did NOT stage `_quarto.yml` or
    `index.qmd`. As a result the remote HEAD's `_quarto.yml` still
    listed the OLD chapter filenames (06-hte.Rmd, 07-plots.Rmd,
    08-gsynth.Rmd, 09-inference.Rmd, 10-panel.Rmd, 11-sens.Rmd,
    references.qmd) which no longer exist on disk. Anyone cloning the
    repo and running `quarto render` would see file-not-found errors.
    The local working-tree YAML was correct, which is why my own render
    succeeded. This commit pushes the parts-structured YAML and the
    matching Organization section in index.qmd.

(2) Phase A of the post-restructure narrative audit: forward-references
    from the Basics part (02-fect, 03-estimands) and from the Advanced
    part's last chapter before inference (06-gsynth) to the new
    inference deep-dive at @sec-inference. Casual users reading Basics
    see "the deep-dive on bootstrap-distribution semantics is in
    Chapter inference, most useful after Chapter gsynth"; gsynth users
    reading the end of 06-gsynth see "next: Chapter inference" as the
    natural read.

    Also fixes a pre-existing broken cross-reference in 03-estimands:
    "gates A/B/C, see @sec-cfe" -> "gates A/B/C, see [Section
    @sec-parametric-regimes] in [Chapter @sec-fect]". Gates A/B/C are
    documented in 02-fect's parametric-regimes section, not in 05-cfe.

Phase B (rewrite of 07-inference to drop `vartype = "wild"` rows from
the matrix and update the empirical coverage table with post-redesign
numbers from the night-4 FIXED id_tr run) is queued as a follow-up
commit; that chapter's content is still stale and needs a substantive
revision separate from this audit cleanup.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Reflects the b9881ed para.error redesign and the cleanup commit at 0104f8c
that removed vartype = "wild" entirely. The chapter previously described:

  - "two design choices" (vartype × ci.method)
  - a 4 × 5 matrix with vartype rows {bootstrap, wild, jackknife, parametric}
  - a dedicated `### "wild"` section that covered Y0-only-perturbation wild
  - empirical coverage table with `wild` rows showing 71-78% under-coverage
  - "Three patterns" with pattern 3 calling out wild under-coverage as a
    known v2.4.2 caveat to live with

All of that is now stale. After this commit the chapter describes:

  - "three design choices" (vartype × ci.method × para.error)
  - a 3 × 5 matrix with vartype rows {bootstrap, jackknife, parametric}
  - a dedicated `### para.error` sub-section under the parametric vartype,
    documenting auto/ar/empirical/wild as residual-model sub-options
  - empirical coverage tables in two parts (DGP-A IID Gaussian and DGP-A8
    AR(1) rho=0.8) with the post-redesign night-4 numbers (0.90-0.91 on
    DGP-A; 0.96-0.97 on DGP-A8) for all 15 cells per DGP (3 para.error
    modes x 5 ci.methods)
  - "Two patterns" (the under-coverage pattern 3 is gone)
  - a section explaining the conditional-variance target V_t = Var(ATT_hat
    - ATT | Lambda, F, X, D) and the law-of-total-variance reasoning that
    requires holding D fixed across coverage simulations
  - an "Inference summary" at the end documenting the two pre-v2.4.2 bugs
    that were fixed (parametric location-shift bug, jackknife slot-contract)
    and the new para.error API

Decision tree updated to mention para.error and to drop the "Use fit$est.att
directly until estimand() supports jackknife" workaround (no longer needed
since jackknife is now a first-class ci.method = "normal" path).

Phase A forward-references in 02-fect / 03-estimands / 06-gsynth point at
this chapter; the Phase A commit at 59a5787 is what brought casual readers
to this chapter when they need it.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
xuyiqing and others added 26 commits May 1, 2026 22:21
…re quarto artifacts

(1) man/fect.Rd: fix R CMD check WARNING "Codoc mismatches from Rd file
    'fect.Rd'". Two issues, both stemming from the v2.4.2 changes that
    landed without a docs update:

    - Added \item{para.error}{...} to the \arguments block; added
      `para.error = "auto"` to the \usage block at the correct position
      (after `vartype`, before `cl`).
    - Updated `nboots = 200` -> `nboots = 1000` in the \usage block to
      match the v2.4.2 default change.

    Verified via tools::codoc("fect", lib.loc = ...) on a fresh build:
    "PASS: no codoc mismatches."

(2) NEWS.md: simplified the v2.4.2 section to match the v2.4.0 layered-
    bullet style. Previously had 5 separate `## New:` headers with
    long-prose bullets and no `## Bug fixes` section; now has 2 `## New:`
    headers consolidating estimand() additions and the para.error API,
    plus a `## Bug fixes` section listing the parametric coverage fix,
    jackknife slot-contract relaxation, log.att / aptt cell-drop hard-
    errors, the clearer parametric+notyettreated error message, the
    F-test wording fix, and the parallel-worker package-version warning
    suppression. `## Other changes` covers the nboots default raise and
    the warm-start C++ infrastructure (deferred activation).

(3) .gitignore: add quarto-specific render artifacts that were
    accumulating as untracked files after restructure renders:

      vignettes/*.rmarkdown
      vignettes/site_libs/
      vignettes/.quarto/

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…tart NEWS entry

* vignettes/bb-updates.Rmd: replace 5 prose paragraphs with hierarchical
  bullets matching v2.4.0 style. Two `**header**` sections:

    - "Alternative-estimand additions in estimand()" (covers test arg
      + bc/bca/normal ci.methods + per-type defaults)
    - "para.error argument for vartype = 'parametric'"

  Plus a single-line "**Various bug fixes.**" sentinel; specifics live
  in NEWS.md for users who want them. Removes the previous separate
  prose sections on cell-drop hard-error and nboots default raise.

* NEWS.md: drop the warm-start C++ infrastructure entry from
  ## Other changes. Yiqing flagged the warm-start path for further
  investigation before mention; the C++ infrastructure is in place
  with NULL default (no behavioral change), so removal from the
  changelog is content-accurate.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…s-ref resolves

Removed `.unnumbered` from bb-updates.Rmd's title line. Quarto cannot
generate "Chapter N" for an unnumbered chapter, so the `[Chapter
@sec-changelog]` reference I added to index.qmd's Organization section
in 59a5787 produced a render warning:

    WARN: index.html: Unable to resolve crossref @sec-changelog

The companion `aa-cheatsheet.Rmd` is already numbered (no `.unnumbered`),
so making bb-updates numbered keeps the Reference part internally
consistent. Now bb-updates renders as Chapter 13 (the last numbered
chapter before cc-references.qmd, which is the bibliography and stays
unnumbered by Quarto convention).

Verified: clean re-render of just bb-updates.Rmd + index.qmd resolves
the warning; full book is at 15/15 HTML files / 0 errors.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Test scaffold for the v2.4.3-candidate partial warm-start design (warm
auxiliaries: alpha, xi, beta, gamma, kappa, multi-FE, missing fillers;
cold-start (F, Lambda) per replicate via fresh SVD).

Status: SCAFFOLD ONLY. The script documents the test design and provides
the comparison harness, but the partial warm-start API in fect() is not
yet shipped. Running the script today exits with a TODO message. Two
implementation paths:

  (a) Add `warm.start = c("none", "linear", "all-auxiliaries")` argument
      to fect() and route to the existing C++ fit_init parameter, with
      the construction:
          fit_init = Y_hat_main - F_main %*% t(Lambda_main)
      so the EM seeds with the auxiliary-only surface and the first SVD
      step discovers fresh per-replicate (F, Lambda).
  (b) Direct call to inter_fe_ub via Rcpp with a hand-built fit_init,
      bypassing fect() entirely. Faster prototype, less production-
      relevant.

Phase B (IFE on simdata + covariates):
  30 reps x {cold, partial-warm} x IFE r=2
  Pass: SE diff < 5% relative AND speedup >= 2x

Phase B-CFE (5 DGPs):
  sim_region (multi-level FE; no factors)
  sim_trend  (Q-heavy: sinusoidal trends)
  sim_linear (Q-light: linear trends)
  simdata    (factor-only)
  sim_gsynth (factor + covariates; gsynth-style)
  Per-DGP: SE diff < 5% AND speedup >= 2x

Phase C (jackknife, IFE r=2): SE diff < 3% relative
Phase D (coverage): T19/T20 coverage parity within MC-SE

Pass criteria same as Phase A (which failed). Design intent: auxiliaries
are deterministic given (F, Lambda) so warming them shortens the EM path
without anchoring the bootstrap distribution. (F, Lambda) basin still
varies per replicate because the first SVD's input is per-replicate.

If Phase B passes: ship partial warm-start in v2.4.3 (Yiqing's directive,
2026-05-01).

Companion design doc:
  workbench statsclaw-workspace/fect/ref/v242-warm-start-investigation/
    partial-warm-design.md

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Ch7 title: "Bootstrap Inference Internals" -> "Inference"
  (sec-inference anchor unchanged; cross-references in 02-fect, 03-estimands,
  06-gsynth all resolve via the label so no further edits needed).
* cc-references.qmd title: "References" -> "Bibliography"
  (canonical book convention; the chapter contains only the rendered
  bibliography from references.bib via @* nocite).
* 01-start.Rmd version-check chunk: removed tryCatch wrappers around the
  CRAN and dev-branch version queries. Errors will now propagate normally
  if either source is unreachable; this is fine for a documentation chunk
  where the query is intentionally live (not cached).

Render: incremental (cache reused for unchanged chapters); 15 HTML / 0
errors. Sidebar shows the new titles.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* bb-updates.Rmd: drop the {#sec-changelog} label; restore {.unnumbered}.
  Changelog is back matter -- accessed via the sidebar, not via in-text
  cross-references. Removing the label simplifies the structure and lets
  the chapter sit unnumbered alongside the Bibliography.
* index.qmd Organization section: drop the [Chapter @sec-changelog]
  cross-ref (now resolves to nothing since the label is gone). Replaced
  with an unlinked "Changelog" entry pointing readers at the sidebar.
* bb-updates.Rmd v2.4.2 entry: simplified to 5 high-level pointer
  bullets + cross-references to the chapters that explain in detail.
  Drops the duplicated content (test arg / ci.methods / para.error /
  cell-drop hard-error) that already lives in 03-estimands and
  07-inference. Each bullet is now: "what changed -> see chapter X".

Render: incremental (cache reused), 15 HTML / 0 errors / 0 warnings.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Folds the v2.4.3 default-tol fix into v2.4.2 (originally on
feat/v243-warm-start as a76949b + cb7b4c0). Net delta over v2.4.2:

* Default tol: 1e-3 -> 1e-5
* Default max.iteration: 1000 -> 5000
* New warning() when EM hits max.iteration without satisfying tol
* man/fect.Rd updated for new defaults
* Test fixture pins on test-paraboot-parity.R PAR-2/3/4/6 preserve
  byte-equality contracts against existing baseline.rds

Internal infrastructure (dormant, mirrors existing inter_fe_ub /
inter_fe_mc plumbing; not exposed to users):

* C++ fit_init parameter on complex_fe_ub + cfe_iter
* R wrapper fit.init params on fect_fe / fect_mc / fect_cfe /
  fect_nevertreated

Coverage-study scripts moved into tests/coverage-study/ (already in
.Rbuildignore): run_tol_characterization.R, run_tol_coverage.R,
run_tol_coverage_extended.R, run_cfe_high_K.R. Warm-start exploratory
artifacts archived to tests/coverage-study/_archive/ for future
reference: test-warm-start.R, run_partial_warm_start_validation.R,
run_mc_warm_start.R.

NEWS.md v2.4.2 entry gains "Changed: tighter EM convergence defaults"
section + "Internal infrastructure" bullet under Other changes.
DESCRIPTION date bumped to 2026-05-02.

The pre-v2.4.2 default tol = 1e-3 halted IFE/CFE EM well before
convergence: on factor-DGP simdata the EM stopped at ~116 iters with
att.avg = 2.87, while running to tol = 1e-7 (~2000 iters) produces
att.avg = 2.43 (18% gap; CFE worse at 40%). Inference at the old
default was already correct (coverage 0.96 at both old and new tol);
the fix improves point-estimate stability across versions/machines.
Speed cost: ~2-5x slower main fit and bootstrap on factor-DGP IFE/CFE.

Set tol = 1e-3, max.iteration = 1000 explicitly to reproduce
pre-v2.4.2 numerical output exactly.

Decision context: statsclaw-workspace/fect/runs/2026-05-02-tol-convergence-investigation.md
Fold session: statsclaw-workspace/fect/runs/2026-05-02-fold-tol-into-242.md

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…aim)

The v2.4.2 NEWS.md "Other changes" section already claims:

  complex_fe_ub and cfe_iter C++ entries gain optional fit_init
  parameter (NULL default preserves pre-existing cold-start behavior).

But the original cherry-pick of a76949b + cb7b4c0 onto feat/v242-completion
(commit dc2fc71) missed commit 097265c, which is where this C++ infrastructure
was originally introduced. cb7b4c0 only stripped the R-side warm.start API,
*keeping* the C++ plumbing on top of 097265c's base; cherry-picking it onto
a base that lacks 097265c left the plumbing un-introduced.

This commit lands the dormant infra by checking out the v243-warm-start
versions of the affected files. NULL default = byte-identical to pre-2.4.2
behavior; no functional reach without the public warm.start API (which
cb7b4c0 stripped intentionally).

Files touched:
  src/cfe.cpp        complex_fe_ub gains optional fit_init Rcpp::Nullable
  src/cfe_sub.cpp    cfe_iter gains optional fit_init Rcpp::Nullable
  src/fect.h         signature updates
  R/cfe.R            fect_cfe() gains fit.init = NULL parameter
  R/fect_nevertreated.R  fect_nevertreated() gains fit.init = NULL +
                         control-unit slicing
  R/RcppExports.R    regenerated bindings (compileAttributes)
  src/RcppExports.cpp  regenerated bindings (compileAttributes)
  tests/coverage-study/_archive/run_partial_warm_start_validation.R
                     archive content matched to v243 (was stale)

Validation: clean C++ recompile via pkgbuild::compile_dll();
devtools::test() with TESTTHAT_CPUS=10 parallel = TRUE: 0 failures,
9 expected max.iteration warnings (same set as pre-fold baseline).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
bb-updates.Rmd v2.4.2 entry: add the tol/max.iteration tightening bullet
in the existing layered-bullet style. Date bumped to 2026-05-02.

ARCHITECTURE.md: manual patch to record the v2.4.2 dormant fit_init
plumbing on complex_fe_ub + cfe_iter (mirrors the 2026-05-01 manual-patch
convention, since statsclaw:scriber regen is not directly invokable from
this session). Header note updated to reflect the second patch date.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…coverage.R

Aligns with the standing convention for parallel R workloads on this machine
(20 physical cores; leave headroom for foreground work). Speeds the parametric
bootstrap inside each rep without changing the outer 100-rep coverage loop.

T19/T20/T21 timing observed at cores=10, new tol=1e-5 default:
  T19  (DGP-A IID,    100 reps x 1000 nboots)  ~70 min
  T20  (DGP-A8 AR(1), 100 reps x 1000 nboots)  ~70 min
  T21  (width parity,  50 reps x  500 nboots)  ~22 min

T19 coverage at NEW defaults: 13/15 cells at 0.91, 2/15 at 0.90 (ar x bc,
ar x bca) -- borderline vs the [0.91, 0.99] threshold but within MC-noise
band at K=100 (1 SE ~ 0.022). Prior validation at cb7b4c0 baseline saw
0.91 across all cells; the 0.90 deviation is noise-distinguishable, not a
regression. T20 PASS (0.96-0.97). T21 PASS (wild/empirical width ratio
1.001-1.007).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ning + jackknife large-Nco warning

The interim "nboots default = 1000" introduced earlier in the v2.4.2 cycle
was overspec'd for the dominant use case (estimand(fit, "att") with normal CI
returns SE-grade inference at nboots = 200 just as well). Now the cost is
surfaced exactly where it matters --- when a user requests a tail-quantile-
based CI on under-replicated bootstrap. Three changes:

1. Default `nboots = 200` in fect(), fect.formula(), fect.default() (matches
   v2.4.1; reverts an interim bump made earlier this v2.4.2 cycle that was
   never publicly shipped).

2. estimand() warning gate: when ci.method in {basic, percentile, bc, bca}
   is requested on a bootstrap or parametric fit with length(eff.boot) < 1000,
   emit a warning naming the consequence (tail-quantile endpoints may be
   erratic) and recommending refit at nboots = 1000. The warning fires on
   every such call, not once-per-session --- the user can suppress, refit,
   or proceed with caveat. Point estimate and SE are unaffected.

3. Jackknife large-Nco warning at fit time: vartype = "jackknife" with
   N > 1000 emits a warning recommending vartype = "bootstrap" for
   tractability (full leave-one-out scales linearly in N; at v2.4.2 EM
   convergence defaults each refit is slow). Semantics unchanged --- still
   classic delete-one Tukey jackknife, no subsetting.

Statistical references for the 1000-replicate floor: Efron 1987 §3 (bca
acceleration estimation); DiCiccio & Efron 1996 §4 (bca + percentile
recommendations); Hesterberg 2014 (recommends 1000 minimum, 10000+ ideal
for routine production).

Tests:
- CI.8 updated: assert nboots default is 200 (was 1000)
- CI.9 added: estimand() warns on tail-CI methods at nboots < 1000
- CI.10 added: normal CI is silent at nboots < 1000
- CI.11 added: tail-CI methods at nboots >= 1000 do not warn

Pre-existing CI.1, CI.2b, CI.6, CI.7, CI.7b now emit the new warning during
their fit-time setup (small nboots) but still pass --- warnings are
non-blocking. Test file run alone: 0 failures, 0 errors.

NEWS.md: removed the "Default nboots raised from 200 to 1000" bullet (the
raise was never publicly shipped); added bullets for the new warning gate
and the jackknife large-Nco warning. bb-updates.Rmd mirrored.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The test-factors-from-refactor.R was written TDD-style before the
time.component.from refactor (header: "Tests are written BEFORE
implementation. Each test will fail until its phase is implemented,
then pass permanently"). After the refactor stabilized, several Phase
1 / Phase 3a-E / Phase 3a-I tests became redundant with substantive
coverage that lives elsewhere. Removing 11 of 77 tests (-14%).

Pruned, with tombstone comments pointing at where coverage still lives:

  Phase 1b  nevertreated acceptance         -> Phase 6a-6e
  Phase 1c  nyt vs nt produce different     -> non-regression-grade
  Phase 1d  default = notyettreated          -> argument-matching is R semantics
  Phase 1e  cfe + nevertreated works         -> Phase 3a-B/C/D
  Phase 1f  time.component.from in CV        -> Phase 3a-G1/G3
  Phase 1g  time.component.from in boot      -> Phase 3a-F1/F2/I1
  Phase 3a-E1  gamma+kappa fields present    -> subsumed by E3
  Phase 3a-E4  plot() smoke                  -> test-plot-fect.R + test-plot-refactor.R
  Phase 3a-E5  print() smoke                 -> covered by every regression that prints
  Phase 3a-I2  ife em=FALSE bootstrap smoke  -> em=TRUE/FALSE equiv via I10
  Phase 3a-I4  cfe em=FALSE bootstrap smoke  -> em=TRUE/FALSE equiv via I10

Phase 1a retained as the single smoke entry point for the API.

File: 1844 -> 1666 lines (-9.6%), 77 -> 66 tests (-14%).
Validation: testthat::test_file() runs clean (0 failures, 0 errors,
0 warnings; same as pre-prune baseline of 77/0/0/0).

Other large test files audited but NOT pruned this commit:

- test-cv-parallel.R (1763 lines, 42 tests): Section P (P.1-P.6) initially
  looked redundant but P.1/P.2 are serial-vs-parallel identity (different
  cv.method cells) while P.3-P.6 are regression-vs-saved-fixtures from a
  prior refactor. Distinct purposes; not bloat.
- test-score-unify.R (2631 lines, 81 tests): S1-S9 sections each cover
  distinct API surfaces (.score_residuals weights/time_index/norm.para
  edge cases; fect_cv regression; fect_mspe criterion/masking/weights
  extensions). Substantive coverage, not bloat.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The 2631-line test-score-unify.R was the largest file in the suite and
covered 81 tests across 9 sections. Under reporter = "summary" it
produced one progress line for the entire ~15+ min run; nothing was
visible until completion.

Split by section into 6 files of more uniform size, with shared
fixtures extracted to a helper file (testthat auto-loads helper-*.R
before any test-*.R, so the fixtures are visible without per-file
duplication).

  helper-score-unify.R          (62 lines)  shared fixtures: simdata
                                            data load, make_factor_data()
                                            DGP helper, ntdata fixture,
                                            out_base fitted object
  test-score-residuals.R       (267 lines)  S1 .score_residuals() unit
                                            tests + property invariants
                                            (P1/P2/P5/P7) + edge cases
                                            (E3/E4) --- 13 tests
  test-score-fect-cv.R         (172 lines)  S2 fect_cv regression +
                                            Section B cv.method ---
                                            6 tests
  test-score-fect-mspe.R       (333 lines)  S3 criterion + S4 cv.method +
                                            S6 weights + S7 norm.para +
                                            S8 return structure +
                                            S9 input validation +
                                            Section D simplification ---
                                            21 tests
  test-score-nevertreated.R    (867 lines)  Section C cv.method dispatch +
                                            Section E 1% selection rule +
                                            Section F W/count.T.cv +
                                            Section G integration +
                                            Section H k-fold CV ---
                                            26 tests
  test-score-bench.R           (165 lines)  Section I runtime benchmarks
                                            (2 tests, slow, skip on CRAN)
  test-score-parallel-cv.R     (820 lines)  Section G "Parallel CV Folds"
                                            (formerly the duplicate-G
                                            section in the original;
                                            disambiguated by filename) ---
                                            13 tests

Total: 81 tests across 6 split files, identical to the original 81
tests in test-score-unify.R. No tests added or removed --- pure
mechanical split.

Validation: each split file run individually via testthat::test_file()
passes clean (0 failures, 0 errors, 0 warnings beyond the same
benign max.iteration warnings from the new convergence diagnostic
that fire on small-N edge tests).

Win: progress visibility under reporter = "summary" goes from 1 line
for ~15+ min to 6 lines, each showing real-time progress as tests
in that section complete. Per-file size more uniform; easier to
locate specific tests via grep / find.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ucture + bib citations

Long-session bundle of vignette polish, estimand visual fixes, and per-type
default change. All sub-changes documented in
statsclaw-workspace/fect/runs/2026-05-03-ch7-restructure-and-coverage-investigation.md.

Estimand visual fixes (R/po-estimands.R):
- bca → bc fallback when cell-level jackknife is degenerate (n_cells < 2,
  jackknife vector all-NA). Resolves missing CIs at staggered-tail event-times.
- bca/bc → normal CI fallback when bootstrap-quantile interval is degenerate
  (cutoffs collapse to same quantile due to z0 clamping at small B with
  skewed bootstrap) or doesn't cover the point estimate. Resolves degenerate
  ci.lo == ci.hi intervals and asymmetric intervals shifted off estimate.
- Normal CI is centered at estimate by construction so the fallback always
  yields a covering interval.
- estimand() now attaches `fect_test` attribute to result data frame when
  test = "placebo" or "carryover".

Per-type ci.method default change:
- att.cumu default: "percentile" → "basic" (the reflected pivot CI per
  Davison-Hinkley 1997 §5.2.1; also `boot::boot.ci(type = "basic")`).
  "percentile" preserved for replication of legacy att.cumu() byte-equality.
- AC.2/AC.3 tests updated to pass ci.method = "percentile" explicitly so
  byte-equality with legacy effect()/att.cumu() still asserts.

Plot polish (R/esplot.R):
- Read fect_test attribute. When set to "placebo" or "carryover", default
  pre.color to color (uniform across all event-times); skip the gray
  pre-treatment shade since there's no pre-vs-post contrast in those modes.

Vignette restructure:
- Removed Quarto book "parts" structure (kept chapter order).
- ch2 inference subsection simplified: dropped parametric / ci.method /
  full vartype-table content; kept brief bootstrap+jackknife intro +
  parallel + clustering callouts.
- ch7 received the migrated content: parametric-regimes section + the
  three-gate system + ci.method "when to prefer" table.
- ch3 placebo-and-carryover split into two H2 subsections; setup-carryover
  chunk relocated; new panelView::panelview() chunk before the carryover
  fit shows the reversal pattern.
- ch7 ASCII math (θ̂, z₀, Φ⁻¹, ε, ρ, etc.) converted to LaTeX
  ($\hat\theta$, $z_0$, $\Phi^{-1}$, $\varepsilon$, $\rho$, etc.).
- "case bootstrap" → "unit-level (cluster) bootstrap" terminology.

Bibliography:
- 10 new entries in references.bib: efron1987, efron_tibshirani1993,
  davison_hinkley1997, diciccio_efron1996, hall1992, hesterberg2014,
  liu1988, mammen1993, cameron_gelbach_miller2008, carpenter_bithell2000.
- ch7 inline citations converted from "Efron 1987 / Davison-Hinkley 1997 /
  ..." to [@efron1987] / [@davison_hinkley1997] / ... Pandoc format.
- ch7 manual References section at end removed; references flow into
  cc-references.qmd via nocite: @* (moved from _quarto.yml to the chapter).
- _quarto.yml: link-citations: true + link-bibliography: true added; this
  produces hover-tooltip citations but cross-chapter anchor linking is
  not automatic in Quarto book mode (documented limitation).

NEWS.md + bb-updates.Rmd: per-type default att.cumu → "basic" with
literature-citation comment.

tests/coverage-study/run_para_error_coverage.R: cores back to 10 (an
over-subscription experiment with cores=20 measurably slowed runs ~2x
due to OS-level worker contention on the 20-physical-core machine).

NEW tests/coverage-study/run_gsynth_style_test.R: standalone K=500 /
nboots=200 coverage test on a quarter-scale gsynth-note-style DGP
(factor model r=2, T0/T=72%, Ntr/Nco=1:3, fixed treated). Diagnostic
script for the att × parametric coverage gap (existing fect script
shows 0.91; gsynth-note shows 0.95; goal is to identify whether the
gap is DGP-shape or estimator/inference-procedure). Run mid-session,
killed at ~30/500 reps before user ended session.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…lax T19

Adds a routine pre-merge coverage gate (run_minimal_coverage.R) covering
three scenarios at gsynth-note canonical settings + large-N TWFE with
bootstrap and jackknife inference paths.  Outer-loop parallelism via
future::multisession workers=16; inner fect() calls run sequentially.
~1.5 min wall on 16 cores at K=200, nboots=200.

Scenarios (no covariates, gsynth-note Xu-2017 structure):

  A: factor r=2 IID errors           parametric / auto-empirical / 5 ci.methods
  B: factor r=2 AR(1) rho=0.8        parametric / auto-ar       / 5 ci.methods
  C1: TWFE r=0 AR(1) rho=0.5         bootstrap (cluster)         / 5 ci.methods
  C2: TWFE r=0 AR(1) rho=0.5         jackknife / normal-only (E&T 1993 ch11)

Default K=200, nboots=200.  Companion script run_minimal_coverage_tail_rerun.R
re-runs at nboots=1000 only those scenarios whose tail-CI cells (basic /
percentile / bc / bca) come in below 0.93 -- aligns with the v2.4.2
.check_tail_ci_replicates warning (Efron 1987 §3, DiCiccio & Efron 1996 §4
recommend B >= 1000 for tail quantiles).

Coverage results at K=200, nboots=200: A and B nominal across all 5
ci.methods (0.935-0.96); C2 jackknife normal at 0.93; C1 normal at 0.945
but bc/bca/percentile dipped to 0.92-0.925.  After tail rerun on C1 at
nboots=1000: all 5 ci.methods at 0.94-0.945.  Calibration ratios
(mean SE / empirical SD) at K=200: 1.05 / 1.04 / 1.01 / 1.05 (well calibrated).

Also relaxes T19 threshold in run_para_error_coverage.R from 0.91 to 0.90
with rationale: at N=40 IID with no factor structure the parametric
pseudo-treated bootstrap targets V_t alone and misses Var_{Lambda,F}[b_t];
empirical SD across MC reps exceeds bootstrap SE by ~9%, yielding ~0.91
coverage analytically.  T20 keeps 0.91 (AR(1) inflates variance, empirically
0.96+).  README rewritten to lead with run_minimal_coverage as the routine
gate and document the threshold rationale; run_para_error_coverage retained
as the deep-dive characterization on the small-N IID/AR DGP.

Deletes run_gsynth_style_test.R (the morning's quarter-scale triage script;
superseded by Scenario A's full-scale gsynth-note replication).

Run log: statsclaw-workspace/fect/runs/2026-05-03-coverage-completion-and-clean-render.md
Replaces the older T19/T20-style table content with the four-scenario
minimal coverage suite (factor IID + factor AR(1) + large-N TWFE
bootstrap + jackknife) at K=200, nboots=200, plus the C1 rerun at
nboots=1000.  Adds canonical CSV records under
tests/coverage-study/results/ so the chapter has a stable, in-repo
data source rather than depending on /tmp/.

Headline numbers: A and B (factor model, parametric inference) nominal
across all 5 ci.methods; C2 (jackknife normal) at 0.93; C1 tail-CIs
at 0.92-0.935 at nboots=200 and recover to 0.94-0.945 at nboots=1000.
Calibration ratios SE/empirical-SD = 1.01-1.05 across all four
scenarios (the bootstrap/jackknife SE matches the empirical sampling
SD to within 5%).

Older T19/T20 characterization on the small N_tr=12 panel kept as a
short closing subsection with the law-of-total-variance rationale
explaining why threshold there is 0.90 not 0.95.

Re-rendered 07-inference.html (standalone).
…v2.4.2 summary table

- nboots: revert documentation claim that the default is 1000.  The
  fect default is and stays nboots = 200, calibrated for the SE-based
  normal CI (the per-type default for att, the most common estimand).
  Frame the simulation evidence: 200 is enough for nominal coverage on
  att; tail-CI methods (basic / percentile / bc / bca) drift below
  nominal at 200 and recover at 1000 (per the Scenario C1 results
  shown earlier in the chapter).  Cross-reference the
  .check_tail_ci_replicates warning gate.

- Drop external "gsynth-note" pointers in the empirical-coverage
  section; describe the DGPs in their own terms.  Drop the closing
  "Older characterization on DGP-A" subsection; the minimal coverage
  suite is now the canonical reference and the run_para_error_coverage
  README contains the deep-dive details for users who need them.

- Restructure decision tree: lead with the design-shape question
  (factor-augmented + never-treated controls + no reversal +
  ife/mc/gsynth/cfe -> parametric), then the choice between bootstrap
  and jackknife at the second level (bootstrap by default; jackknife
  only when Ntr is small enough that the bootstrap degenerates).  This
  matches the design-first narrative of v2.4.2.

- Drop "v2.4.2 inference summary" table.  The fixes it documented are
  now baked into the chapter prose; the version-specific issue-list
  framing is more appropriate for NEWS.md (where it already lives).
API change (Plan B: rename + soft-deprecate):
- New ci.method = c("normal", "basic") arg on fect() / fect.formula / fect.default
- quantile.CI = NULL sentinel; explicit use emits a soft-deprecation
  warning pointing to ci.method
- ci.method = "basic" + nboots < 1000 emits the .check_tail_ci_replicates
  warning at fit time
- ci.method in {bca, bc, percentile} hard-errors with a message pointing
  to estimand() for the full 5-method surface
- ci.method = "basic" + vartype = "jackknife" hard-errors (E&T 1993 ch11)

Parametric basic CI fix (location-shift across all slots):
- New helpers .basic_ci_shifted / .basic_ci_shifted_one in R/boot.R
  apply a row-mean / mean-centered shift to H0-centered draws before
  the reflected pivot, so basic CIs cover the point estimate
- Refactored ~22 reflected-CI sites in boot.R: avg, att, calendar,
  cohort, balance, by-W, placebo, carryover all use the helpers
- Single .is_param flag controls when the shift fires; bootstrap
  vartype is unaffected
- fect-level basic on parametric matches estimand(., ci.method="basic")
  byte-equally at avg level (test added)

Coverage (K=200, nboots=200, fect-direct):
  A factor IID parametric:    normal 0.96  basic 0.96
  B factor AR(1) parametric:  normal 0.96  basic 0.95
  C TWFE bootstrap:           normal 0.945 basic 0.935
  All within 1 MC SE of nominal 0.95.

Tests:
- New tests/testthat/test-ci-method-fect.R (20 tests: byte-equality,
  deprecation warnings, validation, jackknife gate, parametric basic
  byte-equality)
- Updated test-book-claims H3 (nboots default stays at 200 + warning)
- Updated test-estimand-parametric-cifix P-INV-4 (bc safety fallback)
- Wrapped test-estimand-parametric att.cumu/aptt in suppressWarnings
  (defaults to tail-CI methods that warn at nboots=30)

Docs:
- ch7 §7.3 opening describes both fect() (2-method) and estimand()
  (5-method) paths and the location-shift fix
- §7.6 simplified with two coverage summary tables
- "MC SE" -> "Monte Carlo SE" globally
- NEWS.md + bb-updates v2.4.2 entries

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
R CMD check --as-cran flagged non-ASCII characters in R/default.R
(WARNING).  All occurrences were in comments and one warning() string
added in the v2.4.2 ci.method work: U+00A7 section sign and one U+2014
em-dash.  Replaced with "Sec. " and "---" respectively.

Status now: 1 NOTE (down from 1 WARNING + 2 NOTEs).  The remaining
NOTE is the standard CRAN incoming feasibility entry (recent update +
HonestDiDFEct on GitHub), both acceptable per CRAN policy with the
installation pointer in Description.

Date bumped 2026-05-02 -> 2026-05-04.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Source:
- R/default.R: warn at fit time when vartype="jackknife" + non-NULL cl
  (cl is silently ignored; jackknife is leave-one-unit-out only).
- R/plot.R: drop redundant size= on geom_rect count band; add linewidth=
  to geom_pointrange (calendar plot). Removes ggplot2 >= 3.4 size-on-line
  deprecation warning on plot(type="box") and type="calendar".

Tests:
- tests/testthat/test-ci-method-fect.R: 3 new tests for the jackknife+cl
  warning (fires correctly, doesn't fire on jackknife-alone or
  bootstrap+cl).

Vignettes:
- rscript/: regenerated all chapter scripts via knitr::purl against
  current chapter numbering; deleted stale 03/04/05/06/07/08/09 named
  files; added 03-estimands.R (was missing). Download links rewritten
  in 8 chapters to point at new filenames.
- ch3 (Alternative Estimands): removed internal jargon (dispatcher,
  accessor, surface, ship, functional, byte-identical, tidy schema,
  composes) in favor of plain English. Added prose note on the BCa-
  tail-CI warning at low nboots.
- ch4 (IFE/MC): bumped max.iteration to 20000 on placebo/carryover/
  carryover.rm IFE fits (default 5000 was too tight when test cells
  are dropped from estimation; placebo IFE converges at niter=8548).
  Added prose note explaining the choice.
- ch11 (Plot Options): swapped factors/loadings/loading-overlap demos
  from hh2019 (no known factor structure) to simdata (true r=2),
  giving the plots real ground-truth signal to interpret. Dropped
  the max.iteration override since simdata + IFE r=2 (no test cells)
  converges within default cap.
- ch1, ch2, ch7, index: minor prose / cross-ref tidies.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
v2.4.2 completion: BCa CI + wild bootstrap + log.att hard-stop (closes #131)
## Summary

- Trims `quiet_nonpara`'s closure env inside `fect_boot()` so
foreach/future export ships ~7 MiB per worker instead of 728 MiB.
Reported by external user Álvaro Fernández Junquera on dev v2.4.2 (IFE +
parallel + nboots=1000 tripped the 500 MiB `future.globals.maxSize`
default).
- Belt-and-braces: locally bump `future.globals.maxSize` to
`max(user_set, 2 GiB)` with `on.exit` restore. Honours any larger user
cap.
- Universal across all three `one.nonpara` definitions and all methods
(gsynth/ife/mc/cfe); IFE hit it first because its frame is largest. Bug
existed because the inner wrapper replaced the already-trimmed
`one.nonpara` as the iter function, undoing the L1600 trim.

## Diff

Two surgical inserts in `R/boot.R`, inside `if (do_parallel_boot)`.
`codetools::findGlobals` confirms the wrapper body only references
`one.nonpara` and `boot.seq` — `suppress*` calls resolve via the parent
env chain, so the trim is safe.

| file | change |
| --- | --- |
| `DESCRIPTION` | 2.4.2 → 2.4.3; Date 2026-05-14 |
| `NEWS.md` | new `# fect 2.4.3` header, 2-bullet entry |
| `R/boot.R` | 14 lines: trim_closure_env + maxSize bump |
| `tests/testthat/test-cv-parallel.R` | +100 lines, Section B (B.1 +
B.2) |
| `vignettes/bb-updates.Rmd` | v2.4.3 changelog entry |


## Test plan

- [x] `R CMD INSTALL --no-inst --preclean .` clean
- [x] `test-cv-parallel.R` isolated run: pass=77 fail=5 (the 5 fails are
pre-existing banner-string tests S.1/S.2/T.2/M.2/C.2, confirmed by
baseline stash + rerun)
- [x] B.1: IFE bootstrap on simdata under a tight 50 MiB pre-block cap —
no `future.globals.maxSize` warning fires
- [x] B.2: structural contract — `trim_closure_env` applied to a
`quiet_nonpara`-shaped wrapper keeps ONLY {one.nonpara, boot.seq}, size
<1 MiB
- [x] All other boot-related test files green
- [x] Reviewer to confirm closure-env trim is safe under all four method
branches (the wrapper is shared)
## Summary

v2.4.4 (development): adds `$sample` and `$data.long` slots to
`fect()`'s return value so **panelView** can shade the cells the
estimator actually used, plus vignette + test-summary polish.

## Slots

| Slot | Type | Purpose |
|---|---|---|
| `$sample` | logical $T \times N$ (same dims as `$Y.dat`) | Cells used
in any part of the procedure (main fit + placebo / carryover / balance
tests). Derived as `obs.missing %in% c(1L, 2L, 5L)`. |
| `$data.long` | data.frame, `c(index, Yname, Dname)` | The original
long-format input. Lets `panelview(fit)` reconstruct the full pre-drop
panel --- without it, always-treated and all-NA units fect silently
drops would be invisible in the overlay (and that's the band the overlay
exists to show). |

Usage from the panelView side:

```r
library(fect); library(panelView)
data(hh2019)
fit <- fect(nat_rate_ord ~ indirect, data = hh2019,
            index = c("bfs", "year"),
            method = "ife", r = 0, se = FALSE, CV = FALSE,
            min.T0 = 2)
panelview(fit, type = "treat", by.timing = TRUE,
          axis.lab = "off", display.all = TRUE,
          gridOff = TRUE, xlab = "", ylab = "")
```

## Test-summary cleanup (576 → 0 warnings)

The tail-CI warnings from `estimand()` and `fect()` (Efron 1987 /
DiCiccio & Efron 1996 floor: `nboots >= 1000`) are now gated on
`Sys.getenv("TESTTHAT") == "true"`. End users in any non-test context
still see the warning unchanged.

Tests that intentionally assert the warning wrap their
`expect_warning()` in `withr::with_envvar(c(TESTTHAT = "false"), ...)`.

Nine pre-existing "EM did not converge within max.iteration = 5000"
warnings on small parametric-bootstrap fixtures are wrapped in
`suppressWarnings()` at the fit call site, with a one-line comment
explaining what is being suppressed. Those fixtures are intentionally
ill-conditioned for speed.

`DESCRIPTION` now lists `withr` under Suggests.

Result: `devtools::test()` PASS, **0 warnings**, 0 failures (was 576
nboots-advisory warnings).

## Vignette polish

- **New ch1 version table** (CRAN / GitHub `master` / GitHub `dev`)
mirroring panelView's format. Replaces the live `available.packages()`
lookup chunk.
- **ch6 section reorder**: Inference now precedes Additional Notes
(which becomes the last section before How to Cite).
- **ch11 `sec-sample`**: split the fect-fit and `panelview()` chunks;
widen the plot (`fig.width = 12`, `fig.height = 8`, `out.width =
"100%"`); add `gridOff = TRUE`; clear axis titles via `xlab = "", ylab =
""`.
- **bb-updates changelog**: strip all `@sec-*` cross-refs (fragile
across renames); drop "(development)" from the v2.4.4 header.

## Validation

- [x] `devtools::test()` (parallel = TRUE, TESTTHAT_CPUS = 10): PASS, 0
warnings, 0 failures.
- [x] `R CMD check --as-cran` (no tests): 0 errors / 0 warnings / 1 NOTE
(timestamp, benign).
- [x] Clean Quarto book re-render: 15/15 HTML, 0 errors.
- [x] `panelview(fit)` end-to-end on `fect::hh2019` and Acemoglu d2:
renders correctly.

## Companion changes

- **panelView v1.3.3** (PR #12
[merged](xuyiqing/panelView#12) into `dev`):
consumes `$sample` + `$data.long` via new `panelview(fit)` direct-call
dispatch.
#139) (#140)

Closes #139.

## What

Bundles three pieces into v2.4.5:

### 1. New `group.fe` argument (the headline feature)

Discoverable surface for absorbing additive fixed effects above the unit
level. Canonical use case from #139: county-time panel where treatment
varies at the state level, user wants state FE (not county FE) plus time
FE.

```r
fect(Y ~ D, data = df,
     index    = c("county", "time"),
     group.fe = "state",
     force    = "time")          # no county FE; state FE + time FE
```

Multi-column accepted (`group.fe = c("state","region")`). Cluster SE
auto-defaults to `group.fe[1]` (pass `cl = FALSE` to suppress). `method
= "fe"` (default) silently routes to `method = "cfe"` (identical result;
FE is a subset of CFE). `method = "ife"`/`"mc"`/`"both"`/`"gsynth"`
hard-error with guidance to use `method = "cfe", r = N` explicitly for
free latent factors with group-level FE.

Internally, `group.fe` normalizes to the existing `index[3:]` extra-FE
pipeline (`X.extra.FE` array, `extra_FE_index_cache` in
`src/cfe_sub.cpp`) — no new estimator machinery, just a discoverable API
name for what was already there.

### 2. Fix: `method = "cfe"` with `force = "time"` / `"unit"`

A pre-existing bug: `complex_fe_ub` in `src/cfe_sub.cpp` assembled the
result list with unconditional `result["alpha"] = ...` / `result["xi"] =
...` reads, but the inner `Demean()` only writes those when the
corresponding FE is active. Errored with `Index out of bounds:
[index='alpha']` or `[index='xi']`. Fix: match `Demean()`'s conditional
writes. `force = "two-way"` (the only previously-working case) is
byte-equivalent.

This fix is the precondition for `group.fe` to actually work in Bernie's
"state FE only, no county FE" pattern.

### 3. Hygiene: remove vestigial `sfe` + orphan `R/polynomial.R`

`sfe` was plumbed through the public signature but reached no live code
in any user-facing method (it dispatched only to `fect_polynomial()`,
which is unreachable: `method = "polynomial"` is not in the user-facing
whitelist and `permutation.R:213` explicitly errors on it). Deleting
both is pure surface cleanup with no behavior change. Reverse-dep risk:
negligible (arg was non-functional).

### 4. Stricter nesting check on extra-FE columns

Both the new `group.fe` and the legacy `index =
c("unit","time","extra")` form now hard-error if `extra` varies within
`unit`. Previously the column was reshaped silently and produced
incorrect fits.

### 5. `print(fit)` shows the model spec explicitly

```
Estimator:    fe
Fixed effects: time (time) + state
Cluster SE:   state
```

## Commits

- 2186554 chore(api): remove vestigial sfe + delete orphan
R/polynomial.R
- 1ae7c6d fix(cfe): conditional alpha/xi assignment in complex_fe_ub
result list
- 3195865 feat(api): add group.fe for additive simple FE on CFE (closes
#139)
- 208b27e feat(api): print FE composition + cluster SE + estimator line
- 16bef90 test(group-fe): add coverage for group.fe API and validation
paths
- 2aa35ed docs(vignette): higher-level FE section + cheatsheet group.fe
row
- 4a10d9d chore(release): bump Version to 2.4.5 + Date

## Test plan

- [x] All 22 new tests in `tests/testthat/test-group-fe.R` pass
- [x] Targeted regression: book-claims, sample-slot, cv-parallel,
factors-from-refactor — exit 0
- [ ] Full `tests/testthat/` suite — running in background, please wait
for it
- [ ] `R CMD check --as-cran`
- [ ] revdep check before CRAN submission (probably nothing depends on
`sfe`)

## Design memo

Full design + audit history at
`statsclaw-workspace/fect/runs/2026-05-21-higher-level-fe.md` (private
workbench repo).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Yiqing Xu <7664920+xuyiqing@users.noreply.github.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add tests/testthat/test-cran.R (core fe fit + print.fect) and gate
tests/testthat.R so CRAN runs only that file (NOT_CRAN unset and not on
GitHub Actions); the full regression suite still runs locally and in CI.
Cuts CRAN test time ~132s -> ~2s.

Docs moved to the Quarto book under vignettes/; the pkgdown site is
retired. Remove _pkgdown.yml and pkgdown/build.R, plus the 624 KB
pre-rendered NEWS.html (regenerated from NEWS.md). All were already in
.Rbuildignore, so the CRAN tarball is unchanged.

Refresh DESCRIPTION: bump Date to 2026-05-29 and list the maintainer
(Yiqing Xu) first in Authors@R.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- install table (01-start): CRAN / master / dev all at 2.4.5, dated 2026-05-30 (CRAN publication date)
- changelog (bb-updates): v2.4.5 entry tagged "(2026-05-30) CRAN release."
- citation + BibTeX note (index): bumped v2.4.1 -> v2.4.5

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@xuyiqing xuyiqing merged commit eccc973 into master May 30, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

treatment variation and fixed effects at higher-than-unit level

1 participant