Conversation
…er_fe_mc Adds optional `Rcpp::Nullable<Rcpp::NumericMatrix> fit_init = R_NilValue` parameter at the end of the IFE/MC C++ entry points and the inner EM workhorse functions. When non-null and shape matches Y, the EM loop seeds `fit` from it instead of the default `fit = Y0` (cold-start). NULL (default) preserves pre-2.4.2 behavior exactly. Both `fit = Y0` initialization sites in fe_ad_inter_iter and fe_ad_inter_covar_iter are routed through the warm_init dispatch (initial init AND post-burnin restart under use_weight=1). Sites touched: - src/ife_sub.cpp: fe_ad_inter_iter, fe_ad_inter_covar_iter - src/ife.cpp: inter_fe_ub (forwards to both inner functions) - src/mc.cpp: inter_fe_mc (forwards to fe_ad_inter_iter via mc=1) - src/fect.h: forward declarations updated CFE warm-start (cfe_iter / complex_fe_ub) deferred to v2.5.0 --- the additional gamma/kappa FE structures complicate the conditional init logic and the marginal coverage gain is small. Statistical justification (Yiqing 2026-05-01): the prediction surface Y_hat = F * Lambda^T is the unique global minimizer (Eckart-Young for unconstrained low-rank; convexity for nuclear-norm MC; convexity of the entropy-regularized projection for simplex bound). Factorization is non-unique by rotation, but Y_hat is uniquely identified --- so warm-starting won't bias the bootstrap distribution of any estimand. Full reasoning: statsclaw-workspace/fect/ref/ warm-start-audit-2026-05-01.md. This commit is the engine-side plumbing only. R wrappers and boot.R activation come in subsequent commits. Existing test suite (estimand + book-claims subset, 196 expectations) passes unchanged --- by construction since fit_init = NULL throughout. Refs: v2.4.2 release plan; statsclaw-workspace/fect/runs/ 2026-05-01-v242-inference-infra.md. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds `fit.init = NULL` parameter to fect_fe() and fect_mc() R wrappers; threads it through to the inter_fe_ub / inter_fe_mc C++ calls as the `fit_init` argument. NULL preserves pre-2.4.2 cold-start behavior; bootstrap callers will pass the main-fit prediction surface (boot.R activation in next commit). Note: the r.cv = 0 sub-fit inside fect_fe (no factors, no EM) explicitly passes NULL even when fit.init is provided --- the warm-start matrix is shaped for a factor-model EM, not a no-factor additive-FE solver. Scope: - IFE: fect_fe (R/fe.R) - MC: fect_mc (R/mc.R) Deferred (separate commit or later release): - GSC path via fect_nevertreated (.estimate_co helper); needs centering-aware fit.init handling - CFE path via fect_cfe (deferred to v2.5.0 with cfe_iter) Tests: existing suite passes (fit.init = NULL throughout, behavior unchanged). Refs: v2.4.2 release plan; runs/2026-05-01-v242-inference-infra.md. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The previous message "F-test Failed. The estimated covariance matrix is singular." conflated two distinct inferential statements: (a) the test failed to reject the null, vs (b) the test could not be computed (singular covariance). Standard hypothesis-test usage reserves "Failed to reject" for case (a). Case (b) is a numerical-failure message and should be worded accordingly. Two occurrences in R/diagtest.R updated. Refs: Yiqing 2026-05-01 hypothesis-test-language feedback; test-language memory feedback_test_language_failed.md. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…1000
ci.method enum: c("basic", "percentile") -> c("basic", "percentile", "bc", "normal").
New methods:
- "bc": bias-corrected percentile (Efron 1987 minus acceleration).
z0 = Phi^-1(P(boot < est)); cutoffs shift by 2*z0 to compensate
for bootstrap-median bias relative to the point estimate. Free
(just one Phi^-1 call); uniform across all vartype values.
- "normal": Wald (theta +- z * SE). What fit$est.att already uses
internally; textbook for jackknife.
Per-type ci.method defaults via NULL trigger (ci.method = NULL):
- att -> "normal" (matches what fit$est.att actually contains;
corrects v2.4.1's mislabeled "basic" passthrough)
- att.cumu -> "percentile" (matches what att.cumu() does internally)
- aptt -> "bc" (ratio estimator; basic CI flips off the point
under bootstrap skew)
- log.att -> "bc" (log estimator; same skew rationale)
Existing v2.4.1 explicit ci.method values still work; the new default
only fires when ci.method is omitted.
Implementation: new internal helper .compute_ci() centralizes all four
methods. Refactored call sites: .estimand_att_overall,
.compute_aptt_event_time, .compute_log_att_event_time. att fast path
updated: triggers on ci.method == "normal" instead of "basic" so
default att queries hit the byte-equality passthrough from fit$est.att.
nboots default 200 -> 1000 in R/default.R (all three signatures
fect / fect.formula / fect.default), plus the misspecified-nboots
error message updated. Existing scripts that pass nboots = ...
explicitly are unaffected.
Note: bc CI may degenerate (ci.lo == ci.hi) when the point estimate
is far outside the bootstrap distribution --- this is the cell-drop
pathology in log.att / aptt; addressed by the hard-error in the
next commit.
Refs: statsclaw-workspace/fect/ref/v242-vartype-cimethod-design.md;
runs/2026-05-01-v242-inference-infra.md.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…t / aptt When a cell used in the point estimate has Y0_b <= 0 in a non-trivial fraction of bootstrap replicates, log(Y0_b) is undefined and colMeans(..., na.rm = TRUE) silently averages over fewer cells in that replicate. The point estimate uses N cells; bootstrap averages may use N, N-1, ..., 0 cells. This breaks the basic bootstrap principle that the resampled estimator should compute over the same data structure as the point estimator, contaminating the bootstrap distribution and yielding meaningless inference. v2.4.2 detects this and hard-errors with actionable guidance: 1. Filter out unstable cells via `cells = ~ Y0_hat > <threshold>` 2. Transform the outcome before fect: log(Y + c) 3. Use a different estimand (att doesn't have this pathology) Two detection paths: - log.att (.compute_log_att_event_time): trigger when the WORST cell has Y0_b <= 0 in > 5% of replicates. Sub-threshold dropping is tolerated as benign (small noise at the bootstrap-distribution scale). - aptt (.compute_aptt_event_time): trigger when E(Y0_b) ~ 0 (within 1e-10) in any replicate, since the APTT denominator blows up there. The 5% threshold is a heuristic; can be made tunable via an argument in a follow-up release if user feedback warrants. Also reverts an inadvertent suppressWarnings around the inner log(Y0_b_ok) call --- with the hard-error gate, suppressWarnings is no longer needed (any surviving log call has positive args). Refs: statsclaw-workspace/fect/ref/v242-vartype-cimethod-design.md hard-error section; runs/2026-05-01-v242-inference-infra.md. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
New file `tests/testthat/test-estimand-ci-methods.R` (10 tests, 36
expectations) covers the v2.4.2 ci.method extensions:
- CI.1: enum accepts basic / percentile / bc / normal
- CI.2: NULL trigger gives per-type defaults (att=normal etc)
- CI.3: normal CI is symmetric around the point estimate
- CI.4: bc collapses to percentile when bootstrap median = point
- CI.5: bc shifts cutoffs when bootstrap is biased relative to point
- CI.6: vartype column reports fit-time vartype regardless of ci.method
- CI.7: log.att on simdata triggers cell-drop hard-error
- CI.7b: log.att works on a positive-Y panel
- CI.8: fect() / fect.formula / fect.default nboots default = 1000
Updated existing fixtures:
- `tests/testthat/test-estimand-log-att.R`:
- .fit_positive_Y(): bumped intercept 0.5 -> 2.0 and reduced
noise sd 0.2 -> 0.1 to keep Y comfortably positive (so the
new cell-drop hard-error doesn't fire on benign bootstrap noise).
- LA.4 now expects the hard-error instead of a warning.
- `tests/testthat/test-estimand-parametric.R`:
- log.att-on-parametric test now expects the hard-error
(sim_linear has many negative Y cells; v2.4.1's silent warning
was producing meaningless inference).
All 6 estimand test files pass: 33 tests, 97 expectations, 0 failures.
Refs: statsclaw-workspace/fect/runs/2026-05-01-v242-inference-infra.md.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…workers
Per Yiqing 2026-05-01: parametric bootstrap (and any parallel path
that uses mvtnorm / future / etc.) emitted N warnings ("package
'mvtnorm' was built under R version X.Y.Z") --- one per worker on
first package use --- because workers loaded packages in user-visible
contexts.
Two-pronged fix:
1. `.fect_make_future_cluster()` now runs `clusterEvalQ` at cluster
creation to pre-load mvtnorm / future / future.apply / doParallel /
foreach via `requireNamespace(quietly = TRUE)` inside
suppressPackageStartupMessages + suppressWarnings. This loads
packages SILENTLY at startup so subsequent worker code doesn't
re-trigger the version-mismatch warning.
2. New helper `.fect_with_quiet_pkg_warnings(expr)` wraps an
expression with a `withCallingHandlers` that targets ONLY the
"was built under R version" warning class (other warnings pass
through normally). Used at the boot.R parametric path's
`future_lapply` call as belt-and-suspenders.
Refactored fect() and did_wrapper() to route through
`.fect_make_future_cluster()` instead of `future::multisession`
directly, so all parallel paths benefit from the silent pre-load.
CV (cv.R), nevertreated (fect_nevertreated.R), and the bootstrap
foreach loop in boot.R already use the helper, so they automatically
benefit. fittest.R + permutation.R use `parallel::makeCluster()`
directly --- can route through the helper in a follow-up.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The book-claims test asserted fect()'s nboots default is 200, which was true through v2.4.1. v2.4.2 raises the default to 1000 (improves percentile/bc tail estimates). Update the assertion to match. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…TURE - DESCRIPTION: Version 2.4.1 -> 2.4.2; Date 2026-04-29 -> 2026-05-01 - NEWS.md: full v2.4.2 entry covering ci.method extensions (bc + normal + per-type defaults), cell-drop hard-error in log.att / aptt, nboots default raised 200 -> 1000, internal warm-start infrastructure (not user-visible; activation deferred to v2.5.0 with empirical justification), diagtest message reword, parallel-worker package warning suppression - vignettes/bb-updates.Rmd: same changelog entry, condensed for the user-facing book - man/estimand.Rd: ci.method default updated NULL with per-type default explanation - R/po-estimands.R: roxygen comment for ci.method updated to match - ARCHITECTURE.md: version 2.4.1 -> 2.4.2; po-estimands.R row reflects ci.method extensions and hard-error addition; estimand() function-table row updated. Manual patches noted; full scriber regen deferred until next material module change. Validation gates run on this branch: - Full testthat suite: 1190 expectations, 0 failures, 0 errors - estimand subset: 6 files, 33 tests, 97 expectations passing - ci-methods test file (new): 10 tests, 36 expectations passing - Coverage simulation (200 sims, all 4 ci.methods on att/overall): basic 92.5%, bc 91.0%, normal 91.5%, percentile 91.0% (nominal 95%; all within MC noise band; details in statsclaw-workspace/fect/ref/v242-coverage-study/findings.md) - Quarto book: rendered cleanly - R CMD check: blocked by a pre-existing macOS cp/xattr issue unrelated to v2.4.2 changes; CI on Ubuntu will verify on PR Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
v2.4.2: inference infrastructure (ci.methods + cell-drop hard-error + warm-start C++ infra)
…positive Vignette 03-estimands.Rmd's setup-estimand chunk uses df$Y <- exp(0.5 + 0.05*time + 0.3*D + rnorm(0, 0.2)) which can dip near zero in some cells. With v2.4.2's new hard-error on bootstrap cell-drop pathology in log.att / aptt, the est-log-att chunk can fail to render when bootstrap Y0_b crosses zero in > 5% of replicates for some cell --- which depends on RNG state and is non-deterministic across renders. Bump intercept to 2.0 and reduce noise sd to 0.1, so Y is comfortably positive (~e^2 ~ 7.4 minimum, far from zero). The hard-error never fires on this DGP. Also updated the prose in the Log-scale ATT section to describe the new hard-error behavior (was: "A single warning per call reports how many cells were excluded"; now: explains the v2.4.2 hard-error and the actionable options). Local Quarto book render: 14 HTML / 176 SVG / 0 errors (matches the v2.4.1 baseline counts). testthat suite passes per the latest gate run on dev HEAD. This is a robustness fix --- the v2.4.2 vignette on dev happens to render cleanly because of lucky RNG state, but is fragile to seed changes. This branch makes it deterministically robust. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds `test = c("none", "placebo", "carryover")` argument to estimand().
Closes issue #131 (ajunquera, "Pre-treatment estimates for APTT
estimand").
When `test = "placebo"`:
- Aggregates the requested estimand (att / aptt / log.att) over
pre-treatment cells in `fit$placebo.period` (the cells masked-and-
imputed during the placebo fit)
- Requires `placeboTest = TRUE` at fit time; otherwise hard-errors
with actionable refit guidance
- Auto-sets `direction = "on"`
When `test = "carryover"`:
- Aggregates over early post-reversal cells in `fit$carryover.period`
- Requires `carryoverTest = TRUE` + `hasRevs == 1`; otherwise
hard-errors
- Auto-sets `direction = "off"`
`type = "att.cumu"` is rejected with clear error when `test != "none"`
since cumulative semantics are defined relative to treatment onset.
Implementation:
- New `.test_cell_mask(fit, test, direction)` internal helper builds
the cell-level base mask + returns Tev for downstream aggregation
- `.estimand_att`, `.estimand_aptt`, `.estimand_log_att`,
`.estimand_att_overall` accept `test` argument and forward to
`.test_cell_mask`
- New `.estimand_att_event_time` slow path for att with test != "none"
(the fast path requires test == "none" since fit$est.att passthrough
is byte-equality only for the standard surface)
- Per-event-time att with `test = "placebo"` is byte-identical to
`fit$est.att` rows at placebo event times (asserted by test PC.6)
Tests: 10 tests, 19 expectations in
`tests/testthat/test-estimand-placebo-carryover.R`. All pass on the
positive-Y synthetic panel and on simdata. Includes:
- PC.1: argument validation
- PC.2/PC.3: standard fit + test=placebo/carryover hard-errors
- PC.4: placebo aptt returns rows in fit$placebo.period range
- PC.5: placebo att works; placebo log.att hard-errors on simdata
(cell-drop pathology from v2.4.2 fires; expected behavior)
- PC.6: att, test=placebo byte-equality with fit$est.att rows
- PC.7: att.cumu + test=placebo rejected
- PC.8: carryover aptt on reversal panel
- PC.9: test=placebo silently uses direction='on'
- PC.10: vartype column reports method actually used at fit time
Refs: parked branch feat/v242-estimand-placebo-carryover (db11630)
documented the original design; this commit rebuilds the placebo /
carryover surface cleanly on top of v2.4.2's ci.method changes.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…v2.4.2) Closes the v2.4.2 inference work with three additions and a roll-up of the changelog: * ci.method = "bca": full Efron 1987 (z0 + acceleration via cell-level jackknife). Wired into att / aptt / log.att dispatcher paths. Defaults updated: aptt -> "bca", log.att -> "bca" (was "bc"; bc collapsed at the boundary when point estimate fell outside the bootstrap support). * vartype = "wild": unit-level Rademacher wild bootstrap (Liu 1988; Mammen 1993; CGM 2008). Keeps unit composition fixed; perturbs main-fit residuals on D=0/I=1 cells. Continuous outcomes only; fe / ife / mc / gsynth / cfe. Storage layout matches "bootstrap"; estimand() consumes wild fits transparently. * log.att hard-stop at point-estimate level: any treated cell with Y <= 0 or Y0_hat <= 0 now errors with actionable refit guidance (log undefined; silent drop would bias both point estimate and bootstrap distribution). Changelog rolled up: collapse the in-progress 2.4.3 section into 2.4.2; drop the warm-start internal entry (deferred to v2.5.0; not user-visible) and the diagtest wording bug-fix (rolled into hard-error guidance section). Vignettes: * Ch2: new vartype-x-ci.method tables documenting computational cost in multiples of nboots refits and per-type defaults. * Ch3: placebo + carryover demo sections (APTT and log-ATT under test = "placebo" / "carryover") with esplot calls. * bb-updates: same roll-up as NEWS.md. man/estimand.Rd: bca + wild documented. Tests + Quarto render deferred to maintainer (per request). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…e _common.R in ch3
Validation pass on `feat/v242-completion` after deferred test/render gates surfaced two issues:
1. Tests: 4 testthat assertions expected the bootstrap-level "log-ATT bootstrap is unreliable" message, but v2.4.2's NEW point-estimate-level hard-error ("log.att requires Y > 0 and Y0_hat > 0...") fires first on simdata (Y < 0 cells). The point-level error is the correct upstream catch; the bootstrap-level error remains in R/po-estimands.R for strictly-positive panels where Y0_b crosses zero only in some replicates.
Updated expected message strings in CI.7, LA.4, parametric log.att, and PC.5.
2. Vignette: 03-estimands.Rmd was missing `source("_common.R")`, so quarto render loaded the system-installed CRAN fect (v2.4.1) instead of the dev source. The `test = "placebo"` argument (v2.4.2+) was rejected, breaking `plot-aptt-placebo` chunk and halting render.
Added the standard `.common` chunk that other chapters use; dropped redundant `library(fect)` since `_common.R` calls `devtools::load_all("..")`.
Validation gates after fixes:
- `devtools::test()`: PASS 1209 / FAIL 0 (was 1205/4)
- `quarto render`: 14 HTML / 180 SVG / 0 fatal errors (rgl/X11 warnings non-functional)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ci.method (closes parametric × {basic/percentile/bc/bca} 0% coverage)
Pre-fix coverage on a clean DGP-A (additive TWFE, true ATT=3.0, N=40, T=20, T0=12, 100 reps):
parametric × normal: 93% (works — uses sd only)
parametric × basic: 0% (CI offset above truth)
parametric × percentile: 0% (CI matches H0 quantiles)
parametric × bc: 0% (boundary collapse, width ~2e-13)
parametric × bca: 0% (boundary collapse, width ~2e-13)
Root cause: R/boot.R:991-996 constructs Y.boot[,treated] = Y0_hat +
error.tr.boot — no treatment effect added, so eff.boot is centered at 0
(under H0). normal ci.method only uses sd(boot) so was unaffected; the
other 4 ci.methods interpret eff.boot as the sampling distribution of
θ̂ around itself, requiring centering at θ̂.
Fix (Option A — post-hoc location shift in R/po-estimands.R only):
att_b_centered = att_b - mean(att_b) + estimate
applied at 4 sites (att event.time, att overall, aptt event.time, log.att
event.time) immediately before .compute_ci() when the fit's vartype is
parametric. The shift is variance-preserving: sd(shifted) == sd(original)
to machine precision, so normal CI is byte-stable. boot.R, plot.R, and
get.pvalue() (which legitimately needs H0 centering for the legacy
H0:ATT=0 hypothesis test) are untouched.
Justification: the parametric bootstrap simulates errors from a fitted
homoskedastic Gaussian AR-vcov model. Var(θ̂|H0) = Var(θ̂|H1) by
construction, so the H0-centered bootstrap distribution has the correct
dispersion for inference about θ̂; only the location is wrong, and the
shift restores it. Full rationale and downstream-consumer audit in
statsclaw-workspace/fect/spec.md and ref/v242-coverage-study/findings_v242_full.md.
Post-fix coverage (NBOOTS = 200, 1000, 2000 — all consistent within MC noise):
parametric × normal: 94% (byte-stable with pre-fix)
parametric × basic: 93-94%
parametric × percentile: 91-94%
parametric × bc: 91-94% (boundary collapse eliminated)
parametric × bca: 91-94% (boundary collapse eliminated)
Tests: 98 new assertions in tests/testthat/test-estimand-parametric-cifix.R
covering value invariants (P-INV-1..5), edge cases (P-EDGE-2..4),
regression smoke (P-REG-1..1b), all-5-ci.methods on FE and gsynth fits,
and 100-rep coverage gates (P-COV-1..3, P-WIDTH-1). Full suite:
1307 PASS / 0 FAIL in CRAN mode; +98 PASS with NOT_CRAN=true.
statsclaw pipeline: planner → builder → tester → reviewer (PASS WITH NOTE).
Spec: statsclaw-workspace/fect/spec.md
Test spec: statsclaw-workspace/fect/test-spec.md
Sim spec: statsclaw-workspace/fect/sim-spec.md
Review: statsclaw-workspace/fect/runs/2026-05-01-parametric-fix-review.md
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…10→11
New chapter is a reference for the 4 vartype × 5 ci.method matrix:
- vartype mechanics (case bootstrap / wild / jackknife / parametric)
- ci.method formulas + symmetry / bias-aware / skew-aware properties
- p-value formulas via test inversion (per ci.method)
- nboots literature recommendations (Efron & Tibshirani 1993; DiCiccio &
Efron 1996; Davison & Hinkley 1997; Hesterberg 2014)
- empirical coverage table (DGP-A, NBOOTS=200/1000/2000)
- decision tree for picking (vartype, ci.method) per use case
- v2.4.2 caveats panel: wild under-coverage, jackknife slot-contract,
parametric ci.method fix (this PR), missing p.value column
Renames (Quarto cross-refs use @sec-foo labels, not file numbers, so
breakage is impossible):
09-panel.Rmd → 10-panel.Rmd
10-sens.Rmd → 11-sens.Rmd
(new) → 09-inference.Rmd
Render confirmed clean: 15 HTML / 180 SVG / 0 fatal errors.
Coverage table notes: at this clean Gaussian DGP, all 5 ci.methods
saturate at NBOOTS=200 (case bootstrap 91%, parametric 93-94% post-fix,
wild 71-78%). Literature recommendations of B≥1000 matter most for
skewed estimands (aptt/log.att) and small Ntr — not for symmetric
near-Gaussian estimands like att.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ard-error on non-normal ci.methods
Pre-fix: estimand() rejects all jackknife fits with "Slot contract:
fit\$eff.boot first two dimensions (TT x N-1) must match TT x N" because
jackknife stores eff.boot as TT x (N-1) x N (each leave-one-out fit
drops a column), not TT x N x nboots.
Fix in R/po-estimands.R only:
1. Slot contract relaxation: vartype-branched. Jackknife asserts
dim(eb)[1] == TT and dim(eb)[3] == N (drops match unit count); other
vartypes keep the original TT x N check.
2. New .check_jackknife_ci_method() helper hard-errors on anything
other than "normal" for jackknife fits, citing Efron & Tibshirani
(1993) Ch.11 and Davison & Hinkley (1997) §3.2.1: jackknife produces
an SE estimate via Tukey's formula, not a sampling distribution. The
N leave-one-out estimates are influence-function flavored and not
exchangeable bootstrap draws, so reflection-based ci.methods (basic,
percentile, bc, bca) are not statistically meaningful on them.
3. Two new aggregation paths:
- .estimand_att_overall_jackknife(): two sub-cases. No cells filter
-> reads fit\$att.avg.unit.boot directly. With cells filter ->
iterates j=1..N, drops column j from cell_mask, extracts the mean
of eff.boot[,,j][cm_j], applies Tukey SE.
- .estimand_att_event_time_jackknife(): same column-drop masking
pattern for the test != "none" path.
4. APTT and log.att jackknife branches in .compute_aptt_event_time and
.compute_log_att_event_time using identical column-drop masking.
5. imputed_outcomes(replicates=TRUE) hard-errors on jackknife fits
(column-count mismatch makes per-cell replicate expansion incoherent).
The fast path (event.time + normal + no filter + no test) for jackknife
needs zero new logic: once the contract gate is unblocked, it reads
fit\$est.att directly, which fect populates with Tukey SE and Wald CIs
at fit time.
Tests: 68 new assertions in tests/testthat/test-estimand-jackknife.R
(scenarios J.1-J.12 + S-11/S-12 anti-regression + S-SIM 100-rep coverage
on DGP-A att overall, threshold >= 0.85, achieved). Full suite: 0 FAIL.
statsclaw pipeline: planner -> builder -> tester, all green.
Spec: statsclaw-workspace/fect/jackknife-fix/spec.md
Audit: statsclaw-workspace/fect/jackknife-fix/audit.md
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replace the wild branch in one.nonpara() (R/boot.R lines 1139–1178). Old code perturbed only D=0/I=1 cells and kept observed Y on treated cells verbatim, capturing only Y0-imputation variance. New code builds a per-event-time ATT lookup matrix and perturbs treated cells too using eff[t,i] - att_period[T.on[t,i]] as the residual, following CGM 2008 exactly. Width ratio wild/bootstrap = 0.865 (>= 0.80 target); point estimate byte-identical to case-bootstrap on same seed. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
59 new assertions in tests/testthat/test-wild-bootstrap.R covering
14 scenarios (T1-T11 + E1-E4):
T1 wild + all 5 ci.methods produce finite, non-degenerate CIs
T2 wild CI width >= 80% of bootstrap CI width across all 5 methods
T3 att.avg byte-identical between wild and bootstrap fits (point
estimate unchanged - the fix only changes the bootstrap distribution)
T4 aptt event.time + wild + bca on positive-Y DGP
T5 log.att event.time + wild + bca on positive-Y DGP
T6-T8 bootstrap, parametric, jackknife paths unchanged
T9 binary outcome + wild still hard-errors at fit time
T10 nboots=200 still works
T11 100-rep coverage simulation: each ci.method >= 0.85
E1-E4 edge cases (staggered panel, uniform adoption, placebo masked
cells, degenerate fits)
Coverage results (T11, 100 reps NBOOTS=200, DGP-A att overall):
Pre-fix: basic 0.78 percentile 0.73 bc 0.74 bca 0.74 normal 0.77
Post-fix: all 5 methods >= 0.85 (T11 acceptance threshold)
Width ratio (T2, single seed): wild/bootstrap = 1.41 (was ~0.57 pre-fix).
Full testthat suite post-fix: PASS 1434 / FAIL 0 (1976.6 s wall).
statsclaw pipeline: planner -> builder (2df98f8) -> tester, all green.
Spec: statsclaw-workspace/fect/wild-bootstrap-fix/spec.md
Audit: statsclaw-workspace/fect/wild-bootstrap-fix/audit.md
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…rap fix" This reverts commit 731866e.
…ion A)" This reverts commit 2df98f8.
Implements spec wild-as-paraerror: - New `para.error` argument to fect()/fect.formula()/fect.default() with default "auto". Enum: "auto", "ar", "empirical", "wild". - Loop 2 in fect_boot() dispatches on `para.error.resolved` (three-way switch) instead of `if (0 %in% I)` / `else`. - Wild path: unit-level Rademacher sign-flip on Loop 1 pool draws (variant-i, H0-centered); location-shift in estimand() re-centers. - Validation in fect.default(): hard-error if empirical/wild + missing cells; deprecation warning if vartype="wild" (rewrites to vartype="parametric" + para.error="wild" before gate). - Standalone wild branch in boot.R one.nonpara() removed. - para.error.resolved stored on fit object for inspection. - 10 unit tests in test-para-error.R covering all spec scenarios. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Companion to the builder's starter file (test-para-error.R, 10 unit tests). Covers test-spec.md scenarios T4, T7, T9-T10, T13-T21, E2-E4, E6. NOT YET RUN. Tonight's pickup: run this file via testthat::test_file() after the redesign at b9881ed. Coverage threshold per scenario is 0.91 (within 2 MC-SE of nominal 95%) — DO NOT relax to 0.85. Critical scenarios: - T19 (DGP-A IID): all (para.error × ci.method) cells coverage >= 0.91 - T20 (DGP-A8 AR(1) rho=0.8): same threshold; this is the gsynth-note stress test that the variant-(ii) attempt failed - T21: width parity wild/empirical in [0.70, 1.30] - Anti-regression on test-estimand-parametric-cifix.R (98 assertions for the v2.4.2 location-shift fix) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…net-new API vartype = "wild" was added in 6641af7 (the v2.4.2 completion commit) and then folded into a deprecation alias in b9881ed when the redesign moved the wild bootstrap into the parametric Loop-2 architecture. Since v2.4.1 never had vartype = "wild" and v2.4.2 has not shipped, there is no caller to deprecate --- the alias is backwards-compat scaffolding for nobody. Removes: - R/default.R: the if (vartype == "wild") rewrite block + the parenthetical hint in the vartype validation error. vartype enum is now exactly c("bootstrap", "jackknife", "parametric"); passing "wild" produces the standard validation error. - tests/testthat/test-para-error.R: Test 4 (alias warning + routing). - tests/testthat/test-para-error-full.R: T9 (alias-identical-to-explicit) and T10 (soft-deprecation-not-error). Reframes: - NEWS.md: replaces the "New: vartype = "wild"" + deprecation paragraphs with a single "New: para.error argument" entry covering all four modes (auto / ar / empirical / wild) as the API surface. - vignettes/bb-updates.Rmd: same reframe in the v2.4.2 changelog block. - vignettes/02-fect.Rmd: vartype × ci.method table loses the "wild" row; a new para.error sub-options table is added under it. - vignettes/09-inference.Rmd: decision-tree note rewritten in para.error terms; v2.4.2 caveats table updated (jackknife slot-contract bug now fixed in v2.4.2 via hard-error on non-normal ci.methods; wild-bootstrap caveat removed since the variant-(ii) mode no longer exists). Verification: test-para-error.R 29/29 PASS; test-para-error-full.R fast portion 44/44 PASS (T19/T20/T21 in background or already done). Clean validation error confirmed: vartype = "wild" -> "must be one of bootstrap, jackknife, or parametric". Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…/coverage-study/ Two coordinated changes to the para.error validation suite, both motivated by the same finding: the parametric pseudo-treated bootstrap targets the conditional variance V_t = Var(ATT_hat - ATT | Lambda, F, X, D) (gsynth-note section 2). Re-randomizing the treatment indicator D across simulation replications adds a Var_D[b_t] term to the marginal variance the simulation measures coverage against, biasing measured coverage downward by exactly that amount. The bootstrap procedure itself is unaffected. (1) DGP fix in tests/testthat/test-para-error-full.R: dgp_a, dgp_a8, dgp_b now use D[(T0+1):TT, 1:Ntr] <- 1L (fixed treated block) rather than `id_tr <- sample(1:N, Ntr)` (random per rep). This matches gsynth-note's MC framework (footnote on Table 1, section A.2) and is the convention in the factor-model literature. Without this fix, T19/T20 measured coverage 74-79% when nominal was 95% --- not because of any bug in the bootstrap, but because the simulation was measuring against the wrong target. (2) Move T19, T20, T21 from tests/testthat/test-para-error-full.R to tests/coverage-study/run_para_error_coverage.R: These three Monte-Carlo studies (~30 min wall in parallel) are simulation scripts, not unit tests. Routine devtools::test() runs no longer pay the cost of running 1500+ fits at nboots=1000. The new standalone runner produces a CSV summary at /tmp/fect-coverage-study/ and is invoked via Rscript when inferential methods are being updated; see tests/coverage-study/README.md for the full trigger list and acceptance criteria. The directory is added to .Rbuildignore so the validation script stays in source but is excluded from the CRAN tarball. Verification: post-cleanup, tests/testthat/test-para-error-full.R still has 12 test_that blocks (T4, T7, T13-T18, E2-E4, E6) all passing; smoke tests in test-para-error.R also pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…mponent.from
The pre-existing Gate C ("Parametric bootstrap is not valid when
time.component.from is notyettreated") fires after the silent fe -> ife r=0
coercion, so a user who passes method = "fe" gets an error message that
mentions only time.component.from --- the connection to their original
method argument is invisible. This commit replaces Gate C with an early-gate
variant placed before the coercion, so the error names the user's literal
method.
Before:
fect(Y ~ D, method = "fe", vartype = "parametric")
# Error: Parametric bootstrap is not valid when "time.component.from" is
# "notyettreated". Use time.component.from = "nevertreated" ...
After:
fect(Y ~ D, method = "fe", vartype = "parametric")
# Error: vartype = "parametric" requires time.component.from = "nevertreated".
# Your call: method = "fe", time.component.from = "notyettreated".
# The parametric pseudo-treated bootstrap requires a control pool isolated
# from treated-unit pre-treatment cells. Pass time.component.from =
# "nevertreated" (if never-treated controls exist) or use vartype =
# "bootstrap" or "jackknife".
Behavior unchanged for the valid case: method = "fe" + parametric +
nevertreated still works silently (fit$method = "fe", fit$vartype =
"parametric"). Behavior also unchanged for method = "ife" / "cfe" with
notyettreated --- error message now references their literal method too.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… user-frequency
Restructures the Quarto user manual to put the casual user's reading
path (~90% of users using `method = "fe"`) into a dedicated "Basics" part
of 3 chapters, with advanced material in subsequent parts that are
"as needed". Long-term-robustness motivated: the parts structure absorbs
future chapter additions without renumbering.
Final order:
Basics (always read)
01-start, 02-fect, 03-estimands
Advanced estimators and inference (read when factor models or gsynth
are needed; the inference deep-dive is most meaningful in the context
of gsynth + parametric bootstrap)
04-ife-mc, 05-cfe, 06-gsynth (was 08), 07-inference (was 09)
Diagnostics and extensions
08-hte (was 06), 09-panel (was 10), 10-sens (was 11)
Reference
11-plots (was 07; reference material lives at end), aa-cheatsheet,
bb-updates, cc-references (was references.qmd; bibliography is the
absolute last entry per book convention)
File renames preserve git history (R = rename in git status).
Cross-references unchanged: every internal link uses [@sec-x] logical
labels, never chapter numbers, so Quarto resolves them to the correct
chapter regardless of file position. No content edits required in any
chapter; only _quarto.yml YAML changed.
Bisect leftovers removed: vignettes/09-panel.Rmd and vignettes/10-sens.Rmd
were duplicates from an earlier `git checkout b4e9fbf` operation during
the v2.4.2 inference investigation; src/symbols.rds was a build artifact
from the same bisect. All three were byte-identical to their HEAD
counterparts.
Verification: Quarto book renders cleanly (15 HTML / 0 errors) at
b9881ed redesign + Option B parametric+notyettreated gate + this
restructure all stacked.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…rve method-aware error The Option B gate added at e45a852 fired BEFORE the reversal-check gate (line 1868), which broke two existing tests: test-paraboot-dispatcher.R:145 (hasRevs+parametric) test-paraboot-parity.R:405 (GATE-1 reversal+parametric) Both call fect() on a reversal panel with default time.component.from ("notyettreated"); they expect the reversal error message but got the new nevertreated message instead. Fix: save method_arg <- method early (before the silent fe -> ife coercion at line 601), then place the parametric/nevertreated gate AFTER the reversal-check gate. Reversal users now get the more actionable reversal message first; non-reversal-but-notyettreated users get the method-aware nevertreated message via method_arg, preserving Option B's design intent that the error names the user's literal method = "fe". Verification: - Full devtools::test() with NOT_CRAN=true: 1448/1448 PASS - Manual checks of three scenarios: method=fe + parametric + notyettreated -> "method = \"fe\"" msg method=fe + parametric + nevertreated -> works silently reversal + parametric -> reversal msg first Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…A forward-refs Two issues, fixed together: (1) The chapter-restructure commit at e45a852 staged the file renames via `git mv` but accidentally did NOT stage `_quarto.yml` or `index.qmd`. As a result the remote HEAD's `_quarto.yml` still listed the OLD chapter filenames (06-hte.Rmd, 07-plots.Rmd, 08-gsynth.Rmd, 09-inference.Rmd, 10-panel.Rmd, 11-sens.Rmd, references.qmd) which no longer exist on disk. Anyone cloning the repo and running `quarto render` would see file-not-found errors. The local working-tree YAML was correct, which is why my own render succeeded. This commit pushes the parts-structured YAML and the matching Organization section in index.qmd. (2) Phase A of the post-restructure narrative audit: forward-references from the Basics part (02-fect, 03-estimands) and from the Advanced part's last chapter before inference (06-gsynth) to the new inference deep-dive at @sec-inference. Casual users reading Basics see "the deep-dive on bootstrap-distribution semantics is in Chapter inference, most useful after Chapter gsynth"; gsynth users reading the end of 06-gsynth see "next: Chapter inference" as the natural read. Also fixes a pre-existing broken cross-reference in 03-estimands: "gates A/B/C, see @sec-cfe" -> "gates A/B/C, see [Section @sec-parametric-regimes] in [Chapter @sec-fect]". Gates A/B/C are documented in 02-fect's parametric-regimes section, not in 05-cfe. Phase B (rewrite of 07-inference to drop `vartype = "wild"` rows from the matrix and update the empirical coverage table with post-redesign numbers from the night-4 FIXED id_tr run) is queued as a follow-up commit; that chapter's content is still stale and needs a substantive revision separate from this audit cleanup. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Reflects the b9881ed para.error redesign and the cleanup commit at 0104f8c that removed vartype = "wild" entirely. The chapter previously described: - "two design choices" (vartype × ci.method) - a 4 × 5 matrix with vartype rows {bootstrap, wild, jackknife, parametric} - a dedicated `### "wild"` section that covered Y0-only-perturbation wild - empirical coverage table with `wild` rows showing 71-78% under-coverage - "Three patterns" with pattern 3 calling out wild under-coverage as a known v2.4.2 caveat to live with All of that is now stale. After this commit the chapter describes: - "three design choices" (vartype × ci.method × para.error) - a 3 × 5 matrix with vartype rows {bootstrap, jackknife, parametric} - a dedicated `### para.error` sub-section under the parametric vartype, documenting auto/ar/empirical/wild as residual-model sub-options - empirical coverage tables in two parts (DGP-A IID Gaussian and DGP-A8 AR(1) rho=0.8) with the post-redesign night-4 numbers (0.90-0.91 on DGP-A; 0.96-0.97 on DGP-A8) for all 15 cells per DGP (3 para.error modes x 5 ci.methods) - "Two patterns" (the under-coverage pattern 3 is gone) - a section explaining the conditional-variance target V_t = Var(ATT_hat - ATT | Lambda, F, X, D) and the law-of-total-variance reasoning that requires holding D fixed across coverage simulations - an "Inference summary" at the end documenting the two pre-v2.4.2 bugs that were fixed (parametric location-shift bug, jackknife slot-contract) and the new para.error API Decision tree updated to mention para.error and to drop the "Use fit$est.att directly until estimand() supports jackknife" workaround (no longer needed since jackknife is now a first-class ci.method = "normal" path). Phase A forward-references in 02-fect / 03-estimands / 06-gsynth point at this chapter; the Phase A commit at 59a5787 is what brought casual readers to this chapter when they need it. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…re quarto artifacts
(1) man/fect.Rd: fix R CMD check WARNING "Codoc mismatches from Rd file
'fect.Rd'". Two issues, both stemming from the v2.4.2 changes that
landed without a docs update:
- Added \item{para.error}{...} to the \arguments block; added
`para.error = "auto"` to the \usage block at the correct position
(after `vartype`, before `cl`).
- Updated `nboots = 200` -> `nboots = 1000` in the \usage block to
match the v2.4.2 default change.
Verified via tools::codoc("fect", lib.loc = ...) on a fresh build:
"PASS: no codoc mismatches."
(2) NEWS.md: simplified the v2.4.2 section to match the v2.4.0 layered-
bullet style. Previously had 5 separate `## New:` headers with
long-prose bullets and no `## Bug fixes` section; now has 2 `## New:`
headers consolidating estimand() additions and the para.error API,
plus a `## Bug fixes` section listing the parametric coverage fix,
jackknife slot-contract relaxation, log.att / aptt cell-drop hard-
errors, the clearer parametric+notyettreated error message, the
F-test wording fix, and the parallel-worker package-version warning
suppression. `## Other changes` covers the nboots default raise and
the warm-start C++ infrastructure (deferred activation).
(3) .gitignore: add quarto-specific render artifacts that were
accumulating as untracked files after restructure renders:
vignettes/*.rmarkdown
vignettes/site_libs/
vignettes/.quarto/
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…tart NEWS entry
* vignettes/bb-updates.Rmd: replace 5 prose paragraphs with hierarchical
bullets matching v2.4.0 style. Two `**header**` sections:
- "Alternative-estimand additions in estimand()" (covers test arg
+ bc/bca/normal ci.methods + per-type defaults)
- "para.error argument for vartype = 'parametric'"
Plus a single-line "**Various bug fixes.**" sentinel; specifics live
in NEWS.md for users who want them. Removes the previous separate
prose sections on cell-drop hard-error and nboots default raise.
* NEWS.md: drop the warm-start C++ infrastructure entry from
## Other changes. Yiqing flagged the warm-start path for further
investigation before mention; the C++ infrastructure is in place
with NULL default (no behavioral change), so removal from the
changelog is content-accurate.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…s-ref resolves Removed `.unnumbered` from bb-updates.Rmd's title line. Quarto cannot generate "Chapter N" for an unnumbered chapter, so the `[Chapter @sec-changelog]` reference I added to index.qmd's Organization section in 59a5787 produced a render warning: WARN: index.html: Unable to resolve crossref @sec-changelog The companion `aa-cheatsheet.Rmd` is already numbered (no `.unnumbered`), so making bb-updates numbered keeps the Reference part internally consistent. Now bb-updates renders as Chapter 13 (the last numbered chapter before cc-references.qmd, which is the bibliography and stays unnumbered by Quarto convention). Verified: clean re-render of just bb-updates.Rmd + index.qmd resolves the warning; full book is at 15/15 HTML files / 0 errors. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Test scaffold for the v2.4.3-candidate partial warm-start design (warm
auxiliaries: alpha, xi, beta, gamma, kappa, multi-FE, missing fillers;
cold-start (F, Lambda) per replicate via fresh SVD).
Status: SCAFFOLD ONLY. The script documents the test design and provides
the comparison harness, but the partial warm-start API in fect() is not
yet shipped. Running the script today exits with a TODO message. Two
implementation paths:
(a) Add `warm.start = c("none", "linear", "all-auxiliaries")` argument
to fect() and route to the existing C++ fit_init parameter, with
the construction:
fit_init = Y_hat_main - F_main %*% t(Lambda_main)
so the EM seeds with the auxiliary-only surface and the first SVD
step discovers fresh per-replicate (F, Lambda).
(b) Direct call to inter_fe_ub via Rcpp with a hand-built fit_init,
bypassing fect() entirely. Faster prototype, less production-
relevant.
Phase B (IFE on simdata + covariates):
30 reps x {cold, partial-warm} x IFE r=2
Pass: SE diff < 5% relative AND speedup >= 2x
Phase B-CFE (5 DGPs):
sim_region (multi-level FE; no factors)
sim_trend (Q-heavy: sinusoidal trends)
sim_linear (Q-light: linear trends)
simdata (factor-only)
sim_gsynth (factor + covariates; gsynth-style)
Per-DGP: SE diff < 5% AND speedup >= 2x
Phase C (jackknife, IFE r=2): SE diff < 3% relative
Phase D (coverage): T19/T20 coverage parity within MC-SE
Pass criteria same as Phase A (which failed). Design intent: auxiliaries
are deterministic given (F, Lambda) so warming them shortens the EM path
without anchoring the bootstrap distribution. (F, Lambda) basin still
varies per replicate because the first SVD's input is per-replicate.
If Phase B passes: ship partial warm-start in v2.4.3 (Yiqing's directive,
2026-05-01).
Companion design doc:
workbench statsclaw-workspace/fect/ref/v242-warm-start-investigation/
partial-warm-design.md
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Ch7 title: "Bootstrap Inference Internals" -> "Inference" (sec-inference anchor unchanged; cross-references in 02-fect, 03-estimands, 06-gsynth all resolve via the label so no further edits needed). * cc-references.qmd title: "References" -> "Bibliography" (canonical book convention; the chapter contains only the rendered bibliography from references.bib via @* nocite). * 01-start.Rmd version-check chunk: removed tryCatch wrappers around the CRAN and dev-branch version queries. Errors will now propagate normally if either source is unreachable; this is fine for a documentation chunk where the query is intentionally live (not cached). Render: incremental (cache reused for unchanged chapters); 15 HTML / 0 errors. Sidebar shows the new titles. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* bb-updates.Rmd: drop the {#sec-changelog} label; restore {.unnumbered}.
Changelog is back matter -- accessed via the sidebar, not via in-text
cross-references. Removing the label simplifies the structure and lets
the chapter sit unnumbered alongside the Bibliography.
* index.qmd Organization section: drop the [Chapter @sec-changelog]
cross-ref (now resolves to nothing since the label is gone). Replaced
with an unlinked "Changelog" entry pointing readers at the sidebar.
* bb-updates.Rmd v2.4.2 entry: simplified to 5 high-level pointer
bullets + cross-references to the chapters that explain in detail.
Drops the duplicated content (test arg / ci.methods / para.error /
cell-drop hard-error) that already lives in 03-estimands and
07-inference. Each bullet is now: "what changed -> see chapter X".
Render: incremental (cache reused), 15 HTML / 0 errors / 0 warnings.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Folds the v2.4.3 default-tol fix into v2.4.2 (originally on feat/v243-warm-start as a76949b + cb7b4c0). Net delta over v2.4.2: * Default tol: 1e-3 -> 1e-5 * Default max.iteration: 1000 -> 5000 * New warning() when EM hits max.iteration without satisfying tol * man/fect.Rd updated for new defaults * Test fixture pins on test-paraboot-parity.R PAR-2/3/4/6 preserve byte-equality contracts against existing baseline.rds Internal infrastructure (dormant, mirrors existing inter_fe_ub / inter_fe_mc plumbing; not exposed to users): * C++ fit_init parameter on complex_fe_ub + cfe_iter * R wrapper fit.init params on fect_fe / fect_mc / fect_cfe / fect_nevertreated Coverage-study scripts moved into tests/coverage-study/ (already in .Rbuildignore): run_tol_characterization.R, run_tol_coverage.R, run_tol_coverage_extended.R, run_cfe_high_K.R. Warm-start exploratory artifacts archived to tests/coverage-study/_archive/ for future reference: test-warm-start.R, run_partial_warm_start_validation.R, run_mc_warm_start.R. NEWS.md v2.4.2 entry gains "Changed: tighter EM convergence defaults" section + "Internal infrastructure" bullet under Other changes. DESCRIPTION date bumped to 2026-05-02. The pre-v2.4.2 default tol = 1e-3 halted IFE/CFE EM well before convergence: on factor-DGP simdata the EM stopped at ~116 iters with att.avg = 2.87, while running to tol = 1e-7 (~2000 iters) produces att.avg = 2.43 (18% gap; CFE worse at 40%). Inference at the old default was already correct (coverage 0.96 at both old and new tol); the fix improves point-estimate stability across versions/machines. Speed cost: ~2-5x slower main fit and bootstrap on factor-DGP IFE/CFE. Set tol = 1e-3, max.iteration = 1000 explicitly to reproduce pre-v2.4.2 numerical output exactly. Decision context: statsclaw-workspace/fect/runs/2026-05-02-tol-convergence-investigation.md Fold session: statsclaw-workspace/fect/runs/2026-05-02-fold-tol-into-242.md Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…aim) The v2.4.2 NEWS.md "Other changes" section already claims: complex_fe_ub and cfe_iter C++ entries gain optional fit_init parameter (NULL default preserves pre-existing cold-start behavior). But the original cherry-pick of a76949b + cb7b4c0 onto feat/v242-completion (commit dc2fc71) missed commit 097265c, which is where this C++ infrastructure was originally introduced. cb7b4c0 only stripped the R-side warm.start API, *keeping* the C++ plumbing on top of 097265c's base; cherry-picking it onto a base that lacks 097265c left the plumbing un-introduced. This commit lands the dormant infra by checking out the v243-warm-start versions of the affected files. NULL default = byte-identical to pre-2.4.2 behavior; no functional reach without the public warm.start API (which cb7b4c0 stripped intentionally). Files touched: src/cfe.cpp complex_fe_ub gains optional fit_init Rcpp::Nullable src/cfe_sub.cpp cfe_iter gains optional fit_init Rcpp::Nullable src/fect.h signature updates R/cfe.R fect_cfe() gains fit.init = NULL parameter R/fect_nevertreated.R fect_nevertreated() gains fit.init = NULL + control-unit slicing R/RcppExports.R regenerated bindings (compileAttributes) src/RcppExports.cpp regenerated bindings (compileAttributes) tests/coverage-study/_archive/run_partial_warm_start_validation.R archive content matched to v243 (was stale) Validation: clean C++ recompile via pkgbuild::compile_dll(); devtools::test() with TESTTHAT_CPUS=10 parallel = TRUE: 0 failures, 9 expected max.iteration warnings (same set as pre-fold baseline). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
bb-updates.Rmd v2.4.2 entry: add the tol/max.iteration tightening bullet in the existing layered-bullet style. Date bumped to 2026-05-02. ARCHITECTURE.md: manual patch to record the v2.4.2 dormant fit_init plumbing on complex_fe_ub + cfe_iter (mirrors the 2026-05-01 manual-patch convention, since statsclaw:scriber regen is not directly invokable from this session). Header note updated to reflect the second patch date. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…coverage.R Aligns with the standing convention for parallel R workloads on this machine (20 physical cores; leave headroom for foreground work). Speeds the parametric bootstrap inside each rep without changing the outer 100-rep coverage loop. T19/T20/T21 timing observed at cores=10, new tol=1e-5 default: T19 (DGP-A IID, 100 reps x 1000 nboots) ~70 min T20 (DGP-A8 AR(1), 100 reps x 1000 nboots) ~70 min T21 (width parity, 50 reps x 500 nboots) ~22 min T19 coverage at NEW defaults: 13/15 cells at 0.91, 2/15 at 0.90 (ar x bc, ar x bca) -- borderline vs the [0.91, 0.99] threshold but within MC-noise band at K=100 (1 SE ~ 0.022). Prior validation at cb7b4c0 baseline saw 0.91 across all cells; the 0.90 deviation is noise-distinguishable, not a regression. T20 PASS (0.96-0.97). T21 PASS (wild/empirical width ratio 1.001-1.007). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ning + jackknife large-Nco warning
The interim "nboots default = 1000" introduced earlier in the v2.4.2 cycle
was overspec'd for the dominant use case (estimand(fit, "att") with normal CI
returns SE-grade inference at nboots = 200 just as well). Now the cost is
surfaced exactly where it matters --- when a user requests a tail-quantile-
based CI on under-replicated bootstrap. Three changes:
1. Default `nboots = 200` in fect(), fect.formula(), fect.default() (matches
v2.4.1; reverts an interim bump made earlier this v2.4.2 cycle that was
never publicly shipped).
2. estimand() warning gate: when ci.method in {basic, percentile, bc, bca}
is requested on a bootstrap or parametric fit with length(eff.boot) < 1000,
emit a warning naming the consequence (tail-quantile endpoints may be
erratic) and recommending refit at nboots = 1000. The warning fires on
every such call, not once-per-session --- the user can suppress, refit,
or proceed with caveat. Point estimate and SE are unaffected.
3. Jackknife large-Nco warning at fit time: vartype = "jackknife" with
N > 1000 emits a warning recommending vartype = "bootstrap" for
tractability (full leave-one-out scales linearly in N; at v2.4.2 EM
convergence defaults each refit is slow). Semantics unchanged --- still
classic delete-one Tukey jackknife, no subsetting.
Statistical references for the 1000-replicate floor: Efron 1987 §3 (bca
acceleration estimation); DiCiccio & Efron 1996 §4 (bca + percentile
recommendations); Hesterberg 2014 (recommends 1000 minimum, 10000+ ideal
for routine production).
Tests:
- CI.8 updated: assert nboots default is 200 (was 1000)
- CI.9 added: estimand() warns on tail-CI methods at nboots < 1000
- CI.10 added: normal CI is silent at nboots < 1000
- CI.11 added: tail-CI methods at nboots >= 1000 do not warn
Pre-existing CI.1, CI.2b, CI.6, CI.7, CI.7b now emit the new warning during
their fit-time setup (small nboots) but still pass --- warnings are
non-blocking. Test file run alone: 0 failures, 0 errors.
NEWS.md: removed the "Default nboots raised from 200 to 1000" bullet (the
raise was never publicly shipped); added bullets for the new warning gate
and the jackknife large-Nco warning. bb-updates.Rmd mirrored.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The test-factors-from-refactor.R was written TDD-style before the time.component.from refactor (header: "Tests are written BEFORE implementation. Each test will fail until its phase is implemented, then pass permanently"). After the refactor stabilized, several Phase 1 / Phase 3a-E / Phase 3a-I tests became redundant with substantive coverage that lives elsewhere. Removing 11 of 77 tests (-14%). Pruned, with tombstone comments pointing at where coverage still lives: Phase 1b nevertreated acceptance -> Phase 6a-6e Phase 1c nyt vs nt produce different -> non-regression-grade Phase 1d default = notyettreated -> argument-matching is R semantics Phase 1e cfe + nevertreated works -> Phase 3a-B/C/D Phase 1f time.component.from in CV -> Phase 3a-G1/G3 Phase 1g time.component.from in boot -> Phase 3a-F1/F2/I1 Phase 3a-E1 gamma+kappa fields present -> subsumed by E3 Phase 3a-E4 plot() smoke -> test-plot-fect.R + test-plot-refactor.R Phase 3a-E5 print() smoke -> covered by every regression that prints Phase 3a-I2 ife em=FALSE bootstrap smoke -> em=TRUE/FALSE equiv via I10 Phase 3a-I4 cfe em=FALSE bootstrap smoke -> em=TRUE/FALSE equiv via I10 Phase 1a retained as the single smoke entry point for the API. File: 1844 -> 1666 lines (-9.6%), 77 -> 66 tests (-14%). Validation: testthat::test_file() runs clean (0 failures, 0 errors, 0 warnings; same as pre-prune baseline of 77/0/0/0). Other large test files audited but NOT pruned this commit: - test-cv-parallel.R (1763 lines, 42 tests): Section P (P.1-P.6) initially looked redundant but P.1/P.2 are serial-vs-parallel identity (different cv.method cells) while P.3-P.6 are regression-vs-saved-fixtures from a prior refactor. Distinct purposes; not bloat. - test-score-unify.R (2631 lines, 81 tests): S1-S9 sections each cover distinct API surfaces (.score_residuals weights/time_index/norm.para edge cases; fect_cv regression; fect_mspe criterion/masking/weights extensions). Substantive coverage, not bloat. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The 2631-line test-score-unify.R was the largest file in the suite and
covered 81 tests across 9 sections. Under reporter = "summary" it
produced one progress line for the entire ~15+ min run; nothing was
visible until completion.
Split by section into 6 files of more uniform size, with shared
fixtures extracted to a helper file (testthat auto-loads helper-*.R
before any test-*.R, so the fixtures are visible without per-file
duplication).
helper-score-unify.R (62 lines) shared fixtures: simdata
data load, make_factor_data()
DGP helper, ntdata fixture,
out_base fitted object
test-score-residuals.R (267 lines) S1 .score_residuals() unit
tests + property invariants
(P1/P2/P5/P7) + edge cases
(E3/E4) --- 13 tests
test-score-fect-cv.R (172 lines) S2 fect_cv regression +
Section B cv.method ---
6 tests
test-score-fect-mspe.R (333 lines) S3 criterion + S4 cv.method +
S6 weights + S7 norm.para +
S8 return structure +
S9 input validation +
Section D simplification ---
21 tests
test-score-nevertreated.R (867 lines) Section C cv.method dispatch +
Section E 1% selection rule +
Section F W/count.T.cv +
Section G integration +
Section H k-fold CV ---
26 tests
test-score-bench.R (165 lines) Section I runtime benchmarks
(2 tests, slow, skip on CRAN)
test-score-parallel-cv.R (820 lines) Section G "Parallel CV Folds"
(formerly the duplicate-G
section in the original;
disambiguated by filename) ---
13 tests
Total: 81 tests across 6 split files, identical to the original 81
tests in test-score-unify.R. No tests added or removed --- pure
mechanical split.
Validation: each split file run individually via testthat::test_file()
passes clean (0 failures, 0 errors, 0 warnings beyond the same
benign max.iteration warnings from the new convergence diagnostic
that fire on small-N edge tests).
Win: progress visibility under reporter = "summary" goes from 1 line
for ~15+ min to 6 lines, each showing real-time progress as tests
in that section complete. Per-file size more uniform; easier to
locate specific tests via grep / find.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ucture + bib citations
Long-session bundle of vignette polish, estimand visual fixes, and per-type
default change. All sub-changes documented in
statsclaw-workspace/fect/runs/2026-05-03-ch7-restructure-and-coverage-investigation.md.
Estimand visual fixes (R/po-estimands.R):
- bca → bc fallback when cell-level jackknife is degenerate (n_cells < 2,
jackknife vector all-NA). Resolves missing CIs at staggered-tail event-times.
- bca/bc → normal CI fallback when bootstrap-quantile interval is degenerate
(cutoffs collapse to same quantile due to z0 clamping at small B with
skewed bootstrap) or doesn't cover the point estimate. Resolves degenerate
ci.lo == ci.hi intervals and asymmetric intervals shifted off estimate.
- Normal CI is centered at estimate by construction so the fallback always
yields a covering interval.
- estimand() now attaches `fect_test` attribute to result data frame when
test = "placebo" or "carryover".
Per-type ci.method default change:
- att.cumu default: "percentile" → "basic" (the reflected pivot CI per
Davison-Hinkley 1997 §5.2.1; also `boot::boot.ci(type = "basic")`).
"percentile" preserved for replication of legacy att.cumu() byte-equality.
- AC.2/AC.3 tests updated to pass ci.method = "percentile" explicitly so
byte-equality with legacy effect()/att.cumu() still asserts.
Plot polish (R/esplot.R):
- Read fect_test attribute. When set to "placebo" or "carryover", default
pre.color to color (uniform across all event-times); skip the gray
pre-treatment shade since there's no pre-vs-post contrast in those modes.
Vignette restructure:
- Removed Quarto book "parts" structure (kept chapter order).
- ch2 inference subsection simplified: dropped parametric / ci.method /
full vartype-table content; kept brief bootstrap+jackknife intro +
parallel + clustering callouts.
- ch7 received the migrated content: parametric-regimes section + the
three-gate system + ci.method "when to prefer" table.
- ch3 placebo-and-carryover split into two H2 subsections; setup-carryover
chunk relocated; new panelView::panelview() chunk before the carryover
fit shows the reversal pattern.
- ch7 ASCII math (θ̂, z₀, Φ⁻¹, ε, ρ, etc.) converted to LaTeX
($\hat\theta$, $z_0$, $\Phi^{-1}$, $\varepsilon$, $\rho$, etc.).
- "case bootstrap" → "unit-level (cluster) bootstrap" terminology.
Bibliography:
- 10 new entries in references.bib: efron1987, efron_tibshirani1993,
davison_hinkley1997, diciccio_efron1996, hall1992, hesterberg2014,
liu1988, mammen1993, cameron_gelbach_miller2008, carpenter_bithell2000.
- ch7 inline citations converted from "Efron 1987 / Davison-Hinkley 1997 /
..." to [@efron1987] / [@davison_hinkley1997] / ... Pandoc format.
- ch7 manual References section at end removed; references flow into
cc-references.qmd via nocite: @* (moved from _quarto.yml to the chapter).
- _quarto.yml: link-citations: true + link-bibliography: true added; this
produces hover-tooltip citations but cross-chapter anchor linking is
not automatic in Quarto book mode (documented limitation).
NEWS.md + bb-updates.Rmd: per-type default att.cumu → "basic" with
literature-citation comment.
tests/coverage-study/run_para_error_coverage.R: cores back to 10 (an
over-subscription experiment with cores=20 measurably slowed runs ~2x
due to OS-level worker contention on the 20-physical-core machine).
NEW tests/coverage-study/run_gsynth_style_test.R: standalone K=500 /
nboots=200 coverage test on a quarter-scale gsynth-note-style DGP
(factor model r=2, T0/T=72%, Ntr/Nco=1:3, fixed treated). Diagnostic
script for the att × parametric coverage gap (existing fect script
shows 0.91; gsynth-note shows 0.95; goal is to identify whether the
gap is DGP-shape or estimator/inference-procedure). Run mid-session,
killed at ~30/500 reps before user ended session.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…lax T19
Adds a routine pre-merge coverage gate (run_minimal_coverage.R) covering
three scenarios at gsynth-note canonical settings + large-N TWFE with
bootstrap and jackknife inference paths. Outer-loop parallelism via
future::multisession workers=16; inner fect() calls run sequentially.
~1.5 min wall on 16 cores at K=200, nboots=200.
Scenarios (no covariates, gsynth-note Xu-2017 structure):
A: factor r=2 IID errors parametric / auto-empirical / 5 ci.methods
B: factor r=2 AR(1) rho=0.8 parametric / auto-ar / 5 ci.methods
C1: TWFE r=0 AR(1) rho=0.5 bootstrap (cluster) / 5 ci.methods
C2: TWFE r=0 AR(1) rho=0.5 jackknife / normal-only (E&T 1993 ch11)
Default K=200, nboots=200. Companion script run_minimal_coverage_tail_rerun.R
re-runs at nboots=1000 only those scenarios whose tail-CI cells (basic /
percentile / bc / bca) come in below 0.93 -- aligns with the v2.4.2
.check_tail_ci_replicates warning (Efron 1987 §3, DiCiccio & Efron 1996 §4
recommend B >= 1000 for tail quantiles).
Coverage results at K=200, nboots=200: A and B nominal across all 5
ci.methods (0.935-0.96); C2 jackknife normal at 0.93; C1 normal at 0.945
but bc/bca/percentile dipped to 0.92-0.925. After tail rerun on C1 at
nboots=1000: all 5 ci.methods at 0.94-0.945. Calibration ratios
(mean SE / empirical SD) at K=200: 1.05 / 1.04 / 1.01 / 1.05 (well calibrated).
Also relaxes T19 threshold in run_para_error_coverage.R from 0.91 to 0.90
with rationale: at N=40 IID with no factor structure the parametric
pseudo-treated bootstrap targets V_t alone and misses Var_{Lambda,F}[b_t];
empirical SD across MC reps exceeds bootstrap SE by ~9%, yielding ~0.91
coverage analytically. T20 keeps 0.91 (AR(1) inflates variance, empirically
0.96+). README rewritten to lead with run_minimal_coverage as the routine
gate and document the threshold rationale; run_para_error_coverage retained
as the deep-dive characterization on the small-N IID/AR DGP.
Deletes run_gsynth_style_test.R (the morning's quarter-scale triage script;
superseded by Scenario A's full-scale gsynth-note replication).
Run log: statsclaw-workspace/fect/runs/2026-05-03-coverage-completion-and-clean-render.md
Replaces the older T19/T20-style table content with the four-scenario minimal coverage suite (factor IID + factor AR(1) + large-N TWFE bootstrap + jackknife) at K=200, nboots=200, plus the C1 rerun at nboots=1000. Adds canonical CSV records under tests/coverage-study/results/ so the chapter has a stable, in-repo data source rather than depending on /tmp/. Headline numbers: A and B (factor model, parametric inference) nominal across all 5 ci.methods; C2 (jackknife normal) at 0.93; C1 tail-CIs at 0.92-0.935 at nboots=200 and recover to 0.94-0.945 at nboots=1000. Calibration ratios SE/empirical-SD = 1.01-1.05 across all four scenarios (the bootstrap/jackknife SE matches the empirical sampling SD to within 5%). Older T19/T20 characterization on the small N_tr=12 panel kept as a short closing subsection with the law-of-total-variance rationale explaining why threshold there is 0.90 not 0.95. Re-rendered 07-inference.html (standalone).
…v2.4.2 summary table - nboots: revert documentation claim that the default is 1000. The fect default is and stays nboots = 200, calibrated for the SE-based normal CI (the per-type default for att, the most common estimand). Frame the simulation evidence: 200 is enough for nominal coverage on att; tail-CI methods (basic / percentile / bc / bca) drift below nominal at 200 and recover at 1000 (per the Scenario C1 results shown earlier in the chapter). Cross-reference the .check_tail_ci_replicates warning gate. - Drop external "gsynth-note" pointers in the empirical-coverage section; describe the DGPs in their own terms. Drop the closing "Older characterization on DGP-A" subsection; the minimal coverage suite is now the canonical reference and the run_para_error_coverage README contains the deep-dive details for users who need them. - Restructure decision tree: lead with the design-shape question (factor-augmented + never-treated controls + no reversal + ife/mc/gsynth/cfe -> parametric), then the choice between bootstrap and jackknife at the second level (bootstrap by default; jackknife only when Ntr is small enough that the bootstrap degenerates). This matches the design-first narrative of v2.4.2. - Drop "v2.4.2 inference summary" table. The fixes it documented are now baked into the chapter prose; the version-specific issue-list framing is more appropriate for NEWS.md (where it already lives).
API change (Plan B: rename + soft-deprecate):
- New ci.method = c("normal", "basic") arg on fect() / fect.formula / fect.default
- quantile.CI = NULL sentinel; explicit use emits a soft-deprecation
warning pointing to ci.method
- ci.method = "basic" + nboots < 1000 emits the .check_tail_ci_replicates
warning at fit time
- ci.method in {bca, bc, percentile} hard-errors with a message pointing
to estimand() for the full 5-method surface
- ci.method = "basic" + vartype = "jackknife" hard-errors (E&T 1993 ch11)
Parametric basic CI fix (location-shift across all slots):
- New helpers .basic_ci_shifted / .basic_ci_shifted_one in R/boot.R
apply a row-mean / mean-centered shift to H0-centered draws before
the reflected pivot, so basic CIs cover the point estimate
- Refactored ~22 reflected-CI sites in boot.R: avg, att, calendar,
cohort, balance, by-W, placebo, carryover all use the helpers
- Single .is_param flag controls when the shift fires; bootstrap
vartype is unaffected
- fect-level basic on parametric matches estimand(., ci.method="basic")
byte-equally at avg level (test added)
Coverage (K=200, nboots=200, fect-direct):
A factor IID parametric: normal 0.96 basic 0.96
B factor AR(1) parametric: normal 0.96 basic 0.95
C TWFE bootstrap: normal 0.945 basic 0.935
All within 1 MC SE of nominal 0.95.
Tests:
- New tests/testthat/test-ci-method-fect.R (20 tests: byte-equality,
deprecation warnings, validation, jackknife gate, parametric basic
byte-equality)
- Updated test-book-claims H3 (nboots default stays at 200 + warning)
- Updated test-estimand-parametric-cifix P-INV-4 (bc safety fallback)
- Wrapped test-estimand-parametric att.cumu/aptt in suppressWarnings
(defaults to tail-CI methods that warn at nboots=30)
Docs:
- ch7 §7.3 opening describes both fect() (2-method) and estimand()
(5-method) paths and the location-shift fix
- §7.6 simplified with two coverage summary tables
- "MC SE" -> "Monte Carlo SE" globally
- NEWS.md + bb-updates v2.4.2 entries
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
R CMD check --as-cran flagged non-ASCII characters in R/default.R (WARNING). All occurrences were in comments and one warning() string added in the v2.4.2 ci.method work: U+00A7 section sign and one U+2014 em-dash. Replaced with "Sec. " and "---" respectively. Status now: 1 NOTE (down from 1 WARNING + 2 NOTEs). The remaining NOTE is the standard CRAN incoming feasibility entry (recent update + HonestDiDFEct on GitHub), both acceptable per CRAN policy with the installation pointer in Description. Date bumped 2026-05-02 -> 2026-05-04. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Source: - R/default.R: warn at fit time when vartype="jackknife" + non-NULL cl (cl is silently ignored; jackknife is leave-one-unit-out only). - R/plot.R: drop redundant size= on geom_rect count band; add linewidth= to geom_pointrange (calendar plot). Removes ggplot2 >= 3.4 size-on-line deprecation warning on plot(type="box") and type="calendar". Tests: - tests/testthat/test-ci-method-fect.R: 3 new tests for the jackknife+cl warning (fires correctly, doesn't fire on jackknife-alone or bootstrap+cl). Vignettes: - rscript/: regenerated all chapter scripts via knitr::purl against current chapter numbering; deleted stale 03/04/05/06/07/08/09 named files; added 03-estimands.R (was missing). Download links rewritten in 8 chapters to point at new filenames. - ch3 (Alternative Estimands): removed internal jargon (dispatcher, accessor, surface, ship, functional, byte-identical, tidy schema, composes) in favor of plain English. Added prose note on the BCa- tail-CI warning at low nboots. - ch4 (IFE/MC): bumped max.iteration to 20000 on placebo/carryover/ carryover.rm IFE fits (default 5000 was too tight when test cells are dropped from estimation; placebo IFE converges at niter=8548). Added prose note explaining the choice. - ch11 (Plot Options): swapped factors/loadings/loading-overlap demos from hh2019 (no known factor structure) to simdata (true r=2), giving the plots real ground-truth signal to interpret. Dropped the max.iteration override since simdata + IFE r=2 (no test cells) converges within default cap. - ch1, ch2, ch7, index: minor prose / cross-ref tidies. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
v2.4.2 completion: BCa CI + wild bootstrap + log.att hard-stop (closes #131)
## Summary
- Trims `quiet_nonpara`'s closure env inside `fect_boot()` so
foreach/future export ships ~7 MiB per worker instead of 728 MiB.
Reported by external user Álvaro Fernández Junquera on dev v2.4.2 (IFE +
parallel + nboots=1000 tripped the 500 MiB `future.globals.maxSize`
default).
- Belt-and-braces: locally bump `future.globals.maxSize` to
`max(user_set, 2 GiB)` with `on.exit` restore. Honours any larger user
cap.
- Universal across all three `one.nonpara` definitions and all methods
(gsynth/ife/mc/cfe); IFE hit it first because its frame is largest. Bug
existed because the inner wrapper replaced the already-trimmed
`one.nonpara` as the iter function, undoing the L1600 trim.
## Diff
Two surgical inserts in `R/boot.R`, inside `if (do_parallel_boot)`.
`codetools::findGlobals` confirms the wrapper body only references
`one.nonpara` and `boot.seq` — `suppress*` calls resolve via the parent
env chain, so the trim is safe.
| file | change |
| --- | --- |
| `DESCRIPTION` | 2.4.2 → 2.4.3; Date 2026-05-14 |
| `NEWS.md` | new `# fect 2.4.3` header, 2-bullet entry |
| `R/boot.R` | 14 lines: trim_closure_env + maxSize bump |
| `tests/testthat/test-cv-parallel.R` | +100 lines, Section B (B.1 +
B.2) |
| `vignettes/bb-updates.Rmd` | v2.4.3 changelog entry |
## Test plan
- [x] `R CMD INSTALL --no-inst --preclean .` clean
- [x] `test-cv-parallel.R` isolated run: pass=77 fail=5 (the 5 fails are
pre-existing banner-string tests S.1/S.2/T.2/M.2/C.2, confirmed by
baseline stash + rerun)
- [x] B.1: IFE bootstrap on simdata under a tight 50 MiB pre-block cap —
no `future.globals.maxSize` warning fires
- [x] B.2: structural contract — `trim_closure_env` applied to a
`quiet_nonpara`-shaped wrapper keeps ONLY {one.nonpara, boot.seq}, size
<1 MiB
- [x] All other boot-related test files green
- [x] Reviewer to confirm closure-env trim is safe under all four method
branches (the wrapper is shared)
## Summary
v2.4.4 (development): adds `$sample` and `$data.long` slots to
`fect()`'s return value so **panelView** can shade the cells the
estimator actually used, plus vignette + test-summary polish.
## Slots
| Slot | Type | Purpose |
|---|---|---|
| `$sample` | logical $T \times N$ (same dims as `$Y.dat`) | Cells used
in any part of the procedure (main fit + placebo / carryover / balance
tests). Derived as `obs.missing %in% c(1L, 2L, 5L)`. |
| `$data.long` | data.frame, `c(index, Yname, Dname)` | The original
long-format input. Lets `panelview(fit)` reconstruct the full pre-drop
panel --- without it, always-treated and all-NA units fect silently
drops would be invisible in the overlay (and that's the band the overlay
exists to show). |
Usage from the panelView side:
```r
library(fect); library(panelView)
data(hh2019)
fit <- fect(nat_rate_ord ~ indirect, data = hh2019,
index = c("bfs", "year"),
method = "ife", r = 0, se = FALSE, CV = FALSE,
min.T0 = 2)
panelview(fit, type = "treat", by.timing = TRUE,
axis.lab = "off", display.all = TRUE,
gridOff = TRUE, xlab = "", ylab = "")
```
## Test-summary cleanup (576 → 0 warnings)
The tail-CI warnings from `estimand()` and `fect()` (Efron 1987 /
DiCiccio & Efron 1996 floor: `nboots >= 1000`) are now gated on
`Sys.getenv("TESTTHAT") == "true"`. End users in any non-test context
still see the warning unchanged.
Tests that intentionally assert the warning wrap their
`expect_warning()` in `withr::with_envvar(c(TESTTHAT = "false"), ...)`.
Nine pre-existing "EM did not converge within max.iteration = 5000"
warnings on small parametric-bootstrap fixtures are wrapped in
`suppressWarnings()` at the fit call site, with a one-line comment
explaining what is being suppressed. Those fixtures are intentionally
ill-conditioned for speed.
`DESCRIPTION` now lists `withr` under Suggests.
Result: `devtools::test()` PASS, **0 warnings**, 0 failures (was 576
nboots-advisory warnings).
## Vignette polish
- **New ch1 version table** (CRAN / GitHub `master` / GitHub `dev`)
mirroring panelView's format. Replaces the live `available.packages()`
lookup chunk.
- **ch6 section reorder**: Inference now precedes Additional Notes
(which becomes the last section before How to Cite).
- **ch11 `sec-sample`**: split the fect-fit and `panelview()` chunks;
widen the plot (`fig.width = 12`, `fig.height = 8`, `out.width =
"100%"`); add `gridOff = TRUE`; clear axis titles via `xlab = "", ylab =
""`.
- **bb-updates changelog**: strip all `@sec-*` cross-refs (fragile
across renames); drop "(development)" from the v2.4.4 header.
## Validation
- [x] `devtools::test()` (parallel = TRUE, TESTTHAT_CPUS = 10): PASS, 0
warnings, 0 failures.
- [x] `R CMD check --as-cran` (no tests): 0 errors / 0 warnings / 1 NOTE
(timestamp, benign).
- [x] Clean Quarto book re-render: 15/15 HTML, 0 errors.
- [x] `panelview(fit)` end-to-end on `fect::hh2019` and Acemoglu d2:
renders correctly.
## Companion changes
- **panelView v1.3.3** (PR #12
[merged](xuyiqing/panelView#12) into `dev`):
consumes `$sample` + `$data.long` via new `panelview(fit)` direct-call
dispatch.
#139) (#140) Closes #139. ## What Bundles three pieces into v2.4.5: ### 1. New `group.fe` argument (the headline feature) Discoverable surface for absorbing additive fixed effects above the unit level. Canonical use case from #139: county-time panel where treatment varies at the state level, user wants state FE (not county FE) plus time FE. ```r fect(Y ~ D, data = df, index = c("county", "time"), group.fe = "state", force = "time") # no county FE; state FE + time FE ``` Multi-column accepted (`group.fe = c("state","region")`). Cluster SE auto-defaults to `group.fe[1]` (pass `cl = FALSE` to suppress). `method = "fe"` (default) silently routes to `method = "cfe"` (identical result; FE is a subset of CFE). `method = "ife"`/`"mc"`/`"both"`/`"gsynth"` hard-error with guidance to use `method = "cfe", r = N` explicitly for free latent factors with group-level FE. Internally, `group.fe` normalizes to the existing `index[3:]` extra-FE pipeline (`X.extra.FE` array, `extra_FE_index_cache` in `src/cfe_sub.cpp`) — no new estimator machinery, just a discoverable API name for what was already there. ### 2. Fix: `method = "cfe"` with `force = "time"` / `"unit"` A pre-existing bug: `complex_fe_ub` in `src/cfe_sub.cpp` assembled the result list with unconditional `result["alpha"] = ...` / `result["xi"] = ...` reads, but the inner `Demean()` only writes those when the corresponding FE is active. Errored with `Index out of bounds: [index='alpha']` or `[index='xi']`. Fix: match `Demean()`'s conditional writes. `force = "two-way"` (the only previously-working case) is byte-equivalent. This fix is the precondition for `group.fe` to actually work in Bernie's "state FE only, no county FE" pattern. ### 3. Hygiene: remove vestigial `sfe` + orphan `R/polynomial.R` `sfe` was plumbed through the public signature but reached no live code in any user-facing method (it dispatched only to `fect_polynomial()`, which is unreachable: `method = "polynomial"` is not in the user-facing whitelist and `permutation.R:213` explicitly errors on it). Deleting both is pure surface cleanup with no behavior change. Reverse-dep risk: negligible (arg was non-functional). ### 4. Stricter nesting check on extra-FE columns Both the new `group.fe` and the legacy `index = c("unit","time","extra")` form now hard-error if `extra` varies within `unit`. Previously the column was reshaped silently and produced incorrect fits. ### 5. `print(fit)` shows the model spec explicitly ``` Estimator: fe Fixed effects: time (time) + state Cluster SE: state ``` ## Commits - 2186554 chore(api): remove vestigial sfe + delete orphan R/polynomial.R - 1ae7c6d fix(cfe): conditional alpha/xi assignment in complex_fe_ub result list - 3195865 feat(api): add group.fe for additive simple FE on CFE (closes #139) - 208b27e feat(api): print FE composition + cluster SE + estimator line - 16bef90 test(group-fe): add coverage for group.fe API and validation paths - 2aa35ed docs(vignette): higher-level FE section + cheatsheet group.fe row - 4a10d9d chore(release): bump Version to 2.4.5 + Date ## Test plan - [x] All 22 new tests in `tests/testthat/test-group-fe.R` pass - [x] Targeted regression: book-claims, sample-slot, cv-parallel, factors-from-refactor — exit 0 - [ ] Full `tests/testthat/` suite — running in background, please wait for it - [ ] `R CMD check --as-cran` - [ ] revdep check before CRAN submission (probably nothing depends on `sfe`) ## Design memo Full design + audit history at `statsclaw-workspace/fect/runs/2026-05-21-higher-level-fe.md` (private workbench repo). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Yiqing Xu <7664920+xuyiqing@users.noreply.github.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add tests/testthat/test-cran.R (core fe fit + print.fect) and gate tests/testthat.R so CRAN runs only that file (NOT_CRAN unset and not on GitHub Actions); the full regression suite still runs locally and in CI. Cuts CRAN test time ~132s -> ~2s. Docs moved to the Quarto book under vignettes/; the pkgdown site is retired. Remove _pkgdown.yml and pkgdown/build.R, plus the 624 KB pre-rendered NEWS.html (regenerated from NEWS.md). All were already in .Rbuildignore, so the CRAN tarball is unchanged. Refresh DESCRIPTION: bump Date to 2026-05-29 and list the maintainer (Yiqing Xu) first in Authors@R. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- install table (01-start): CRAN / master / dev all at 2.4.5, dated 2026-05-30 (CRAN publication date) - changelog (bb-updates): v2.4.5 entry tagged "(2026-05-30) CRAN release." - citation + BibTeX note (index): bumped v2.4.1 -> v2.4.5 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
v2.4.5 release —
dev → masterPeriodic
dev → mastergate for the v2.4.5 CRAN cut, live on CRAN since 2026-05-30.masterwas last at v2.4.1 (#132); this brings it up to the released version, bundling v2.4.2 → v2.4.5. This is the exact tree CRAN accepted as 2.4.5.Versions included
v2.4.5 —
group.fefor sub-group treatmentgroup.feargument onfect(): absorb additive fixed effects at a coarsening of the unit identifier (e.g. state FE on county-level rows when treatment varies at the state level). Closes treatment variation and fixed effects at higher-than-unit level #139. Clustered SE defaults togroup.fe[1]; override withcl = "<column>".method = "cfe"withforce = "time"/"unit"(previously erroredIndex out of bounds).force = "two-way"is byte-equivalent.sfeargument and orphanR/polynomial.R.v2.4.4 —
$sampleslotfect()now returns$sample, a logical matrix (same dims as$Y.dat) marking cells used in any part of estimation (main fit, placebo / carryover / balance). Compatible withpanelView::panelview(sample = ...).v2.4.3 — parallel-bootstrap fix
future.globals.maxSizeoverrun: thequiet_nonparawrapper no longer capturesfect_boot()'s full frame.v2.4.2 —
ci.methodAPI + estimand inference infrastructureci.method = c("normal", "basic")onfect(); legacyquantile.CIsoft-deprecated (mapped with a one-time warning).estimand()gainstest = c("none", "placebo", "carryover")(Pre-treatment estimates for APTT estimand #131),para.errorfor parametricvartype, and BCa / normal CI methods with per-type defaults.tol1e-3 → 1e-5,max.iteration1000 → 5000) with a non-convergencewarning(); pass the old values explicitly to reproduce pre-2.4.2 output.Release hygiene
tests/testthat/test-cran.R(CRAN test time 132s → ~2s; full suite still runs locally / CI); dropped legacy pkgdown tooling and the pre-renderedNEWS.html(all already.Rbuildignore'd — zero tarball effect).(2026-05-30) CRAN release.tag, citation + BibTeXnote.Scope
56 commits, 101 files, +11,783 / −5,059.
R CMD check --as-cranclean (1 routine authorship note).🤖 Generated with Claude Code