Skip to content

Address #412 holistic re-audit residuals (R2)#430

Merged
igerber merged 5 commits into
mainfrom
fix-audit-412-r2
May 14, 2026
Merged

Address #412 holistic re-audit residuals (R2)#430
igerber merged 5 commits into
mainfrom
fix-audit-412-r2

Conversation

@igerber
Copy link
Copy Markdown
Owner

@igerber igerber commented May 14, 2026

Summary

A holistic codex review of the merged #412 + cleanup #422 state surfaced three documentation/test gaps that the per-PR cleanup review path could not see (it only scopes to the cleanup diff). All three are at-most P3 in severity but each is real claim-vs-coverage drift.

  • REGISTRY schema list omits `het_*` columns — the top-level `Note (Phase 3 by_path ...)` `to_dataframe(level="by_path")` schema list still describes the pre-heterogeneity column set. Add the six `het_*` columns the `_to_dataframe` path has always emitted since Phase 5.

  • Parity tests under-asserted golden payload — `TestDCDHDynRParityHeterogeneity` and `TestDCDHDynRParityByPathHeterogeneity` assert only `beta` and `se`, leaving `t_stat`, `p_value`, `conf_int`, and `n_obs` unpinned. A regression in the inference extraction or final-`df_survey` refresh could ship while parity still passes. Pin `t_stat` at `SE_RTOL` (`t = beta / se` is invariant to the Wald-test critical-value distribution) and `n_obs` exactly.

  • Real undocumented Z-vs-t structural deviation — while extending the parity assertions, surfaced a real Python-vs-R deviation that was undocumented: `_compute_heterogeneity_test` passes `df=None` to `safe_inference`, so Python uses the normal Z critical value (~1.96) for `p_value` and `conf_int`. R `did_multiplegt_dyn(..., predict_het)` uses the t-distribution with df = n - k from the WLS regression. The structural gap produces ~0.1-2% rtol gaps on CIs and p-values that exceed `SE_RTOL` (verified empirically: CI gap ~0.17% on h=1). Document the deviation in the heterogeneity R-parity Note; pin only `beta`, `se`, `t_stat`, `n_obs` in the parity tests; `p_value` / `conf_int` parity intentionally skipped. Add a TODO row tracking the optional df-threading work.

No estimator behavior, weighting, variance/SE computation, or default-statistical surface changed - documentation accuracy plus expanded regression coverage only.

Methodology references

  • Method: `ChaisemartinDHaultfoeuille` per-path heterogeneity testing (Web Appendix Section 1.5 / Lemma 7), composed with the existing `by_path` / `paths_of_interest` selector machinery.
  • Source: de Chaisemartin & D'Haultfoeuille (2020); dynamic companion paper.
  • Deviation documented: Z-vs-t critical value in the heterogeneity inference layer; tracked in TODO.

Test plan

  • CI - `pytest tests/test_chaisemartin_dhaultfoeuille_parity.py::TestDCDHDynRParityHeterogeneity tests/test_chaisemartin_dhaultfoeuille_parity.py::TestDCDHDynRParityByPathHeterogeneity` both pass locally with the expanded assertions.

🤖 Generated with Claude Code

A holistic codex review of the merged #412 + cleanup #422 state
surfaced three documentation/test gaps that the per-PR cleanup review
path could not see (it only scopes to the cleanup diff). All three
are at-most P3 in severity but each is real claim-vs-coverage drift.

1. REGISTRY's top-level `Note (Phase 3 by_path ...)` `to_dataframe(
   level="by_path")` schema list omits the `het_*` columns
   (`het_beta`, `het_se`, `het_t_stat`, `het_p_value`,
   `het_conf_int_lower`, `het_conf_int_upper`) that `_to_dataframe`
   has always emitted since the Phase 5 heterogeneity wave landed.
   Add them to the schema list so the registry contract matches the
   implementation.

2. The two new parity tests
   (`TestDCDHDynRParityHeterogeneity`,
    `TestDCDHDynRParityByPathHeterogeneity`) assert only `beta` and
   `se` from the R golden payload, leaving `t_stat`, `p_value`,
   `conf_int`, and `n_obs` unpinned. A regression in the inference
   extraction or final-`df_survey` refresh could ship while parity
   still passes. Pin `t_stat` at `SE_RTOL` (invariant to critical-
   value distribution since `t = beta / se`) and `n_obs` exactly.

3. While extending the parity assertions, surfaced a real Python-vs-R
   structural deviation that was undocumented:
   `_compute_heterogeneity_test` passes `df=None` to `safe_inference`,
   so Python uses the normal Z critical value (~1.96) for `p_value`
   and `conf_int`. R `did_multiplegt_dyn(..., predict_het)` uses the
   t-distribution with df = n - k from the WLS regression. The
   structural gap produces ~0.1-2% rtol gaps on CIs and p-values
   that exceed `SE_RTOL` (verified empirically on the parity fixture:
   CI gap ~0.17% on h=1). Document the deviation in the heterogeneity
   R-parity Note. Pin only `beta`, `se`, `t_stat`, `n_obs` in the
   parity tests; `p_value` and `conf_int` parity intentionally
   skipped. Add a TODO row tracking the optional df-threading work.

No estimator behavior, weighting, variance/SE computation, or
default-statistical surface changed - documentation accuracy plus
expanded regression coverage only.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

Overall Assessment

✅ Looks good

Executive Summary

  • Affected method: ChaisemartinDHaultfoeuille heterogeneity testing / predict_het, including the by_path surface from Web Appendix Section 1.5 / Lemma 7.
  • The diff is documentation + test-only; _compute_heterogeneity_test() and inference behavior are unchanged.
  • The new REGISTRY/TODO entries correctly document the existing Python-vs-R heterogeneity inference gap: Python still calls safe_inference(..., df=None) on the non-survey heterogeneity path, so p_value/conf_int are z-based while R uses finite-df t inference (diff_diff/chaisemartin_dhaultfoeuille.py:L5094-L5106, TODO.md:L64-L64).
  • The schema correction is accurate: to_dataframe(level="by_path") already emits the six het_* columns and fills placebo rows with NaN (diff_diff/chaisemartin_dhaultfoeuille_results.py:L1819-L1941).
  • New parity assertions materially improve coverage by pinning t_stat and n_obs.
  • P3 only: there is still one wording mismatch in the refreshed REGISTRY note, and one remaining non-survey test blind spot for heterogeneity p_value/conf_int.

Methodology

  • P3 docs/methodology/REGISTRY.md:L643-L643, diff_diff/chaisemartin_dhaultfoeuille.py:L5094-L5106: The refreshed **Per-path heterogeneity testing** paragraph still describes the non-survey branch in WLS terms ("standard WLS vcov for non-survey" / "df = n - k from the WLS regression"), while the implementation’s non-survey path is plain OLS with no weights. Impact: the methodology note now contains a claim-vs-code mismatch in the same paragraph used to explain the new Z-vs-t deviation. Concrete fix: change the non-survey wording to OLS, and only use WLS language for the survey_design path unless an exact R source citation is added for the R-side regression type.
  • No other methodology findings. The actual Z-vs-t divergence is now properly documented and tracked, so it is informational rather than a blocker under the stated review rules (docs/methodology/REGISTRY.md:L643-L643, TODO.md:L64-L64).

Code Quality

  • None.

Performance

  • None.

Maintainability

  • None.

Tech Debt

  • None. The finite-df heterogeneity follow-up is now properly tracked in TODO.md:L64-L64.

Security

  • None.

Documentation/Tests

  • P3 tests/test_chaisemartin_dhaultfoeuille_parity.py:L1394-L1401, tests/test_chaisemartin_dhaultfoeuille_parity.py:L1491-L1505, tests/test_chaisemartin_dhaultfoeuille.py:L2822-L2869, tests/test_chaisemartin_dhaultfoeuille.py:L10132-L10148: After intentionally skipping R parity for heterogeneity p_value/conf_int, the suite still does not backfill a local non-survey invariant that these fields equal safe_inference(beta, se, df=None) on the global and by_path heterogeneity surfaces. Impact: a regression isolated to heterogeneity p_value/conf_int extraction could still pass. Concrete fix: add one global and one by_path regression test that recompute safe_inference from stored beta/se and assert t_stat, p_value, and conf_int.
  • Verification note: I could not execute the touched tests in this environment because pytest and pandas are unavailable.

Restored CI reviewer R0 on PR #430 flagged two follow-on items:

1. REGISTRY's refreshed heterogeneity Note described the non-survey
   variance branch in WLS terms ("standard WLS vcov for non-survey",
   "df = n - k from the WLS regression"). The implementation at
   `chaisemartin_dhaultfoeuille.py:5095-5106` is plain OLS via
   `solve_ols(design, dep_arr, return_vcov=True)` with no weights
   on the non-survey path; only the survey branch uses WLS. Adjust
   the wording so the deviation paragraph and the variance-machinery
   description both correctly attribute the df to the OLS regression.
   Same wording fix applied to the matching TODO row.

2. By intentionally skipping R parity for heterogeneity `p_value`
   and `conf_int` (Python Z vs R t deviation), the suite no longer
   covers a regression isolated to those fields' extraction or
   `_refresh_path_inference` ordering. Backfill that gap with a
   Python-side invariant: assert that stored `t_stat` / `p_value` /
   `conf_int` equal `safe_inference(beta, se, df=None)` on every
   populated horizon. Two tests:
     - `TestHeterogeneityTesting::test_heterogeneity_inference_matches_safe_inference`
       (global surface, all populated horizons)
     - `TestByPathHeterogeneity::test_per_path_heterogeneity_inference_matches_safe_inference`
       (by_path surface, all populated (path, horizon) entries)
   Both close the regression-coverage gap without re-introducing
   the Z-vs-t structural mismatch in R parity.

No runtime behavior change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: e6d050a5749c12e1d62720e56f020c9005750a89


Overall Assessment

✅ Looks good

Executive Summary

  • Re-review scope: the prior P3 coverage gap is addressed. New non-survey regression tests now assert heterogeneity t_stat, p_value, and conf_int equal safe_inference(beta, se, df=None) on both global and by_path surfaces (tests/test_chaisemartin_dhaultfoeuille.py:L2870-L2906, tests/test_chaisemartin_dhaultfoeuille.py:L10120-L10166).
  • The schema documentation fix is correct: to_dataframe(level="by_path") already emits the six het_* columns and NaN-fills placebo rows (docs/methodology/REGISTRY.md:L643-L643, diff_diff/chaisemartin_dhaultfoeuille_results.py:L1819-L1941).
  • The Z-vs-t heterogeneity inference divergence is now properly documented and tracked, and the parity tests appropriately pin only beta, se, t_stat, and n_obs (docs/methodology/REGISTRY.md:L643-L643, TODO.md:L64-L64, tests/test_chaisemartin_dhaultfoeuille_parity.py:L1378-L1401, tests/test_chaisemartin_dhaultfoeuille_parity.py:L1478-L1505).
  • P3: one residual prose mismatch remains: some updated methodology/test/TODO text still says “WLS” for the non-survey heterogeneity parity gap even though _compute_heterogeneity_test()’s non-survey branch is plain OLS (diff_diff/chaisemartin_dhaultfoeuille.py:L4896-L4902, diff_diff/chaisemartin_dhaultfoeuille.py:L5094-L5106, docs/methodology/REGISTRY.md:L643-L643, tests/test_chaisemartin_dhaultfoeuille_parity.py:L1394-L1401, TODO.md:L64-L64).

Methodology

  • Severity P3. Impact: the refreshed by-path heterogeneity note still attributes the parity tolerance to “small WLS denominator” drift, and the parity/TODO prose still refers to a “WLS regression/df”, while the implementation and main heterogeneity note describe the non-survey branch as plain OLS (diff_diff/chaisemartin_dhaultfoeuille.py:L4896-L4902, diff_diff/chaisemartin_dhaultfoeuille.py:L5094-L5106, docs/methodology/REGISTRY.md:L637-L643, tests/test_chaisemartin_dhaultfoeuille_parity.py:L1394-L1401, TODO.md:L64-L64). Concrete fix: replace the remaining “WLS” references with “OLS” or neutral “finite-df regression” wording in the registry note, parity comment, and TODO row.
  • No other methodology findings. The actual Z-vs-t deviation is documented in docs/methodology/REGISTRY.md and tracked in TODO.md, so it is informational rather than blocking under the stated review rules (docs/methodology/REGISTRY.md:L643-L643, TODO.md:L64-L64).

Code Quality

  • No findings.

Performance

  • No findings.

Maintainability

  • No findings.

Tech Debt

  • No new untracked tech-debt findings. The remaining heterogeneity critical-value divergence and the deferred per-path placebo-heterogeneity question are explicitly tracked in TODO.md:L63-L64.

Security

  • No findings.

Documentation/Tests

  • No findings. The previous P3 test blind spot is addressed by the new local safe_inference invariance tests and expanded parity assertions (tests/test_chaisemartin_dhaultfoeuille.py:L2870-L2906, tests/test_chaisemartin_dhaultfoeuille.py:L10120-L10166, tests/test_chaisemartin_dhaultfoeuille_parity.py:L1378-L1401, tests/test_chaisemartin_dhaultfoeuille_parity.py:L1478-L1505).
  • Verification note: I could not execute the targeted tests in this environment because pytest is not installed.

…aces

R1 review caught three residual "WLS" references in the heterogeneity
deviation paragraphs that escaped the first R1 commit:

1. REGISTRY R-parity note still attributed the SE tolerance to "small
   WLS denominator-and-cohort-recentering numerical drift". The non-
   survey path is OLS; the recentering observation applies to the OLS
   denominator. Adjust the wording.

2. Parity-test inline comment still said "uses t-distribution with df
   = n - k from the WLS regression". Same fix.

3. TODO row still said "Either thread the WLS df into safe_inference".
   Same fix.

The legitimate "uses WLS" reference in the heterogeneity Note (under
the "Note (survey support)" sub-paragraph) is intentionally retained -
the SURVEY path IS WLS; only the non-survey heterogeneity branch is
OLS. The remaining WLS occurrence at REGISTRY.md:1271 is in an
unrelated estimator's note.

No runtime behavior change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: 21ffbcbf2f281fa7f933a63133e1217b5c936de2


Overall Assessment

✅ Looks good — no unmitigated P0/P1 findings.

Executive Summary

  • Methods touched are ChaisemartinDHaultfoeuille heterogeneity testing and by_path heterogeneity testing (Web Appendix Section 1.5 / Lemma 7). The diff changes docs/tests/TODO only; no estimator implementation changed.
  • The prior re-review wording issue is resolved in the diffed files: the updated registry/TODO/parity comments now correctly describe non-survey predict_het as plain OLS, while the implementation continues to use WLS only on the survey branch (docs/methodology/REGISTRY.md:L637-L643, TODO.md:L64-L64, tests/test_chaisemartin_dhaultfoeuille_parity.py:L1385-L1401, diff_diff/chaisemartin_dhaultfoeuille.py:L5094-L5223).
  • The schema documentation fix is correct: to_dataframe(level="by_path") now documents the het_* columns already emitted by the results layer (docs/methodology/REGISTRY.md:L643-L643, diff_diff/chaisemartin_dhaultfoeuille_results.py:L1819-L1941).
  • The earlier heterogeneity test blind spot is closed: new local invariants pin stored inference to safe_inference(beta, se, df=None), and parity now asserts t_stat and n_obs while intentionally leaving p_value/conf_int unpinned because the Z-vs-t deviation is documented and tracked (tests/test_chaisemartin_dhaultfoeuille.py:L2870-L2906, tests/test_chaisemartin_dhaultfoeuille.py:L10120-L10167, tests/test_chaisemartin_dhaultfoeuille_parity.py:L1378-L1401, tests/test_chaisemartin_dhaultfoeuille_parity.py:L1468-L1505).
  • [Newly identified] One adjacent in-code docstring remains stale: path_heterogeneity_effects still says beta is a WLS coefficient, which is only true under survey_design (diff_diff/chaisemartin_dhaultfoeuille_results.py:L427-L440).

Methodology

  • Severity P3 [Newly identified]. Impact: diff_diff/chaisemartin_dhaultfoeuille_results.py:L427-L440 still documents per-path heterogeneity beta as a WLS coefficient, contradicting both the non-survey implementation (diff_diff/chaisemartin_dhaultfoeuille.py:L5094-L5106) and the updated registry/TODO/parity text. This is documentation drift only and does not affect estimates or inference. Concrete fix: update that result-class docstring to say the non-survey path is plain OLS and the survey path is WLS.
  • No other methodology findings. The Z-vs-t heterogeneity inference divergence is explicitly documented in docs/methodology/REGISTRY.md:L643-L643 and tracked in TODO.md:L64-L64, so it is non-blocking under the stated review rules.

Code Quality

  • No findings.

Performance

  • No findings.

Maintainability

  • No findings beyond the docstring drift noted above.

Tech Debt

  • No findings. The new TODO entry at TODO.md:L64-L64 properly tracks the deferred heterogeneity critical-value mismatch.

Security

  • No findings.

Documentation/Tests

  • No findings in the diff. The added regression coverage is appropriate and does not paper over the documented R/Python inference difference (tests/test_chaisemartin_dhaultfoeuille.py:L2870-L2906, tests/test_chaisemartin_dhaultfoeuille.py:L10120-L10167, tests/test_chaisemartin_dhaultfoeuille_parity.py:L1378-L1401, tests/test_chaisemartin_dhaultfoeuille_parity.py:L1468-L1505).
  • Verification note: I could not execute the targeted tests in this environment because the default Python is missing pytest, numpy, and pandas.

R2 review caught one more stale "WLS" reference in the result-class
docstring: `path_heterogeneity_effects` described `beta` as a WLS
coefficient, but the non-survey path is plain OLS (only the survey
branch is WLS). Clarify that `beta` is OLS on the non-survey path
and WLS-on-pweights only under `survey_design`.

The global `heterogeneity_effects` docstring is already generic
("Per-horizon heterogeneity test results") and needs no change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: f94a88ae3964843c4346f1cb6691dbe69d798dcc


Overall Assessment

✅ Looks good — no unmitigated P0/P1 findings.

Executive Summary

  • Affected methods are ChaisemartinDHaultfoeuille heterogeneity testing and per-path heterogeneity testing under by_path/paths_of_interest (Web Appendix Section 1.5, Lemma 7).
  • This diff does not change estimator behavior, weighting, SE computation, or defaults; it changes documentation, TODO tracking, and regression coverage only.
  • The prior stale per-path heterogeneity results docstring is fixed and now matches the implementation’s non-survey OLS / survey WLS split in diff_diff/chaisemartin_dhaultfoeuille_results.py:L427-L441 and diff_diff/chaisemartin_dhaultfoeuille.py:L5094-L5223.
  • The Z-vs-t heterogeneity inference divergence is now documented in docs/methodology/REGISTRY.md:L643-L643 and tracked in TODO.md:L64-L64, so it is non-blocking under the stated review rules.
  • The new tests appropriately close the earlier blind spot by pinning t_stat/n_obs in R parity and locally asserting safe_inference(beta, se, df=None) for non-survey heterogeneity surfaces in tests/test_chaisemartin_dhaultfoeuille.py:L2870-L2906, tests/test_chaisemartin_dhaultfoeuille.py:L10120-L10168, and tests/test_chaisemartin_dhaultfoeuille_parity.py:L1373-L1505.
  • One minor documentation drift remains: the new registry text says the non-survey per-path heterogeneity path uses “plain OLS vcov,” but the implementation calls solve_ols(..., return_vcov=True) with the default vcov_type="hc1".

Methodology

  • Severity P3 [Newly identified]. Impact: docs/methodology/REGISTRY.md:L643-L643 now describes the non-survey per-path heterogeneity variance path as “plain OLS vcov,” but _compute_heterogeneity_test calls solve_ols(..., return_vcov=True) without overriding vcov_type, so the actual non-survey variance is HC1 heteroskedasticity-robust, not classical OLS (diff_diff/chaisemartin_dhaultfoeuille.py:L5094-L5106, diff_diff/linalg.py:L451-L488). This is documentation drift only; no estimator output changes. Concrete fix: change the registry wording to “non-survey via solve_ols(..., return_vcov=True) (HC1 by default)” or equivalent.
  • No other methodology findings. The Z-vs-t heterogeneity inference difference is now explicitly documented and tracked, so it is mitigated under the review policy.

Code Quality

  • No findings.

Performance

  • No findings.

Maintainability

  • No findings beyond the registry wording drift above.

Tech Debt

  • No findings. The new deferred-work entry in TODO.md:L64-L64 appropriately tracks the documented heterogeneity Z-vs-t critical-value gap as non-blocking follow-up work.

Security

  • No findings.

Documentation/Tests

  • No findings. The expanded heterogeneity coverage is well targeted and does not mask the documented R/Python inference-critical-value deviation.
  • Verification note: I could not execute the targeted tests in this environment because /usr/bin/python does not have pytest installed.

The R1 wording fix replaced "WLS" with "plain OLS vcov" in the
non-survey heterogeneity description, but `solve_ols(..., return_
vcov=True)` defaults to `vcov_type="hc1"` (linalg.py:464), so the
actual non-survey variance is HC1 heteroskedasticity-robust, not
classical OLS. Adjust the registry wording to "HC1-robust OLS vcov
for non-survey via solve_ols(..., return_vcov=True) (vcov_type=
\"hc1\" default)".

The `beta` field is still plain OLS (the point estimate solves the
normal equations); only the vcov/SE side uses HC1. The result-class
docstring's "plain OLS on the non-survey path" wording refers to
the point estimate and is correct.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: e71027f4a2611011db4374452105179d17a512ce


Overall Assessment

✅ Looks good — no unmitigated P0/P1 findings.

Executive Summary

  • Affected method: ChaisemartinDHaultfoeuille heterogeneity testing, global and by_path, per Web Appendix Section 1.5 / Assumption 15 / Lemma 7. This PR is docs/tests/TODO only; estimator behavior is unchanged. diff_diff/chaisemartin_dhaultfoeuille.py:L4880-L5234
  • The prior re-review P3 is addressed: the registry now correctly documents the non-survey per-path heterogeneity variance path as HC1 via solve_ols(..., return_vcov=True) and records the Z-vs-t R divergence. diff_diff/chaisemartin_dhaultfoeuille.py:L5094-L5106, diff_diff/linalg.py:L451-L465, docs/methodology/REGISTRY.md:L643-L643
  • The new TODO.md entry properly tracks the non-blocking heterogeneity critical-value deviation, so it is mitigated under the stated review rules. TODO.md:L64-L64
  • The by-path schema docs now match the implementation’s always-present het_* columns. diff_diff/chaisemartin_dhaultfoeuille_results.py:L1528-L1537, diff_diff/chaisemartin_dhaultfoeuille_results.py:L1820-L1825, diff_diff/chaisemartin_dhaultfoeuille_results.py:L1937-L1942
  • The added tests close the earlier blind spot: parity now pins t_stat and n_obs, while local invariants assert p_value/conf_int equal safe_inference(beta, se, df=None) on populated non-survey heterogeneity entries. tests/test_chaisemartin_dhaultfoeuille_parity.py:L1373-L1401, tests/test_chaisemartin_dhaultfoeuille_parity.py:L1468-L1504, tests/test_chaisemartin_dhaultfoeuille.py:L2870-L2906, tests/test_chaisemartin_dhaultfoeuille.py:L10120-L10168
  • Verification note: I could not execute the targeted pytest cases here because pytest is not installed (/bin/bash: pytest: command not found).

Methodology

No findings. The diff does not change estimator math, weighting, variance/SE computation, or defaults; the updated registry/TODO text is consistent with the current implementation and with the project’s documented handling of the heterogeneity R divergence. diff_diff/chaisemartin_dhaultfoeuille.py:L5094-L5106, docs/methodology/REGISTRY.md:L643-L643, TODO.md:L64-L64

Code Quality

No findings.

Performance

No findings.

Maintainability

No findings.

Tech Debt

No findings. The newly added deferred-work entry is correctly tracked in TODO.md, so it is non-blocking under the review policy. TODO.md:L64-L64

Security

No findings.

Documentation/Tests

No findings. The registry schema list now matches to_dataframe(level="by_path"), and the new tests improve coverage without masking the documented Z-vs-t deviation versus R. diff_diff/chaisemartin_dhaultfoeuille_results.py:L1528-L1537, diff_diff/chaisemartin_dhaultfoeuille_results.py:L1820-L1825, diff_diff/chaisemartin_dhaultfoeuille_results.py:L1937-L1942, tests/test_chaisemartin_dhaultfoeuille.py:L2870-L2906, tests/test_chaisemartin_dhaultfoeuille.py:L10120-L10168, tests/test_chaisemartin_dhaultfoeuille_parity.py:L1373-L1401, tests/test_chaisemartin_dhaultfoeuille_parity.py:L1468-L1504

@igerber igerber added the ready-for-ci Triggers CI test workflows label May 14, 2026
@igerber igerber merged commit 93ff96d into main May 14, 2026
31 of 32 checks passed
@igerber igerber deleted the fix-audit-412-r2 branch May 14, 2026 12:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready-for-ci Triggers CI test workflows

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant