Improve tests assertion#237
Conversation
Signed-off-by: Hao Wu <skyw@nvidia.com>
Signed-off-by: Hao Wu <skyw@nvidia.com>
Greptile SummaryThis PR centralises test assertion helpers into a new
Confidence Score: 4/5Safe to merge; the migration is purely mechanical and the new helpers are correct. The one item worth a follow-up before merging is the unexplained 10× tolerance widening in test_polargrad.py. All assert_equal substitutions preserve exact bitwise semantics. The assert_close_to_identity and assert_close_to_orthogonal helpers are logically correct, including the rtol/atol equivalence in the test_spel migration. The only unresolved question is the tolerance loosening in test_polargrad.py (atol/rtol widened from 1e-5 to 1e-4 with no explanation), which could be hiding a real precision regression or just a noisy test that needed stabilising. tests/test_polargrad.py — the tolerance change at line 103 lacks a comment explaining whether it reflects a known float32 precision floor or was used to silence intermittent failures. Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
CMP["tests/_comparison.py"]
AE["assert_equal\n(functools.partial of assert_close\natol=rtol=0)"]
ACI["assert_close_to_identity\n(diag vs ones, off-diag vs zeros\nwith separate atols)"]
ACO["assert_close_to_orthogonal\n(builds Gram matrix,\nthen calls assert_close_to_identity)"]
CMP --> AE
CMP --> ACI
CMP --> ACO
AE --> T1["test_distributed_muon_utils_cpu\ntest_distributed_rekls_cpu\ntest_distributed_soap_utils_cpu\ntest_eig_utils / test_muon_utils\ntest_normalized_optimizer\ntest_orthogonalized_optimizer\ntest_polargrad / test_scalar_optimizers\ntest_soap / test_soap_utils\ntest_triton_kernels\ntest_utils_modules / test_weight_decay_mixin"]
ACI --> T2["test_soap\ntest_soap_utils\ntest_spel"]
ACO --> T3["test_muon_utils\n(new test_polar_express_16steps)"]
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
flowchart TD
CMP["tests/_comparison.py"]
AE["assert_equal\n(functools.partial of assert_close\natol=rtol=0)"]
ACI["assert_close_to_identity\n(diag vs ones, off-diag vs zeros\nwith separate atols)"]
ACO["assert_close_to_orthogonal\n(builds Gram matrix,\nthen calls assert_close_to_identity)"]
CMP --> AE
CMP --> ACI
CMP --> ACO
AE --> T1["test_distributed_muon_utils_cpu\ntest_distributed_rekls_cpu\ntest_distributed_soap_utils_cpu\ntest_eig_utils / test_muon_utils\ntest_normalized_optimizer\ntest_orthogonalized_optimizer\ntest_polargrad / test_scalar_optimizers\ntest_soap / test_soap_utils\ntest_triton_kernels\ntest_utils_modules / test_weight_decay_mixin"]
ACI --> T2["test_soap\ntest_soap_utils\ntest_spel"]
ACO --> T3["test_muon_utils\n(new test_polar_express_16steps)"]
Reviews (1): Last reviewed commit: "Add assert function for identity and ort..." | Re-trigger Greptile |
| torch.testing.assert_close( | ||
| rotated, |
There was a problem hiding this comment.
Unexplained tolerance relaxation
The tolerance on RightPolarGradOrthFnTest was silently widened by 10× (from 1e-5 to 1e-4). If this was triggered by a flaky CI failure, it is worth adding a comment documenting whether the looser bound reflects observed float32 error or was simply the minimum needed to make the test pass consistently.
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
There was a problem hiding this comment.
@greptileai it was a flaky test, ignore.
There was a problem hiding this comment.
Got it, makes sense — flaky tests from float32 precision are a common reason to relax tolerances. I'll drop this.
Would you like me to remember a rule: "Do not flag tolerance relaxations in tests when they are used to fix flaky CI failures"?
|
/ok to test 9708732 |
Test Results 81 files ±0 151 suites ±0 1m 47s ⏱️ +3s Results for commit 9708732. ± Comparison against base commit 9ab7b03. This pull request removes 3 and adds 5 tests. Note that renamed tests count towards both.♻️ This comment has been updated with latest results. |
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
Add assert_equal and assertion for identity, orthogonality, etc.