Skip to content

feat(proportion.single.inequality): add exact test#334

Draft
Snoopy1866 wants to merge 1 commit into
mainfrom
feat-proportion-single-inequality-exact
Draft

feat(proportion.single.inequality): add exact test#334
Snoopy1866 wants to merge 1 commit into
mainfrom
feat-proportion-single-inequality-exact

Conversation

@Snoopy1866
Copy link
Copy Markdown
Owner

No description provided.

@github-actions github-actions Bot added the documentation Improvements or additions to documentation label May 11, 2026
@read-the-docs-community
Copy link
Copy Markdown

Documentation build overview

📚 pystatpower | 🛠️ Build #32630960 | 📁 Comparing dfda73b against latest (a1af8d3)

  🔍 Preview build  

2 files changed
± api/proportion/single/inequality/index.html
± models/proportion/single/inequality/index.html

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces exact binomial test power and sample size calculations for one-sample proportion tests. The feedback identifies several critical issues: the signature change in the _power function breaks existing calls in other modules, and the solve_size implementation for exact tests is incomplete, lacking both a return statement and the logic to determine the minimum sample size. Furthermore, the power calculations for lower and two-sided tails are currently anti-conservative and require adjustments to the critical values. Finally, the documentation for the new exact test feature needs to be populated with mathematical background.

alpha: float,
phat: bool,
continuity_correction: bool,
exact: bool,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

Adding exact as a required positional argument to _power breaks existing calls in solve_null_proportion and solve_proportion, which currently only pass 7 arguments. Consider providing a default value for exact to maintain backward compatibility and fix the TypeError in those functions.

Suggested change
exact: bool,
exact: bool = False,

Comment on lines +252 to +262
if exact:
initial_size = solve_size(null_proportion, proportion, alternative, alpha, power, exact=False)
search_interval_lower_bound = max(floor(0.5 * initial_size), 1)
search_interval_upper_bound = min(ceil(initial_size * 1.5), SAMPLE_SIZE_SEARCH_MAX)

return ceil(brentq(func, 0.000001, SAMPLE_SIZE_SEARCH_MAX))
rng = range(search_interval_lower_bound, search_interval_upper_bound)
dict_power = dict.fromkeys(rng)
for size in rng:
dict_power[size] = solve_power(null_proportion, proportion, size, alternative, alpha, exact)
else:
return ceil(brentq(func, 0.000001, SAMPLE_SIZE_SEARCH_MAX))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The implementation for exact=True in solve_size is incomplete and contains a critical bug: it lacks a return statement, causing the function to return None. Additionally, the current logic populates a dictionary but does not use it to find the minimum sample size. The search logic should iterate through the range and return the smallest size that achieves the target power. Also, ensure that exact is passed correctly to solve_power (it is currently being passed as the 6th positional argument, which corresponds to phat).

Suggested change
if exact:
initial_size = solve_size(null_proportion, proportion, alternative, alpha, power, exact=False)
search_interval_lower_bound = max(floor(0.5 * initial_size), 1)
search_interval_upper_bound = min(ceil(initial_size * 1.5), SAMPLE_SIZE_SEARCH_MAX)
return ceil(brentq(func, 0.000001, SAMPLE_SIZE_SEARCH_MAX))
rng = range(search_interval_lower_bound, search_interval_upper_bound)
dict_power = dict.fromkeys(rng)
for size in rng:
dict_power[size] = solve_power(null_proportion, proportion, size, alternative, alpha, exact)
else:
return ceil(brentq(func, 0.000001, SAMPLE_SIZE_SEARCH_MAX))
if exact:
initial_size = solve_size(null_proportion, proportion, alternative, alpha, power, exact=False)
search_interval_lower_bound = max(floor(0.5 * initial_size), 1)
search_interval_upper_bound = min(ceil(initial_size * 1.5), SAMPLE_SIZE_SEARCH_MAX)
for size in range(search_interval_lower_bound, search_interval_upper_bound + 1):
if solve_power(null_proportion, proportion, size, alternative, alpha, phat, continuity_correction, exact) >= power:
return size
return search_interval_upper_bound
else:
return ceil(brentq(func, 0.000001, SAMPLE_SIZE_SEARCH_MAX))

Comment on lines +132 to +133
z_alpha = binom.ppf(alpha, size, null_proportion)
power = binom.cdf(z_alpha, size, proportion)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The power calculation for the lower tail is anti-conservative. binom.ppf(alpha, ...) returns the smallest $k$ such that $P(X \le k) \ge \alpha$. If $P(X \le k) > \alpha$, using this $k$ as the critical value results in a test with a significance level greater than $\alpha$. For an exact test, the critical value should be the largest $k$ such that $P(X \le k) \le \alpha$.

Suggested change
z_alpha = binom.ppf(alpha, size, null_proportion)
power = binom.cdf(z_alpha, size, proportion)
z_alpha = binom.ppf(alpha, size, null_proportion)
if binom.cdf(z_alpha, size, null_proportion) > alpha:
z_alpha -= 1
power = binom.cdf(z_alpha, size, proportion)

Comment on lines +136 to +137
z_alpha_2 = binom.ppf(alpha / 2, size, null_proportion)
power = 1 - binom.cdf(z_alpha_1, size, proportion) + binom.cdf(z_alpha_2, size, proportion)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Similar to the one-sided lower tail, the two-sided test implementation might be anti-conservative because z_alpha_2 (the lower critical value) is calculated using ppf without checking if the resulting CDF exceeds $\alpha/2$.

Suggested change
z_alpha_2 = binom.ppf(alpha / 2, size, null_proportion)
power = 1 - binom.cdf(z_alpha_1, size, proportion) + binom.cdf(z_alpha_2, size, proportion)
z_alpha_2 = binom.ppf(alpha / 2, size, null_proportion)
if binom.cdf(z_alpha_2, size, null_proportion) > alpha / 2:
z_alpha_2 -= 1
power = 1 - binom.cdf(z_alpha_1, size, proportion) + binom.cdf(z_alpha_2, size, proportion)

n = \frac{n'}{4} \left( 1 + \sqrt{1 + \frac{2}{|p_1 - p_0| n'}}\right)^2
$$

## 精确检验 {exact-test}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The "精确检验" (Exact Test) section is currently empty. Please add the mathematical background and formulas for the binomial exact test power calculation to provide complete documentation for this new feature.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant