Skip to content
This repository was archived by the owner on Mar 10, 2026. It is now read-only.
This repository was archived by the owner on Mar 10, 2026. It is now read-only.

Trying to understand slight differences in p values between Samplics and R's survey/srvyr packages #64

@kburchfiel

Description

@kburchfiel

I am testing out the Samplics library by comparing its output to that of R's survey and srvyr packages. I'm finding very slight variations in P values between Samplics and R when running chi squared tests, and I'd love your help in figuring out the cause of these variations.

Here's my initial data import code:

from samplics.estimation import TaylorEstimator
from samplics.utils.types import PopParam
from samplics.categorical import CrossTabulation
import pandas as pd
import numpy as np
# Reading in dataset
df_car_survey = pd.read_csv(
    'https://raw.githubusercontent.com/ifstudies/carsurveydata/\
refs/heads/main/car_survey.csv')
df_car_survey['Enjoy_Driving_Fast'] = df_car_survey[
'Enjoy_Driving_Fast'].str.replace(' ','_').copy()
print(df_car_survey.head())

Output:

  Car_Color    Weight Enjoy_Driving_Fast  Count  Response_Sort_Map
0       Red  1.975884     Strongly_Agree      1                  0
1       Red  0.943725     Strongly_Agree      1                  0
2       Red  1.342593     Strongly_Agree      1                  0
3       Red  1.704274     Strongly_Agree      1                  0
4       Red  0.348622     Strongly_Agree      1                  0

Here is the code I used to estimate proportions and run a chi squared test within Samplics:

# Creating filtered copy for use within Samplics
df_car_survey_filtered = df_car_survey.query(
    "Car_Color in ['Red', 'Black']").copy()

print("Samplics proportion estimates:")
df_survey_estimator = TaylorEstimator(param=PopParam.prop)
df_survey_estimator.estimate(
    y=df_car_survey_filtered["Enjoy_Driving_Fast"],
    domain=df_car_survey_filtered["Car_Color"],
    samp_weight=df_car_survey_filtered['Weight'],
    remove_nan = True
)
print(df_survey_estimator.to_dataframe())


print("\nSamplics chi squared test:")
rb_chisq_initial = CrossTabulation(param=PopParam.prop)
rb_chisq_initial.tabulate(
    vars=df_car_survey_filtered[["Car_Color", "Enjoy_Driving_Fast"]],
    samp_weight=df_car_survey_filtered['Weight'],
    remove_nan = True
)
print(rb_chisq_initial)

Samplics output:

Samplics proportion estimates:
           _param _domain             _level  _estimate  _stderror      _lci  \
0   PopParam.prop   Black              Agree   0.238712   0.024280  0.194339   
1   PopParam.prop   Black           Disagree   0.038357   0.010993  0.021736   
2   PopParam.prop   Black     Slightly_Agree   0.084976   0.015647  0.058874   
3   PopParam.prop   Black  Slightly_Disagree   0.049802   0.012735  0.029975   
4   PopParam.prop   Black     Strongly_Agree   0.475709   0.028543  0.420207   
5   PopParam.prop   Black  Strongly_Disagree   0.112444   0.018153  0.081427   
6   PopParam.prop     Red              Agree   0.190626   0.026221  0.144352   
7   PopParam.prop     Red           Disagree   0.029421   0.011059  0.013973   
8   PopParam.prop     Red     Slightly_Agree   0.128908   0.022078  0.091400   
9   PopParam.prop     Red  Slightly_Disagree   0.040227   0.012731  0.021466   
10  PopParam.prop     Red     Strongly_Agree   0.481333   0.032954  0.417302   
11  PopParam.prop     Red  Strongly_Disagree   0.129485   0.021744  0.092436   

        _uci       _cv  
0   0.289574  0.101711  
1   0.066820  0.286600  
2   0.121160  0.184140  
3   0.081639  0.255706  
4   0.531818  0.060001  
5   0.153305  0.161438  
6   0.247444  0.137553  
7   0.060892  0.375888  
8   0.178780  0.171270  
9   0.074140  0.316491  
10  0.545983  0.068465  
11  0.178463  0.167926  

Samplics chi squared test:

Cross-tabulation of Car_Color and Enjoy_Driving_Fast
 Number of strata: 1
 Number of PSUs: 718
 Number of observations: 718
 Degrees of freedom: 717.00

 Car_Color Enjoy_Driving_Fast  PopParam.prop  stderror  lower_ci  upper_ci
    Black              Agree       0.135314  0.014665  0.109015  0.166771
    Black           Disagree       0.021743  0.006286  0.012289  0.038188
    Black     Slightly_Agree       0.048169  0.009040  0.033221  0.069361
    Black  Slightly_Disagree       0.028230  0.007305  0.016931  0.046712
    Black     Strongly_Agree       0.269657  0.019140  0.233767  0.308836
    Black  Strongly_Disagree       0.063739  0.010575  0.045870  0.087929
      Red              Agree       0.082569  0.012114  0.061692  0.109685
      Red           Disagree       0.012744  0.004830  0.006038  0.026697
      Red     Slightly_Agree       0.055836  0.009952  0.039222  0.078910
      Red  Slightly_Disagree       0.017424  0.005577  0.009269  0.032520
      Red     Strongly_Agree       0.208488  0.017628  0.175985  0.245208
      Red  Strongly_Disagree       0.056086  0.009785  0.039695  0.078692

Pearson (with Rao-Scott adjustment):
	Unadjusted - chi2(5): 6.3434 with p-value of 0.2742
	Adjusted - F(4.99, 3579.82): 0.9571  with p-value of 0.4427

  Likelihood ratio (with Rao-Scott adjustment):
	 Unadjusted - chi2(5): 6.3364 with p-value of 0.2748
	 Adjusted - F(4.99, 3579.82): 0.9561  with p-value of 0.4434

Meanwhile, here are my tests within R: (This code was executed within the same Python notebook using rpy2's %%R and %R tools.)

Initializing R and loading survey dataset:

%load_ext rpy2.ipython
%R -i df_car_survey # Imports DataFrame into R

Here is my R code for estimating proportions and running a chi squared test: (this code was based on chapters 5 and 6 of Exploring Complex Survey Data Analysis Using R.)

%%R
library(survey)
library(srvyr)

print("R proportion estimates:")
# Creating a survey design object:
df_des <- df_car_survey %>% as_survey_design(weights=Weight)
# Filtering this survey design object:
df_des_filtered <- df_des %>% filter(Car_Color %in% c('Red', 'Black'))
# df_des_filtered %>% group_by(Car_Color, Enjoy_Driving_Fast) %>% 
#     summarize(row_count = n())
df_proportions <- df_des_filtered %>% group_by(
    Car_Color, Enjoy_Driving_Fast) %>% 
    summarize(proportion = survey_prop(
        vartype=c('ci', 'var', 'cv')))
print(df_proportions)

print("R chi squared test:")
df_chi2 <- df_des_filtered %>% svychisq(
    formula = ~ Car_Color + Enjoy_Driving_Fast,
    design = ., 
statistic = "Chisq",
    na.rm = TRUE)
print(df_chi2)

R Output:

[1] "R proportion estimates:"
# A tibble: 12 × 7
# Groups:   Car_Color [2]
   Car_Color Enjoy_Driving_Fast proportion proportion_low proportion_upp
   <chr>     <chr>                   <dbl>          <dbl>          <dbl>
 1 Black     Agree                  0.239          0.194          0.290 
 2 Black     Disagree               0.0384         0.0217         0.0668
 3 Black     Slightly_Agree         0.0850         0.0589         0.121 
 4 Black     Slightly_Disagree      0.0498         0.0300         0.0816
 5 Black     Strongly_Agree         0.476          0.420          0.532 
 6 Black     Strongly_Disagree      0.112          0.0814         0.153 
 7 Red       Agree                  0.191          0.144          0.247 
 8 Red       Disagree               0.0294         0.0140         0.0609
 9 Red       Slightly_Agree         0.129          0.0914         0.179 
10 Red       Slightly_Disagree      0.0402         0.0215         0.0741
11 Red       Strongly_Agree         0.481          0.417          0.546 
12 Red       Strongly_Disagree      0.129          0.0924         0.178 
# ℹ 2 more variables: proportion_var <dbl>, proportion_cv <dbl>
[1] "R chi squared test:"

	Pearson's X^2: Rao & Scott adjustment

data:  NextMethod()
X-squared = 6.3434, df = 5, p-value = 0.4464

The p value reported by R for the Pearson's chi squared test (0.4464) differs slightly from the equivalent Samplics P value (0.4427), even though the chi squared statistic reported by both R and samplics (6.3434) is the same. Do you know why this might be the case? I know the difference is slight, but depending on the underlying P value and confidence level, this discrepancy could mean the difference between a significant and non-significant result.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions