CellBioStats is a web application that simplifies the creation of publication-quality SuperPlots and performs statistical analysis on hierarchical data from common experimental results in cell biology, molecular biology, and biochemistry, helping you visualize complex datasets and avoid pitfalls like pseudoreplication.
The current version is still limited in terms of case uses, but the idea is the project to grow to be more comprehensive.
- Interactive SuperPlots: Generates publication-quality SuperPlots, showing individual data points, replicate means, and overall treatment means with standard error (SEM).
- Automated Statistical Analysis: Automatically performs normality checks, variance homogeneity tests, and selects the appropriate statistical test for your data.
- Hierarchical Statistics: Avoids pseudoreplication by correctly performing statistical tests on replicate means, not on raw technical measurements.
- Paired & Unpaired Data: Handles both independent (unpaired) and repeated measures (paired) experimental designs.
- Data Upload: Supports both
.csvand.xlsxfile formats. - Plot Customization: Allows customization of plot aesthetics (font size, color schemes, marker size, and plot templates).
- Downsampling for plot visualization: Option to display a random subset of your data points to keep plots clean and responsive with very large datasets.
- Export Results: Download the detailed statistical summary as a
.txtfile and the plot as a.png, .jpeg, .svg or .pdffile directly from the app.
CellBioStats is deployed on Render. Visit the live app
-
Clone the repository:
git clone https://github.com/brunicardoso/CellBioStats.git cd CellBioStats -
Create a virtual environment and install dependencies:
python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate pip install -r requirements.txt
-
Run the application:
uvicorn backend.main:app --reload
-
Open your web browser and navigate to
http://localhost:8000.
- Upload Data: Drag and drop (or click to browse) your
.csvor.xlsxfile. You can try it on our sample data - Map Columns: Select the appropriate columns from your file for Treatment, Value, and Replicate.
- Select Test Type: (Optional) If your experiment uses a repeated measures or paired design, toggle "Use paired/repeated measures tests". Read this if you are not sure about choosing paired or unpaired tests
- Analyze: Click the "Analyze" button.
- Customize: Expand "Plot customization" to adjust appearance (font size, colors, marker size, template, downsampling). Click "Analyze" again to update.
- Download: Use the download buttons to save the plot (PNG/JPEG/SVG/PDF) or statistical summary (.txt).
The application expects your data to be in a specific format, like in the example below. You must have at least three columns:
- Treatment Column: Identifies the different experimental groups (e.g., 'Control', 'Drug A').
- Value Column: Contains the numeric measurement data (the dependent variable; e.g., cell size, expression level, etc).
- Replicate Column: Identifies the independent experimental replicates (e.g., experiment number of independent replications done in different days or animal ID).
Here is an example of a valid data structure:
| Treatment | Cell_size | Replicate |
|---|---|---|
| Control | 10.5 | 1 |
| Control | 11.2 | 1 |
| Control | 12.1 | 2 |
| Control | 11.8 | 2 |
| Drug A | 15.3 | 1 |
| Drug A | 14.9 | 1 |
| Drug A | 16.5 | 2 |
| Drug A | 17.1 | 2 |
CellBioStats is designed to perform statistically sound analysis by respecting the hierarchical nature of typical biological data.
- Data Aggregation: All primary statistical tests are performed on the means of each replicate, not on the raw technical measurements. This avoids pseudoreplication and ensures that the statistical power reflects the number of independent experiments.
- Assumption Checks:
- Normality: The Shapiro-Wilk test is run on the replicate means for each treatment group.
- Homoscedasticity (Equal Variances): Levene's test is run to check for equality of variances across groups.
- Automated Test Selection: Based on the number of groups, the experimental design (paired/unpaired), and the results of the assumption checks, the app automatically selects the most appropriate statistical test.
| # of Groups | Design | Assumptions Met (Normal & Homoscedastic) | Assumptions Not Met |
|---|---|---|---|
| 2 | Unpaired | Student's t-test | Mann-Whitney U test |
| >2 | Unpaired | One-way ANOVA + Tukey HSD post-hoc | Kruskal-Wallis + Mann-Whitney U post-hoc (Bonferroni) |
| 2 | Paired | Paired t-test | Wilcoxon signed-rank test |
| >2 | Paired | Repeated Measures ANOVA + Paired t-test (Bonferroni) | Friedman test + Wilcoxon post-hoc (Bonferroni) |
- Backend: FastAPI serving REST API endpoints and static files
- Frontend: HTML + Tailwind CSS + vanilla JavaScript + Plotly.js
- Deployment: Render via
render.yaml
-
Lord, S. J., Velle, K. B., Mullins, R. D., & Fritz-Laylin, L. K. (2020). SuperPlots: Communicating reproducibility and variability in cell biology. Journal of Cell Biology, 219(6). https://doi.org/10.1083/jcb.202001064
-
Pollard, D. A., Pollard, T. D., & Pollard, K. S. (2019). Empowering statistical methods for cellular and molecular biologists. Molecular Biology of the Cell, 30(12), 1359-1368. https://doi.org/10.1091/mbc.e15-02-0076
Bruni-Cardoso, A. (2025). CellBioStats, an application for robust visualization and statistical analysis of data from cell and molecular biology experiments
This project is licensed under the MIT License - see the LICENSE file for details.
