PLN-team · Bastien-mva · May 5, 2025 · Apr 27, 2025 · Apr 27, 2025 · May 1, 2025
diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml
@@ -40,24 +40,24 @@ jobs:
       - name: Run Pylint
         run: find . -type f -name "*.py" | xargs pylint --disable=import-error,missing-module-docstring,invalid-name,not-callable,duplicate-code --load-plugins=pylint.extensions.docparams
 
-  # tests:
-  #   runs-on: ubuntu-22.04
-  #   container:
-  #     image: ghcr.io/bastien-mva/docker_image:latest
-  #   steps:
-  #     - uses: actions/checkout@v4
-  #     - name: Install package locally and run tests
-  #       run: |
-  #         pip install '.[tests]'
-  #         pip install -e .
-  #         jupyter nbconvert Getting_started.ipynb --to python --output tests/untestable_getting_started
-  #         cd tests
-  #         python _create_readme_getting_started_and_docstrings_tests.py
-  #         pytest --cov --cov-branch --cov-report=xml .
-  #     - name: Upload coverage reports to Codecov
-  #       uses: codecov/codecov-action@v5
-  #       with:
-  #         token: ${{ secrets.CODECOV_TOKEN }}
+  tests:
+    runs-on: ubuntu-22.04
+    container:
+      image: ghcr.io/bastien-mva/docker_image:latest
+    steps:
+      - uses: actions/checkout@v4
+      - name: Install package locally and run tests
+        run: |
+          pip install '.[tests]'
+          pip install -e .
+          jupyter nbconvert Getting_started.ipynb --to python --output tests/untestable_getting_started
+          cd tests
+          python _create_readme_getting_started_and_docstrings_tests.py
+          pytest --cov --cov-branch --cov-report=xml .
+      - name: Upload coverage reports to Codecov
+        uses: codecov/codecov-action@v5
+        with:
+          token: ${{ secrets.CODECOV_TOKEN }}
 
 
   build_package:
@@ -80,30 +80,30 @@ jobs:
           name: dist
           path: dist/
 
-  # publish_package:
-  #   runs-on: ubuntu-22.04
-  #   needs:
-  #     - build_package
-  #     - tests
-  #   if: github.event_name == 'release'
-  #   steps:
-  #     - uses: actions/checkout@v4
-  #     - name: Set up Python
-  #       uses: actions/setup-python@v4
-  #       with:
-  #         python-version: '3.9'
-  #     - name: Install Twine
-  #       run: pip install twine
-  #     - name: download artifacts and publish
-  #       uses: actions/download-artifact@v4
-  #       with:
-  #         name: dist
-  #         path: dist/
-  #     - name: Publish package
-  #       env:
-  #         TWINE_USERNAME: __token__
-  #         TWINE_PASSWORD: ${{ secrets.PYPLN_TOKEN }}
-  #       run: python -m twine upload dist/*
+  publish_package:
+    runs-on: ubuntu-22.04
+    needs:
+      - build_package
+      - tests
+    if: github.event_name == 'release'
+    steps:
+      - uses: actions/checkout@v4
+      - name: Set up Python
+        uses: actions/setup-python@v4
+        with:
+          python-version: '3.9'
+      - name: Install Twine
+        run: pip install twine
+      - name: download artifacts and publish
+        uses: actions/download-artifact@v4
+        with:
+          name: dist
+          path: dist/
+      - name: Publish package
+        env:
+          TWINE_USERNAME: __token__
+          TWINE_PASSWORD: ${{ secrets.PYPLN_TOKEN }}
+        run: python -m twine upload dist/*
 
   pages:
     runs-on: ubuntu-22.04
@@ -122,7 +122,10 @@ jobs:
           wget https://github.com/jgm/pandoc/releases/download/1.15.1/pandoc-1.15.1-1-amd64.deb
           sudo dpkg -i pandoc-1.15.1-1-amd64.deb
       - name: Convert README
-        run: pandoc README.md --from markdown --to rst -s -o docs/source/readme.rst
+        run: |
+          pandoc README.md --from markdown --to rst -s -o docs/source/readme.rst
+          echo "HEEEEEERE"
+          cat docs/source/readme.rst
       - name: Build docs
         run: |
           pip install .
@@ -136,3 +139,4 @@ jobs:
         with:
           github_token: ${{ secrets.GITHUB_TOKEN }}
           publish_dir: ./docs/build/html
+          force: true
diff --git a/README.md b/README.md
@@ -46,6 +46,9 @@ that an `R` version of the package is available [here](https://pln-team.github.i
 ```sh
 pip install pyPLNmodels
 ```
+The package depends on resource-intensive libraries like `torch`, so it may
+require significant storage space.
+
 
 ## Statistical description
 

diff --git a/docs/source/tutorials/_quarto.yml b/docs/source/tutorials/_quarto.yml
@@ -46,3 +46,4 @@ format:
     css: custom_css_yml.css
     theme: cosme
     code-copy: true
+    # page-navigation: true
diff --git a/docs/source/tutorials/autoreg.html b/docs/source/tutorials/autoreg.html
diff --git a/docs/source/tutorials/basic_analysis.html b/docs/source/tutorials/basic_analysis.html
diff --git a/docs/source/tutorials/basic_analysis.qmd b/docs/source/tutorials/basic_analysis.qmd
@@ -49,7 +49,7 @@ and the model parameters are:
 These models aim to capture the structure of the data through the latent variables $Z$.
 
 The `Pln` model assumes $\Sigma$ has full rank, while the `PlnPCA` model
-assumes $\Sigma$ has a low rank, which must be specified by the user. A lower
+assumes $\Sigma$ has a low rank, which must be specified by the user.
  A lower rank introduces a trade-off, reducing computational complexity but potentially
 compromising parameter estimation accuracy.
 
@@ -61,16 +61,20 @@ The `pyPLNmodels` package is designed to:
 * Retrieve the latent variables $Z$ (which typically contains more information than $Y$)
 * Visualize the latent variables and their relationships
 
-This is achieved using the input count matrix $Y$, along with optional covariate matrix $X$ (defaulting to a vector of 1's) and offsets $O$ (defaulting to a matrix of 0's).
+This is achieved using the input count matrix $Y$, along with optional covariate matrix $X$ (defaulting to a vector of 1s) and offsets $O$ (defaulting to a matrix of 0s).
 
 
 # Importing Data
 
-In this example, we analyze single-cell RNA-seq data provided by the `load_scrna` function in the package. Each column in the dataset represents a gene, while each row corresponds to a cell (i.e., an individual). Covariates for cell types (`labels`) are also included. For simplicity, we limit the analysis to 20 variables (dimensions).
+In this example, we analyze single-cell RNA-seq data provided by the
+`load_scrna` function in the package. Each column in the dataset represents a
+gene, while each row corresponds to a cell (i.e., an individual). Covariates
+for cell types (`labels`) are also included. For simplicity, we limit the
+analysis to $10$ variables (dimensions).
 
 ```{python}
 from pyPLNmodels import load_scrna
-rna = load_scrna(dim=20)
+rna = load_scrna(dim=10)
 print('Data: ', rna.keys())
 ```
 
@@ -118,17 +122,15 @@ To gain deeper insights into the model parameters and the optimization process,
 pln.show()
 ```
 
-Monitoring the norm of each parameters allows to know if the model has
-converged. If it has not converged, one can refit the model with a lower
-tolerance (`tol`) and a bigger number iterations than the default (`maxiter=400`):
+Monitoring the norm of each parameter is essential to assess model convergence.
+If the model has not converged, consider refitting it with additional iterations and
+a reduced tolerance (`tol`). To adjust the number of iterations, use the
+`maxiter` parameter:
 
 ```{python}
-#|eval : false
-pln.fit(maxiter=1000, tol = 0)
+pln.fit(maxiter=1000, tol = 0).show()
 ```
 
-
-
 ## Exploring Latent Variables
 
 The latent variables $Z$, which capture the underlying structure of the data, are accessible via the `latent_variables` attribute, or the `.transform()` method:
@@ -139,20 +141,23 @@ Z = pln.transform()
 print('Shape of Z:', Z.shape)
 ```
 
-The effect of covariates on the latent variables can be removed by using the `remove_exog_effect` keyword:
+
+You can visualize these latent variables using the `.viz()` method:
 
 ```{python}
-Z_moins_XB = pln.transform(remove_exog_effect=True)
+pln.viz(colors=cell_type)
 ```
 
-
-You can visualize these latent variables using the `.viz()` method:
+By default the effect of covariates on the latent variables is included in the
+visualization. This means that the latent variables are represented as $Z +
+XB$. The effect of covariates on the latent variables can be removed by using
+the `remove_exog_effect` keyword:
 
 ```{python}
-pln.viz(colors=cell_type)
+Z_moins_XB = pln.transform(remove_exog_effect=True)
 ```
 
-To visualize the latent positions without the effect of covariates (i.e., \(Z - XB\)), set the `remove_exog_effect` parameter to `True`:
+To visualize the latent positions without the effect of covariates (i.e., \(Z - XB\)), set the `remove_exog_effect` parameter to `True` in the `.viz()` method:
 
 ```{python}
 pln.viz(colors=cell_type, remove_exog_effect=True)
@@ -163,6 +168,7 @@ Additionally, you can generate a pair plot of the first Principal Components (PC
 ```{python}
 pln.pca_pairplot(n_components=4, colors=cell_type)
 ```
+The `remove_exog_effect` parameter is also available in the `pca_pairplot` method.
 
 # Analyzing Covariate Effects
 
@@ -175,6 +181,8 @@ To summarize the model, including confidence intervals and p-values, use the `su
 ```{python}
 pln.summary()
 ```
+The p-value corresponds to the coding used in one-hot encoding, with
+`Macrophages` set as the reference category.
 
 You can also visualize confidence intervals for regression coefficients using the `plot_regression_forest` method:
 
@@ -220,12 +228,23 @@ pca = PlnPCA.from_formula('endog ~ 1 + labels', data=high_d_rna, rank=5).fit()
 
 **⚠️  Note:** P-values are not available in the `PlnPCA` model.
 
-
 ```{python}
 print(pca)
 ```
 
-This model is particularly efficient for high-dimensional datasets, offering significantly reduced computation time compared to `Pln`:
+A low-dimensional of dimension `rank` of the latent variables can be obtained using the `project` keyword of the `.transform()` method:
+
+```{python}
+Z_low_dim = pca.transform(project=True)
+print('Shape of Z_low_dim:', Z_low_dim.shape)
+```
+
+
+
+This model is particularly efficient for high-dimensional datasets, offering
+significantly reduced computation time compared to `Pln`. See [this
+paper](https://joss.theoj.org/papers/10.21105/joss.06969) for a computational
+comparison between `Pln` and `PlnPCA`
 
 ## Selecting the Rank