Skip to content

Performance improvements with KD-tree and grid-based algorithms#12

Merged
MayarMAhmed merged 15 commits intomainfrom
perf-improvements
Mar 11, 2026
Merged

Performance improvements with KD-tree and grid-based algorithms#12
MayarMAhmed merged 15 commits intomainfrom
perf-improvements

Conversation

@aalexmmaldonado
Copy link
Member

This PR replaces dense pairwise distance matrices throughout the POVME codebase with spatial data structures that operate in O(N log N) time and O(N) memory. The test suite runs 10.8× faster (291s → 27s), and peak memory usage for large systems drops substantially depending on the operation.

The changes preserve algorithmic correctness — the same points are kept/removed, the same volumes are computed, and the same pockets are identified. All existing tests pass with unchanged expected values.

Motivation

POVME's core operations (clash detection, neighbor counting, pocket separation, contiguity enforcement) all require answering the question "which points are close to which other points?" The previous implementation answered this by computing all pairwise distances via cdist, pdist, or sparse_distance_matrix().todense(), producing N×N or N×M dense matrices. For typical production systems (50K grid points, 10K atoms), these matrices consumed massive amounts of RAM and dominated both runtime and memory.

The key insight is that POVME only needs nearby pairs (within 2–3 Å), and on a regular grid, each point has at most 26 neighbors. Spatial data structures like KD-trees and integer-grid algorithms provide direct access to these sparse neighbor relationships without materializing the dense matrix.

Changes by file

hull.py — Per-frame volume calculation

Operation Before After
Contiguity neighbor counting squareform(pdist(pts)) _count_neighbors_kdtree() using KDTree.query_ball_point(return_length=True)
Contiguous pocket flood fill cdist(pts, contig_pts) in while-loop flood_fill_contiguous_kdtree() using KD-tree BFS
Convex hull point-in-hull test Python for-loop over all points × all triangles scipy.spatial.Delaunay.find_simplex() with _vectorized_hull_check() fallback

gridmesh.py — Pocket detection

Operation Before After
Hull point-in-hull test [pt for pt in pts if hull.inside_hull(pt)] Vectorized NumPy triangle test with early exit (largest time impact)
Atom clash removal scipy.spatial.KDTree + Ray chunking + sparse_distance_matrix Single KDTree.query_ball_tree() call
Isolated point filtering sparse_distance_matrix().todense() KDTree.query_ball_point(return_length=True)
Pocket separation BFS on dense N×N matrix scipy.ndimage.label on 3D boolean grid
Point expansion Triple-nested Python for-loop np.meshgrid + broadcast addition

volume.py — Point field generation

Operation Before After
Exclusion point removal cdist(pts, pts_exclusion) < 1e-7 np.isin on np.void views

regions.py — Region point generation

Operation Before After
Spherical region distance cdist(result, center.reshape(1, -1)) np.einsum vectorized norm

detect.py — Pocket detection pipeline

Operation Before After
Cluster radius calculation cdist(np.array([center]), cluster_pts) np.einsum vectorized norm

config/io.py — Pydantic config

  • Fix PydanticDeprecatedSince211 warning: access model_fields on the class (self.__class__.model_fields) instead of the instance.

Performance impact

Test suite runtime on my laptop.
(Super reproducible by the way /s)

Stage Time
Before all changes 291s
After hull.py + volume.py + regions.py changes 183s
After gridmesh.py vectorized hull + KDTree changes 27s

Design decisions

Delaunay vs custom gift-wrapping hull

The volume path now uses Delaunay.find_simplex() for convex hull tests, which is fully C-level and ~100× faster. A try/except block falls back to the vectorized triangle test, handling degenerate geometries where the Delaunay construction fails.

scipy.ndimage.label for pocket separation

Since grid points lie on a regular lattice, they can be mapped to integer indices (i, j, k) and processed as a 3D image. Connected-component labeling with 26-connectivity replaces BFS for dense distance matrices.

Set difference for exclusion removal

Both inclusion and exclusion point sets are snapped to the same grid by snap_points(), guaranteeing bit-identical coordinates for coincident points. This allows np.isin to be used on structured-array views instead of cdist.

…lgorithms

The GridMesh class had several operations that materialized dense N×N
distance matrices, causing excessive memory usage and slow runtimes
(e.g., 20 GB for 50K points). This commit replaces them with spatial
data structures and image-processing algorithms that operate in O(N)
memory.

Changes:

- `TaskRemovePointsOutsideHull`: Replace per-point Python loop calling
  `hull.inside_hull()` with vectorized NumPy triangle test. For each hull
  triangle, all candidate points are tested simultaneously via a single
  matrix-vector dot product. Points classified as outside are excluded
  from subsequent triangle iterations (early exit). This was the single
  highest-impact change, reducing detection test time from ~180s to ~27s.
- `filter_isolated_points_until_no_change`: Replace
  `sparse_distance_matrix().todense()` (which converted a sparse matrix
  to a full dense N×N matrix) with `KDTree.query_ball_point()` using
  `return_length=True`.
- `separate_out_pockets`: Replace BFS on the dense distance matrix with
  `scipy.ndimage.label` on a 3D boolean grid. Points are mapped to
  integer (i,j,k) indices on the regular lattice, and connected-
  component labeling with 26-connectivity identifies pockets in O(N)
  time and memory.
- `expand_around_existing_points`: Replace triple-nested Python for loop
  with `np.meshgrid` offset generation and NumPy broadcast addition.

The `__keep_limited_points` and `__delete_limited_points` helper methods
are now unused and are removed.
…ence

`gen_points()` previously used `cdist(pts, pts_exclusion) < 1e-7` to
find inclusion grid points that overlap with exclusion grid points.
This materialized a full (N × M) float64 distance matrix to perform what is
logically a set difference.

Since both point sets are snapped to the same regular grid by
`snap_points()`, coincident points have bit-identical coordinates.
The new `remove_exclusion_points()` function exploits this by:

1. Reinterpreting each (x, y, z) row as a single `np.void` element
   via a zero-copy structured-array view (`_to_void`).
2. Using `np.isin()` for hash-based membership testing.

Also extracted `remove_exclusion_points()` as a standalone function
(previously inline in `gen_points()`) and added module and function
docstrings.
…launay algorithms

The per-frame `ConvexHull.volume()` method had three operations that
materialized dense pairwise distance matrices or used pure-Python
loops over all grid points. This commit replaces them with KD-tree
queries and scipy's C-level Delaunay triangulation.

Changes:

- Contiguity neighbor counting: Replace `squareform(pdist(pts))`
  with `_count_neighbors_kdtree()`, a new helper that uses
  `KDTree.query_ball_point(..., return_length=True)` to count
  neighbors in O(N log N) time and O(N) memory.
- Contiguous pocket flood fill: Replace the `while`-loop that called
  `cdist(pts, contig_pts)` on every iteration (materializing an
  N×M matrix that grew each iteration) with
  `flood_fill_contiguous_kdtree()`, a new function that builds a
  KD-tree once, pre-computes all neighbor lists in a single
  `query_ball_point` call, then runs BFS over the adjacency graph.
- Convex hull point-in-hull test: Replace the pure-Python loop
  (`for pt in old_pts: outside_hull(pt, hull)`) with
  `scipy.spatial.Delaunay.find_simplex()`, which performs all
  point-in-hull tests in a single C-level call. Falls back to
  `_vectorized_hull_check()` (new helper using NumPy broadcast
  triangle tests) if Delaunay construction fails on degenerate
  geometry.

New module-level helpers:

- `_count_neighbors_kdtree()`: KD-tree neighbor counting.
- `flood_fill_contiguous_kdtree()`: KD-tree BFS flood fill.
- `_vectorized_hull_check()`: NumPy-vectorized triangle half-space
  test (Delaunay fallback).
@aalexmmaldonado aalexmmaldonado requested review from a team and MayarMAhmed as code owners March 10, 2026 18:49
@codecov-commenter
Copy link

Codecov Report

❌ Patch coverage is 77.14286% with 32 lines in your changes missing coverage. Please review.
✅ Project coverage is 79.01%. Comparing base (d9fa4ac) to head (ccb125c).

Files with missing lines Patch % Lines
povme/points/hull.py 58.33% 23 Missing and 2 partials ⚠️
povme/points/gridmesh.py 89.83% 3 Missing and 3 partials ⚠️
povme/config/io.py 50.00% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main      #12      +/-   ##
==========================================
+ Coverage   77.31%   79.01%   +1.69%     
==========================================
  Files          15       15              
  Lines        1133     1158      +25     
  Branches      157      162       +5     
==========================================
+ Hits          876      915      +39     
+ Misses        192      180      -12     
+ Partials       65       63       -2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Member

@MayarMAhmed MayarMAhmed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. I tested this against the CNR1 case (472 residues) that had been triggering the memory explosion previously, and it worked.

@MayarMAhmed MayarMAhmed merged commit 0567210 into main Mar 11, 2026
2 checks passed
@MayarMAhmed MayarMAhmed deleted the perf-improvements branch March 11, 2026 12:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants