Skip to content

Replace deepcopy with coord save/restore in CIF writing#15

Open
longleo17 wants to merge 1 commit intobytedance:mainfrom
longleo17:pr/avoid-deepcopy-cif
Open

Replace deepcopy with coord save/restore in CIF writing#15
longleo17 wants to merge 1 commit intobytedance:mainfrom
longleo17:pr/avoid-deepcopy-cif

Conversation

@longleo17
Copy link

Summary

Replaces copy.deepcopy(atom_array) with a coord save/restore pattern in save_structure_cif(). For each of N_sample outputs (100-500), the old code deep-copied the entire AtomArray just to set new coordinates. The swap pattern just reassigns the .coord attribute reference — no data is copied.

Measured impact (synthetic benchmark)

  • 39.5x faster on the coordinate assignment path
  • Small fraction of total inference time, but eliminates 100-500 unnecessary full deep copies per run

Changes

  • pxdesign/runner/dumper.pysave_structure_cif(): save original coords, assign predicted coords, write CIF, restore original

Test plan

  • pytest tests/test_avoid_deepcopy_cif.py -v
  • Run inference, verify CIF output is identical

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR optimizes CIF dumping during inference by avoiding copy.deepcopy(atom_array) and instead temporarily swapping AtomArray.coord to the predicted coordinates while writing, then restoring the original coordinates.

Changes:

  • Updated save_structure_cif() in the runner dumper to use coord save/restore instead of deep-copying the full AtomArray.
  • Added a test to enforce that copy.deepcopy() is no longer used in pxdesign/runner/dumper.py and to validate the basic coord swap/restore pattern.

Reviewed changes

Copilot reviewed 2 out of 3 changed files in this pull request and generated 1 comment.

File Description
pxdesign/runner/dumper.py Replaces deep-copy-based CIF writing with coord swapping for improved performance.
tests/test_avoid_deepcopy_cif.py Adds regression tests to prevent reintroduction of copy.deepcopy() and to validate coord restoration behavior.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +197 to +205
original_coord = atom_array.coord
atom_array.coord = pred_coordinate.cpu().numpy()
save_atoms_to_cif(
output_fpath,
pred_atom_array,
atom_array,
entity_poly_type,
pdb_id,
)
atom_array.coord = original_coord
Copy link

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

atom_array.coord is restored only on the happy path. If save_atoms_to_cif() raises (I/O error, invalid structure, etc.), the function exits early and leaves atom_array mutated for the rest of the run. Wrap the coord swap in a try/finally so the original coordinates are restored even on exceptions.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants