Better typing of cysteines#33
Open
arohou wants to merge 4 commits into
Open
Conversation
added 4 commits
May 14, 2026 16:29
…r agentic control, because otherwise popup windows requiring the human to click were getting in the way.
Adds `isolde validate {peptidebonds,rama,rotamers,clashes}` for
agent / scripted access to the same scoring as the GUI Validate
tab, plus a no-op `isolde validate` parent that lists the
subcommands when called bare (same for `isolde preflight`). Each
subcommand returns a structured dict (summary + items) and accepts
shared `log` / `saveFile` / `limit` keywords; summary lines include
a hint pointing the caller at how to see the full list.
Refactors the GUI peptide-bond and clashes panels and the
`rama` / `rota` `report=True` text dumps to share the new compute
helpers (`_compute_rama_report`, `_compute_rotamer_report`,
`classify_peptide_bonds`, `clash_atom_label`) so the new commands
and the legacy GUI/CLI surfaces stay in lock-step. `RamaMgr.cis()`
and `twisted()` now also read their cutoffs from
`defaults.CIS_PEPTIDE_BOND_CUTOFF` / `defaults.TWISTED_PEPTIDE_BOND_DELTA`
instead of hardcoded `radians(30)` / `radians(150)`.
`cys_type()` previously only special-cased Cys SG bonded to an atom literally named `CH3` (the ACEcyc head cap used for cyclic-peptide thioethers). All other external carbon partners -- covalent ligand warhead carbons, post-translationally modified Cys, designed bioconjugates, etc. -- fell through to the metal-binding branch and were mis-parameterised as CYM, with the wrong SG charge and an unstable S--C bond during simulation. Match on `a.element.name == "C"` instead of the literal atom name so any external C--S bond picks the CYScyc / CCYScyc thioether template. These templates only depend on SG having one external bond; the partner's atom name does not affect the internal charges, so the broadened match is safe. N-terminal Cys with an external C--S bond (residue carries H1) now explicitly returns CYM, since no NCYScyc template ships in termods.xml. Disulfide, metal, and iron-sulfur paths are unchanged.
Owner
|
Will take a look at this when I get the chance. In the meantime, I recently
pushed a commit that should help you with exactly this scenario. Not yet
polished enough to officially document, but it will let you add an
`isolde_template_name` property to any residue, which if present takes
precedence over the automatic assignment mechanisms.
…On Fri, 22 May 2026 at 02:48, Alexis Rohou ***@***.***> wrote:
I am aiming to enable more covalent modifications of Cys residues. In the
process of working it out, my agent made the following analysis.
I have NOT yet tested this, but I'd appreciate getting your review when
you get a chance.
Summary
cys_type() in openmm_interface.py decides which Amber template variant
(CYS / CYX / CYScyc / CYM / ...) applies to every CYS residue, by
inspecting the bonds on SG. Its thioether branch only matched a partner
atom
whose *name* is literally "CH3" -- the head-cap carbon of the synthetic
ACEcyc residue used for cyclic-peptide thioethers. Any other thioether
partner on SG (a covalent-inhibitor warhead carbon, a post-translationally
modified Cys, a designed bioconjugate, ...) fell through to the
metal-binding
branch and was assigned CYM (deprotonated cysteine), which gives SG the
wrong partial charge and breaks the cross-residue bond during simulation.
Observed symptom
A protein with a small molecule covalently attached to a cysteine SG (e.g.
the warhead carbon of a bromoacetamide / acrylamide / etc. inhibitor, named
something like C1, C2, Cb) is loaded into ChimeraX with the
protein--ligand bond present in the model. After loading the ligand's
ffXML via *"Load residue MD definitions"* and clicking
*"Start simulation"*:
- The cysteine is parameterised as CYM (visible by inspecting
find_residue_templates()'s assignment for that residue).
- During simulation, the SG--C bond between the residue and the ligand
pulls apart -- the deprotonated CYM template assigns the wrong partial
charge to SG, and with ignoreExternalBonds=True in
_create_openmm_system the cross-residue bond is held only by the
ChimeraX-side connectivity, so the mismatched template charges win.
The current practical workaround is a manual rename in the ChimeraX log
before every simulation start, e.g.
setattr #1/A:154 residues name CYScyc
which is fragile and easy to forget.
Root cause
In cys_type():
for a in bonded_atoms:
if a.residue != residue:
if a.name == "SG":
...
return 'CYX'
elif a.name == "CH3":
if 'OXT' in names:
return 'CCYScyc'
else:
return 'CYScyc'
# Assume metal binding - will eventually need to do something better here
return 'CYM'
The elif a.name == "CH3": branch was intended to detect a cyclic-peptide
thioether (ACEcyc head-cap to Cys side-chain), but it only ever fires for
partners whose atom name happens to be CH3. Any other external carbon on
SG -- regardless of element -- falls into the "Assume metal binding" path
and returns CYM.
The thioether templates CYScyc / CCYScyc in amberff/termods.xml do not
depend on the partner atom's name; they just declare that SG carries one
external bond, with thioether-appropriate charges on CB, SG, and the
backbone. They are the correct template for *any* Cys-SG--carbon
thioether, not just the ACEcyc one.
Fix
Match on the partner atom's *element* instead of its name:
elif a.element.name == "C":
# SG bonded to any external carbon -- a thioether.
# Previously this branch only matched a partner atom
# literally named "CH3" (the ACEcyc head cap), which
# missed thioether bonds to any other external carbon
# (covalent-inhibitor warheads, post-translational
# modifications, designed bioconjugates, ...). The
# CYScyc / CCYScyc templates only depend on SG having
# one external bond; the partner atom's name does not
# affect the internal charges, so the broadened match
# is safe.
if 'OXT' in names:
return 'CCYScyc'
if 'H1' in names:
# No NCYScyc template ships in termods.xml yet, so
# return CYM rather than silently mis-parameterise
# an N-terminal Cys with an S--C external bond.
return 'CYM'
return 'CYScyc'
Why this is safe
- The original name == "CH3" case is a strict subset of element == "C",
so every previously-supported scenario keeps returning the same
template.
- The thioether templates only define charges and bonds within the
cysteine itself; they declare SG as an <ExternalBond> site. The
partner atom is supplied by ChimeraX connectivity and is not modelled
inside the cysteine template, so the partner's name and chemical
identity
never appear in the parameters consumed by OpenMM.
- The disulfide path (a.name == "SG"), the metal-binding fallback, the
iron-sulfur cluster branch, and the free-Cys path are all unchanged.
Files changed
- isolde/src/openmm/openmm_interface.py -- cys_type()
Test scenario
1. Open in ChimeraX a structure with a thioether covalent bond between
a
non-terminal Cys SG and a small-molecule carbon. (Any
covalent-inhibitor
structure with the protein--ligand bond modelled as a single
CONECT/LINK
between SG and the warhead carbon will do.)
2. Load the ligand's ffXML via ISOLDE's *"Load residue MD definitions"*
button.
3. Start a simulation (Start simulation).
4. *Before fix:* The cysteine is parameterised as CYM. The S--C bond
between cysteine and ligand pulls apart over the first few frames of
simulation.
5. *After fix:* The cysteine is parameterised as CYScyc (or CCYScyc if
it is the C-terminal residue). The simulation runs cleanly; the S--C
bond is preserved.
Regression checks
- *Cyclic-peptide thioether* (ACEcyc.CH3 -- CYS.SG): still picks
CYScyc / CCYScyc. The new branch is a strict superset of the old
name == "CH3" branch.
- *Disulfide* (Cys-Cys via SG--SG): still picks CYX / CCYX /
NCYX.
- *Free Cys*: still picks CYS / CCYS / NCYS.
- *Metal-coordinating Cys* (e.g. Zn-coordinating): still picks CYM
because the metal atom is not a carbon.
- *Iron-sulfur cluster Cys* (residue neighbours include SF4 / FES):
still picks MC_CYF via the unchanged early return.
Known limitations (out of scope for this fix)
- *N-terminal Cys with an external C--S bond* (residue carries H1):
termods.xml does not yet contain an NCYScyc template, so the new code
explicitly returns CYM for this case rather than silently picking the
wrong template. Adding the N-terminal variant is a separate change to
termods.xml.
- *Non-carbon thioether-like external partners* on SG (S--N, S--P):
still fall through to CYM. No biologically motivated case has come up
yet; a future PR could broaden this further if needed.
------------------------------
You can view, comment on, or merge this pull request online at:
#33
Commit Summary
- d4e0b47
<d4e0b47>
Add preflight commands for disulfides and altlocs. This was needed for
agentic control, because otherwise popup windows requiring the human to
click were getting in the way.
- b53ae38
<b53ae38>
Add read-only `isolde validate` command suite
- 5098f1e
<5098f1e>
Commit change that had been missed earlier - needed for validate clash
command
- da296bf
<da296bf>
Detect thioether-bonded cysteines for any external carbon partner
File Changes
(13 files <https://github.com/tristanic/isolde/pull/33/files>)
- *M* isolde/docs/source/commands/isolde.rst
<https://github.com/tristanic/isolde/pull/33/files#diff-684f8e90a0485ff1e4ff6ed40f50c9d281510a90dff42e59e4e574797677d299>
(113)
- *M* isolde/src/atomic/building/build_utils.py
<https://github.com/tristanic/isolde/pull/33/files#diff-10df0eb152b4a194b4ed55b6ab7db33f126db2ed28931e32ab04b644406c5536>
(29)
- *M* isolde/src/atomic/util.py
<https://github.com/tristanic/isolde/pull/33/files#diff-3e5336b0d82d799fdf1dc01984ade93b7791a40c3c152fa7729ac36f32b67d95>
(23)
- *M* isolde/src/cmd/cmd.py
<https://github.com/tristanic/isolde/pull/33/files#diff-88e72fba07882e3da08e87885ec3aafe2abaff86b533b3e8c0e04216db175af1>
(87)
- *M* isolde/src/isolde.py
<https://github.com/tristanic/isolde/pull/33/files#diff-c3d807da03edc176a099da05002adaf36646e0cca05571ee12f9d08bf147533a>
(27)
- *M*
isolde/src/menu/model_building/disulphides/make_all_sensible_disulphides.py
<https://github.com/tristanic/isolde/pull/33/files#diff-cff6c318cf2b76853095be9bdfb2b37e3f930c9ae811205027a5b2a5e5b99713>
(10)
- *M* isolde/src/molobject.py
<https://github.com/tristanic/isolde/pull/33/files#diff-e168c16677e4e92592ae255e1c3c84b2cfbb6c98f84ef05b5420b8d1095a7ac6>
(19)
- *M* isolde/src/openmm/openmm_interface.py
<https://github.com/tristanic/isolde/pull/33/files#diff-793e6f0f9f6e552630540f573f4151c32b0cfe7dabacf241b831da36f6d537e5>
(20)
- *M* isolde/src/ui/main_win.py
<https://github.com/tristanic/isolde/pull/33/files#diff-033d8913cb48d3cae7a734807d17ecf816d53e76be3eeab94f36cfcc182a0877>
(23)
- *M* isolde/src/ui/validation_tab/clashes.py
<https://github.com/tristanic/isolde/pull/33/files#diff-7a9e4aeb58c51027c240cb6f551ae3f42f8303cc9bedc01bc87be11e16c4b4b2>
(9)
- *M* isolde/src/ui/validation_tab/peptide_bond.py
<https://github.com/tristanic/isolde/pull/33/files#diff-20351388606a2ba4d74f3802a380837479b183adb28eef05003d0e97dacf35ee>
(34)
- *M* isolde/src/validation/clashes.py
<https://github.com/tristanic/isolde/pull/33/files#diff-4b7c41e4ed9032b8b26c53f72f9bfd5f571d69d35b0c5328eaf107893b068068>
(11)
- *M* isolde/src/validation/cmd.py
<https://github.com/tristanic/isolde/pull/33/files#diff-2548bd4fd1ae5196764b6aa05892fc21ad5bf6427a46af0f45d727530cb45525>
(1094)
Patch Links:
- https://github.com/tristanic/isolde/pull/33.patch
- https://github.com/tristanic/isolde/pull/33.diff
—
Reply to this email directly, view it on GitHub
<#33>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AFM54YBJYLW3JIO2WGTERNL436WXBAVCNFSM6AAAAACZIQV4BSVHI2DSMVQWIX3LMV43ASLTON2WKOZUGQ4TSMJSHA4DQNI>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
--
Altos Labs UK Limited | England | Company reg 13484917
Registered
address: 3rd Floor 1 Ashley Road, Altrincham, Cheshire, United Kingdom,
WA14 2DT
|
Owner
|
Forgot to mention: you can make the assignment from the command line with
“setattr {atomspec} isolde_template_name {name}”.
Something I should have done ages ago, tbh.
…On Fri, 22 May 2026 at 09:08, Tristan Croll ***@***.***> wrote:
Will take a look at this when I get the chance. In the meantime, I
recently pushed a commit that should help you with exactly this scenario.
Not yet polished enough to officially document, but it will let you add an
`isolde_template_name` property to any residue, which if present takes
precedence over the automatic assignment mechanisms.
On Fri, 22 May 2026 at 02:48, Alexis Rohou ***@***.***>
wrote:
> I am aiming to enable more covalent modifications of Cys residues. In the
> process of working it out, my agent made the following analysis.
>
> I have NOT yet tested this, but I'd appreciate getting your review when
> you get a chance.
> Summary
>
> cys_type() in openmm_interface.py decides which Amber template variant
> (CYS / CYX / CYScyc / CYM / ...) applies to every CYS residue, by
> inspecting the bonds on SG. Its thioether branch only matched a partner
> atom
> whose *name* is literally "CH3" -- the head-cap carbon of the synthetic
> ACEcyc residue used for cyclic-peptide thioethers. Any other thioether
> partner on SG (a covalent-inhibitor warhead carbon, a
> post-translationally
> modified Cys, a designed bioconjugate, ...) fell through to the
> metal-binding
> branch and was assigned CYM (deprotonated cysteine), which gives SG the
> wrong partial charge and breaks the cross-residue bond during simulation.
> Observed symptom
>
> A protein with a small molecule covalently attached to a cysteine SG
> (e.g.
> the warhead carbon of a bromoacetamide / acrylamide / etc. inhibitor,
> named
> something like C1, C2, Cb) is loaded into ChimeraX with the
> protein--ligand bond present in the model. After loading the ligand's
> ffXML via *"Load residue MD definitions"* and clicking
> *"Start simulation"*:
>
> - The cysteine is parameterised as CYM (visible by inspecting
> find_residue_templates()'s assignment for that residue).
> - During simulation, the SG--C bond between the residue and the ligand
> pulls apart -- the deprotonated CYM template assigns the wrong partial
> charge to SG, and with ignoreExternalBonds=True in
> _create_openmm_system the cross-residue bond is held only by the
> ChimeraX-side connectivity, so the mismatched template charges win.
>
> The current practical workaround is a manual rename in the ChimeraX log
> before every simulation start, e.g.
>
> setattr #1/A:154 residues name CYScyc
>
> which is fragile and easy to forget.
> Root cause
>
> In cys_type():
>
> for a in bonded_atoms:
> if a.residue != residue:
> if a.name == "SG":
> ...
> return 'CYX'
> elif a.name == "CH3":
> if 'OXT' in names:
> return 'CCYScyc'
> else:
> return 'CYScyc'
> # Assume metal binding - will eventually need to do something better here
> return 'CYM'
>
> The elif a.name == "CH3": branch was intended to detect a cyclic-peptide
> thioether (ACEcyc head-cap to Cys side-chain), but it only ever fires for
> partners whose atom name happens to be CH3. Any other external carbon on
> SG -- regardless of element -- falls into the "Assume metal binding" path
> and returns CYM.
>
> The thioether templates CYScyc / CCYScyc in amberff/termods.xml do not
> depend on the partner atom's name; they just declare that SG carries one
> external bond, with thioether-appropriate charges on CB, SG, and the
> backbone. They are the correct template for *any* Cys-SG--carbon
> thioether, not just the ACEcyc one.
> Fix
>
> Match on the partner atom's *element* instead of its name:
>
> elif a.element.name == "C":
> # SG bonded to any external carbon -- a thioether.
> # Previously this branch only matched a partner atom
> # literally named "CH3" (the ACEcyc head cap), which
> # missed thioether bonds to any other external carbon
> # (covalent-inhibitor warheads, post-translational
> # modifications, designed bioconjugates, ...). The
> # CYScyc / CCYScyc templates only depend on SG having
> # one external bond; the partner atom's name does not
> # affect the internal charges, so the broadened match
> # is safe.
> if 'OXT' in names:
> return 'CCYScyc'
> if 'H1' in names:
> # No NCYScyc template ships in termods.xml yet, so
> # return CYM rather than silently mis-parameterise
> # an N-terminal Cys with an S--C external bond.
> return 'CYM'
> return 'CYScyc'
>
> Why this is safe
>
> - The original name == "CH3" case is a strict subset of element == "C"
> ,
> so every previously-supported scenario keeps returning the same
> template.
> - The thioether templates only define charges and bonds within the
> cysteine itself; they declare SG as an <ExternalBond> site. The
> partner atom is supplied by ChimeraX connectivity and is not modelled
> inside the cysteine template, so the partner's name and chemical
> identity
> never appear in the parameters consumed by OpenMM.
> - The disulfide path (a.name == "SG"), the metal-binding fallback, the
> iron-sulfur cluster branch, and the free-Cys path are all unchanged.
>
> Files changed
>
> - isolde/src/openmm/openmm_interface.py -- cys_type()
>
> Test scenario
>
> 1. Open in ChimeraX a structure with a thioether covalent bond
> between a
> non-terminal Cys SG and a small-molecule carbon. (Any
> covalent-inhibitor
> structure with the protein--ligand bond modelled as a single
> CONECT/LINK
> between SG and the warhead carbon will do.)
> 2. Load the ligand's ffXML via ISOLDE's *"Load residue MD
> definitions"*
> button.
> 3. Start a simulation (Start simulation).
> 4. *Before fix:* The cysteine is parameterised as CYM. The S--C bond
> between cysteine and ligand pulls apart over the first few frames of
> simulation.
> 5. *After fix:* The cysteine is parameterised as CYScyc (or CCYScyc if
> it is the C-terminal residue). The simulation runs cleanly; the S--C
> bond is preserved.
>
> Regression checks
>
> - *Cyclic-peptide thioether* (ACEcyc.CH3 -- CYS.SG): still picks
> CYScyc / CCYScyc. The new branch is a strict superset of the old
> name == "CH3" branch.
> - *Disulfide* (Cys-Cys via SG--SG): still picks CYX / CCYX /
> NCYX.
> - *Free Cys*: still picks CYS / CCYS / NCYS.
> - *Metal-coordinating Cys* (e.g. Zn-coordinating): still picks CYM
> because the metal atom is not a carbon.
> - *Iron-sulfur cluster Cys* (residue neighbours include SF4 / FES):
> still picks MC_CYF via the unchanged early return.
>
> Known limitations (out of scope for this fix)
>
> - *N-terminal Cys with an external C--S bond* (residue carries H1):
> termods.xml does not yet contain an NCYScyc template, so the new code
> explicitly returns CYM for this case rather than silently picking the
> wrong template. Adding the N-terminal variant is a separate change to
> termods.xml.
> - *Non-carbon thioether-like external partners* on SG (S--N, S--P):
> still fall through to CYM. No biologically motivated case has come up
> yet; a future PR could broaden this further if needed.
>
> ------------------------------
> You can view, comment on, or merge this pull request online at:
>
> #33
> Commit Summary
>
> - d4e0b47
> <d4e0b47>
> Add preflight commands for disulfides and altlocs. This was needed for
> agentic control, because otherwise popup windows requiring the human to
> click were getting in the way.
> - b53ae38
> <b53ae38>
> Add read-only `isolde validate` command suite
> - 5098f1e
> <5098f1e>
> Commit change that had been missed earlier - needed for validate clash
> command
> - da296bf
> <da296bf>
> Detect thioether-bonded cysteines for any external carbon partner
>
> File Changes
>
> (13 files <https://github.com/tristanic/isolde/pull/33/files>)
>
> - *M* isolde/docs/source/commands/isolde.rst
> <https://github.com/tristanic/isolde/pull/33/files#diff-684f8e90a0485ff1e4ff6ed40f50c9d281510a90dff42e59e4e574797677d299>
> (113)
> - *M* isolde/src/atomic/building/build_utils.py
> <https://github.com/tristanic/isolde/pull/33/files#diff-10df0eb152b4a194b4ed55b6ab7db33f126db2ed28931e32ab04b644406c5536>
> (29)
> - *M* isolde/src/atomic/util.py
> <https://github.com/tristanic/isolde/pull/33/files#diff-3e5336b0d82d799fdf1dc01984ade93b7791a40c3c152fa7729ac36f32b67d95>
> (23)
> - *M* isolde/src/cmd/cmd.py
> <https://github.com/tristanic/isolde/pull/33/files#diff-88e72fba07882e3da08e87885ec3aafe2abaff86b533b3e8c0e04216db175af1>
> (87)
> - *M* isolde/src/isolde.py
> <https://github.com/tristanic/isolde/pull/33/files#diff-c3d807da03edc176a099da05002adaf36646e0cca05571ee12f9d08bf147533a>
> (27)
> - *M*
> isolde/src/menu/model_building/disulphides/make_all_sensible_disulphides.py
> <https://github.com/tristanic/isolde/pull/33/files#diff-cff6c318cf2b76853095be9bdfb2b37e3f930c9ae811205027a5b2a5e5b99713>
> (10)
> - *M* isolde/src/molobject.py
> <https://github.com/tristanic/isolde/pull/33/files#diff-e168c16677e4e92592ae255e1c3c84b2cfbb6c98f84ef05b5420b8d1095a7ac6>
> (19)
> - *M* isolde/src/openmm/openmm_interface.py
> <https://github.com/tristanic/isolde/pull/33/files#diff-793e6f0f9f6e552630540f573f4151c32b0cfe7dabacf241b831da36f6d537e5>
> (20)
> - *M* isolde/src/ui/main_win.py
> <https://github.com/tristanic/isolde/pull/33/files#diff-033d8913cb48d3cae7a734807d17ecf816d53e76be3eeab94f36cfcc182a0877>
> (23)
> - *M* isolde/src/ui/validation_tab/clashes.py
> <https://github.com/tristanic/isolde/pull/33/files#diff-7a9e4aeb58c51027c240cb6f551ae3f42f8303cc9bedc01bc87be11e16c4b4b2>
> (9)
> - *M* isolde/src/ui/validation_tab/peptide_bond.py
> <https://github.com/tristanic/isolde/pull/33/files#diff-20351388606a2ba4d74f3802a380837479b183adb28eef05003d0e97dacf35ee>
> (34)
> - *M* isolde/src/validation/clashes.py
> <https://github.com/tristanic/isolde/pull/33/files#diff-4b7c41e4ed9032b8b26c53f72f9bfd5f571d69d35b0c5328eaf107893b068068>
> (11)
> - *M* isolde/src/validation/cmd.py
> <https://github.com/tristanic/isolde/pull/33/files#diff-2548bd4fd1ae5196764b6aa05892fc21ad5bf6427a46af0f45d727530cb45525>
> (1094)
>
> Patch Links:
>
> - https://github.com/tristanic/isolde/pull/33.patch
> - https://github.com/tristanic/isolde/pull/33.diff
>
> —
> Reply to this email directly, view it on GitHub
> <#33>, or unsubscribe
> <https://github.com/notifications/unsubscribe-auth/AFM54YBJYLW3JIO2WGTERNL436WXBAVCNFSM6AAAAACZIQV4BSVHI2DSMVQWIX3LMV43ASLTON2WKOZUGQ4TSMJSHA4DQNI>
> .
> Triage notifications on the go with GitHub Mobile for iOS
> <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
> or Android
> <https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
>
> You are receiving this because you are subscribed to this thread.Message
> ID: ***@***.***>
>
--
Altos Labs UK Limited | England | Company reg 13484917
Registered
address: 3rd Floor 1 Ashley Road, Altrincham, Cheshire, United Kingdom,
WA14 2DT
|
Author
|
OK thanks. My agent had recommended doing Also - you'll want to review my |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
I am aiming to enable more covalent modifications of Cys residues. In the process of working it out, my agent made the following analysis.
I have NOT yet tested this, but I'd appreciate getting your review when you get a chance.
Summary
cys_type()inopenmm_interface.pydecides which Amber template variant(
CYS/CYX/CYScyc/CYM/ ...) applies to every CYS residue, byinspecting the bonds on
SG. Its thioether branch only matched a partner atomwhose name is literally
"CH3"-- the head-cap carbon of the syntheticACEcycresidue used for cyclic-peptide thioethers. Any other thioetherpartner on
SG(a covalent-inhibitor warhead carbon, a post-translationallymodified Cys, a designed bioconjugate, ...) fell through to the metal-binding
branch and was assigned
CYM(deprotonated cysteine), which givesSGthewrong partial charge and breaks the cross-residue bond during simulation.
Observed symptom
A protein with a small molecule covalently attached to a cysteine
SG(e.g.the warhead carbon of a bromoacetamide / acrylamide / etc. inhibitor, named
something like
C1,C2,Cb) is loaded into ChimeraX with theprotein--ligand bond present in the model. After loading the ligand's
ffXML via "Load residue MD definitions" and clicking "Start
simulation":
CYM(visible by inspectingfind_residue_templates()'s assignment for that residue).SG--C bond between the residue and the ligandpulls apart -- the deprotonated
CYMtemplate assigns the wrong partialcharge to
SG, and withignoreExternalBonds=Truein_create_openmm_systemthe cross-residue bond is held only by theChimeraX-side connectivity, so the mismatched template charges win.
The current practical workaround is a manual rename in the ChimeraX log
before every simulation start, e.g.
which is fragile and easy to forget.
Root cause
In
cys_type():The
elif a.name == "CH3":branch was intended to detect a cyclic-peptidethioether (ACEcyc head-cap to Cys side-chain), but it only ever fires for
partners whose atom name happens to be
CH3. Any other external carbon onSG-- regardless of element -- falls into the "Assume metal binding" pathand returns
CYM.The thioether templates
CYScyc/CCYScycinamberff/termods.xmldo notdepend on the partner atom's name; they just declare that
SGcarries oneexternal bond, with thioether-appropriate charges on
CB,SG, and thebackbone. They are the correct template for any Cys-
SG--carbonthioether, not just the ACEcyc one.
Fix
Match on the partner atom's element instead of its name:
Why this is safe
name == "CH3"case is a strict subset ofelement == "C",so every previously-supported scenario keeps returning the same template.
cysteine itself; they declare
SGas an<ExternalBond>site. Thepartner atom is supplied by ChimeraX connectivity and is not modelled
inside the cysteine template, so the partner's name and chemical identity
never appear in the parameters consumed by OpenMM.
a.name == "SG"), the metal-binding fallback, theiron-sulfur cluster branch, and the free-Cys path are all unchanged.
Files changed
isolde/src/openmm/openmm_interface.py--cys_type()Test scenario
non-terminal Cys
SGand a small-molecule carbon. (Any covalent-inhibitorstructure with the protein--ligand bond modelled as a single CONECT/LINK
between
SGand the warhead carbon will do.)button.
CYM. The S--C bondbetween cysteine and ligand pulls apart over the first few frames of
simulation.
CYScyc(orCCYScycifit is the C-terminal residue). The simulation runs cleanly; the S--C
bond is preserved.
Regression checks
ACEcyc.CH3--CYS.SG): still picksCYScyc/CCYScyc. The new branch is a strict superset of the oldname == "CH3"branch.SG--SG): still picksCYX/CCYX/NCYX.CYS/CCYS/NCYS.CYMbecause the metal atom is not a carbon.
SF4/FES):still picks
MC_CYFvia the unchanged earlyreturn.Known limitations (out of scope for this fix)
H1):termods.xmldoes not yet contain anNCYScyctemplate, so the new codeexplicitly returns
CYMfor this case rather than silently picking thewrong template. Adding the N-terminal variant is a separate change to
termods.xml.SG(S--N, S--P):still fall through to
CYM. No biologically motivated case has come upyet; a future PR could broaden this further if needed.