Skip to content

Conversation

@tanii1125
Copy link
Contributor

@tanii1125 tanii1125 commented Dec 9, 2025

This PR implements enhancement proposed in issue #5170

Summary

EinsteinMSD currently lowercases the msd_type string but does not:

  • strip whitespace
  • validate the type
  • handle mixed-case safely

What This PR Does?

  1. Add .strip() before .lower() to handle inputs like " xy ".
  2. Add type validation (raise TypeError when msd_type is not a string).
  3. Add new tests to cover:
    • whitespace (" xy ")
    • uppercase/mixed case (" Xz ")
    • non-string inputs

All tests pass locally


📚 Documentation preview 📚: https://mdanalysis--5173.org.readthedocs.build/en/5173/

@codecov
Copy link

codecov bot commented Dec 9, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 92.73%. Comparing base (bbcef1b) to head (83594a4).
⚠️ Report is 5 commits behind head on develop.

Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #5173      +/-   ##
===========================================
+ Coverage    92.72%   92.73%   +0.01%     
===========================================
  Files          180      180              
  Lines        22472    22473       +1     
  Branches      3188     3189       +1     
===========================================
+ Hits         20837    20841       +4     
+ Misses        1177     1175       -2     
+ Partials       458      457       -1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Comment on lines 385 to 390
# Validate type first
if not isinstance(self.msd_type, str):
raise TypeError("msd_type must be a string")

# strip whitespace + lowercase
self.msd_type = self.msd_type.strip().lower()
Copy link
Contributor Author

@tanii1125 tanii1125 Dec 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added these lines in msd.py , they do -

  1. Explicitly checks that msd_type is a string to avoid unexpected AttributeErrors.
  2. Uses .strip().lower() to handle inputs with leading/trailing whitespace
    (e.g. " xy ", " Xz "), while preserving existing case-insensitive behavior.

Comment on lines 126 to 139
def test_msd_type_whitespace(self, u, SELECTION):
m = MSD(u, SELECTION, msd_type=" xy ", fft=False)
assert m.dim_fac == 2
assert m._dim == [0, 1]

def test_msd_type_uppercase(self, u, SELECTION):
m = MSD(u, SELECTION, msd_type=" Xz ", fft=False)
assert m.dim_fac == 2
assert m._dim == [0, 2]

def test_msd_type_nonstring(self, u, SELECTION):
with pytest.raises(TypeError):
MSD(u, SELECTION, msd_type=123, fft=False)

Copy link
Contributor Author

@tanii1125 tanii1125 Dec 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added tests covering msd_type input handling in EinsteinMSD:

  • whitespace handling (e.g. " xy ")
  • case-insensitive parsing (existing behavior)
  • non-string inputs raising TypeError

@marinegor marinegor self-requested a review December 23, 2025 17:29
@Dreamstick9
Copy link

Hey! Just checking out the changes.

Since this ecosystem relies heavily on NumPy, users might be passing msd_type as a numpy.str_ (like if they read their config from an array).
Screenshot 2025-12-24 at 9 39 02 AM
This check would strictly raise a TypeError for them, even though it likely worked fine before via duck typing.

It might be safer to either check against (str, np.str_) or just let the .strip().lower() call handle it implicitly so we don't break those workflows.

Also, minor edge case: if the input is just whitespace (e.g. " "), strip() will result in an empty string "". It might be worth raising a ValueError there so it doesn't fail silently downstream?

Nice cleanup on the parsing logic though!

@tanii1125
Copy link
Contributor Author

@Dreamstick9 thank you for pointing this out !

Since the use of .strip() is in question, I’ll update this PR accordingly:

  • keep .lower() while avoiding breaking string-like inputs (e.g. numpy.str_)
  • add an explicit check to raise a clear error for empty input

@Dreamstick9
Copy link

One more thing I noticed in test_msd.py:

test_msd_type_whitespace_around_valid_value expects a ValueError for " xy ".

Since the PR goal is to support whitespace, shouldn't this test expect success instead of an error?

Currently, this test will likely fail because the code raises a KeyError (due to the unstripped string) rather than the expected ValueError.

@tanii1125
Copy link
Contributor Author

This is intentional. As discussed in #5170, we decided not to silently normalize whitespace. Inputs like " xy " are treated as invalid, and the test documents that behavior explicitly.

Copy link
Member

@IAlibay IAlibay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per the associated issue, the enhancement by adding extra checks is likely uncessary.

My proposal would be to move the lower() call to line 388 instead when we set self._dim and then add an AttributeError to the except.

@tanii1125
Copy link
Contributor Author

@IAlibay Thanks for the clarification.
I agree that keeping the behavior minimal is preferable.
I’ll update the implementation to move the .lower() call into the dictionary lookup and handle AttributeError explicitly there, without adding additional normalization or checks. I’ll also adjust the tests accordingly.

Copy link
Member

@IAlibay IAlibay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also add your handle to the changelog please.

try:
self._dim = keys[self.msd_type]
self._dim = keys[self.msd_type.lower()]
except AttributeError:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't double except here, just do the one exception message and add the word "string" in the message.

Comment on lines 126 to 133
def test_msd_type_uppercase(self, u, SELECTION):
m = MSD(u, SELECTION, msd_type="Xz", fft=False)
assert m.dim_fac == 2
assert m._dim == [0, 2]

def test_msd_type_nonstring(self, u, SELECTION):
with pytest.raises(TypeError):
MSD(u, SELECTION, msd_type=123, fft=False)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just make these extra parameters of the test_msdtype_error method above.

@tanii1125 tanii1125 requested a review from IAlibay January 3, 2026 17:44
"xy, xz, yz, x, y, z".format(self.msd_type)
self._dim = keys[self.msd_type.lower()]
except (AttributeError, KeyError):
raise TypeError(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
raise TypeError(
raise ValueError(

Keep the value error, it's an error in the input value.

self._dim = keys[self.msd_type.lower()]
except (AttributeError, KeyError):
raise TypeError(
"msd_type must be a string and one of: xyz, xy, xz, yz, x, y, z"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"msd_type must be a string and one of: xyz, xy, xz, yz, x, y, z"
f"Invalid msd_type {msd_type} must be a string and one of: xyz, xy, xz, yz, x, y, z"

Keep the parameter exposed in the error message.

Comment on lines 120 to 134
@pytest.mark.parametrize(
"msd_type, exc",
[
("Xz", None), # valid, mixed case.
(123, TypeError), # non-string.
],
)
def test_msdtype_error(self, u, SELECTION, msd_type, exc):
if exc is None:
m = MSD(u, SELECTION, msd_type=msd_type, fft=False)
assert m.dim_fac == 2
assert m._dim == [0, 2]
else:
with pytest.raises(exc):
MSD(u, SELECTION, msd_type=msd_type, fft=False)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't do this, keep all the old parameters ("foo", "bar", etc...) and solely check for the ValueError to be raised, just like the old test.

],
)
def test_msdtype_error(self, u, SELECTION, msd_type, exc):
if exc is None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't do this, keep this solely to checking for an error being raised with bad inputs.

@tanii1125 tanii1125 requested a review from IAlibay January 4, 2026 10:55
@IAlibay IAlibay merged commit a029dcd into MDAnalysis:develop Jan 4, 2026
16 of 24 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants