Refactor classification code to handle multiple classifications #44

alanocallaghan · 2025-10-20T14:04:23Z

Previously the assumption was a classification has one name. However QuPath features can have multiple classifications: an object with class "A: B: C" is converted to JSON something like {[blah], "properties": {"classification": {"names": ["A", "B", "C"]}}} whereas an object with class "A" is converted something like {[blah], "properties": {"classification": {"name": "A"}}}

This PR removes the assumption of a single class while ensuring a single name can be fetched using a mocked name property that joins on ": ", similar to objects with multiple classes in QuPath.

Resolve #43

Previously the assumption was a classification has one name. However QuPath features can have multiple classifications: an object with class "A: B: C" is converted to JSON something like {[blah], "properties": {"classification": {"names": ["A", "B", "C"]}}} whereas an object with class "A" is converted something like {[blah], "properties": {"classification": {"name": "A"}}} This PR removes the assumption of a single class while ensuring a single name can be fetched using a mocked `name` property that joins on ": ", similar to objects with multiple classes in QuPath. Resolve qupath#43

qubalab/objects/classification.py

Rylern

Looks good

petebankhead · 2025-10-21T05:47:48Z

Can you add code examples to the PR for how this should look?

I think I like the ideal that classification.names should return all the names.

From the description, I think classification.name exists as a read-only property, and I’m not sure if that’s desirable. In QuPath, the concatenation is only applied with PathClass.toString() so the closest here will be through str(classification) I think.

I mention it because I have some regrets at introducing both PathObject.classification and PathObject.classifications since it is really easy to mix them up.

Also, it seems counterintuitive to me that names is not made up of multiple name parts, but rather name is created by combining the names.

However the main thing I’d like is a clear example in the first post showing how the Python code looks. The description is a bit hard to parse, since it’s not clear if the changes are made only for JSON purposes.

alanocallaghan · 2025-10-21T06:46:37Z

Can you add code examples to the PR for how this should look?

I'm not sure what I would be demonstrating?

From the description, I think classification.name exists as a read-only property, and I’m not sure if that’s desirable.

I would tend to treat these as effectively immutable anyway.

Also, it seems counterintuitive to me that names is not made up of multiple name parts, but rather name is created by combining the names.

It is admittedly not the best time to try parsing this but I don't understand what you mean.

The description is a bit hard to parse, since it’s not clear if the changes are made only for JSON purposes.

Not entirely; previously we did not really handle multiple classifications in any meaningful way. Technically I suppose you could have by manually manipulating parts of the hierarchical name (without parsing them from qupath)

petebankhead · 2025-10-21T07:47:54Z

Can you add code examples to the PR for how this should look?
I'm not sure what I would be demonstrating?

Example of intended use and behavior:

classification = Classification.get_cached_classification(('Tumor', 'Positive'))

classification.names # ('Tumor', 'Positive')
classification.name  # 'Tumor: Positive'

As mentioned at #45 I presume this would 'work', but could be problematic:

classification = Classification(('Tumor', 'Positive'))

Although writing that has me thinking that the following could be problematic as well:

classification = Classification('Tumor: Positive'))
classification = Classification.get_cached_classification('Tumor: Positive')

And I wonder what would happen if someone tries

classification = Classification(['Tumor', 'Positive'])

(I haven't tried running the code, this is only from reading it - maybe there are guards I'm missing, or my Python is too rusty)

From the description, I think classification.name exists as a read-only property, and I’m not sure if that’s desirable.

I would tend to treat these as effectively immutable anyway.

Fair. It's the use of name and names that I think is more problematic, since they are similar. It's also inconsistent with QuPath, e.g. if I have a classification Tumor: Positive then calling PathClass.getName() would give me Positive - and PathClass.toString() would give me Tumor: Positive.

I'm not saying we should replicate QuPath's confusing behavior, but maybe an alternative to name can be found.

Also, it seems counterintuitive to me that names is not made up of multiple name parts, but rather name is created by combining the names.

It is admittedly not the best time to try parsing this but I don't understand what you mean.

I'm referring to the normal use of singular and plural forms.

Expected
Apples 🍎🍎🍎
Apple 🍎

Unexpected
Apples (🍎, 🍎, 🍎)
Apple 🍎: 🍎: 🍎

alanocallaghan · 2025-10-21T08:40:29Z

Fair. It's the use of name and names that I think is more problematic, since they are similar. [...]
I'm not saying we should replicate QuPath's confusing behavior, but maybe an alternative to name can be found.

I was aiming to parallel classification and classifications in QuPath. I could change to name_parts...?

It's also inconsistent with QuPath, e.g. if I have a classification Tumor: Positive then calling PathClass.getName() would give me Positive - and PathClass.toString() would give me Tumor: Positive.

This is extremely unintuitive to me to the point that I don't understand why it's implemented this way or why this behaviour is at all desirable.

It seems like the API requires further changes, I will write some unit tests, update the documentation, and add example code in a future PR.

alanocallaghan · 2025-10-21T08:41:48Z

Also I would suggest we keep further discussion in one thread? Fine with the issue or here

petebankhead · 2025-10-21T09:23:19Z

It seems like the API requires further changes, I will write some unit tests, update the documentation, and add example code in a future PR.

Yes. Before going too deep, we need a clear definition of what exactly the code is meant to be doing. There are a lot of issues / tradeoffs to balance:

Support for single and multiple classifications
Equality testing and ordering (CD3: CD8 vs. CD8: CD3)
Uniqueness of classification names (CD3: CD3 possible?)
Creating a convenient string representation (if this is even necessary?)
- : may have been a bad choice originally, see Class label entry adds space after a colon qupath#507
JSON serializability (and conversion to/from a QuPath-friendly way)
Linking colors with classifications consistently
Access to constructors or use of singletons for efficiency
- QuPath enforces singletons partly because creating 1,000,000 objects or 1,000,000 cells would have a non-trivial overhead
Similarity to QuPath's API
Feels sensible in Python... not weirdly-ported Java

Also I would suggest we keep further discussion in one thread? Fine with the issue or here

Sure, we can drop this thread and factor the list above into a new discussion on any proposed redesign.

This is extremely unintuitive to me to the point that I don't understand why it's implemented this way or why this behaviour is at all desirable.

QuPath's approach has (just about) served its purpose for a decade, but I'm not going to argue it's how it ought to have been designed. So this is a chance to do something better.

alanocallaghan requested a review from Rylern October 20, 2025 14:04

alanocallaghan commented Oct 20, 2025

View reviewed changes

qubalab/objects/classification.py Outdated Show resolved Hide resolved

Apply suggestion from @alanocallaghan

f880565

Rylern approved these changes Oct 20, 2025

View reviewed changes

Merge branch 'main' into handle-classifications

7a34137

alanocallaghan merged commit 499fa1f into qupath:main Oct 20, 2025
3 checks passed

alanocallaghan deleted the handle-classifications branch October 20, 2025 19:02

petebankhead mentioned this pull request Oct 21, 2025

Creating classifications and checking equality could be confusing #45

Closed

alanocallaghan mentioned this pull request Oct 21, 2025

Make classifications singleton #46

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Refactor classification code to handle multiple classifications #44

Refactor classification code to handle multiple classifications #44

Uh oh!

alanocallaghan commented Oct 20, 2025

Uh oh!

Uh oh!

Rylern left a comment

Uh oh!

Uh oh!

petebankhead commented Oct 21, 2025

Uh oh!

alanocallaghan commented Oct 21, 2025

Uh oh!

petebankhead commented Oct 21, 2025

Uh oh!

alanocallaghan commented Oct 21, 2025 •

edited

Loading

Uh oh!

alanocallaghan commented Oct 21, 2025

Uh oh!

petebankhead commented Oct 21, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Refactor classification code to handle multiple classifications #44

Refactor classification code to handle multiple classifications #44

Uh oh!

Conversation

alanocallaghan commented Oct 20, 2025

Uh oh!

Uh oh!

Rylern left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

petebankhead commented Oct 21, 2025

Uh oh!

alanocallaghan commented Oct 21, 2025

Uh oh!

petebankhead commented Oct 21, 2025

Uh oh!

alanocallaghan commented Oct 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

alanocallaghan commented Oct 21, 2025

Uh oh!

petebankhead commented Oct 21, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

alanocallaghan commented Oct 21, 2025 •

edited

Loading