Skip to content

Conversation

@alanocallaghan
Copy link
Contributor

Previously the assumption was a classification has one name. However QuPath features can have multiple classifications: an object with class "A: B: C" is converted to JSON something like {[blah], "properties": {"classification": {"names": ["A", "B", "C"]}}} whereas an object with class "A" is converted something like {[blah], "properties": {"classification": {"name": "A"}}}

This PR removes the assumption of a single class while ensuring a single name can be fetched using a mocked name property that joins on ": ", similar to objects with multiple classes in QuPath.

Resolve #43

Previously the assumption was a classification has one name.
However QuPath features can have multiple classifications:
an object with class "A: B: C" is converted to JSON something like
{[blah], "properties": {"classification": {"names": ["A", "B", "C"]}}}
whereas an object with class "A" is converted something like
{[blah], "properties": {"classification": {"name": "A"}}}

This PR removes the assumption of a single class while ensuring a single
name can be fetched using a mocked `name` property that joins on ": ",
similar to objects with multiple classes in QuPath.

Resolve qupath#43
@alanocallaghan alanocallaghan requested a review from Rylern October 20, 2025 14:04
Copy link
Contributor

@Rylern Rylern left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good

@alanocallaghan alanocallaghan merged commit 499fa1f into qupath:main Oct 20, 2025
3 checks passed
@alanocallaghan alanocallaghan deleted the handle-classifications branch October 20, 2025 19:02
@petebankhead
Copy link
Member

Can you add code examples to the PR for how this should look?

I think I like the ideal that classification.names should return all the names.

From the description, I think classification.name exists as a read-only property, and I’m not sure if that’s desirable. In QuPath, the concatenation is only applied with PathClass.toString() so the closest here will be through str(classification) I think.

I mention it because I have some regrets at introducing both PathObject.classification and PathObject.classifications since it is really easy to mix them up.

Also, it seems counterintuitive to me that names is not made up of multiple name parts, but rather name is created by combining the names.

However the main thing I’d like is a clear example in the first post showing how the Python code looks. The description is a bit hard to parse, since it’s not clear if the changes are made only for JSON purposes.

@alanocallaghan
Copy link
Contributor Author

Can you add code examples to the PR for how this should look?

I'm not sure what I would be demonstrating?

From the description, I think classification.name exists as a read-only property, and I’m not sure if that’s desirable.

I would tend to treat these as effectively immutable anyway.

Also, it seems counterintuitive to me that names is not made up of multiple name parts, but rather name is created by combining the names.

It is admittedly not the best time to try parsing this but I don't understand what you mean.

The description is a bit hard to parse, since it’s not clear if the changes are made only for JSON purposes.

Not entirely; previously we did not really handle multiple classifications in any meaningful way. Technically I suppose you could have by manually manipulating parts of the hierarchical name (without parsing them from qupath)

@petebankhead
Copy link
Member

Can you add code examples to the PR for how this should look?
I'm not sure what I would be demonstrating?

Example of intended use and behavior:

classification = Classification.get_cached_classification(('Tumor', 'Positive'))

classification.names # ('Tumor', 'Positive')
classification.name  # 'Tumor: Positive'

As mentioned at #45 I presume this would 'work', but could be problematic:

classification = Classification(('Tumor', 'Positive'))

Although writing that has me thinking that the following could be problematic as well:

classification = Classification('Tumor: Positive'))
classification = Classification.get_cached_classification('Tumor: Positive')

And I wonder what would happen if someone tries

classification = Classification(['Tumor', 'Positive'])

(I haven't tried running the code, this is only from reading it - maybe there are guards I'm missing, or my Python is too rusty)

From the description, I think classification.name exists as a read-only property, and I’m not sure if that’s desirable.

I would tend to treat these as effectively immutable anyway.

Fair. It's the use of name and names that I think is more problematic, since they are similar. It's also inconsistent with QuPath, e.g. if I have a classification Tumor: Positive then calling PathClass.getName() would give me Positive - and PathClass.toString() would give me Tumor: Positive.

I'm not saying we should replicate QuPath's confusing behavior, but maybe an alternative to name can be found.

Also, it seems counterintuitive to me that names is not made up of multiple name parts, but rather name is created by combining the names.

It is admittedly not the best time to try parsing this but I don't understand what you mean.

I'm referring to the normal use of singular and plural forms.

Expected
Apples 🍎🍎🍎
Apple 🍎

Unexpected
Apples (🍎, 🍎, 🍎)
Apple 🍎: 🍎: 🍎

@alanocallaghan
Copy link
Contributor Author

alanocallaghan commented Oct 21, 2025

Fair. It's the use of name and names that I think is more problematic, since they are similar. [...]
I'm not saying we should replicate QuPath's confusing behavior, but maybe an alternative to name can be found.

I was aiming to parallel classification and classifications in QuPath. I could change to name_parts...?

It's also inconsistent with QuPath, e.g. if I have a classification Tumor: Positive then calling PathClass.getName() would give me Positive - and PathClass.toString() would give me Tumor: Positive.

This is extremely unintuitive to me to the point that I don't understand why it's implemented this way or why this behaviour is at all desirable.

It seems like the API requires further changes, I will write some unit tests, update the documentation, and add example code in a future PR.

@alanocallaghan
Copy link
Contributor Author

Also I would suggest we keep further discussion in one thread? Fine with the issue or here

@petebankhead
Copy link
Member

It seems like the API requires further changes, I will write some unit tests, update the documentation, and add example code in a future PR.

Yes. Before going too deep, we need a clear definition of what exactly the code is meant to be doing. There are a lot of issues / tradeoffs to balance:

  1. Support for single and multiple classifications
  2. Equality testing and ordering (CD3: CD8 vs. CD8: CD3)
  3. Uniqueness of classification names (CD3: CD3 possible?)
  4. Creating a convenient string representation (if this is even necessary?)
  5. JSON serializability (and conversion to/from a QuPath-friendly way)
  6. Linking colors with classifications consistently
  7. Access to constructors or use of singletons for efficiency
    • QuPath enforces singletons partly because creating 1,000,000 objects or 1,000,000 cells would have a non-trivial overhead
  8. Similarity to QuPath's API
  9. Feels sensible in Python... not weirdly-ported Java

Also I would suggest we keep further discussion in one thread? Fine with the issue or here

Sure, we can drop this thread and factor the list above into a new discussion on any proposed redesign.

This is extremely unintuitive to me to the point that I don't understand why it's implemented this way or why this behaviour is at all desirable.

QuPath's approach has (just about) served its purpose for a decade, but I'm not going to argue it's how it ought to have been designed. So this is a chance to do something better.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Handle multiple classifications

3 participants