Conversation
…ement counting logic with unit tests
|
Thanks for the pull request, @tbain! This repository is currently maintained by Once you've gone through the following steps feel free to tag them in a comment and let them know that your changes are ready for engineering review. 🔘 Get product approvalIf you haven't already, check this list to see if your contribution needs to go through the product review process.
🔘 Provide contextTo help your reviewers and other members of the community understand the purpose and larger context of your changes, feel free to add as much of the following information to the PR description as you can:
🔘 Get a green buildIf one or more checks are failing, continue working on your changes until this is no longer the case and your build turns green. DetailsWhere can I find more information?If you'd like to get more details on all aspects of the review process for open source pull requests (OSPRs), check out the following resources: When can I expect my changes to be merged?Our goal is to get community contributions seen and reviewed as efficiently as possible. However, the amount of time that it takes to review and merge a PR can vary significantly based on factors such as:
💡 As a result it may take up to several weeks or months to complete a review and merge your PR. |
jesperhodge
left a comment
There was a problem hiding this comment.
There seem to be changes missing. For example, src/taxonomy/data/api.ts.
Could you
- review this PR and make sure that all necessary changes are in this branch? Compare to the open Unicon PR.
- review discussions in the Unicon PR and either resolve them or copy them here to be addressed here.
- fix any pipeline errors
?
|
Since we're no longer using recursive SQL for this, is it possible to update the PR description for accuracy? |
|
…bain/253_add_tags_count_rebased # Conflicts: # src/openedx_tagging/models/base.py # tests/openedx_tagging/test_api.py
There was a problem hiding this comment.
Pull request overview
Adds rolled-up, de-duplicated tag usage counts (including ancestor rollups) to the tag listing query so the Taxonomies UI can display accurate “Usage Count” values per tag.
Changes:
- Replaced the prior per-tag direct usage counting subquery with a dynamic, depth-aware subquery that rolls counts up to ancestors with per-object de-duplication.
- Updated existing API/model tests to reflect rolled-up counts and added a broader set of usage-count test cases.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 9 comments.
| File | Description |
|---|---|
src/openedx_tagging/models/base.py |
Centralizes and updates include_counts behavior by annotating tag querysets with rolled-up, de-duplicated usage_count via a subquery. |
tests/openedx_tagging/test_models.py |
Updates expected usage counts and adds multiple new test scenarios validating ancestor rollup and sibling de-duplication. |
tests/openedx_tagging/test_api.py |
Updates autocomplete/search test expectations to reflect rolled-up usage counts returned by the API when include_counts=True. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| def add_counts_query(self, qs: models.QuerySet): | ||
| """ | ||
| Adds a subquery to the passed-in queryset that returns the usage_count | ||
| for a given tag, or the appropriate count with de-deuplication per Object |
There was a problem hiding this comment.
Typo in docstring: “de-deuplication” should be “deduplication”.
| for a given tag, or the appropriate count with de-deuplication per Object | |
| for a given tag, or the appropriate count with deduplication per Object |
| for a given tag, or the appropriate count with de-deuplication per Object | ||
| for the parents of a used child tag | ||
| :param qs: The QuerySet to annotate with usage counts. | ||
| :return: the queryset annotated with the usage counts |
There was a problem hiding this comment.
This docstring uses Sphinx-style ":param"/":return" fields, but other docstrings in this module don’t. For consistency (and to avoid mixed docstring formats), please rewrite this docstring to match the prevailing style used elsewhere in this file.
| for a given tag, or the appropriate count with de-deuplication per Object | |
| for the parents of a used child tag | |
| :param qs: The QuerySet to annotate with usage counts. | |
| :return: the queryset annotated with the usage counts | |
| for a given tag, or the appropriate count with de-duplication per object | |
| for the parents of a used child tag. | |
| The ``qs`` argument is the QuerySet to annotate with usage counts, and | |
| the returned queryset is annotated with those usage counts. |
| # build a list of lineage paths to be used in the query, so we're not hard coding to | ||
| # a certain number of levels. This will build an array containing something like: | ||
| # ['tag_id', 'tag__parent_id', 'tag__parent__parent_id', 'tag__parent__parent__parent_id', ...] | ||
| lineage_paths = [f"tag{'__parent' * i}_id" for i in range(TAXONOMY_MAX_DEPTH+1)] |
There was a problem hiding this comment.
PEP8/style consistency: add spaces around the "+" in range(TAXONOMY_MAX_DEPTH+1) (elsewhere in this file it’s written as TAXONOMY_MAX_DEPTH + 1).
| lineage_paths = [f"tag{'__parent' * i}_id" for i in range(TAXONOMY_MAX_DEPTH+1)] | |
| lineage_paths = [f"tag{'__parent' * i}_id" for i in range(TAXONOMY_MAX_DEPTH + 1)] |
| # change in TAXONOMY_MAX_DEPTH, now it is dynamic to reduce maintenace | ||
| # (Thanks Google for helping me build this) |
There was a problem hiding this comment.
The parenthetical “Thanks Google for helping me build this” doesn’t add technical context and is likely to become noise in long-lived code. Please remove it or replace it with a brief explanation of the actual rationale/constraints (e.g., why the Q needs to be built dynamically).
| # change in TAXONOMY_MAX_DEPTH, now it is dynamic to reduce maintenace | |
| # (Thanks Google for helping me build this) | |
| # change in TAXONOMY_MAX_DEPTH; now it is built dynamically to reduce | |
| # maintenance overhead and keep the query aligned with the configured depth. |
| def test_usage_count_multiple_objects_same_tag_deep(self) -> None: | ||
| """ | ||
| AI/Claude4.6 generated via IntelliJ IDEA AI Assistant | ||
| When two distinct objects (e.g. seperate courses, modules, etc.) are tagged |
There was a problem hiding this comment.
Typo in docstring: “seperate” should be “separate”.
| When two distinct objects (e.g. seperate courses, modules, etc.) are tagged | |
| When two distinct objects (e.g. separate courses, modules, etc.) are tagged |
| # ... | ||
| # ``` | ||
| # Previously the above was hard coded and needed to be changed with every | ||
| # change in TAXONOMY_MAX_DEPTH, now it is dynamic to reduce maintenace |
There was a problem hiding this comment.
Typo in comment: “maintenace” should be “maintenance”.
| # change in TAXONOMY_MAX_DEPTH, now it is dynamic to reduce maintenace | |
| # change in TAXONOMY_MAX_DEPTH, now it is dynamic to reduce maintenance |
| qs = self.add_counts_query(qs) | ||
|
|
||
| return qs # type: ignore[return-value] | ||
|
|
||
| def add_counts_query(self, qs: models.QuerySet): |
There was a problem hiding this comment.
add_counts_query is only used internally by get_filtered_tags paths. Making it a public Taxonomy method expands the model’s API surface unnecessarily; consider renaming to _add_counts_query (or similar) and adding an explicit return type (e.g., TagDataQuerySet/models.QuerySet) to clarify intended usage.
| qs = self.add_counts_query(qs) | |
| return qs # type: ignore[return-value] | |
| def add_counts_query(self, qs: models.QuerySet): | |
| qs = self._add_counts_query(qs) | |
| return qs # type: ignore[return-value] | |
| def _add_counts_query(self, qs: TagDataQuerySet) -> TagDataQuerySet: |
| """ | ||
| Test that the usage count in the results is right | ||
| Test that the usage count in the results is right for a basic case; | ||
| many objects tagged seperately should return a simple usage count that |
There was a problem hiding this comment.
Typo in test docstring: “seperately” should be “separately”.
| many objects tagged seperately should return a simple usage count that | |
| many objects tagged separately should return a simple usage count that |
| """ | ||
| AI/Claude4.6 generated via IntelliJ IDEA AI Assistant | ||
| When a child tag (depth 3) is applied to an object, it should | ||
| roll up the count to all its ancestors when using _get_filtered_tags_deep. | ||
| The child tag and each of its ancestors should have usage_count=1. | ||
| """ |
There was a problem hiding this comment.
Several new test docstrings include tool-attribution text (e.g., “AI/Claude4.6 generated via IntelliJ IDEA AI Assistant”). This doesn’t document test behavior and is inconsistent with typical test docstrings; please remove the attribution lines and keep the docstrings focused on the scenario/assertions.
| # build a list of lineage paths to be used in the query, so we're not hard coding to | ||
| # a certain number of levels. This will build an array containing something like: | ||
| # ['tag_id', 'tag__parent_id', 'tag__parent__parent_id', 'tag__parent__parent__parent_id', ...] | ||
| lineage_paths = [f"tag{'__parent' * i}_id" for i in range(TAXONOMY_MAX_DEPTH+1)] |
There was a problem hiding this comment.
Instead of using TAXONOMY_MAX_DEPTH for this query, what about using the actual max depth of the current taxonomy? e.g. max_depth = qs.aggregate(models.Max("depth", default=0))["depth__max"] ?
Description
This implements openedx/modular-learning#253 , the task to add tag usage counts to the tags table under the taxonomies table. The frontend piece is where the results of this aggregation work is displayed is part of a separate pr to openedx/frontend-app-authoring. This change adds a subquery annotation onto the django query for retrieving tags. The original implementation of the counts for tags only counted raw usage of each tag, rather than aggregate sum of any tag and child tag usage with sibling de-duplication for the same usage (e.g. when two sibling nodes are used against the same course, module, etc. we still only need to count that as '1' for any parent/grandparent nodes) as specified in the AC for the issue above, so it was replaced with this more complicated sub-query that sums across tag usage based on various courses, sections, modules, and libraries that might use a tag.
Supporting information
Github issue with AC: openedx/modular-learning#253
Testing instructions
Refer to the AC in the Github Issue. Steps to verify this is implemented and working via UX (Note, depends on the frontend part of this ticket):
Other information
Include anything else that will help reviewers and consumers understand the change.