Skip to content

Add relationship-aware query APIs to reduce raw SQL in app code #142

@seapagan

Description

@seapagan

Proposal: relationship-aware existence/filter/projection APIs to reduce raw SQL in app code

I want to tighten up the query builder so more real-world repository code can stay fully inside the ORM/query API instead of dropping to handwritten SQL for reverse-FK and many-to-many cases.

The immediate motivation is Cogitus, where TagRepository.list_with_usage() can now likely move to with_count(), but list_in_use() and similar queries still push user code toward raw SQL or awkward post-processing. The broader goal is to make common "has related rows", "filter through related rows", and "annotate models with related counts/existence" cases first-class.

What I want to add

1. Relationship existence filters

Add APIs along these lines:

  • has("ideas")
  • where_has("ideas", ...)
  • doesnt_have("ideas")

These should work for reverse FK and M2M relationships, and ideally for nested paths too.

Examples:

db.select(Tag).has("ideas").order("name").fetch_all()
db.select(Author).doesnt_have("books").fetch_all()
db.select(Tag).where_has("ideas", title__icontains="python").fetch_all()

This is the cleanest answer for queries like "tags currently in use".

2. Relationship traversal in normal filters beyond forward FKs

filter() should be able to traverse reverse FK and M2M paths, not just forward FK joins.

Examples:

db.select(Tag).filter(ideas__title__icontains="python").fetch_all()
db.select(Article).filter(tags__name="sqlite").fetch_all()
db.select(User).filter(groups__permissions__name="admin").fetch_all()

This would eliminate a lot of manual JOIN SQL in repositories.

3. Better projection fetch shape than fetch_dicts()

Projection mode is now powerful enough to express grouped/aggregate queries, but fetch_dicts() is still clumsy when the caller wants "model instance + computed columns".

Add a higher-level fetch mode for projection queries, something like:

  • fetch_annotated()
  • fetch_model_tuples()
  • fetch_models_with(...)

I do not want to over-design the naming yet, but the intended outcome is to support code like:

rows = (
    db.select(Tag)
    .with_count("ideas", alias="usage")
    .order("name")
    .fetch_annotated()
)

with a result shape that is nicer than "dicts then reconstruct the model manually".

4. Generalize relationship aggregates

with_count() is useful, but it feels like a special case. I want a path-aware aggregate API that can grow naturally.

Possible directions:

db.select(Tag).annotate(usage=func.count_related("ideas"))
db.select(Author).annotate(book_count=func.count_related("books"))

and later maybe related-path sums/avgs if that proves useful.

I do not necessarily want to replace with_count() immediately, but I do want the aggregate story to feel coherent rather than split between base-field aggregates and relationship counting.

5. Add relationship existence annotations

A boolean annotation API would also be useful:

  • with_exists("ideas", alias="in_use")
  • or annotate(in_use=func.exists_related("ideas"))

Examples:

db.select(Tag).with_exists("ideas", alias="in_use").fetch_dicts()

This is often a better fit than counting when the caller only needs a boolean.

Implementation notes

  • Keep all table/junction/column resolution internal to the query layer.
  • For M2M traversal, always derive junction and FK column names from public relationship metadata (sql_metadata) rather than reconstructing them ad hoc.
  • Prefer parameterized value binding everywhere it applies.
  • For relationship existence filters, EXISTS (...) may be cleaner than aggregate/grouping in many cases, but the exact generated SQL can be decided based on simplicity and reuse.
  • Nested path support would be ideal from the start if the internal path-resolution code can be shared with existing with_count() traversal.

Priority order

If I want the highest-value additions first:

  1. has() / where_has() / doesnt_have()
  2. reverse-FK and M2M path traversal in filter()
  3. a better projection fetch API than fetch_dicts()
  4. generalized relationship aggregate helpers
  5. existence annotations like with_exists()

Success criteria

When this is done, common repository queries such as:

  • rows with at least one related record
  • rows with no related records
  • rows filtered by reverse/M2M related fields
  • rows annotated with related counts/existence

should no longer require handwritten SQL in user code.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions