Skip to content

Conversation

Copy link
Contributor

Copilot AI commented Dec 23, 2025

  • Create test file test2408.js with comprehensive window aggregate tests
  • Modify src/50expression.js to skip adding window aggregates to selectGroup
  • Modify src/40select.js to initialize query.windowaggrs array
  • Modify src/40select.js to add window aggregate post-processing logic
  • Modify src/424select.js to detect and track window aggregates
  • Handle COUNT(*) vs COUNT(column) for window functions
  • Handle NULL values correctly in aggregates
  • All 8 test cases passing
  • Existing window function tests still passing (test1126, test198, test229)
  • Refactor for elegance: Reduced code by ~49% (from 116 to 59 added lines)
    • Consolidated aggregate calculations using switch statement
    • Eliminated repetitive loops by collecting values once
    • Used built-in Math.max/min and Array.reduce for cleaner logic
    • Simplified partition key generation with ternary operator
    • Unified column definition logic for window/regular aggregates
  • Performance optimization: Replaced multiple string comparisons with regex test
    • Created re_aggrWithExpression regex outside hot code path
    • Single regex test instead of 9 equality comparisons
    • Improves performance in aggregate detection
Original prompt

This section details on the original issue you should resolve

<issue_title>Window Aggregate Functions (COUNT/MAX/MIN) Not Working with PARTITION BY</issue_title>
<issue_description>## Summary
Window aggregate functions like COUNT(), MAX(), and MIN() with OVER (PARTITION BY ...) clauses are parsed successfully but do not implement proper window function semantics. Instead of returning one row per input row with the windowed aggregate value, they collapse the result set to a single row with the overall aggregate.

Current Behavior

// Given test data:
var data = [
  {category: 'A', amount: 10},
  {category: 'A', amount: 20},
  {category: 'B', amount: 30},
  {category: 'B', amount: 40}
];

// Query with window function:
alasql('SELECT category, amount, COUNT(*) OVER (PARTITION BY category) AS cnt FROM ?', [data]);

// Current (INCORRECT) output:
[{category: undefined, amount: undefined, cnt: 4}]
// Returns single aggregated row, losing all detail

Expected Behavior

// Expected (CORRECT) output:
[
  {category: 'A', amount: 10, cnt: 2},  // Count of 'A' rows
  {category: 'A', amount: 20, cnt: 2},  // Count of 'A' rows
  {category: 'B', amount: 30, cnt: 2},  // Count of 'B' rows
  {category: 'B', amount: 40, cnt: 2}   // Count of 'B' rows
]
// Should return one row per input row with window aggregate

Technical Details

Working Reference

  • ROW_NUMBER() OVER (PARTITION BY ...) works correctly (implemented in src/424select.js lines 514-529)
  • Parser correctly recognizes OVER clause structure (src/47over.js)

Problem Location

The issue appears to be in how aggregate functions interact with the OVER clause:

  • Parser: src/alasqlparser.jison - correctly parses OVER clause
  • Compilation: src/424select.js - needs window function execution logic
  • Aggregates: src/55functions.js - aggregate functions don't check for OVER clause

Root Cause

When an aggregate function (COUNT/MAX/MIN/SUM/AVG) has an OVER clause, it should:

  1. Process all rows without grouping
  2. Partition rows according to PARTITION BY columns
  3. Calculate aggregate within each partition
  4. Assign the partition's aggregate value to each row in that partition

Currently, the aggregate behavior treats it as a regular GROUP BY aggregate, collapsing rows.

Affected Functions

  • COUNT(*) OVER (PARTITION BY ...)
  • MAX(column) OVER (PARTITION BY ...)
  • MIN(column) OVER (PARTITION BY ...)
  • SUM(column) OVER (PARTITION BY ...) (likely)
  • AVG(column) OVER (PARTITION BY ...) (likely)

Test Case

describe('Window Aggregate Functions', function() {
  it('COUNT() OVER (PARTITION BY) should return per-row values', function(done) {
    var data = [
      {dept: 'Sales', emp: 'John', salary: 1000},
      {dept: 'Sales', emp: 'Jane', salary: 1200},
      {dept: 'IT', emp: 'Bob', salary: 1500},
      {dept: 'IT', emp: 'Alice', salary: 1600}
    ];
    
    var res = alasql(
      'SELECT dept, emp, COUNT(*) OVER (PARTITION BY dept) AS dept_count FROM ?',
      [data]
    );
    
    assert.deepEqual(res, [
      {dept: 'Sales', emp: 'John', dept_count: 2},
      {dept: 'Sales', emp: 'Jane', dept_count: 2},
      {dept: 'IT', emp: 'Bob', dept_count: 2},
      {dept: 'IT', emp: 'Alice', dept_count: 2}
    ]);
    done();
  });
  
  it('MAX() OVER (PARTITION BY) should return per-row values', function(done) {
    var res = alasql(
      'SELECT dept, emp, salary, MAX(salary) OVER (PARTITION BY dept) AS max_dept_salary FROM ?',
      [data]
    );
    
    assert.deepEqual(res, [
      {dept: 'Sales', emp: 'John', salary: 1000, max_dept_salary: 1200},
      {dept: 'Sales', emp: 'Jane', salary: 1200, max_dept_salary: 1200},
      {dept: 'IT', emp: 'Bob', salary: 1500, max_dept_salary: 1600},
      {dept: 'IT', emp: 'Alice', salary: 1600, max_dept_salary: 1600}
    ]);
    done();
  });
});

Implementation Approach

  1. Detect when an aggregate function has an OVER clause during compilation
  2. Mark the SELECT as requiring window function processing
  3. After main query execution, apply window function logic:
    • Partition result rows by PARTITION BY columns
    • Calculate aggregate for each partition
    • Add aggregate value to each row in partition
  4. Handle ORDER BY within OVER clause for frame-aware aggregates

References

Priority

High - This is a core SQL feature that users expect to work when the syntax is accepted.

Related Issues

  • Issue #[original] - Request for window aggregate functions
  • ROW_NUMBER() OVER (PARTITION BY ...) working correctly (can serve as template)</issue_description>

Comments on the Issue (you are @copilot in this section)


✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

@mathiasrw
Copy link
Member

Continue

@mathiasrw
Copy link
Member

@copilot Continue...

…TITION BY

Co-authored-by: mathiasrw <1063454+mathiasrw@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Jan 2, 2026

@copilot Continue...

I've implemented window aggregate functions with PARTITION BY support. The implementation now allows COUNT(), MAX(), MIN(), SUM(), and AVG() to work correctly with OVER (PARTITION BY ...) clauses.

Changes made (commit e1e87f4):

  • Modified the parser logic to skip adding window aggregates to the GROUP BY selectGroup
  • Added post-processing logic to calculate partition-specific aggregates
  • Created comprehensive tests in test2362.js - all 8 tests passing
  • Verified existing window function tests still pass

Key features:

  • Properly returns one row per input row with the windowed aggregate value
  • Supports single and multi-column PARTITION BY
  • Handles COUNT(*) vs COUNT(column) correctly
  • Properly handles NULL values in aggregates
  • Works with multiple window aggregates in the same query

Copilot AI changed the title [WIP] Fix window aggregate functions not implementing semantics Fix window aggregate functions not implementing semantics Jan 2, 2026
Copy link
Member

@mathiasrw mathiasrw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a lot of code for such a simple issue.

Please review and see how we can make the code more elegant.

Co-authored-by: mathiasrw <1063454+mathiasrw@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Jan 2, 2026

This seems like a lot of code for such a simple issue.

Please review and see how we can make the code more elegant.

Refactored for elegance (commit f2d12d7). The code is now ~49% more concise:

Key improvements:

  • Consolidated aggregate calculations using switch statement instead of separate if-else blocks
  • Eliminated repetitive loops by collecting values once, then using built-in methods (Math.max/min, Array.reduce)
  • Simplified partition key generation with ternary operators
  • Unified column definition logic for both window and regular aggregates

Result: Reduced from 116 added lines to 59 added lines while maintaining full functionality and test coverage.

Copilot AI requested a review from mathiasrw January 2, 2026 10:05
Co-authored-by: mathiasrw <1063454+mathiasrw@users.noreply.github.com>
Copilot AI requested a review from mathiasrw January 2, 2026 12:07
@mathiasrw mathiasrw marked this pull request as ready for review January 2, 2026 15:42
@mathiasrw mathiasrw merged commit 5af89fb into develop Jan 2, 2026
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Window Aggregate Functions (COUNT/MAX/MIN) Not Working with PARTITION BY

2 participants