Skip to content

gh-144888: Replace bloom filter linked lists with continuous arrays to optimize executor invalidating performance#145873

Open
cocolato wants to merge 2 commits intopython:mainfrom
cocolato:gh-144888
Open

gh-144888: Replace bloom filter linked lists with continuous arrays to optimize executor invalidating performance#145873
cocolato wants to merge 2 commits intopython:mainfrom
cocolato:gh-144888

Conversation

@cocolato
Copy link
Contributor

@cocolato cocolato commented Mar 12, 2026

During JIT compilation, when function objects are destroyed or code objects are modified, all executors must be traversed to inspect their dependencies, followed by invalidating the relevant executors. The original implementation stored executors using singly linked lists, resulting in numerous pointer jumps during traversal and consequently poor CPU cache efficiency.

This PR changes the executor storage structure from a linked list to a contiguous array, reducing pointer jumps during traversal to improve CPU cache efficiency. It also implements O(1) deletion using swap-remove, thereby accelerating dependency invalidation operations.

Copy link
Member

@markshannon markshannon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for doing this.
I've only had time to do a quick scan, but this looks like it should speed up the scan considerably.

_PyBloomFilter bloom;
_PyExecutorLinkListNode links;
int32_t bloom_array_idx; // Index in interp->executor_blooms/executor_ptrs.
_PyExecutorLinkListNode links; // Used by deletion list.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this necessary now? We can traverse all executors using the executor_ptrs array.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need it to save deletion list:

cpython/Python/optimizer.c

Lines 332 to 338 in 08a018e

static void
uop_dealloc(PyObject *op) {
_PyExecutorObject *self = _PyExecutorObject_CAST(op);
executor_invalidate(op);
assert(self->vm_data.code == NULL);
add_to_pending_deletion_list(self);
}

@cocolato
Copy link
Contributor Author

@Fidget-Spinner gentle ping, if you have time ,please take a look at this, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants