Split the node_id_to_def_id table into a per-owner table#138995
Split the node_id_to_def_id table into a per-owner table#138995oli-obk wants to merge 5 commits intorust-lang:mainfrom
Conversation
|
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
[perf experiment] Split the resolver tables into per-owner tables r? `@ghost` just doing some experiments to see if splitting `hir_crate` is feasible by checking if splitting the resolver's output into per-owner queries is feasible (rust-lang#95004) Basically necessary for rust-lang#138705 as that can't be landed perf-wise while the `hir_crate` query is still a thing
|
☀️ Try build successful - checks-actions |
This comment has been minimized.
This comment has been minimized.
|
Finished benchmarking commit (792af13): comparison URL. Overall result: ❌ regressions - please read the text belowBenchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf. Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @bors rollup=never Instruction countThis is the most reliable metric that we have; it was used to determine the overall result at the top of this comment. However, even this metric can sometimes exhibit noise.
Max RSS (memory usage)Results (primary 1.3%, secondary 0.5%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesResults (secondary 1.5%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Binary sizeThis benchmark run did not return any relevant results for this metric. Bootstrap: 777.548s -> 776.554s (-0.13%) |
|
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
[perf experiment] Split the resolver tables into per-owner tables r? `@ghost` just doing some experiments to see if splitting `hir_crate` is feasible by checking if splitting the resolver's output into per-owner queries is feasible (rust-lang#95004) Basically necessary for rust-lang#138705 as that can't be landed perf-wise while the `hir_crate` query is still a thing
|
☀️ Try build successful - checks-actions |
This comment has been minimized.
This comment has been minimized.
|
Finished benchmarking commit (66f172c): comparison URL. Overall result: ❌ regressions - please read the text belowBenchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf. Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @bors rollup=never Instruction countThis is the most reliable metric that we have; it was used to determine the overall result at the top of this comment. However, even this metric can sometimes exhibit noise.
Max RSS (memory usage)Results (primary 1.6%, secondary 2.7%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesResults (secondary 2.9%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Binary sizeThis benchmark run did not return any relevant results for this metric. Bootstrap: 778.99s -> 777.791s (-0.15%) |
|
ok... better, but not great yet either |
|
@bors try @rust-timer queue |
[perf experiment] Split the resolver tables into per-owner tables r? `@ghost` just doing some experiments to see if splitting `hir_crate` is feasible by checking if splitting the resolver's output into per-owner queries is feasible (rust-lang#95004) Basically necessary for rust-lang#138705 as that can't be landed perf-wise while the `hir_crate` query is still a thing
This comment has been minimized.
This comment has been minimized.
|
☀️ Try build successful - checks-actions |
|
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
[perf experiment] Split the resolver tables into per-owner tables
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
Finished benchmarking commit (b4dc833): comparison URL. Overall result: ❌✅ regressions and improvements - please read:Benchmarking means the PR may be perf-sensitive. It's automatically marked not fit for rolling up. Overriding is possible but disadvised: it risks changing compiler perf. Next, please: If you can, justify the regressions found in this try perf run in writing along with @bors rollup=never Instruction countOur most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.
Max RSS (memory usage)Results (primary 0.3%, secondary 1.6%)A less reliable metric. May be of interest, but not used to determine the overall result above.
CyclesResults (primary -4.2%, secondary -5.1%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Binary sizeThis perf run didn't have relevant results for this metric. Bootstrap: 498.27s -> 499.245s (0.20%) |
|
yay, getting there |
|
unused-warnings is a small regression, but somewhat expected considering the high number of unused imports that all run some extra logic for obtaining information from the hash tables. the primary regressions (serde, bitmaps, clap) are very small, largely inliner noise (local testing says so) and to a smaller part because of some extra Overall it's a wash, so I think we should just merge it. Especially considering that some of the refactorings that enabled this being perf neutral were perf improvements themselves. @rustbot ready |
|
The Clippy subtree was changed cc @rust-lang/clippy |
This comment has been minimized.
This comment has been minimized.
Now they unfortunately land in their parent, which is not necessary
|
This PR was rebased onto a different main commit. Here's a range-diff highlighting what actually changed. Rebasing is a normal part of keeping PRs up to date, so no action is needed—this note is just to help reviewers. |
| F: Fn(T) -> UnordItems<O, U>, | ||
| { | ||
| UnordItems(self.0.flat_map(f)) | ||
| UnordItems(self.0.flat_map(move |x| f(x).0)) |
There was a problem hiding this comment.
I don't understand the motivation behind these changes, could you explain?
There was a problem hiding this comment.
The follow-up commit uses flat_map in def_id_to_node_id. To flatten NodeMap<PerOwner...>, we need an UnordItems of the owners table to be able to be flat_map its inner node_id_to_def_id, which is also unord
|
|
||
| impl<T, I: Iterator<Item = T>> UnordItems<T, I> { | ||
| #[inline] | ||
| pub fn wrap(iter: I) -> Self { |
There was a problem hiding this comment.
| pub fn wrap(iter: I) -> Self { | |
| pub fn new(iter: I) -> Self { |
More idiomatic naming for the constructor.
| impl<T, I: Iterator<Item = T>> UnordItems<T, I> { | ||
| #[inline] | ||
| pub fn wrap(iter: I) -> Self { | ||
| Self(iter) |
There was a problem hiding this comment.
| Self(iter) | |
| UnordItems(iter) |
Could you avoid Self in non-generic code, at least in constructors like Self { ... } or Self(...)? Makes it hard to search code.
There are several instances in this PR.
| /// The id of the owner | ||
| pub id: ast::NodeId, | ||
| /// The `DefId` of the owner, can't be found in `node_id_to_def_id`. | ||
| pub def_id: LocalDefId, |
There was a problem hiding this comment.
Is this an optimization?
Or putting it into node_id_to_def_id was a hack in the first place?
There was a problem hiding this comment.
Putting it in the hash map was a "preserve more previous behaviour" thing. Just wanted to keep the diff small. I have a commit somewhere from last year where i added the DefId here directly. In fact, it should be an OwnerId, and then we can get rid of the one from the ast lowerer.
But yes, for the purposes of this PR it's an optimization.
|
(Will continue reviewing tomorrow or later today.) |
View all comments
My goal is to split all the resolver tables that get passed to act lowering into per-owner tables, so that all information that ast lowering needs from the resolver is separated by owners. This should allow us to fully split ast lowering to have one query invocation per owner that steal the individual resolver results for each owner.
part of rust-lang/rust-project-goals#620