Skip to content

Commit cf0f23a

Browse files
committed
format
1 parent e4f1b15 commit cf0f23a

1 file changed

Lines changed: 10 additions & 5 deletions

File tree

src/blog/tanstack-router-route-matching-tree-rewrite.md

Lines changed: 10 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ authors:
77

88
![Big performance number](/blog-assets/tanstack-router-route-matching-tree-rewrite/header.png)
99

10-
We achieved a 20,000× performance improvement in route matching in TanStack Router. Let's be honest, this is *definitely* cherry-picked, but the number is real and comes from a real production application. More importantly, it shows that matching a pathname to a route is no longer bottlenecked by the number of routes in your application.
10+
We achieved a 20,000× performance improvement in route matching in TanStack Router. Let's be honest, this is _definitely_ cherry-picked, but the number is real and comes from a real production application. More importantly, it shows that matching a pathname to a route is no longer bottlenecked by the number of routes in your application.
1111

1212
## The Real Problem: correctness, not speed
1313

@@ -24,20 +24,24 @@ We now parse the route tree into a segment trie, and matching is done by travers
2424
A trie ([wikipedia](https://en.wikipedia.org/wiki/Trie)) is a tree structure where each node corresponds to the common string prefix shared by all of the node's children. The concept maps very well to a representation of the routes in an app, where each node is a URL pathname segment.
2525

2626
Given a single route `/users/$id`, our segment trie would look like this:
27+
2728
```
2829
root
2930
└── users
3031
└── $id => match /users/$id
3132
```
3233

3334
We can add more routes to get a complete picture:
35+
3436
```
3537
/users/$id
3638
/users/$id/posts
3739
/users/profile
3840
/posts/$slug
3941
```
42+
4043
This yields the following tree:
44+
4145
```
4246
root
4347
├── users
@@ -49,6 +53,7 @@ root
4953
```
5054

5155
To match `/users/123`, we:
56+
5257
1. Start at root, look for "users" → found
5358
2. Move to users node, look for "123" → matches $id pattern
5459
3. Check if this node has a route → yes, return `/users/$id`
@@ -60,11 +65,11 @@ The reason we can get such a massive performance boost is because we've changed
6065
- Old approach: `O(N)` where `N` is the number of routes in the tree.
6166
- New approach: `O(M)` where `M` is the number of segments in the pathname.
6267

63-
(This is very simplified, it's probably more something like `O(N * M)` vs. `O(M * log(N))`, but the point stands: we're scaling differently now.)
68+
(This is simplified, in practice it's more like `O(N * M)` vs. `O(M * log(N))` in the average case, but the point is that we've changed which variable dominates the complexity.)
6469

6570
Using this new tree structure, each check eliminates a large number of possible routes, allowing us to quickly zero in on the correct match.
6671

67-
For example, imagine we have a route tree with 450 routes (fairly large app) and the tree can only eliminate 50% of routes at each segment check (this is unusually low, it's often much higher). With this bad setup, we have found a match in 9 checks (`2**9 > 450`). By contrast, the old approach *could* have found the match on the first check, but in the worst case it would have had to check all 450 routes, which yields an average of 225 checks. Even in this simplified case, we are looking at a 25× performance improvement.
72+
For example, imagine we have a route tree with 450 routes (fairly large app) and the tree can only eliminate 50% of routes at each segment check (this is unusually low, it's often much higher). With this bad setup, we have found a match in 9 checks (`2**9 > 450`). By contrast, the old approach _could_ have found the match on the first check, but in the worst case it would have had to check all 450 routes, which yields an average of 225 checks. Even in this simplified case, we are looking at a 25× performance improvement.
6873

6974
This is what makes tree structures so powerful.
7075

@@ -85,7 +90,7 @@ We use a stack to manage our traversal of the tree, because the presence of dyna
8590

8691
The ideal algorithm would be depth-first search (DFS) in order of highest priority, so that we can return as soon as we find a match. In practice, we have very few possibilities of early exit; but a fully static path should still be able to return immediately.
8792

88-
To accomplish this, we use an array as the stack. We know that `.push()` and `.pop()` at the end of an array are O(1) operations, while `.shift()` and `.unshift()` from the start are O(N), and we want to avoid the latter entirely. At each segment, we iterate candidates in *reverse* order of priority, pushing them onto the stack. This way, when we pop from the stack, we get the highest priority candidates first.
93+
To accomplish this, we use an array as the stack. We know that `.push()` and `.pop()` at the end of an array are O(1) operations, while `.shift()` and `.unshift()` from the start are O(N), and we want to avoid the latter entirely. At each segment, we iterate candidates in _reverse_ order of priority, pushing them onto the stack. This way, when we pop from the stack, we get the highest priority candidates first.
8994

9095
```ts
9196
const stack = [
@@ -144,7 +149,7 @@ The downside is that this limits us to 32 segments, because in JavaScript bitwis
144149

145150
### Reusing Typed Arrays for Segment Parsing
146151

147-
When building the segment trie, we need to parse each route (e.g., `/users/$userId/{-$maybe}`) into its constituent segments (e.g. `static:user`, `dynamic:userId`, `optional:maybe`). Doing this is basically running the same parsing algorithms hundreds of times, every time extracting the same structured data (i.e. segment type, value, prefix, suffix, where the next segment starts, etc).
152+
When building the segment trie, we need to parse each route (e.g., `/users/$userId/{-$maybe}`) into its constituent segments (e.g. `static:users`, `dynamic:userId`, `optional:maybe`). Doing this is basically running the same parsing algorithms hundreds of times, every time extracting the same structured data (i.e. segment type, value, prefix, suffix, where the next segment starts, etc).
148153

149154
Instead of re-creating a new object every time, we can reuse the same object across all parsing operations to avoid allocations in the hot path.
150155

0 commit comments

Comments
 (0)