-
-
Notifications
You must be signed in to change notification settings - Fork 34.2k
Description
Description
I propose to use memchr (link) (and memrchr) to speed up str.split (and str.resplit).
Currently, str.split (and str.rsplit) use a basic loop to iterate through the string to find delimiters.
Using memchr and memrchr can provide a significant performance boost as they are typically heavily optimised / implemented using SIMD instructions.
Note that no complicated change is needed to make this happen, as both functions exist within STRINGLIB (find_char and rfind_char).
Interestingly, there is a comment today in the code stating that using memchr does not provide any meaningful improvements, but benchmarks seem to disagree. I suspect that this is because this code was written ~16y ago (f2c5484) and memchr implementations (and, potentially, hardware) have evolved and started to difference since then.
Proposed change
I have a branch in my fork that implements this change and that I can open if we want to go forward with that.
The long story short code-wise is that the existing for loop iterating on characters would be replaced with a call to STRINGLIB(find_char).
Benchmarks
I made this pyperf benchmarking script which should show whether the proposed change helps. (I don't think there's a pyperformance benchmark for that.)
import pyperf
SIZES = [100, 1_000, 10_000, 100_000, 1_000_000]
SEGMENT_LENGTHS = [2, 10, 50, 250, 500, 1000, 10_000, 25_000, 100_000, 250_000, 500_000, 999_999]
CHAR_SETS = {
"ascii": ("a", " "),
"latin": ("é", " "),
# we have two CJK cases because find_char has a different implementation for
# (sep & 0xff == 0) and (sep & 0xff != 0)
"cjk": ("\u4e16", "\u3000"), # CJK ideograph + ideographic space (sep & 0xff == 0)
"cjk_nz": ("\u4e16", "\u3001"), # CJK ideograph + ideographic comma (sep & 0xff != 0)
"emoji": ("\U0001f600", " "),
}
def make_string(char: str, sep: str, size: int, seg_len: int) -> str:
segment = char * seg_len + sep
repeats = max(1, size // len(segment))
return (segment * repeats)[:size]
def bench_split_sep(s: str, sep: str):
s.split(sep)
def add_benchmarks(runner: pyperf.Runner):
for charset_name, (char, sep) in CHAR_SETS.items():
for size in SIZES:
for seg_len in SEGMENT_LENGTHS:
if seg_len > size:
continue
s = make_string(char, sep, size, seg_len)
name = f"split_{charset_name}_size{size}_seg{seg_len}"
runner.bench_func(name, bench_split_sep, s, sep)
runner = pyperf.Runner()
add_benchmarks(runner)I ran the benchmarks on two computers (running macOS and Linux), with --enable-optimizations --enable-lto. I don't have a Windows machine around, let alone with a C compiler available, but if needed I may be able to find one.
Optimisation flags actually are important in that case. I initially ran benchmarks on a build without LTO and it led to slightly worse results in certain cases (probably due to the find_char function not being inlined, although I don't have proof).
Overall, what the benchmarks show is that:
- There is a performance gain (and a significant one)
- The performance gains grow with the size of segments (which makes sense, given that what we optimise here is how fast we find the next delimiter)
- At lower segment sizes, the performance gain is near zero
- At higher segment sizes, the performance gain can be up to 90%
- Performance gains are much lower (or even inexistent) when the delimiter (searched character) is "not well supported"
- Splits on ASCII show significant improvements
- Same goes for Latin
- CJK shows or doesn't show an improvement depending on the delimiter (due to this), in the "happy case" the improvement is comparable to ASCII/Latin
- Emoji delimiters show no improvement
- One caveat to all that is that the absolute numbers are already not exactly huge today. However, when splitting many strings in a loop, they could quickly add up.
The raw results are attached here.
Details
Note The Change column is how much time the optimised version takes compared to the baseline, so less is better.
Linux
Benchmark Optimised Baseline Change
----------------------------------------------------------------------------------
split_ascii_size100_seg2 1.22 us +- 0.02 us 1.35 us +- 0.03 us -9.6%
split_ascii_size100_seg10 421 ns +- 6 ns 497 ns +- 4 ns -15.3%
split_ascii_size100_seg50 212 ns +- 1 ns 242 ns +- 12 ns -12.4%
split_ascii_size1000_seg2 11.1 us +- 0.2 us 12.3 us +- 0.2 us -9.8%
split_ascii_size1000_seg10 3.30 us +- 0.05 us 4.08 us +- 0.05 us -19.1%
split_ascii_size1000_seg50 768 ns +- 9 ns 1.22 us +- 0.03 us -37.0%
split_ascii_size1000_seg250 289 ns +- 5 ns 611 ns +- 8 ns -52.7%
split_ascii_size1000_seg500 245 ns +- 9 ns 429 ns +- 3 ns -42.9%
split_ascii_size1000_seg1000 191 ns +- 5 ns 529 ns +- 17 ns -63.9%
split_ascii_size10000_seg2 105 us +- 4 us 116 us +- 3 us -9.5%
split_ascii_size10000_seg10 28.4 us +- 0.4 us 36.8 us +- 0.3 us -22.8%
split_ascii_size10000_seg50 7.05 us +- 0.09 us 11.7 us +- 0.1 us -39.7%
split_ascii_size10000_seg250 1.74 us +- 0.05 us 5.95 us +- 0.20 us -70.8%
split_ascii_size10000_seg500 1.84 us +- 0.06 us 5.61 us +- 0.08 us -67.2%
split_ascii_size10000_seg1000 1.27 us +- 0.01 us 4.43 us +- 0.04 us -71.3%
split_ascii_size10000_seg10000 255 ns +- 4 ns 3.52 us +- 0.02 us -92.8%
split_ascii_size100000_seg2 1.54 ms +- 0.03 ms 1.65 ms +- 0.03 ms -6.7%
split_ascii_size100000_seg10 290 us +- 21 us 361 us +- 4 us -19.7%
split_ascii_size100000_seg50 65.8 us +- 1.1 us 110 us +- 1 us -40.2%
split_ascii_size100000_seg250 18.0 us +- 0.2 us 57.7 us +- 1.8 us -68.8%
split_ascii_size100000_seg500 19.3 us +- 0.3 us 56.4 us +- 0.7 us -65.8%
split_ascii_size100000_seg1000 12.7 us +- 0.4 us 47.3 us +- 0.3 us -73.2%
split_ascii_size100000_seg10000 4.26 us +- 0.24 us 33.6 us +- 1.0 us -87.3%
split_ascii_size100000_seg25000 3.51 us +- 0.18 us 28.2 us +- 1.2 us -87.6%
split_ascii_size100000_seg100000 1.35 us +- 0.02 us 33.2 us +- 0.9 us -95.9%
split_ascii_size1000000_seg2 18.8 ms +- 0.5 ms 20.7 ms +- 0.7 ms -9.2%
split_ascii_size1000000_seg10 5.40 ms +- 0.08 ms 6.12 ms +- 0.06 ms -11.8%
split_ascii_size1000000_seg50 1.19 ms +- 0.04 ms 1.59 ms +- 0.01 ms -25.2%
split_ascii_size1000000_seg250 677 us +- 10 us 1.07 ms +- 0.01 ms -36.7%
split_ascii_size1000000_seg500 200 us +- 2 us 554 us +- 2 us -63.9%
split_ascii_size1000000_seg1000 143 us +- 2 us 459 us +- 9 us -68.8%
split_ascii_size1000000_seg10000 71.9 us +- 2.6 us 375 us +- 10 us -80.8%
split_ascii_size1000000_seg25000 77.2 us +- 2.2 us 369 us +- 4 us -79.1%
split_ascii_size1000000_seg100000 61.6 us +- 1.6 us 346 us +- 13 us -82.2%
split_ascii_size1000000_seg250000 45.0 us +- 1.0 us 284 us +- 5 us -84.2%
split_ascii_size1000000_seg500000 27.5 us +- 0.8 us 189 us +- 5 us -85.4%
split_ascii_size1000000_seg999999 73.7 us +- 3.4 us 387 us +- 2 us -81.0%
split_latin_size100_seg2 1.53 us +- 0.07 us 1.62 us +- 0.01 us -5.6%
split_latin_size100_seg10 499 ns +- 13 ns 570 ns +- 12 ns -12.5%
split_latin_size100_seg50 217 ns +- 3 ns 255 ns +- 4 ns -14.9%
split_latin_size1000_seg2 14.4 us +- 0.2 us 15.2 us +- 0.2 us -5.3%
split_latin_size1000_seg10 4.13 us +- 0.11 us 4.81 us +- 0.10 us -14.1%
split_latin_size1000_seg50 940 ns +- 12 ns 1.62 us +- 0.10 us -42.0%
split_latin_size1000_seg250 307 ns +- 2 ns 835 ns +- 12 ns -63.2%
split_latin_size1000_seg500 247 ns +- 1 ns 584 ns +- 12 ns -57.7%
split_latin_size1000_seg1000 190 ns +- 4 ns 844 ns +- 7 ns -77.5%
split_latin_size10000_seg2 138 us +- 3 us 145 us +- 4 us -4.8%
split_latin_size10000_seg10 37.6 us +- 0.7 us 44.0 us +- 0.5 us -14.5%
split_latin_size10000_seg50 9.51 us +- 0.59 us 15.4 us +- 0.5 us -38.2%
split_latin_size10000_seg250 2.07 us +- 0.07 us 8.76 us +- 0.11 us -76.4%
split_latin_size10000_seg500 1.92 us +- 0.02 us 8.46 us +- 0.13 us -77.3%
split_latin_size10000_seg1000 1.38 us +- 0.03 us 7.30 us +- 0.10 us -81.1%
split_latin_size10000_seg10000 255 ns +- 2 ns 6.90 us +- 0.32 us -96.3%
split_latin_size100000_seg2 2.26 ms +- 0.21 ms 2.26 ms +- 0.27 ms ~same
split_latin_size100000_seg10 371 us +- 5 us 435 us +- 11 us -14.7%
split_latin_size100000_seg50 86.8 us +- 1.2 us 146 us +- 5 us -40.5%
split_latin_size100000_seg250 20.9 us +- 0.4 us 87.0 us +- 0.7 us -76.0%
split_latin_size100000_seg500 21.4 us +- 0.6 us 87.5 us +- 1.0 us -75.5%
split_latin_size100000_seg1000 13.7 us +- 0.5 us 78.8 us +- 0.4 us -82.6%
split_latin_size100000_seg10000 4.27 us +- 0.19 us 63.0 us +- 1.1 us -93.2%
split_latin_size100000_seg25000 3.60 us +- 0.23 us 52.6 us +- 0.9 us -93.2%
split_latin_size100000_seg100000 1.34 us +- 0.01 us 66.3 us +- 2.6 us -98.0%
split_latin_size1000000_seg2 26.0 ms +- 0.8 ms 26.5 ms +- 0.7 ms -1.9%
split_latin_size1000000_seg10 7.18 ms +- 0.23 ms 8.05 ms +- 0.62 ms -10.8%
split_latin_size1000000_seg50 1.88 ms +- 0.01 ms 2.47 ms +- 0.09 ms -23.9%
split_latin_size1000000_seg250 709 us +- 8 us 1.37 ms +- 0.01 ms -48.2%
split_latin_size1000000_seg500 217 us +- 5 us 869 us +- 11 us -75.0%
split_latin_size1000000_seg1000 149 us +- 2 us 781 us +- 5 us -80.9%
split_latin_size1000000_seg10000 74.9 us +- 3.7 us 700 us +- 7 us -89.3%
split_latin_size1000000_seg25000 76.7 us +- 2.4 us 690 us +- 3 us -88.9%
split_latin_size1000000_seg100000 62.1 us +- 2.2 us 639 us +- 14 us -90.3%
split_latin_size1000000_seg250000 45.3 us +- 1.3 us 529 us +- 3 us -91.4%
split_latin_size1000000_seg500000 27.5 us +- 0.9 us 357 us +- 11 us -92.3%
split_latin_size1000000_seg999999 74.8 us +- 3.2 us 757 us +- 56 us -90.1%
split_cjk_size100_seg2 1.46 us +- 0.02 us 1.54 us +- 0.04 us -5.2%
split_cjk_size100_seg10 519 ns +- 5 ns 605 ns +- 4 ns -14.2%
split_cjk_size100_seg50 248 ns +- 4 ns 278 ns +- 1 ns -10.8%
split_cjk_size1000_seg2 13.8 us +- 0.5 us 14.2 us +- 0.1 us -2.8%
split_cjk_size1000_seg10 4.53 us +- 0.26 us 5.03 us +- 0.12 us -9.9%
split_cjk_size1000_seg50 1.52 us +- 0.04 us 1.99 us +- 0.05 us -23.6%
split_cjk_size1000_seg250 880 ns +- 19 ns 1.17 us +- 0.03 us -24.8%
split_cjk_size1000_seg500 630 ns +- 5 ns 807 ns +- 9 ns -21.9%
split_cjk_size1000_seg1000 842 ns +- 18 ns 1.18 us +- 0.01 us -28.6%
split_cjk_size10000_seg2 130 us +- 1 us 136 us +- 8 us -4.4%
split_cjk_size10000_seg10 41.5 us +- 0.3 us 46.1 us +- 1.5 us -10.0%
split_cjk_size10000_seg50 14.6 us +- 0.5 us 19.4 us +- 0.6 us -24.7%
split_cjk_size10000_seg250 10.6 us +- 0.2 us 14.1 us +- 0.1 us -24.8%
split_cjk_size10000_seg500 9.07 us +- 0.06 us 12.3 us +- 0.1 us -26.3%
split_cjk_size10000_seg1000 7.44 us +- 0.15 us 10.5 us +- 0.2 us -29.1%
split_cjk_size10000_seg10000 6.80 us +- 0.12 us 10.1 us +- 0.0 us -32.7%
split_cjk_size100000_seg2 2.18 ms +- 0.18 ms 2.21 ms +- 0.20 ms -1.4%
split_cjk_size100000_seg10 410 us +- 17 us 443 us +- 4 us -7.4%
split_cjk_size100000_seg50 138 us +- 4 us 188 us +- 12 us -26.6%
split_cjk_size100000_seg250 105 us +- 1 us 145 us +- 6 us -27.6%
split_cjk_size100000_seg500 92.8 us +- 0.7 us 127 us +- 1 us -26.9%
split_cjk_size100000_seg1000 80.5 us +- 1.7 us 113 us +- 1 us -28.8%
split_cjk_size100000_seg10000 66.5 us +- 0.3 us 96.1 us +- 0.6 us -30.8%
split_cjk_size100000_seg25000 55.8 us +- 1.1 us 80.6 us +- 1.6 us -30.8%
split_cjk_size100000_seg100000 65.7 us +- 0.4 us 98.4 us +- 0.5 us -33.2%
split_cjk_size1000000_seg2 25.7 ms +- 1.8 ms 25.3 ms +- 0.9 ms +1.6%
split_cjk_size1000000_seg10 7.40 ms +- 0.30 ms 7.86 ms +- 0.31 ms -5.9%
split_cjk_size1000000_seg50 2.76 ms +- 0.25 ms 3.17 ms +- 0.22 ms -12.9%
split_cjk_size1000000_seg250 1.05 ms +- 0.01 ms 1.42 ms +- 0.06 ms -26.1%
split_cjk_size1000000_seg500 949 us +- 7 us 1.32 ms +- 0.07 ms -28.1%
split_cjk_size1000000_seg1000 804 us +- 14 us 1.13 ms +- 0.01 ms -28.8%
split_cjk_size1000000_seg10000 748 us +- 15 us 1.07 ms +- 0.01 ms -30.1%
split_cjk_size1000000_seg25000 741 us +- 4 us 1.06 ms +- 0.00 ms -30.1%
split_cjk_size1000000_seg100000 682 us +- 7 us 976 us +- 3 us -30.1%
split_cjk_size1000000_seg250000 572 us +- 22 us 812 us +- 4 us -29.6%
split_cjk_size1000000_seg500000 390 us +- 21 us 550 us +- 3 us -29.1%
split_cjk_size1000000_seg999999 815 us +- 4 us 1.14 ms +- 0.01 ms -28.5%
split_cjk_nz_size100_seg2 1.49 us +- 0.02 us 1.54 us +- 0.04 us -3.2%
split_cjk_nz_size100_seg10 499 ns +- 8 ns 609 ns +- 11 ns -18.1%
split_cjk_nz_size100_seg50 223 ns +- 5 ns 282 ns +- 10 ns -20.9%
split_cjk_nz_size1000_seg2 14.0 us +- 0.1 us 14.3 us +- 0.3 us -2.1%
split_cjk_nz_size1000_seg10 3.95 us +- 0.05 us 5.01 us +- 0.04 us -21.2%
split_cjk_nz_size1000_seg50 951 ns +- 17 ns 1.98 us +- 0.08 us -52.0%
split_cjk_nz_size1000_seg250 376 ns +- 10 ns 1.16 us +- 0.01 us -67.6%
split_cjk_nz_size1000_seg500 312 ns +- 11 ns 814 ns +- 33 ns -61.7%
split_cjk_nz_size1000_seg1000 203 ns +- 12 ns 1.18 us +- 0.03 us -82.8%
split_cjk_nz_size10000_seg2 132 us +- 3 us 134 us +- 1 us -1.5%
split_cjk_nz_size10000_seg10 36.0 us +- 1.2 us 45.4 us +- 0.5 us -20.7%
split_cjk_nz_size10000_seg50 9.05 us +- 0.07 us 19.2 us +- 0.4 us -52.9%
split_cjk_nz_size10000_seg250 3.93 us +- 0.03 us 14.6 us +- 1.1 us -73.1%
split_cjk_nz_size10000_seg500 2.70 us +- 0.02 us 12.3 us +- 0.1 us -78.0%
split_cjk_nz_size10000_seg1000 1.65 us +- 0.02 us 10.4 us +- 0.1 us -84.1%
split_cjk_nz_size10000_seg10000 344 ns +- 5 ns 10.1 us +- 0.1 us -96.6%
split_cjk_nz_size100000_seg2 2.18 ms +- 0.20 ms 2.18 ms +- 0.21 ms ~same
split_cjk_nz_size100000_seg10 347 us +- 5 us 444 us +- 14 us -21.8%
split_cjk_nz_size100000_seg50 87.6 us +- 5.6 us 186 us +- 8 us -52.9%
split_cjk_nz_size100000_seg250 41.0 us +- 1.0 us 141 us +- 1 us -70.9%
split_cjk_nz_size100000_seg500 29.0 us +- 0.8 us 127 us +- 3 us -77.2%
split_cjk_nz_size100000_seg1000 17.7 us +- 0.3 us 113 us +- 1 us -84.3%
split_cjk_nz_size100000_seg10000 8.53 us +- 0.52 us 99.7 us +- 5.7 us -91.4%
split_cjk_nz_size100000_seg25000 7.04 us +- 0.46 us 80.5 us +- 1.3 us -91.3%
split_cjk_nz_size100000_seg100000 2.50 us +- 0.05 us 98.4 us +- 0.3 us -97.5%
split_cjk_nz_size1000000_seg2 25.1 ms +- 0.9 ms 25.5 ms +- 0.6 ms -1.6%
split_cjk_nz_size1000000_seg10 7.01 ms +- 0.17 ms 7.82 ms +- 0.21 ms -10.4%
split_cjk_nz_size1000000_seg50 2.26 ms +- 0.16 ms 3.24 ms +- 0.16 ms -30.2%
split_cjk_nz_size1000000_seg250 440 us +- 27 us 1.41 ms +- 0.02 ms -68.8%
split_cjk_nz_size1000000_seg500 353 us +- 10 us 1.29 ms +- 0.01 ms -72.6%
split_cjk_nz_size1000000_seg1000 235 us +- 5 us 1.13 ms +- 0.01 ms -79.2%
split_cjk_nz_size1000000_seg10000 193 us +- 3 us 1.07 ms +- 0.00 ms -82.0%
split_cjk_nz_size1000000_seg25000 197 us +- 3 us 1.07 ms +- 0.01 ms -81.6%
split_cjk_nz_size1000000_seg100000 162 us +- 3 us 979 us +- 8 us -83.5%
split_cjk_nz_size1000000_seg250000 122 us +- 3 us 813 us +- 4 us -85.0%
split_cjk_nz_size1000000_seg500000 72.9 us +- 2.8 us 552 us +- 6 us -86.8%
split_cjk_nz_size1000000_seg999999 234 us +- 9 us 1.14 ms +- 0.00 ms -79.5%
split_emoji_size100_seg2 1.69 us +- 0.06 us 1.83 us +- 0.02 us -7.7%
split_emoji_size100_seg10 615 ns +- 38 ns 674 ns +- 8 ns -8.8%
split_emoji_size100_seg50 249 ns +- 9 ns 300 ns +- 5 ns -17.0%
split_emoji_size1000_seg2 15.6 us +- 0.7 us 17.1 us +- 0.5 us -8.8%
split_emoji_size1000_seg10 4.92 us +- 0.12 us 5.67 us +- 0.10 us -13.2%
split_emoji_size1000_seg50 1.22 us +- 0.01 us 2.10 us +- 0.09 us -41.9%
split_emoji_size1000_seg250 620 ns +- 11 ns 1.39 us +- 0.04 us -55.4%
split_emoji_size1000_seg500 331 ns +- 7 ns 816 ns +- 5 ns -59.4%
split_emoji_size1000_seg1000 231 ns +- 2 ns 1.19 us +- 0.00 us -80.6%
split_emoji_size10000_seg2 148 us +- 3 us 164 us +- 1 us -9.8%
split_emoji_size10000_seg10 45.9 us +- 0.5 us 52.1 us +- 0.3 us -11.9%
split_emoji_size10000_seg50 11.9 us +- 0.1 us 20.2 us +- 0.2 us -41.1%
split_emoji_size10000_seg250 5.82 us +- 0.15 us 15.7 us +- 0.1 us -62.9%
split_emoji_size10000_seg500 3.45 us +- 0.05 us 12.7 us +- 0.1 us -72.8%
split_emoji_size10000_seg1000 2.21 us +- 0.03 us 10.7 us +- 0.1 us -79.3%
split_emoji_size10000_seg10000 669 ns +- 12 ns 10.6 us +- 0.8 us -93.7%
split_emoji_size100000_seg2 2.53 ms +- 0.02 ms 2.62 ms +- 0.12 ms -3.4%
split_emoji_size100000_seg10 457 us +- 4 us 514 us +- 6 us -11.1%
split_emoji_size100000_seg50 115 us +- 1 us 194 us +- 1 us -40.7%
split_emoji_size100000_seg250 64.5 us +- 1.8 us 159 us +- 3 us -59.4%
split_emoji_size100000_seg500 35.4 us +- 1.2 us 128 us +- 0 us -72.3%
split_emoji_size100000_seg1000 26.7 us +- 1.2 us 118 us +- 1 us -77.4%
split_emoji_size100000_seg10000 19.1 us +- 1.0 us 105 us +- 2 us -81.8%
split_emoji_size100000_seg25000 15.0 us +- 1.1 us 87.3 us +- 1.4 us -82.8%
split_emoji_size100000_seg100000 4.63 us +- 0.04 us 98.8 us +- 2.1 us -95.3%
split_emoji_size1000000_seg2 31.4 ms +- 0.7 ms 32.7 ms +- 0.8 ms -4.0%
split_emoji_size1000000_seg10 9.30 ms +- 0.18 ms 10.1 ms +- 0.4 ms -7.9%
split_emoji_size1000000_seg50 3.68 ms +- 0.09 ms 4.47 ms +- 0.04 ms -17.7%
split_emoji_size1000000_seg250 748 us +- 12 us 1.62 ms +- 0.03 ms -53.8%
split_emoji_size1000000_seg500 546 us +- 11 us 1.36 ms +- 0.01 ms -59.9%
split_emoji_size1000000_seg1000 412 us +- 6 us 1.22 ms +- 0.01 ms -66.2%
split_emoji_size1000000_seg10000 429 us +- 4 us 1.19 ms +- 0.01 ms -63.9%
split_emoji_size1000000_seg25000 382 us +- 19 us 1.17 ms +- 0.01 ms -67.4%
split_emoji_size1000000_seg100000 338 us +- 4 us 1.07 ms +- 0.01 ms -68.4%
split_emoji_size1000000_seg250000 314 us +- 4 us 924 us +- 6 us -66.0%
split_emoji_size1000000_seg500000 237 us +- 2 us 652 us +- 6 us -63.7%
split_emoji_size1000000_seg999999 869 us +- 6 us 1.77 ms +- 0.14 ms -50.9%
macOS
Benchmark Optimised Baseline Change
----------------------------------------------------------------------------------
split_ascii_size100_seg2 435 ns +- 28 ns 403 ns +- 10 ns +7.9%
split_ascii_size100_seg10 145 ns +- 3 ns 150 ns +- 8 ns -3.3%
split_ascii_size100_seg50 59.8 ns +- 3.5 ns 70.2 ns +- 0.5 ns -14.8%
split_ascii_size1000_seg2 4.50 us +- 0.15 us 3.86 us +- 0.05 us +16.6%
split_ascii_size1000_seg10 1.15 us +- 0.01 us 1.35 us +- 0.02 us -14.8%
split_ascii_size1000_seg50 291 ns +- 6 ns 573 ns +- 34 ns -49.2%
split_ascii_size1000_seg250 92.6 ns +- 0.6 ns 321 ns +- 10 ns -71.2%
split_ascii_size1000_seg500 94.3 ns +- 1.5 ns 233 ns +- 5 ns -59.5%
split_ascii_size1000_seg1000 64.4 ns +- 0.6 ns 316 ns +- 3 ns -79.6%
split_ascii_size10000_seg2 45.6 us +- 0.3 us 37.0 us +- 0.8 us +23.2%
split_ascii_size10000_seg10 11.4 us +- 0.1 us 12.1 us +- 0.2 us -5.8%
split_ascii_size10000_seg50 2.86 us +- 0.03 us 5.79 us +- 0.12 us -50.6%
split_ascii_size10000_seg250 684 ns +- 9 ns 3.61 us +- 0.05 us -81.1%
split_ascii_size10000_seg500 846 ns +- 16 ns 3.44 us +- 0.30 us -75.4%
split_ascii_size10000_seg1000 539 ns +- 11 ns 2.80 us +- 0.02 us -80.8%
split_ascii_size10000_seg10000 231 ns +- 2 ns 2.63 us +- 0.02 us -91.2%
split_ascii_size100000_seg2 495 us +- 5 us 413 us +- 6 us +19.9%
split_ascii_size100000_seg10 115 us +- 3 us 129 us +- 1 us -10.9%
split_ascii_size100000_seg50 27.8 us +- 0.3 us 54.6 us +- 1.5 us -49.1%
split_ascii_size100000_seg250 8.16 us +- 0.07 us 37.4 us +- 0.8 us -78.2%
split_ascii_size100000_seg500 10.9 us +- 0.2 us 37.3 us +- 0.9 us -70.8%
split_ascii_size100000_seg1000 7.30 us +- 0.18 us 32.2 us +- 0.1 us -77.3%
split_ascii_size100000_seg10000 2.99 us +- 0.04 us 24.8 us +- 0.4 us -87.9%
split_ascii_size100000_seg25000 2.14 us +- 0.02 us 20.3 us +- 0.1 us -89.5%
split_ascii_size100000_seg100000 1.69 us +- 0.01 us 25.8 us +- 0.2 us -93.4%
split_ascii_size1000000_seg2 5.26 ms +- 0.19 ms 4.47 ms +- 0.05 ms +17.7%
split_ascii_size1000000_seg10 1.43 ms +- 0.01 ms 1.56 ms +- 0.06 ms -8.3%
split_ascii_size1000000_seg50 329 us +- 3 us 628 us +- 14 us -47.6%
split_ascii_size1000000_seg250 77.4 us +- 0.5 us 373 us +- 6 us -79.2%
split_ascii_size1000000_seg500 108 us +- 1 us 380 us +- 7 us -71.6%
split_ascii_size1000000_seg1000 79.8 us +- 0.7 us 337 us +- 5 us -76.3%
split_ascii_size1000000_seg10000 34.3 us +- 0.3 us 274 us +- 2 us -87.5%
split_ascii_size1000000_seg25000 27.6 us +- 0.3 us 264 us +- 1 us -89.5%
split_ascii_size1000000_seg100000 28.6 us +- 0.3 us 248 us +- 2 us -88.5%
split_ascii_size1000000_seg250000 24.0 us +- 0.3 us 206 us +- 2 us -88.3%
split_ascii_size1000000_seg500000 15.3 us +- 0.2 us 136 us +- 3 us -88.8%
split_ascii_size1000000_seg999999 29.8 us +- 0.5 us 271 us +- 2 us -89.0%
split_latin_size100_seg2 484 ns +- 6 ns 444 ns +- 8 ns +9.0%
split_latin_size100_seg10 155 ns +- 6 ns 159 ns +- 2 ns -2.5%
split_latin_size100_seg50 60.1 ns +- 0.6 ns 70.6 ns +- 1.4 ns -14.9%
split_latin_size1000_seg2 5.09 us +- 0.06 us 4.58 us +- 0.13 us +11.1%
split_latin_size1000_seg10 1.42 us +- 0.01 us 1.49 us +- 0.01 us -4.7%
split_latin_size1000_seg50 336 ns +- 6 ns 573 ns +- 13 ns -41.4%
split_latin_size1000_seg250 102 ns +- 1 ns 336 ns +- 7 ns -69.6%
split_latin_size1000_seg500 95.3 ns +- 1.2 ns 232 ns +- 3 ns -58.9%
split_latin_size1000_seg1000 66.0 ns +- 0.7 ns 327 ns +- 2 ns -79.8%
split_latin_size10000_seg2 51.2 us +- 0.4 us 47.4 us +- 0.5 us +8.0%
split_latin_size10000_seg10 13.8 us +- 0.6 us 14.3 us +- 0.5 us -3.5%
split_latin_size10000_seg50 3.54 us +- 0.06 us 5.94 us +- 0.07 us -40.4%
split_latin_size10000_seg250 844 ns +- 10 ns 3.80 us +- 0.03 us -77.8%
split_latin_size10000_seg500 919 ns +- 10 ns 3.50 us +- 0.02 us -73.7%
split_latin_size10000_seg1000 565 ns +- 9 ns 2.88 us +- 0.03 us -80.4%
split_latin_size10000_seg10000 234 ns +- 8 ns 2.79 us +- 0.04 us -91.6%
split_latin_size100000_seg2 574 us +- 5 us 541 us +- 5 us +6.1%
split_latin_size100000_seg10 136 us +- 1 us 145 us +- 1 us -6.2%
split_latin_size100000_seg50 35.3 us +- 0.3 us 59.9 us +- 1.9 us -41.1%
split_latin_size100000_seg250 9.38 us +- 0.19 us 39.1 us +- 0.2 us -76.0%
split_latin_size100000_seg500 11.7 us +- 0.1 us 38.5 us +- 0.2 us -69.6%
split_latin_size100000_seg1000 7.48 us +- 0.04 us 32.9 us +- 0.2 us -77.3%
split_latin_size100000_seg10000 3.01 us +- 0.03 us 25.3 us +- 0.1 us -88.1%
split_latin_size100000_seg25000 2.15 us +- 0.02 us 20.8 us +- 0.4 us -89.7%
split_latin_size100000_seg100000 1.69 us +- 0.01 us 27.5 us +- 0.4 us -93.9%
split_latin_size1000000_seg2 6.49 ms +- 0.07 ms 5.99 ms +- 0.07 ms +8.3%
split_latin_size1000000_seg10 1.75 ms +- 0.02 ms 1.85 ms +- 0.02 ms -5.4%
split_latin_size1000000_seg50 413 us +- 3 us 666 us +- 20 us -38.0%
split_latin_size1000000_seg250 90.0 us +- 2.0 us 390 us +- 9 us -76.9%
split_latin_size1000000_seg500 118 us +- 1 us 390 us +- 2 us -69.7%
split_latin_size1000000_seg1000 83.5 us +- 0.6 us 343 us +- 6 us -75.7%
split_latin_size1000000_seg10000 34.7 us +- 0.3 us 281 us +- 2 us -87.7%
split_latin_size1000000_seg25000 27.8 us +- 0.2 us 270 us +- 2 us -89.7%
split_latin_size1000000_seg100000 28.8 us +- 0.3 us 251 us +- 2 us -88.5%
split_latin_size1000000_seg250000 24.0 us +- 0.3 us 211 us +- 1 us -88.6%
split_latin_size1000000_seg500000 15.3 us +- 0.3 us 141 us +- 6 us -89.1%
split_latin_size1000000_seg999999 29.7 us +- 0.4 us 279 us +- 2 us -89.4%
split_cjk_size100_seg2 577 ns +- 15 ns 549 ns +- 14 ns +5.1%
split_cjk_size100_seg10 212 ns +- 3 ns 195 ns +- 3 ns +8.7%
split_cjk_size100_seg50 74.5 ns +- 0.4 ns 73.5 ns +- 2.7 ns +1.4%
split_cjk_size1000_seg2 5.62 us +- 0.17 us 5.29 us +- 0.07 us +6.2%
split_cjk_size1000_seg10 1.87 us +- 0.06 us 1.81 us +- 0.05 us +3.3%
split_cjk_size1000_seg50 590 ns +- 6 ns 592 ns +- 11 ns ~same
split_cjk_size1000_seg250 429 ns +- 13 ns 413 ns +- 10 ns +3.9%
split_cjk_size1000_seg500 249 ns +- 4 ns 260 ns +- 4 ns -4.2%
split_cjk_size1000_seg1000 322 ns +- 9 ns 330 ns +- 2 ns -2.4%
split_cjk_size10000_seg2 54.0 us +- 0.5 us 51.6 us +- 0.6 us +4.7%
split_cjk_size10000_seg10 17.5 us +- 0.2 us 16.7 us +- 0.2 us +4.8%
split_cjk_size10000_seg50 6.03 us +- 0.06 us 6.07 us +- 0.10 us ~same
split_cjk_size10000_seg250 4.77 us +- 0.10 us 4.86 us +- 0.04 us -1.9%
split_cjk_size10000_seg500 3.65 us +- 0.03 us 3.73 us +- 0.04 us -2.1%
split_cjk_size10000_seg1000 3.00 us +- 0.02 us 3.06 us +- 0.03 us -2.0%
split_cjk_size10000_seg10000 2.63 us +- 0.02 us 2.80 us +- 0.07 us -6.1%
split_cjk_size100000_seg2 587 us +- 17 us 568 us +- 7 us +3.3%
split_cjk_size100000_seg10 178 us +- 3 us 168 us +- 3 us +6.0%
split_cjk_size100000_seg50 59.2 us +- 0.6 us 59.3 us +- 0.6 us ~same
split_cjk_size100000_seg250 51.8 us +- 0.4 us 52.7 us +- 1.2 us -1.7%
split_cjk_size100000_seg500 41.4 us +- 0.4 us 41.9 us +- 0.3 us -1.2%
split_cjk_size100000_seg1000 36.5 us +- 0.9 us 37.0 us +- 0.2 us -1.4%
split_cjk_size100000_seg10000 25.7 us +- 0.2 us 26.2 us +- 0.2 us -1.9%
split_cjk_size100000_seg25000 22.6 us +- 0.2 us 23.1 us +- 0.4 us -2.2%
split_cjk_size100000_seg100000 25.8 us +- 0.1 us 27.8 us +- 0.5 us -7.2%
split_cjk_size1000000_seg2 6.57 ms +- 0.10 ms 6.38 ms +- 0.11 ms +3.0%
split_cjk_size1000000_seg10 2.18 ms +- 0.04 ms 2.11 ms +- 0.05 ms +3.3%
split_cjk_size1000000_seg50 708 us +- 6 us 711 us +- 10 us ~same
split_cjk_size1000000_seg250 514 us +- 10 us 520 us +- 2 us -1.2%
split_cjk_size1000000_seg500 414 us +- 2 us 420 us +- 3 us -1.4%
split_cjk_size1000000_seg1000 374 us +- 5 us 381 us +- 4 us -1.8%
split_cjk_size1000000_seg10000 285 us +- 2 us 290 us +- 2 us -1.7%
split_cjk_size1000000_seg25000 294 us +- 2 us 302 us +- 8 us -2.6%
split_cjk_size1000000_seg100000 262 us +- 2 us 268 us +- 2 us -2.2%
split_cjk_size1000000_seg250000 216 us +- 8 us 220 us +- 1 us -1.8%
split_cjk_size1000000_seg500000 143 us +- 1 us 145 us +- 1 us -1.4%
split_cjk_size1000000_seg999999 284 us +- 2 us 289 us +- 2 us -1.7%
split_cjk_nz_size100_seg2 597 ns +- 7 ns 552 ns +- 12 ns +8.2%
split_cjk_nz_size100_seg10 197 ns +- 3 ns 196 ns +- 3 ns ~same
split_cjk_nz_size100_seg50 64.8 ns +- 1.8 ns 88.6 ns +- 17.2 ns -26.9%
split_cjk_nz_size1000_seg2 6.41 us +- 0.08 us 5.34 us +- 0.12 us +20.0%
split_cjk_nz_size1000_seg10 1.86 us +- 0.02 us 1.80 us +- 0.03 us +3.3%
split_cjk_nz_size1000_seg50 397 ns +- 4 ns 593 ns +- 7 ns -33.1%
split_cjk_nz_size1000_seg250 212 ns +- 9 ns 413 ns +- 11 ns -48.7%
split_cjk_nz_size1000_seg500 107 ns +- 1 ns 260 ns +- 2 ns -58.8%
split_cjk_nz_size1000_seg1000 94.0 ns +- 0.9 ns 329 ns +- 2 ns -71.4%
split_cjk_nz_size10000_seg2 63.9 us +- 0.7 us 51.8 us +- 0.7 us +23.4%
split_cjk_nz_size10000_seg10 18.0 us +- 0.3 us 16.7 us +- 0.3 us +7.8%
split_cjk_nz_size10000_seg50 4.15 us +- 0.08 us 6.04 us +- 0.05 us -31.3%
split_cjk_nz_size10000_seg250 2.08 us +- 0.03 us 4.84 us +- 0.03 us -57.0%
split_cjk_nz_size10000_seg500 1.32 us +- 0.03 us 3.72 us +- 0.02 us -64.5%
split_cjk_nz_size10000_seg1000 922 ns +- 9 ns 3.07 us +- 0.02 us -70.0%
split_cjk_nz_size10000_seg10000 393 ns +- 3 ns 2.80 us +- 0.02 us -86.0%
split_cjk_nz_size100000_seg2 697 us +- 19 us 567 us +- 14 us +22.9%
split_cjk_nz_size100000_seg10 178 us +- 2 us 168 us +- 2 us +6.0%
split_cjk_nz_size100000_seg50 40.6 us +- 0.4 us 59.2 us +- 0.7 us -31.4%
split_cjk_nz_size100000_seg250 25.2 us +- 0.2 us 52.6 us +- 0.3 us -52.1%
split_cjk_nz_size100000_seg500 16.9 us +- 0.1 us 42.0 us +- 0.9 us -59.8%
split_cjk_nz_size100000_seg1000 13.5 us +- 0.2 us 37.0 us +- 0.2 us -63.5%
split_cjk_nz_size100000_seg10000 5.35 us +- 0.06 us 26.1 us +- 0.2 us -79.5%
split_cjk_nz_size100000_seg25000 5.75 us +- 0.07 us 23.1 us +- 0.2 us -75.1%
split_cjk_nz_size100000_seg100000 3.39 us +- 0.08 us 27.6 us +- 0.4 us -87.7%
split_cjk_nz_size1000000_seg2 7.77 ms +- 0.18 ms 6.43 ms +- 0.14 ms +20.8%
split_cjk_nz_size1000000_seg10 2.18 ms +- 0.03 ms 2.13 ms +- 0.06 ms +2.3%
split_cjk_nz_size1000000_seg50 527 us +- 5 us 711 us +- 7 us -25.9%
split_cjk_nz_size1000000_seg250 244 us +- 2 us 521 us +- 2 us -53.2%
split_cjk_nz_size1000000_seg500 167 us +- 1 us 420 us +- 2 us -60.2%
split_cjk_nz_size1000000_seg1000 141 us +- 2 us 381 us +- 3 us -63.0%
split_cjk_nz_size1000000_seg10000 61.0 us +- 0.6 us 290 us +- 2 us -79.0%
split_cjk_nz_size1000000_seg25000 74.2 us +- 0.6 us 301 us +- 2 us -75.3%
split_cjk_nz_size1000000_seg100000 59.0 us +- 1.2 us 268 us +- 2 us -78.0%
split_cjk_nz_size1000000_seg250000 46.0 us +- 0.5 us 218 us +- 1 us -78.9%
split_cjk_nz_size1000000_seg500000 29.8 us +- 0.3 us 146 us +- 1 us -79.6%
split_cjk_nz_size1000000_seg999999 58.8 us +- 0.4 us 289 us +- 2 us -79.7%
split_emoji_size100_seg2 591 ns +- 13 ns 560 ns +- 6 ns +5.5%
split_emoji_size100_seg10 224 ns +- 3 ns 211 ns +- 6 ns +6.2%
split_emoji_size100_seg50 83.5 ns +- 1.1 ns 80.3 ns +- 1.3 ns +4.0%
split_emoji_size1000_seg2 5.98 us +- 0.11 us 5.49 us +- 0.06 us +8.9%
split_emoji_size1000_seg10 2.02 us +- 0.04 us 1.93 us +- 0.02 us +4.7%
split_emoji_size1000_seg50 649 ns +- 5 ns 647 ns +- 7 ns ~same
split_emoji_size1000_seg250 423 ns +- 3 ns 422 ns +- 3 ns ~same
split_emoji_size1000_seg500 260 ns +- 2 ns 269 ns +- 2 ns -3.3%
split_emoji_size1000_seg1000 328 ns +- 12 ns 333 ns +- 2 ns -1.5%
split_emoji_size10000_seg2 58.6 us +- 0.8 us 54.8 us +- 0.5 us +6.9%
split_emoji_size10000_seg10 19.4 us +- 0.2 us 18.7 us +- 0.4 us +3.7%
split_emoji_size10000_seg50 6.61 us +- 0.10 us 6.60 us +- 0.08 us ~same
split_emoji_size10000_seg250 5.00 us +- 0.05 us 5.05 us +- 0.03 us ~same
split_emoji_size10000_seg500 3.94 us +- 0.03 us 4.02 us +- 0.02 us -2.0%
split_emoji_size10000_seg1000 3.28 us +- 0.03 us 3.31 us +- 0.03 us ~same
split_emoji_size10000_seg10000 2.64 us +- 0.01 us 2.79 us +- 0.03 us -5.4%
split_emoji_size100000_seg2 689 us +- 6 us 642 us +- 5 us +7.3%
split_emoji_size100000_seg10 194 us +- 6 us 183 us +- 2 us +6.0%
split_emoji_size100000_seg50 64.4 us +- 0.8 us 63.8 us +- 0.5 us ~same
split_emoji_size100000_seg250 57.2 us +- 0.3 us 58.0 us +- 0.3 us -1.4%
split_emoji_size100000_seg500 49.3 us +- 0.2 us 49.9 us +- 0.4 us -1.2%
split_emoji_size100000_seg1000 41.7 us +- 0.9 us 42.0 us +- 0.5 us ~same
split_emoji_size100000_seg10000 31.9 us +- 0.2 us 32.5 us +- 0.3 us -1.8%
split_emoji_size100000_seg25000 24.1 us +- 0.2 us 24.9 us +- 0.8 us -3.2%
split_emoji_size100000_seg100000 25.7 us +- 0.1 us 27.7 us +- 0.4 us -7.2%
split_emoji_size1000000_seg2 7.40 ms +- 0.42 ms 7.18 ms +- 0.27 ms +3.1%
split_emoji_size1000000_seg10 2.51 ms +- 0.07 ms 2.40 ms +- 0.03 ms +4.6%
split_emoji_size1000000_seg50 916 us +- 16 us 929 us +- 36 us -1.4%
split_emoji_size1000000_seg250 566 us +- 4 us 576 us +- 4 us -1.7%
split_emoji_size1000000_seg500 495 us +- 2 us 500 us +- 4 us ~same
split_emoji_size1000000_seg1000 427 us +- 10 us 427 us +- 3 us ~same
split_emoji_size1000000_seg10000 353 us +- 4 us 362 us +- 14 us -2.5%
split_emoji_size1000000_seg25000 315 us +- 2 us 319 us +- 3 us -1.3%
split_emoji_size1000000_seg100000 287 us +- 2 us 292 us +- 2 us -1.7%
split_emoji_size1000000_seg250000 237 us +- 13 us 239 us +- 2 us ~same
split_emoji_size1000000_seg500000 155 us +- 1 us 158 us +- 1 us -1.9%
split_emoji_size1000000_seg999999 309 us +- 2 us 315 us +- 2 us -1.9%
I have also ensured the tests pass with my changes.
$ ./python.exe -m unittest Lib/test/test_str.py -v -k split
test_additional_rsplit (Lib.test.test_str.StrTest.test_additional_rsplit) ... ok
test_additional_split (Lib.test.test_str.StrTest.test_additional_split) ... ok
test_rsplit (Lib.test.test_str.StrTest.test_rsplit) ... ok
test_split (Lib.test.test_str.StrTest.test_split) ... ok
test_splitlines (Lib.test.test_str.StrTest.test_splitlines) ... ok
test_formatter_field_name_split (Lib.test.test_str.StringModuleTest.test_formatter_field_name_split) ... ok
----------------------------------------------------------------------
Ran 6 tests in 0.001s
OKI haven't (yet) made benchmarks for the rsplit case, but if that'd be useful, I can also make some (I just wanted to probe for interest / validity first).