speedup fix - avoid creating new array everytime#145
speedup fix - avoid creating new array everytime#145jucas1 wants to merge 4 commits intodecalage2:masterfrom
Conversation
fix for performance bottleneck for larger fat tables.
|
Hi @jucas1, thanks a lot for this PR. Do you have a sample where it makes a big difference? I would just like to test the improvement. |
|
@decalage2 Concatenating arrays using "+" operator have exponential cost, while .extend cost is almost linear. In my use-case that did 5-10x speedup (for batch processing). Also "+=" operator will make effect, and surprisingly is even slightly faster than .extend: |
Approx. 20% speedup
# TODO: would it be more efficient using a dict or hash values, instead
# of a list of long ?
|
+1 for this change, storing the used stream indexes in a list makes parsing quadratic in the number of streams; changing to use a set is a very easy fix. |
fix for performance bottleneck for larger fat tables.