From 5679b93e29907d430e27107fffb894217900c731 Mon Sep 17 00:00:00 2001 From: Mathieu BISKUPSKI Date: Wed, 29 Apr 2026 15:27:15 +0200 Subject: [PATCH] perf: avoid per-call ctx allocation in next_no_xml_space MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit `next_no_xml_space` is only entered when the source document has no `xml:space` attribute anywhere, in which case the per-node `ctx` (xml:space inheritance stack) is always `Bool[false]` and is never mutated. Allocating a fresh `[false]` on every call therefore costs ~32 B × N nodes for no semantic benefit. Reuse the parent's `ctx` instead. On a 47 MiB representative KML file (5,411 Placemarks, ~1 M XML nodes traversed during a `DataFrame` extraction in a downstream consumer), this drops cumulative tracked allocations by ~60 MiB without any behavior or API change. Wall-clock parsing time is unchanged. next_xml_space is unaffected: that path mutates ctx via push/pop when descending into elements with `xml:space`, so each Raw still needs its own copy. --- src/raw.jl | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/src/raw.jl b/src/raw.jl index 29d0a10..ef9452a 100644 --- a/src/raw.jl +++ b/src/raw.jl @@ -431,7 +431,12 @@ function next_no_xml_space(o::Raw) # same as v0.3.5 data = o.data type = o.type has_xml_space = o.has_xml_space - ctx = [false] + # `ctx` (the xml:space inheritance stack) is unused by this code path — + # `next_no_xml_space` is only called when the document has no `xml:space` + # attribute anywhere, so the per-node ctx is always `Bool[false]` and is + # never mutated. Reuse the parent's `ctx` instead of allocating a fresh + # one; on a 47 MiB file this drops ~60 MiB of cumulative allocation. + ctx = o.ctx i = findnext(!xml_isspace, data, i) if isnothing(i) return nothing