Describe the bug:
HTTPCoreAsyncInstrumentation.call in elasticapm/instrumentation/packages/httpx/async/httpcore.py constructs a corrupted URL that later crashes urllib.parse inside Span.autofill_service_target, propagating a ValueError out of the user's httpx request.
This is normally invisible because the inner httpcore span is dropped under the leaf httpx span. It becomes observable when the outer httpx span itself is a DroppedSpan — e.g. after transaction_max_spans is exhausted in a long-running transaction with many outbound calls. At that point the httpcore span is created as a real Span carrying the malformed URL, and crashes when it ends.
Two bugs in httpcore.py (async/httpcore.py, same pattern likely in sync/httpcore.py):
async def call(self, module, method, wrapped, instance, args, kwargs):
url, method, headers = utils.get_request_data(args, kwargs)
scheme, host, port, target = url
if port != default_ports.get(scheme): # bug 1
host += ":" + str(port)
...
url = "%s://%s%s" % (scheme, host, url) # bug 2
- When
port is None (URL uses the scheme's default port), None != 443 evaluates to True, so host becomes e.g. api.example.com:None.
- The third
%s is given the 4-tuple url = (scheme, host, port, target) instead of target. str() of that tuple is concatenated into the URL string.
Resulting context["http"]["url"]:
https://api.example.com:None('https', 'api.example.com', None, '/v1/accounts/<ACCOUNT_ID>/transactions/?limit=200&...&fingerprint=')
Span.autofill_service_target (elasticapm/traces.py) then reads that string back:
url = self.context["http"]["url"]
parsed_url = urllib.parse.urlparse(url)
service_target["name"] = parsed_url.hostname
if parsed_url.port: # raises here
service_target["name"] += f":{parsed_url.port}"
Because the netloc is api.example.com:None('https', 'api.example.com', None, ', parsed_url.port raises:
ValueError: Port could not be cast to integer value as "None('https', 'api.example.com', None, '"
The exception escapes async_capture_span.__aexit__ and surfaces in user code at the httpx call site, not in APM code.
Suggested fix
if port is not None and port != default_ports.get(scheme):
host += ":" + str(port)
...
url = "%s://%s%s" % (scheme, host, target)
Workaround
Add DISABLE_INSTRUMENTATIONS=['httpcore'] to the APM config. httpx instrumentation alone still produces the outbound span; per its own docstring, httpcore is only there to propagate distributed-tracing headers.
To Reproduce
- Configure elastic-apm with the default
transaction_max_spans (500) and start a long transaction.
- Inside the transaction make many outbound
httpx.AsyncClient calls so the budget is exhausted and httpx's leaf span becomes a DroppedSpan.
- Issue another
httpx GET to a URL that uses the scheme's default port (e.g. https://api.example.com/path/?param=value, no explicit port).
- On span end,
Span.autofill_service_target raises ValueError: Port could not be cast to integer value as "None(...)" out of httpx.AsyncClient.send.
Environment (please complete the following information)
- OS: Linux (kernel 6.12)
- Python version: 3.12
- Framework and version:
httpx async client over httpcore >= 0.13
- APM Server version: n/a (bug is purely client-side, in the agent)
- Agent version: 6.26.1 (latest on PyPI as of filing); same code present on
main
Additional context
Source links to current main:
Describe the bug:
HTTPCoreAsyncInstrumentation.callinelasticapm/instrumentation/packages/httpx/async/httpcore.pyconstructs a corrupted URL that later crashesurllib.parseinsideSpan.autofill_service_target, propagating aValueErrorout of the user'shttpxrequest.This is normally invisible because the inner httpcore span is dropped under the leaf httpx span. It becomes observable when the outer httpx span itself is a
DroppedSpan— e.g. aftertransaction_max_spansis exhausted in a long-running transaction with many outbound calls. At that point the httpcore span is created as a realSpancarrying the malformed URL, and crashes when it ends.Two bugs in
httpcore.py(async/httpcore.py, same pattern likely insync/httpcore.py):port is None(URL uses the scheme's default port),None != 443evaluates toTrue, sohostbecomes e.g.api.example.com:None.%sis given the 4-tupleurl = (scheme, host, port, target)instead oftarget.str()of that tuple is concatenated into the URL string.Resulting
context["http"]["url"]:Span.autofill_service_target(elasticapm/traces.py) then reads that string back:Because the netloc is
api.example.com:None('https', 'api.example.com', None, ',parsed_url.portraises:The exception escapes
async_capture_span.__aexit__and surfaces in user code at thehttpxcall site, not in APM code.Suggested fix
Workaround
Add
DISABLE_INSTRUMENTATIONS=['httpcore']to the APM config. httpx instrumentation alone still produces the outbound span; per its own docstring, httpcore is only there to propagate distributed-tracing headers.To Reproduce
transaction_max_spans(500) and start a long transaction.httpx.AsyncClientcalls so the budget is exhausted and httpx's leaf span becomes aDroppedSpan.httpxGET to a URL that uses the scheme's default port (e.g.https://api.example.com/path/?param=value, no explicit port).Span.autofill_service_targetraisesValueError: Port could not be cast to integer value as "None(...)"out ofhttpx.AsyncClient.send.Environment (please complete the following information)
httpxasync client overhttpcore >= 0.13mainAdditional context
Source links to current
main: