Skip to content

httpcore async instrumentation builds malformed URL and crashes urllib when parent httpx span is dropped #2667

@Pentusha

Description

@Pentusha

Describe the bug:

HTTPCoreAsyncInstrumentation.call in elasticapm/instrumentation/packages/httpx/async/httpcore.py constructs a corrupted URL that later crashes urllib.parse inside Span.autofill_service_target, propagating a ValueError out of the user's httpx request.

This is normally invisible because the inner httpcore span is dropped under the leaf httpx span. It becomes observable when the outer httpx span itself is a DroppedSpan — e.g. after transaction_max_spans is exhausted in a long-running transaction with many outbound calls. At that point the httpcore span is created as a real Span carrying the malformed URL, and crashes when it ends.

Two bugs in httpcore.py (async/httpcore.py, same pattern likely in sync/httpcore.py):

async def call(self, module, method, wrapped, instance, args, kwargs):
    url, method, headers = utils.get_request_data(args, kwargs)
    scheme, host, port, target = url

    if port != default_ports.get(scheme):          # bug 1
        host += ":" + str(port)
    ...
    url = "%s://%s%s" % (scheme, host, url)        # bug 2
  1. When port is None (URL uses the scheme's default port), None != 443 evaluates to True, so host becomes e.g. api.example.com:None.
  2. The third %s is given the 4-tuple url = (scheme, host, port, target) instead of target. str() of that tuple is concatenated into the URL string.

Resulting context["http"]["url"]:

https://api.example.com:None('https', 'api.example.com', None, '/v1/accounts/<ACCOUNT_ID>/transactions/?limit=200&...&fingerprint=')

Span.autofill_service_target (elasticapm/traces.py) then reads that string back:

url = self.context["http"]["url"]
parsed_url = urllib.parse.urlparse(url)
service_target["name"] = parsed_url.hostname
if parsed_url.port:                                # raises here
    service_target["name"] += f":{parsed_url.port}"

Because the netloc is api.example.com:None('https', 'api.example.com', None, ', parsed_url.port raises:

ValueError: Port could not be cast to integer value as "None('https', 'api.example.com', None, '"

The exception escapes async_capture_span.__aexit__ and surfaces in user code at the httpx call site, not in APM code.

Suggested fix

if port is not None and port != default_ports.get(scheme):
    host += ":" + str(port)
...
url = "%s://%s%s" % (scheme, host, target)

Workaround

Add DISABLE_INSTRUMENTATIONS=['httpcore'] to the APM config. httpx instrumentation alone still produces the outbound span; per its own docstring, httpcore is only there to propagate distributed-tracing headers.

To Reproduce

  1. Configure elastic-apm with the default transaction_max_spans (500) and start a long transaction.
  2. Inside the transaction make many outbound httpx.AsyncClient calls so the budget is exhausted and httpx's leaf span becomes a DroppedSpan.
  3. Issue another httpx GET to a URL that uses the scheme's default port (e.g. https://api.example.com/path/?param=value, no explicit port).
  4. On span end, Span.autofill_service_target raises ValueError: Port could not be cast to integer value as "None(...)" out of httpx.AsyncClient.send.

Environment (please complete the following information)

  • OS: Linux (kernel 6.12)
  • Python version: 3.12
  • Framework and version: httpx async client over httpcore >= 0.13
  • APM Server version: n/a (bug is purely client-side, in the agent)
  • Agent version: 6.26.1 (latest on PyPI as of filing); same code present on main

Additional context

Source links to current main:

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions