Skip to content

feat(search): add you_com as a search provider#28370

Open
brainsparker wants to merge 6 commits into
BerriAI:litellm_internal_stagingfrom
brainsparker:you-com-search-provider
Open

feat(search): add you_com as a search provider#28370
brainsparker wants to merge 6 commits into
BerriAI:litellm_internal_stagingfrom
brainsparker:you-com-search-provider

Conversation

@brainsparker
Copy link
Copy Markdown

@brainsparker brainsparker commented May 20, 2026

Summary

Registers You.com as a first-class search_provider in the search_tools registry, alongside Tavily, Exa AI, Perplexity, Parallel AI, Brave, Google PSE, DataForSEO, Firecrawl, SearXNG, Linkup, DuckDuckGo, SearchAPI, and Serper.

Once registered, you_com works transparently with the litellm_web_search interception layer across all 25+ supported LLM providers (OpenAI, Anthropic, Vertex, Bedrock, etc.).

Keyless free tier by default. If YOUCOM_API_KEY is not set, the adapter falls through to You.com's keyless endpoint (api.you.com/v1/agents/search, ~100 queries/day, IP-throttled, no signup). This matches the existing keyless-default pattern used by duckduckgo and searxng and means LiteLLM users can try this provider with zero configuration. Setting YOUCOM_API_KEY upgrades to the keyed endpoint (ydc-index.io/v1/search) with higher rate limits.

Refs upstream expansion signal: #15942.

What's in here

New adapter — litellm/llms/you_com/search/transformation.py

  • Endpoint selection:
    • YOUCOM_API_KEY set → POST https://ydc-index.io/v1/search with X-API-Key header
    • no key → POST https://api.you.com/v1/agents/search (keyless free tier)
    • YOUCOM_API_BASE override honored as-is
  • Parameter mapping (Perplexity unified spec → You.com):
    • max_resultscount
    • search_domain_filterinclude_domains
    • countrycountry (lowercased to match Tavily's convention)
    • max_tokens_per_page → not applicable, ignored
  • Response normalization:
    • Flattens results.web + results.news into a single SearchResult list
    • snippet prefers snippets[0], falls back to description
    • page_agedate
  • Both endpoints return an identical JSON shape, verified against live API.

Registry wiring

  • SearchProviders.YOU_COM = "you_com" in litellm/types/utils.py
  • YouComSearchConfig wired into ProviderConfigManager.get_provider_search_config() in litellm/utils.py

Pricing

  • Placeholder you_com/search entry in model_prices_and_context_window.json at 0.0 per query. Happy to update to the public pricing number you'd prefer — let me know.

Tests — tests/search_tests/test_you_com_search.py (7 tests, all pass locally)

  1. test_you_com_search_request_payload — keyed URL, X-API-Key header, count mapping, response normalization
  2. test_you_com_search_domain_filter_and_countryinclude_domains / lowercased country mapping; unified-spec param names don't leak through
  3. test_you_com_search_snippet_fallback_to_description — snippet falls back when snippets is empty
  4. test_you_com_search_news_results_appended — news results flatten in after web
  5. test_you_com_search_complete_url_handles_trailing_slash — normalize trailing slashes on custom api_base
  6. test_you_com_search_keyless_free_tier — keyless URL + absence of X-API-Key when no key configured
  7. test_you_com_search_validate_environment_keyless — config doesn't raise on missing key (keyless is valid)

Usage

Zero-config (keyless free tier):

import litellm
resp = litellm.search(query="latest AI developments", search_provider="you_com")

With API key (higher limits):

import os, litellm
os.environ["YOUCOM_API_KEY"] = "sk-..."
resp = litellm.search(query="latest AI developments", search_provider="you_com", max_results=5)

Router YAML:

search_tools:
  - search_tool_name: "my-you-com-search"
    litellm_params:
      search_provider: "you_com"

Test plan

  • pytest tests/search_tests/test_you_com_search.py -v — 7/7 pass locally
  • tests/search_tests/test_tavily_search.py — passes (adjacent regression sanity)
  • Live verification: both ydc-index.io/v1/search and api.you.com/v1/agents/search return matching JSON shapes
  • CI run on this PR

Registers You.com Search API as a first-class `search_provider` in the
`search_tools` registry, alongside Tavily, Exa, Perplexity, etc.

- New adapter: litellm/llms/you_com/search/transformation.py
  - POSTs to https://ydc-index.io/v1/search
  - Auth: X-API-Key from YOUCOM_API_KEY (or explicit api_key)
  - Maps Perplexity unified spec: max_results -> count,
    search_domain_filter -> include_domains, country -> country
  - Flattens results.web + results.news into a single SearchResult list;
    snippet prefers snippets[0], falls back to description; page_age -> date
- Registry: SearchProviders.YOU_COM in litellm/types/utils.py and wired
  into ProviderConfigManager.get_provider_search_config()
- Pricing entry: model_prices_and_context_window.json (placeholder $0.0;
  happy to adjust to maintainers' preferred public number)
- Docs: example router config snippet and example proxy yaml updated
- Tests: tests/search_tests/test_you_com_search.py - 5 mocked tests
  (payload shape, domain filter mapping, snippet fallback, news flattening,
  missing-api-key error)

Refs upstream expansion signal: BerriAI#15942
@CLAassistant
Copy link
Copy Markdown

CLAassistant commented May 20, 2026

CLA assistant check
All committers have signed the CLA.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 20, 2026

Greptile Summary

This PR registers You.com as a new search_provider in the search_tools registry, following the same adapter pattern as Tavily, Exa, Perplexity, and the other existing providers.

  • New adapter (litellm/llms/you_com/search/transformation.py): POSTs to https://ydc-index.io/v1/search, authenticates via X-API-Key, maps the unified Perplexity spec params (max_resultscount, search_domain_filterinclude_domains), and flattens results.web + results.news into a single SearchResult list.
  • Registry wiring: SearchProviders.YOU_COM added to the enum in litellm/types/utils.py and mapped to YouComSearchConfig in ProviderConfigManager.get_provider_search_config() in litellm/utils.py.
  • Pricing placeholder: you_com/search entry added to model_prices_and_context_window.json at $0.0 per query.

Confidence Score: 4/5

Safe to merge; the changes are additive and self-contained, touching only the new provider module and its registry hookup.

The core wiring and response normalisation are correct and mirror existing providers closely. Two small issues in the adapter itself are worth addressing before shipping: the URL normalisation logic can double-append /v1/search if a custom base URL has a trailing slash, and the country parameter is not lowercased while every other provider that supports it does lowercase it. Tests also mutate the process environment without cleanup, which could silently affect unrelated test runs.

litellm/llms/you_com/search/transformation.py (URL construction and country normalisation) and tests/search_tests/test_you_com_search.py (env var cleanup).

Important Files Changed

Filename Overview
litellm/llms/you_com/search/transformation.py New You.com search adapter: transforms queries, maps params, normalises responses. Minor URL-path construction edge case and country case inconsistency.
litellm/types/utils.py Adds YOU_COM = "you_com" to the SearchProviders enum; straightforward and consistent with other entries.
litellm/utils.py Wires YouComSearchConfig into ProviderConfigManager; mirrors the pattern used by all other search providers.
tests/search_tests/test_you_com_search.py Five mock-based tests covering core paths. Sets YOUCOM_API_KEY directly in os.environ without fixture-based cleanup, which can pollute subsequent tests.
model_prices_and_context_window.json Adds you_com/search entry at 0.0 input_cost_per_query; placeholder pricing, correctly structured.
litellm/integrations/websearch_interception/ARCHITECTURE.md Adds you_com example to the provider snippet; documentation change that should be in the litellm-docs repo per project policy.
litellm/proxy/example_config_yaml/websearch_interception_config.yaml Adds commented-out you_com example; harmless documentation comment.

Reviews (1): Last reviewed commit: "feat(search): add you_com as a search pr..." | Re-trigger Greptile

Comment on lines +76 to +77
if not api_base.endswith("/v1/search"):
api_base = f"{api_base.rstrip('/')}/v1/search"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 The trailing-slash guard and the path-append are applied in the wrong order. endswith("/v1/search") is evaluated on the raw string, so a custom base ending in /v1/search/ fails the check, then rstrip('/') removes only the slash, and the result is …/v1/search/v1/search. The base should be normalised before the check.

Suggested change
if not api_base.endswith("/v1/search"):
api_base = f"{api_base.rstrip('/')}/v1/search"
api_base = api_base.rstrip("/")
if not api_base.endswith("/v1/search"):
api_base = f"{api_base}/v1/search"

Comment on lines +110 to +111
if "country" in optional_params:
request_data["country"] = optional_params["country"]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 TavilySearchConfig.transform_search_request normalises the country value with .lower() before sending it upstream (see litellm/llms/tavily/search/transformation.py:141). The You.com adapter passes the value through as-is, so a caller using country="US" (uppercase, which is the idiomatic form in the unified spec) may receive different behaviour depending on how You.com's API validates it. Consider applying .lower() here for consistency or documenting the accepted case.

Suggested change
if "country" in optional_params:
request_data["country"] = optional_params["country"]
if "country" in optional_params:
request_data["country"] = optional_params["country"].lower()

Comment on lines +23 to +24
Validate the You.com search request payload structure without real API calls.
"""
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Environment variable leaks between tests

Each async test opens by writing os.environ["YOUCOM_API_KEY"] = "test-api-key" directly and never removes it; test_you_com_search_raises_without_api_key then calls os.environ.pop("YOUCOM_API_KEY", None) to simulate a missing key, but that mutation is permanent for the rest of the session. If test ordering changes or tests are parallelised, this creates false positives or false negatives. Prefer monkeypatch.setenv / monkeypatch.delenv (or a pytest.fixture with yield) so env state is restored automatically after each test.

Comment on lines 244 to 250
- search_tool_name: "my-tavily-tool"
litellm_params:
search_provider: "tavily"
- search_tool_name: "my-you-com-tool"
litellm_params:
search_provider: "you_com"
```
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Documentation changes should live in the litellm-docs repo

Per project policy, documentation additions belong in the external docs repo rather than here. This snippet addition and the corresponding change to litellm/proxy/example_config_yaml/websearch_interception_config.yaml appear to be documentation-only content that should be tracked separately.

Rule Used: Prevent documentation from being added - needs to ... (source)

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

@codecov
Copy link
Copy Markdown

codecov Bot commented May 20, 2026

Codecov Report

❌ Patch coverage is 94.28571% with 4 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
litellm/llms/you_com/search/transformation.py 95.45% 3 Missing ⚠️
litellm/utils.py 50.00% 1 Missing ⚠️

📢 Thoughts on this report? Let us know!

…o test

Addresses Greptile inline review comments on BerriAI#28370:

- get_complete_url: strip trailing slashes from api_base *before* the
  endswith("/v1/search") check, so a custom base like ".../v1/search/"
  doesn't become ".../v1/search/v1/search".
- transform_search_request: .lower() country before sending, matching
  Tavily's convention so callers using the unified spec form ("US") get
  consistent behavior across providers.
- Tests: replace direct os.environ writes with an autouse monkeypatch
  fixture so YOUCOM_API_KEY is set per-test and removed afterwards.
  The missing-key test now uses monkeypatch.delenv. New test asserts the
  trailing-slash normalization above.

Reverts the ARCHITECTURE.md / example yaml edits per the reviewer note
that documentation changes belong in the litellm-docs repo.
@brainsparker
Copy link
Copy Markdown
Author

Thanks for the review! Pushed fixups in 903e44e:

  • get_complete_url — strip trailing slashes from api_base before the endswith("/v1/search") check, so a custom base like .../v1/search/ doesn't become .../v1/search/v1/search. New test test_you_com_search_complete_url_handles_trailing_slash covers this.
  • Country lowercasetransform_search_request now calls .lower() on country before sending, matching Tavily's behavior. Test updated to assert "us" instead of "US".
  • Test env-var leak — replaced direct os.environ writes with an autouse monkeypatch fixture (scoped per-test). The missing-key test uses monkeypatch.delenv instead of a permanent pop.
  • Docs — reverted the ARCHITECTURE.md and example yaml edits per the note that documentation changes belong in the litellm-docs repo. Happy to send a follow-up PR there if you point me at the repo.

All 6 tests pass locally.

You.com offers an IP-throttled keyless endpoint that returns the same
response shape as the keyed one (~100 queries/day, no signup). This is a
significant onboarding lever - mirrors the keyless DuckDuckGo/SearXNG
providers already in the search_tools registry.

Behavior:
- YOUCOM_API_KEY set        -> keyed:  POST https://ydc-index.io/v1/search
                                       (X-API-Key header)
- no key                    -> free:   POST https://api.you.com/v1/agents/search
                                       (no auth)
- YOUCOM_API_BASE override  -> honored as-is

Tests:
- New: test_you_com_search_keyless_free_tier - asserts URL + absence of
  X-API-Key when no key is configured.
- New: test_you_com_search_validate_environment_keyless - asserts the
  config no longer raises when the key is absent.
- Removed: test_you_com_search_raises_without_api_key (the precondition
  no longer holds).
- Existing payload/domain-filter/etc tests still cover keyed mode via
  the autouse YOUCOM_API_KEY fixture.

Verified both endpoints accept POST + return identical JSON shape:
  results.web[] / results.news[] with title, url, snippets, description,
  page_age.
@brainsparker
Copy link
Copy Markdown
Author

Added keyless free-tier support in 6eb1912.

If YOUCOM_API_KEY is not set, the adapter now targets api.you.com/v1/agents/search (IP-throttled ~100 queries/day, no signup) instead of erroring. Setting the key upgrades to the keyed endpoint with higher rate limits. Mirrors the existing keyless-default pattern in duckduckgo / searxng.

Mode Endpoint Auth
YOUCOM_API_KEY set POST https://ydc-index.io/v1/search X-API-Key
no key POST https://api.you.com/v1/agents/search none

Both endpoints return the same results.web[] / results.news[] JSON shape (verified against live API). Two new tests (test_you_com_search_keyless_free_tier, test_you_com_search_validate_environment_keyless) cover the keyless path. 7/7 tests pass locally.

PR description updated to reflect the dual-tier behavior.

Adding `litellm/llms/you_com/` requires a corresponding entry in
provider_endpoints_support.json or the
code-quality/check_provider_folders_documented CI check fails.

Follows the compact tavily/serper pattern - endpoints: { search: true }.
Local run of the check now reports "All 114 provider folders are documented".
@krrish-berri-2
Copy link
Copy Markdown
Contributor

Merge Confidence: 3/5 ❌ BLOCKED
⚠️ 1 check failing: codecov/patch
1 PR-related CI failure need fixes first.

Score docked for: 1 PR-related CI failure (codecov/patch).

Drill-down

PR-related failures
• codecov/patch — The new adapter file litellm/llms/you_com/search/transformation.py is in this diff and has 0% patch coverage (65 lines missing); codecov/patch fails because the added production code is not exercised by the CI test suite even though unit tests exist locally.

@krrish-berri-2
Copy link
Copy Markdown
Contributor

please include a video of this working as expected

@krrish-berri-2
Copy link
Copy Markdown
Contributor

you should also file an adjacent pr on litellm-docs so people know this exists

The litellm CI workflows scope unit tests to `tests/test_litellm/...`
(see test-unit-llm-providers.yml: `tests/test_litellm/llms` path), so
tests living under `tests/search_tests/` are never run in CI - which is
why codecov reports 0% patch coverage for the new adapter even though
the unit tests exist and pass locally.

Move test_you_com_search.py into `tests/test_litellm/llms/you_com/` so
the test-unit-llm-providers job picks it up. 7/7 tests still pass at
the new location.

(Sibling search-only providers - tavily, exa_ai, brave, etc. - still
live only in `tests/search_tests/` and would benefit from the same
move, but that is out of scope for this PR.)
@brainsparker
Copy link
Copy Markdown
Author

Thanks @krrish-berri-2 — addressing your feedback:

1. codecov/patch blocker — root cause was that tests/search_tests/ isn't included in any CI workflow path (all test-unit-* jobs scope to tests/test_litellm/... or tests/proxy_*), so the unit tests existed but CI never ran them. Moved the file to tests/test_litellm/llms/you_com/test_you_com_search.py in 65ac846 so the test-unit-llm-providers job picks it up. 7/7 tests still pass at the new location.

2. Adjacent docs PR — filed at BerriAI/litellm-docs#182 (new docs/search/you_com.md page, sidebar entry, supported-providers table row).

3. Demo video — recording in progress, will attach to this PR shortly.

FYI on (1): every other search-only provider in the registry (tavily, exa_ai, brave, serper, linkup, firecrawl, etc.) currently has the same hidden-from-CI issue — their tests also live only in tests/search_tests/. Happy to send a follow-up PR moving those too if useful, but kept the scope tight here.

The keyless free-tier endpoint (api.you.com/v1/agents/search) advertises
Content-Encoding: gzip but returns a body that httpx's decoder rejects
with `zlib.error: Error -3 while decompressing data: incorrect header
check`, surfacing as litellm.APIConnectionError in user code. curl works
because it doesn't request compression by default.

Pin Accept-Encoding: identity in validate_environment so the upstream
server skips compression entirely. Harmless on the keyed endpoint
(ydc-index.io/v1/search) which negotiates content-encoding correctly.

The header uses setdefault so a caller-supplied Accept-Encoding still
takes precedence. (Server-side bug has been flagged to the You.com team
separately - once fixed there, this workaround can be removed.)

New unit test: test_you_com_search_pins_identity_accept_encoding.
@brainsparker
Copy link
Copy Markdown
Author

Demo recordings, as promised:

Keyless (no YOUCOM_API_KEY set, routes to api.you.com/v1/agents/search):
asciicast

Keyed (YOUCOM_API_KEY set, routes to ydc-index.io/v1/search):
asciicast

Both hit POST /v1/search/{search_tool_name} with search_provider: you_com configured — exercises YouComSearchConfig via ProviderConfigManager. The endpoint divergence is visible in the --detailed_debug log lines at the end of each cast.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants