Skip to content

Refactor: slim akd-core — drop search/gap/storm agents and langchain deps#432

Merged
NISH1001 merged 18 commits into
developfrom
refactor/remove-bloat
Apr 23, 2026
Merged

Refactor: slim akd-core — drop search/gap/storm agents and langchain deps#432
NISH1001 merged 18 commits into
developfrom
refactor/remove-bloat

Conversation

@NISH1001
Copy link
Copy Markdown
Collaborator

Continuation of the akd-core slim-down (follow-up to #426). Rips out every in-core agent that's either superseded by a future tool+ReAct loop implementation or scoped to live in external packages (akd_ext, backend). Also retires the langchain dependency tree in favor of LiteLLM — already akd's single LLM client of record for LiteLLMInstructorBaseAgent.

~14,900 lines deleted, ~300 added across 86 files. No runtime behavior changes for remaining agents/tools.

Major Changes

  • Removed every in-core search agentDeepLitSearchAgent, ControlledSearchAgent, CodeSearchAgent, QuestionAnsweringAgent, AspectSearchAgent, plus the SearchAgent / LitBaseAgent base classes and the LitSearchAgent* / SearchAgent* schema family. Also akd/agents/search/components/ (triage, clarification, instruction_builder, research_synthesis, content_condensation). Search will come back as a thin tool+ReAct loop in a later PR.
  • Removed GapAgent — entire akd/agents/gap_analysis/ subpackage and its profilers. Downstream packages (akd_ext) already ship their own gap agent and can register it at runtime via AgentRegistry.register_agent(...).
  • Removed StormAgent — entire akd/agents/storm/ subpackage + akd/configs/storm_config.py. Had zero tests and only one external reference (a gitignored marimo notebook).
  • Removed LitAgent deprecation shimakd/agents/litsearch.py. No remaining imports in-tree.
  • Planner registry is now empty by designAVAILABLE_AGENTS cleared in akd/planner/registry.py. The machinery stays intact for runtime registration via AgentRegistry.register_agent(YourAgent); backends / akd_ext populate it at startup. JSON caches in akd/mapping/ emptied to match.
  • Migrated LLMFallbackMapper from langchain_openai.ChatOpenAI to LiteLLMakd/mapping/mappers.py now uses litellm.acompletion(...) directly. Completes the pre-existing # TODO: implement liteLLM interface and replace this comment.
  • Gated AKDSerializer's langgraph dependency via lazy importAKDSerializer now extends JsonPlusSerializer only when langgraph is importable; the _convert_pydantic_to_dict helper (the only part the mapping system uses) works with no langgraph installed. Checkpoint serde role (dumps / loads / dumps_typed / loads_typed) is preserved for backends that install the serializer extra.
  • Moved langchain/langgraph out of core dependenciespyproject.toml no longer pulls langchain-community, langchain-core, langchain-google-community, langchain-huggingface, langchain-openai, or langgraph. A new serializer optional extra ships langgraph==1.0.3 for backends that use AKDSerializer() as a checkpoint serde. uv.lock regenerated — ~20 transitive langchain-* packages gone.

Minor Changes

  • Removed dead extraction schemasExtractionSchema, SingleEstimation, ResearchData, and the LitSearchResult = SearchResultItem alias dropped from akd/structures.py and its re-exports in akd/__init__.py. Nothing in-tree still referenced them after the ExtractionAgent had been removed in an earlier PR. PaperDataItem kept (used by the Semantic Scholar tool).
  • Scripts / examples deletedscripts/demo_deep_search.py, scripts/test_guardrails_code_search.py, scripts/run_lit_agent.py, plus all of scripts/profilers/ (both deep-search and gap-agent profilers, the memray profiler, and PROFILING_README.md). examples/deep_search_test.py and examples/code_search_test.py deleted.
  • Tests deletedtests/agents/search/, tests/agents/gap_analysis/, tests/code_search_tool_test.py, and four planner test files (test_registry.py, test_workflow_builder.py, test_field_mapping.py, test_llm_planner.py) whose fixtures were hard-coded to the now-deleted deep_search / code_search / gap_analysis agent IDs. tests/planner/test_workflow_format.py kept (tests the workflow-format validator with generic placeholders). tests/mapping/test_mappers.py pruned of the 4 cases coupled to the removed LitSearchAgent* schemas.
  • create_lit_agent() dropped from akd/agents/factory.py.
  • Code-search tools removedakd/tools/search/code_search.py deleted (only consumer was the agent being removed); akd/tools/search/composite.py docstring example rewritten to use remaining tools.
  • AKDSerializer raises a clear RuntimeError pointing at pip install akd[serializer] if dumps / dumps_typed is called without langgraph installed.
  • Stale docs cleaneddocs/node-template.md deleted (entire file documented a NodeTemplate class that never shipped); langgraph-specific code block removed from docs/design_philosophy.md; docs/specs/STREAMING.md samples retargeted from DeepLitSearchAgent to a generic MyResearchAgent placeholder; akd/mapping/README.md and akd/planner/README.md retargeted to generic agent IDs.
  • README — new "Optional extras" table documenting serializer / ml / dev / local extras, plus install examples for each.
  • Top-level docs refreshedCLAUDE.md agents/configs tree trimmed to the remaining modules; README.md agents table pruned to the utility agents that remain (IntentAgent, QueryAgent, FollowUpQueryAgent, RelevancyAgent, MultiRubricRelevancyAgent, BaseAgent, LiteLLMInstructorBaseAgent) plus a note on runtime registration; speculative "Conflict Agent" / "Full Attribution Chain" roadmap items removed.
  • requirements.txt — dropped the 6 langchain/langgraph lines so CI (.github/workflows/*.yml uses pip install -r requirements.txt) stays in sync with pyproject.toml. Broader sync of that file vs pyproject is out of scope.
  • akd/utils.pyPartialModel docstring example rewritten to stop importing the deleted LitSearchAgentOutputSchema.
  • akd/tools/utils.py — stale to_langchain_structured_tool() line removed from tool_wrapper docstring (method never existed).

Dead code removed

  • akd/agents/search/ (subpackage) + akd/agents/search/components/ + akd/agents/search/aspect_search/
  • akd/agents/gap_analysis/ (subpackage)
  • akd/agents/storm/ (subpackage) + akd/configs/storm_config.py
  • akd/agents/litsearch.py (deprecation shim)
  • akd/tools/search/code_search.py
  • Test, example, script, and profiler files coupled to the above

Backend coordination (akd-framework)

akd-framework uses AKDSerializer as the serde for AsyncPostgresSaver (langgraph checkpointer). After this PR lands, backend needs to install akd[serializer] (or pin langgraph itself) so AKDSerializer.dumps / dumps_typed / loads / loads_typed remain wired up. The lazy import makes the class importable either way — only the checkpoint-serde methods require langgraph.

Test plan

  • uv run pytest tests/mapping/ — 20 passed after mapper migration; LLM-fallback tests correctly skip without an OpenAI key
  • uv run python -c "from akd.agents import search" / gap_analysis / storm / litsearch all raise ModuleNotFoundError
  • uv run python -c "from akd.planner.registry import AgentRegistry; ..." — empty registry, runtime register_agent(QueryAgent) works
  • AKDSerializer._convert_pydantic_to_dict(M(...)) works; dumps_typed works with langgraph installed
  • uv lock regenerates cleanly; langchain-community / -google-community / -huggingface / -openai all removed from lockfile
  • Full uv run pytest locally on final branch state (after last push)
  • CI (.github/workflows/cicd.yml) green on the branch
  • akd-framework install bumped to akd[serializer] before this merges or in the same window

@github-actions
Copy link
Copy Markdown

✅ Tests passed

📊 Test Results

  • Passed: 428
  • Failed: 0
  • Skipped: 34
  • Warnings: 162
  • Coverage: 73%

Branch: refactor/remove-bloat
PR: #432
Commit: b892c7f

📋 Full coverage report and logs are available in the workflow run.

Copy link
Copy Markdown
Collaborator

@muthukumaranR muthukumaranR left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approve on the reduction, with some notes:

  1. why keep these things?
    | Utility | IntentAgent | User intent classification |
    | QueryAgent | Query reformulation and refinement |
    | FollowUpQueryAgent | Follow-up query generation |

  2. looking at toml, I feel it’s still a little bloated. google, duckduckgo etc. should we do optional dependency groups / [tagged] installation, where the base is just bare essentials.

  • example groups: [evals, search_tools, etc]
  1. is the intent to add AKDSerializer /  langgraph to go towards unified AKD [core + backend] and towards the SDK?

@NISH1001
Copy link
Copy Markdown
Collaborator Author

Approve on the reduction, with some notes:

1. why keep these things?
   | **Utility** | `IntentAgent` | User intent classification |
   | `QueryAgent` | Query reformulation and refinement |
   | `FollowUpQueryAgent` | Follow-up query generation |

2. looking at toml, I feel it’s still a little bloated. google, duckduckgo etc. should we do optional dependency groups / [tagged] installation, where the base is just bare essentials.


* example groups:  [evals, search_tools, etc]


3. is the intent to add AKDSerializer /  langgraph to go towards unified AKD [core + backend] and towards the SDK?

Good call. Let me trim down the toml but need to validate as well.

langgraph to go towards unified AKD [core + backend] and towards the SDK

Kind of. Since we no longer have much langchain/langgraph dependency, I think it's best for compatibility to maintain an extra deps for it and lazy load for time being. The akd.serializers could be more maybe best on other serialization needed as per.

@github-actions
Copy link
Copy Markdown

✅ Tests passed

📊 Test Results

  • Passed: 428
  • Failed: 0
  • Skipped: 34
  • Warnings: 165
  • Coverage: 73%

Branch: refactor/remove-bloat
PR: #432
Commit: 09e9832

📋 Full coverage report and logs are available in the workflow run.

@github-actions
Copy link
Copy Markdown

✅ Tests passed

📊 Test Results

  • Passed: 428
  • Failed: 0
  • Skipped: 34
  • Warnings: 165
  • Coverage: 73%

Branch: refactor/remove-bloat
PR: #432
Commit: 1611b76

📋 Full coverage report and logs are available in the workflow run.

@NISH1001 NISH1001 merged commit 7ef2e39 into develop Apr 23, 2026
1 check passed
@NISH1001 NISH1001 deleted the refactor/remove-bloat branch April 23, 2026 15:52
@NISH1001 NISH1001 mentioned this pull request Apr 23, 2026
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants