After aggressive refactoring and architecture realignment, our testing philosophy is:
- Clean boundaries: Unit tests for isolated components, integration tests for cross-component behavior
- Fast execution: Unit tests run in milliseconds, mypy completes in seconds
- Modern patterns: Type-safe fixtures, clear separation of concerns
- Minimal mocking: Only mock external services, test real internal behavior
# Run all tests
./Taskfile test
# Run specific test categories
pytest tests/unit/auth/ # Authentication tests
pytest tests/unit/services/ # Service layer tests
pytest tests/integration/ # Cross-component integration tests (core)
pytest tests/plugins # All plugin tests
pytest tests/plugins/metrics # Single plugin tests
pytest tests/performance/ # Performance benchmarks
# Run with coverage
./Taskfile test-coverage
# Type checking and quality (now sub-second)
./Taskfile typecheck
./Taskfile pre-commitClean architecture after aggressive refactoring - Removed 180+ tests and 3000+ lines of problematic code:
tests/
├── conftest.py # Essential fixtures (515 lines, was 1117)
├── unit/ # True unit tests (mock at service boundaries)
│ ├── api/ # Remaining lightweight API tests
│ │ ├── test_mcp_route.py # MCP permission routes
│ │ ├── test_plugins_status.py # Plugin status endpoint
│ │ ├── test_reset_endpoint.py # Reset endpoint
│ │ └── test_analytics_pagination_service.py # Pagination service
│ ├── services/ # Core service tests
│ │ ├── test_adapters.py # OpenAI↔Anthropic conversion
│ │ ├── test_streaming.py # Streaming functionality
│ │ ├── test_confirmation_service.py # Confirmation service (cleaned)
│ │ ├── test_scheduler.py # Scheduler (simplified)
│ │ ├── test_scheduler_tasks.py # Task management
│ │ ├── test_claude_sdk_client.py # Claude SDK client
│ │ └── test_pricing.py # Token pricing
│ ├── auth/ # Authentication tests
│ │ ├── test_auth.py # Core auth (cleaned of HTTP testing)
│ │ ├── test_oauth_registry.py # OAuth registry
│ │ ├── test_authentication_error.py # Error handling
│ │ └── test_refactored_auth.py # Refactored patterns
│ ├── config/ # Configuration tests
│ │ ├── test_claude_sdk_options.py # Claude SDK config
│ │ ├── test_claude_sdk_parser.py # Config parsing
│ │ ├── test_config_precedence.py # Priority handling
│ │ └── test_terminal_handler.py # Terminal handling
│ ├── utils/ # Utility tests
│ │ ├── test_binary_resolver.py # Binary resolution
│ │ ├── test_startup_helpers.py # Startup utilities
│ │ └── test_version_checker.py # Version checking
│ ├── cli/ # CLI command tests
│ │ ├── test_cli_config.py # CLI configuration
│ │ ├── test_cli_serve.py # Server CLI
│ │ └── test_cli_confirmation_handler.py # Confirmation CLI
│ ├── test_caching.py # Caching functionality
│ ├── test_plugin_system.py # Plugin system (cleaned)
│ └── test_hook_ordering.py # Hook ordering
├── integration/ # Cross-component tests (moved from unit)
│ ├── test_analytics_pagination.py # Full analytics flow
│ ├── test_confirmation_integration.py # Permission flows
│ ├── test_metrics_plugin.py # Metrics collection
│ ├── test_plugin_format_adapters_v2.py # Format adapter system
│ ├── test_plugins_health.py # Plugin health checks
│ └── docker/ # Docker integration tests (moved)
│ └── test_docker.py # Docker functionality
├── performance/ # Performance tests (separated)
│ └── test_format_adapter_performance.py # Benchmarks
├── factories/ # Simplified factories (362 lines, was 651)
│ ├── __init__.py # Factory exports
│ └── fastapi_factory.py # Streamlined FastAPI factories
├── fixtures/ # Essential fixtures only
│ ├── claude_sdk/ # Claude SDK mocking
│ ├── external_apis/ # External API mocking
│ └── responses.json # Mock data
├── helpers/ # Test utilities
├── ccproxy/plugins/ # Plugin tests (centralized)
│ ├── my_plugin/
│ │ ├── unit/ # Plugin unit tests
│ │ └── integration/ # Plugin integration tests
└── test_handler_config.py # Handler configuration tests
Unit Tests (tests/unit/):
- Mock at service boundaries only - never mock internal components
- Test pure functions and single components in isolation
- No HTTP layer testing - use service layer mocks instead
- No timing dependencies - all asyncio.sleep() removed
- No database operations - moved to integration tests
Integration Tests (tests/integration/):
- Test cross-component interactions with minimal mocking
- Include HTTP client testing with FastAPI TestClient
- Test background workers and async coordination
- Validate configuration end-to-end
- External APIs only: Claude API, OAuth endpoints, Docker processes
- Internal services: Use real implementations with dependency injection
- Configuration: Use test settings objects, not mocks
- No mock explosion: Removed 300+ redundant test fixtures
- Add unit coverage for
ModelMapper(ordering,regex,prefix/suffix) and the alias-restore helpers. - Integration tests covering provider adapters should assert that mapped requests still emit the original client
modelin downstream responses (JSON and streaming SSE). /modelsendpoint tests should configuremodels_endpointin test settings instead of patching routes directly.
REQUIREMENT: All test files MUST pass type checking and linting. This is not optional.
- All test files MUST pass mypy type checking - No
Anytypes unless absolutely necessary - All test files MUST pass ruff formatting and linting - Code must be properly formatted
- Add proper type hints to all test functions and fixtures - Include return types and parameter types
- Import necessary types - Use
from typing importfor type annotations
- Test functions: Must have
-> Nonereturn type annotation - Fixtures: Must have proper return type hints
- Parameters: Must have type hints where not inferred from fixtures
- Variables: Add type hints for complex objects when not obvious
from typing import Any
import pytest
from fastapi.testclient import TestClient
def test_service_endpoint(client: TestClient) -> None:
"""Test service endpoint with proper typing."""
response = client.get("/api/models")
assert response.status_code == 200
data: dict[str, Any] = response.json()
assert "models" in datafrom typing import Generator
import pytest
from fastapi import FastAPI
from fastapi.testclient import TestClient
@pytest.fixture
def app() -> FastAPI:
"""Create test FastAPI application."""
from ccproxy.api.app import create_app
return create_app()
@pytest.fixture
def client(app: FastAPI) -> Generator[TestClient, None, None]:
"""Create test client."""
with TestClient(app) as test_client:
yield test_clientAfter aggressive cleanup, we maintain only essential, well-typed fixtures:
integration_app_factory- Dynamic FastAPI app creation with plugin configsintegration_client_factory- Creates async HTTP clients with custom settingsmetrics_integration_client- Session-scoped client for metrics tests (high performance)disabled_plugins_client- Session-scoped client with plugins disabledbase_integration_settings- Minimal settings for fast test executiontest_settings- Clean test configurationisolated_environment- Temporary directory isolation
auth_settings- Basic auth configurationclaude_sdk_environment- Claude SDK test environment- Simple auth patterns without combinatorial explosion
- External API mocking only (Claude API, OAuth endpoints)
- No internal service mocking - use real implementations
- Removed 200+ redundant mock fixtures
claude_responses- Essential Claude API responsesmock_claude_stream- Streaming response patterns- Removed complex test data generators
@pytest.mark.unit- Fast unit tests (default)@pytest.mark.integration- Cross-component integration tests@pytest.mark.performance- Performance benchmarks@pytest.mark.asyncio- Async test functions
- Clean boundaries - Unit tests mock at service boundaries only
- Fast execution - Unit tests run in milliseconds, no timing dependencies
- Type safety - All fixtures properly typed, mypy compliant
- Real components - Test actual internal behavior, not mocked responses
- Performance-optimized patterns - Use session-scoped fixtures for expensive operations
- Modern async patterns -
@pytest.mark.asyncio(loop_scope="session")for integration tests - No overengineering - Removed 180+ tests, 3000+ lines of complexity
- Plugin integration tests - Plugin initialization is expensive
- Database/external service tests - Connection setup overhead
- Complex app configuration - Multiple services, middleware stacks
- Consistent test state needed - Tests require same app configuration
- Dynamic configurations - Each test needs different plugin settings
- Isolation required - Tests might interfere with shared state
- Simple setup - Minimal overhead for app creation
- Use
ERRORlevel - Minimal logging for faster test execution - Disable JSON logs -
json_logs=Falsefor better performance - Manual setup required - Call
setup_logging()explicitly in test environment
import pytest
from httpx import AsyncClient
# Use session-scoped app creation for expensive plugin initialization
@pytest.mark.asyncio(loop_scope="session")
async def test_plugin_functionality(metrics_integration_client) -> None:
"""Test plugin with session-scoped app for optimal performance."""
# App is created once per test session, client per test
resp = await metrics_integration_client.get("/metrics")
assert resp.status_code == 200
assert "prometheus_metrics" in resp.text@pytest.mark.asyncio
async def test_dynamic_plugin_config(integration_client_factory) -> None:
"""Test with dynamic plugin configuration."""
client = await integration_client_factory({
"metrics": {"enabled": True, "custom_setting": "value"}
})
async with client:
resp = await client.get("/metrics")
assert resp.status_code == 200from ccproxy.utils.caching import TTLCache
def test_cache_basic_operations() -> None:
"""Test cache basic operations."""
cache: TTLCache[str, int] = TTLCache(maxsize=10, ttl=60)
# Test real cache behavior
cache["key"] = 42
assert cache["key"] == 42
assert len(cache) == 1For integration tests that need consistent app state and optimal performance:
import pytest
from httpx import AsyncClient
# Session-scoped app creation (expensive operations done once)
@pytest.fixture(scope="session")
def metrics_integration_app():
"""Pre-configured app for metrics plugin integration tests."""
from ccproxy.core.logging import setup_logging
from ccproxy.config.settings import Settings
from ccproxy.api.bootstrap import create_service_container
from ccproxy.api.app import create_app
# Set up logging once per session
setup_logging(json_logs=False, log_level_name="ERROR")
settings = Settings(
enable_plugins=True,
plugins={
"metrics": {
"enabled": True,
"metrics_endpoint_enabled": True,
}
},
logging={
"level": "ERROR", # Minimal logging for speed
"verbose_api": False,
},
)
service_container = create_service_container(settings)
return create_app(service_container), settings
# Test-scoped client (reuses shared app)
@pytest.fixture
async def metrics_integration_client(metrics_integration_app):
"""HTTP client for metrics integration tests."""
from httpx import ASGITransport, AsyncClient
from ccproxy.api.app import initialize_plugins_startup
app, settings = metrics_integration_app
await initialize_plugins_startup(app, settings)
transport = ASGITransport(app=app)
async with AsyncClient(transport=transport, base_url="http://test") as client:
yield client
# Test using session-scoped pattern
@pytest.mark.asyncio(loop_scope="session")
async def test_metrics_endpoint_available(metrics_integration_client) -> None:
"""Test metrics endpoint availability."""
resp = await metrics_integration_client.get("/metrics")
assert resp.status_code == 200
assert b"# HELP" in resp.content or b"# TYPE" in resp.contentFor tests that need different configurations:
@pytest.mark.asyncio
async def test_custom_plugin_config(integration_client_factory) -> None:
"""Test with custom plugin configuration."""
client = await integration_client_factory({
"metrics": {
"enabled": True,
"metrics_endpoint_enabled": True,
"include_labels": True,
}
})
async with client:
resp = await client.get("/metrics")
assert resp.status_code == 200
# Test custom configuration behavior
assert "custom_label" in resp.textfrom pathlib import Path
from ccproxy.config.settings import Settings
def test_config_loading(tmp_path: Path) -> None:
"""Test configuration file loading."""
config_file: Path = tmp_path / "config.toml"
config_file.write_text("port = 8080")
settings: Settings = Settings(_config_file=config_file)
assert settings.server.port == 8080# Type checking (MUST pass) - now sub-second
./Taskfile typecheck
uv run mypy tests/
# Linting and formatting (MUST pass)
./Taskfile lint
./Taskfile format
uv run ruff check tests/
uv run ruff format tests/
# Run all quality checks
./Taskfile pre-commitConvenience scripts live in scripts/ to speed up local testing and debugging:
scripts/debug-no-stream-all.sh: exercise non-streaming endpoints quicklyscripts/debug-stream-all.sh: exercise streaming endpointsscripts/show_request.sh/scripts/last_request.sh: inspect recent requestsscripts/test_streaming_metrics_all.py: ad-hoc streaming metrics checksscripts/run_integration_tests.py: advanced integration runner (filters, timing)
These are optional helpers for dev workflows; standard Make targets and pytest remain the primary interface.
./Taskfile test # Run all tests with coverage
./Taskfile test-unit # Fast unit tests only
./Taskfile test-integration # Integration tests (core + plugins)
./Taskfile test-plugins # Only plugin tests
./Taskfile test-coverage # With coverage reportpytest -v # Verbose output
pytest -k "test_auth" # Run matching tests
pytest --lf # Run last failed
pytest -x # Stop on first failure
pytest --pdb # Debug on failure
pytest -m unit # Unit tests only
pytest -m integration # Integration tests only
pytest tests/plugins # All plugin tests
pytest tests/plugins/metrics -m unit # Single plugin unit tests
Note: tests run with `--import-mode=importlib` via Taskfile to avoid module name clashes.- Start here: Read this file and
tests/fixtures/integration.py - Run tests:
./Taskfile testto ensure everything works (606 optimized tests) - Choose pattern:
- Session-scoped fixtures for plugin tests (
metrics_integration_client) - Factory patterns for dynamic configs (
integration_client_factory) - Unit tests for isolated components
- Session-scoped fixtures for plugin tests (
- Performance first: Use
ERRORlogging level, session-scoped apps for expensive operations - Type safety: All test functions need
-> Nonereturn type, proper fixture typing - Modern async: Use
@pytest.mark.asyncio(loop_scope="session")for integration tests - Mock external only: Don't mock internal components, test real behavior
All existing test patterns still work - but new tests should use the performance-optimized patterns:
- Session-scoped integration fixtures -
metrics_integration_client,disabled_plugins_client - Async factory patterns -
integration_client_factoryfor dynamic configs - Manual logging setup -
setup_logging(json_logs=False, log_level_name="ERROR") - Session loop scope -
@pytest.mark.asyncio(loop_scope="session")for integration tests - Service container pattern -
create_service_container()+create_app() - Plugin lifecycle management -
initialize_plugins_startup()in fixtures
- Minimal logging - ERROR level only, no JSON logging, plugin logging disabled
- Session-scoped apps - Expensive plugin initialization done once per session
- Streamlined fixtures - 515 lines (was 1117), focused on essential patterns
- Real component testing - Mock external APIs only, test actual internal behavior
Plugin tests are now centralized under tests/plugins/<plugin>/{unit,integration} instead of co-located in plugins/<plugin>/tests. Update any paths and imports accordingly.
The architecture has been significantly optimized for performance while maintaining full functionality.