Contributing to ChatSEEK

Thank you for your interest in contributing to ChatSEEK! This document provides guidelines for contributing to the project.

Getting Started

Prerequisites

Python 3.8 or higher
Neo4j database (5.18.1+)
Git

Development Setup

Clone the repository:

git clone https://github.com/yourorg/chatseek.git
cd chatseek

Create a virtual environment:

python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate

Install development dependencies:
```
pip install -e ".[dev]"
```

Set up environment variables:

cp .env.example .env
# Edit .env with your Neo4j and API credentials

Run tests to verify setup:
```
pytest tests/ -v
```

Development Workflow

1. Create a Feature Branch

git checkout -b feature/your-feature-name

2. Make Your Changes

Write clear, documented code
Follow existing code style (Black formatter, line length 100)
Add docstrings to all public functions and classes
Update relevant documentation

3. Run Tests

# Run all tests
pytest tests/ -v

# Run with coverage
pytest tests/ --cov=chatseek --cov-report=html

# Run specific test file
pytest tests/unit/test_query_engine.py -v

4. Format Code

# Format with Black
black chatseek/ tests/

# Check with Ruff
ruff check chatseek/ tests/

5. Commit Your Changes

Use clear, descriptive commit messages:

git add .
git commit -m "feat: Add support for new GEO template type"

Commit message format:

feat: - New feature
fix: - Bug fix
docs: - Documentation changes
test: - Test additions or modifications
refactor: - Code refactoring
chore: - Maintenance tasks

6. Push and Create Pull Request

git push origin feature/your-feature-name

Then create a pull request on GitHub with:

Clear description of changes
Reference to any related issues
Screenshots/examples if applicable

Code Style Guidelines

Python Code

Follow PEP 8 style guide
Use Black formatter (line length: 100)
Use type hints where appropriate
Write comprehensive docstrings

Example:

def extract_entities(query: str, llm: BaseChatModel) -> Dict[str, Any]:
    """
    Extract entities from a natural language query.

    Args:
        query: Natural language question
        llm: Language model instance for extraction

    Returns:
        Dictionary containing extracted entities with keys:
        - study: Study name (if present)
        - samples: List of sample UIDs
        - assay: Assay type

    Raises:
        ExtractionError: If entity extraction fails
    """
    # Implementation

Documentation

Update README.md if adding user-facing features
Add examples to examples/ directory for new features
Update ROADMAP.md if implementing planned features
Keep IMPLEMENTATION_STATUS.md current with progress

Testing Guidelines

Test Structure

tests/
├── unit/              # Unit tests for individual components
├── integration/       # Integration tests for workflows
└── fixtures/          # Test data and fixtures

Writing Tests

Write tests for all new functionality
Aim for 80%+ code coverage
Use descriptive test names
Mock external dependencies (Neo4j, LLMs)

Example:

def test_entity_extractor_identifies_study_name(mock_llm):
    """Test that entity extractor correctly identifies study names."""
    query = "Find samples in the GBM Study"
    extractor = EntityExtractor(mock_llm)

    result = extractor.extract(query)

    assert result["study"] == "GBM Study"
    assert result["intent"] == "find_samples"

Adding New Features

New GEO Templates

See docs/guides/CUSTOM_TEMPLATE_GUIDE.md for detailed instructions.

Quick overview:

Create template in chatseek/geo/templates.py
Define required fields and sections
Add validation logic
Write tests in tests/unit/test_geo_templates.py
Add example to examples/geo_examples.py

New Query Types

Add query template to chatseek/graphrag/query_builder.py
Update entity extraction patterns in chatseek/graphrag/entity_extractor.py
Add integration test
Document in README.md

Project Structure

chatseek/
├── chatseek/          # Main package
│   ├── core/         # Core infrastructure (config, database)
│   ├── graphrag/     # GraphRAG query system
│   ├── geo/          # GEO submission system
│   ├── cli/          # Command-line interface
│   └── utils/        # Utility functions
├── tests/            # Test suite
├── examples/         # Example scripts
├── demos/            # Streamlit demo app
├── docs/             # Documentation
└── notebooks/        # Jupyter tutorials

Documentation Structure

User-facing docs: Root-level .md files
Guides: docs/guides/
Archived docs: docs/archive/
Code docs: Inline docstrings

Getting Help

Issues: Check existing issues or open a new one
Discussions: Use GitHub Discussions for questions
Documentation: Review README.md and QUICKSTART.md

Code Review Process

All contributions go through code review:

Automated checks: Tests, coverage, linting must pass
Manual review: Maintainer reviews code quality and design
Feedback: Address any requested changes
Merge: Once approved, PR is merged

Release Process

ChatSEEK uses semantic versioning (MAJOR.MINOR.PATCH):

MAJOR: Breaking API changes
MINOR: New features (backward compatible)
PATCH: Bug fixes

Releases are managed by project maintainers.

License

By contributing, you agree that your contributions will be licensed under the MIT License.

Questions?

Feel free to open an issue or reach out to the maintainers. We appreciate your contributions!

Thank you for helping make ChatSEEK better!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Contributing to ChatSEEK

Getting Started

Prerequisites

Development Setup

Development Workflow

1. Create a Feature Branch

2. Make Your Changes

3. Run Tests

4. Format Code

5. Commit Your Changes

6. Push and Create Pull Request

Code Style Guidelines

Python Code

Documentation

Testing Guidelines

Test Structure

Writing Tests

Adding New Features

New GEO Templates

New Query Types

Project Structure

Documentation Structure

Getting Help

Code Review Process

Release Process

License

Questions?

FilesExpand file tree

CONTRIBUTING.md

Latest commit

History

CONTRIBUTING.md

File metadata and controls

Contributing to ChatSEEK

Getting Started

Prerequisites

Development Setup

Development Workflow

1. Create a Feature Branch

2. Make Your Changes

3. Run Tests

4. Format Code

5. Commit Your Changes

6. Push and Create Pull Request

Code Style Guidelines

Python Code

Documentation

Testing Guidelines

Test Structure

Writing Tests

Adding New Features

New GEO Templates

New Query Types

Project Structure

Documentation Structure

Getting Help

Code Review Process

Release Process

License

Questions?