Thank you for considering contributing to plexe! Your contributions help improve this project for everyone.
By participating in this project, you agree to uphold our Code of Conduct, which outlines expectations for respectful and inclusive interactions.
If you encounter a bug, please:
- Search Existing Issues: Check if the issue has already been reported.
- Open a New Issue: If not found, create a new issue and include:
- A descriptive title.
- Steps to reproduce the bug.
- Expected and actual behavior.
- Screenshots or code snippets, if applicable.
To propose new features or improvements:
- Search Existing Issues: Ensure the suggestion hasn't been made.
- Open a New Issue: Provide:
- A clear description of the enhancement.
- Rationale for the suggestion.
- Any relevant examples or references.
For code contributions:
- Fork the Repository: Create your own copy of the repo.
- Create a Branch: Use a descriptive name (e.g.,
feature/new-modelorbugfix/issue-123). - Make Changes: Implement your changes with clear and concise code.
- Write Tests: Ensure new features or bug fixes are covered by tests.
- Commit Changes: Follow our commit message guidelines.
- Push to Your Fork: Upload your changes.
- Open a Pull Request: Provide a detailed description of your changes and reference any related issues.
To set up the development environment:
-
Clone the Repository:
git clone https://github.com/plexe-ai/plexe.git cd plexe -
Install Dependencies:
pip install poetry # Enter the virtual environment bubble. # Modern Poetry uses 'shell', while older versions might need 'env activate' poetry shell # OR: poetry env activate (for older versions or specific setups) # Install specific extras as needed (e.g. AWS, PySpark) # Note: pyspark and databricks-connect are mutually exclusive poetry install -E aws -E pyspark # Run setup.py to configure pre-commit hooks python setup.py
-
Run Tests:
# If 'poetry shell' above was successful, you can run directly: pytest # If you encounter "ModuleNotFoundError", use the more robust: poetry run pytest
-
Run staged integration tests before opening a PR:
# Requires ANTHROPIC_API_KEY and local Spark/Java setup bash scripts/tests/run_integration_staged.shThe staged suite runs three pytest phases with hard barriers:
integration_seed: builds reusable checkpoints through phase 3integration_search: resumes from seeds and runs model searchintegration_eval: resumes from search checkpoints, runs evaluation, and validates predictor inference
This
tests/integrationsuite is the primary pre-PR integration workflow. Makefile Docker targets remain optional/manual end-to-end checks.
Ensure all tests pass before making contributions.
Adhere to PEP 8 guidelines for Python code. Key points include:
- Use 4 spaces per indentation level.
- Limit lines to 79 characters.
- Use meaningful variable and function names.
- Include docstrings for all public modules, classes, and functions.
Write clear and concise commit messages:
-
Format:
<type>(<scope>): <subject>- Type:
feat,fix,docs,style,refactor,test,chore - Scope: Optional, e.g.,
data,model - Subject: Brief description (max 50 characters)
- Type:
-
Example:
feat(model): add support for gemini