Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
122 changes: 122 additions & 0 deletions build_stream/README.md
Original file line number Diff line number Diff line change
@@ -1,2 +1,124 @@
# Build Stream

**Build Stream** is a **RESTful API** (Representational State Transfer Application Programming Interface) service that orchestrates the creation and management of build jobs for the Omnia infrastructure platform. It provides a centralized interface for managing software catalog parsing, local repository creation, image building, and validation workflows.

## Architecture Overview

Build Stream follows a clean architecture pattern with clear separation of concerns:

- **API Layer** (`api/`): FastAPI routes and HTTP handling
- **Core Layer** (`core/`): Business logic, entities, and domain services
- **Orchestrator Layer** (`orchestrator/`): Use cases that coordinate workflows
- **Infrastructure Layer** (`infra/`): External integrations and data persistence
- **Common Layer** (`common/`): Shared utilities and configuration

## High-Level Workflow

1. **Authentication**: **JWT** (JSON Web Token)-based authentication secures all API endpoints
2. **Job Creation**: Clients submit build requests through the jobs API
3. **Stage Processing**: Jobs are broken into stages (catalog parsing, local repo, build image, validation)
4. **Async Execution**: Stages execute asynchronously with result polling
5. **Artifact Management**: Build artifacts are stored and tracked throughout the process
6. **Audit Trail**: All operations are logged for traceability and compliance

## Configuration

Configuration is managed through:
- Environment variables for runtime settings
- `build_stream.ini` for artifact store configuration
- Vault integration for secure credential management
- Database configuration for persistent storage

Key configuration areas:
- Database connections (PostgreSQL)
- Artifact storage backend (file system or in-memory)
- Vault endpoints and authentication
- **CORS** (Cross-Origin Resource Sharing) and server settings

## Getting Started

### For Developers

**Primary Entry Points:**
- `main.py` - FastAPI application entry point
- `api/router.py` - API route aggregation
- `container.py` - Dependency injection setup

**Key Workflows:**
- [Jobs Management](./doc/jobs.md) - Job lifecycle and orchestration
- [Catalog Processing](./doc/catalog.md) - Software catalog parsing and role generation
- [Local Repository](./doc/local_repo.md) - Local package repository creation
- [Image Building](./doc/build_image.md) - Container image build workflows
- [Validation](./doc/validation.md) - Input and output validation

**Development Setup:**
```bash
# Install dependencies
pip install -r requirements.txt
pip install -r requirements-dev.txt

# Set environment variables
export HOST=<host ip>
export PORT=<port>

# Run development server
uvicorn main:app --reload

# Run tests
pytest
```
Comment on lines +54 to +69
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these only talks about the API server, we need to mention about other pre-req infra setup


**API Documentation:**
- See Omnia ReadTheDocs for complete API documentation

### Architecture Components

**Core Services:**
- **Job Service**: Manages job lifecycle and state transitions
- **Catalog Service**: Parses software catalogs and generates roles
- **Local Repo Service**: Creates and manages local repositories
- **Build Service**: Orchestrates container image builds
- **Validation Service**: Validates inputs and outputs

**Data Flow:**
1. Client requests → API routes → Use cases → Core services → Repositories
2. Async job processing with stage-based execution
3. Result polling and webhook notifications
4. Artifact storage and metadata tracking

**Security:**
- JWT token-based authentication
- Vault integration for secret management
- Role-based access control
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not available today, can be removed.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to build image we can pass any roles, so I kept this

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not ansible roles. these are referring to logged in user roles like admin, etc

- Audit logging for compliance

## Workflow Areas

Each major workflow area has dedicated documentation:

- **Jobs** - Job creation, monitoring, and lifecycle management
- **Catalog** - Software catalog parsing and role generation
- **Local Repo** - Local package repository setup and management
- **Build Image** - Container image build orchestration
- **Validation** - Input validation and output verification

See the `doc/` directory for detailed workflow documentation.

## Dependencies

Build Stream uses FastAPI with the following key dependencies:
- FastAPI/Uvicorn for web framework
- SQLAlchemy for database **ORM** (Object-Relational Mapping)
- Dependency Injector for **IoC** (Inversion of Control) container
- PyJWT for **JWT** (JSON Web Token) authentication
- Ansible for infrastructure automation
- Vault client for secret management

## Support

For troubleshooting and development guidance:
1. Check the workflow-specific documentation in `doc/`
2. Review API logs for error details
3. Consult the audit trail for job execution history
4. Refer to the health check endpoint: `/health`

59 changes: 59 additions & 0 deletions build_stream/doc/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
# Build Stream Documentation

This directory contains comprehensive documentation for the Build Stream module and its workflows.

## Documentation Structure

### Overview Documentation
- **[Developer Guide](./developer-guide.md)** - Complete development guide with architecture deep dive
- **[Main README](../README.md)** - High-level overview and getting started guide

### Workflow Documentation
- **[Jobs Management](./jobs.md)** - Job lifecycle and orchestration
- **[Catalog Processing](./catalog.md)** - Software catalog parsing and role generation
- **[Local Repository](./local_repo.md)** - Local package repository creation
- **[Image Building](./build_image.md)** - Container image build workflows
- **[Validation](./validation.md)** - Input and output validation

## Quick Navigation

### For New Contributors
1. Start with the [main README](../README.md) for architecture overview
2. Read the [Developer Guide](./developer-guide.md) for detailed understanding
3. Explore specific workflow documentation based on your area of focus

### For Debugging Issues
1. Check the relevant workflow documentation for your issue area
2. Use the Developer Guide for troubleshooting steps
3. Review the audit trail and logging sections

### For Feature Development
1. Read the Developer Guide for architecture and patterns
2. Review the relevant workflow documentation
3. Follow the contribution guidelines in the Developer Guide

## Documentation Standards

All Build Stream documentation follows these standards:
- **No sensitive data** - Never include passwords, tokens, or secrets
- **Developer-focused** - Written for technical contributors
- **Cross-referenced** - Links between related documentation
- **Example-driven** - Includes practical examples and code snippets
- **Maintainable** - Easy to update as the codebase evolves

## Getting Help

If you need additional help beyond the documentation:
1. Check the troubleshooting sections in workflow docs
2. Review the audit trail and error handling patterns
3. Consult the architecture diagrams in the Developer Guide
4. Reach out to the Build Stream development team

## Contributing to Documentation

When contributing to Build Stream:
1. Update relevant documentation for API changes
2. Add new workflow documentation for new features
3. Keep cross-references up to date
4. Follow the established documentation standards
5. Include examples and troubleshooting information
98 changes: 98 additions & 0 deletions build_stream/doc/build_image.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,98 @@
# OS Image Building

The OS Image Building workflow orchestrates operating system image creation for functional roles in the Omnia platform.

## What It Does

The OS Image Building workflow provides:
- OS image build orchestration for functional roles
- Multi-architecture OS image support (x86_64, aarch64)
- Package installation and configuration management

## Inputs/Outputs

**Inputs:**
- Catalog files defining functional roles and packages
- Generated input configuration files
- PXE mapping file for deployment configuration

**Outputs:**
- Built OS images for functional roles
- OS image metadata and manifests
- Package installation logs and validation reports
- OS image deployment configurations

## Key Logic Locations

**Primary Files:**
- `api/build_image/routes.py` - HTTP endpoints for OS build operations
- `orchestrator/build_image/use_cases/` - OS build orchestration logic
- `core/build_image/entities.py` - OS build domain entities
- `core/build_image/repositories.py` - OS build data access
- `core/build_image/services.py` - OS build management services

**Main Components:**
- **BuildOSImageUseCase** - Orchestrates OS image build processes for functional roles
- **OSService** - Manages OS build execution and monitoring
- **MultiArchOSBuilder** - Handles multi-architecture OS builds
- **PackageInstaller** - Manages package installation and configuration

## Workflow Flow

1. **Build Request**: Client submits image build request for functional roles
2. **OS Context Preparation**: Base functional role packages assembled
3. **Multi-Arch Setup**: OS build configurations prepared for target architectures
4. **Package Installation**: Functional role packages installed and configured
5. **OS Customization**: System settings and configurations applied
6. **Image Creation**: OS images built and optimized for deployment

## Architecture Support

Supports multiple CPU architectures:
- **x86_64** - Standard 64-bit Intel/AMD processors
- **aarch64** - 64-bit ARM processors


## Build Optimization

Optimizations include:
- **Package caching** - Reusing downloaded packages across builds
- **Parallel builds** - Concurrent building for multiple architectures
- **Dependency resolution** - Efficient package dependency management

## Security Features

Security capabilities include:
- **Package verification** - Automated package integrity validation
- **Base OS validation** - Verified base OS sources and configurations
- **Signature verification** - Package signature and checksum validation


## Integration Points

- Receives packages from local repository workflow
- Integrates with validation workflow for quality checks
- Uses Vault for secure credential management
- Connects with deployment systems for functional role provisioning

## Configuration

Build configuration includes:
- OS build parameters and environment variables
- Functional role specifications and requirements
- Package installation policies and configurations
- Architecture-specific OS settings

## Error Handling

- Detailed OS build error reporting
- Step-by-step build progress tracking
- Rollback capabilities for failed builds
- Automated retry for transient failures

## Monitoring

- Real-time OS build progress monitoring
- Resource usage tracking (CPU, memory, storage)
- Build success/failure metrics
- Package installation result tracking
74 changes: 74 additions & 0 deletions build_stream/doc/catalog.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
# Catalog Processing

The Catalog workflow handles software catalog parsing and role generation for the Omnia platform.

## What It Does

The Catalog workflow provides:
- Software catalog parsing from JSON files
- Role generation based on catalog contents
- Package categorization and dependency resolution
- Integration with Ansible for role creation
- Validation of catalog structure and contents

## Inputs/Outputs

**Inputs:**
- Software catalog JSON files
- Package configuration mappings
- Role templates and definitions
- Platform-specific parameters

**Outputs:**
- Generated Ansible roles
- Package dependency mappings
- Validated catalog structures
- Role metadata and documentation

## Key Logic Locations

**Primary Files:**
- `api/catalog_roles/routes.py` - HTTP endpoints for catalog operations
- `api/parse_catalog/routes.py` - Catalog parsing endpoints
- `orchestrator/catalog/use_cases/parse_catalog.py` - Catalog parsing logic
- `orchestrator/catalog/use_cases/generate_input_files.py` - Input file generation

**Main Components:**
- **ParseCatalogUseCase** - Handles catalog parsing and validation
- **GenerateInputFilesUseCase** - Creates Ansible input files
- **CatalogRolesService** - Role generation and management
- **CatalogRepository** - Catalog data persistence

## Workflow Flow

1. **Catalog Upload**: Client submits catalog via `/api/v1/parse_catalog` endpoint
2. **Structure Validation**: Catalog schema and structure validated
3. **Package Parsing**: Individual packages extracted and categorized
4. **Dependency Resolution**: Package dependencies analyzed and resolved
5. **Role Generation**: Ansible roles generated based on packages
6. **Input File Creation**: Configuration files created for downstream workflows
7. **Validation**: Generated artifacts validated for completeness
8. **Storage**: Results stored in artifact repository

## Package Categorization

Packages are categorized into:
- **Base OS Bundles**: Operating system packages (e.g., rhel)
- **Driver Bundles**: Hardware driver packages (e.g., nvidia_gpu_driver)
- **Functional Bundles**: Core service packages (service_k8s, slurm_custom, additional_packages)
- **Infrastructure Bundles**: CSI and infrastructure packages (csi_driver_powerscale)
- **Miscellaneous**: Additional packages that don't fit other categories

## Integration Points

- Feeds into local repository creation workflow
- Provides input for image building workflows
- Integrates with validation workflow for quality checks
- Uses Vault for secure access to package repositories

## Configuration

Catalog processing is configured through:
- Package mapping files
- Adapter policy configurations
- Validation rules and schemas
Loading
Loading