Overview
This issue proposes optional improvements to the environment configuration strategy to make the codebase more maintainable, safer, and less brittle when adding new environments.
Priority: Low (nice-to-have improvements)
Scope: Code quality, developer experience, safety
Current Issues
1. Hardcoded Environment Lists
Problem: Adding new environments requires finding and updating multiple hardcoded lists.
Current code:
# main.py:759
if ENVIRONMENT not in ["prod", "test"]: # Easy to forget to update
# Multiple workflow files with similar checks
if [ "${{ inputs.environment }}" == "prod" ]; then
elif [ "${{ inputs.environment }}" == "test" ]; then
Risk: When ci-test was added, it fell through to dev behavior because the list wasn't updated (see #216).
2. Unsafe Default Environment
Problem: ENVIRONMENT defaults to "prod" if not set.
Current code:
# main.py:64
ENVIRONMENT = os.getenv("ENVIRONMENT", "prod")
Risk: If someone forgets to set the environment variable, they could accidentally modify production resources.
3. No Environment Validation
Problem: Typos in environment names fail silently or cause unexpected behavior.
Example:
export ENVIRONMENT=pruduction # Typo!
# Falls through to dev behavior (local state)
Risk: Silent failures, resources created in wrong environment.
4. Backend File Manipulation at Runtime
Problem: Deleting backend.tf at runtime is fragile.
Current code:
# main.py:760
(TERRAFORM_DIR / "backend.tf").unlink(missing_ok=True)
Risk: If process crashes during initialization, file state is inconsistent.
5. Generic Lock Table Name
Problem: DynamoDB table name "lock-table" could conflict with other projects.
Current:
dynamodb_table = "lock-table"
Risk: If multiple projects share an AWS account, table conflicts or unintended sharing.
6. Duplicated Logic Across Workflows
Problem: Environment decision logic is duplicated in multiple places in lablink-images.yml.
Locations:
- Lines 69-98: Dockerfile selection
- Lines 104-113: Environment suffix
- Lines 219-225, 260, 271, 309-316, 352, 363: Verification conditionals
Risk: Easy to update one place but miss others (inconsistent behavior).
Proposed Improvements
Priority 1: Centralize Environment Configuration
Create packages/allocator/src/lablink_allocator_service/conf/environments.yaml:
environments:
dev:
backend_type: local
description: "Local development"
test:
backend_type: s3
description: "Automated staging environment"
ci-test:
backend_type: s3
description: "Manual CI testing environment"
prod:
backend_type: s3
description: "Production environment"
Benefits:
- Single source of truth for valid environments
- Easy to add new environments
- Self-documenting
Usage in code:
from .conf.environments import load_environment_config, VALID_ENVIRONMENTS
if ENVIRONMENT not in VALID_ENVIRONMENTS:
raise ValueError(f"Invalid environment: {ENVIRONMENT}")
env_config = load_environment_config(ENVIRONMENT)
if env_config.backend_type == "s3":
# Use S3 backend
Priority 2: Add Environment Validation
VALID_ENVIRONMENTS = ["dev", "test", "ci-test", "prod"]
ENVIRONMENT = os.getenv("ENVIRONMENT", "dev") # Safer default
if ENVIRONMENT not in VALID_ENVIRONMENTS:
raise ValueError(
f"Invalid ENVIRONMENT: {ENVIRONMENT}. "
f"Must be one of: {', '.join(VALID_ENVIRONMENTS)}"
)
Benefits:
- Fails fast on typos
- Clear error messages
- Prevents silent failures
Priority 3: Improve DynamoDB Lock Table Naming
# backend-client-*.hcl files
dynamodb_table = "lablink-terraform-lock" # Or "tf-lock-lablink-<environment>"
Benefits:
- Avoids conflicts with other projects
- Clear ownership
- Better for multi-project AWS accounts
Priority 4: Add Resource Tagging
# In Terraform client VM creation
tags = {
Project = "lablink"
Environment = var.environment
ManagedBy = "terraform"
Purpose = "compute"
}
Benefits:
- Easy cost tracking per environment
- Helps identify orphaned resources
- Better AWS resource management
Priority 5: Safer Backend Handling
Instead of deleting backend.tf, consider:
- Option A: Use
-backend=false flag for dev
- Option B: Use Terraform workspaces
- Option C: Keep separate backend files and symlink based on environment
Benefits:
- No runtime file manipulation
- More robust initialization
- Clearer intent
Priority 6: Consolidate Workflow Logic
Create a workflow-level environment configuration or use a matrix strategy to reduce duplicated environment checks throughout the workflow file.
Benefits:
- Single source of truth
- Easier to update
- Less duplication
Implementation Approach
Phase 1 (Quick wins):
- Add environment validation
- Change default to "dev" instead of "prod"
- Update lock table name
Phase 2 (Architecture):
- Create environments.yaml config
- Refactor code to use centralized config
- Add resource tagging
Phase 3 (Workflow refactor):
- Consolidate workflow decision logic
- Update documentation
Non-Goals
- This does NOT propose changing the number of environments
- This does NOT propose separate S3 buckets per environment (current single bucket strategy is fine for manual workflows)
- This does NOT propose auto-cleanup or complex lifecycle management
Related
Acceptance Criteria
Overview
This issue proposes optional improvements to the environment configuration strategy to make the codebase more maintainable, safer, and less brittle when adding new environments.
Priority: Low (nice-to-have improvements)
Scope: Code quality, developer experience, safety
Current Issues
1. Hardcoded Environment Lists
Problem: Adding new environments requires finding and updating multiple hardcoded lists.
Current code:
Risk: When ci-test was added, it fell through to dev behavior because the list wasn't updated (see #216).
2. Unsafe Default Environment
Problem: ENVIRONMENT defaults to "prod" if not set.
Current code:
Risk: If someone forgets to set the environment variable, they could accidentally modify production resources.
3. No Environment Validation
Problem: Typos in environment names fail silently or cause unexpected behavior.
Example:
Risk: Silent failures, resources created in wrong environment.
4. Backend File Manipulation at Runtime
Problem: Deleting backend.tf at runtime is fragile.
Current code:
Risk: If process crashes during initialization, file state is inconsistent.
5. Generic Lock Table Name
Problem: DynamoDB table name "lock-table" could conflict with other projects.
Current:
Risk: If multiple projects share an AWS account, table conflicts or unintended sharing.
6. Duplicated Logic Across Workflows
Problem: Environment decision logic is duplicated in multiple places in lablink-images.yml.
Locations:
Risk: Easy to update one place but miss others (inconsistent behavior).
Proposed Improvements
Priority 1: Centralize Environment Configuration
Create
packages/allocator/src/lablink_allocator_service/conf/environments.yaml:Benefits:
Usage in code:
Priority 2: Add Environment Validation
Benefits:
Priority 3: Improve DynamoDB Lock Table Naming
Benefits:
Priority 4: Add Resource Tagging
Benefits:
Priority 5: Safer Backend Handling
Instead of deleting backend.tf, consider:
-backend=falseflag for devBenefits:
Priority 6: Consolidate Workflow Logic
Create a workflow-level environment configuration or use a matrix strategy to reduce duplicated environment checks throughout the workflow file.
Benefits:
Implementation Approach
Phase 1 (Quick wins):
Phase 2 (Architecture):
Phase 3 (Workflow refactor):
Non-Goals
Related
Acceptance Criteria