Skip to content

Conversation

@cat5inthecradle
Copy link
Contributor

@cat5inthecradle cat5inthecradle commented Jan 7, 2026

This PR lowers the 'min_num_instances' for the Mistral model in the Gen AI production CloudFormation stack from 3 to 2. This change aims to reduce the baseline resource consumption for the Mistral endpoint while maintaining autoscaling capabilities.

This must be deployed manually via aws/cloudformation/standalone/gen_ai_curriculum/deployment/deploy_gen_ai_curriculum_stack

@cat5inthecradle cat5inthecradle requested a review from a team as a code owner January 7, 2026 20:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants