-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathtemplate.yaml
More file actions
109 lines (96 loc) · 3.5 KB
/
Copy pathtemplate.yaml
File metadata and controls
109 lines (96 loc) · 3.5 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
AWSTemplateFormatVersion: '2010-09-09'
Transform: AWS::Serverless-2016-10-31
Description: Hyperstrate AI proxy — API + async job worker
Parameters:
DatabaseDSN:
Type: String
# SQLite works only when both functions share the same EFS mount or you
# replace this with a real shared DB (RDS/Aurora) connection string.
# For local SAM testing the default below is fine.
Default: "file:/tmp/hyperstrate.db?cache=shared&_fk=1"
Description: >
GORM data source name. For production replace with a PostgreSQL/MySQL DSN
pointing to RDS / Aurora Serverless so both Lambdas share the same DB.
Globals:
Function:
Runtime: provided.al2023
Architectures:
- x86_64
Environment:
Variables:
APP_ENV: production
DATABASE_DSN: !Ref DatabaseDSN
Resources:
# ── SQS ──────────────────────────────────────────────────────────────────────
JobDLQ:
Type: AWS::SQS::Queue
Properties:
QueueName: !Sub "${AWS::StackName}-job-dlq"
MessageRetentionPeriod: 1209600 # 14 days
JobQueue:
Type: AWS::SQS::Queue
Properties:
QueueName: !Sub "${AWS::StackName}-job-queue"
# VisibilityTimeout must be >= worker Lambda timeout (900 s) with some buffer.
VisibilityTimeout: 960
RedrivePolicy:
deadLetterTargetArn: !GetAtt JobDLQ.Arn
maxReceiveCount: 3
# ── API Lambda ────────────────────────────────────────────────────────────────
ApiFunction:
Type: AWS::Serverless::Function
Properties:
CodeUri: .
Handler: bootstrap
Timeout: 30
MemorySize: 256
Environment:
Variables:
SQS_QUEUE_URL: !Ref JobQueue
Policies:
- SQSSendMessagePolicy:
QueueName: !GetAtt JobQueue.QueueName
Events:
ProxyApi:
Type: HttpApi
Properties:
Path: /{proxy+}
Method: ANY
RootApi:
Type: HttpApi
Properties:
Path: /
Method: ANY
Metadata:
BuildMethod: makefile
# ── Worker Lambda ─────────────────────────────────────────────────────────────
WorkerFunction:
Type: AWS::Serverless::Function
Properties:
CodeUri: .
Handler: worker-bootstrap
# Allow up to 15 minutes for long-running AI jobs (Kling video gen, etc.)
Timeout: 900
MemorySize: 512
Events:
SQSTrigger:
Type: SQS
Properties:
Queue: !GetAtt JobQueue.Arn
# Process one job per invocation so a single slow job doesn't block others.
BatchSize: 1
# Report individual message failures so SQS retries only the failed item.
FunctionResponseTypes:
- ReportBatchItemFailures
Metadata:
BuildMethod: makefile
Outputs:
ApiUrl:
Description: API Gateway endpoint URL
Value: !Sub "https://${ServerlessHttpApi}.execute-api.${AWS::Region}.amazonaws.com"
JobQueueURL:
Description: SQS queue where job IDs are published by the API
Value: !Ref JobQueue
JobDLQURL:
Description: Dead-letter queue for jobs that failed all retries
Value: !Ref JobDLQ