BackupHighlights automates fetching sports highlights, stores data in S3 and DynamoDB, processes videos, and runs on a schedule using ECS Fargate and EventBridge. It uses templated JSON files with environment variable injection for easy configuration and deployment.
Before running the scripts, ensure you have the following:
Rapidapi.com account, will be needed to access highlight images and videos.
For this example we will be using NCAA (USA College Basketball) highlights since it's included for free in the basic plan. Sports Highlights API is the endpoint we will be using
Docker should be pre-installed in most regions docker --version
AWS CloudShell has AWS CLI pre-installed aws --version
Python3 should be pre-installed also python3 --version
Install gettext package - envsubst is a command-line utility is used for environment variable substituition in shell scripts and text files. Install Steps
Copy your AWS Account ID Once logged in to the AWS Management Console Click on your account name in the top right corner You will see your account ID Copy and save this somewhere safe because you will need to update codes in the labs later
You can check to see if you have an access key in the IAM dashboard Under Users, click on a user and then "Security Credentials" Scroll down until you see the Access Key section You will not be able to retrieve your secret access key so if you don't have that somewhere, you need to create an access key.
git clone https://github.com/alahl1/SportsDataBackup
cd srcSearch & Replace the following values:
- Your-AWS-Account-ID
aws sts get-caller-identity --query "Account" --output text- Your-RAPIDAPI-Key
- Your-AWS-Access-Key
- Your-AWS-Secret-Access-key
- S3_BUCKET_NAME=your-alias
- Your-MediaConvert-Endpoint
aws mediaconvert describe-endpoints- SUBNET_ID=subnet-
- SECURITY_GROUP_ID=sg-
Steps for SubnetID and Security Group ID:
- In the github repo, there is a resources folder and copy the entire contents
- In the AWS Cloudshell or vs code terminal, create the file vpc_setup.sh and paste the script inside.
- Run the script
bash vpc_setup.sh- You will see variables in the output, paste these variables into Subnet_ID and Security_Group_ID
- In the CLI, run the following command to create an on demand table
aws dynamodb create-table \
--table-name SportsHighlights \
--attribute-definitions AttributeName=id,AttributeType=S \
--key-schema AttributeName=id,KeyType=HASH \
--billing-mode PAY_PER_REQUESTset -a
source .env
set +aOptional - Verify the variables are loaded
echo $AWS_LOGS_GROUP
echo $TASK_FAMILY
echo $AWS_ACCOUNT_ID- ECS Task Definition
envsubst < taskdef.template.json > taskdef.json- S3/DynamoDB Policy
envsubst < s3_dynamodb_policy.template.json > s3_dynamodb_policy.json- ECS Target
envsubst < ecsTarget.template.json > ecsTarget.json- ECS Events Role Policy
envsubst < ecseventsrole-policy.template.json > ecseventsrole-policy.json*Optional - Open the gnerated files using cat or a text editor to confirm that all place holders have been correctly replaced
- Create an ECR Repo
aws ecr create-repository --repository-name sports-backup2.Log In To ECR
aws ecr get-login-password --region ${AWS_REGION} | docker login --username AWS --password-stdin ${AWS_ACCOUNT_ID}.dkr.ecr.${AWS_REGION}.amazonaws.com- Build the Docker Image
docker build -t sports-backup .4.Tag the Image for ECR
docker tag sports-backup:latest ${AWS_ACCOUNT_ID}.dkr.ecr.${AWS_REGION}.amazonaws.com/sports-backup:latest- Push the Image
docker push ${AWS_ACCOUNT_ID}.dkr.ecr.${AWS_REGION}.amazonaws.com/sports-backup:latest- Register the ECS Task Definition
aws ecs register-task-definition --cli-input-json file://taskdef.json --region ${AWS_REGION}- Create the CloudWatch Logs Group
aws logs create-log-group --log-group-name "${AWS_LOGS_GROUP}" --region ${AWS_REGION}- Attach the S3/DynamoDB Policy to the ECS Task Execution Role
aws iam put-role-policy \
--role-name ecsTaskExecutionRole \
--policy-name S3DynamoDBAccessPolicy \
--policy-document file://s3_dynamodb_policy.json- Set up the ECS Events Role Create the Role with Trust Policy
aws iam create-role --role-name ecsEventsRole --assume-role-policy-document file://ecsEventsRole-trust.jsonAttach the Events Role Policy
aws iam put-role-policy --role-name ecsEventsRole --policy-name ecsEventsPolicy --policy-document file://ecseventsrole-policy.json- Create the Rule
aws events put-rule --name SportsBackupScheduleRule --schedule-expression "rate(1 day)" --region ${AWS_REGION}- Add the Target
aws events put-targets --rule SportsBackupScheduleRule --targets file://ecsTarget.json --region ${AWS_REGION}aws ecs run-task \
--cluster sports-backup-cluster \
--launch-type FARGATE \
--task-definition ${TASK_FAMILY} \
--network-configuration "awsvpcConfiguration={subnets=[\"${SUBNET_ID}\"],securityGroups=[\"${SECURITY_GROUP_ID}\"],assignPublicIp=\"ENABLED\"}" \
--region ${AWS_REGION}- Using templates to generate json files
- Integrating DynamoDB to store data backup
- Cloudwatcher for logging
- Integrate exporting a table from DynamoDB to an S3 bucket
- Configure an automated backup
- Creating batch processing of the entire Json file (importing more than 10 videos)
