-
Notifications
You must be signed in to change notification settings - Fork 9
Expand file tree
/
Copy pathaccess-control-template.yml
More file actions
112 lines (108 loc) · 4.87 KB
/
access-control-template.yml
File metadata and controls
112 lines (108 loc) · 4.87 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
AWSTemplateFormatVersion: '2010-09-09'
Transform: AWS::Serverless-2016-10-31
Metadata:
AWS::ServerlessRepo::Application:
Name: ComprehendPiiAccessControlS3ObjectLambda
Description: Deploys a Lambda which will provide capability to control access to text files with PII (Personally Identifiable Information). This Lambda can be used as a s3 object lambda which will be triggered on get-object call when configured with access point
Author: AWS Comprehend
# SPDX License Id, e.g., MIT, MIT-0, Apache-2.0. See https://spdx.org/licenses for more details
SpdxLicenseId: MIT-0
LicenseUrl: LICENSE
ReadmeUrl: ACCESS_CONTROL_README.md
Labels: [serverless,comprehend,nlp,pii]
HomePageUrl: https://aws.amazon.com/comprehend/
SemanticVersion: 1.0.2
SourceCodeUrl: https://github.com/aws-samples/amazon-comprehend-s3-object-lambda-functions
Parameters:
LogLevel:
Type: String
Description: Log level for Lambda function logging, e.g., ERROR, INFO, DEBUG, etc.
Default: INFO
UnsupportedFileHandling:
Type: String
Description: Handling logic for Unsupported files. Valid values are PASS and FAIL.
Default: FAIL
IsPartialObjectSupported:
Type: String
Description: Whether to support partial objects or not. Accessing partial object through http headers such byte-range can corrupt the object and/or affect PII detection accuracy.
Default: FALSE
DocumentMaxSizeContainsPiiEntities:
Type: Number
Description: Maximum document size (in bytes) to be used for making calls to Comprehend's ContainsPiiEntities API.
Default: 50000
PiiEntityTypes:
Type: String
Description: List of comma separated PII entity types to be considered for access control. Refer Comprehend's documentation page for list of supported PII entity types.
Default: ALL
SubsegmentOverlappingTokens:
Type: Number
Description: Number of tokens/words to overlap among segments of a document in case chunking is needed because of maximum document size limit.
Default: 20
DocumentMaxSize:
Type: Number
Description: Default maximum document size (in bytes) that this function can process otherwise will throw exception for too large document size.
Default: 102400
ConfidenceThreshold:
Type: Number
Description: The minimum prediction confidence score above which PII classification and detection would be considered as final answer. Valid range (0.5 to 1.0).
Default: 0.5
MaxCharsOverlap:
Type: Number
Description: Maximum characters to overlap among segments of a document in case chunking is needed because of maximum document size limit.
Default: 200
DefaultLanguageCode:
Type: String
Description: Default language of the text to be processed. This code will be used for interacting with Comprehend.
Default: en
ContainsPiiEntitiesThreadCount:
Type: Number
Description: Number of threads to use for calling Comprehend's ContainsPiiEntities API. This controls the number of simultaneous calls that will be made from this Lambda.
Default: 20
PublishCloudWatchMetrics:
Type: String
Description: True if publish metrics to Cloudwatch, false otherwise. See README.md for details on CloudWatch metrics.
Default: True
Resources:
PiiAccessControlFunction:
Type: AWS::Serverless::Function
Properties:
CodeUri: src/
Handler: handler.pii_access_control_handler
Runtime: python3.8
Tracing: Active
Timeout: 60
Policies:
- Statement:
- Sid: ComprehendPiiDetectionPolicy
Effect: Allow
Action:
- comprehend:ContainsPiiEntities
Resource: '*'
- Sid: S3AccessPointCallbackPolicy
Effect: Allow
Action:
- s3-object-lambda:WriteGetObjectResponse
Resource: '*'
- Sid: CloudWatchMetricsPolicy
Effect: Allow
Action:
- cloudwatch:PutMetricData
Resource: '*'
Environment:
Variables:
LOG_LEVEL: !Ref LogLevel
UNSUPPORTED_FILE_HANDLING: !Ref UnsupportedFileHandling
IS_PARTIAL_OBJECT_SUPPORTED: !Ref IsPartialObjectSupported
DOCUMENT_MAX_SIZE_CONTAINS_PII_ENTITIES: !Ref DocumentMaxSizeContainsPiiEntities
PII_ENTITY_TYPES: !Ref PiiEntityTypes
SUBSEGMENT_OVERLAPPING_TOKENS: !Ref SubsegmentOverlappingTokens
DOCUMENT_MAX_SIZE: !Ref DocumentMaxSize
CONFIDENCE_THRESHOLD: !Ref ConfidenceThreshold
MAX_CHARS_OVERLAP: !Ref MaxCharsOverlap
DEFAULT_LANGUAGE_CODE: !Ref DefaultLanguageCode
CONTAINS_PII_ENTITIES_THREAD_COUNT: !Ref ContainsPiiEntitiesThreadCount
PUBLISH_CLOUD_WATCH_METRICS: !Ref PublishCloudWatchMetrics
Outputs:
PiiAccessControlFunctionName:
Description: "PII Access Control Function Name"
Value: !Ref PiiAccessControlFunction