-
Notifications
You must be signed in to change notification settings - Fork 117
chore(SECURITY.md): add more description regarding the UDF security model #4299
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
bobbai00
wants to merge
9
commits into
apache:main
Choose a base branch
from
bobbai00:claude/update-security-policy-nNITK
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
9 commits
Select commit
Hold shift + click to select a range
57e4adb
Update SECURITY.md with expanded UDF security details and March 2026 …
claude 5f8b662
Revert formatting and metadata changes per PR review feedback
claude 1003c19
Restore trailing newline at end of SECURITY.md
claude c178bc0
Merge branch 'main' into claude/update-security-policy-nNITK
bobbai00 9683b5d
Address review feedback from Yicong-Huang on SECURITY.md
claude f2463ea
Merge branch 'main' into claude/update-security-policy-nNITK
bobbai00 ea9593c
Merge branch 'main' into claude/update-security-policy-nNITK
Yicong-Huang 2af5a01
Merge branch 'main' into claude/update-security-policy-nNITK
chenlica 6155c28
Refine UDF security tone: known limitation with mitigation guidance
claude File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Some comments aren't visible on the classic Files Changed page.
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -86,6 +86,8 @@ account. | |
| - Network and firewall settings | ||
| - Container orchestration | ||
|
|
||
| **Important**: Texera's security model defines distinct roles with different privilege levels. However, REGULAR and ADMIN users can execute arbitrary code within computing units through User-Defined Functions (UDFs), which is a known limitation that can break the intended role boundaries. UDF code may access resources available in the execution environment, including environment variables, configuration values, and application state. Deployment managers are responsible for mitigating this by applying techniques such as sandboxing UDF execution and disallowing in-process (coordinator JVM) UDFs. See [Deployments and Computing Units](#deployments-and-computing-units) and [What is NOT a Security Issue](#what-is-not-a-security-issue) for more details. | ||
|
|
||
| **Roles**: UI users are assigned one of four roles (INACTIVE, RESTRICTED, REGULAR, ADMIN) that control their permissions | ||
| within the Texera application. | ||
|
|
||
|
|
@@ -157,8 +159,9 @@ our [wiki](https://github.com/apache/texera/wiki/How-to-run-Texera-on-local-Kube | |
| ### Computing Unit Types | ||
|
|
||
| Texera executes workflows on **computing units**. UI users (REGULAR and ADMIN) can execute arbitrary code (e.g., through | ||
| UDFs written in Python, R, Scala) within computing units as part of their workflows. This code is currently not | ||
| sandboxed or restricted by Texera. Deployment managers configure which types of computing units are available: | ||
| UDFs written in Python, R, Java, Scala) within computing units as part of their workflows. UDF execution is a known limitation that can break the intended privilege boundaries between roles — UDF code may access resources available in the execution environment, such as environment variables, configuration values, and other application state. Deployment managers are responsible for mitigating this risk by applying techniques such as sandboxing UDF execution, disallowing in-process (coordinator JVM) UDFs, and ensuring that only trusted users are granted roles that permit code execution. | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. replace in-process UDFs with Java UDFs. Remove the (coordinator JVM) |
||
|
|
||
| Deployment managers configure which types of computing units are available: | ||
|
|
||
| #### Local Computing Units | ||
|
|
||
|
|
@@ -174,6 +177,8 @@ Local computing units run as processes on the same machine as the Texera service | |
| **Security considerations**: | ||
|
|
||
| - Users' workflow code executes on the host machine with limited isolation | ||
| - UDF code is a known limitation that can break role boundaries — it may access application configuration and state in the execution environment | ||
| - Deployment managers should mitigate this by sandboxing UDF execution or disallowing in-process (coordinator JVM) UDFs | ||
| - Deployment managers must trust all REGULAR and ADMIN users | ||
| - Resource exhaustion by one user can affect all users | ||
|
|
||
|
|
@@ -194,6 +199,7 @@ a user needs it. | |
| - Better isolation between users compared to local computing units | ||
| - Kubernetes provides namespace and pod-level isolation | ||
| - Resource limits prevent individual users from consuming excessive resources | ||
| - UDF code within a pod can still access resources available inside that pod's environment (e.g., environment variables, mounted secrets) | ||
| - Container security and image scanning should be implemented | ||
| - Deployment managers must secure the Kubernetes cluster infrastructure | ||
|
|
||
|
|
@@ -202,6 +208,7 @@ a user needs it. | |
| Texera's security model does NOT guarantee: | ||
|
|
||
| - Protection against malicious code in user workflows (users can execute arbitrary code) | ||
| - Isolation of application secrets from UDF code executing within the same process or pod | ||
| - Strong isolation between workflows in local computing units | ||
| - Complete isolation between workflows in Kubernetes computing units within the same namespace | ||
| - Protection against infrastructure-level compromises | ||
|
|
@@ -215,10 +222,11 @@ The following are **NOT considered security vulnerabilities** in Texera: | |
|
|
||
| ### User Code Execution | ||
|
|
||
| REGULAR and ADMIN users can execute arbitrary code (Python, R, Scala) within computing units. This is by design - Texera | ||
| is a data analytics platform where custom code execution is a core feature. The system currently does not sandbox user | ||
| code beyond the isolation provided by the deployment environment (local processes or Kubernetes pods). Deployment | ||
| managers should use resource limits, monitor usage, and restrict user roles appropriately. | ||
| Texera's security model defines distinct user roles with different privilege levels. However, REGULAR and ADMIN users can execute arbitrary code (Python, R, Java, Scala) within computing units through UDFs. This is by design — Texera is a data analytics platform where custom code execution is a core feature. | ||
|
|
||
| UDF execution is a known limitation that can break the intended privilege boundaries between roles. UDF code may access resources available in the execution environment, including application configuration, environment variables, and other process state. This is not considered a vulnerability, given that Texera's security model expects deployment managers to actively mitigate this risk. | ||
|
|
||
| Deployment managers are responsible for making UDF execution more secure by applying techniques such as sandboxing UDF execution, disallowing in-process (coordinator JVM) UDFs, restricting user roles appropriately, and monitoring resource usage. | ||
|
|
||
| ### Resource Consumption | ||
|
|
||
|
|
@@ -262,7 +270,7 @@ lists and website. | |
|
|
||
| --- | ||
|
|
||
| **Last Updated**: November 2025 | ||
| **Last Updated**: March 2026 | ||
|
|
||
| **Disclaimer**: This project is currently undergoing incubation at The Apache Software Foundation (ASF). Incubation is | ||
| required of all newly accepted projects until a further review indicates that the infrastructure, communications, and | ||
|
|
||
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
replace in-process UDFs with Java UDFs. Remove the (coordinator JVM)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix the what-is-not-a-security-issue link