From 57e4adb6ca67fd467809861c0786bc8102480bc2 Mon Sep 17 00:00:00 2001 From: Claude Date: Tue, 17 Mar 2026 01:24:47 +0000 Subject: [PATCH 1/5] Update SECURITY.md with expanded UDF security details and March 2026 date - Add UDF security warning in UI Users section about unsandboxed code execution - Expand Computing Unit Types with detailed UDF access implications - Add Java to supported UDF languages throughout - Add "Isolation of application secrets" to NOT Guaranteed list - Expand User Code Execution section with known limitation details - Update Last Updated date to March 2026 https://claude.ai/code/session_01UnG9kQ8oHvtDXPMmouf7z5 --- SECURITY.md | 29 +++++++++++++++++++---------- 1 file changed, 19 insertions(+), 10 deletions(-) diff --git a/SECURITY.md b/SECURITY.md index c293b53e971..895d3650d0c 100644 --- a/SECURITY.md +++ b/SECURITY.md @@ -26,7 +26,7 @@ Texera's security architecture is built around: In Texera, a **resource** is any object within the system that can be created, accessed, modified, or shared by users via the web application. Understanding resource types and how access to them is managed is critical to following -Texera’s security model. +Texera's security model. ### Resource Types @@ -78,7 +78,7 @@ account. **Who They Are**: Individuals who interact with Texera through the web interface. -**Access Level**: Application-level access only. UI users work within the Texera platform but do not have access to: +**Access Level**: UI users interact with Texera at the application level through the web interface. They do not have direct access to: - The underlying infrastructure (servers, Kubernetes cluster) - Database administration @@ -86,6 +86,8 @@ account. - Network and firewall settings - Container orchestration +**Important**: REGULAR and ADMIN users can execute arbitrary code within computing units through User-Defined Functions (UDFs). This UDF code is currently not sandboxed and runs within the same process or pod as Texera's computing engine. As a result, UDF code may have access to resources available in the execution environment, including environment variables, configuration values, and application state within the same process. Deployment managers should be aware of this and should only grant REGULAR or ADMIN roles to trusted individuals. See [Deployments and Computing Units](#deployments-and-computing-units) and [What is NOT a Security Issue](#what-is-not-a-security-issue) for more details. + **Roles**: UI users are assigned one of four roles (INACTIVE, RESTRICTED, REGULAR, ADMIN) that control their permissions within the Texera application. @@ -150,15 +152,17 @@ They cannot: deployment manager access is required. ## Deployments and Computing Units -Texera can be deployed in several configurations, such as local development, single-node setups, or distributed Kubernetes +Texera can be deployed in several configurations, such as local development, single-node setups, or distributed Kubernetes clusters. For details on supported deployment options and their operational differences, see the deployment guides in our [wiki](https://github.com/apache/texera/wiki/How-to-run-Texera-on-local-Kubernetes). ### Computing Unit Types Texera executes workflows on **computing units**. UI users (REGULAR and ADMIN) can execute arbitrary code (e.g., through -UDFs written in Python, R, Scala) within computing units as part of their workflows. This code is currently not -sandboxed or restricted by Texera. Deployment managers configure which types of computing units are available: +UDFs written in Python, R, Java, Scala) within computing units as part of their workflows. This code is currently not +sandboxed or restricted by Texera. As a result, UDF code may be able to access resources available in the execution environment, such as environment variables, JVM classpath entries, configuration values loaded into the process, and other application state. Deployment managers are responsible for configuring the execution environment to limit exposure of sensitive information and for ensuring that only trusted users are granted roles that permit code execution. + +Deployment managers configure which types of computing units are available: #### Local Computing Units @@ -174,6 +178,7 @@ Local computing units run as processes on the same machine as the Texera service **Security considerations**: - Users' workflow code executes on the host machine with limited isolation +- UDF code running in the same JVM process can potentially access application configuration and state available on the classpath - Deployment managers must trust all REGULAR and ADMIN users - Resource exhaustion by one user can affect all users @@ -194,6 +199,7 @@ a user needs it. - Better isolation between users compared to local computing units - Kubernetes provides namespace and pod-level isolation - Resource limits prevent individual users from consuming excessive resources +- UDF code within a pod can still access resources available inside that pod's environment (e.g., environment variables, mounted secrets) - Container security and image scanning should be implemented - Deployment managers must secure the Kubernetes cluster infrastructure @@ -202,6 +208,7 @@ a user needs it. Texera's security model does NOT guarantee: - Protection against malicious code in user workflows (users can execute arbitrary code) +- Isolation of application secrets (e.g., JWT keys, database credentials) from UDF code executing within the same process or pod - Strong isolation between workflows in local computing units - Complete isolation between workflows in Kubernetes computing units within the same namespace - Protection against infrastructure-level compromises @@ -215,10 +222,13 @@ The following are **NOT considered security vulnerabilities** in Texera: ### User Code Execution -REGULAR and ADMIN users can execute arbitrary code (Python, R, Scala) within computing units. This is by design - Texera +REGULAR and ADMIN users can execute arbitrary code (Python, R, Java, Scala) within computing units. This is by design - Texera is a data analytics platform where custom code execution is a core feature. The system currently does not sandbox user -code beyond the isolation provided by the deployment environment (local processes or Kubernetes pods). Deployment -managers should use resource limits, monitor usage, and restrict user roles appropriately. +code beyond the isolation provided by the deployment environment (local processes or Kubernetes pods). + +Because UDF code is not sandboxed, it may be able to access resources available in the execution environment, including but not limited to application configuration, environment variables, and JVM classpath entries. This includes the possibility of reading sensitive values such as JWT secrets or database credentials if they are accessible within the process. This is a known limitation, not a vulnerability, given Texera's security model, which requires deployment managers to grant code-execution roles only to trusted users. + +Deployment managers should use resource limits, monitor usage, restrict user roles appropriately, and consider isolating sensitive configuration from the execution environment where possible. ### Resource Consumption @@ -262,11 +272,10 @@ lists and website. --- -**Last Updated**: November 2025 +**Last Updated**: March 2026 **Disclaimer**: This project is currently undergoing incubation at The Apache Software Foundation (ASF). Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision-making process have stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF. - From 5f8b6622c85ebac8993e2d4c4f6d6521302fa629 Mon Sep 17 00:00:00 2001 From: Claude Date: Tue, 17 Mar 2026 02:11:25 +0000 Subject: [PATCH 2/5] Revert formatting and metadata changes per PR review feedback - Restore Unicode right single quotation mark in Resources section - Revert Access Level wording to original phrasing - Restore trailing space in Deployments section line - Revert Last Updated date back to November 2025 All substantive UDF security content additions are preserved. https://claude.ai/code/session_01UnG9kQ8oHvtDXPMmouf7z5 --- SECURITY.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/SECURITY.md b/SECURITY.md index 895d3650d0c..970b71e0b41 100644 --- a/SECURITY.md +++ b/SECURITY.md @@ -26,7 +26,7 @@ Texera's security architecture is built around: In Texera, a **resource** is any object within the system that can be created, accessed, modified, or shared by users via the web application. Understanding resource types and how access to them is managed is critical to following -Texera's security model. +Texera’s security model. ### Resource Types @@ -78,7 +78,7 @@ account. **Who They Are**: Individuals who interact with Texera through the web interface. -**Access Level**: UI users interact with Texera at the application level through the web interface. They do not have direct access to: +**Access Level**: Application-level access only. UI users work within the Texera platform but do not have access to: - The underlying infrastructure (servers, Kubernetes cluster) - Database administration @@ -152,7 +152,7 @@ They cannot: deployment manager access is required. ## Deployments and Computing Units -Texera can be deployed in several configurations, such as local development, single-node setups, or distributed Kubernetes +Texera can be deployed in several configurations, such as local development, single-node setups, or distributed Kubernetes clusters. For details on supported deployment options and their operational differences, see the deployment guides in our [wiki](https://github.com/apache/texera/wiki/How-to-run-Texera-on-local-Kubernetes). @@ -272,7 +272,7 @@ lists and website. --- -**Last Updated**: March 2026 +**Last Updated**: November 2025 **Disclaimer**: This project is currently undergoing incubation at The Apache Software Foundation (ASF). Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and From 1003c199794a7905dbaa5423ca52e0b05cd4a7a4 Mon Sep 17 00:00:00 2001 From: Claude Date: Tue, 17 Mar 2026 02:55:57 +0000 Subject: [PATCH 3/5] Restore trailing newline at end of SECURITY.md Reverts the remaining change flagged in PR review comment on line 272. https://claude.ai/code/session_01UnG9kQ8oHvtDXPMmouf7z5 --- SECURITY.md | 1 + 1 file changed, 1 insertion(+) diff --git a/SECURITY.md b/SECURITY.md index 970b71e0b41..d3ae9d41e37 100644 --- a/SECURITY.md +++ b/SECURITY.md @@ -279,3 +279,4 @@ required of all newly accepted projects until a further review indicates that th decision-making process have stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF. + From 9683b5dbb2d846a55eeda73c3f99f92bc2d2ae57 Mon Sep 17 00:00:00 2001 From: Claude Date: Tue, 17 Mar 2026 21:58:03 +0000 Subject: [PATCH 4/5] Address review feedback from Yicong-Huang on SECURITY.md - Update Last Updated date to March 2026 - Generalize JVM-specific references to runtime-agnostic language - Remove specific mentions of JWT secrets and database credentials https://claude.ai/code/session_01UnG9kQ8oHvtDXPMmouf7z5 --- SECURITY.md | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/SECURITY.md b/SECURITY.md index d3ae9d41e37..2a38a4d4096 100644 --- a/SECURITY.md +++ b/SECURITY.md @@ -160,7 +160,7 @@ our [wiki](https://github.com/apache/texera/wiki/How-to-run-Texera-on-local-Kube Texera executes workflows on **computing units**. UI users (REGULAR and ADMIN) can execute arbitrary code (e.g., through UDFs written in Python, R, Java, Scala) within computing units as part of their workflows. This code is currently not -sandboxed or restricted by Texera. As a result, UDF code may be able to access resources available in the execution environment, such as environment variables, JVM classpath entries, configuration values loaded into the process, and other application state. Deployment managers are responsible for configuring the execution environment to limit exposure of sensitive information and for ensuring that only trusted users are granted roles that permit code execution. +sandboxed or restricted by Texera. As a result, UDF code may be able to access resources available in the execution environment, such as environment variables, runtime classpath entries, configuration values loaded into the process, and other application state. Deployment managers are responsible for configuring the execution environment to limit exposure of sensitive information and for ensuring that only trusted users are granted roles that permit code execution. Deployment managers configure which types of computing units are available: @@ -178,7 +178,7 @@ Local computing units run as processes on the same machine as the Texera service **Security considerations**: - Users' workflow code executes on the host machine with limited isolation -- UDF code running in the same JVM process can potentially access application configuration and state available on the classpath +- UDF code running in the same process can potentially access application configuration and state available in the execution environment - Deployment managers must trust all REGULAR and ADMIN users - Resource exhaustion by one user can affect all users @@ -208,7 +208,7 @@ a user needs it. Texera's security model does NOT guarantee: - Protection against malicious code in user workflows (users can execute arbitrary code) -- Isolation of application secrets (e.g., JWT keys, database credentials) from UDF code executing within the same process or pod +- Isolation of application secrets from UDF code executing within the same process or pod - Strong isolation between workflows in local computing units - Complete isolation between workflows in Kubernetes computing units within the same namespace - Protection against infrastructure-level compromises @@ -226,7 +226,7 @@ REGULAR and ADMIN users can execute arbitrary code (Python, R, Java, Scala) with is a data analytics platform where custom code execution is a core feature. The system currently does not sandbox user code beyond the isolation provided by the deployment environment (local processes or Kubernetes pods). -Because UDF code is not sandboxed, it may be able to access resources available in the execution environment, including but not limited to application configuration, environment variables, and JVM classpath entries. This includes the possibility of reading sensitive values such as JWT secrets or database credentials if they are accessible within the process. This is a known limitation, not a vulnerability, given Texera's security model, which requires deployment managers to grant code-execution roles only to trusted users. +Because UDF code is not sandboxed, it may be able to access resources available in the execution environment, including but not limited to application configuration, environment variables, and runtime classpath entries. This includes the possibility of reading sensitive values if they are accessible within the process. This is a known limitation, not a vulnerability, given Texera's security model, which requires deployment managers to grant code-execution roles only to trusted users. Deployment managers should use resource limits, monitor usage, restrict user roles appropriately, and consider isolating sensitive configuration from the execution environment where possible. @@ -272,7 +272,7 @@ lists and website. --- -**Last Updated**: November 2025 +**Last Updated**: March 2026 **Disclaimer**: This project is currently undergoing incubation at The Apache Software Foundation (ASF). Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and From 6155c280e6f364abe0408bd5a4a3f7b6a1da89a0 Mon Sep 17 00:00:00 2001 From: Claude Date: Wed, 18 Mar 2026 18:19:53 +0000 Subject: [PATCH 5/5] Refine UDF security tone: known limitation with mitigation guidance Restructure all UDF-related sections to follow a consistent 3-point tone: 1. State that the security model defines distinct roles with different privileges 2. Acknowledge UDFs as a known limitation that can break role boundaries 3. State deployment managers' responsibility to mitigate via sandboxing, disallowing in-process (coordinator JVM) UDFs, and role restrictions https://claude.ai/code/session_01UnG9kQ8oHvtDXPMmouf7z5 --- SECURITY.md | 16 +++++++--------- 1 file changed, 7 insertions(+), 9 deletions(-) diff --git a/SECURITY.md b/SECURITY.md index 2a38a4d4096..d391a0da3bc 100644 --- a/SECURITY.md +++ b/SECURITY.md @@ -86,7 +86,7 @@ account. - Network and firewall settings - Container orchestration -**Important**: REGULAR and ADMIN users can execute arbitrary code within computing units through User-Defined Functions (UDFs). This UDF code is currently not sandboxed and runs within the same process or pod as Texera's computing engine. As a result, UDF code may have access to resources available in the execution environment, including environment variables, configuration values, and application state within the same process. Deployment managers should be aware of this and should only grant REGULAR or ADMIN roles to trusted individuals. See [Deployments and Computing Units](#deployments-and-computing-units) and [What is NOT a Security Issue](#what-is-not-a-security-issue) for more details. +**Important**: Texera's security model defines distinct roles with different privilege levels. However, REGULAR and ADMIN users can execute arbitrary code within computing units through User-Defined Functions (UDFs), which is a known limitation that can break the intended role boundaries. UDF code may access resources available in the execution environment, including environment variables, configuration values, and application state. Deployment managers are responsible for mitigating this by applying techniques such as sandboxing UDF execution and disallowing in-process (coordinator JVM) UDFs. See [Deployments and Computing Units](#deployments-and-computing-units) and [What is NOT a Security Issue](#what-is-not-a-security-issue) for more details. **Roles**: UI users are assigned one of four roles (INACTIVE, RESTRICTED, REGULAR, ADMIN) that control their permissions within the Texera application. @@ -159,8 +159,7 @@ our [wiki](https://github.com/apache/texera/wiki/How-to-run-Texera-on-local-Kube ### Computing Unit Types Texera executes workflows on **computing units**. UI users (REGULAR and ADMIN) can execute arbitrary code (e.g., through -UDFs written in Python, R, Java, Scala) within computing units as part of their workflows. This code is currently not -sandboxed or restricted by Texera. As a result, UDF code may be able to access resources available in the execution environment, such as environment variables, runtime classpath entries, configuration values loaded into the process, and other application state. Deployment managers are responsible for configuring the execution environment to limit exposure of sensitive information and for ensuring that only trusted users are granted roles that permit code execution. +UDFs written in Python, R, Java, Scala) within computing units as part of their workflows. UDF execution is a known limitation that can break the intended privilege boundaries between roles — UDF code may access resources available in the execution environment, such as environment variables, configuration values, and other application state. Deployment managers are responsible for mitigating this risk by applying techniques such as sandboxing UDF execution, disallowing in-process (coordinator JVM) UDFs, and ensuring that only trusted users are granted roles that permit code execution. Deployment managers configure which types of computing units are available: @@ -178,7 +177,8 @@ Local computing units run as processes on the same machine as the Texera service **Security considerations**: - Users' workflow code executes on the host machine with limited isolation -- UDF code running in the same process can potentially access application configuration and state available in the execution environment +- UDF code is a known limitation that can break role boundaries — it may access application configuration and state in the execution environment +- Deployment managers should mitigate this by sandboxing UDF execution or disallowing in-process (coordinator JVM) UDFs - Deployment managers must trust all REGULAR and ADMIN users - Resource exhaustion by one user can affect all users @@ -222,13 +222,11 @@ The following are **NOT considered security vulnerabilities** in Texera: ### User Code Execution -REGULAR and ADMIN users can execute arbitrary code (Python, R, Java, Scala) within computing units. This is by design - Texera -is a data analytics platform where custom code execution is a core feature. The system currently does not sandbox user -code beyond the isolation provided by the deployment environment (local processes or Kubernetes pods). +Texera's security model defines distinct user roles with different privilege levels. However, REGULAR and ADMIN users can execute arbitrary code (Python, R, Java, Scala) within computing units through UDFs. This is by design — Texera is a data analytics platform where custom code execution is a core feature. -Because UDF code is not sandboxed, it may be able to access resources available in the execution environment, including but not limited to application configuration, environment variables, and runtime classpath entries. This includes the possibility of reading sensitive values if they are accessible within the process. This is a known limitation, not a vulnerability, given Texera's security model, which requires deployment managers to grant code-execution roles only to trusted users. +UDF execution is a known limitation that can break the intended privilege boundaries between roles. UDF code may access resources available in the execution environment, including application configuration, environment variables, and other process state. This is not considered a vulnerability, given that Texera's security model expects deployment managers to actively mitigate this risk. -Deployment managers should use resource limits, monitor usage, restrict user roles appropriately, and consider isolating sensitive configuration from the execution environment where possible. +Deployment managers are responsible for making UDF execution more secure by applying techniques such as sandboxing UDF execution, disallowing in-process (coordinator JVM) UDFs, restricting user roles appropriately, and monitoring resource usage. ### Resource Consumption