diff --git a/.codemapignore b/.codemapignore deleted file mode 100644 index 0d212e1db6..0000000000 --- a/.codemapignore +++ /dev/null @@ -1,6 +0,0 @@ -# -*- sh -*- -# gitignore-like file for Codemap (see https://github.com/aryx/codemap) -# This is useful to perform LOC stats on our different codebases -# by skipping autogenerated code and count only real LOC we wrote. - -/static diff --git a/mintlify-docs/LICENSE b/mintlify-docs/LICENSE new file mode 100644 index 0000000000..5411374274 --- /dev/null +++ b/mintlify-docs/LICENSE @@ -0,0 +1,21 @@ +MIT License + +Copyright (c) 2023 Mintlify + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. \ No newline at end of file diff --git a/mintlify-docs/api-reference/Authentication.mdx b/mintlify-docs/api-reference/Authentication.mdx new file mode 100644 index 0000000000..2e9d29663a --- /dev/null +++ b/mintlify-docs/api-reference/Authentication.mdx @@ -0,0 +1,7 @@ +--- +title: "Authentication" +--- + +The API supports authentication with an API token with the "Web API" permission, without limited scopes of access. + +You can provision an API token [from the Settings page](https://semgrep.dev/orgs/-/settings/tokens). \ No newline at end of file diff --git a/mintlify-docs/api-reference/DeploymentsService.mdx b/mintlify-docs/api-reference/DeploymentsService.mdx new file mode 100644 index 0000000000..70c42bc2a5 --- /dev/null +++ b/mintlify-docs/api-reference/DeploymentsService.mdx @@ -0,0 +1,4 @@ +--- +title: "Deployment" +description: "Deployments encapsulate your organization's security organization, with multiple projects, policies, and integrations. As the root object of the organization, they're similarly the root object of the API." +--- \ No newline at end of file diff --git a/mintlify-docs/api-reference/FindingsService.mdx b/mintlify-docs/api-reference/FindingsService.mdx new file mode 100644 index 0000000000..c2f1118e5d --- /dev/null +++ b/mintlify-docs/api-reference/FindingsService.mdx @@ -0,0 +1,4 @@ +--- +title: "Code, Supply Chain, and AI-Powered Scan" +description: "Manage and retrieve code, supply chain, and AI-powered scan findings from Semgrep scans" +--- \ No newline at end of file diff --git a/mintlify-docs/api-reference/Introduction.mdx b/mintlify-docs/api-reference/Introduction.mdx new file mode 100644 index 0000000000..d138e3afcf --- /dev/null +++ b/mintlify-docs/api-reference/Introduction.mdx @@ -0,0 +1,18 @@ +--- +title: "Introduction" +description: "Welcome to Semgrep's portal for the Semgrep AppSec Platform web API." +--- + +Semgrep is a fast, open-source, static analysis tool for finding bugs and enforcing code standards at editor, commit, and CI time. [Get started.](https://semgrep.dev/getting-started/) + +Semgrep analyzes code locally on your computer or in your build environment: **code is never uploaded.** + +This API is documented in the **OpenAPI format**. + + +Download OpenAPI specification: + + + + + diff --git a/mintlify-docs/api-reference/MiscService.mdx b/mintlify-docs/api-reference/MiscService.mdx new file mode 100644 index 0000000000..f3244a79b8 --- /dev/null +++ b/mintlify-docs/api-reference/MiscService.mdx @@ -0,0 +1,4 @@ +--- +title: "Other" +description: "Utility endpoints." +--- diff --git a/mintlify-docs/api-reference/PoliciesService.mdx b/mintlify-docs/api-reference/PoliciesService.mdx new file mode 100644 index 0000000000..706772331b --- /dev/null +++ b/mintlify-docs/api-reference/PoliciesService.mdx @@ -0,0 +1,4 @@ +--- +title: "Policies" +description: "View and manage the Policies of your organization." +--- \ No newline at end of file diff --git a/mintlify-docs/api-reference/ScansService.mdx b/mintlify-docs/api-reference/ScansService.mdx new file mode 100644 index 0000000000..185c4fef02 --- /dev/null +++ b/mintlify-docs/api-reference/ScansService.mdx @@ -0,0 +1,4 @@ +--- +title: "Scans" +description: "View details of scans associated with projects in your organization." +--- \ No newline at end of file diff --git a/mintlify-docs/api-reference/SecretsService.mdx b/mintlify-docs/api-reference/SecretsService.mdx new file mode 100644 index 0000000000..e0c7cda1e5 --- /dev/null +++ b/mintlify-docs/api-reference/SecretsService.mdx @@ -0,0 +1,4 @@ +--- +title: "Secrets" +description: "View and manage the Secrets of your organization." +--- diff --git a/mintlify-docs/api-reference/SupplyChainService.mdx b/mintlify-docs/api-reference/SupplyChainService.mdx new file mode 100644 index 0000000000..79b96cf4aa --- /dev/null +++ b/mintlify-docs/api-reference/SupplyChainService.mdx @@ -0,0 +1,6 @@ +--- +title: "Supply Chain" +description: "Manage the Supply Chain findings and dependencies of your organization." +--- + +A request body is required, but may be an empty object. \ No newline at end of file diff --git a/mintlify-docs/api-reference/Terms-of-Use.mdx b/mintlify-docs/api-reference/Terms-of-Use.mdx new file mode 100644 index 0000000000..801c6415a7 --- /dev/null +++ b/mintlify-docs/api-reference/Terms-of-Use.mdx @@ -0,0 +1,5 @@ +--- +title: "Terms of Use" +--- + +Please note, the materials made available herein are subject to the [Semgrep Terms of Use](https://semgrep.dev/resources/website-terms/), and your access or use of any of the same is your acknowledgment and acceptance of the such terms. \ No newline at end of file diff --git a/mintlify-docs/api-reference/TicketingService.mdx b/mintlify-docs/api-reference/TicketingService.mdx new file mode 100644 index 0000000000..f55f70497a --- /dev/null +++ b/mintlify-docs/api-reference/TicketingService.mdx @@ -0,0 +1,4 @@ +--- +title: "Ticketing" +description: "Create and manage external tickets" +--- diff --git a/mintlify-docs/api-reference/TriageService.mdx b/mintlify-docs/api-reference/TriageService.mdx new file mode 100644 index 0000000000..232c7b9812 --- /dev/null +++ b/mintlify-docs/api-reference/TriageService.mdx @@ -0,0 +1,4 @@ +--- +title: "Triage" +description: "View and manage the triage of your organization." +--- \ No newline at end of file diff --git a/mintlify-docs/api-reference/deploymentsservice/list-deployments.mdx b/mintlify-docs/api-reference/deploymentsservice/list-deployments.mdx new file mode 100644 index 0000000000..2b0ceaa963 --- /dev/null +++ b/mintlify-docs/api-reference/deploymentsservice/list-deployments.mdx @@ -0,0 +1,4 @@ +--- +title: "List deployments" +openapi: get /api/v1/deployments +--- \ No newline at end of file diff --git a/mintlify-docs/api-reference/findingsservice/list-code-or-supply-chain-findings.mdx b/mintlify-docs/api-reference/findingsservice/list-code-or-supply-chain-findings.mdx new file mode 100644 index 0000000000..b913c989db --- /dev/null +++ b/mintlify-docs/api-reference/findingsservice/list-code-or-supply-chain-findings.mdx @@ -0,0 +1,4 @@ +--- +title: "List code, supply chain, or AI-powered scan findings" +openapi: get /api/v1/deployments/{deploymentSlug}/findings +--- \ No newline at end of file diff --git a/mintlify-docs/api-reference/miscservice/[beta]-get-sms-vpc-bootstrap-cloudformation-template.mdx b/mintlify-docs/api-reference/miscservice/[beta]-get-sms-vpc-bootstrap-cloudformation-template.mdx new file mode 100644 index 0000000000..ace4156376 --- /dev/null +++ b/mintlify-docs/api-reference/miscservice/[beta]-get-sms-vpc-bootstrap-cloudformation-template.mdx @@ -0,0 +1,3 @@ +--- +openapi: get /api/v1/bootstrap-sms-vpc +--- \ No newline at end of file diff --git a/mintlify-docs/api-reference/miscservice/ping.mdx b/mintlify-docs/api-reference/miscservice/ping.mdx new file mode 100644 index 0000000000..29d5fd4a25 --- /dev/null +++ b/mintlify-docs/api-reference/miscservice/ping.mdx @@ -0,0 +1,3 @@ +--- +openapi: get /api/v1/ping +--- \ No newline at end of file diff --git a/mintlify-docs/api-reference/policiesservice/list-policies.mdx b/mintlify-docs/api-reference/policiesservice/list-policies.mdx new file mode 100644 index 0000000000..da1276b469 --- /dev/null +++ b/mintlify-docs/api-reference/policiesservice/list-policies.mdx @@ -0,0 +1,3 @@ +--- +openapi: get /api/v1/deployments/{deploymentId}/policies +--- \ No newline at end of file diff --git a/mintlify-docs/api-reference/policiesservice/list-policy-rules.mdx b/mintlify-docs/api-reference/policiesservice/list-policy-rules.mdx new file mode 100644 index 0000000000..8118e32777 --- /dev/null +++ b/mintlify-docs/api-reference/policiesservice/list-policy-rules.mdx @@ -0,0 +1,3 @@ +--- +openapi: get /api/v1/deployments/{deploymentId}/policies/{policyId} +--- \ No newline at end of file diff --git a/mintlify-docs/api-reference/policiesservice/update-policy.mdx b/mintlify-docs/api-reference/policiesservice/update-policy.mdx new file mode 100644 index 0000000000..5f2da7ef7b --- /dev/null +++ b/mintlify-docs/api-reference/policiesservice/update-policy.mdx @@ -0,0 +1,3 @@ +--- +openapi: put /api/v1/deployments/{deploymentId}/policies/{policyId} +--- \ No newline at end of file diff --git a/mintlify-docs/api-reference/projectsservice/add-tags-to-project.mdx b/mintlify-docs/api-reference/projectsservice/add-tags-to-project.mdx new file mode 100644 index 0000000000..1eaf4bb2ee --- /dev/null +++ b/mintlify-docs/api-reference/projectsservice/add-tags-to-project.mdx @@ -0,0 +1,3 @@ +--- +openapi: put /api/v1/deployments/{deploymentSlug}/projects/{projectName}/tags +--- \ No newline at end of file diff --git a/mintlify-docs/api-reference/projectsservice/delete-project.mdx b/mintlify-docs/api-reference/projectsservice/delete-project.mdx new file mode 100644 index 0000000000..e54e5d61ae --- /dev/null +++ b/mintlify-docs/api-reference/projectsservice/delete-project.mdx @@ -0,0 +1,3 @@ +--- +openapi: delete /api/v1/deployments/{deploymentSlug}/projects/{projectName} +--- \ No newline at end of file diff --git a/mintlify-docs/api-reference/projectsservice/get-project-details.mdx b/mintlify-docs/api-reference/projectsservice/get-project-details.mdx new file mode 100644 index 0000000000..fdf7c8ffd4 --- /dev/null +++ b/mintlify-docs/api-reference/projectsservice/get-project-details.mdx @@ -0,0 +1,3 @@ +--- +openapi: get /api/v1/deployments/{deploymentSlug}/projects/{projectName} +--- \ No newline at end of file diff --git a/mintlify-docs/api-reference/projectsservice/list-all-projects.mdx b/mintlify-docs/api-reference/projectsservice/list-all-projects.mdx new file mode 100644 index 0000000000..456b0a1d6d --- /dev/null +++ b/mintlify-docs/api-reference/projectsservice/list-all-projects.mdx @@ -0,0 +1,3 @@ +--- +openapi: get /api/v1/deployments/{deploymentSlug}/projects +--- \ No newline at end of file diff --git a/mintlify-docs/api-reference/projectsservice/remove-tags-from-project.mdx b/mintlify-docs/api-reference/projectsservice/remove-tags-from-project.mdx new file mode 100644 index 0000000000..9ea0689288 --- /dev/null +++ b/mintlify-docs/api-reference/projectsservice/remove-tags-from-project.mdx @@ -0,0 +1,3 @@ +--- +openapi: delete /api/v1/deployments/{deploymentSlug}/projects/{projectName}/tags +--- \ No newline at end of file diff --git a/mintlify-docs/api-reference/projectsservice/toggle-managed-scans-for-a-project.mdx b/mintlify-docs/api-reference/projectsservice/toggle-managed-scans-for-a-project.mdx new file mode 100644 index 0000000000..836ad45c31 --- /dev/null +++ b/mintlify-docs/api-reference/projectsservice/toggle-managed-scans-for-a-project.mdx @@ -0,0 +1,3 @@ +--- +openapi: patch /api/v1/deployments/{deploymentSlug}/projects/{projectName}/managed-scan +--- \ No newline at end of file diff --git a/mintlify-docs/api-reference/projectsservice/update-project-details.mdx b/mintlify-docs/api-reference/projectsservice/update-project-details.mdx new file mode 100644 index 0000000000..08cb8c77c0 --- /dev/null +++ b/mintlify-docs/api-reference/projectsservice/update-project-details.mdx @@ -0,0 +1,3 @@ +--- +openapi: patch /api/v1/deployments/{deploymentSlug}/projects/{projectName} +--- \ No newline at end of file diff --git a/mintlify-docs/api-reference/scansservice/get-scan-details.mdx b/mintlify-docs/api-reference/scansservice/get-scan-details.mdx new file mode 100644 index 0000000000..1fdc876ac4 --- /dev/null +++ b/mintlify-docs/api-reference/scansservice/get-scan-details.mdx @@ -0,0 +1,3 @@ +--- +openapi: get /api/v1/deployments/{deploymentId}/scan/{scanId} +--- \ No newline at end of file diff --git a/mintlify-docs/api-reference/scansservice/list-scans-beta.mdx b/mintlify-docs/api-reference/scansservice/list-scans-beta.mdx new file mode 100644 index 0000000000..b1e83e36ac --- /dev/null +++ b/mintlify-docs/api-reference/scansservice/list-scans-beta.mdx @@ -0,0 +1,3 @@ +--- +openapi: post /api/v1/deployments/{deploymentId}/scans/search +--- \ No newline at end of file diff --git a/mintlify-docs/api-reference/secretsservice/list-secrets.mdx b/mintlify-docs/api-reference/secretsservice/list-secrets.mdx new file mode 100644 index 0000000000..b6880982c5 --- /dev/null +++ b/mintlify-docs/api-reference/secretsservice/list-secrets.mdx @@ -0,0 +1,3 @@ +--- +openapi: get /api/v1/deployments/{deploymentId}/secrets +--- \ No newline at end of file diff --git a/mintlify-docs/api-reference/supplychainservice/create-a-new-sbom-export-job.mdx b/mintlify-docs/api-reference/supplychainservice/create-a-new-sbom-export-job.mdx new file mode 100644 index 0000000000..b48150086b --- /dev/null +++ b/mintlify-docs/api-reference/supplychainservice/create-a-new-sbom-export-job.mdx @@ -0,0 +1,3 @@ +--- +openapi: post /api/v1/deployments/{deploymentId}/sbom/export +--- \ No newline at end of file diff --git a/mintlify-docs/api-reference/supplychainservice/get-the-status-of-a-sbom-export-job.mdx b/mintlify-docs/api-reference/supplychainservice/get-the-status-of-a-sbom-export-job.mdx new file mode 100644 index 0000000000..0393e5ba28 --- /dev/null +++ b/mintlify-docs/api-reference/supplychainservice/get-the-status-of-a-sbom-export-job.mdx @@ -0,0 +1,3 @@ +--- +openapi: get /api/v1/deployments/{deploymentId}/sbom/export/{taskToken} +--- \ No newline at end of file diff --git a/mintlify-docs/api-reference/supplychainservice/list-dependencies.mdx b/mintlify-docs/api-reference/supplychainservice/list-dependencies.mdx new file mode 100644 index 0000000000..fe10e5f4d7 --- /dev/null +++ b/mintlify-docs/api-reference/supplychainservice/list-dependencies.mdx @@ -0,0 +1,3 @@ +--- +openapi: post /api/v1/deployments/{deploymentId}/dependencies +--- \ No newline at end of file diff --git a/mintlify-docs/api-reference/supplychainservice/list-lockfiles-in-a-given-repository-with-dependencies.mdx b/mintlify-docs/api-reference/supplychainservice/list-lockfiles-in-a-given-repository-with-dependencies.mdx new file mode 100644 index 0000000000..39b500be2e --- /dev/null +++ b/mintlify-docs/api-reference/supplychainservice/list-lockfiles-in-a-given-repository-with-dependencies.mdx @@ -0,0 +1,3 @@ +--- +openapi: post /api/v1/deployments/{deploymentId}/dependencies/repositories/{repositoryId}/lockfiles +--- \ No newline at end of file diff --git a/mintlify-docs/api-reference/supplychainservice/list-repositories-with-dependencies.mdx b/mintlify-docs/api-reference/supplychainservice/list-repositories-with-dependencies.mdx new file mode 100644 index 0000000000..071f10a7b4 --- /dev/null +++ b/mintlify-docs/api-reference/supplychainservice/list-repositories-with-dependencies.mdx @@ -0,0 +1,3 @@ +--- +openapi: post /api/v1/deployments/{deploymentId}/dependencies/repositories +--- \ No newline at end of file diff --git a/mintlify-docs/api-reference/ticketingservice/create-jira-tickets.mdx b/mintlify-docs/api-reference/ticketingservice/create-jira-tickets.mdx new file mode 100644 index 0000000000..fa4e40f42d --- /dev/null +++ b/mintlify-docs/api-reference/ticketingservice/create-jira-tickets.mdx @@ -0,0 +1,3 @@ +--- +openapi: post /api/v1/deployments/{deploymentSlug}/tickets +--- \ No newline at end of file diff --git a/mintlify-docs/api-reference/ticketingservice/unlink-a-jira-ticket.mdx b/mintlify-docs/api-reference/ticketingservice/unlink-a-jira-ticket.mdx new file mode 100644 index 0000000000..9778e152f6 --- /dev/null +++ b/mintlify-docs/api-reference/ticketingservice/unlink-a-jira-ticket.mdx @@ -0,0 +1,3 @@ +--- +openapi: delete /api/v1/deployments/{deploymentId}/ticketing/v2/tickets/{externalTicketId} +--- \ No newline at end of file diff --git a/mintlify-docs/api-reference/triageservice/bulk-triage.mdx b/mintlify-docs/api-reference/triageservice/bulk-triage.mdx new file mode 100644 index 0000000000..f0920b259d --- /dev/null +++ b/mintlify-docs/api-reference/triageservice/bulk-triage.mdx @@ -0,0 +1,3 @@ +--- +openapi: post /api/v1/deployments/{deploymentSlug}/triage +--- \ No newline at end of file diff --git a/mintlify-docs/category/bitbucket-pr-comments.mdx b/mintlify-docs/category/bitbucket-pr-comments.mdx new file mode 100644 index 0000000000..586ded3ace --- /dev/null +++ b/mintlify-docs/category/bitbucket-pr-comments.mdx @@ -0,0 +1,13 @@ +--- +title: "Bitbucket PR comments" +--- + + + + Enable PR comments in your Bitbucket Cloud repositories to display Semgrep findings to developers. + + + + Enable PR comments in your Bitbucket Data Center repositories to display Semgrep findings to developers. + + \ No newline at end of file diff --git a/mintlify-docs/category/ci-references-1.mdx b/mintlify-docs/category/ci-references-1.mdx new file mode 100644 index 0000000000..fae970452c --- /dev/null +++ b/mintlify-docs/category/ci-references-1.mdx @@ -0,0 +1,21 @@ +--- +title: "CI references" +--- + + + + Configure Semgrep in CI by setting various environment variables. Enable diff-aware scanning, connect to Semgrep AppSec Platform, and more. + + + + View sample configuration files to run Semgrep with various CI/CD providers such as GitHub, GitLab, Jenkins, Buildkite, CircleCI, and more. + + + + Learn how Semgrep Pro tracks findings and triage states in CI pipelines. + + + + Packages included in the latest Semgrep docker image. + + \ No newline at end of file diff --git a/mintlify-docs/category/ci-references.mdx b/mintlify-docs/category/ci-references.mdx new file mode 100644 index 0000000000..de7b2346b7 --- /dev/null +++ b/mintlify-docs/category/ci-references.mdx @@ -0,0 +1,21 @@ +--- +title: "CI references" +--- + + + + Configure Semgrep in CI by setting various environment variables. Enable diff-aware scanning, connect to Semgrep AppSec Platform, and more. + + + + View sample configuration files to run Semgrep with various CI/CD providers such as GitHub, GitLab, Jenkins, Buildkite, CircleCI, and more. + + + + Learn how Semgrep Pro tracks findings and triage states in CI pipelines. + + + + Packages included in the latest Semgrep docker image. + + \ No newline at end of file diff --git a/mintlify-docs/category/deployment-at-scale.mdx b/mintlify-docs/category/deployment-at-scale.mdx new file mode 100644 index 0000000000..15d3f700b8 --- /dev/null +++ b/mintlify-docs/category/deployment-at-scale.mdx @@ -0,0 +1,21 @@ +--- +title: "Deployment at scale" +--- + + + + 1 item + + + + Manage tokens used to authorize requests to Semgrep AppSec Platform and API. + + + + Guidelines on how to add or remove tags through Semgrep AppSec Platform and semgrepconfig.yml file. + + + + Learn how to set up the Semgrep Network Broker, which facilitates secure access between Semgrep and your private network. + + \ No newline at end of file diff --git a/mintlify-docs/category/glossaries-1.mdx b/mintlify-docs/category/glossaries-1.mdx new file mode 100644 index 0000000000..5f9b6cd97d --- /dev/null +++ b/mintlify-docs/category/glossaries-1.mdx @@ -0,0 +1,12 @@ +--- +title: "Glossaries" +--- + + + +Definitions of Semgrep Code product-specific terms. + + +Definitions of Semgrep Supply Chain and software composition analysis (SCA) terms. + + \ No newline at end of file diff --git a/mintlify-docs/category/glossaries.mdx b/mintlify-docs/category/glossaries.mdx new file mode 100644 index 0000000000..023b8b0251 --- /dev/null +++ b/mintlify-docs/category/glossaries.mdx @@ -0,0 +1,13 @@ +--- +title: "Glossaries" +--- + + + + Definitions of Semgrep Code product-specific terms. + + + + Definitions of Semgrep Supply Chain and software composition analysis (SCA) terms. + + \ No newline at end of file diff --git a/mintlify-docs/category/go.mdx b/mintlify-docs/category/go.mdx new file mode 100644 index 0000000000..3b96a378ec --- /dev/null +++ b/mintlify-docs/category/go.mdx @@ -0,0 +1,15 @@ +--- +title: "Go" +description: "Security guides and cheatsheets for the Go programming language and related frameworks." +--- + + + + + Cheat sheet for the prevention of Command Injection vulnerabilities for Go. + + + + Cheat sheet for the prevention of Cross-site Scripting (XSS) vulnerabilities for Go and net/http. + + \ No newline at end of file diff --git a/mintlify-docs/category/java.mdx b/mintlify-docs/category/java.mdx new file mode 100644 index 0000000000..5eb408017a --- /dev/null +++ b/mintlify-docs/category/java.mdx @@ -0,0 +1,22 @@ +--- +title: "Java" +description: "Security guides and cheatsheets for the Java programming language and related frameworks." +--- + + + + Cheat sheet for the prevention of Code Injection vulnerabilities for Java. + + + + Cheat sheet for the prevention of Command Injection vulnerabilities for Java. + + + + Cheat sheet for the prevention of Cross-site Scripting (XSS) vulnerabilities for Java and Java Server Pages (JSP). + + + + Cheat sheet for the prevention of XML External Entity (XEE) vulnerabilities for Java. + + \ No newline at end of file diff --git a/mintlify-docs/category/javascript.mdx b/mintlify-docs/category/javascript.mdx new file mode 100644 index 0000000000..670e2534be --- /dev/null +++ b/mintlify-docs/category/javascript.mdx @@ -0,0 +1,18 @@ +--- +title: "JavaScript" +description: "Security guides and cheatsheets for the JavaScript programming language, Node and related frameworks." +--- + + + + Cheat sheet for the prevention of Code Injection vulnerabilities for JavaScript. + + + + Cheat sheet for the prevention of Command Injection vulnerabilities for JavaScript. + + + + Cheat sheet for the prevention of Cross-site Scripting (XSS) vulnerabilities for ExpressJS. + + \ No newline at end of file diff --git a/mintlify-docs/category/language-reference.mdx b/mintlify-docs/category/language-reference.mdx new file mode 100644 index 0000000000..ee90444fda --- /dev/null +++ b/mintlify-docs/category/language-reference.mdx @@ -0,0 +1,16 @@ +--- +title: "Language reference" +sidebarTitle: "Language reference" +--- + + + + Definitions for language maturity levels across Semgrep products. + + + Definitions for Semgrep Code and Supply Chain analysis features. + + + Proprietary Semgrep features for the Java language that can increase true positives and reduce false positives. + + diff --git a/mintlify-docs/category/language-specific-features.mdx b/mintlify-docs/category/language-specific-features.mdx new file mode 100644 index 0000000000..c75ba35ea5 --- /dev/null +++ b/mintlify-docs/category/language-specific-features.mdx @@ -0,0 +1,9 @@ +--- +title: "Language-specific features" +--- + + + + Proprietary Semgrep features for the Java language that can increase true positives and reduce false positives. + + \ No newline at end of file diff --git a/mintlify-docs/category/local-and-cli-scans.mdx b/mintlify-docs/category/local-and-cli-scans.mdx new file mode 100644 index 0000000000..4faea5b9d8 --- /dev/null +++ b/mintlify-docs/category/local-and-cli-scans.mdx @@ -0,0 +1,25 @@ +--- +title: "Local and CLI scans" +--- + + + + Learn how to set up Semgrep, scan your first project for security issues, and view your findings in the CLI. + + + + Learn how to use local Semgrep rules in your scans. + + + + Update Semgrep by running the correct commands for your environment or operating system. + + + + Send your local scans to Semgrep AppSec Platform to view and track your findings. + + + + Get more information when Semgrep hangs, crashes, times out, or runs very slowly. + + \ No newline at end of file diff --git a/mintlify-docs/category/pr-or-mr-comments.mdx b/mintlify-docs/category/pr-or-mr-comments.mdx new file mode 100644 index 0000000000..8e20e6ef24 --- /dev/null +++ b/mintlify-docs/category/pr-or-mr-comments.mdx @@ -0,0 +1,21 @@ +--- +title: "PR or MR comments" +--- + + + + Enable PR comments in your Azure DevOps repositories to display Semgrep findings to developers. + + + + Enable pull request (PR) comments in your GitHub repositories to display Semgrep findings to developers. + + + + Enable merge request (MR) comments in your GitLab repositories to display Semgrep findings to developers. + + + + 2 items + + \ No newline at end of file diff --git a/mintlify-docs/category/python.mdx b/mintlify-docs/category/python.mdx new file mode 100644 index 0000000000..e859592005 --- /dev/null +++ b/mintlify-docs/category/python.mdx @@ -0,0 +1,28 @@ +--- +title: "Python" +description: "Security guides and cheatsheets for the Python programming language and related frameworks." +--- + + + + + + Cheat sheet for the prevention of Code Injection vulnerabilities for Python. + + + + Cheat sheet for the prevention of Command Injection vulnerabilities for Python. + + + + Cheat sheet for the prevention of Cross-site Scripting (XSS) vulnerabilities for Python and Django. + + + + Cheat sheet for the prevention of Cross-site Scripting (XSS) vulnerabilities for Python and Flask. + + + + Learn about Insecure Deserialization vulnerabilities for Python + + \ No newline at end of file diff --git a/mintlify-docs/category/ruby.mdx b/mintlify-docs/category/ruby.mdx new file mode 100644 index 0000000000..36259f64f3 --- /dev/null +++ b/mintlify-docs/category/ruby.mdx @@ -0,0 +1,18 @@ +--- +title: "Ruby" +description: "Security guides and cheatsheets for the Ruby programming language and related frameworks." +--- + + + + Cheat sheet for the prevention of Code Injection vulnerabilities for Ruby. + + + + Cheat sheet for the prevention of Command Injection vulnerabilities for Ruby. + + + + Cheat sheet for the prevention of Cross-site Scripting (XSS) vulnerabilities for Ruby on Rails. + + \ No newline at end of file diff --git a/mintlify-docs/category/scan-repositories-with-the-appsec-platform.mdx b/mintlify-docs/category/scan-repositories-with-the-appsec-platform.mdx new file mode 100644 index 0000000000..644006bd3d --- /dev/null +++ b/mintlify-docs/category/scan-repositories-with-the-appsec-platform.mdx @@ -0,0 +1,45 @@ +--- +title: "Scan repositories with the AppSec Platform" +--- + + + + 4 items + + + + 1 item + + + + Set up your CI pipeline with Semgrep AppSec Platform for centralized rule and findings management. + + + + Set up your CI pipeline manually with Semgrep AppSec Platform for centralized rule and findings management. + + + + Customize your CI job to fit your organization's workflows. + + + + Configure how Semgrep in CI pipelines handles errors and blocks findings. + + + + 1 item + + + + View projects, detailed logs, and information for any scan. + + + + Set your primary or default branch to ensure Semgrep full scans display accurate counts and deduplicated findings. + + + + Not seeing what you expect in Semgrep AppSec Platform? Follow these troubleshooting steps or find out how to get one-on-one help. + + \ No newline at end of file diff --git a/mintlify-docs/cheat-sheets/django-xss.mdx b/mintlify-docs/cheat-sheets/django-xss.mdx new file mode 100644 index 0000000000..ff62ec82dd --- /dev/null +++ b/mintlify-docs/cheat-sheets/django-xss.mdx @@ -0,0 +1,359 @@ +--- +title: "Prevent XSS in Django" +sidebarTitle: "XSS in Django" +--- + +This is a cross-site scripting (XSS) prevention cheat sheet by Semgrep, Inc. It contains code patterns of potential XSS in an application. Instead of scrutinizing code for exploitable vulnerabilities, the recommendations in this cheat sheet pave a safe road for developers that mitigate the possibility of XSS in your code. By following these recommendations, you can be reasonably sure your code is free of XSS. + + +Learn more about [Cross-site Scripting](/learn/vulnerabilities/cross-site-scripting) vulnerability concepts. + +## Mitigation summary + +In general, always use the template engine provided by Django using `render()`. If you need HTML escaping, use `mark_safe()` combined with `format_html() `and review each individual usage carefully. Once reviewed, mark with `# nosem`. Beware of putting data in dangerous locations in templates. And as always, run a security checker continuously on your code. + +Semgrep ruleset for this cheatsheet: [https://semgrep.dev/p/minusworld.django-xss](https://semgrep.dev/p/minusworld.django-xss) + +### Check your project using Semgrep + +```bash +semgrep --config p/minusworld.django-xss +``` + +## 1. Server code: Marking "safe" content, which does not escape HTML + +### 1.A. Using **mark_safe()** + +`mark_safe()` marks the returned content as "safe to render." This instructs the template engine to bypass HTML escaping, creating the possibility of a XSS vulnerability. + +Example: + +```python +mark_safe(html_content) +``` + +#### References: + +- [`mark_safe()` documentation](https://docs.djangoproject.com/en/3.1/ref/utils/#django.utils.safestring.mark_safe) +- [Bandit Check B703 - Django `mark_safe()`](https://bandit.readthedocs.io/en/latest/plugins/b703_django_mark_safe.html) +- [`format_html()` documentation](https://docs.djangoproject.com/en/3.0/ref/utils/#django.utils.html.format_html) + + +#### Mitigation + +Ban `mark_safe()`. Alternatively, if needed, use in combination with `format_html()` and review each usage carefully. Create an exemption with `# nosem`. + +#### Semgrep rule + +[`python.django.security.audit.avoid-mark-safe.avoid-mark-safe`](https://semgrep.dev/r/python.django.security.audit.avoid-mark-safe.avoid-mark-safe) + +### 1.B. Using the **SafeString** class directly + +The `SafeString` class is how Django determines which variables should be escaped and which should not. Elements passed to `mark_safe()` are returned as a `SafeString`. Invoking `SafeString` directly will bypass HTML escaping which could create a XSS vulnerabliity. + +Example: +```python +SafeString(f"
{request.POST.get('name')}
") +``` + +#### References: + +- [Filters and auto-escaping in Django](https://docs.djangoproject.com/en/3.1/howto/custom-template-tags/#filters-and-auto-escaping) +- [`SafeString` documentation](https://docs.djangoproject.com/en/3.1/ref/utils/#django.utils.safestring.SafeString) + +#### Mitigation + +Ban `SafeString()`. Alternatively, prefer `mark_safe()` if necessary. + +### 1.C. Registering a custom filter with **is_safe=True** + + +Registering a filter with `is_safe=True` indicates to Django that the filter absolutely does not introduce any unsafe HTML characters. The value returned from the filter will be marked as "safe" when the input is also marked "safe". Generally, this is acceptable, but if you cannot be certain the filter is safe, it may introduce a XSS vulnerability. + +Example: + +```python +@register.filter(is_safe=True) +def myfilter(value): + return value +``` + +#### References: + +- [Custom filters and auto-escaping](https://docs.djangoproject.com/en/3.1/howto/custom-template-tags/#filters-and-auto-escaping) + + +#### Mitigation + +Do not mark filters with `is_safe=True`. Alternatively, prefer `mark_safe()` if necessary. + +#### Semgrep rule + +python.django.security.audit.xss.filter-with-is-safe + + + +### 1.D. Use of the **__html__** magic method in a class + +The `__html__` magic method is used by the Django template engine to determine whether the object should be escaped. If available, the value returned by the method will not be escaped and could introduce a XSS vulnerability. + +Example: + +```python +class RawHtml(str): + def __html__(self): + return str(self) +``` + +#### References: + +- [`conditional_escape()` documentation](https://docs.djangoproject.com/en/3.0/ref/utils/#django.utils.html.conditional_escape) +- [`conditional_escape()` source code](https://docs.djangoproject.com/en/3.0/_modules/django/utils/html/#conditional_escape) + +#### Mitigation + +Ban `__html__` in classes. Alternatively, prefer `mark_safe()` if necessary. + +#### Semgrep rule + +[`python.django.security.audit.xss.html-magic-method.html-magic-method`](https://semgrep.dev/r/python.django.security.audit.xss.html-magic-method.html-magic-method) + + +### 1.E. Using **html_safe()** + +The `html_safe()` decorator adds the `__html__` magic method to the supplied class. The added `__html__` magic method returns the exact string representation of the class (for example `str(self)`). Because objects with the `__html__` method are not escaped, this could create a XSS vulnerability. + +Example: + +```python +@html_safe +class RawHtml(str): + pass +``` + +#### References: + +- [`html_safe()` documentation](https://docs.djangoproject.com/en/3.0/ref/utils/#django.utils.html.html_safe) + +#### Mitigation: + +Ban `html_safe()`. Alternatively, prefer `mark_safe()` if necessary. + +#### Semgrep rule + +[`python.django.security.audit.xss.html-safe.html-safe`](https://semgrep.dev/r/python.django.security.audit.xss.html-safe.html-safe) + +## 2. Server code: Bypassing the template engine + + +### 2.A. Directly writing a response using **HttpResponse** or similar classes + + +Writing results directly to `HttpResponse` or similar classes bypasses the Django template engine. This also bypasses the HTML escaping built into the template engine and creates the possibility of a XSS vulnerability. Use `render()` with a template instead. + +Example: + +```python +return HttpResponse("Hello, " + name) +``` + +#### References: + +- [Django Book - Security: XSS](https://django-book.readthedocs.io/en/latest/chapter20.html#cross-site-scripting-xss) +- [Example of XSS via `HttpResponseBadRequest`](https://semgrep.dev/blog/2020/be-careful-what-you-request-for-django-method/) +- [HttpResponse subclasses](https://docs.djangoproject.com/en/3.1/ref/request-response/#httpresponse-subclasses) + + +#### Mitigation: + +Ban `HttpResponse` and similar classes. Alternatively, use `render()`. + +#### Semgrep rule + +[`python.django.security.audit.xss.direct-use-of-httpresponse`](https://semgrep.dev/r/python.django.security.audit.xss.direct-use-of-httpresponse) + +### 2.B. Globally disabling autoescape + +Autoescaping can be globally disabled in Django settings. This should never be done if you are rendering HTML; now, every response returned to the user will need to be audited to ensure it is free of XSS vulnerabilities. + +Example: + +```python +TEMPLATES = [ + { + ..., + 'OPTIONS': {'autoescape': False} + } +] +``` + +#### References: + +- [Django template settings documentation](https://docs.djangoproject.com/en/3.1/topics/templates/#django.template.backends.django.DjangoTemplates) + +#### Mitigation: + +Ban globally disabling autoescape. Alternatively, do not globally disable escaping. If HTML escaping is necessary, use `mark_safe()`. + +#### Semgrep rule + +[`python.django.security.audit.xss.global-autoescape-off.global-autoescape-off`](https://semgrep.dev/r/python.django.security.audit.xss.global-autoescape-off.global-autoescape-off) + +### 2.C. Setting **autoescape=False** in a template context + + +Setting `autoescape=False` in a template context will disable HTML escaping for that template. Any data rendered in that template could be a XSS vulnerability. + +Example: + +```python +response = render(request, "index.html", {"autoescape": False}) +``` + +#### References: + +- [Context source code](https://github.com/django/django/blob/54ea290e5bbd19d87bd8dba807738eeeaf01a362/django/template/context.py#L135) +- [`Template.render()` documentation](https://docs.djangoproject.com/en/3.1/ref/templates/api/#django.template.Template.render) +- [`render_to_string()` documentation](https://docs.djangoproject.com/en/3.1/topics/templates/#django.template.loader.render_to_string) +- [`render()` documentation](https://docs.djangoproject.com/en/3.1/topics/http/shortcuts/#django.shortcuts.render) + + +#### Mitigation: +description: "Ban `autoescape=False` in template contexts" +alternative: "Use `mark_safe()` if necessary" +rule: "python.django.security.audit.xss.context-autoescape-off.context-autoescape-off" + + +## 3. Templates: unescaped variables + +### 3.A. Use of the **| safe** filter + +The `| safe` filter marks the content as "safe for rendering." This has the same effect as `mark_safe()` in Python code. This will permit direct rendering of HTML and create a possible XSS vulnerability. + +Example: + +```django +{{ name | safe }} +``` + +#### References: + +- [`| safe` filter documentation](https://docs.djangoproject.com/en/3.0/ref/templates/builtins/#safe) + +#### Mitigation: + +Ban `| safe`. Alternatively, use `mark_safe()` in Python if necessary. + +#### Semgrep rule + +[`python.flask.security.xss.audit.template-unescaped-with-safe.template-unescaped-with-safe`](https://semgrep.dev/r/python.flask.security.xss.audit.template-unescaped-with-safe.template-unescaped-with-safe) + +### 3.B. Use of the **| safeseq** filter + +The `| safeseq` filter marks the content as "safe for rendering." This has the same effect as `mark_safe()` in Python code. This will permit direct rendering of HTML and create a possible XSS vulnerability. + +Example: + +```django +{{ names | safeseq | join:", " }} +``` + +#### References: + +- [`| safeseq` documentation](https://docs.djangoproject.com/en/3.0/ref/templates/builtins/#safeseq) + +#### Mitigation: + +"Ban `| safeseq`. Alternatively, use `mark_safe()` in Python if necessary. + +#### Semgrep rule + +[`python.django.security.audit.xss.template-var-unescaped-with-safeseq.template-var-unescaped-with-safeseq`](https://semgrep.dev/r/python.django.security.audit.xss.template-var-unescaped-with-safeseq.template-var-unescaped-with-safeseq) + +### 3.C. The **\{% autoescape off %\}** block + +The `{$ autoescape off %}` block disables autoescaping for whole portions of the template. Disabling autoescaping allows HTML characters to be rendered directly onto the page which could create XSS vulnerabilities. + +Example: + +```django +{% autoescape off %} +``` + +#### References: + +- [`autoescape` block documentation](https://docs.djangoproject.com/en/3.0/ref/templates/builtins/#autoescape) + +#### Mitigation: + +Ban `{% autoescape off %}`. Alternatively, use `mark_safe()` in Python if necessary. + +#### Semgrep rule + +[`python.django.security.audit.xss.template-autoescape-off.template-autoescape-off`](https://semgrep.dev/r/python.django.security.audit.xss.template-autoescape-off.template-autoescape-off) + + +## 4. Templates: Variable in dangerous location" + +### 4.A. Unquoted variable in HTML attribute + +Unquoted template variables rendered into HTML attributes is a potential XSS vector because an attacker could inject JavaScript handlers which do not require HTML characters. An example handler might look like: `onmouseover=alert(1)`. HTML escaping will not mitigate this. The variable must be quoted to avoid this. + +Example: + +```django +
+``` + +#### References: + +- [Flask cross-site scripting considerations](https://flask.palletsprojects.com/en/1.1.x/security/#cross-site-scripting-xss) + +#### Mitigation: + +Flag unquoted HTML attributes with Jinja expressions. Alternatively, always use quotes around HTML attributes. + +#### Semgrep rule + +[`python.flask.security.xss.audit.template-unquoted-attribute-var.template-unquoted-attribute-var`](https://semgrep.dev/r/python.flask.security.xss.audit.template-unquoted-attribute-var.template-unquoted-attribute-var) + +### 4.B. Variable in **href** attribute + +Template variables in a `href` value could still accept the `javascript:` URI. This could be a XSS vulnerability. HTML escaping will not prevent this. Use `url_for` to generate links. + +Example: + +```django + +``` + +#### References: + +- [Flask cross-site scripting considerations](https://flask.palletsprojects.com/en/1.1.x/security/#cross-site-scripting-xss) + +#### Mitigation: + +Flag template variables in `href` attributes. Alternatively, use `url_for` to generate links. + +#### Semgrep rule + +[`python.django.security.audit.xss.template-href-var.template-href-var`](https://semgrep.dev/r/python.django.security.audit.xss.template-href-var.template-href-var) + +### 4.C. Variable in **<script>** block + +Template variables placed directly into JavaScript or similar are now directly in a code execution context. Normal HTML escaping will not prevent the possibility of code injection because code can be written without HTML characters. This creates the potential for XSS vulnerabilities, or worse. + +#### References: + +- [Template engines: Why default encoders are not enough](https://www.veracode.com/blog/secure-development/nodejs-template-engines-why-default-encoders-are-not-enough) +- [Safely including data for JavaScript in a Django template](https://adamj.eu/tech/2020/02/18/safely-including-data-for-javascript-in-a-django-template/) +- [`json_script` documentation](https://docs.djangoproject.com/en/3.0/ref/templates/builtins/#json-script) + +Example: +```django + +``` + +#### Mitigation: + +Ban template variables in ` +``` + +#### References +- [Template engines: Why default encoders are not enough](https://www.veracode.com/blog/secure-development/nodejs-template-engines-why-default-encoders-are-not-enough) +- [Protecting against XSS in Rails - JavaScript contexts. (Relevant to all template engines.)](https://blog.ircmaxell.com/2018/06/protecting-rails-xss.html) + +#### Mitigation + +Ban template variables in ` +``` + +#### References + +- [Template engines: Why default encoders are not enough](https://www.veracode.com/blog/secure-development/nodejs-template-engines-why-default-encoders-are-not-enough) +- [Protecting against XSS in Rails - JavaScript contexts. (Relevant to all template engines.)](https://blog.ircmaxell.com/2018/06/protecting-rails-xss.html) + +#### Mitigation + +Ban template variables in ` +``` + +#### Mitigation + +Ban template variables in ` +``` + +#### Mitigation + +Ban template variables in ` +``` + +#### References + +- [Template engines: Why default encoders are not enough](https://www.veracode.com/blog/secure-development/nodejs-template-engines-why-default-encoders-are-not-enough) +- [Protecting against XSS in Rails - JavaScript contexts](https://blog.ircmaxell.com/2018/06/protecting-rails-xss.html) +- [`escape_javascript` documentation](https://api.rubyonrails.org/classes/ActionView/Helpers/JavaScriptHelper.html#method-i-escape_javascript) + +#### Mitigation + +Ban template variables in `<script>` blocks. Alternatively, If necessary, use the `escape_javascript` function or its alias, `j`. Review each usage carefully and exempt with `# nosem`. + +#### Semgrep rule + +[`ruby.rails.security.audit.xss.templates.var-in-script-tag.var-in-script-tag`](https://semgrep.dev/r/ruby.rails.security.audit.xss.templates.var-in-script-tag.var-in-script-tag) diff --git a/mintlify-docs/cheat-sheets/ruby-code-injection.mdx b/mintlify-docs/cheat-sheets/ruby-code-injection.mdx new file mode 100644 index 0000000000..97f1cfad8e --- /dev/null +++ b/mintlify-docs/cheat-sheets/ruby-code-injection.mdx @@ -0,0 +1,88 @@ +--- +title: "Prevent Code Injection for Ruby" +sidebarTitle: "Code Injection for Ruby" +--- + +This is a code injection prevention cheat sheet by Semgrep, Inc. It contains code patterns of potential ways to run arbitrary code in an application. Instead of scrutinizing code for exploitable vulnerabilities, the recommendations in this cheat sheet pave a safe road for developers that mitigate the possibility of code injection in your code. By following these recommendations, you can be reasonably sure your code is free of code injection. + +Learn more about [Code Injection](/learn/vulnerabilities/code-injection) vulnerability concepts. + +### Check your project using Semgrep + +```bash +semgrep --config auto . +``` + +## 1. Evaluating code + +### 1.A. Evaluating code with `eval` + +Evaluating code can be dangerous if dynamic content is used as input. If this input originates from outside of the program it can lead to a code injection vulnerability. + +Examples: + +```ruby +# safe +str = "hello" +eval "str + ' Fred'" + +# vulnerable +str = "hello" +user_input = "system('cat /etc/passwd')" # Value supplied by user +eval "str + #{user_input}" +``` + +```ruby +class Thing +end + +# safe +Thing.module_eval(%q{def hello() "Hello there!" end}) + +# vulnerable +user_input = "system('cat /etc/passwd')" # Value supplied by user +Thing.module_eval(%q{def hello() "#{user_input}" end}) +``` + +#### References + +- [eval() documentation](https://www.rubydoc.info/stdlib/core/Kernel:eval) + +#### Mitigation + +- Don't use `eval()`, `class_eval()`, `module_eval()`, or `instance_eval()` if possible. +- If you need to use `eval()`, `class_eval()`, `module_eval()`, or `instance_eval()` with non-literal values, ensure that executed content is not controllable by external sources. +- If it's not possible, strip everything except alphanumeric characters from the input. + +#### Semgrep rule + +[`ruby.lang.security.no-eval.ruby-eval`](https://semgrep.dev/r/ruby.lang.security.no-eval.ruby-eval) + +### 1.B. Evaluating code with RubyVM::InstructionSequence + +The `InstructionSequence` class represents compiled instructions for the Ruby Virtual Machine. See details in [RubyVM::InstructionSequence documentation](https://ruby-doc.org/core-2.6/RubyVM/InstructionSequence.html). The `RubyVM` class itself is **not** intended for regular users. As the `RubyVM` class enables compiling code it may insecurely interpret user input. Providing user input to this class or its methods can result in a code injection vulnerability. + +Example: + +```ruby +# safe +RubyVM::InstructionSequence.compile("a = 1 + 2") + +# vulnerable +user_input = "system('cat /etc/passwd')" # Value supplied by user +RubyVM::InstructionSequence.compile("a = 1 + #{user_input}") +``` + +#### References + +- [RubyVM documentation](https://ruby-doc.org/core-2.7.0/RubyVM.html) +- [RubyVM::InstructionSequence documentation](https://ruby-doc.org/core-2.6/RubyVM/InstructionSequence.html) + +#### Mitigation + +- Don't use `RubyVM`, or `RubyVM::InstructionSequence` if possible. +- If you need to use `RubyVM` or `RubyVM::InstructionSequence` with non-literal values or user input, ensure that inputs are from trusted sources. + +#### Semgrep rule + +[`ruby.lang.security.no-eval.ruby-eval`](https://semgrep.dev/r/ruby.lang.security.no-eval.ruby-eval) diff --git a/mintlify-docs/cheat-sheets/ruby-command-injection.mdx b/mintlify-docs/cheat-sheets/ruby-command-injection.mdx new file mode 100644 index 0000000000..8a4bd9fa28 --- /dev/null +++ b/mintlify-docs/cheat-sheets/ruby-command-injection.mdx @@ -0,0 +1,314 @@ +--- +title: "Prevent Command Injection for Ruby" +sidebarTitle: "Command Injection for Ruby" +--- + +This is a command injection prevention cheat sheet by Semgrep, Inc. It contains code patterns of potential ways to run an OS command in an application. Instead of scrutinizing code for exploitable vulnerabilities, the recommendations in this cheat sheet pave a safe road for developers that mitigate the possibility of command injection in your code. By following these recommendations, you can be reasonably sure your code is free of command injection. + +Learn more about [Command Injection](/learn/vulnerabilities/command-injection) vulnerability concepts. + +### Check your project using Semgrep + +```bash +semgrep --config auto . +``` + +## 1. Running OS commands + +### 1.A. Open3 module + +`Open3` grants access to running processes when running another program. For more information, see [Ruby documentation](https://docs.ruby-lang.org/en/2.0.0/Open3.html). Such methods as `capture2`, `capture2e`, `capture3`, `popen2`, `popen2e`, `popen3`, `pipeline`, `pipeline_r`, `pipeline_rw`, `pipeline_start` and `pipeline_w` are intended for running commands provided as a string. Letting user supplied data in a command that is passed as an argument to one of these methods, can create an opportunity for a command injection vulnerability. + +Examples: + +```ruby +require 'open3' + +# safe +Open3.popen3("ls -la") + +# vulnerable +user_input = " && cat /etc/passwd" # Value supplied by user +Open3.popen3("ls #{user_input}") +``` + +```ruby +require 'open3' + +# safe +fname = "/usr/share/man/man1/ls.1.gz" +Open3.pipeline(["zcat", fname], "nroff -man", "colcrt") + +# vulnerable +user_input = " && cat /etc/passwd" # Value supplied by user +Open3.pipeline("zcat #{user_input}", "nroff -man", "colcrt") +``` + +#### References + +- [`Open3`](https://docs.ruby-lang.org/en/2.0.0/Open3.html) documentation. + +#### Mitigation + +- Do not pass user input to `Open3` methods. +- Always try to use the internal Ruby API (if it exists) instead of running an OS command. Use internal language features instead of invoking commands that can be exploited. +- Don't pass user-controlled input or use an allowlist for inputs. +- Do not include command arguments in a command string, use parameterization instead. For example:
+ + Instead of the following code: + ```ruby + Open3.pipeline(["bash", "-c", "myCommand myArg1 " + input_value]) + ``` + + Use: + ```ruby + Open3.pipeline(["/path/to/myCommand", "myArg1", input_value]) + ``` +- Define a list of allowed arguments. +- Avoid non-literal values for the command string. Strip everything except alphanumeric characters from an input provided for the command string and arguments. + +#### Semgrep rule + +[`ruby.lang.security.dangerous-open3-pipeline.dangerous-open3-pipeline`](https://semgrep.dev/r/ruby.lang.security.dangerous-open3-pipeline.dangerous-open3-pipeline) + +### 1.B. open() function + +The `open(...)` function creates an input/output (I/O) object connected to a stream, file, or subprocess. If the first argument starts with a pipe character (`|`), it creates a subprocess. An opportunity for a command injection vulnerability is created when the subprocess includes user input in a command argument to `open()` function. + +Example: + +```ruby +# safe +open("my_file.txt") + +# vulnerable +user_input = “|cat /etc/passwd” # Value supplied by user +open(user_input) +``` + +#### References + +- [open](https://apidock.com/ruby/Kernel/open) documentation. + +#### Mitigation + +- Do not provide raw user input to the `open()` function. +- Always try to use the internal Ruby API (if it exists) instead of running an OS command. Use internal language features instead of invoking commands that can be exploited. +- If the use of user input is unavoidable, create an allowlist for inputs, such as allowed command arguments. +- Strip everything except alphanumeric characters from an input provided for the command string and arguments. + +#### Semgrep rule + +[`ruby.lang.security.dangerous-open.dangerous-open`](https://semgrep.dev/r/ruby.lang.security.dangerous-open.dangerous-open) + +### 1.C. system() function + +The `system()` function executes OS commands in a subshell. This might potentially lead to a command injection vulnerability when used with user input. A malicious actor can potentially run OS commands to exploit the system. + +Example: + +```ruby +# safe +system("ls -lah /tmp") + +# vulnerable +user_input = ' && cat /etc/passwd' # Value supplied by user +system("ls #{user_input}") +``` + +#### References + +- [`system()` documentation](https://apidock.com/ruby/Kernel/system) + +#### Mitigation + +- Do not provide raw user input to the `system()` function. +- Always try to use the internal Ruby API (if it exists) instead of running an OS command. Use internal language features instead of invoking commands that can be exploited. +- If the use of user input is unavoidable, create an allowlist for inputs, such as allowed arguments. +- Strip everything except alphanumeric characters from an input provided for the command string and arguments. + +#### Semgrep rule + +[`ruby.lang.security.dangerous-exec.dangerous-exec`](https://semgrep.dev/r/ruby.lang.security.dangerous-exec.dangerous-exec) + +### 1.D. exec() function + +The `exec()` function executes OS commands. This might potentially lead to a command injection vulnerability when used with user input. A malicious actor can potentially run OS commands to exploit the system. + +Example: + +```ruby +# safe +exec("ls -lah /tmp") + +# vulnerable +user_input = ' && cat /etc/passwd' # Value supplied by user +exec("ls #{user_input}") +``` + +#### References + +- [`exec()` documentation](https://apidock.com/ruby/Kernel/exec) + +#### Mitigation + +- Do not provide raw user input to the `exec()` function. +- Always try to use the internal Ruby API (if it exists) instead of running an OS command. Use internal language features instead of invoking commands that can be exploited. +- If the use of user input is unavoidable, create an allowlist for inputs, such as allowed arguments. +- Strip everything except alphanumeric characters from an input provided for the command string and arguments. + +#### Semgrep rule + +[`ruby.lang.security.dangerous-exec.dangerous-exec`](https://semgrep.dev/r/ruby.lang.security.dangerous-exec.dangerous-exec) + +### 1.D. spawn() function + +The `spawn()` function executes OS commands. This might potentially lead to a command injection vulnerability when used with user input. A malicious actor can potentially run OS commands to exploit the system. + +Example: + +```ruby +# safe +pid = spawn("ls -lah /tmp") +Process.wait pid + +# vulnerable +user_input = ' && cat /etc/passwd' # Value supplied by user +pid = spawn("ls #{user_input}") +Process.wait pid +``` + +#### References + +- [`spawn()` documentation](https://apidock.com/ruby/Kernel/spawn) + +#### Mitigation + +- Do not provide raw user input to the `spawn()` function. +- Always try to use the internal Ruby API (if it exists) instead of running an OS command. Use internal language features instead of invoking commands that can be exploited. +- If the use of user input is unavoidable, create an allowlist for inputs, such as allowed arguments. +- Strip everything except alphanumeric characters from an input provided for the command string and arguments. + +#### Semgrep rule + +[`ruby.lang.security.dangerous-exec.dangerous-exec`](https://semgrep.dev/r/ruby.lang.security.dangerous-exec.dangerous-exec) + +### 1.E. Backticks (``) or %x[command] methods + +Backticks ``` `` ``` or `%x[command]` methods allow Ruby developers to execute system commands and return their outputs. Both methods accept string interpolation. As for other methods mentioned in this cheat sheet, when this method is used with user input, it can lead to a command injection vulnerability. + +Ruby interprets the text inside of backticks as an OS command. For example, ``` `ls -l` ``` interpreted by Ruby prints the contents of current working directory. In addition, if the `%x` is used with various delimiters, it is also interpreted as an OS command. The ``` `ls -l` ``` in Ruby is equivalent to the following: + +- `` %x`ls -l` `` +- `` %x;ls -l; `` +- `` %x(ls -l) `` +- `` %x"ls -l" `` +- `` %x{ls -l} `` +- `` %x:ls -l: `` +- `` %x'ls -l' `` +- `` %x[ls -l] `` + +Example: + +```ruby +# safe +`ls -lah /tmp` + +%x[ ls -lah /tmp ] +%x{ ls -lah /tmp } + +# vulnerable +user_input = ' && cat /etc/passwd' # Value supplied by user +`ls #{user_input}` + +%x{ls #{user_input}} +``` + +#### References + +- Ruby [Kernel](https://ruby-doc.org/3.2.1/Kernel.html) documentation. +- Ruby [command injection](https://ruby-doc.org/3.2.1/command_injection_rdoc.html) documentation. + +#### Mitigation + +- Do not provide raw user input to ``` `` ``` or `%x` methods. +- Always try to use the internal Ruby API (if it exists) instead of running an OS command. Use internal language features instead of invoking commands that can be exploited. +- If the use of user input is unavoidable, create an allowlist for inputs, such as allowed arguments. +- Strip everything except alphanumeric characters from an input provided for the command string and arguments. + +#### Semgrep rule + +[`ruby.lang.security.dangerous-subshell.dangerous-subshell`](https://semgrep.dev/r/ruby.lang.security.dangerous-subshell.dangerous-subshell) + +### 1.F. Process.spawn and Process.exec methods + +The `spawn` and `exec` methods execute a system command and return its output. Both methods accept string interpolation. Similarly to other methods mentioned in this cheat sheet, when either of these methods is used with user input, it can lead to command injection vulnerability. + +https://ruby-doc.org/3.2.1/Process.html + +Example: + +```ruby +# safe +Process.spawn("ls -alh") +Process.spawn("ls", "-alh") +Process.spawn(["ls", "-alh"]) + +# vulnerable +user_input = ' && cat /etc/passwd' # Value supplied by user +Process.spawn("ls #{user_input}") + +# safe +Process.exec("ls -alh") +Process.exec("ls", "-alh") +Process.exec(["ls", "-alh"]) + +# vulnerable +user_input = ' && cat /etc/passwd' # Value supplied by user +Process.exec("ls #{user_input}") +``` + +#### References + +- [Process](https://ruby-doc.org/3.2.1/Process.html) documentation. + +#### Mitigation + +- Do not provide raw user input to `Process.spawn` and `Process.exec` methods. +- Always try to use internal Ruby API (if it exists) instead of running an OS command. Use internal language features instead of invoking commands that can be exploited. +- If the use of user input is unavoidable, create an allowlist for inputs, such as allowed arguments. +- Strip everything except alphanumeric characters from an input provided for the command string and arguments. + +#### Semgrep rule + +[`ruby.lang.security.dangerous-exec.dangerous-exec`](https://semgrep.dev/r/ruby.lang.security.dangerous-exec.dangerous-exec) + +### 1.F. PTY.spawn method + +The `PTY.spawn` method executes OS commands in a new terminal. This might potentially lead to a command injection vulnerability when used with user input. A malicious actor can potentially run OS commands to exploit the system. + +Example: + +```ruby +# safe +stdout,stdin,pid = PTY.spawn("ls -lah") + +# vulnerable +user_input = ' && cat /etc/passwd' # Value supplied by user +stdout,stdin,pid = PTY.spawn("ls #{user_input}") +``` + +#### References + +- [PTY](https://ruby-doc.org/3.2.1/exts/pty/PTY.html) library documentation. + +#### Mitigation + +- Do not provide raw user input to `PTY.spawn` methods. +- Always try to use the internal Ruby API (if it exists) instead of running an OS command. Use internal language features instead of invoking commands that can be exploited. +- If the use of user input is unavoidable, create an allowlist for inputs, such as allowed arguments. +- Strip everything except alphanumeric characters from an input provided for the command string and arguments. + +#### Semgrep rule + +[`ruby.lang.security.dangerous-exec.dangerous-exec`](https://semgrep.dev/r/ruby.lang.security.dangerous-exec.dangerous-exec) \ No newline at end of file diff --git a/mintlify-docs/cli-reference.mdx b/mintlify-docs/cli-reference.mdx new file mode 100644 index 0000000000..dad48c1c7d --- /dev/null +++ b/mintlify-docs/cli-reference.mdx @@ -0,0 +1,1302 @@ +--- +title: "CLI reference" +sidebarTitle: "CLI reference" +--- + +This document provides the outputs of the following [Semgrep CLI](https://github.com/semgrep/semgrep) tool commands: + +- `semgrep --help` +- `semgrep scan --help` +- `semgrep ci --help` + +In addition, this page also gives an overview of the Semgrep CLI exit codes. + +## Semgrep commands + +For a list of available commands, run the following command: + +```bash +semgrep --help +``` + +Command output: + +```bash expandable +Usage: semgrep [OPTIONS] COMMAND [ARGS]... + + To get started quickly, run `semgrep scan --config auto` + + Run `semgrep SUBCOMMAND --help` for more information on each subcommand + + If no subcommand is passed, will run `scan` subcommand by default + +Options: + -h, --help Show this message and exit. + +Commands: + ci Run Semgrep on a git diff (for use in CI) + install-semgrep-pro Install the Semgrep Pro Engine + login Obtain and save credentials for semgrep.dev + logout Remove locally stored credentials to semgrep.dev + lsp Start the Semgrep LSP server (useful for IDEs) + publish Upload rule to semgrep.dev + scan Run Semgrep rules on local folders or files + show Show various types of information + test Test the rules (EXPERIMENTAL improvements over scan --test) + validate Validate the rules (EXPERIMENTAL improvements over scan --validate) + mcp Start the Semgrep MCP server +``` + +## `semgrep ci` and `semgrep scan` command options + +You can invoke Semgrep using the CLI with either `semgrep ci` or `semgrep scan`. + + + +The `semgrep scan` command is primarily used for local scans and is suitable if you want to scan your codebase for security issues without requiring a Semgrep account. You can run scans using specific rules or rulesets. For example, to use the default ruleset, the command would be `semgrep scan --config "p/default"`. By default, these scans don't return failing error codes on findings for further handling. + +The `semgrep ci` command is primarily used in CI pipelines for both full scans of codebases, as well as diff-aware scans that are initiated in the context of a pull request or a merge request. With `semgrep ci`, Semgrep uses the policies and rules defined by your organization. It also uses cross-file (interfile) and cross-function (intrafile) analysis for improved results. By default, these scans return failing error codes on findings for further handling. + + + +You can list all available `semgrep ci` or `semgrep scan` options by running `semgrep ci --help` or `semgrep scan --help`, respectively. The available options are also listed below; **select the tab that best fits the command that you're using.** + + + +```bash expandable +NAME + semgrep scan - run semgrep rules on files + +SYNOPSIS + semgrep scan [OPTION]… [TARGETS]… + +DESCRIPTION + Searches TARGET paths for matches to rules or patterns. Defaults to + searching entire current working directory. + + To get started quickly, run + + semgrep --config auto . + + This will automatically fetch rules for your project from the Semgrep + Registry. NOTE: Using `--config auto` will log in to the Semgrep + Registry with your project URL. + + For more information about Semgrep, go to https://semgrep.dev. + + NOTE: By default, Semgrep will report pseudonymous usage metrics to + its server if you pull your configuration from the Semgrep registry. + To learn more about how and why these metrics are collected, please + see https://semgrep.dev/metrics. To modify this behavior, see the + --metrics option below. + +ARGUMENTS + TARGETS + Files or folders to be scanned by semgrep. + +OPTIONS + -a, --autofix + Apply autofix patches. WARNING: data loss can occur with this + flag. Make sure your files are stored in a version control system. + Note that this mode is experimental and not guaranteed to function + properly. + + --allow-local-builds + Experimental: allow building projects contained in the repository. + This allows Semgrep to identify dependencies and dependency + relationships when lockfiles are not present or are insufficient. + However, building code may inherently require the execution of + code contained in the scanned project or in its dependencies, + which is a security risk. + + --allow-untrusted-validators + Allows running rules with validators from origins other than + semgrep.dev. Avoid running rules from origins you don't trust. + + --baseline-commit=VAL (absent SEMGREP_BASELINE_COMMIT env) + Only show results that are not found in this commit hash. Aborts + run if not currently in a git directory, there are unstaged + changes, or given baseline hash doesn't exist. + + -d, --dump-command-for-core + + + --dataflow-traces + Explain how non-local values reach the location of a finding (only + affects text and SARIF output). + + --disable-nosem + negates --enable-nosem + + --disable-version-check + negates --enable-version-check + + --dryrun + If --dryrun, does not write autofixes to a file. This will print + the changes to the console. This lets you see the changes before + you commit to them. Only works with the --autofix flag. Otherwise + does nothing. + + --dump-ast + If --dump-ast, shows AST of the input file or passed expression + and then exit (can use --json). + + --dump-engine-path + + + -e VAL, --pattern=VAL + Code search pattern. See + https://semgrep.dev/writing-rules/pattern-syntax for + information on pattern features. + + --emacs + Output results in Emacs single-line format. + + --emacs-output=VAL + Write a copy of the emacs output to a file or post to URL. + + --enable-nosem + Enables 'nosem'. Findings will not be reported on lines containing + a 'nosem' comment at the end. Enabled by default. + + --enable-version-check (absent SEMGREP_ENABLE_VERSION_CHECK env) + Checks Semgrep servers to see if the latest version is run; + disabling this may reduce exit time after returning results. + + --error + Exit 1 if there are findings. Useful for CI and scripts. + + --exclude=PATTERN + Skip any file or directory whose path that matches PATTERN. + '--exclude=*.py' will ignore the following: 'foo.py', + 'src/foo.py', 'foo.py/bar.sh'. '--exclude=tests' will ignore + 'tests/foo.py' as well as 'a/b/tests/c/foo.py'. Multiple + '--exclude' options may be specified. PATTERN is a glob-style + pattern that uses the same syntax as gitignore and semgrepignore, + which is documented at + https://git-scm.com/gitignore#_pattern_format + + --exclude-minified-files + Skip minified files. These are files that are < 7% whitespace, or + which have an average of > 1000 bytes per line. By default + minified files are scanned. + + --exclude-rule=VAL + Skip any rule with the given id. Can add multiple times. + + -f VAL, -c VAL, --config=VAL (absent SEMGREP_RULES env) + YAML configuration file, directory of YAML files ending in + .yml|.yaml, URL of a configuration file, or Semgrep registry entry + name. Use --config auto to automatically obtain rules tailored to + this project; your project URL will be used to log in to the + Semgrep registry. To run multiple rule files simultaneously, use + --config before every YAML, URL, or Semgrep registry entry name. + For example `semgrep --config p/python --config + myrules/myrule.yaml` See + https://semgrep.dev/writing-rules/rule-syntax for information + on configuration file format. + + --files-with-matches + Output only the names of files containing matches. REQUIRES + --experimental + + --force-color (absent SEMGREP_FORCE_COLOR env) + Always include ANSI color in the output, even if not writing to a + TTY; defaults to using the TTY status + + --gitlab-sast + Output results in GitLab SAST format. + + --gitlab-sast-output=VAL + Write a copy of the GitLab SAST output to a file or post to URL. + + --gitlab-secrets + Output results in GitLab Secrets format. + + --gitlab-secrets-output=VAL + Write a copy of the GitLab Secrets output to a file or post to + URL. + + --historical-secrets + Scans git history using Secrets rules. + + --include=PATTERN + Specify files or directories that should be scanned by semgrep, + excluding other files. This filter is applied after these other + filters: '--exclude' options, any filtering done by git (or other + SCM), and filtering by '.semgrepignore' files. Multiple + '--include' options can be specified. A file path is selected if + it matches at least one of the include patterns. PATTERN is a + glob-style pattern such as 'foo.*' that must match the path. For + example, specifying the language with '-l javascript' might + preselect files 'src/foo.jsx' and 'lib/bar.js'. Specifying one of + '--include=src', '--include=*.jsx', or '--include=src/foo.*' will + restrict the selection to the single file 'src/foo.jsx'. A choice + of multiple '--include' patterns can be specified. For example, + '--include=foo.* --include=bar.*' will select both 'src/foo.jsx' + and 'lib/bar.js'. Glob-style patterns follow the syntax supported + by gitignore and semgrepignore, which is documented at + https://git-scm.com/gitignore#_pattern_format + + --incremental-output + Output results incrementally. REQUIRES --experimental + + --interfile-timeout=INT (absent=0) + Maximum time to spend on interfile analysis. If set to 0 will not + have time limit. Defaults to 0 s for all CLI scans. For CI scans, + it defaults to 3 hours. + + -j VALUE, --jobs=VALUE (absent=3) + Degree of parallelism to use for parallel scanning, either using + shared-memory threads (the default) or the legacy process-based + parallelism (enabled with the deprecated --x-parmap flag). Semgrep + recommends under-provisioning the job count by 10-15 percent to + account for overhead from the garbage collector managing the + shared heap (for example, on a 12-core box, a -j value of 10 or 11 + would be considered a good starting value). We highly recommend + that users do not _oversubscribe_ threads to CPUs, since this has + been seen to induce significant GC latency and slow scan times. + (Doing so will log a warning in debug mode.) The default jobs + value is derived from the number of logical cores that are + detected by Semgrep, scaled by 0.85. + + --json + Output results in Semgrep's JSON format. + + --json-output=VAL + Write a copy of the json output to a file or post to URL. + + --junit-xml + Output results in JUnit XML format. + + --junit-xml-output=VAL + Write a copy of the JUnit XML output to a file or post to URL. + + -l VAL, --lang=VAL + Parse pattern and all files in specified language. Must be used + with -e/--pattern. + + --matching-explanations + Add debugging information in the JSON output to trace how + different parts of a rule are matched (a.k.a., "Inspect Rule" in + the Semgrep playground) + + --max-chars-per-line=INT (absent=160) + Maximum number of characters to show per line. + + --max-lines-per-finding=INT (absent=10) + Maximum number of lines of code that will be shown for each match + before trimming (set to 0 for unlimited). + + --max-log-list-entries=INT (absent=100) + Maximum number of entries that will be shown in the log (e.g., + list of rule ids, list of skipped files). A zero or negative value + disables this filter. Defaults to 100 + + --max-memory=INT (absent=0) + Maximum system memory in MiB to use during the interfile + pre-processing phase, or when running a rule on a single file. If + set to 0, will not have memory limit. Defaults to 0. For CI scans + that use the Pro Engine, defaults to 5000 MiB. + + --max-target-bytes=VALUE (absent=1000000) + Maximum size for a file to be scanned by Semgrep, e.g '1.5MB'. Any + input program larger than this will be ignored. A zero or negative + value disables this filter. Defaults to 1000000 bytes + + --metrics=ENUM (absent=auto or SEMGREP_SEND_METRICS env) + Configures how usage metrics are sent to the Semgrep server. If + 'auto', metrics are sent whenever the --config value pulls from + the Semgrep server or if the user is logged in. If 'on', metrics + are always sent. If 'off', metrics are disabled altogether and not + sent. If absent, the SEMGREP_SEND_METRICS environment variable + value will be used. If no environment variable, defaults to + 'auto'. + + --no-autofix + negates -a/--autofix + + --no-dryrun + negates --dryrun + + --no-error + negates --error + + --no-exclude-minified-files + negates --exclude-minified-files + + --no-force-color + negates --force-color + + --no-git-ignore + negates --use-git-ignore + + --no-rewrite-rule-ids + negates --rewrite-rule-ids + + --no-secrets-validation + Disables secret validation. + + --no-strict + negates --strict + + --no-test-ignore-todo + negates --test-ignore-todo + + --no-time + negates --time + + --novcs + Assume the project is not managed by a version control system + (VCS), even if the project appears to be under version control + based on the presence of files such as '.git' or similar. REQUIRES + --experimental or --semgrepignore-v2. + + -o VAL, --output=VAL + Save search results to a file or post to URL. Default is to print + to stdout. + + --optimizations=VALUE (absent=all) + Turn on/off optimizations. Default = 'all'. Use 'none' to turn all + optimizations off. + + --oss-only + Run using only the OSS engine, even if the Semgrep Pro toggle is + on. This may still run Pro rules, but only using the OSS features. + + --pro + Inter-file analysis and Pro languages (currently Apex, C#, and + Elixir. Requires Semgrep Pro Engine. See + https://semgrep.dev/products/pro-engine/ for more. + + --pro-intrafile + Intra-file inter-procedural taint analysis. Implies + --pro-languages. Requires Semgrep Pro Engine. See + https://semgrep.dev/products/pro-engine/ for more. + + --pro-languages + Enable Pro languages (currently Apex, C#, and Elixir). Requires + Semgrep Pro Engine. See https://semgrep.dev/products/pro-engine/ + for more. + + --pro-path-sensitive + Path sensitivity. Implies --pro-intrafile. Requires Semgrep Pro + Engine. See https://semgrep.dev/products/pro-engine/ for more. + + --project-root=VAL + Semgrep normally determines the type of project (git or novcs) and + the project root automatically. The project root is then used to + locate and use '.gitignore' and '.semgrepignore' files which + determine target files that should be ignored by semgrep. This + option forces the project root to be a specific folder and assumes + a local project without version control (novcs). This option is + useful to ensure the '.semgrepignore' file that may exist at the + project root is consulted when the scanning root is not the + current folder '.'. A valid project root must be a folder (path + referencing a directory) whose physical path is a prefix of the + physical path of the scanning roots passed on the command line. + For example, the command 'semgrep scan --project-root . src' is + valid if '.' is '/home/me' and 'src' is a directory or a symbolic + link to a '/home/me/sources' directory or a symbolic link to a + 'sources' directory but not if it is a symbolic link to a + directory '/var/sources' (assuming '/var' is not a symbolic link). + REQUIRES --experimental or --semgrepignore-v2. + + --remote=VAL + Remote will quickly check out and scan a remote git repository of + the format "http[s]:///.../.git". Must be run with + --pro. Incompatible with --project-root. Note this requires an + empty CWD as this command will clone the repository into the CWD. + REQUIRES --experimental + + --replacement=VAL + An autofix expression that will be applied to any matches found + with --pattern. Only valid with a command-line specified pattern. + + --rewrite-rule-ids + Rewrite rule ids when they appear in nested sub-directories (Rule + 'foo' in test/rules.yaml will be renamed 'test.foo'). + + --sarif + Output results in SARIF format. + + --sarif-output=VAL + Write a copy of the SARIF output to a file or post to URL. + + --scan-unknown-extensions + If true, target files specified directly on the command line will + bypass normal language detection. They will be analyzed according + to the value of --lang if applicable, or otherwise with the + analyzers/languages specified in the Semgrep rule(s) regardless of + file extension or file type. This setting doesn't apply to target + files discovered by scanning folders. Defaults to false. + + --secrets + Run Semgrep Secrets product, including support for secret + validation. Requires access to Secrets, contact + support@semgrep.com for more information. + + --secrets-timeout=INT (absent=30) + Timeout in seconds for each secrets validation HTTP request. If + set to 0, no timeout is applied. Defaults to 30. + + --semgrepignore-v2 + [DEPRECATED] '--semgrepignore-v2' used to force the use of the + newer Semgrepignore v2 implementation for discovering and + filtering target files. It is now the default and only behavior. + The transitional option '--no-semgrepignore-v2' is no longer + available. + + --severity=ENUM + Report findings only from rules matching the supplied severity + level. By default all applicable rules are run. Can add multiple + times. Each should be one of INFO, WARNING, or ERROR. + + --show-supported-languages + Print a list of languages that are currently supported by Semgrep. + + --skip-unknown-extensions + negates --scan-unknown-extensions + + --strict + Return a nonzero exit code when WARN level errors are encountered. + Fails early if invalid configuration files are present. Defaults + to --no-strict. + + --test + Run test suite. + + --test-ignore-todo + If --test-ignore-todo, ignores rules marked as '#todoruleid:' in + test files. + + --text + Output results in text format. + + --text-output=VAL + Write a copy of the text output to a file or post to URL. + + --time + Include a timing summary with the results. If output format is + json, provides times for each pair (rule, target). This feature is + meant for internal use and may be changed or removed without + warning. At the current moment, --trace is better supported. + + --timeout=DOUBLE (absent=5.) + Maximum time to spend running a rule on a single file in seconds. + If set to 0 will not have time limit. Defaults to 5.0 s. + + --timeout-threshold=INT (absent=3) + Maximum number of rules that can time out on a file before the + file is skipped. If set to 0 will not have limit. Defaults to 3. + + --use-git-ignore + '--use-git-ignore' is Semgrep's default behavior. Under the + default behavior, Git-tracked files are not excluded by Gitignore + rules and only untracked files are excluded by Gitignore rules. + '--no-git-ignore' causes semgrep to not call 'git' and not consult + '.gitignore' files to determine which files semgrep should scan. + As a result of '--no-git-ignore', gitignored files and Git + submodules will be scanned unless excluded by other means + ('.semgrepignore', '--exclude', etc.). This flag has no effect if + the scanning root is not in a Git repository. + + --validate + Validate configuration file(s). This will check YAML files for + errors and run 'p/semgrep-rule-lints' on the YAML files. No search + is performed. + + --version + Show the version and exit. + + --vim + Output results in vim single-line format. + + --vim-output=VAL + Write a copy of the vim output to a file or post to URL. + + --x-mem-policy=VAL + [INTERNAL] Heap and GC tuning policy. Only affects the Pro Engine. + +COMMON OPTIONS + --debug + All of --verbose, but with additional debugging information. + + --develop + Living on the edge. + + --experimental + Enable experimental features. + + --help[=FMT] (default=auto) + Show this help in format FMT. The value FMT must be one of auto, + pager, groff or plain. With auto, the format is pager or plain + whenever the TERM env var is dumb or undefined. + + --legacy + Prefer old (legacy) behavior. + + --no-trace + negates --trace + + --profile + Record profiles via Pyro Caml. By default sends them to + localhost:4040 + + -q, --quiet + Only output findings. + + --trace + Record traces from Semgrep scans to help debugging. This feature + is meant for internal use and may be changed or removed without + warning. + + --trace-endpoint=VAL + Endpoint to send OpenTelemetry traces to, if `--trace` is present. + The value may be `semgrep-prod` (default), `semgrep-dev`, + `semgrep-local`, or any valid URL. This feature is meant for + internal use and may be changed or removed without warning. + + -v, --verbose + Show more details about what rules are running, which files failed + to parse, etc. + +EXPERIMENTAL OPTIONS + Any option starting with '--x-' is experimental and may be removed + from semgrep without notice. + + --no-x-run-taint-once + [INTERNAL] Disable running taint analysis just once + + --x-disable-transitive-reachability + [INTERNAL] Disable transitive reachability analysis regardless of + app-based configuration. + + --x-dump-symbol-analysis + [INTERNAL] Dump symbol analysis results in JSON format, ATD type + 'symbol_analysis'. + + --x-eio + [INTERNAL] + + --x-group-taint-rules + [INTERNAL] Do not use + + --x-ignore-semgrepignore-files + [INTERNAL] Ignore all '.semgrepignore' files found in the project + tree for the purpose of selecting target files to be scanned by + semgrep. Other filters may still apply. THIS OPTION IS NOT PART OF + THE SEMGREP API AND MAY CHANGE OR DISAPPEAR WITHOUT NOTICE. + + --x-ls + [INTERNAL] List the selected target files before any rule-specific + or language-specific filtering. Then exit. The default output + format is one path per line. THIS OPTION IS NOT PART OF THE + SEMGREP API AND MAY CHANGE OR DISAPPEAR WITHOUT NOTICE. + + --x-ls-long + [INTERNAL] Show selected targets and skipped targets with reasons + why they were skipped, using an unspecified output format. Implies + --x-ls. THIS OPTION IS NOT PART OF THE SEMGREP API AND MAY CHANGE + OR DISAPPEAR WITHOUT NOTICE. + + --x-mcp + [INTERNAL] This flag indicates that the scan is run by the MCP + server. It is used to output extra info (e.g. rules, num bytes + scanned) at the end of the scan for the MCP server to use and + makes sure that metrics are not sent so that the MCP server can + send its own metrics. + + --x-no-python-schema-validation + [INTERNAL] Skip JSON schema validation; rely on osemgrep parser to + validate rules files + + --x-parmap + [INTERNAL] Rely on legacy Parmap-based parallelism + + --x-pro-naming + [INTERNAL] Do not use + + --x-run-taint-once + [INTERNAL] Run taint analysis just once (default: true) + + --x-semgrepignore-filename=FILENAME + [INTERNAL] Files named FILENAME shall be consulted instead of the + files named '.semgrepignore'. This option can be useful for + testing semgrep on intentionally broken code that should normally + be ignored. + + --x-simple-profiling + Upon exit, print on stderr a report showing how long certain + operations took, in an unspecified text format. + + --x-tr, --x-enable-transitive-reachability + [INTERNAL] Enable transitive reachability analysis regardless of + app-based configuration. Typically used with + '--allow-local-builds'. + +EXIT STATUS + semgrep scan exits with: + + 0 OK + + 1 some findings + + 2 fatal error + + 3 invalid target code + + 4 invalid pattern + + 5 unparseable YAML + + 7 missing configuration + + 8 invalid language + + 13 invalid API key + + 99 not implemented in osemgrep + +ENVIRONMENT + These environment variables affect the execution of semgrep scan: + + SEMGREP_BASELINE_COMMIT + See option --baseline-commit. + + SEMGREP_ENABLE_VERSION_CHECK + See option --enable-version-check. + + SEMGREP_FORCE_COLOR + See option --force-color. + + SEMGREP_RULES + See option --config. + + SEMGREP_SEND_METRICS + See option --metrics. + +AUTHORS + Semgrep Inc. + +BUGS + If you encounter an issue, please report it at + https://github.com/semgrep/semgrep/issues +``` + + +```bash expandable +NAME + semgrep ci - the recommended way to run semgrep in CI + +SYNOPSIS + semgrep ci [OPTION]… + +DESCRIPTION + In pull_request/merge_request (PR/MR) contexts, `semgrep ci` will only + report findings that were introduced by the PR/MR. + + When logged in, `semgrep ci` runs rules configured on Semgrep App and + sends findings to your findings dashboard. + + Only displays findings that were marked as blocking. + +OPTIONS + -a, --autofix + Currently ignored. + + --allow-local-builds + Experimental: allow building projects contained in the repository. + This allows Semgrep to identify dependencies and dependency + relationships when lockfiles are not present or are insufficient. + However, building code may inherently require the execution of + code contained in the scanned project or in its dependencies, + which is a security risk. + + --allow-untrusted-validators + Allows running rules with validators from origins other than + semgrep.dev. Avoid running rules from origins you don't trust. + + --audit-on=VAL (absent SEMGREP_AUDIT_ON env) + + --baseline-commit=VAL (absent SEMGREP_BASELINE_COMMIT env) + Only show results that are not found in this commit hash. Aborts + run if not currently in a git directory, there are unstaged + changes, or given baseline hash doesn't exist. + + --code + Run Semgrep Code (SAST) product. + + -d, --dump-command-for-core + + + --dataflow-traces + Explain how non-local values reach the location of a finding (only + affects text and SARIF output). + + --disable-nosem + negates --enable-nosem + + --disable-version-check + negates --enable-version-check + + --dry-run + When set, will not start a scan on semgrep.dev and will not report + findings. Instead will print out json objects it would have sent. + + --dryrun + Currently ignored. + + --emacs + Output results in Emacs single-line format. + + --emacs-output=VAL + Write a copy of the emacs output to a file or post to URL. + + --enable-nosem + Enables 'nosem'. Findings will not be reported on lines containing + a 'nosem' comment at the end. Enabled by default. + + --enable-version-check (absent SEMGREP_ENABLE_VERSION_CHECK env) + Checks Semgrep servers to see if the latest version is run; + disabling this may reduce exit time after returning results. + + --exclude=PATTERN + Skip any file or directory whose path that matches PATTERN. + '--exclude=*.py' will ignore the following: 'foo.py', + 'src/foo.py', 'foo.py/bar.sh'. '--exclude=tests' will ignore + 'tests/foo.py' as well as 'a/b/tests/c/foo.py'. Multiple + '--exclude' options may be specified. PATTERN is a glob-style + pattern that uses the same syntax as gitignore and semgrepignore, + which is documented at + https://git-scm.com/gitignore#_pattern_format + + --exclude-minified-files + Skip minified files. These are files that are < 7% whitespace, or + which have an average of > 1000 bytes per line. By default + minified files are scanned. + + --exclude-rule=VAL + Skip any rule with the given id. Can add multiple times. + + -f VAL, -c VAL, --config=VAL + Not supported in 'ci' mode + + --fake-backend=VAL + Internal flag. + + --files-with-matches + Output only the names of files containing matches. REQUIRES + --experimental + + --force-color (absent SEMGREP_FORCE_COLOR env) + Always include ANSI color in the output, even if not writing to a + TTY; defaults to using the TTY status + + --gitlab-sast + Output results in GitLab SAST format. + + --gitlab-sast-output=VAL + Write a copy of the GitLab SAST output to a file or post to URL. + + --gitlab-secrets + Output results in GitLab Secrets format. + + --gitlab-secrets-output=VAL + Write a copy of the GitLab Secrets output to a file or post to + URL. + + --historical-secrets + Scans git history using Secrets rules. + + --include=PATTERN + Specify files or directories that should be scanned by semgrep, + excluding other files. This filter is applied after these other + filters: '--exclude' options, any filtering done by git (or other + SCM), and filtering by '.semgrepignore' files. Multiple + '--include' options can be specified. A file path is selected if + it matches at least one of the include patterns. PATTERN is a + glob-style pattern such as 'foo.*' that must match the path. For + example, specifying the language with '-l javascript' might + preselect files 'src/foo.jsx' and 'lib/bar.js'. Specifying one of + '--include=src', '--include=*.jsx', or '--include=src/foo.*' will + restrict the selection to the single file 'src/foo.jsx'. A choice + of multiple '--include' patterns can be specified. For example, + '--include=foo.* --include=bar.*' will select both 'src/foo.jsx' + and 'lib/bar.js'. Glob-style patterns follow the syntax supported + by gitignore and semgrepignore, which is documented at + https://git-scm.com/gitignore#_pattern_format + + --incremental-output + Output results incrementally. REQUIRES --experimental + + --interfile-timeout=INT (absent=0) + Maximum time to spend on interfile analysis. If set to 0 will not + have time limit. Defaults to 0 s for all CLI scans. For CI scans, + it defaults to 3 hours. + + --internal-ci-scan-results + Internal flag. + + -j VALUE, --jobs=VALUE (absent=3) + Degree of parallelism to use for parallel scanning, either using + shared-memory threads (the default) or the legacy process-based + parallelism (enabled with the deprecated --x-parmap flag). Semgrep + recommends under-provisioning the job count by 10-15 percent to + account for overhead from the garbage collector managing the + shared heap (for example, on a 12-core box, a -j value of 10 or 11 + would be considered a good starting value). We highly recommend + that users do not _oversubscribe_ threads to CPUs, since this has + been seen to induce significant GC latency and slow scan times. + (Doing so will log a warning in debug mode.) The default jobs + value is derived from the number of logical cores that are + detected by Semgrep, scaled by 0.85. + + --json + Output results in Semgrep's JSON format. + + --json-output=VAL + Write a copy of the json output to a file or post to URL. + + --junit-xml + Output results in JUnit XML format. + + --junit-xml-output=VAL + Write a copy of the JUnit XML output to a file or post to URL. + + --log-backend=VAL + Internal flag. + + --matching-explanations + Add debugging information in the JSON output to trace how + different parts of a rule are matched (a.k.a., "Inspect Rule" in + the Semgrep playground) + + --max-chars-per-line=INT (absent=160) + Maximum number of characters to show per line. + + --max-lines-per-finding=INT (absent=10) + Maximum number of lines of code that will be shown for each match + before trimming (set to 0 for unlimited). + + --max-log-list-entries=INT (absent=100) + Maximum number of entries that will be shown in the log (e.g., + list of rule ids, list of skipped files). A zero or negative value + disables this filter. Defaults to 100 + + --max-memory=INT (absent=0) + Maximum system memory in MiB to use during the interfile + pre-processing phase, or when running a rule on a single file. If + set to 0, will not have memory limit. Defaults to 0. For CI scans + that use the Pro Engine, defaults to 5000 MiB. + + --max-target-bytes=VALUE (absent=1000000) + Maximum size for a file to be scanned by Semgrep, e.g '1.5MB'. Any + input program larger than this will be ignored. A zero or negative + value disables this filter. Defaults to 1000000 bytes + + --metrics=ENUM (absent=auto or SEMGREP_SEND_METRICS env) + Configures how usage metrics are sent to the Semgrep server. If + 'auto', metrics are sent whenever the --config value pulls from + the Semgrep server or if the user is logged in. If 'on', metrics + are always sent. If 'off', metrics are disabled altogether and not + sent. If absent, the SEMGREP_SEND_METRICS environment variable + value will be used. If no environment variable, defaults to + 'auto'. + + --no-autofix + negates -a/--autofix + + --no-dryrun + negates --dryrun + + --no-exclude-minified-files + negates --exclude-minified-files + + --no-force-color + negates --force-color + + --no-git-ignore + negates --use-git-ignore + + --no-rewrite-rule-ids + negates --rewrite-rule-ids + + --no-secrets-validation + Disables secret validation. + + --no-suppress-errors + negates --suppress-errors + + -o VAL, --output=VAL + Save search results to a file or post to URL. Default is to print + to stdout. + + --optimizations=VALUE (absent=all) + Turn on/off optimizations. Default = 'all'. Use 'none' to turn all + optimizations off. + + --oss-only + Run using only the OSS engine, even if the Semgrep Pro toggle is + on. This may still run Pro rules, but only using the OSS features. + + --pro + Inter-file analysis and Pro languages (currently Apex, C#, and + Elixir. Requires Semgrep Pro Engine. See + https://semgrep.dev/products/pro-engine/ for more. + + --pro-intrafile + Intra-file inter-procedural taint analysis. Implies + --pro-languages. Requires Semgrep Pro Engine. See + https://semgrep.dev/products/pro-engine/ for more. + + --pro-languages + Enable Pro languages (currently Apex, C#, and Elixir). Requires + Semgrep Pro Engine. See https://semgrep.dev/products/pro-engine/ + for more. + + --pro-path-sensitive + Path sensitivity. Implies --pro-intrafile. Requires Semgrep Pro + Engine. See https://semgrep.dev/products/pro-engine/ for more. + + --rewrite-rule-ids + Rewrite rule ids when they appear in nested sub-directories (Rule + 'foo' in test/rules.yaml will be renamed 'test.foo'). + + --sarif + Output results in SARIF format. + + --sarif-output=VAL + Write a copy of the SARIF output to a file or post to URL. + + --scan-unknown-extensions + If true, target files specified directly on the command line will + bypass normal language detection. They will be analyzed according + to the value of --lang if applicable, or otherwise with the + analyzers/languages specified in the Semgrep rule(s) regardless of + file extension or file type. This setting doesn't apply to target + files discovered by scanning folders. Defaults to false. + + --secrets + Run Semgrep Secrets product, including support for secret + validation. Requires access to Secrets, contact + support@semgrep.com for more information. + + --secrets-timeout=INT (absent=30) + Timeout in seconds for each secrets validation HTTP request. If + set to 0, no timeout is applied. Defaults to 30. + + --semgrepignore-v2 + [DEPRECATED] '--semgrepignore-v2' used to force the use of the + newer Semgrepignore v2 implementation for discovering and + filtering target files. It is now the default and only behavior. + The transitional option '--no-semgrepignore-v2' is no longer + available. + + --skip-unknown-extensions + negates --scan-unknown-extensions + + --subdir=VAL + Scan only a subdirectory of this folder. This creates a project + specific to the subdirectory unless SEMGREP_REPO_DISPLAY_NAME is + set. Expects a relative path. (Note that when two scans have the + same SEMGREP_REPO_DISPLAY_NAME but different targeted directories, + the results of the second scan overwrite the first.) + + --supply-chain + Run Semgrep Supply Chain product. + + --suppress-errors (absent SEMGREP_SUPPRESS_ERRORS env) + Configures how the CI command reacts when an error occurs. If + true, encountered errors are suppressed and the exit code is zero + (success). If false, encountered errors are not suppressed and the + exit code is non-zero (failure). + + --text + Output results in text format. + + --text-output=VAL + Write a copy of the text output to a file or post to URL. + + --timeout=DOUBLE (absent=5.) + Maximum time to spend running a rule on a single file in seconds. + If set to 0 will not have time limit. Defaults to 5.0 s. + + --timeout-threshold=INT (absent=3) + Maximum number of rules that can time out on a file before the + file is skipped. If set to 0 will not have limit. Defaults to 3. + + --use-git-ignore + '--use-git-ignore' is Semgrep's default behavior. Under the + default behavior, Git-tracked files are not excluded by Gitignore + rules and only untracked files are excluded by Gitignore rules. + '--no-git-ignore' causes semgrep to not call 'git' and not consult + '.gitignore' files to determine which files semgrep should scan. + As a result of '--no-git-ignore', gitignored files and Git + submodules will be scanned unless excluded by other means + ('.semgrepignore', '--exclude', etc.). This flag has no effect if + the scanning root is not in a Git repository. + + --vim + Output results in vim single-line format. + + --vim-output=VAL + Write a copy of the vim output to a file or post to URL. + + --x-enable-mal-deps + Enable malicious dependency rules for this scan. + + --x-mem-policy=VAL + [INTERNAL] Heap and GC tuning policy. Only affects the Pro Engine. + +COMMON OPTIONS + --debug + All of --verbose, but with additional debugging information. + + --develop + Living on the edge. + + --experimental + Enable experimental features. + + --help[=FMT] (default=auto) + Show this help in format FMT. The value FMT must be one of auto, + pager, groff or plain. With auto, the format is pager or plain + whenever the TERM env var is dumb or undefined. + + --legacy + Prefer old (legacy) behavior. + + --no-trace + negates --trace + + --profile + Record profiles via Pyro Caml. By default sends them to + localhost:4040 + + -q, --quiet + Only output findings. + + --trace + Record traces from Semgrep scans to help debugging. This feature + is meant for internal use and may be changed or removed without + warning. + + --trace-endpoint=VAL + Endpoint to send OpenTelemetry traces to, if `--trace` is present. + The value may be `semgrep-prod` (default), `semgrep-dev`, + `semgrep-local`, or any valid URL. This feature is meant for + internal use and may be changed or removed without warning. + + -v, --verbose + Show more details about what rules are running, which files failed + to parse, etc. + +EXPERIMENTAL OPTIONS + Any option starting with '--x-' is experimental and may be removed + from semgrep without notice. + + --no-x-run-taint-once + [INTERNAL] Disable running taint analysis just once + + --x-computed-dependencies-dir=VAL + Internal flag. + + --x-disable-transitive-reachability + [INTERNAL] Disable transitive reachability analysis regardless of + app-based configuration. + + --x-dump-rule-partitions=INT (absent=0) + Internal flag. + + --x-dump-rule-partitions-dir=VAL + Internal flag. + + --x-dump-rule-partitions-strategy=VAL + Internal flag. + + --x-dump-scan-config-path=VAL + Internal flag. + + --x-dump-subprojects-and-exit=VAL + Internal flag. + + --x-eio + [INTERNAL] + + --x-ignore-semgrepignore-files + [INTERNAL] Ignore all '.semgrepignore' files found in the project + tree for the purpose of selecting target files to be scanned by + semgrep. Other filters may still apply. THIS OPTION IS NOT PART OF + THE SEMGREP API AND MAY CHANGE OR DISAPPEAR WITHOUT NOTICE. + + --x-mcp + [INTERNAL] This flag indicates that the scan is run by the MCP + server. It is used to output extra info (e.g. rules, num bytes + scanned) at the end of the scan for the MCP server to use and + makes sure that metrics are not sent so that the MCP server can + send its own metrics. + + --x-merge-partial-results-dir=DIR + Internal flag. + + --x-merge-partial-results-output=VAL + Internal flag. + + --x-no-python-schema-validation + [INTERNAL] Skip JSON schema validation; rely on osemgrep parser to + validate rules files + + --x-parmap + [INTERNAL] Rely on legacy Parmap-based parallelism + + --x-partial-config=VAL + Internal flag. + + --x-partial-output=VAL + Internal flag. + + --x-pro-naming + [INTERNAL] Do not use + + --x-run-taint-once + [INTERNAL] Run taint analysis just once (default: true) + + --x-semgrepignore-filename=FILENAME + [INTERNAL] Files named FILENAME shall be consulted instead of the + files named '.semgrepignore'. This option can be useful for + testing semgrep on intentionally broken code that should normally + be ignored. + + --x-simple-profiling + Upon exit, print on stderr a report showing how long certain + operations took, in an unspecified text format. + + --x-tr, --x-enable-transitive-reachability + [INTERNAL] Enable transitive reachability analysis regardless of + app-based configuration. Typically used with + '--allow-local-builds'. + + --x-upload-partial-results=VAL + Internal flag. + + --x-upload-partial-results-scan-id=INT + Internal flag. + + --x-use-saved-scan-config-path=VAL + Internal flag. + + --x-validate-partial-results-actual=VAL + Internal flag. + + --x-validate-partial-results-expected=VAL + Internal flag. + +EXIT STATUS + semgrep ci exits with: + + 0 OK + + 1 some findings + + 2 fatal error + + 3 invalid target code + + 4 invalid pattern + + 5 unparseable YAML + + 7 missing configuration + + 8 invalid language + + 13 invalid API key + + 99 not implemented in osemgrep + +ENVIRONMENT + These environment variables affect the execution of semgrep ci: + + SEMGREP_AUDIT_ON + See option --audit-on. + + SEMGREP_BASELINE_COMMIT + See option --baseline-commit. + + SEMGREP_ENABLE_VERSION_CHECK + See option --enable-version-check. + + SEMGREP_FORCE_COLOR + See option --force-color. + + SEMGREP_SEND_METRICS + See option --metrics. + + SEMGREP_SUPPRESS_ERRORS + See option --suppress-errors. + +AUTHORS + Semgrep Inc. + +BUGS + If you encounter an issue, please report it at + https://github.com/semgrep/semgrep/issues + +``` + + + +## Ignore files + +The Semgrep command line tool supports a `.semgrepignore` file that follows `.gitignore` syntax and is used to skip files and directories during scanning. This is commonly used to avoid vendor and test related code. For a complete example, see the [.semgrepignore file on Semgrep’s source code](https://github.com/semgrep/semgrep/blob/develop/.semgrepignore). + +In addition to `.semgrepignore` there are several methods to set up ignore patterns. See [Ignoring files, folders, or code](/ignoring-files-folders-code). + +## Connect to Semgrep Registry through a proxy + +Semgrep uses the Python3 `requests` library. Set the following environment variables to point to your proxy: + +```bash +export HTTP_PROXY="HTTP_PROXY_URL" + +export HTTPS_PROXY="HTTPS_PROXY_URL" +``` + +For example: + +```bash +export HTTP_PROXY="http://10.10.1.10:3128" + +export HTTPS_PROXY="http://10.10.1.10:1080" +``` + +## Exit codes + +Semgrep can finish with the following exit codes: + +- **0**: Semgrep ran successfully and found no errors (or did find errors, but the `--error` flag is **not** being used). +- **1**: Semgrep ran successfully and found issues in your code (while using the `--error` flag). +- **2**: Semgrep failed. +- **3**: Invalid syntax of the scanned language. This error occurs only while using the `--strict` flag. +- **4**: Semgrep encountered an invalid pattern in the rule schema. +- **5**: Semgrep configuration is not valid YAML. +- **7**: At least one rule in the configuration is invalid. +- **8**: Semgrep does not understand specified language. +- **13**: The API key is invalid. +- **14**: [Deprecated] Semgrep scan failed. + + +**TIP** + +To view the exit code when running `semgrep scan`, enter the following command immediately after the Semgrep scan finishes: +```bash +echo $? +``` +The output is a single exit code, such as: +```bash +1 +``` + + +Not finding what you need in this doc? Ask questions in our [Community Slack group](https://go.semgrep.dev/slack), or see [Support](/support) for other ways to get help. \ No newline at end of file diff --git a/mintlify-docs/compliance/compliance-overview.mdx b/mintlify-docs/compliance/compliance-overview.mdx new file mode 100644 index 0000000000..5b833b99d8 --- /dev/null +++ b/mintlify-docs/compliance/compliance-overview.mdx @@ -0,0 +1,51 @@ +--- +title: "Compliance" +description: "Semgrep provides security tooling that can support compliance efforts, but does not guarantee compliance. Organizations remain responsible for meeting all compliance requirements. Consult with your compliance team and auditors to determine how Semgrep fits into your compliance program." +--- + +Semgrep can help address security requirements in the following compliance frameworks and standards: + +### Government and federal standards + +- **[FedRAMP](/compliance/fedramp):** Federal Risk and Authorization Management Program for cloud services used by U.S. federal agencies +- **[NIST 800-171](/compliance/nist-800-171):** Protecting Controlled Unclassified Information (CUI) in nonfederal systems + +### Healthcare and privacy + +- **[HIPAA/HITRUST](/compliance/hipaa-hitrust):** Health Insurance Portability and Accountability Act and HITRUST Common Security Framework +- **[GDPR](/compliance/gdpr):** General Data Protection Regulation for protecting personal data of EU residents + +### Financial services + +- **[PCI DSS](/compliance/pci-dss):** Payment Card Industry Data Security Standard for protecting cardholder data + +### Information security standards + +- **[ISO 27001](/compliance/iso27001):** International standard for information security management systems (ISMS) +- **[ISO 27017](/compliance/iso-27017):** Code of practice for information security controls for cloud services + +### SOC 2 + +- **[SOC 2](/compliance/soc2):** Service Organization Control 2 for security, availability, processing integrity, confidentiality, and privacy + +## Getting started with compliance + + + +**Review the specific framework page** relevant to your organization from the list above + + +**Understand which controls** Semgrep can help address in your compliance program + + +**Deploy Semgrep** following the [core deployment guide](/deployment/core-deployment) + + +**Configure policies** that align with your compliance requirements + + +**Work with your compliance team** to incorporate Semgrep into your compliance documentation and audit processes + + + +For questions about how Semgrep fits into your specific compliance program, contact your compliance team or [Semgrep support](/support). \ No newline at end of file diff --git a/mintlify-docs/compliance/fedramp.mdx b/mintlify-docs/compliance/fedramp.mdx new file mode 100644 index 0000000000..c8b73e4bdf --- /dev/null +++ b/mintlify-docs/compliance/fedramp.mdx @@ -0,0 +1,36 @@ +--- +title: "FedRAMP compliance" +sidebarTitle: "FedRAMP" +--- + +**Disclaimer:** *Semgrep provides security tooling that can support compliance efforts, but does not guarantee compliance. Organizations remain responsible for meeting all compliance requirements. Consult with your compliance team and auditors to determine how Semgrep fits into your compliance program.* + +**Last updated:** November 2025 + +FedRAMP (Federal Risk and Authorization Management Program) provides standardized security assessment and authorization for cloud services used by federal agencies. The FedRAMP Authorization Boundary Guidance v3.0 Section 7 indicates that corporate services are outside the FedRAMP Authorization Boundary so long as they do not contain federal data. When Semgrep scans code, it collects and stores metadata about scan results. For details, see the [metrics documentation](/metrics). + + +**WARNING** + +Federal data should not exist in code repositories. Semgrep scans code repositories, not production systems or databases containing federal data. Federal data is typically absent from code repositories. + + +Semgrep may help address FedRAMP security requirements derived from NIST SP 800-53 Rev 5: + +- **RA-5 (vulnerability monitoring and scanning):** SAST scanning helps provide continuous automated vulnerability detection. Semgrep identifies OWASP Top 10 vulnerabilities including SQL injection, broken authentication, and security misconfigurations before code reaches production FedRAMP systems. Audit logs help provide timestamped evidence of continuous vulnerability monitoring that 3PAO assessors can review during annual assessments. + +- **IA-5 (authenticator management):** [Secrets detection](/semgrep-secrets/conceptual-overview) helps prevent AWS GovCloud API tokens, Azure Government credentials, database passwords, and private keys from being committed to source code. Custom rules enforce that authentication mechanisms use federal identity providers rather than hardcoded credentials, helping agencies meet OMB Memorandum M-22-09 requirements for phishing-resistant MFA. + +- **SA-11 (developer security testing):** Policy enforcement demonstrates that security testing is mandatory in the secure software development lifecycle. When configured with CI/CD platforms, Semgrep helps block vulnerable code at the pull request level before deployment. Custom policies can enforce agency-specific secure coding standards and can be configured to match security control baselines (low, moderate, or high) required by your authorization. + +- **AU-2 and AU-3 (event logging and audit record content):** Audit logs document every scan execution, security finding, policy violation, and remediation action with timestamps in UTC format. Logs are exportable in JSON format for integration with federal SIEM systems and compliance reporting tools. + +- **SI-2 (flaw remediation):** SAST scanning combined with Supply Chain vulnerability detection helps provide comprehensive flaw identification across custom code and third-party dependencies. [Jira integration creates documented remediation](/semgrep-appsec-platform/jira) workflows with discovery timestamps and resolution status. Audit logs track flaw remediation timelines segmented by severity level, helping agencies meet requirements for remediating high-risk flaws within 30 days and moderate-risk flaws within 90 days. + +- **SI-3 (malicious code protection):** Supply Chain scanning detects malicious packages, typosquatting attacks, dependency confusion vulnerabilities, and known malicious packages before they reach production federal systems. Reachability analysis determines whether malicious dependencies are actually invoked in your application code. + +- **SA-15 and SR-3 (software supply chain security):** [SBOM generation](/semgrep-supply-chain/sbom) helps provide visibility into the federal software supply chain by documenting all third-party components in CycloneDX and SPDX formats. For agencies responding to Executive Order 14028 and OMB Memorandum M-22-18 requirements for software supply chain security, SBOMs document the composition of software deployed in FedRAMP environments. Supply Chain scanning evaluates dependencies against security criteria, identifies components with known vulnerabilities, malicious packages, and abandoned projects. Policy enforcement can block dependencies that fail supply chain risk criteria. + +### Deployment considerations + +CLI and on-premises CI/CD deployments keep code entirely within agency-controlled infrastructure. For agencies using FedRAMP-authorized CI/CD platforms (GitHub Enterprise Server in GovCloud, GitLab Dedicated for Government, Azure Government DevOps), Semgrep integrates with existing workflows. [Semgrep Managed Scans](/getting-started/quickstart-managed-scans) (AWS deployed) may be acceptable for repositories without federal data per FedRAMP Authorization Boundary Guidance v3.0 Section 7, but requires a case-by-case assessment with your authorizing official. diff --git a/mintlify-docs/compliance/gdpr.mdx b/mintlify-docs/compliance/gdpr.mdx new file mode 100644 index 0000000000..b90fc86350 --- /dev/null +++ b/mintlify-docs/compliance/gdpr.mdx @@ -0,0 +1,27 @@ +--- +title: "GDPR compliance" +sidebarTitle: "GDPR" +--- + +**Disclaimer:** *Semgrep provides security tooling that can support compliance efforts, but does not guarantee compliance. Organizations remain responsible for meeting all compliance requirements. Consult with your compliance team and auditors to determine how Semgrep fits into your compliance program.* + +**Last updated:** November 2025 + +GDPR (General Data Protection Regulation) governs how organizations collect, store, and process personal data of EU residents. Organizations must implement appropriate technical and organizational measures to protect personal data and demonstrate compliance with supervisory authorities. + + + +**WARNING** + +Personal data should not exist in code repositories. Semgrep scans code repositories, not production systems or databases containing customer data. EU resident personal data is typically absent from code repositories. + + + +Semgrep helps reduce GDPR violation risk: + +- **Article 25 (data protection by design and by default):** [Policy enforcement](/semgrep-code/policies) demonstrates that security is built into your development process from the start. When properly configured with CI/CD systems, Semgrep can enforce secure coding practices at the pull request level. For details about proper configuration, please chat with the [Semgrep team](/support/). + +- **Article 32 (security of processing):** [SAST scanning](/semgrep-code/overview) detects injection flaws, broken authentication, and insecure configurations that attackers exploit to access databases containing customer personal data. [Secrets detection](/semgrep-secrets/conceptual-overview) stops hardcoded API keys, database credentials, or access tokens that could provide unauthorized access to systems processing EU resident data. [Audit logs](/semgrep-code/findings) provide documented evidence of technical measures to protect personal data. + +- **Articles 44-50 (data transfers to third countries):** Deployment flexibility allows you to meet data residency requirements. CLI and on-premises CI/CD keep code in your environment. For cloud deployments, you can choose EU-region CI/CD providers. For Semgrep Multimodal, you can bring your own API keys with EU-based providers. Semgrep provides Data Processing Agreements with Standard Contractual Clauses for trans-Atlantic transfers. + diff --git a/mintlify-docs/compliance/hipaa-hitrust.mdx b/mintlify-docs/compliance/hipaa-hitrust.mdx new file mode 100644 index 0000000000..7e2bfd880c --- /dev/null +++ b/mintlify-docs/compliance/hipaa-hitrust.mdx @@ -0,0 +1,36 @@ +--- +title: "HIPAA/HITRUST compliance" +sidebarTitle: "HIPAA/HITRUST" +--- + + +**Disclaimer:** *Semgrep provides security tooling that can support compliance efforts, but does not guarantee compliance. Organizations remain responsible for meeting all compliance requirements. Consult with your compliance team and auditors to determine how Semgrep fits into your compliance program.* + +**Last updated:** November 2025 + +HIPAA (Health Insurance Portability and Accountability Act) establishes national standards for protecting medical records and health information. HITRUST CSF v11 (Common Security Framework) is a certifiable framework that harmonizes multiple security and privacy standards including HIPAA requirements. + + +**WARNING** + +Protected Health Information (PHI) should not exist in code repositories. Semgrep scans code repositories, not production systems or databases containing PHI data. PHI data is typically absent from code repositories. + + +Semgrep may help address HIPAA Security Rule requirements and HITRUST CSF v11 control families: + +- **HIPAA Technical Safeguard 164.312(a)(1) and HITRUST control 01.m (access control):** SAST scanning detects SQL injection, authentication bypasses, and broken authorization that attackers exploit to access PHI databases. Policy enforcement blocks these vulnerabilities at the pull request level before code reaches production systems handling PHI. Audit logs create timestamped records showing when access control vulnerabilities were detected and fixed. + +- **HIPAA Technical Safeguard 164.312(a)(2)(i) and HITRUST control 01.q (unique user identification):** [Secrets detection helps prevent hardcoded database credentials](/semgrep-secrets/conceptual-overview), API keys, authentication tokens, and service account passwords from reaching production code that accesses PHI. Custom rules can enforce that all authentication uses centralized identity providers rather than hardcoded credentials. + +- **HIPAA Administrative Safeguard 164.308(a)(8) and HITRUST control 10.m (evaluation of security):** Policy enforcement demonstrates active preventive controls. When configured with CI/CD systems, Semgrep blocks vulnerable code at the PR level. Audit logs document policy enforcement activity showing security controls ran on every code change and blocked violations. + +- **HIPAA Technical Safeguard 164.312(b) and HITRUST control 10.k (audit controls):** Audit logs document every scan execution, security finding, remediation action, and status change with timestamps and user attribution. Exportable logs provide compliance evidence for auditor review showing continuous monitoring and systematic remediation over the audit period. + +- **HIPAA Administrative Safeguard 164.308(a)(1)(ii)(A) and HITRUST control 03.a (risk management):** [Jira integration](/semgrep-appsec-platform/jira) creates documented remediation workflows with timestamps, assignments, priority levels, and resolution timelines. This provides evidence that security findings are systematically identified, tracked, prioritized, and resolved according to risk management procedures. + +- **HIPAA Technical Safeguard 164.312(c)(1) and HITRUST control 01.o (integrity):** SAST rules detect code patterns that could allow data tampering, unauthorized modification of PHI records, or integrity violations. [Custom rules](/semgrep-code/editor) can enforce data validation requirements and detect missing integrity checks in code that modifies PHI. + +- **HIPAA Technical Safeguard 164.312(e)(1) and HITRUST control 09.n (transmission security):** Custom SAST rules enforce TLS requirements for PHI transmission, detect insecure HTTP usage in healthcare applications, and flag missing encryption for data in transit. Rules can verify that all PHI transmission uses appropriate cryptographic protocols. + +- **Supply Chain Security for Healthcare:** [SBOM generation](/semgrep-supply-chain/sbom) provides visibility into third-party components used in healthcare applications including medical device software, connected health platforms, and patient portals. Supply Chain scanning detects vulnerabilities in healthcare-specific libraries such as HL7 parsers, FHIR implementations, and DICOM handlers. Reachability analysis shows which vulnerable dependencies actually process or access PHI, helping security teams prioritize remediation for components in the PHI data path. + diff --git a/mintlify-docs/compliance/iso-27017.mdx b/mintlify-docs/compliance/iso-27017.mdx new file mode 100644 index 0000000000..59947c182a --- /dev/null +++ b/mintlify-docs/compliance/iso-27017.mdx @@ -0,0 +1,29 @@ +--- +title: "ISO 27017 compliance" +sidebarTitle: "ISO 27017" +--- + + +**Disclaimer:** Semgrep provides security tooling that can support compliance efforts, but does not guarantee compliance. Organizations remain responsible for meeting all compliance requirements. Consult with your compliance team and auditors to determine how Semgrep fits into your compliance program. + +**Last updated:** November 2025 + +ISO 27017 extends ISO 27001 with cloud-specific security guidance for protecting customer data in cloud environments. This standard applies to cloud service providers and cloud customers. + +Semgrep may help address ISO 27017 cloud security guidance: + +- **Cloud service development:** Continuous vulnerability scanning and policy enforcement can help demonstrate security controls in development processes. When properly configured with CI/CD systems, Semgrep can enforce secure coding practices at the pull request level. For details around proper configuration please chat with the Semgrep team. + +- **Vulnerability management:** Automated detection and tracking of security issues in code that runs in cloud environments. Audit logs document security scanning activity, findings, and remediation with timestamps. + +- **Logging and monitoring:** Audit logs provide documented evidence of continuous security monitoring across your cloud application codebase. + +- **Supply chain security:** [SBOM generation](/semgrep-supply-chain/sbom) provides inventory of third-party components and dependencies deployed in cloud services, giving visibility into supply chain risk. + +- **Change management:** [Jira integration](/semgrep-appsec-platform/jira) documents how security issues are tracked and remediated through your change management process with timestamps, assignments, and resolution status. Policy enforcement can help prevent vulnerable code from reaching cloud production environments. + +### Deployment and certification + +ISO 27017 applies to cloud service providers and customers. If you provide cloud services to customers, your deployment of Semgrep should align with your cloud security architecture. + +Semgrep Inc. is not itself ISO 27017 certified. However, for CLI deployments, scans run on customer infrastructure. For on-premises CI/CD, scans run on customer-controlled infrastructure. Cloud CI/CD providers (GitHub, GitLab, Azure DevOps, Bitbucket) maintain ISO 27017 certification or equivalent cloud security controls. For Semgrep Multimodal, the default provider (OpenAI) operates in ISO 27017-compliant infrastructure. For Semgrep Managed Scans, AWS infrastructure maintains ISO 27017 certification for cloud security controls. \ No newline at end of file diff --git a/mintlify-docs/compliance/iso27001.mdx b/mintlify-docs/compliance/iso27001.mdx new file mode 100644 index 0000000000..c6771df1ed --- /dev/null +++ b/mintlify-docs/compliance/iso27001.mdx @@ -0,0 +1,26 @@ +--- +title: "ISO 27001 compliance" +sidebarTitle: "ISO 27001" +--- + + +**Disclaimer:** *Semgrep provides security tooling that can support compliance efforts, but does not guarantee compliance. Organizations remain responsible for meeting all compliance requirements. Consult with your compliance team and auditors to determine how Semgrep fits into your compliance program.* + +**Last updated:** November 2025 + +ISO 27001 is the international standard for information security management systems. Organizations must demonstrate continuous security testing and risk management, not just point-in-time assessments. + +Semgrep helps address multiple ISO 27001:2022 Annex A controls: + +- **Control A.8.8 (management of technical vulnerabilities):** Semgrep provides continuous [vulnerability scanning](/semgrep-code/overview) on every code change. [Audit logs](/semgrep-code/findings) document vulnerability detection and remediation timelines, giving auditors automated proof that controls are operational rather than requiring manual evidence collection during audit season. + +- **Controls A.8.25 through A.8.32 (secure development lifecycle):** When properly configured with CI/CD systems, [policy enforcement](/semgrep-code/policies) can help demonstrate active enforcement of secure coding practices. Auditors can see documented evidence that security policies were run on every code change. Note that developers with appropriate permissions can override policy blocks when necessary. For details around proper configuration, please chat with the [Semgrep team](/support/). + +- **Controls A.8.9 and A.8.32 (configuration management and change management):** [Jira integration](/semgrep-appsec-platform/jira) documents how security issues are tracked and remediated through your change management process with timestamps, assignments, and resolution status. + +- **Controls A.5.19 through A.5.23 (information security in supplier relationships):** [SBOM generation](/semgrep-supply-chain/sbom) provides a documented inventory of third-party components and their vulnerabilities, proving you have visibility into supply chain risk. + + +### Deployment and certification + +Semgrep Inc. is **not** itself ISO 27001 certified, but the product can be used in deployment models that support an organization's ISO 27001 certification efforts. For CLI and customer-managed CI/CD deployments, scans run on customer-controlled infrastructure. For Semgrep Multimodal, the default provider (OpenAI) maintains ISO 27001 certification. For Semgrep Managed Scans (SMS), AWS infrastructure is ISO 27001 certified. diff --git a/mintlify-docs/compliance/nist-800-171.mdx b/mintlify-docs/compliance/nist-800-171.mdx new file mode 100644 index 0000000000..8ef8836863 --- /dev/null +++ b/mintlify-docs/compliance/nist-800-171.mdx @@ -0,0 +1,36 @@ +--- +title: "NIST 800-171 compliance" +sidebarTitle: "NIST 800-171" +--- + +**Disclaimer:** Semgrep provides security tooling that can support compliance efforts, but does not guarantee compliance. Organizations remain responsible for meeting all compliance requirements. Consult with your compliance team and auditors to determine how Semgrep fits into your compliance program. + +**Last updated:** November 2025 + +NIST SP 800-171 Revision 2 specifies 110 security requirements across 14 control families for protecting Controlled Unclassified Information (CUI) in non-federal systems. Defense contractors and government contractors handling CUI must maintain CUI within systems that implement these 110 security requirements and remain under contractor control. + + +**WARNING** + +NIST 800-171 applies only to systems that store, process, or transmit CUI. Not all code is CUI. You must assess each repository to determine whether it contains CUI. Commercial software, internal tools, and projects unrelated to government contracts typically do not contain CUI. + + +**Contractor-controlled systems defined:** Under NIST 800-171, contractor-controlled systems are information systems that are owned, operated, and maintained by the contractor (not the government), where the contractor implements all required security controls and maintains full administrative access. This includes on-premises infrastructure in contractor facilities and contractor-managed cloud infrastructure where the contractor implements the 110 NIST 800-171 security requirements. Standard commercial cloud services ([GitHub.com](http://GitHub.com), [GitLab.com](http://GitLab.com), [Azure DevOps Services](https://azure.microsoft.com/en-us/products/devops)) where the service provider controls security configurations generally do not meet the definition of contractor-controlled for CUI. + +When Semgrep scans your source code, it analyzes code for security vulnerabilities and policy violations. If your code does not contain CUI, NIST 800-171 requirements do not apply to code scanning. + +For repositories that **do not contain CUI**, Semgrep may help with your overall security posture: + +- **3.14.1 (flaw remediation):** SAST scanning detects security weaknesses in code. Audit logs document vulnerability detection and remediation timelines. + +- **3.5.10 (authenticator management):** Secrets detection helps prevent hardcoded credentials that provide unauthorized access from reaching production. + +- **3.3.1 (audit record creation):** Audit logs document security scanning activity and findings with timestamps and user attribution. + +- **3.4.7 (least functionality):** Policy enforcement can help block vulnerable code at the pull request level. When properly configured with CI/CD systems, Semgrep can enforce security policies on every code change. For details around proper configuration, please chat with the [Semgrep team](/support/). + +### Deployment requirements for CUI + +If your code contains CUI, you must ensure your Semgrep deployment keeps CUI within contractor-controlled systems implementing all 110 NIST 800-171 security requirements. Currently the only Semgrep deployment that would support NIST SP 800-171 is the Semgrep CLI tool. The CLI tool runs entirely on local systems. If your local systems are contractor-controlled and implement all 110 NIST 800-171 requirements, CLI deployment keeps CUI within compliant systems. + +Not finding what you need in this doc? Ask questions in our [Community Slack group](https://go.semgrep.dev/slack), or see [Support](/support/) for other ways to get help. diff --git a/mintlify-docs/compliance/pci-dss.mdx b/mintlify-docs/compliance/pci-dss.mdx new file mode 100644 index 0000000000..0fefd072d1 --- /dev/null +++ b/mintlify-docs/compliance/pci-dss.mdx @@ -0,0 +1,30 @@ +--- +title: "PCI DSS compliance" +sidebarTitle: "PCI DSS" +--- + + +**Disclaimer:** *Semgrep provides security tooling that can support compliance efforts, but does not guarantee compliance. Organizations remain responsible for meeting all compliance requirements. Consult with your compliance team and auditors to determine how Semgrep fits into your compliance program.* + +**Last updated:** November 2025 + +PCI DSS (Payment Card Industry Data Security Standard) is mandatory for organizations that store, process, or transmit payment data. QSAs (Qualified Security Assessors) require documented evidence of security controls during assessments. + + +**WARNING** + +Cardholder data should never exist in code repositories. Use designated test card numbers for testing. If no cardholder data exists in your code, PCI DSS does not apply to your SAST scanning. + + +Semgrep helps address PCI DSS requirements: + +- **Requirement 6.2 (ensure all systems are protected from known vulnerabilities):** [SAST scanning](/semgrep-code/overview) detects injection flaws, broken authentication, and insecure configurations that could expose cardholder data. [Audit logs](/semgrep-code/findings) provide documented evidence of vulnerability detection and remediation timelines. QSAs require quarterly validation, and Semgrep provides continuous evidence rather than point-in-time snapshots. + +- **Requirement 6.3.1 (removal of custom application accounts, user IDs, and passwords before applications become active):** [Secrets detection](/semgrep-secrets/conceptual-overview) helps prevent hardcoded credentials that provide access to payment systems from reaching production. + +- **Requirement 6.3.2 (secure coding practices):** QSAs expect to see evidence of vulnerability scanning, such as SAST, in the development process. When properly configured with CI/CD systems, [policy enforcement](/semgrep-code/policies) can help block risky code at the pull request level, creating a preventive control. Developers with appropriate permissions can override blocks when necessary. Every policy violation is documented for auditors. For configuration help, please contact [Semgrep](/support/). + +### Deployment guidance + +CLI and on-premises CI/CD keep code in customer-controlled infrastructure. Cloud CI/CD and Semgrep Managed Scans only process code repositories that should not contain cardholder data. If cardholder data is present in the code, verify that your deployment option meets your PCI scope requirements. + diff --git a/mintlify-docs/compliance/soc2.mdx b/mintlify-docs/compliance/soc2.mdx new file mode 100644 index 0000000000..013f378025 --- /dev/null +++ b/mintlify-docs/compliance/soc2.mdx @@ -0,0 +1,20 @@ +--- +title: "SOC 2 compliance" +sidebarTitle: "SOC 2" +--- + +**Disclaimer:** *Semgrep provides security tooling that can support compliance efforts, but does not guarantee compliance. Organizations remain responsible for meeting all compliance requirements. Consult with your compliance team and auditors to determine how Semgrep fits into your compliance program.* + +**Last updated:** November 2025 + +Organizations pursuing SOC 2 Type II certification need to demonstrate that security controls are operational and effective over time (typically 6-12 months), not just implemented at a point in time. + +When Semgrep scans your code, it generates [audit logs](/semgrep-code/findings) that document every scan execution, security finding, remediation action, and status change with timestamps and user attribution. These logs provide evidence for SOC 2 Trust Services Criteria, including CC6.6 (vulnerabilities are identified and addressed), CC7.2 (system monitoring), and CC7.3 (evaluation of security events). + +When properly configured with CI/CD systems, Semgrep [policy enforcement](/semgrep-code/policies) allows security teams to define [custom security rules that can block code](/semgrep-ci/configuring-blocking-and-errors-in-ci#blocking-findings) from merging when violations are detected. This demonstrates preventive controls (CC6.1, CC6.6) rather than detective controls. Auditors want to see that you stop security issues before they reach production, not just detect them afterward. Note that developers with appropriate permissions can override policy blocks when necessary. For details around proper configuration, please chat with the Semgrep team. + +[Jira integration](/semgrep-appsec-platform/jira) documents your remediation workflow with timestamps and assignments, giving auditors clear evidence that security issues are identified, tracked, and resolved systematically (CC8.1 change management). [SBOM generation](/semgrep-supply-chain/sbom) provides supply chain visibility for vendor risk management controls (CC9.1). + +### Deployment and certification + +Semgrep Inc. is SOC 2 Type II certified. For CLI deployments, scans run on customer infrastructure (which may or may not be SOC 2 certified, depending on customer controls). For on-premises CI/CD, scans run on customer-controlled infrastructure. Cloud CI/CD providers (GitHub, GitLab, Azure DevOps, Bitbucket) are SOC 2 certified. For Semgrep Managed Scans, scans run on Semgrep's SOC 2 Type II-certified AWS infrastructure. diff --git a/mintlify-docs/contributing/adding-a-language.mdx b/mintlify-docs/contributing/adding-a-language.mdx new file mode 100644 index 0000000000..7b51834925 --- /dev/null +++ b/mintlify-docs/contributing/adding-a-language.mdx @@ -0,0 +1,420 @@ +--- +title: "How to add support for a new language" +sidebarTitle: "Add support for a new language" +--- + +This document is about adding support for a new programming language in Semgrep using the [tree-sitter](https://tree-sitter.github.io/tree-sitter/) technology. Most languages in semgrep use `tree-parser` though you may also need to update the `menhir` parser. + +Repositories involved directly: + +* [**semgrep**](https://github.com/semgrep/semgrep): the semgrep command line program. +* [**ocaml-tree-sitter-semgrep**](https://github.com/semgrep/ocaml-tree-sitter-semgrep): language-specific setup, generates C/OCaml parsers for semgrep. +* A new repository **semgrep-LANG** for the language you're adding: this is a C or OCaml parser generated from `ocaml-tree-sitter-semgrep` by a Semgrep administrator. +* [**semgrep-interfaces**](https://github.com/semgrep/semgrep-interfaces/blob/main/generate.py) + +## Placeholder values + +This document uses the placeholder LANG to indicate that you should substitute the name of your language as the value in the given context. For example, if your language is Ruby, and the document's instructions read: + +> Create a new file `TEST_LANG_.txt` where LANG is in small caps. + +The name of your file should be `TEST_LANG_ruby.txt` + +> Create a file `Pretty_print.**_EXTENSION_**` with the filename extension of your language: + +The name of your file should be `Pretty_print.rb`. + +## `semgrep` repository overview + +There are some GitHub repositories involved in porting a language. +Here is the file hierarchy of the [`semgrep` +repository](https://github.com/semgrep/semgrep): + +```text +/languages +├── bash + ... +├── swift + ├── generic + └── tree-sitter + └── semgrep-swift # generated tree-sitter parsers +``` + +When you're done with the work in [`ocaml-tree-sitter-semgrep`](https://github.com/semgrep/ocaml-tree-sitter-semgrep), you'll need a new repository **`semgrep-LANG`** to host the generated parser code. + +Ask someone from the Semgrep team to create one for you. For this, they should use the template +[`semgrep-lang-template`](https://github.com/semgrep/semgrep-lang-template) when creating the repository. + +The instructions for adding a language start in [`ocaml-tree-sitter-semgrep`](https://github.com/semgrep/ocaml-tree-sitter-semgrep), as indicated below. Be careful that you are always in the correct repository! + +## Set up `ocaml-tree-sitter-semgrep` + +As a model, you can use the existing setup for `ruby` or `javascript`. The most complicated setup is for `typescript` and `tsx`. + +### Expedited setup + +If you're lucky, the language you want to add can be added with the script `add-simple-lang`: + +```bash +cd lang +./add-simple-lang --help +``` + +Follow the instructions from --help. + +This often works with languages that define a single dialect using a `grammar.js` file at the root of the project. If this simplified approach fails, use the [Manual setup](#manual-setup) instructions below to understand what's going on or to set things up manually. + +### Manual setup + +From the `ocaml-tree-sitter-semgrep` repository, do the following: + + + +Create a `lang/LANG` folder. + + +Make a `test/ok` directory. Inside the directory, create a simple `hello-world` program for the language you are porting. Name the program `hello-world.EXTENSION`. + + +Now make a file called `extensions.txt` and input all the language extensions (.rb, .kt, etc) for your language in the file. + + +Create a file called `fyi.list` with all the information files, such as + `semgrep-grammars/src/tree-sitter-LANG/LICENSE`, + `semgrep-grammars/src/tree-sitter-LANG/grammar.js`, + `semgrep-grammars/src/semgrep-LANG/grammar.js`, etc. + to bundle with the final OCaml/C project. + + +Link the Makefile.common to a Makefile in the directory with: + `ln -s ../Makefile.common Makefile` + + + +Create a test corpus. You can do this by: + * Running `most-starred-for-language` to gather projects + on which to run parsing stats. Run with the following command: + `./scripts/most-starred-for-language LANG YOUR_USERNAME API_KEY` + * Using github advanced search to find the most starred or most forked repositories. + + +Copy the generated `projects.txt` file into the `lang/LANG` directory. + + +Add in extra projects and extra input sets as you see necessary. + + + +Here's the file hierarchy for Ruby: + +```bash +lang/ruby # language name of the form [a-z][a-z0-9]* +├── extensions.txt # standard name. Required for stats. +├── fyi.list # list of informational files to copy. Recommended. +├── Makefile -> ../Makefile.common +├── projects.txt # standard name. Required for stats. +└── test # sample input files + ├── ok # contains input files supported by the current grammar + │ ├── comment.rb + │ ├── ex1.rb + │ ├── ex2.rb + │ ├── hello.rb + │ └── poly.rb + └── xfail # contains input files that are expected to fail + └── rating.rb +``` + +To test a language in `ocaml-tree-sitter-semgrep`, you must build the +`ocaml-tree-sitter-semgrep` OCaml code generator, run it to produce a parser, +then run some tests for the parser. Full instructions for this +are given in [updating-a-grammar](/contributing/updating-a-grammar) under +"Testing". The short instructions are: +1. For the first time, build everything with `./scripts/rebuild-everything`. +2. Subsequently, work from the `lang/LANG` folder and run + `make` and `make test`. + +### The `fyi.list` file + +The `fyi.list` file was created to specify informational files that +should accompany the generated files. These files are typically: + +* the source grammar, most often a single `grammar.js` file. +* the licensing conditions usually specified in a `LICENSE` file. + +Example: + +```text +# Comments are allowed on their own line. +# Blank lines are ok. + +# Each path is relative to ocaml-tree-sitter-semgrep/lang +semgrep-grammars/src/tree-sitter-ruby/LICENSE +semgrep-grammars/src/tree-sitter-ruby/grammar.js +semgrep-grammars/src/semgrep-ruby/grammar.js +``` + +The files listed in `fyi.list` end up in a `fyi` folder in +tree-sitter-lang. For example, +[see `ruby/fyi`](https://github.com/semgrep/semgrep-ruby/tree/main). + +## Extend the original grammar with semgrep syntax + +This is best done after everything else is set up. Some constructs +such as semgrep metavariables (`$FOO`) may already be valid constructs +in the language, in which case there's nothing to do. Some support for +the semgrep ellipsis `...` usually needs to be added as well. + +You'll need to learn [how to create tree-sitter +grammars](https://tree-sitter.github.io/tree-sitter/creating-parsers). + + + +Work from `semgrep-grammars/src/semgrep-LANG` and use `make` and + `make test` to build and test. + + +Add new test cases to `test/corpus/semgrep.text`. + + +Edit `grammar.js`. + + +Refer to the original grammar in + `semgrep-grammars/src/tree-sitter-LANG` to determine which rules to + extend. + + + +For an example of how to extend a language, you can: +* Look at what was done for the semgrep extensions of other languages + in their respective `semgrep-*` folders. +* Look at how `tree-sitter-typescript` extends the JavaScript grammar. + This is the file [`common/define-grammar.js` in the + tree-sitter-typescript repository](https://github.com/tree-sitter/tree-sitter-typescript/blob/master/common/define-grammar.js). + +Avoiding parsing conflicts is the trickiest part. Asking for help is encouraged. + + + +**💡 A NOTE ON THE JAVASCRIPT SYNTAX THAT'S HEAVILY USED TO DEFINE AND EXTEND GRAMMARS:** + +When possible, the development team prefers **shorthand** notation for anonymous functions made of a single expression: +```js +(x) => x +``` +which is the same as +```js +(x) => { return x; } +``` +which is itself the same as +```js +function(x) { return x; } +``` + +When extending any rule with an alternate choice such as `$.ellipsis`, +the simpler way is this one: + +```js +expression: ($, previous) => choice(previous, $.ellipsis), +``` + +However, if the `previous` rule is known to be a `choice()`, you can avoid +one level of nesting and append to the original list of choices, which +is done as follows: +```js +expression: ($, previous) => choice(...previous.members, $.ellipsis), +``` + +Whether to use one or the other is a matter of taste. + + + +Finally, on rare occasions where the rule body is more than a single expression, you'll have to use the curly brace or return syntax: +```js +expression: ($, previous) => { + if (semgrep_ext) + return choice(...previous.members, $.ellipsis); + else + return previous; +}, +``` + +## Parsing statistics + +From a language's folder such as `lang/csharp`, two targets are +available to exercise the generated parser: + +* `make test`: runs on `test/ok` and `test/xfail` +* `make stat`: downloads the code specified in `projects.txt` and + parses the files whose extension matches those in `extensions.txt`, + reporting parsing success in the form of a CSV file. + +For gathering a good test corpus, you can use [GitHub +Search](https://github.com/search/advanced) or the script provided in +`scripts/most-starred-for-language.py`. For github searches, filter by +programming language and use a constraint to select large projects, +such as "> 100 forks". Collect the repository URLs and put them into +`projects.txt`. + +## Publish generated parsers + +After you have pushed your ocaml-tree-sitter-semgrep changes to the main +branch, do the following: + + + +Check that the original `grammar.js`, `src/scanner.c`/`.cc` (if + applicable) look clean and have minimal external dependencies. + + +In `ocaml-tree-sitter/lang/Makefile`, add language under + 'SUPPORTED_LANGUAGES' and 'STAT_LANGUAGES'. + + +In `ocaml-tree-sitter/lang` directory, run `./release LANG --dry-run`. + If this looks good, please [ask someone from the Semgrep team](https://github.com/semgrep/ocaml-tree-sitter-semgrep/blob/main/doc/release.md) to + publish the code using `./release LANG`. + + + +### Troubleshooting + +Various errors can occur along the way. + +Compilation errors in C or C++ are usually due to a missing source +file `scanner.c` or `scanner.cc`, or a grammar with a name that +doesn't match the name inside the scanner file. JavaScript files may +also be missing, in particular in the case of grammars that extend +existing grammars such as C++ for C or TypeScript for +JavaScript. Check for `require()` calls in `grammar.js` and learn how +this NodeJS primitive resolves paths. + +There may also be errors when generating or compiling +OCaml code. These are likely bugs in ocaml-tree-sitter-semgrep and they should +be reported or fixed right away. + +Here are some known types of parsing errors: + +* A syntax error. The input program is in the wrong syntax or uses a + recent feature that's not supported yet: `make test` or directly the + `parse_LANG` program will show the tree produced by tree-sitter with + one or more `ERROR` nodes. +* A "reparsing" error. It's an error generated after the first + successful parsing pass by the tree-sitter parser, during the + reparsing pass by the OCaml code performed by the generated + `Parse.ml` file. The error message should tell you something like + "cannot interpret tree-sitter's output", with details on what code + failed to match what pattern. This is most likely a bug in + `ocaml-tree-sitter-semgrep`. +* A segmentation fault. This could be due to a bug in the + OCaml/tree-sitter C bindings and should be fixed. A simple test case + that reproduces the problem would be nice. + See https://github.com/semgrep/ocaml-tree-sitter-semgrep/issues/65 + +Parsing errors that are due to an incomplete or incorrect grammar should be recorded, and eventually reported or fixed in the upstream project. + +We keep failing test cases in a `fail/` folder, preferably in the form of the minimal program suitable for a bug report, with a comment describing what was expected and what's going on. + + +## Update the `semgrep` repository + +Now that you have added your new language LANG to `tree-sitter`, do the following: + + + +Update [`generate.py`](https://github.com/semgrep/semgrep-interfaces/blob/main/generate.py) in the `semgrep-interfaces` repository with your new language. + + +In the `semgrep` repository, go to [`/src/parsing/Check_pattern.ml`](https://github.com/semgrep/semgrep/blob/develop/src/parsing/Check_pattern.ml), and add LANG to `lang_has_no_dollar_ids`. If the grammar has no dollar identifiers, add LANG above 'true'. Otherwise, add it above 'false'. + + +In [`/src/printing/Pretty_print_AST.ml`](https://github.com/semgrep/semgrep/blob/develop/src/printing/Pretty_print_AST.ml), add LANG to the appropriate functions: + * `print_bool` + * `if_stmt` + * `while_stmt` + * `do_while` + * `for_stmt` + * `def_stmt` + * `return` + * `break` + * `continue` + * `literal` + + +In [`/src/parsing/tests/Test_parsing.ml`](https://github.com/semgrep/semgrep/blob/develop/src/parsing/tests/Test_parsing.ml), add in LANG to `dump_tree_sitter_cst_lang`. + + +Inspect the other languages in `/languages` as a reference for what + code to add. Create a new folder for your language. + + +Add the `semgrep-LANG` repository as a submodule under + `/languages/LANG/tree-sitter/` (`git submodule add ...`). + + +Create a file + `/languages/LANG/tree-sitter/Parse_LANG_tree_sitter.ml` + by copying the generated template `Boilerplate.ml` that you'll find + in the `semgrep-LANG` submodule. + Add basic functionality to + define the function `parse` and import the module + `Parse_tree_sitter_helpers`. + Look at other languages to get a better idea of how to + define the parse file function. This file should contain something + similar to: + ```ocaml + module H = Parse_tree_sitter_helpers + + let parse file = + H.wrap_parser + (fun () -> + Parallel.backtrace_when_exn := false + Parallel.invoke Tree_sitter_X.Parse.file file () + ) + ``` + + +Create the missing `dune` files wherever you have OCaml source + files (`.ml`, `.mli`) by imitating what was done for other + languages. + + +Write a basic test case for your language in + `tests/LANG/hello-world.EXT`. This + can just be a hello-world function. + + +Try to build the project using the usual commands + (`make` or `make dev`). + + +Test that the command + `semgrep-core/bin/semgrep-core -dump_tree_sitter_cst test/LANG/hello-world` + prints out a CST for your language. + + + +At this point, you're ready to start writing the translator from +the CST produced by the tree-sitter parser for LANG +into the generic AST used by Semgrep, accommodating all the languages +in a single AST type. It's recommended but not required to first +translate the CST into a language-specific AST before translating it +into the generic AST in a second step. + +## Legal concerns + +Be thankful for the authors of the original code, keep clearly visible +license notices, and make it easy to get back to the original projects: + +* Make sure to preserve the `LICENSE` files. This should be listed in + the `fyi.list` file. +* For sample input in `test/`, consider Public Domain ("The + Unlicense") files or write your own, for simplicity. + [GitHub Search](https://github.com/search/advanced) + allows you to filter projects by license and by programming language. + +## See also + + + + \ No newline at end of file diff --git a/mintlify-docs/contributing/contributing-code.mdx b/mintlify-docs/contributing/contributing-code.mdx new file mode 100644 index 0000000000..f4b17df92d --- /dev/null +++ b/mintlify-docs/contributing/contributing-code.mdx @@ -0,0 +1,279 @@ +--- +title: "Contributing code" +--- + +Semgrep welcomes contributions from anyone. If you have an idea for a feature +or notice a bug please [open an issue](https://github.com/semgrep/semgrep/issues/new/choose). +Creating an issue first is preferable to moving directly to a pull request so +that we can ensure you're on the right track without any wasted effort. This +is also a great way to contribute to Semgrep even if you're not making changes +yourself. + +This README gives an overview of the repository. For further information on building, you will be directed to [semgrep-core contributing](/contributing/semgrep-core-contributing) and/or [semgrep-cli contributing](/contributing/semgrep-contributing) in [Making a Change](#making-a-change). + +## File structure + +Semgrep consists of a Python wrapper (`semgrep-cli`) around an OCaml engine (`semgrep-core`) which performs the core parsing/matching work. Within `semgrep-core`, there are two sources of parsers, `pfff` and `tree-sitter-lang` using [tree-sitter](https://github.com/tree-sitter/tree-sitter). Additionally, `semgrep-core` contains a subengine, `spacegrep`, for generic matching. + +You may also be interested in `perf`, which contains our code for running repositories against specific rulesets. + +There are many other files, but the below diagram broadly displays the file structure. + +```text +. +├── cli/ (Python wrapper) +│ └── src/ +│ └── semgrep/ +│ +├── src/ (semgrep-core) +│ │── analyzing/ (Dataflow analysis) +│ │── core_cli/ (Entrypoint for semgrep-core) +│ └── matching/ (Matching engine) +│ +├── languages/ (Language parsers) +│ +├── libs/ (Library components) +│ │── ast_generic/ (Generic AST) +│ └── spacegrep/ (Generic matching) +│ +└── perf/ (Performance benchmarking) +``` + +Most of Semgrep's logic is in `cli/src` and `src`. + +## Code relationship + +The `semgrep-core` binary stands alone. Once built, it is possible to run `semgrep-core` on a semgrep rule for a given language with a file/directory and receive matches. + +For example, say you create the config file `unsafe-exec.yaml` and the program `unsafe-exec.py`: + +```yaml +rules: +- id: unsafe-exec + pattern: exec(...); + message: Avoid use of exec; it can lead to a remote code execution. + severity: MEDIUM + languages: [python] +``` + +```python +exec("ls"); +``` + +If you run `semgrep-core -config unsafe-exec.yaml unsafe-exec.py -lang python`, it will output + +```text +unsafe-exec.py:1 with rule unsafe-exec + exec("ls"); +``` + +If you run `semgrep --config unsafe-exec.yaml unsafe-exec.py`, it will output + +```text +running 1 rules... +unsafe-exec.py +severity:warning rule:unsafe-exec: Avoid use of exec; it can lead to a remote code execution. +1:exec("ls"); +ran 1 rules on 1 files: 1 findings +``` + +The matched code is the same, but with `semgrep-cli` the output is more polished and includes the message. + +`semgrep-cli` invokes the `semgrep-core` binary as a subprocess, with a flag to request JSON output. It reads the `semgrep-core` output and transforms it appropriately. + +Currently, depending on the flags used, `spacegrep` is invoked both independently by `semgrep-cli` as a subprocess and by `semgrep-core` as a subfolder. Therefore, `semgrep-cli` requires the `spacegrep` binary, but building `semgrep-core` will build `spacegrep` as well. + +## Making a change + +Semgrep runs on Python versions >= 3.8. If you don't have one of these versions installed, please do so before proceeding. + +Because the Python and OCaml development paths are relatively independent, the instructions are divided into Python ([semgrep-cli contributing](/contributing/semgrep-contributing)) and OCaml ([semgrep-core contributing](/contributing/semgrep-core-contributing)). + +To fully build Semgrep from source, start at [semgrep-core contributing](/contributing/semgrep-core-contributing). It will direct you to [semgrep-cli contributing](/contributing/semgrep-contributing) when appropriate. + +Depending on what change you want to make, it might be simpler to build only `semgrep-cli` or only `semgrep-core`. For example, if you only want to modify Python code, you can skip installing OCaml by downloading binaries for the OCaml parts. Similarly, if you only want to modify OCaml code, you can work on `semgrep-core`/`spacegrep` directly. + +If you only want to build `semgrep-cli`, go straight to [semgrep-cli contributing](/contributing/semgrep-contributing). Otherwise, follow the instructions in [semgrep-core contributing](/contributing/semgrep-core-contributing). + +Below is a guide for what functionality each of `semgrep-cli` and `semgrep-core` controls. + +### Only `semgrep-cli` + +The python code for Semgrep performs pre and post-processing work. You likely need to touch only `semgrep-cli` if you want to affect + +* How output is formatted +* What files are scanned for each language +* The message that is displayed + +Go to [semgrep-cli contributing](/contributing/semgrep-contributing) + +### Only `semgrep-core` + +The OCaml code for Semgrep performs all the parsing and matching work. You likely need to touch only `semgrep-core` if you want to + +* Fix a parse error +* Fix a matching error +* Improve Semgrep's performance + +Go to [semgrep-core contributing](/contributing/semgrep-core-contributing) + +### Both `semgrep` and `semgrep-core` + +There are some features that cross through both OCaml and Python code. You will likely need to touch both `semgrep-cli` and `semgrep-core` if you want to + +* Fix an Rule-defined fix error +* Add a new language +* Change error reporting + +Go to [semgrep-core contributing](/contributing/semgrep-core-contributing). It will direct you to [semgrep-cli contributing](/contributing/semgrep-contributing) when appropriate. + +## Development workflow + +Before each commit Semgrep will run [`pre-commit`](https://pre-commit.com/) to +ensure files are well-formatted and check for basic linting bugs. If you don't +have `pre-commit` installed the following command will do so for you: + +```bash +python -m pip install pre-commit +``` + +Our `pre-commit` configuration uses Docker images. Please ensure you have +[Docker installed](https://docs.docker.com/get-docker/) before running +`pre-commit`. Install the `pre-commit` hooks with the following command: + +```bash +pre-commit install +``` + +To ensure `pre-commit` is working as expected, run the following command: + +```bash +pre-commit run --all +``` + +Once `pre-commit` is working you may commit code and create pull requests as +you would expect. Pull requests require approval of at least one maintainer and +[CI to be passing](https://github.com/semgrep/semgrep/actions). + +### Explaining code + +It's important for code to be easy to maintain. This allows all of us to +spend more time on new features rather than spending it on studying +legacy code. As a general rule of thumb, assume that all context that +is not written down will be lost and forgotten. Useful context includes: + +* Why does this code exist? +* What or who uses this code? +* What does this code achieve? +* Could this code be replaced by an off-the-shelf component? Why not? +* Does it implement a formal specification or a well-known pattern? Where can + we learn more about it? + +We ask that **each source file start with one comment** that +concisely answers these questions. + +Here's a short example: +```ocaml +(* + Generate unique names with a given prefix. +*) +``` + +It can be improved by explaining the code's uses: +```ocaml +(* + Generate unique names with a given prefix. This is used to + name new grammar rules and new OCaml variables. +*) +``` + +### Adding a changelog entry + +#### Quick reference + +Add a new file named like `changelog.d/gh-1234.fixed` that contains +a single paragraph of Markdown text such as: +```text +Fix emojis absorbed by the fleeb generator +``` + +File name format: +```text +gh-1234.fixed + ^^^^ ^^^^^ + | | + | one of: "added", "changed", "fixed", "infra" + GitHub issue or pull request ID +``` + +Valid changelog file suffixes are: +- `added` - New features or other previously non-existing functionality +- `changed` - Items that have changed the way Semgrep functions +- `fixed` - Bug fixes or other improvements +- `infra` - Workflow improvements or other non-code updates + +#### When to add a changelog entry + +If you contribute code that affects users, you must add an entry +to the changelog, in the [`changelog.d` +folder](https://github.com/semgrep/semgrep/tree/develop/changelog.d). At +each Semgrep release, these files are automatically gathered and formatted to +produce [release notes](https://github.com/semgrep/semgrep/blob/develop/CHANGELOG.md). + +A changelog entry is required if you are: +- Adding new features or other previously non-existing functionality. +- Including important changes in the way Semgrep functions. +- Submitting bug fixes or other improvements. +- Creating workflow improvements or other non-code updates. + +A tool called [`towncrier`](https://github.com/twisted/towncrier) is +used for changelog management. + +### Troubleshooting pre-commit + +On M1 macs some `pre-commit` tests may fail. + +If those checks are running in docker containers (such as `hadolint`) and exit with code 137, this means they are running into a memory limit. +This is because for running x86_64 images on an M1 mac, docker will utilize an emulation with qemu that can cause higher memory consumption. +To fix this, change the memory limit in Docker Desktop in the Resources section of the Preferences, 8.00GB should be sufficient. + +### Working with git submodules + +A submodule is a reference to a specific commit in another git +repository. This results in a subfolder containing a checkout of that +repository at that particular commit. Submodules have a reputation of +being tricky to use. To minimize problems, make sure to follow these +guidelines: + +* When checking out a new branch or commit, update the submodules + using the command `git submodule update --init --recursive`. + Adding a shortcut to your shell can be useful. The following is a + Bash function that lets you call `gitup`. It goes into your `~/.bashrc`: + +```bash +gitup() { + echo "git submodule update --init --recursive" + git submodule update --init --recursive +} +``` + +* When modifying both a parent repo A and one of its submodules B, + make one pull request for each (PR A, PR B).

+ i. Before merging PR B, make sure the branch on repo B is **not + lagging behind** the main branch. This ensures that the submodule + includes all the latest changes made by others.

+ ii. Make sure PR B is merged **before** PR A. + This ensures that other developers will pick up the changes on B + when making their own changes.

+ iii. After merging PR A, check that submodule B is still up-to-date + with respect to its main branch, especially if PR B was merged + more than an hour ago. + Good to know: + - Merging in B can be done with a merge commit or by squashing the + commits. + - If squashing commits in B, you must know that the original commit + referenced by A becomes orphaned when the branch is deleted but + remains cached by git for a while. This is usually sufficient to + not require A to point to the newly-squashed commit. _If this turns + out to be problematic in practice, we may have to disallow + commit squashing in the future._ diff --git a/mintlify-docs/contributing/contributing-to-semgrep-rules-repository.mdx b/mintlify-docs/contributing/contributing-to-semgrep-rules-repository.mdx new file mode 100644 index 0000000000..c69d55a0d4 --- /dev/null +++ b/mintlify-docs/contributing/contributing-to-semgrep-rules-repository.mdx @@ -0,0 +1,534 @@ +--- +title: "Contribute rules to the Semgrep Registry" +--- + + +Publish rules to the Semgrep Registry to share them with the Semgrep community and contribute to the field of software security. There are two ways in which you can contribute rules to the Semgrep Registry: + +**For users of Semgrep AppSec Platform** + +Contribute new rules to the Semgrep Registry through Semgrep AppSec Platform. This workflow is recommended. See [Contribute through Semgrep AppSec Platform (recommended)](#contribute-through-semgrep-appsec-platform-recommended). This workflow creates the necessary pull request for you and streamlines the whole process. + +**For contributors to the repository through GitHub** + +Contribute rules to the Semgrep Registry, or suggest changes to existing rules, through a pull request to `semgrep-rules`. See the [Contribute through GitHub](#contribute-through-github) section for detailed information. + +## Contribute through Semgrep AppSec Platform (recommended) + +This is the recommended path for adding a new rule. To suggest a change to an existing rule, see [Update existing rules in Semgrep Registry](#update-existing-rules-in-semgrep-registry). + + + +Sign in to [Semgrep AppSec Platform](https://semgrep.dev/login). + + +Go to the [Semgrep Playground](https://semgrep.dev/playground/new). + + +Click **Create New Rule**. + + +Choose one of the following: + - Create a new rule and test code by clicking **plus** icon, select **New rule** and then click **Save**. Note: The test file must contain at least one true positive and one true negative test case to be approved. See the [Tests](#tests) section of this document for more information. + - In the **Library** panel, select a rule from a category in **Semgrep Registry**. Click **Fork**, modify the rule or test code, and then click **Save**. + + +Click **Share**. + + +Click **Publish to Registry**. + + +Fill in the required and optional fields. + + +Click **Continue**, and then click **Create PR**. + + + +This workflow automatically creates a pull request in the GitHub [Semgrep Registry](https://github.com/semgrep/semgrep-rules). Find more about the Semgrep Registry by reading the [Rule writing](#write-a-rule-for-semgrep-registry) and [Tests](#tests) sections. + +You can also publish rules as private rules outside of Semgrep Registry. These rules are not included in the Semgrep Registry, but they are accessible to your Semgrep organisation. See the [Private rules](/writing-rules/private-rules) documentation for more information. + +## Contribute through GitHub + + + +Create a pull request in the [semgrep/semgrep-rules](https://github.com/semgrep/semgrep-rules) repository. The pull request requires two files: + - The Semgrep rule saved as a YAML file. + - The test file with the file extension of the language or framework. The test file must contain at least one true positive and one true negative test case to be approved. See the [Tests](#tests) section of this document for more information. + + +Sign the Contributor License Agreement (CLA) on GitHub; this is required before Semgrep can accept your contributions. + +Pull requests require the approval of at least one maintainer and successfully passed [CI jobs](https://github.com/semgrep/semgrep-rules/actions). + + + +Find more about the Semgrep Registry by reading the [Rule writing](#write-a-rule-for-semgrep-registry) and [Tests](#tests) sections. + +## Licensing + +The Semgrep Registry can import rules from different repositories. These repositories can enforce their own licensing for rules. If you'd like to enforce a specific license, such as the MIT license or GNU Lesser GPL: + + + +Create a GitHub repository and store your rules there. + + +Reach out to the Semgrep team through the [Community Slack](https://go.semgrep.dev/slack) or [Support](/support) + + + +## Write a rule for Semgrep Registry + +The following sections document necessary fields in rule files of Semgrep Registry, provide information about rule messages, inform about test files, mention rule quality checkers, and describe additional fields required by rules in the security category. + +### General rule requirements + +All rules in general, regardless of whether they are intended only as local rules or for Semgrep Registry, have the same initial requirements. The following table is also included in the [Rule Syntax](/writing-rules/rule-syntax) article. + +All required fields must be present at the top level of a rule immediately under the `rules` key. + +|**Field**|**Type**|**Description**| +|---|---|---| +|`id`|`string`|Unique, descriptive identifier, for example: `no-unused-variable`| +|`message`|`string`|Message that includes why Semgrep matched this pattern and how to remediate it. See also [Rule messages](/contributing/contributing-to-semgrep-rules-repository#rule-messages).| +|`severity`|`string`|Severity can be `LOW`, `MEDIUM`, `HIGH`, or `CRITICAL`. It indicates the criticality of issues detected by a rule. Note: Semgrep Supply Chain uses [CVE assignments for severity](/semgrep-supply-chain/findings#filter-findings), while the rule author sets severity for Code and Secrets. The older levels `ERROR`, `WARNING`, and `INFO` match `HIGH`, `MEDIUM`, and `LOW`. Severity values remain backwards compatible.| +|`languages`|`array`|See [language extensions and tags](/writing-rules/rule-syntax#language-extensions-and-languages-key-values).| +|`pattern`*|`string`|Find code matching this expression| +|`patterns`*|`array`|Logical `AND` of multiple patterns| +|`pattern-either`*|`array`|Logical `OR` of multiple patterns| +|`pattern-regex`*|`string`|Find code matching this [PCRE2](https://www.pcre.org/current/doc/html/pcre2pattern.html)\-compatible pattern in multiline mode| + + +**INFO** + +Only one of the following keys are required: `pattern`, `patterns`, `pattern-either`, `pattern-regex` + + +Every rule also requires a test file in the language that the rule is targeting. See [Tests](#tests) for more details. + +### Semgrep registry rule requirements + +In addition to the fields mentioned above, rules submitted to Semgrep Registry have additional required fields: + +| Field | Description | Possible values | Example | +|---|---|---|---| +| `metadata` | All rules require `technology`, `category`, and `references`. The `category: security` has more requirements. See Including fields required by security category. | Required by all Semgrep Registry rules:


Additional keys required when `category` is `security`: | `metadata:`
`cwe:`
`- "CWE-94: (...)"`
`category: security`
`technology:`
`- unicode`
`references`:
- https://trojansource.codes/| +| `technology` | Nested under the `metadata` field. Additional information about the technology. This helps to specify rulesets in Semgrep Registry. | | `metadata:`
`technology:`
`- react`| +| `category` | Nested under the `metadata` field. If you use category `security`, include additional metadata. See Including fields required by security category. | | `category: security`| +| `references` | Additional information that gives more context to the user of the rule. This helps developers understand the issue and how to fix it. | No finite value. Any additional information that gives more context. | `references:`
- [OWASP DOM based XSS Prevention Cheat Sheet][OWASP-DOM-based-XSS-prevention] | + + + +**INFO** + +- If you use category security, include additional metadata. See Including fields required by security category. +- Cross-file (interfile) analysis requires `interfile: true` under the `options` key in YAML rules. For more information, see [Creating rules that analyze across files](/semgrep-code/semgrep-pro-engine-intro/#write-rules-that-analyze-across-files-and-functions). + + + +### Rule namespace + +The namespacing format for contributing rules in the [Semgrep Registry](https://github.com/semgrep/semgrep-rules) is `///$MORE`. If the rule does not belong to a particular framework, add it to the language directory, which uses the word `lang` in place of the `` - `/`. + +### Tests + +Include a test file in the language that your rule is targeting. A test file includes the following: + +- At least one test where the rule detects a finding. This is called a true positive finding. +- At least one test where the rule does **not** detect a finding. This is called a true negative finding. + +Test file names must match the rule filename, except for the file extension. For example, if the rule is in `my-rule.yaml`, the test filename must be `my-rule.js`. Use any valid extension for the target language. + + + +**REQUIREMENTS OF TEST FILES** + +- In the test file, include examples that mark: + - What is expected to be a finding. + - What is not a finding. +- The test filename must match the rule filename, except for the file extension. + + + + + +See the examples of the rule and test file below: + +Rule file: +```yaml +rules: +- id: my-rule + pattern: var $X = "..."; + … +``` + +In the test file, mark an expected finding with a comment tag and the `ruleid` of your rule in the comment before the expected finding. Also, mark the code that is expected not to be a finding with a comment stating `ok` and add the `ruleid` also. See the example below: +```js +// ruleid: my-rule +var strdata = "hello"; +// ok: my-rule +var numdata = 1; +``` + +For more information, visit [Testing rules](/writing-rules/testing-rules). + +### Rule messages + +Include a rule message that provides details about the matched pattern and informs about how to mitigate any related issues. Provide the following information in a rule message: + +1. Description of the pattern. For example: missing parameter, dangerous flag, out-of-order function calls. +2. Description of why this pattern was detected. For example: logic bug, introduces a security vulnerability, bad practice. +3. An alternative that resolves the issue. For example: Use another function, validate data first, and discard the dangerous flag. + +Use the YAML multiline string operator `>-` when rule messages span multiple lines. This presents the best-looking rule message on the command line without having to worry about line wrapping or escaping the quote or using the backslash. + +For an example of a good rule message, see: [this rule for Django's `mark_safe`](https://semgrep.dev/r?q=python.django.security.audit.avoid-mark-safe.avoid-mark-safe). + + + +**RULE MESSAGE EXAMPLE** + +`mark_safe()` is used to mark a string as safe for HTML output. This disables escaping and may expose the content to XSS attacks. Instead, use `django.utils.html.format_html()` to build HTML for rendering. + + + + + +### Rule quality checker + +When you contribute rules to the Semgrep Registry, our quality checkers (linters) evaluate if the rule conforms to Semgrep, Inc. standards. The `semgrep-rule-lints` job runs linters on a new rule to check for mistakes, performance problems, and best practices for submitting to the Semgrep Registry. To improve your rule writing, use Semgrep itself to [scan semgrep-rules](https://semgrep.dev/blog/2021/how-we-made-semgrep-rules-run-on-semgrep-rules/). + +### Fields required by the `security` category + +Rules in category `security` in the Semgrep Registry require specific metadata fields that ensure consistency across the ecosystem in both Semgrep AppSec Platform and Semgrep CLI. Nest these metadata under the `metadata` field. + +If your rule has a `category: security`, the following metadata are required: + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Required metadata fieldValuesExample use
cweA Comment Weakness Enumeration (CWE)
cwe: "CWE-502: Deserialization of Untrusted Data"
owaspAn OWASP Top 10 category +
owasp:
- A05:2021 - Security Misconfiguration
+
confidenceHIGH, MEDIUM, LOW
confidence: MEDIUM
likelihoodHIGH, MEDIUM, LOW
likelihood: MEDIUM
impactHIGH, MEDIUM, LOW
impact: HIGH
subcategoryvuln, audit, secure default +
subcategory:
- vuln
+
vulnerability_classSee [Vulnerability class](#vulnerability-class) for a list of sample values. Accepts custom values. +
vulnerability_class:
- Hard-coded Secrets
+
+ +These fields help you to find rules in different categories such as: +- High confidence security rules for CI pipelines. +- OWASP Top 10 or CWE Top 25 rulesets. +- Technology. For example, `react` so it is easy to find React rulesets. +- Audit rules with lower confidence are intended for code auditors. + +Examples of rules with a full list of required metadata: +- High confidence JavaScript and TypeScript rule: [`javascript.express.security.audit.express-open-redirect.express-open-redirect`](https://semgrep.dev/r/javascript.express.security.audit.express-open-redirect.express-open-redirect) +- Medium confidence Python rule: [`python.lang.security.dangerous-system-call.dangerous-system-call`](https://semgrep.dev/r/python.lang.security.dangerous-system-call.dangerous-system-call) +- Low confidence C# rule: [`csharp.lang.security.ssrf.rest-client.ssrf`](https://semgrep.dev/r/csharp.lang.security.ssrf.rest-client.ssrf) + + +**NOTE** + +Details of each field mentioned above are provided in the subsections below with examples. + + + +#### CWE + +Include the appropriate Comment Weakness Enumeration (CWE). CWE can explain what vulnerability your rule is trying to find. Examples: + +If you write an SQL Injection rule, use the following: +```yaml +cwe: + - "CWE-89: Improper Neutralization of Special Elements used in an SQL Command ('SQL Injection')" +``` + +If you write an XSS rule, use the following: +```yaml +cwe: + - "CWE-79: Improper Neutralization of Input During Web Page Generation ('Cross-site Scripting')" +``` + +#### Confidence + +Indicate confidence of the rule to detect true positives. See the possible options below: + +- **HIGH** - Security concern, with high true positives. Useful in CI/CD pipelines. +- **MEDIUM** - Security concern, but some false positives. Useful in CI/CD pipelines. +- **LOW** - Expect a fair amount of false positives, similar to audit style rules. These rules can detect many false positives. + +##### HIGH + +HIGH confidence rules can use Semgrep advanced features such as `metavariable-comparison` or `taint mode`, to detect true positives. See examples below: + +- [`go.lang.security.audit.crypto.use_of_weak_rsa_key.use-of-weak-rsa-key`](https://semgrep.dev/r/go.lang.security.audit.crypto.use_of_weak_rsa_key.use-of-weak-rsa-key) +- [`javascript.express.security.audit.express-open-redirect.express-open-redirect`](https://semgrep.dev/r/javascript.express.security.audit.express-open-redirect.express-open-redirect) +- [`javascript.jose.security.jwt-hardcode.hardcoded-jwt-secret`](https://semgrep.dev/r/javascript.jose.security.jwt-hardcode.hardcoded-jwt-secret) + +```yaml +confidence: HIGH +``` + +##### MEDIUM + +MEDIUM confidence rules can use Semgrep advanced features such as `metavariable-comparison` or `taint mode`, but with some false positives. See examples below: + +- [`javascript.express.security.audit.express-ssrf.express-ssrf`](https://semgrep.dev/r/javascript.express.security.audit.express-ssrf.express-ssrf) +- [`javascript.express.security.express-xml2json-xxe.express-xml2json-xxe`](https://semgrep.dev/r/javascript.express.security.express-xml2json-xxe.express-xml2json-xxe) + +```yaml +confidence: MEDIUM +``` + +##### LOW + +Low confidence rules generally find something which appears to be dangerous while reporting a lot of false positives. See examples below: + +- [`php.lang.security.eval-use.eval-use`](https://semgrep.dev/r/php.lang.security.eval-use.eval-use) +- [`javascript.browser.security.dom-based-xss.dom-based-xss`](https://semgrep.dev/r/javascript.browser.security.dom-based-xss.dom-based-xss) + +```yaml +confidence: LOW +``` + +#### Likelihood + +Specify how likely it is that an attacker can exploit the issue that has been found. The possible values are `LOW`, `MEDIUM`, `HIGH`. + +##### HIGH + +HIGH likelihood rules specify a very high concern that the vulnerability can be exploited. Examples: + +- The use of weak encryption: [`go.lang.security.audit.crypto.use_of_weak_rsa_key.use-of-weak-rsa-key`](https://semgrep.dev/r/go.lang.security.audit.crypto.use_of_weak_rsa_key.use-of-weak-rsa-key) +- Disabled security feature in a configuration: [`javascript.angular.security.detect-angular-sce-disabled.detect-angular-sce-disabled`](https://semgrep.dev/r/javascript.angular.security.detect-angular-sce-disabled.detect-angular-sce-disabled) +- Hardcoded secrets that use a constant value `"..."`: [`javascript.jose.security.jwt-hardcode.hardcoded-jwt-secret`](https://semgrep.dev/r/javascript.jose.security.jwt-hardcode.hardcoded-jwt-secret) +- Rules that leverage `taint mode sources` which indicate sources that can come from an attacker. Such as HTTP `POST`, `GET`, `PUT`, and `DELETE` request values. For example: [`javascript.express.security.audit.express-open-redirect.express-open-redirect`](https://semgrep.dev/r/javascript.express.security.audit.express-open-redirect.express-open-redirect) + +```yaml +likelihood: HIGH +``` + +##### MEDIUM + +MEDIUM likelihood rules detect a vulnerability in most circumstances. Although it can be hard for an attacker to exploit them. Also, these rules can detect part of a problem, but not the whole issue. Examples: + +- `taint mode sources` that reach a `taint mode sink` but the source is only vulnerable in certain conditions for example OS Environment Variables, or loading from disk: [`python.aws-lambda.security.dangerous-spawn-process.dangerous-spawn-process`](https://semgrep.dev/r/python.aws-lambda.security.dangerous-spawn-process.dangerous-spawn-process) +- `taint mode sources` with a `taint mode sink` but is missing a `taint mode sanitizer` which can introduce more false positives: [`javascript.express.security.express-puppeteer-injection.express-puppeteer-injection`](https://semgrep.dev/r/javascript.express.security.express-puppeteer-injection.express-puppeteer-injection) + +```yaml +likelihood: MEDIUM +``` + +##### LOW + +LOW likelihood rules tend to find something dangerous, but are not evaluating whether something is truly vulnerable, for example: + +- `taint mode sources` such as function arguments which may or may not be tainted which reach a `taint mode sink`: [`typescript.react.security.audit.react-href-var.react-href-var`](https://semgrep.dev/r/typescript.react.security.audit.react-href-var.react-href-var) +- A rule which uses `search mode` to find the use of a dangerous function for example: `trustAsHTML`, `bypassSecurityTrust()`, `eval()`, or `innerHTML`: [`javascript.browser.security.dom-based-xss.dom-based-xss`](https://semgrep.dev/r/javascript.browser.security.dom-based-xss.dom-based-xss) + +```yaml +likelihood: LOW +``` + +#### Impact + +Indicate how much damage can a vulnerability cause. Use LOW, MEDIUM, and HIGH. + + +##### HIGH + +HIGH impact rules can detect extremely damaging vulnerabilities, such as injection vulnerabilities. Examples: + +- [`javascript.sequelize.security.audit.sequelize-injection-express.express-sequelize-injection`](https://semgrep.dev/r/javascript.sequelize.security.audit.sequelize-injection-express.express-sequelize-injection) +- [`ruby.rails.security.audit.xxe.xml-external-entities-enabled.xml-external-entities-enabled`](https://semgrep.dev/r/ruby.rails.security.audit.xxe.xml-external-entities-enabled.xml-external-entities-enabled) + +```yaml +impact: HIGH +``` + +##### MEDIUM + +MEDIUM impact rules are issues that are less likely to lead to full system compromise but still are fairly damaging. Examples: + +- [`python.flask.security.injection.raw-html-concat.raw-html-format`](https://semgrep.dev/r/python.flask.security.injection.raw-html-concat.raw-html-format) +- [`python.flask.security.injection.ssrf-requests.ssrf-requests`](https://semgrep.dev/r/python.flask.security.injection.ssrf-requests.ssrf-requests) + +```yaml +impact: MEDIUM +``` + +##### LOW + +LOW impact rules are rules that leverage a security issue, but the impact is not too damaging to the application if discovered. + +- [`go.gorilla.security.audit.session-cookie-missing-secure.session-cookie-missing-secure`](https://semgrep.dev/r/go.gorilla.security.audit.session-cookie-missing-secure.session-cookie-missing-secure) +- [`javascript.browser.security.raw-html-join.raw-html-join`](https://semgrep.dev/r/javascript.browser.security.raw-html-join.raw-html-join) + +```yaml +impact: LOW +``` + +#### References + +References help provide more context to a developer on what the issue is, and how to remediate the vulnerability, see examples below: + +- A rule that is finding an issue in React: [`typescript.react.security.audit.react-href-var.react-href-var`](https://semgrep.dev/r/typescript.react.security.audit.react-href-var.react-href-var) + ```yaml + references: + - https://reactjs.org/blog/2019/08/08/react-v16.9.0.html#deprecating-javascript-urls + ``` +- A rule that is detecting an issue in Express: [`javascript.sequelize.security.audit.sequelize-injection-express.express-sequelize-injection`](https://semgrep.dev/r/javascript.sequelize.security.audit.sequelize-injection-express.express-sequelize-injection) + ```yaml + references: + - https://sequelize.org/v6/core-concepts/raw-queries/#replacements + ``` + +#### Subcategory + +Include a subcategory to explain what is the type of the rule. See the subsections below for more details. + + + +A vulnerability rule is something that developers certainly want to resolve. For example, an SQL Injection rule that uses taint mode. Example: + +- [`javascript.sequelize.security.audit.sequelize-injection-express.express-sequelize-injection`](https://semgrep.dev/r/javascript.sequelize.security.audit.sequelize-injection-express.express-sequelize-injection) + +```yaml +subcategory: + - vuln +``` + +##### audit + +An audit rule is useful for code auditors. For example, an SQL rule which finds all uses of the `database.exec(...)` that can be problematic. Example: + +- [`generic.html-templates.security.unquoted-attribute-var.unquoted-attribute-var`](https://semgrep.dev/r/generic.html-templates.security.unquoted-attribute-var.unquoted-attribute-var) + +```yaml +subcategory: + - audit +``` + +##### secure default + +A secure default rule makes use of inherently secure libraries, frameworks, configurations, or settings. These rules enforce the mitigation of common security concerns, such as preventing cross-site request forgery (CSRF) by properly verifying inbound requests in Django or Flask applications. + +A secure default rule must contain remediation that suggests applying a one-time setting that ensures security throughout the codebase without the need for repeated application by developers. For example, configuring a global security setting in a web application framework that applies to all routes and inputs. + +```yaml +subcategory: + - secure default +``` + +#### Technology + +Technology helps to define specific rulesets for languages, libraries, and frameworks that are available in [Semgrep Registry](https://semgrep.dev/explore), for example `express` will be included in the `p/express` ruleset. + +- [`javascript.express.security.audit.express-open-redirect.express-open-redirect`](https://semgrep.dev/r/javascript.express.security.audit.express-open-redirect.express-open-redirect) + +```yaml +technology: + - express +``` + +#### Vulnerability class + +The vulnerability class defines the category to which a rule and its resulting findings belong. The categories are used to group rules in Semgrep AppSec Platform's **Policies** page to help find similar rules. The category is also displayed on the **Finding Details** pages. + +You can provide custom values. Sample values include: + +- Active Debug Code +- Code Injection +- Command Injection +- Cookie Security +- Cross-Site Request Forgery (CSRF) +- Cross-Site-Scripting (XSS) +- Cryptographic Issues +- Dangerous Method or Function +- Denial-of-Service (DoS) +- Hard-coded Secrets +- Improper Authentication +- Improper Authorization +- Improper Encoding +- Improper Validation +- Insecure Deserialization +- Insecure Hashing Algorithm +- Insufficient Logging +- LDAP Injection +- Mass Assignment +- Memory Issues +- Mishandled Sensitive information +- Open Redirect +- Other Security +- Path Traversal +- SQL Injection +- Server-Side Request Forgery (SSRF) +- XML Injection +- XPath Injection + +## Update existing rules in Semgrep Registry + + + +Find a rule you want to update in the [semgrep-rules](https://github.com/semgrep/semgrep-rules/) repository. + + +Submit a PR to the repository with your new update. + + +Follow the same instructions and recommendations as you can find in the rest of this document. For example the security category has specific metadata requirements. + + +Leave a message in the PR. Explain why are you making changes. What is the motivation for this update? + + + +See a [PR example](https://github.com/semgrep/semgrep-rules/pull/2730). + +There can be specific messages in the repository’s pipeline informing you about specific details of your rule. Ensure that your rule fulfills all of the necessities and requirements. However, sometimes the pipeline running in the [semgrep-rules](https://github.com/semgrep/semgrep-rules/) repository can have specific issues. In such a case, wait for a Semgrep reviewer's help. + + +[OWASP-DOM-based-XSS-prevention]: https://cheatsheetseries.owasp.org/cheatsheets/DOM_based_XSS_Prevention_Cheat_Sheet.html diff --git a/mintlify-docs/contributing/contributing.mdx b/mintlify-docs/contributing/contributing.mdx new file mode 100644 index 0000000000..1526ab11d0 --- /dev/null +++ b/mintlify-docs/contributing/contributing.mdx @@ -0,0 +1,20 @@ +--- +title: "Contributing overview" +description: "Your contributions to Semgrep Community Edition (CE) are welcome!" +--- + +To contribute, read and agree with the [Contributor Covenant Code of Conduct](https://github.com/semgrep/semgrep/blob/develop/CODE_OF_CONDUCT.md). + +Your contributions can help in various places: + +| Contribution | Where to contribute | +| :--- | :--- | +| File a Semgrep CE issue. | See the [Semgrep GitHub repository](https://github.com/semgrep/semgrep/issues/new/choose). | +| Contribute code changes. | Follow the [Contributing code](/contributing/contributing-code) document. Find a task in the [list of good first issues](https://github.com/semgrep/semgrep/issues?q=is%3Aopen+is%3Aissue+label%3A%22good+first+issue%22). | +| Contribute rules to the Semgrep Registry. | Add new rules through Semgrep AppSec Platform or GitHub. See [Contributing rules](/contributing/contributing-to-semgrep-rules-repository). | +| Update the documentation you are reading right now! | Create a PR or an issue in the [Documentation repository](https://github.com/semgrep/semgrep-docs). | +| File an issue for our Semgrep Visual Studio Code extension or help us to improve it. |See the [semgrep-vscode](https://github.com/semgrep/semgrep-vscode) repository. | +| File an issue for our Semgrep IntelliJ Plugin help us to improve it. |See the [semgrep-intellij](https://github.com/semgrep/semgrep-intellij) repository. | +| Help others in the community. | Check [Semgrep Community Slack](https://go.semgrep.dev/slack). | + +For any contribution to Semgrep code (bug fix or fixed issue, feature), read more about development workflow and testing in the [contribution guidelines](/contributing/contributing-code). For a high-level view of Semgrep’s design principles, see the [Semgrep philosophy](/contributing/contributing-code). diff --git a/mintlify-docs/contributing/cst-to-ast-tips.mdx b/mintlify-docs/contributing/cst-to-ast-tips.mdx new file mode 100644 index 0000000000..62a98a607e --- /dev/null +++ b/mintlify-docs/contributing/cst-to-ast-tips.mdx @@ -0,0 +1,193 @@ +--- +title: "Tips for converting CST to AST" +--- + + +Once you have copied the generated `Boilerplate.ml` for your language +`foo` into `Parse_foo_tree_sitter.ml`, you can start editing it. The +goal is replace all the calls like `todo env x` by the construction +of a node of the AST. The destination AST can be a language-specific +AST or directly the generic AST. If we're mapping to a +language-specific AST, this language-specific AST needs to be created +first. The advantage of going through a language-specific AST is more +visibility into which constructs are valid for the language, compared +to the generic AST which supports many more constructs. + +Besides writing and updating the tree-sitter grammar, this step is +where the most time will be spent to integrate a language in semgrep. +This is a collection of tips to make this tedious task somewhat easier. + +## Use editor/IDE with good OCaml support + +Make sure to set up your editor with a proper ocaml mode, so that you +can see the inferred type of expressions and get the ability to jump +to the definitions. + +Popular editors include emacs, vim, vscode. They all have their own +OCaml extension or plugin which relies on merlin. + +## Editing the boilerplate + +### Study examples + +`Parse_foo_tree_sitter.ml` is copied from the generated file +[`Boilerplate.ml`](https://github.com/semgrep/semgrep-go/blob/main/lib/Boilerplate.ml). The `todo env x` calls are typically replaced by the +construction of a node of the AST. +See how it's done for example in [`Parse_go_tree_sitter.ml`](https://github.com/semgrep/semgrep/blob/develop/languages/go/tree-sitter/Parse_go_tree_sitter.ml). + +### Learn OCaml basics + +CST and AST type definitions make heavy use of algebraic data types to +accommodate nodes of different kinds under the same type. +Those are known as variants (e.g. `Expr e`) and +polymorphic variants in OCaml jargon (e.g. `` `Expr e``). + +Parametrized types in OCaml are like generics in languages like Java. +The OCaml type for a list of ints is denoted `int list`, which would +be denoted `List` in a Java-like language. + +Run `utop` (`opam install utop`) and go over [this tutorial about OCaml +types at ocaml.org](https://ocaml.org/learn//tutorials/data_types_and_matching.html). + +### Preserve structure, assign useful names + +Consider this example of typical generated code in `Boilerplate.ml` file: + +```ocaml +let rec anon_choice_type_id_42c0412 (env : env) (x : CST.anon_choice_type_id_42c0412) = + match x with + | `Id tok -> [ identifier env tok ] (* identifier *) + | `Nested_id x -> nested_identifier env x +``` + +The name `anon_choice_type_id_42c0412` was generated from an anonymous +node in the grammar and it's not meaningful. However, it's used in multiple +spots, which is why it has its own function definition. It occurs for example +here: +```ocaml + | `Choice_id_opt_type_args (v1, v2) -> + let v1 = anon_choice_type_id_42c0412 env v1 in + let id = concat_nested_identifier v1 in + let _v2 = +``` + +Instead, it works better to give it a meaningful name like `id_or_nested_id` as +follows: + +```ocaml +let rec id_or_nested_id (env : env) (x : CST.anon_choice_type_id_42c0412) = + match x with + | `Id tok -> [ identifier env tok ] (* identifier *) + | `Nested_id x -> nested_identifier env x +``` + +and replace it at every point of use. Now, the snippet where it's used +makes more sense: + +```ocaml + | `Choice_id_opt_type_args (v1, v2) -> + let v1 = id_or_nested_id env v1 in + let id = concat_nested_identifier v1 in + let _v2 = +``` + +Another helpful transformation is to assign a name to every new +value instead of `v1`, `v2`, etc. The above snippet becomes something like + +```ocaml + | `Choice_id_opt_type_args (v1, v2) -> + let ids = id_or_nested_id env v1 in + let id = concat_nested_identifier ids in + let type_args = +``` + +Note that we're still keeping the original `v1` and `v2`, because it's +not very useful to find names for them. + +Finally, it is very useful to specify the return type of the function +so as to figure out type errors a lot more easily. +```ocaml +let rec id_or_nested_id (env : env) (x : CST.anon_choice_type_id_42c0412) = +``` +becomes +```ocaml +let rec id_or_nested_id (env : env) (x : CST.anon_choice_type_id_42c0412) : ident = +``` + +Summary: + +* Replace generated function names by something meaningful. +* Replace `let v1 =` by a meaningful name. +* Specify the return type of functions that map CST to AST. +* Preserve the general structure of the generated functions. + +This all helps with updating the code when the grammar changes. + +## Compile regularly + +Compile regularly so as to perform type checking. This is general +advice for OCaml development. If you're working on a single file, you +don't need to recompile the project, though. Merlin will take care of +checking types when you save the file. + +The initial template with all the todos should compile successfully, +but of course will fail at runtime. Type errors produced by the +compiler can be tricky to understand but it's good to learn how to +interpret them. Sometimes they're just too long, though. + +## Keep the boilerplate structure intact + +Leave the original structure in place as much as possible. This is +important for later when we want to update the grammar and need to +compare the new boilerplate with the old/edited one. + +Add type annotations +--- + +The generated boilerplate looks like this, i.e. the return type is left +unspecified. + +```ocaml +and formal_parameter (env : env) (x : CST.formal_parameter) = + ... +``` + +The return type will be an AST node determined by the programmer. +OCaml performs full type inference, so it's not technically required +to specify a return type. +However, given the peculiar nature of this exercise, we recommend specifying +return types. It makes it easier to see expectations and get clear error +messages when the time comes to upgrade the grammar. A return type annotation +looks like this: + +```ocaml +and formal_parameter (env : env) (x : CST.formal_parameter) : AST.parameter = + ... +``` + +## Consult the original `grammar.js` + +The original `grammar.js`, or sometimes another javascript file, +contains the bulk of the original rules for the grammar. This is +usually a better reference than the generated code. + +The generated boilerplate `Boilerplate.ml` is similar to the type definitions +`CST.ml` which is our interpretation of the original +`grammar.js`. So, it is useful to consult `CST.ml` as well. + +What tends to work well is to keep 4 windows open: +* `Parse_foo_tree_sitter.ml` (2 windows) +* `grammar.js` +* `AST_generic.ml` + +The last file `AST_generic.ml` contains the type definitions of the +AST we're mapping to. + +## Extend the generic AST with moderation + +Each programming language comes with a few features that other +languages don't offer, but typically not that many. The generic AST is +designed to accommodate all the constructs for all the programming +languages. So, we sometimes have to extend the generic AST with new +kinds of nodes. We try to do so sparingly, since we try to keep the +generic AST as simple as possible and it's already very rich. diff --git a/mintlify-docs/contributing/semgrep-contributing.mdx b/mintlify-docs/contributing/semgrep-contributing.mdx new file mode 100644 index 0000000000..3953ef296d --- /dev/null +++ b/mintlify-docs/contributing/semgrep-contributing.mdx @@ -0,0 +1,192 @@ +--- +title: "semgrep-cli contributing" +sidebarTitle: "semgrep contributing" +--- + +This article explains how to build `semgrep-cli` so that you can make and test changes to the Python wrapper. + +The `semgrep-cli` name refers to the project that exposes the actual `semgrep` command. The README explains the relationship between `semgrep-cli` and `semgrep-core`. + +## Prerequisite + +- Python >= 3.10 installed in your local machine. +- [`pipenv`](https://github.com/pypa/pipenv) for managing your virtual +environment. + - Install it by following the `pipenv` [documentation](https://pipenv.pypa.io/en/latest/installation.html). + - Ensure that `pipenv` is on your `$PATH` before proceeding. + +## Set up the environment + +Most Python development is done inside the `cli` directory: + +```bash +cd cli +``` + +Next, initialize and enter the virtual environment. The following command installs developer dependencies, such as `pytest`, and installs `semgrep` in editable mode in the virtual environment. From the `cli` directory, run the following command: + +```bash +pipenv shell +``` + +By convention, your shell prompt is prepended with `(cli)` when the virtual environment is active. + +Next, install the Python dependencies: + +```bash +SEMGREP_SKIP_BIN=true pipenv install --dev +``` + + +**INFO** + +`SEMGREP_SKIP_BIN` tells the installer that you'll use your own `semgrep-core`; see below.* + + +Running `which semgrep` should return a path within your virtual environment. On macOS, this is likely contained within `$HOME/.local/share/virtualenvs/`. + +## Get the `semgrep-core` binary + +Almost all usages of `semgrep-cli` require the `semgrep-core` binary. +To get the binary, follow the instructions in [Building `semgrep-core`](/contributing/semgrep-core-contributing#build-semgrep-core). It takes approximately 20 minutes. + +### Use a precompiled binary + +You can use a precompiled binary, but note two downsides: + +- You cannot modify `semgrep-core`, for example, to fix a parse error. +- Semgrep scans fail if the interface between `semgrep-cli` and `semgrep-core` has changed since the binary was compiled. This has happened roughly every two months historically, but can happen at any time without notice. + +If you installed Semgrep using Homebrew (with `brew install semgrep`), a `semgrep-core` binary was bundled within that installation. However, it is not made available on your `$PATH` by default. + +You can add the bundled binary to your `$PATH` with this series of commands, provided you have `jq` installed: + +```bash +export SEMGREP_BREW_INSTALLED_VERSION="$(brew info --json semgrep | jq '.[0].installed[0].version' -r)" +export SEMGREP_BREW_INSTALL_PATH="$(brew --cellar semgrep)/${SEMGREP_BREW_INSTALLED_VERSION}" +export SEMGREP_BREW_PYTHON_PACKAGE_PATH="$(${SEMGREP_BREW_INSTALL_PATH}/libexec/bin/python -m pip list -v | grep '^semgrep\b' | awk '{ print $3 }')" +export SEMGREP_BREW_CORE_BINARY_PATH="${SEMGREP_BREW_PYTHON_PACKAGE_PATH}/semgrep/bin" +export PATH="${SEMGREP_BREW_CORE_BINARY_PATH}:${PATH}" +``` + +## Run `semgrep-cli` + +Ensure that you are in the `cli/` directory, and then issue the following command: + +```bash +pipenv run semgrep --help +``` + +To try a simple analysis, run: + +```bash +echo 'if 1 == 1: pass' | semgrep --lang python --pattern '$X == $X' - +``` + +You now have Semgrep running locally. + +## Install `semgrep` + +You can always run `semgrep` from `cli/`, which will use your latest changes in that directory, but you may also want to install the `semgrep` binary. To do this, run + +```bash +pipenv install --dev +``` + +If you encounter difficulties, reach out to the [`semgrep` team on Slack](https://go.semgrep.dev/slack). + +Now you can run `semgrep --help` from anywhere. + +If you have installed `semgrep-core` from source, there are convenient targets in the root Makefile that let you update all binaries. After you pull, run: + +```bash +make rebuild +``` + +See the Makefile in `cli/` + +## Add Python packages to `semgrep` + +Semgrep uses `mypy` to do static type-checking of its Python code. Therefore, when adding a new Python package, you also need to add typing stubs for that package. This can be done in three steps. For example, suppose you are adding the package `pyyaml` to Semgrep. + + + +Install the corresponding package with typing stubs. For this `pyyaml` example, the corresponding package is `types-pyyaml`. In the following command, `--dev` specifies that this package is needed for development but not in production. This command updates `cli/Pipfile` with the typing stubs package, and adds both the typing stubs and the package itself to your `Pipfile.lock`. This allows you to import the package in your code (for example, `import yaml as pyyaml`). + + ```bash + pipenv install --dev types-pyyaml + ``` + + +Add the typing stubs package to `.pre-commit-config.yaml` so that the pre-commit `mypy` hook can find the package. + + ```yaml + - id: mypy + additional_dependencies: &mypy-deps + - ... + - types-PyYAML + ``` + + +Add the original package to `cli/setup.py` in the `install_requires` list variable. You can find the version number either in the `Pipfile.lock` file or by looking up the most recent major version of the package online. + + ```text + install_requires = [ + ... + "pyyaml~=6.0", + ] + ``` + +This change makes your package a dependency of published Semgrep. Without this change, if you create a pull request, the CI job called `build docker image` fails with a `ModuleNotFoundError`, indicating it cannot find your package. + + + +## Troubleshooting + +For a reference build that's known to work, consult the root `Dockerfile` +to build Semgrep inside a container. You can check that it builds with + +```bash +docker build -t semgrep . +``` + +## Testing + +`semgrep-cli` uses [`pytest`](https://docs.pytest.org/en/latest/) for testing. + +To run tests, run the following command: + +```bash +pipenv run pytest +``` + +There are some much slower tests that run Semgrep on many open source projects. To run these slow tests, run: + +```bash +pipenv run pytest tests/qa +``` + +If you want to update the tests to match the current output: + +```bash +make regenerate-tests +``` + +If you want to run a single test file: + +```bash +pipenv run pytest path/to/test.py +``` + +Or run an individual test function: + +```bash +pipenv run pytest path/to/test.py::test_func_name +``` + +`semgrep-cli` also includes [`pytest-benchmark`](https://pytest-benchmark.readthedocs.io/en/latest/) +to allow for basic benchmarking functionality. Run the following command: + +```bash +pipenv run pytest --benchmark-only +``` diff --git a/mintlify-docs/contributing/semgrep-core-contributing.mdx b/mintlify-docs/contributing/semgrep-core-contributing.mdx new file mode 100644 index 0000000000..ae76cdcb6f --- /dev/null +++ b/mintlify-docs/contributing/semgrep-core-contributing.mdx @@ -0,0 +1,664 @@ +--- +title: "semgrep-core contributing" +--- + +The following explains how to build `semgrep-core` so you can make and test changes to the OCaml code. Once you have `semgrep-core` installed, you can refer to [semgrep-contributing](/contributing/semgrep-contributing) to see how to build and run the Semgrep application. + +## Build `semgrep-core` + +This document assumes you are building on MacOS and have already installed the Homebrew package manager. Installation commands and package names for different OSes may vary slightly. + +### Check out the code + +Begin by cloning the Semgrep repo from Git. Each parser's tree-sitter code is managed as a separate submodule, so pass `--recurse-submodules` to ensure they are cloned as well. + +```bash +git clone --recurse-submodules https://github.com/semgrep/semgrep +cd semgrep +``` + +If you have already cloned without submodules, you can check them out as a second separate step from the root of the repository: + +```bash +git submodule update --init --recursive +``` + +### Prerequisites + + +`semgrep-core` is written primarily in OCaml. You must [install OCaml](https://opam.ocaml.org/doc/Install.html) and its package manager OPAM, and pin the current compiler version. On MacOS, it is done through the following steps: + +```bash +brew install opam +opam init +opam switch create semgrep 5.3.0 +eval $(opam env) +``` + +Next, install some base packages required for setup and compilation. + +```bash +brew install pkg-config bash +``` + +Lastly, you will almost certainly want the Python environment for `semgrep-cli` +configured before proceeding. Please refer to the [Set up the environment](/contributing/semgrep-contributing#set-up-the-environment) documentation. + +Once you've returned here, ensure that your shell is able to enter the Python +virtual environment. + +```bash +cd cli; pipenv shell # enter the virtual environment +cd .. # from within the virtual environment, return to the repo root +``` + +### First-time installation + +The root `Makefile` contains targets that take care of building the +right things. It is commented. Please refer to it and keep it +up-to-date. + +To install all necessary dependencies, run + +```bash +make setup +``` + +Next, to install `semgrep-core`, run + +```bash +make core +``` + +Finally, test the installation with + +```bash +bin/semgrep-core -help +``` + +If you would like to finish the Semgrep installation, return to the +[Python-side instructions](/contributing/semgrep-contributing). + +### Rebuild after a change + +Unless there is a significant dependency change, you won't need to run `make dev-setup` again. + +The Semgrep team has provided useful targets to help you build and link the entire semgrep project, including both `semgrep-core` and `semgrep`. You may find these helpful. + +To install the latest OCaml binaries and `semgrep` binary after pulling source code changes from Git, run: + +```bash +make rebuild +``` + +To install after you make a change locally, run + +```bash +make build # or just `make` +``` + +After making either of these targets, `semgrep` runs with all your local changes, OCaml and Python both. + +``` +Because this updates the `semgrep` binary, if you do not have your Python environment configured properly, you will encounter errors when running these commands. Follow the procedure under [Development](#development) +``` + +## Development + +In practice, it is not always convenient to use `make build` or `make rebuild`. `make rebuild` will update everything within the project; `make build` will compile and install all the binaries. You can do this yourself in a more targeted fashion. + +Below is a flow appropriate for frequent developers of `semgrep-core` + +After you pull, run + +```bash +git submodule update --recursive +``` + +This will update internal dependencies. (We suggest aliasing it to `uu`) + +After `tree-sitter` is updated, you may need to reconfigure it. If so, run + +```bash +make config +``` + +### Develop `semgrep-core` + +If you are developing `semgrep-core`, Use `Makefile` in the repository root for `core` and `core-test` targets; the code is primarily in `src/`. + +The following assumes you are in the repository root. + +After you pull or make a change, compile using + +```bash +make +``` + +This will build an executable for `semgrep-core` in `_build/default/src/main/Main.exe` (we suggest aliasing this to `sc`). Try it out by running + +```bash +_build/default/src/main/Main.exe -help +``` + +When you are done, test your changes with + +```bash +make core-test +``` + +Finally, to update the `semgrep-core` binary used by `semgrep`, run + +```bash +make copy-core-for-cli +``` + +### Test `semgrep-core` + +`make test` in the repository root directory will run tests that check code is correctly parsed +and patterns perform as expected. To add a test in an appropriate language subdirectory, `tests/patterns/[LANG]`, create a target file (expected file extension given language) and a .sgrep file with a pattern. The testing suite will check that all places with a comment with `ERROR` were matches found by the .sgrep file. See existing tests for more clarity. + +If you are diagnosing test failures, it is time-consuming to re-run the entire test suite. +`make retest` will only re-run tests that failed. + +### Development environment + +OCaml installations include a language server that most modern editors like +Neovim and Emacs support out of the box. + +You can also use Visual Studio Code \(vscode\) to edit the code of Semgrep. The [reason-vscode](https://marketplace.visualstudio.com/items?itemName=jaredly.reason-vscode) Marketplace extension adds support for OCaml/Reason. + +The [OCaml and Reason IDE extension](https://github.com/reasonml-editor/vscode-reasonml) by @freebroccolo is another valid extension, but it seems not as actively maintained as reason-vscode. + +The source of Semgrep contains also a .vscode/ directory at its root containing a task file to automatically build Semgrep from vscode. + +Note that dune and ocamlmerlin must be in your PATH for vscode to correctly build and provide cross-reference on the code. In case of problems, do: + +```bash +cd /path/to/semgrep +eval $(opam env) +dune --version # just checking dune is in your PATH +ocamlmerlin -version # just checking ocamlmerlin is in your PATH +code . +``` + +## Test Semgrep performance + +### Explore results from a slow run of Semgrep + +#### Interpret the result object + +For full timing information, run Semgrep with `--time` and `--json` flags. In addition, you can add `time` at the beginning of the command to get the true wall time. The `--json` argument produces a large amount of output, so redirecting the output to a file with `-o` is recommended. + +See the following example for the full command: + +```bash +time semgrep --config=auto --time --json -o result.json PATH/TO/SRC +``` + +Substitute the optional placeholder `PATH/TO/SRC` with the path to your source code. + +Here is an example result object. + +```json expandable + { "results": [], + "paths": {}, + "errors": [], + "time": { + "max_memory_bytes": 48693248, + "profiling_times": { + "config_time": 0.0624239444732666, + "core_time": 0.11341428756713867, + "ignores_time": 0.00017690658569335938, + "total_time": 0.17628788948059082 + }, + "rules": [ + { + "id": "test-rule" + } + ], + "rules_parse_time": 0.0013418197631835938, + "targets": [ + { + "match_times": [ + 5.9604644775390625e-06 + ], + "num_bytes": 340, + "parse_times": [ + 0.0071868896484375 + ], + "path": "test_functions.java", + "run_time": 0.011521100997924805 + } + ], + "total_bytes": 340 + } +} +``` + +All the information about timing is contained under `time`. + +The first section is `profiling_times`. This contains wall time durations of various relevant steps: +* Getting the rule config files (`config_time`) +* Running the main engine (`core_time`) +* Processing the ignores (`ignores_time`) + +The `total_time` field represents the sum of these steps. + +The remaining fields report engine performance. Together, `rule_parse_time` and `targets` should capture all the time spent running `semgrep-core`. + +`rule_parse_time` is straightforward. It records the time spent parsing the rules file. + +`targets` poses more difficulty. Since files are run in parallel, the amount of time spent parsing (`parse_times`) and matching (`match_times`) will inevitably be meaningless compared against `total_time` or `core_time`. Therefore, the total run time (`run_time`) of each target for each rule is taken within the parallel run. This helps contextualize the time spent parsing and matching each target. The sum of the run times thus can (and usually should) be longer than the total time. + +The lists `match_times` and `parse_times` are in the same order as `rules`. That is, the match time of rule `rules[0]` is `match_times[0]`. + +Note that `parse_times` is given for each rule, but a file should only be parsed once (the first number). Afterwards, the parse time represents the time spent retrieving the file's AST from the cache. + +#### Negative values in the metrics + +When a time is not measured, by default it has the value -1. It is common to a have a normal runtime, but -1 for the parse time or match time; this indicates an error in parsing. + +#### Tips for exploring Semgrep results + +There are several scripts already written to analyze and summarize these timing data. Find them in [`scripts/processing-output`](https://github.com/semgrep/semgrep/tree/develop/scripts/processing-output). If you have a timing file, you can run + +```bash +python read_timing.py [your_timing_file] +``` + +You may need to adjust the line `result_times = results` based on whether you have a timing file or the full results (in which case this should be `result_times = results["time"]`) + +### Profile code + +You can pass the -profile command-line argument to semgrep-core to get +a short profile of the code. For example, running: + +```bash +cd semgrep-core +./bin/semgrep-core -profile -e foo tests/python +``` +will output: + +```bash +--------------------- +profiling result +--------------------- +Main total : 1.975 sec 1 count +Parse_python.parse : 0.828 sec 1 count +... +``` + +You can also instead set the environment variable SEMGREP_CORE_PROFILE to 1 to get the same information: + +```bash +cd semgrep-core +export SEMGREP_CORE_PROFILE=1 +./bin/semgrep-core -e foo tests/python +``` +will output: +```bash +--------------------- +profiling result +--------------------- +Main total : 1.975 sec 1 count +Parse_python.parse : 0.828 sec 1 count +... +``` + +This is especially useful when you don't call directly semgrep-core, but +instead use the python wrapper semgrep. + +Note that since semgrep 0.82, you can pass the `--dump-command-for-core` (or the shorter `-d`) to `semgrep` to get the command the python wrapper will use to call semgrep-core (this is an hidden option, which is why you will not see it in `semgrep --help`). For example: + +```bash +semgrep --dump-command-for-core --config bench/zulip/input/rules/zulip/rules.zulip.semgrep.yml.yaml bench/zulip/input/zulip/ +``` +will output: +```bash +Running 10 rules... +/home/pad/github/semgrep/cli/src/semgrep/bin/semgrep-core -json -rules semgrep_rules.yaml -j 20 -targets semgrep_targets.txt -timeout 30 -timeout_threshold 0 -max_memory 0 -json_time -fast +``` + +where `semgrep_rules.yaml` and `semgrep_targets.txt` are files created by `semgrep` that respectively contain the list of rules and targets. It is easy then to copy-paste this command and possibly add a `-profile` or `-debug` to get more information. + + +You can also use the SEMGREP_CORE_DEBUG environment variable to add debugging +information, for example: + +```bash +export SEMGREP_CORE_DEBUG=1 +export SEMGREP_CORE_PROFILE=1 +pipenv run semgrep -f ../semgrep-core/tests/PERF/ajin.yaml ../semgrep-core/tests/PERF/three.js +``` +will output: +```bash +Debug mode On +Executed as: semgrep-core -lang javascript -rules_file /tmp/tmpy5pzp3p_ -j 8 ../semgrep-core/tests/PERF/three.js +Profile mode On +disabling -j when in profiling mode +PARSING: ../semgrep-core/tests/PERF/three.js +saving rules file for debugging in: /tmp/semgrep_core_rule-97ae74.yaml +--------------------- +profiling result +--------------------- +Main total : 1.975 sec 1 count +Parse_js.parse : 0.828 sec 1 count +Semgrep.check : 0.791 sec 1 count +Semgrep.match_sts_sts : 0.559 sec 185064 count +... +``` + +### Benchmark code + +We have two sets of benchmarks, one on a suite of real repositories against real rulesets (real benchmarks), another that highlights specific slow (rule, file) pairs (micro benchmarks). + +To run the micro benchmarks, go to `perf/perf-matching/`, and run `./run-perf-suite`. + +To run the real benchmarks, go to `perf`, and run `./run-benchmarks`. See the perf [readme](https://github.com/semgrep/semgrep/blob/develop/perf/README.md) for more details on how these are set up. + +There are a number of flags (`./run-benchmarks --help` to see them) which may be helpful if you are using the benchmarks for local development. For example, `./run-benchmarks --plot_benchmarks` will output a graph of the benchmark results at the end. + +If you are concerned about performance, the recommended way to test is to hide your change behind a flag and add that flag to run-benchmarks. Add a flag in `src/configuring/Flag_semgrep.ml`. These are ref cells, so you can check whether the flag is enabled or not via `!Flag_semgrep.your_flag`. In `src/core_cli/Core_CLI.ml`, go to options, and add a flag that sets the appropriate `Flag_semgrep`. Then, in `perf/run-benchmarks`, go to the `SemgrepVariants` list, and add your variant. + +You can also test the impact of your change by running `./run_benchmarks --std_only` in `perf`, which will only run the default version of semgrep. + + +In these next sections we will give an overview of `semgrep-core` and then some tips for making common changes to `semgrep-core`. These are only tips; without seeing an error, we cannot know its cause and proper resolution, but hopefully it gives useful direction. + +## Cheatsheet + +The following assume you are in the root of the repository. + +Compilation: + +* To compile: `make` +* To run the test suite: `make test` +* To install the `semgrep-core` binary: `make install` +* The `semgrep-core` executable produced by `make`: `_build/default/src/main/Main.exe` (alias `sc`) + +Running (examples in Python): + +* To match a rule file against a target: `sc -rules [your-rule].yaml [your-target].py -lang python` +* To match a pattern against a target: `sc -f [your-pattern].sgrep [your-target].py -lang python` +* To dump a pattern AST: `sc -dump_pattern [your-pattern].sgrep -lang python` +* To dump a target AST: `sc -dump_ast [your-target].py -lang python` +* To dump a pattern Python AST: `pf -dump_python [your_pattern].sgrep -sgrep_mode -lang python` +* To dump a target Python AST: `pf -dump_python [your_pattern].sgrep -lang python` + +Debugging: +* To get the semgrep-core command the python wrapper will use: `semgrep --dump-command-for-core --config [your-config] [your-target-directory]` + +Try it out: `sc -f tests/python/dots_stmts.sgrep tests/python/dots_stmts.py -lang python` + +## `semgrep-core` overview + +### Entry point + +The entry point to `semgrep-core` is `Core_CLI.ml`, in `src/core_cli/`. This is where you add command-line arguments. It calls functions depending on the mode in which `semgrep-core` was invoked (`-config` for a yaml file, `-f` for a single pattern, etc.) + +When invoked by `semgrep`, `semgrep-core` is called by default with `-config`. This corresponds to the function `semgrep_with_rules_file`, which in turn calls `semgrep_with_rules`. These functions will parse and then match the rule and targets. + +### Parsing + +`semgrep-core` uses external modules to parse code into augmented language-specific abstract syntax trees (ASTs). Though we call these ASTs, they additionally contain token information such as parentheses that are traditionally only present in concrete syntax trees (CSTs) so that we can output results in the correct range. + +When `semgrep-core` receives a rule or a target, it will first need to parse it. The functions that do this are located in `src/parsing/`. + +* If it reads a rule, it will go through `Parse_rule.ml`, which uses `Parse_pattern.ml` to parse the code-like portions of the rule +* If it reads a target, it will go through `Parse_target.ml` + +Depending on the language, `Parse_pattern.ml` and `Parse_target.ml` will invoke parsers to parse the code. For example, if we have Java code, it will first be parsed into a Java-specific AST. + +### Converting to the generic AST + +`semgrep-core` does not match based on the Java AST. It has a generic AST, defined in `AST_generic.ml` (in `libs/ast_generic/`), which all language-specific ASTs are converted to. + +The functions for this conversion are in either `languages/[LANG]/generic/`. They are named with the appropriate language in a consistent convention. + +### Matching + +The matching functions are contained in `src/engine/` (e.g. `Match_rules.ml`, `Match_patterns.ml`) and `src/matching/` (e.g. `Generic_vs_generic.ml`). There are several possible matchers to invoke + +* spacegrep (for generic mode) +* regexp (to match by regexp instead of semgrep patterns) +* comby (an experimental mode for languages we don't yet support) +* pattern (the main mode) + +We will only talk about the last for now. In most cases, `Match_rules.ml` will invoke the `check` function in `Match_patterns.ml`. This will visit the target AST and try to match the pattern to it at each point. If the pattern and the target node correspond, it will call the relevant function in `Generic_vs_generic.ml`. + +The core of the matching is done by `Generic_vs_generic.ml`. The logic for whether two expressions, statements, etc. match is contained within this file. + +### Report results + +The results of the match will be returned to the calling function in `Main.ml` (for example, `semgrep_with_rules`). From there, the results are formatted and outputted. + +There are two modes for outputting: JSON and text. JSON output is processed by functions in `JSON_report.ml` in `semgrep-core/src/reporting/` + +## Fix a parse error + +Before you start fixing a parse error, you need to know what parser was used. This bears some explanation. + +### Guide to parsers + +The parsers used by semgrep fall into these categories: +* legacy parsers (pfff): implemented directly in OCaml via a parser generator +* tree-sitter parsers: third-party parsers implemented as + [tree-sitter](https://tree-sitter.github.io/) grammars +* generic parser (spacegrep): fallback for unsupported languages, + comes with its own matching engine + +For each language, we need a parser for target files and a parser for +semgrep patterns. For a given language, ideally both would use the same +parser. For historical reasons, some languages use a legacy +parser for patterns and a tree-sitter parser for target code. +Here's the breakdown by language as of February 2021: + +* legacy parser for both pattern and target: + - OCaml + - PHP + - Python +* legacy parser for pattern, tree-sitter parser for target: + - C + - Go + - Java + - JavaScript, JSX, JSON + - Ruby + - TypeScript, TSX +* tree-sitter parser for both pattern and target: + - C# + - Kotlin + - Lua + - R + - Rust + +### Fix a `pfff` parse error + +#### Parse with `pfff` + +[`pfff`](https://github.com/semgrep/pfff) is an OCaml project that we plug into `semgrep-core` as a git submodule. It uses menhir to generate parsers from a defined grammar. + +Consider a Python pattern (or target). To parse it into a generic AST form, we transform the code as follows: + +Text -- (via `Lexer_python.mll`) --> Tokens -- (via `Parser_python.mly`) --> `Ast_python` -- (via `Python_to_generic.ml`) --> `AST_generic` + +These files live in different places. Specifically, + +* `Lexer_python.mll` is in `semgrep-core/src/pfff/lang_python/parsing` +* `Parser_python.mly` is in `semgrep-core/src/pfff/lang_python/parsing` +* `AST_python.ml` is in `semgrep-core/src/pfff/lang_python/parsing` +* `Python_to_generic.ml` is in `semgrep-core/src/parsing/pfff` +* `AST_generic.ml` is in `semgrep-core/src/core/ast` + +You will notice that the first three, `Lexer_python.mll`, `Parser_python.mly`, and `AST_python.ml` are in `semgrep-core/src/pfff/`, which is a submodule. This means that when you modify them, you modify the submodule rather than `semgrep-core`. You can develop as usual---`pfff` is compiled when you run `make` in `semgrep-core/`---but will need to go through an extra step to make a pull request (explained later). + +When a language is particularly complicated, it can be convenient to first parse into a CST, then convert to the AST. Currently, we only do this for PHP. In this case, there is an extra step: + +Tokens -- (via `Parser_php.mly`) --> `Cst_php` -- (via `Ast_php_build.ml`) --> `Ast_php` + +The lexers and parsers apply for both patterns and targets of a given language. To avoid parsing invalid targets, we have a function `Flag_semgrep.sgrep_guard` which fails when parsing constructs that only appear in patterns if a target is being parsed. + +#### Identify the error + +The source of the error can be anywhere along the Text --> `AST_generic` path, so you will want to identify which file is causing it. + +First, create a minimum failing case. If you are debugging a rule, isolate this to an individual pattern if possible, saved in a `.sgrep` file. + +For simplicity, we will use Python in the examples, but you can substitute Python for any language parsed with `pfff`. + +If the problem is in `Lexer_python.mll`, you will probably get a helpful error message which should tell you what you need to change. + +If the problem is in `Parser_python.mly`, you will probably not get a helpful error message, because the error will be reported in the generated parser, not the grammar. To identify which production within the grammar is problematic, you will want to see what AST the parser is trying to produce. Modify the failing case minimally until it parses successfully. + +Now, you need to see what generic AST is produced by this similar code. You can actually do this in the playground, by going to Tools -> Dump AST. On the command line, you can run + +* For a pattern: + ```bash + sc -dump_pattern -f [your_pattern].sgrep -lang python + ``` +* For a target: + ```bash + sc -dump_ast [your_target].py -lang python + ``` + +where `sc` is an alias for `semgrep-core/_build/default/src/cli/Main.exe`, the executable produced by running `make` in `semgrep-core/`. If you have installed `semgrep-core`, you can instead use `semgrep-core` here, but each time you make a change you will need to compile (`make`) and then install (`make install`). + +By default, tokens are not shown in full in the dumped AST. Their presence is indicated by `()`. + +You may also find it useful to see the Python AST representation of the pattern. Just as `make` produces an executable for `semgrep-core` in `semgrep-core/_build/default/src/cli/Main.exe`, it also produces one for `pfff` in `semgrep-core/_build/default/src/pfff/cli/Main.exe` (alias to `pf` for these docs). + +To dump the Python AST, run + +* For a pattern: + ```bash + pf -dump_python [your_pattern].sgrep -sgrep_mode -lang python + ``` +* For a target: + ```bash + pf -dump_python [your_target].py -lang python + ``` + +(Note that `-sgrep_mode` does not always work with incomplete programs. You may need to wrap your pattern so that it is a valid program for that language, except for semgrep constructs such as `...`) + +#### Fix the error + +At this point, the relevant change you need to make will vary depending on your goal. It may be as simple as adding `...` as a possible case. It may require you to introduce a new construct and add it to `AST_generic` and `Ast_python`. As a rule of thumb, prefer to avoid changing `AST_generic` if possible. This will also make your life easier! + +If you add a pattern-specific feature, remember to use `Flag_semgrep.sgrep_guard` so that an invalid target does not parse successfully. + +When you change the grammar, it is important that you do not introduce conflicts. Check the conflicts before you start by forcing dune to compile the grammar. (You can either use `make clean` and read through the output or make a change in `Parser_python.mly`, run `make`, then remove the change and run `make` again.) Then, after you change the grammar, see if there are any more conflicts than there were before your change. + +It can sometimes be okay to introduce a `shift/reduce` conflict, though avoid doing this if possible. It is never okay to introduce a `reduce/reduce` conflict. To understand why, read about [LR(1) parsers](https://en.wikipedia.org/wiki/Canonical_LR_parser). + +If you do introduce a conflict, you can figure out how to resolve it by running + +```bash +menhir --explain Parser_python.mly +``` + +This will produce the file `Parser_python.conflicts` in the same folder as `Parser_python.mly`, which will show the two possible interpretations Menhir is considering for each conflict. + +Unfortunately, it will also produce `Parser_python.ml` and `Parser_python.mli`, which will confuse dune when it tries to build. Remove these files before you run `make` again. + +#### Commit the fix + +Once you have made your desired pattern or target parse, you need to make sure it doesn't break anything else. In `semgrep-core/`, run `make test`. If at the end it says `Ok`, you can commit your fix! + +First, if you have any changes in `pfff`, go into the `semgrep-core/src/pfff/` directory, checkout `develop`, pull, and then make a pull request as usual with your changes. This will make a PR to [`pfff`](https://github.com/semgrep/pfff). + +When you change files in `pfff`, `semgrep-core` will realize that `pfff` is different (though not which file within `pfff`). If you go back up to `semgrep-core/` and run `git status`, you will see `modified: src/pfff (modified content)`. To pin your latest `pfff` changes to `semgrep-core`, add `src/pfff`. + +Now, make the rest of your pull request for `semgrep-core` as usual. + +If you haven't changed `pfff`, don't worry about this. Just make a pull request with your changes. + +Remember to add test cases so that future changes don't break your example! See [Test `semgrep-core`](#test-semgrep-core) + +### Fix a Tree-sitter parse error + +There is more information in [Add Support for a Language](#add-support-for-a-language) on tree-sitter which will be helpful. Also, see `semgrep-core/src/parsing/tree-sitter/`. + +## Fix a match error + +The first thing you will need to do is understand what you expected and why you aren't getting that. If possible, reduce your rule to a single pattern that doesn't match. You may need to experiment with the clauses in your rule. For example, if you are getting too many matches, it may be because the pattern in `pattern-not` doesn't match what you expect. + +If you are unable to do so, you may need to investigate `Match_rule.ml`. + +Otherwise, produce a minimal failing pattern/target pair. You will need to compare the ASTs to see which portion is not matching as you expect. Run + +```bash +sc -dump_ast [your_target].py -lang py +``` + +and then + +```bash +sc -dump_pattern [your_pattern].sgrep -lang py +``` + +It can be hard to figure out where in the AST you are looking. You can make it easier by using a distinctive variable name in the section you're interested in. + +Once you've isolated the parts that aren't matching, try to figure out where they're different, taking into account special features like metavariables and ellipses. It is unlikely (though not impossible) that the problem would ever be that two identical code segments aren't matching or that there is some AST element that ellipses refuse to match. You might find it helpful to write out the AST parts you want to match on a whiteboard, indicating which part is matched by a special feature. Pare down the code as much as possible and try changing the bit you're interested in. + +When you are sure you know what ought to have happened, make it happen. If two pieces of code should match but don't, change `Generic_vs_generic.ml` to tell it that pattern should match the target. + +Oftentimes, a matching error is actually a parsing error. You may want to change how `Parser_python.mly` reduces the construct or how it gets converted in `Python_to_generic.ml`. Refer to [Fix a Parse Error](#fix-a-parse-error) for advice. + +At the end, confirm the match with + +```bash +sc -f [your_pattern].sgrep [your_target].py -lang py +``` + +## Fix an Rule-defined fix error + +Rule-defined fix runs through both `semgrep-core` and `semgrep`, but the most common error raised by Rule-defined fix you can encounter is a kind of incorrect range. This happens because `semgrep-core` determines the range of a match based on the locations of the tokens stored in the AST. When the range is incorrect, that usually means a token is missing. You can see token location information with + +```bash +sc -full_token_info -dump_ast [your_target].py -lang py +``` + +See [Fix a Parse Error](#fix-a-parse-error) for more on parsing + +## Debugging resources + +In the process of debugging, you will probably want to print things. We provide a function `pr2` in `Common.ml` (in `semgrep-core/src/pfff/commons/`) to print strings. You can also use the `Printf` module. + +If you would like to print an AST element, you can use a `show` function. For example, to print a node of type `any` in `AST_generic`, you can use + +`pr2 (show_any your_node)` + +Any type that includes `[@@deriving show]` in its definition can be converted to a string in this way. + +We also provide some flags that are useful. If you run with `-debug`, you can see the steps `semgrep-core` is taking. You can see more information (and change what you want to see) using `-log_config_file`, which takes a file. You can use one of `semgrep-core/log_config.json.ex1` or `semgrep-core/log_config.json.ex2` to start. + +Additionally, the [OCaml debugger](https://ocaml.org/manual/debugger.html) is a great resource. + +## Add support for a language + +There are some cases where we have chosen to implement a new parser in `pfff`, but in general new languages should use tree-sitter. + +### Tree-Sitter parsers + +Tree-sitter parsers exist as individual public projects. They are +shared with other users of tree-sitter outside of semgrep. Our +[ocaml-tree-sitter](https://github.com/semgrep/ocaml-tree-sitter-semgrep) +project adds the necessary extensions for supporting semgrep patterns +(ellipsis `...` and such). It also contains the machinery for turning +a tree-sitter grammar into a usable, typed concrete syntax tree (CST). + +For example, for the Kotlin language we have: +* input: [tree-sitter-kotlin](https://github.com/fwcd/tree-sitter-kotlin) +* output: [semgrep-kotlin](https://github.com/semgrep/semgrep-kotlin) + +Assuming the tree-sitter grammar works well enough, most of the work +consists in mapping the CST to the generic abstract syntax tree (AST) +shared by all languages in semgrep. + +These guides go over the integration work in more details: + + + + + + \ No newline at end of file diff --git a/mintlify-docs/contributing/semgrep-philosophy-1.mdx b/mintlify-docs/contributing/semgrep-philosophy-1.mdx new file mode 100644 index 0000000000..9773958e4c --- /dev/null +++ b/mintlify-docs/contributing/semgrep-philosophy-1.mdx @@ -0,0 +1,8 @@ +--- +title: "Semgrep Community Edition (CE) philosophy" +sidebarTitle: "Semgrep CE philosophy" +--- + +import SemgrepCEPhilosophy from "/snippets/contributing/semgrep-philosophy.mdx" + + \ No newline at end of file diff --git a/mintlify-docs/contributing/semgrep-philosophy-2.mdx b/mintlify-docs/contributing/semgrep-philosophy-2.mdx new file mode 100644 index 0000000000..163cb17c65 --- /dev/null +++ b/mintlify-docs/contributing/semgrep-philosophy-2.mdx @@ -0,0 +1,7 @@ +--- +title: "Semgrep Community Edition (CE) philosophy" +--- + +import SemgrepCEPhilosophy from "/snippets/contributing/semgrep-philosophy.mdx" + + \ No newline at end of file diff --git a/mintlify-docs/contributing/semgrep-philosophy.mdx b/mintlify-docs/contributing/semgrep-philosophy.mdx new file mode 100644 index 0000000000..9773958e4c --- /dev/null +++ b/mintlify-docs/contributing/semgrep-philosophy.mdx @@ -0,0 +1,8 @@ +--- +title: "Semgrep Community Edition (CE) philosophy" +sidebarTitle: "Semgrep CE philosophy" +--- + +import SemgrepCEPhilosophy from "/snippets/contributing/semgrep-philosophy.mdx" + + \ No newline at end of file diff --git a/mintlify-docs/contributing/troubleshooting.mdx b/mintlify-docs/contributing/troubleshooting.mdx new file mode 100644 index 0000000000..a3d42857c6 --- /dev/null +++ b/mintlify-docs/contributing/troubleshooting.mdx @@ -0,0 +1,28 @@ +--- +title: "Troubleshooting" +--- + +## Make errors + + + +There are probably changes to submodules that you don't have. Run `git submodule update --recursive`. + + +## Pre-commit + + + +Make sure to follow the [Development Workflow](/contributing/contributing-code/#development-workflow) so that pre-commit will run on commit + + + +Sometimes changes you make will cause pre-commit errors in code you haven't touched--for example, if you change a function's return type. However, if you're absolutely sure you didn't cause this, you can run `git commit --no-verify` to commit without running `pre-commit`. + + + +## Exotic + + +Run `pip3 show semgrep` to find the location semgrep was installed in. `semgrep-core` will be in that path/semgrep/bin/semgrep-core + \ No newline at end of file diff --git a/mintlify-docs/contributing/updating-a-grammar.mdx b/mintlify-docs/contributing/updating-a-grammar.mdx new file mode 100644 index 0000000000..ccf6c9bd70 --- /dev/null +++ b/mintlify-docs/contributing/updating-a-grammar.mdx @@ -0,0 +1,207 @@ +--- +title: "How to upgrade the grammar for a language" +--- + +Like for adding a language, most of these instructions happen in +[ocaml-tree-sitter-semgrep](https://github.com/semgrep/ocaml-tree-sitter-semgrep). + +Let's assume we are upgrading the grammar for the programming language `$PL`. +(Consider adding an environment variable to your shell to make copying some of the commands below easier). + +## Summary (ocaml-tree-sitter) + +In ocaml-tree-sitter: + + +Update submodule `tree-sitter-$PL`. + + +From `lang/`, run `./test-lang $PL`. + + +From `lang/`, ask a Semgrep team developer to run `./release $PL`. + + +In semgrep: + + +In the semgrep repo, update submodule `semgrep-$PL`. + + +In the semgrep repo, update the OCaml code that maps the CST to the generic AST. + + +In the end, **make sure the generated code used by the main branch of +semgrep can be regenerated** from the main branch of ocaml-tree-sitter: + + +Merge your semgrep branch. + + +Merge your ocaml-tree-sitter branch. + + + +## Components + +Here are the main components: + +* the OCaml code generator + [ocaml-tree-sitter](https://github.com/semgrep/ocaml-tree-sitter-semgrep): + generates OCaml parsing code from tree-sitter grammars extended + with `...` and such. Publishes code into the git repos of the + form `semgrep-$PL`. +* the original tree-sitter grammar `tree-sitter-$PL` e.g., + [tree-sitter-ruby](https://github.com/tree-sitter/tree-sitter-ruby): + the original tree-sitter grammar for the language. + This is the git submodule `lang/semgrep-grammars/src/tree-sitter-$PL` + in ocaml-tree-sitter. It is installed at the project's root + in `node_modules` by invoking `npm install`. +* syntax extensions to support semgrep patterns, such as ellipses + (`...`) and metavariables (`$FOO`). + This is `lang/semgrep-grammars/src/semgrep-$PL`. It can be tested from + that folder with `make && make test`. +* an automatically-modified grammar for language `$PL` in `lang/$PL`. + It is modified so as to accommodate various requirements of the + ocaml-tree-sitter code generator. `lang/$PL/src` and + `lang/$PL/ocaml-src` contain the C/C++/OCaml code that will published + into `semgrep-$PL` e.g. + [semgrep-ruby](https://github.com/semgrep/semgrep-ruby) + and used by semgrep. +* [semgrep-$PL](https://github.com/semgrep/semgrep-ruby): + provides generated OCaml/C parsers as a dune project. Is a submodule + of semgrep. +* [semgrep](https://github.com/semgrep/semgrep): uses the parsers + provided by `semgrep-$PL`, which produce a CST. The + program's CST or pattern's CST is further transformed into an AST + suitable for pattern matching. + +Make sure the above is clear in your mind before proceeding further. +If you have questions, the best way is reach out on the [Semgrep +Community Slack channel](https://go.semgrep.dev/slack). + +## Before upgrading + +Make sure the `grammar.js` file or equivalent source files +defining the grammar are included in the `fyi.list` file in +`ocaml-tree-sitter/lang/$PL`. + +Why: It is important for tracking and _understanding_ the changes made at the +source. + +How: See [How to add support for a new language](/contributing/adding-a-language). + +## Upgrade the tree-sitter-$PL submodule + +Say you want to upgrade (or downgrade) `tree-sitter-$PL` from some old +commit to commit `602f12b`. This uses the git submodule way, without +anything weird. The commands might be something like this: + +```bash +git submodule update --init --recursive --depth 1 +git checkout -b upgrade-$PL +cd lang/semgrep-grammars/src/tree-sitter-$PL +git fetch origin --unshallow +git checkout 602f12b +cd .. +``` + +## Testing + +First, build and install ocaml-tree-sitter normally, based on the +instructions found in the [main README](https://github.com/semgrep/ocaml-tree-sitter-semgrep/blob/main/README.md). + +```bash +./configure +make setup +make +make install +``` + +Then, build support for your language in `lang/`. The following +commands will build and test the language: + +```bash +cd lang + ./test-lang $PL +``` + + +**CAUTION** + +Check the generated code for the presence of `Blank` nodes. Those +correspond to [missing tokens](https://github.com/tree-sitter/tree-sitter/issues/1151). + + + +Check with: +```bash +grep Blank lang/$PL/ocaml-src/lib/CST.ml +``` +If anything comes up, you must modify the grammar so as to create +a named rule for the node of the `Blank` kind. Eventually, the generated +`CST.ml` should not have `Blank` nodes anymore but a token type instead. +Where a `Blank` node exists, we won't be able to get a token or its location +at parsing time. + +If this works, we're all set. Commit the new commit for the +`tree-sitter-$PL` submodule: +```bash +git status +git commit semgrep-languages/semgrep-$PL +git push origin upgrade-$PL +``` + +Then make a pull request to merge this into ocaml-tree-sitter's +main branch. It's ok to merge at this point, even if the generated code +hasn't been exported (**Publishing** section below) or if you haven't +done the necessary changes in semgrep (**Semgrep integration** below). + +We can now consider publishing the code to `semgrep-$PL`. + +## Publishing + +_Please [ask someone at Semgrep, Inc. to run this step](https://github.com/semgrep/ocaml-tree-sitter-semgrep/blob/main/doc/release.md)._ + +From the `lang` folder of ocaml-tree-sitter, we'll perform the +release. This step redoes some of the work that was done earlier and +checks that everything is clean before committing and pushing the +changes to semgrep-$PL. + +```bash +cd lang + ./release --dry-run $PL # dry-run release + ... # 'git status' will show changes for language $PL + ./release $PL # commits and pushes to semgrep-$PL +``` + +This step is safe. Semgrep at this point is unaffected by those +changes. There is now a new commit at +`https://github.com/semgrep/semgrep-$PL` e.g. +https://github.com/semgrep/semgrep-javascript. +The [`fyi/` folder](https://github.com/semgrep/semgrep-javascript/tree/main/fyi) +contains original files from which the code was generated. +[`fyi/versions`](https://github.com/semgrep/semgrep-javascript/blob/main/fyi/versions) +shows the last change for each file, allowing you to check that you +got the correct version of `grammar.js` or some other source file. + +## Semgrep integration + +From the semgrep repository, point the submodule for `semgrep-$PL` to the +latest commit from the "Publishing" step. Then rebuild semgrep-core, +which will normally fail if the grammar changed. If the source +`grammar.js` was included in the `fyi` folder for `semgrep-$PL` (as it +should), `git diff HEAD^` should help figure out the changes since the +last version. + +## Conclusion + +The main difficulty is to understand how the different git projects +interact and to not make mistakes when dealing with git submodules, +which takes a bit of practice. + +## See also + + + + \ No newline at end of file diff --git a/mintlify-docs/customize-semgrep-ce.mdx b/mintlify-docs/customize-semgrep-ce.mdx new file mode 100644 index 0000000000..f4e455c69e --- /dev/null +++ b/mintlify-docs/customize-semgrep-ce.mdx @@ -0,0 +1,145 @@ +--- +sidebarTitle: "Customize scans" +title: "Customize Semgrep Community Edition (CE) scans" +description: "This article shows you how to customize your local scans with Semgrep Community Edition (CE). Before proceeding with this article, ensure that you are familiar with [scanning a project using Semgrep CE](/getting-started/quickstart-ce)." +--- + +## Scan your codebase and export results + +Navigate to the root of your codebase to run first scan. The specific command you use depends on how you want to view the results. + +To view the results in the CLI: + +```bash +semgrep scan +``` + +To export the results to a plain text file: + +```bash +semgrep scan --text --text-output=semgrep.txt +``` + +To export the results to a SARIF file: + +```bash +semgrep scan --sarif --sarif-output=semgrep.sarif +``` + +To export the results to a JSON file: + +```bash +semgrep scan --json --json-output=semgrep.json +``` + +> The JSON schema for Semgrep's CLI output can be found in [semgrep/semgrep-interfaces](https://github.com/semgrep/semgrep-interfaces/blob/main/semgrep_output_v1.jsonschema). + +In addition to the `--text`, `--json`, and `--sarif` flags, which set the primary output formats, and the `--output= ` flag that saves the results to a file or posts to a URL, you can append `-- -output= ` to obtain additional output streams: + +```bash expandable +# prints findings in SARIF format to standard output and writes in JSON format to `findings.json`. +semgrep scan --sarif --json-output=findings.json + +# prints findings in text to standard out and writes JSON output to `findings.json`. +semgrep scan --json-output=findings.json + +# prints text output to `findings.txt` and writes in SARIF to `findings.sarif`. +semgrep scan --output=findings.txt --sarif-output=findings.sarif + +# writes text to `semgrep.txt`, JSON to `semgrep.json`, and SARIF to `semgrep.sarif`. +semgrep scan --text --output=semgrep.txt --json-output=semgrep.json --sarif-output=semgrep.sarif +``` + +Accepted values for ` `: `text`, `json`, `sarif`, `gitlab-sast`, `gitlab-secrets`, `junit-xml`, `emacs`, `vim` + +## Scan your codebase with a specific ruleset + +You can scan your codebase using `--config auto` to run Semgrep with rules that apply to your programming languages and frameworks: + +```bash +semgrep scan --config auto +``` + + +**INFO** + +Semgrep collects pseudonymous metrics when you use rules from the Registry. You can turn this off with `--metrics=off`. + + +To scan your codebase with a specific ruleset, either one that you write or one that you obtain from the [ Semgrep Registry](https://semgrep.dev/explore), use the `--config` flag. + +```bash +# Scan with the JavaScript rules from Semgrep Registry +semgrep scan --config p/javascript +``` + +```bash +# Scan with the rules defined in your custom rules.yaml file +semgrep scan --config rules.yaml +``` + +You can include as many configuration flags as necessary. + +```bash +# Scan with rules defined in two separate config files +semgrep scan --config rules.yaml --config more_rules.yaml +``` + +Rules stored under a **hidden directory**, such as `dir/.hidden/myrule.yml`, are processed by Semgrep when scanning with the `--config` flag. + +Scan with rules in a **directory** and **all** its subdirectories: + +```bash +semgrep scan --config DIRECTORY_NAME +``` + +Scan with all YAML rules detected in the **current working directory** and all its **subdirectories**: + +```bash +semgrep scan --config . +``` + +#### Test custom rules + +Semgrep includes features to [test the custom rules that you write](/writing-rules/testing-rules): + +```bash +semgrep scan --test +``` + +## Improve performance for large codebases + +You can set the number of subprocesses Semgrep uses to run checks in parallel: + +```bash +semgrep scan -j NUMBER_OF_SUBPROCESSES +``` + +By default, the number of jobs Semgrep uses is equivalent to the number of cores detected on the system. + + +Semgrep doesn't currently support parallelism on Windows. + + +## Set log levels + +Semgrep provides three levels of logging: + +| **Log level** | **Flag** | **Description** | +| :--- | :--- | :--- | +| Default | None | Prints scan progress, findings information, warnings, and errors. | +| Verbose | `-v` or `--verbose` | Includes everything printed when using the default logging level, adding a list of rules and details such as skipped files. | +| Debug | `--debug` | Logs the entire scan process at a high level of detail. | + +### Example usage + +To set the logging level for a scan, include the flag when scanning your project: + +```bash +# run a scan and get debug logs +semgrep scan --debug +``` + +## Exit codes + +The command `semgrep scan` finishes with exit code `0` as long as the scan completes, regardless of whether there were findings. To finish with exit code `1` when there are findings, pass in the `--error` flag. diff --git a/mintlify-docs/deployment/add-ai-to-scans.mdx b/mintlify-docs/deployment/add-ai-to-scans.mdx new file mode 100644 index 0000000000..96d7c6a347 --- /dev/null +++ b/mintlify-docs/deployment/add-ai-to-scans.mdx @@ -0,0 +1,103 @@ +--- +title: "Scan with AI-powered detection (beta)" +sidebarTitle: "Scan with AI" +--- + + +This page provides step-by-step instructions on enabling and running an AI-powered scan. For details on what AI-powered detection can uncover, known limitations, and beta considerations, see [AI-powered detection overview](/semgrep-code/ai-powered-detection-concepts). + +## Prerequisites + +To run Semgrep Code's [AI-powered detection](/semgrep-code/overview#ai-powered-detection-beta-feature), ensure that you meet the following requirements: + +* You have added your projects to [Semgrep Managed Scans](/getting-started/quickstart-managed-scans#add-projects-to-semgrep-managed-scans). Look for the `managed-scan` tag in the [**Projects** section of the Semgrep AppSec Platform](https://semgrep.dev/orgs/-/projects/scanning). +* You have enabled [Semgrep Multimodal](/semgrep-multimodal/getting-started#enable-multimodal) for your organization. + +## Enable or disable AI-powered detection + +This feature is enabled by default for all Semgrep Multimodal users. + +To enable or disable AI-powered detection in Semgrep AppSec Platform, go to [**Settings** > **Code**](https://semgrep.dev/orgs/-/settings/general/code) and then toggle **AI-powered scanning** on or off. + +## Scan with AI-powered detection + + + +Log in to Semgrep AppSec Platform. + + +In the **navigation bar**, click on **Projects**. + + + +To scan the default or main branch: + + + +Choose the projects by selecting the checkboxes next to their names. This enables the **Run a new scan** drop-down menu. + + +Click **Run a new scan > AI-powered detection**. + + +A dialog appears that displays the number of projects that were selected for scanning. Click **Scan** to begin. + + * If you would like Semgrep to automatically perform an AI scan on these projects every week, select **Enable weekly scans**. + + + +To scan a non-default branch: + + + +Click **Details** for your project of interest. On the project's **Details** page, click **Run a new scan** and choose **AI-powered detection.** + + +In the dialog, enter the name of the branch you want to scan. + + + +## View findings + +Findings generated by AI-powered detection scans are part of [Semgrep Code findings](/semgrep-code/findings) and are listed on the **[Code](https://semgrep.dev/orgs/-/findings)** page. You can use the filters icon to filter for **AI-powered scan findings**. + +The findings card indicates whether a finding was detected by an AI-powered scan or a Rule-based scan. + + +## Add additional context to AI-powered detection scans + + +**INFO** + +Only **Admins** can upload context documents to Semgrep projects. + + +By uploading project-specific context such as design documents, threat models, or instructional markdown, you can provide additional information for Semgrep to use during AI-powered scans. This enables Semgrep to show higher-impact findings and reduce false positives based on how your application is designed and used. + + +To upload a project-specific context document: + + + +Log in to Semgrep AppSec Platform. + + +In the **navigation bar**, go to **Rules & Policies > Memories**. + + +Go to the **Documents** tab and click **Add document**. + + +Drag the document to the **File upload** box or click **Choose a file** to select and upload your context document. + + Optionally: Add a **Description** of the document. This information will be used as additional context for AI-powered detection scans. + + + +The finding **Details** page references the uploaded context document under the finding description. + + + +For an in-depth understanding of how AI-powered detection works, see [AI-powered detection: concepts, limitations, and FAQs](/semgrep-code/ai-powered-detection-concepts). + + diff --git a/mintlify-docs/deployment/add-semgrep-to-ci.mdx b/mintlify-docs/deployment/add-semgrep-to-ci.mdx new file mode 100644 index 0000000000..b16ffd36f4 --- /dev/null +++ b/mintlify-docs/deployment/add-semgrep-to-ci.mdx @@ -0,0 +1,255 @@ +--- +title: "Add Semgrep to CI" +--- + + + +**YOUR DEPLOYMENT JOURNEY** + +- You have gained the necessary [resource access and permissions](/deployment/checklist) required for deployment. +- You have [created a Semgrep account and organization](/deployment/create-account-and-orgs). +- For GitHub and GitLab users: You have [connected your source code manager](/deployment/connect-scm). +- Optionally, you have [set up SSO](/deployment/sso). + + +Semgrep is integrated into CI environments by creating a **job** that is run by the CI provider. After a scan, findings are sent to Semgrep AppSec Platform for triage and remediation. + +By integrating Semgrep into your CI environment, your development cycle benefits from the automated scanning of repositories at various events, such as: + +- Push events +- Pull requests or merge requests (PRs or MRs) +- User-initiated events (such as GitHub Action's `workflow_dispatch`) + + +**SEMGREP MANAGED SCANS** + +As an alternative to integrating Semgrep into your CI/CD system, consider [Semgrep Managed Scans](/deployment/managed-scanning/overview), which enables you to bulk onboard and scan your repositories without requiring changes to your CI. + + +## Guided setup for CI providers in Semgrep AppSec Platform + +This guide walks you through creating a Semgrep job in the following CI providers, which are explicitly supported in Semgrep AppSec Platform: + +- GitHub Actions +- GitLab CI/CD +- Jenkins +- Bitbucket +- CircleCI +- Buildkite +- Azure Pipelines +- Semaphore + +If your provider is **not** on this list, you can still integrate Semgrep into your CI workflows by following the steps in [ Add Semgrep to other CI providers](/deployment/add-semgrep-to-other-ci-providers). + +## Projects + +Adding a Semgrep job to your CI provider also adds the repository's records, including findings, as a **project** in Semgrep AppSec Platform. Each project can be individually configured to send notifications or tickets. + + +## Add Semgrep to CI + + + + +To add a Semgrep job to your CI provider: + + + +Ensure you are signed in to Semgrep AppSec Platform. + + +Click **[Projects](https://semgrep.dev/orgs/-/projects)** on the left sidebar. + + +Click **Scan new project > CI/CD**. + + +Click the name of the CI provider you use. You are taken to the **Add job** page. + + +Follow the steps provided on the page. The process varies depending on your CI provider, but generally includes the following steps:

+ i. Click **Create new token** to create a `SEMGREP_APP_TOKEN`, which is used to when sending results to Semgrep AppSec Platform.

+ ii. Copy and paste the `SEMGREP_APP_TOKEN` and its value. Store it as an environment variable or secret in your CI provider.

+ iii. Optional: Click **Review CI config** to see Semgrep's default YAML configuration file for your CI provider.

+ iv. Click **Copy snippet** and paste it into your CI provider's configuration file (the filename is typically indicated in the page). Depending on your CI provider, you may have to create a custom configuration file or use an existing one.

+ v. Commit the configuration file to your repository.

+ vi. Return to Semgrep AppSec Platform and click **Check connection**. +
+
+ +You have now added a Semgrep job to your CI provider; this starts your first **full scan**. Its findings are sent to Semgrep AppSec Platform for triage and remediation. + +
+ + +To add a CI job to GitHub Actions: + + + +Ensure you are signed in to Semgrep AppSec Platform. + + +Click **[Projects](https://semgrep.dev/orgs/-/projects)** on the left sidebar. + + +Click **Scan new project > CI/CD**. + + +Click **GitHub Actions**. + + +A list of repositories appears. Select all the repositories you want to add a Semgrep job to. + + +If you do not see the repository you want to add, adjust [ GitHub Application's Repository Access](https://github.com/settings/installations) configuration. See [Detecting GitHub repositories](#detecting-github-repositories) for more information. + + +Click **Add CI job**. You are taken to the Add CI job page. + + +Optional: Click **Review CI config** to see Semgrep's default YAML configuration file. + + +Click **Commit file**. + + + +You have now added a Semgrep job to GitHub Actions. A **full scan** begins automatically after adding a new repository. Its findings are sent to Semgrep AppSec Platform for triage and remediation. + +### Detecting GitHub repositories + +If you aren't seeing your GitHub repos in the Cloud Platform, complete the following steps to ensure that your GitHub repository is **detected** by Semgrep AppSec Platform: + + + +Log in to GitHub. + + +Perform one of the following steps:

+ i. For repositories in personal accounts: Click your **profile photo > Settings > Applications**.

+ ii. For repositories in org accounts: Click your **profile photo > Your organizations > NAME_OF_ORG > Settings > GitHub Apps**. +
+ +On the `semgrep-app` entry, click **Configure**. + + +Under **Repository access** select an option to provide access:

+ i. All repositories will display all current and future public and private repositories.

+ ii. Only select repositories will display explicitly selected repositories. +
+
+ +
+
+ + +**TIP** + +You can edit your configuration files to send findings to **GitHub Advanced Security Dashboard (GHAS)** and **GitLab SAST Dashboard**. Refer to the following samples: + +- [GitHub Advanced Security Dashboard](/semgrep-ci/sample-ci-configs/#upload-findings-to-github-advanced-security-dashboard) +- [GitLab SAST Dashboard](/semgrep-ci/sample-ci-configs/#upload-findings-to-gitlab-security-dashboard) + + + +### Sample CI configuration snippets + +Refer to the following table for links to sample CI configuration snippets: + +| In-app CI provider | Sample CI configuration snippet | +| :--- | :--- | +| Azure Pipelines | [`azure-pipelines.yml`](/semgrep-ci/sample-ci-configs/#azure-pipelines) | +| Bitbucket Pipelines | [`bitbucket-pipelines.yml`](/semgrep-ci/sample-ci-configs/#bitbucket-pipelines) | +| Buildkite | [`pipelines.yml`](/semgrep-ci/sample-ci-configs/#buildkite) | +| CircleCI | [`config.yml`](/semgrep-ci/sample-ci-configs/#circleci) | +| GitHub Actions | [`semgrep.yml`](/semgrep-ci/sample-ci-configs/#github-actions) | +| GitLab CI/CD | [`.gitlab-ci.yml`](/semgrep-ci/sample-ci-configs/#gitlab-cicd) | +| Jenkins | [`Jenkinsfile`](/semgrep-ci/sample-ci-configs/#jenkins) | +| Semaphore | [`semaphore.yml`](/semgrep-ci/sample-ci-configs/#semaphore) | + +### Data collected by Semgrep + +When running in CI, Semgrep runs fully in the CI build environment. Unless you have explicitly granted code access to Semgrep, your code is not sent anywhere. + +- Semgrep collects [findings data](/semgrep-ci/findings-ci), which includes the line number of the code match, but not the code. It is hashed using a one-way hashing function. +- Findings data is used to generate line-specific hyperlinks to your source code management system and support other Semgrep functions. + +### Delete a project + +Deleting a project removes all of its findings, metadata, and other records from Semgrep AppSec Platform. + + + +In Semgrep AppSec Platform, click **Projects**. + + +Search for your repository's name. + + +Click the **windows icon** to access the settings page for that project. + + +Click the **three-dot (...) button** at the header and click **Delete project**. + + + +To delete an archived project: + + + +In Semgrep AppSec Platform, click **Projects**. + + +Switch to the **Not Scanning** tab of the **Projects** page. + + +Select the checkbox to **Show archived** projects. + + +Search for the archived repository's name. + + +Click the **window icon** under **Details** to access the settings page for that repository. + + +Click the dropdown at the header and click **Delete project**. + + + + +**INFO** + +It can take up to a day **(24 hours)** for the [Dashboard](/semgrep-appsec-platform/dashboard) to correctly update and remove findings associated with a recently deleted project. + + +## Scan scope + +Semgrep scans can be classified by **scope**. The scope of a scan refers to what lines of code are scanned in a codebase. When classifying scans by scope, there are two types of scans: + +### Full scan + +A full scan runs on your entire codebase and reports every finding in the codebase. It is recommended to perform a full scan of your default branch, such as main or master at a regular cadence, such as every night or every week. This ensures that Semgrep AppSec Platform has a full list of all findings in your code base, regardless of when they were introduced. To run a full scan, run `semgrep ci` without setting the `SEMGREP_BASELINE_REF` environment variable. Full scans are triggered at a scheduled time, when the `semgrep.yml` file is edited, or manually by a user. + +### Diff-aware scan + +A diff-aware scan runs on your code before and after some "baseline" and only reports findings that are newly introduced in the commits after that baseline. Typically, Semgrep runs diff-aware scans upon the creation of a new pull request or merge request. + +For example, imagine a hypothetical repository with 10 commits. You set commit number 8 as the baseline. Consequently, Semgrep only returns scan results introduced by changes in commits 9 and 10. This is how `semgrep ci` can run in pull requests and merge requests, since it reports only the findings that are created by those code changes. + +To run a diff-aware scan, use `SEMGREP_BASELINE_REF=REF semgrep ci` where `REF` can be a commit hash, branch name, or other Git reference. Note that the `SEMGREP_BASELINE_REF` does not apply to GitHub Actions and GitLab CI/CD environments. This variable cannot be set to turn a diff-aware scan in GitHub Actions or GitLab CI/CD into a full scan. + + +### Default branch names + +When you add a Semgrep CI job to your repository for the first time, Semgrep performs a full scan on the primary, or default, branches. In many cases, Semgrep automatically detects these branches as primary branches. However, you can also [set the primary branch name](/deployment/primary-branch). This is useful for repositories with unique names. This lets Semgrep know what branch to prioritize and perform full scans on. + +## Next steps + +You've set up Semgrep to scan in your repository and send findings after each scan. Your core deployment is almost complete. + +Remaining steps include: + +- Optional: [ Customize your CI job](/deployment/customize-ci-jobs). +- For software composition analysis (SCA) scans using **Jenkins or Maven**: [ Set up SCA scans for your infrastructure.](/semgrep-supply-chain/setup-infrastructure) +- For Jenkins users: Set up a separate CI job for diff-aware scans for feature branches (non-trunk branches) when a pull request or merge request is open. This is a prerequisite to receiving PR or MR comments. See Set up diff-aware scans. +- Set up **PR or MR comments**, which post findings to developers in your SCM. This involves developers in the security process as active participants. See [ PR or MR comments](/category/pr-or-mr-comments) for next steps. diff --git a/mintlify-docs/deployment/add-semgrep-to-other-ci-providers.mdx b/mintlify-docs/deployment/add-semgrep-to-other-ci-providers.mdx new file mode 100644 index 0000000000..a71de66413 --- /dev/null +++ b/mintlify-docs/deployment/add-semgrep-to-other-ci-providers.mdx @@ -0,0 +1,223 @@ +--- +title: "Add Semgrep manually to CI providers" +--- + + +**YOUR DEPLOYMENT JOURNEY** + +- You have gained the necessary [resource access and permissions](/deployment/checklist) required for deployment. +- You have [created a Semgrep account and organization](/deployment/create-account-and-orgs). +- For GitHub and GitLab users: You have [connected your source code manager](/deployment/connect-scm). +- Optionally, you have [set up SSO](/deployment/sso). + + +This guide documents the steps required to create a Semgrep job for CI providers for which Semgrep AppSec Platform offers no explicit guidance. + +See [ Add Semgrep to CI](/deployment/add-semgrep-to-ci/#guided-setup-for-ci-providers-in-semgrep-appsec-platform) before proceeding to ensure that this guide applies to your CI provider. + +Skip this guide if you have already configured a CI job. + +The steps provided here are known to work with the following CI providers: + +- AppVeyor +- Bamboo +- Bitrise +- Buildbot +- Codeship +- Codefresh +- Drone CI +- Nomad +- Semaphore +- TeamCity CI +- Travis CI + +## General steps + +The following steps provide an overview of the process. View the succeeding sections for detailed instructions. + + + +Create a token called `SEMGREP_APP_TOKEN`. + + +Add this token as a credential, secret, or token to your CI provider. + + +Create a CI job that runs Semgrep. This step is typically achieved by committing a CI configuration file. The syntax of the configuration file depends on your CI provider. + + +The CI job can automatically start to run depending on your configuration. If the job does not start, run the job through the CI provider's interface or by committing code. + + +Semgrep detects the `SEMGREP_APP_TOKEN`, sends it to Semgrep AppSec Platform for verification, and if verified, sends findings to Semgrep AppSec Platform. + + +Define additional environment variables to enable other Semgrep AppSec Platform features. This is done last because it is easier to troubleshoot modifications to jobs after ensuring that the base CI job runs correctly. + + + +The next sections go over these steps in detail. + +### Create a SEMGREP_APP_TOKEN + +To create a `SEMGREP_APP_TOKEN`, follow these steps: + + + +Sign in to [ Semgrep AppSec Platform](https://semgrep.dev/login). + + +Click **[ Settings](https://semgrep.dev/orgs/-/settings/tokens)** > **Tokens**. + + +Click **Create new token**. + + +Copy the name and value, then click **Save**. + + +Store the token value into your CI provider. Tokens can also be referred to as secrets, credentials, or secure variables. The steps to do this vary depending on your CI provider. + + + +### Create a Semgrep CI job + + + +Add Semgrep to your CI pipeline. Do either of the following:

+ i. Reference or add the [Semgrep Docker image](https://hub.docker.com/r/semgrep/semgrep). This is the recommended method.

+ ii. Add `pipx install semgrep` (or `uv tool install semgrep` if you use [`uv`](https://docs.astral.sh/uv/)) into your configuration file as a step or command, depending on your CI provider's syntax. See the [Python Packaging guide](https://packaging.python.org/en/latest/guides/installing-stand-alone-command-line-tools/) for more on installing standalone Python CLI tools. +
+ +Add `semgrep ci` as a step or command. + + +Set the `SEMGREP_APP_TOKEN` environment variable within your configuration file. + +
+ +The following example is a Jenkinsfile that adds Semgrep through the Docker image: + + + +```java expandable +pipeline { + agent any + environment { + // The following variable is required for a Semgrep AppSec Platform-connected scan: + SEMGREP_APP_TOKEN = credentials('SEMGREP_APP_TOKEN') + + // Uncomment the following line to scan changed + // files in PRs or MRs (diff-aware scanning): + // SEMGREP_BASELINE_REF = "main" + + // Troubleshooting: + + // Uncomment the following lines if Semgrep AppSec Platform > Findings Page does not create links + // to the code that generated a finding or if you are not receiving PR or MR comments. + // SEMGREP_JOB_URL = "${BUILD_URL}" + // SEMGREP_COMMIT = "${GIT_COMMIT}" + // SEMGREP_BRANCH = "${GIT_BRANCH}" + // SEMGREP_REPO_NAME = env.GIT_URL.replaceFirst(/^https:\/\/github.com\/(.*).git$/, '$1') + // SEMGREP_REPO_URL = env.GIT_URL.replaceFirst(/^(.*).git$/,'$1') + // SEMGREP_PR_ID = "${env.CHANGE_ID}" + } + stages { + stage('Semgrep-Scan') { + steps { + sh '''docker pull semgrep/semgrep && \ + docker run \ + -e SEMGREP_APP_TOKEN=$SEMGREP_APP_TOKEN \ + -e SEMGREP_REPO_URL=$SEMGREP_REPO_URL \ + -e SEMGREP_REPO_NAME=$SEMGREP_REPO_NAME \ + -e SEMGREP_BRANCH=$SEMGREP_BRANCH \ + -e SEMGREP_COMMIT=$SEMGREP_COMMIT \ + -e SEMGREP_PR_ID=$SEMGREP_PR_ID \ + -v "$(pwd):$(pwd)" --workdir $(pwd) \ + semgrep/semgrep semgrep ci ''' + } + } + } +} +``` + + + +The next example is a Jenkins configuration file that installs Semgrep: + + + + +```java expandable +pipeline { + agent any + environment { + // You need to set the token as an environment variable + // (see Create a `SEMGREP_APP_TOKEN` section). + SEMGREP_APP_TOKEN = credentials('SEMGREP_APP_TOKEN') + } + stages { + stage('Semgrep-Scan') { + steps { + // Install and run Semgrep: + sh 'pipx install semgrep' + sh 'semgrep ci' + } + } + } +} +``` + + + +### Run the job + +Depending on your CI provider and configuration, the job runs automatically. Otherwise, trigger the job by committing code or opening a PR or MR. + +### Verify the connection + +To verify that your Semgrep CI job is connected to Semgrep AppSec Platform: + + + +Go to your Semgrep AppSec Platform [Projects page](https://semgrep.dev/orgs/-/projects). + + +Verify that your repository is listed on the Projects page and that Semgrep AppSec Platform is running a scan. + + + +### Troubleshoot your CI job + +Semgrep attempts to automatically detect certain CI values, such as your repository's name and URL. These values are used to provide context to findings in Semgrep AppSec Platform and hyperlinks to the code that generated the finding. + +Refer to the following table for common issues and the corresponding environment variables you can set to fix them: + +| Issue | Environment variable to set | Affected CI providers | +| :--- | :--- | :--- | +| Can't establish a connection to Semgrep AppSec Platform. | `SEMGREP_APP_TOKEN` | Must be set for all CI providers. | +| Semgrep doesn't scan your PRs or MRs. | `SEMGREP_BASELINE_REF` | Required for CI providers **except** GitHub Actions or GitLab CI/CD. | +| Can't click hyperlinks to your repository from Semgrep AppSec Platform, nor can Semgrep AppSec Platform create PR or MR comments. | `SEMGREP_REPO_NAME` | Set these environment variables as needed to troubleshoot broken links for any CI provider **except** GitHub Actions and GitLab CI/CD. | +| Can't click hyperlinks to your repository from Semgrep AppSec Platform, nor can Semgrep AppSec Platform create PR or MR comments. | `SEMGREP_REPO_URL` | Set these environment variables as needed to troubleshoot broken links for any CI provider **except** GitHub Actions and GitLab CI/CD. | +| Can't click hyperlinks to your repository from Semgrep AppSec Platform, nor can Semgrep AppSec Platform create PR or MR comments. | `SEMGREP_BRANCH` | Set these environment variables as needed to troubleshoot broken links for any CI provider **except** GitHub Actions and GitLab CI/CD. | +| Can't click hyperlinks to your repository from Semgrep AppSec Platform, nor can Semgrep AppSec Platform create PR or MR comments. | `SEMGREP_JOB_URL` | Set these environment variables as needed to troubleshoot broken links for any CI provider **except** GitHub Actions and GitLab CI/CD. | +| Can't click hyperlinks to your repository from Semgrep AppSec Platform, nor can Semgrep AppSec Platform create PR or MR comments. | `SEMGREP_COMMIT` | Set these environment variables as needed to troubleshoot broken links for any CI provider **except** GitHub Actions and GitLab CI/CD. | +| Can't click hyperlinks to your repository from Semgrep AppSec Platform, nor can Semgrep AppSec Platform create PR or MR comments. | `SEMGREP_PR_ID` | Required to enable hyperlinks for **Azure Pipelines**. | + +## Data collected by Semgrep AppSec Platform + +When running in CI, Semgrep runs fully in the CI build environment. Unless you have explicitly granted code access to Semgrep, your code is not sent anywhere. + +- Semgrep AppSec Platform collects [findings](/semgrep-ci/findings-ci), which includes the line number of the code match, but not the code. It is hashed using a one-way hashing function. +- Findings data is used to generate line-specific hyperlinks to your source code management system and support other Semgrep functions. + +## Next steps + +You've set up Semgrep to scan in your repository and send findings after each scan. Your core deployment is almost complete. + +Remaining steps include: + +- Optional: [ Customize your CI job](/deployment/customize-ci-jobs). +- For software composition analysis (SCA) scans using **Jenkins or Maven**: [ Set up SCA scans for your infrastructure.](/semgrep-supply-chain/setup-infrastructure) +- Set up diff-aware scanning for feature branches (non-trunk branches) when a pull request or merge request is open. This is a prerequisite to receiving PR or MR comments. See Set up diff-aware scans. +- Set up **PR or MR comments**, which post findings to developers in your SCM. This involves developers in the security process as active participants. See [ PR or MR comments](/category/pr-or-mr-comments) for next steps. diff --git a/mintlify-docs/deployment/beyond-core-deployment.mdx b/mintlify-docs/deployment/beyond-core-deployment.mdx new file mode 100644 index 0000000000..ac3520d9dc --- /dev/null +++ b/mintlify-docs/deployment/beyond-core-deployment.mdx @@ -0,0 +1,33 @@ +--- +title: "Customize a core deployment" +description: "Now that you've finished your Semgrep core deployment, you can either customize Semgrep's scan behavior or continue to enable additional deployment features. The following sections list common tasks after you've finished your core deployment." +--- + +## Customize Semgrep scans or triage workflow + +| Concern | Guide | +| :--- | :--- | +| Semgrep scans irrelevant files. | [Ignore files, folders, or code](/ignoring-files-folders-code). | +| Semgrep Code is too noisy. | Enable [cross-file (interfile) analysis](/semgrep-code/semgrep-pro-engine-intro) or remove rules and rulesets through the [Policies page](/semgrep-code/policies). | +| I want my developers to see certain security issues in their pull request or merge request. | Configure [Comment mode](/semgrep-code/policies#block-a-pr-or-mr-through-rule-modes) in the Policies page. | +| I want to prevent developers from using dependencies with certain licenses. | Set up [license compliance](/semgrep-supply-chain/license-compliance).| +| I want to receive AI assistance when I triage findings. | Enable [Semgrep Multimodal](/semgrep-multimodal/overview). | +| I want to enforce my organization's coding standards. | Write a [custom rule](/writing-rules/overview) and add it to your Policies page. | + +## Enable additional deployment features + +| Concern | Guide | +| :--- | :--- | +| I want to receive notifications in my environment. | Set up [notifications](/semgrep-appsec-platform/notifications). | +| I want my developers to use Semgrep on their IDE. | Install and set up available [IDE extensions](/extensions/overview). | +| I'm scanning too many projects (repositories onboarded to Semgrep) and want to group them somehow. | [Tag your projects](/semgrep-appsec-platform/tags). | +| I'd like to manage access to the resources that developers can view or change in Semgrep AppSec Platform. | Configure [roles and users](/deployment/teams/overview). | + +## Stay up-to-date on new Semgrep features + +Subscribe to the: + + + + + \ No newline at end of file diff --git a/mintlify-docs/deployment/checklist.mdx b/mintlify-docs/deployment/checklist.mdx new file mode 100644 index 0000000000..2855489313 --- /dev/null +++ b/mintlify-docs/deployment/checklist.mdx @@ -0,0 +1,306 @@ +--- +title: "Pre-deployment checklist" +--- + +Before starting the deployment setup, use this checklist to ensure that: + +- You and your organization agree on the **scope** of the deployment. +- You are aware of **permissions** that Semgrep needs to provide certain functions. +- You have **access** to the resources needed to carry out the deployment. + + +**INFO** + +Ensure that your infrastructure meets all the [ Prerequisites](/prerequisites) to run Semgrep. + + +## Stakeholders and deployment team + +For medium-to-large teams, typically with more than 10 developers, coordinating with other departments before starting the deployment is crucial to an efficient roll-out. A complete deployment ensures that your licenses are fully used. + +Here are some teams or departments that may be responsible for parts of your Semgrep deployment: + +| Department | Tasks related to deployment | +| :--- | :--- | +| Infrastructure | SSO, CI/CD, and source code manager (SCM) configuration. | +| Engineering | Repository ownership, displaying findings to developers in PRs or MRs. | +| IT | Firewall or VPN configuration. | + +## Scope + +Scope refers to the breadth of deployment integration within your organization. The more users and repositories you onboard to Semgrep, the more crucial training becomes for **security champions** within your organization. + +Ensure that all stakeholders agree on: + +- Which users and departments will use Semgrep. +- Which repositories you will scan with Semgrep. +- How frequently you run Semgrep scans, such as daily or weekly, and at what time. This may affect other processes, such as PR approvals. +- A timeframe for deployment. You may divide this into phases. + +**Deployment times** vary greatly depending on your processes and size. + + +**ON SCHEDULING SCANS** + +Monorepos may take longer to finish scanning. Semgrep provides several options to improve performance, including piecemeal scanning of the monorepo. See [ Scanning a monorepo in parts](/kb/semgrep-ci/scan-monorepo-in-parts) for more information. + + +## Roles + +Semgrep provides three primary roles: **admin**, **member**, and **readonly**. + +Deployments can also enable a fourth role, **manager**, through the [Teams](/deployment/teams/overview) feature, which provides project-level role-based access control. + +For **single-user deployments**, you are the sole **admin** of your deployment. + +For **multi-user deployments**, determine the following: + +- The administrators (**admins**) that own the Semgrep deployment. +- For **members**, ensure that they have a sign-in method: + - SSO + - GitHub Cloud + - GitLab Cloud + +## Required permissions and access + +The following checklist breaks down permissions required by Semgrep features. + +| Feature | Permission required | +| :--- | :--- | +| Run Semgrep continuously in your CI workflows. | Adding or making changes to CI jobs. This includes committing configuration files for each repository. | +| Run Semgrep continuously in your CI workflows. | Defining environment variables and storing secrets. | +| Run Semgrep continuously **without** changing your CI workflows. | Read access to user-selected repositories. | +| Manage user authentication with SSO. | Viewing and editing of SSO configurations. | +| Receive Slack notifications. | Being a **Slack workspace owner**; alternatively, coordinate with the team responsible. | +| Send pull requests or merge requests to your SCM. | Editing firewall or VPN allowlist for self-hosted repositories. | + + +### SCM-specific required permissions + + + + + + +#### Azure DevOps + +| Feature | Permission | +| :--- | :--- | +| Pull request (PR) comments. | Able to create [user personal access tokens](https://learn.microsoft.com/en-us/azure/devops/organizations/accounts/use-personal-access-tokens-to-authenticate). | + + + + + +#### Bitbucket + +| Feature | Permission | +| :--- | :--- | +| Pull request (PR) comments. | Able to create **repository variables**. | + + + + + +#### GitHub + +| Feature | Permission required | +| :--- | :--- | +| Create CI jobs for repositories in bulk and detect GitHub repositories automatically. | Installing GitHub apps. | +| AI-assisted triage and recommendations. | Code access. | + + + + + +#### GitLab + +| Feature | Permission required | +| :--- | :--- | +| Merge request (MR) comments. | Create personal access tokens. | +| AI-assisted triage and recommendations. | Create personal or project-level access tokens. | +| AI-assisted triage and recommendations. | Read access to user-selected repositories. | + + + + + + +## Appendices + +### Permissions + + + + +#### Permissions for GitHub + +This section explains Semgrep AppSec Platform permissions that are requested in two different events: + +* When you first sign in through GitHub. +* When you first add, integrate, or onboard your repositories to Semgrep AppSec Platform. + +#### Permissions when signing in with GitHub + +Semgrep AppSec Platform requests the following standard permissions set by GitHub when you first sign in. However, not all permissions are used by Semgrep AppSec Platform. + + + +**Verify your GitHub identity**
+Enables Semgrep AppSec Platform to read your GitHub profile data, such as your username. + +**Know which resources you can access**
+Semgrep does not use or access any resources when first logging in. However, you can choose to share resources at a later point to add repositories into Semgrep AppSec Platform. + +**Act on your behalf**
+Enables Semgrep AppSec Platform to perform certain tasks **only on resources that you choose to share with Semgrep AppSec Platform**. Semgrep AppSec Platform never uses this permission and never performs any actions on your behalf, even after you have installed `semgrep-app`. See [When does a GitHub App act on your behalf?](https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/authorizing-github-apps) in GitHub documentation. +
+ +#### Permissions when adding members or repositories into Semgrep AppSec Platform + +The public GitHub integration app is called [`semgrep-app`](https://github.com/apps/semgrep-app). This app is used to integrate Semgrep into user-selected GitHub repositories. + + + +**Reading metadata of the repositories you select**
+Enables Semgrep AppSec Platform to list repository names on the project setup page. + +**Reading the list of organization members**
+Enables Semgrep AppSec Platform to determine who can manage your Semgrep organization based on your GitHub organization's members list. + +**Reading and writing pull requests**
+Enables Semgrep AppSec Platform to comment about findings on pull requests. + +**Reading and writing actions**
+Enables Semgrep AppSec Platform to cancel stuck jobs, rerun jobs, pull logs from jobs, and perform on-demand scanning. + +**Reading [GitHub Checks](https://docs.github.com/en/rest/reference/checks)**
+Facilitates debugging of Semgrep AppSec Platform when configured out of [GitHub Actions](https://docs.github.com/en/actions). + +**Reading and writing security events**
+Enables integration with GitHub Advanced Security (for example, to show Semgrep results). + +**Reading and writing secrets**
+Enables automatically adding of the Semgrep AppSec Platform Token to your repository secrets when onboarding projects. Note: Semgrep cannot read the values of your existing or future secrets (only the names). + +**Reading and writing 2 files**
+Enables Semgrep AppSec Platform to configure itself to run in CI by writing to `.github/workflows/semgrep.yml` and `.semgrepignore` files. + +**Reading and writing workflows**
+Enables Semgrep AppSec Platform to configure itself to run in CI by writing to `.github/workflows/semgrep.yml`. GitHub allows writing to files within `.github/workflows/` directory only if this permission is granted along with "Writing a single file." + +**Reading and writing pull requests**
+Write permissions allow Semgrep AppSec Platform to leave pull request comments about findings. Read permissions allow Semgrep AppSec Platform to automatically remove findings when the pull request that introduced them is closed without merging. +
+ + +#### Permissions when adding repositories into Semgrep AppSec Platform through Semgrep Managed Scans or using AI features + +You can optionally create a private GitHub app, which follows the naming convention **Semgrep Code - YOUR_ORG_NAME**. This private app is used for the following features: + +- To add repositories to Semgrep AppSec Platform without changing your existing CI workflows. To learn more, see [ Semgrep Managed Scans](/deployment/managed-scanning/overview). +- To integrate AI-asssisted features into your Semgrep organization. To learn more, see [ Semgrep Multimodal overview](/semgrep-multimodal/overview). + + +**INFO** + +These features require **read access** to your code. + + + + +**Reading metadata of the repositories you select**
+Lets Semgrep list their names on the project setup page. + +**Reading the list of organization members**
+Lets Semgrep determine who can manage your Semgrep organization based on your GitHub organization's members list. + +**Writing (and reading) pull requests**
+Lets Semgrep comment about findings on pull requests. + +**Writing (and reading) actions**
+Allows Semgrep AppSec Platform to cancel stuck jobs, rerun jobs, pull logs from jobs, and perform on-demand scanning. + +**Reading checks**
+Facilitates debugging of Semgrep AppSec Platform when configured out of GitHub Actions + +**Writing (and reading) security events.**
+Enables integration with GitHub Advanced Security (for example, to show Semgrep results) + +**Writing (and reading) secrets**
+Enables automatic adding of the Semgrep AppSec Platform Token to your repository secrets when onboarding projects. Note: We cannot read the values of your existing or future secrets (only the names). + +**Writing (and reading) 2 files**
+Lets Semgrep configure itself to run in CI by writing to .github/workflows/semgrep.yml and .semgrepignore. + +**Writing (and reading) workflows**
+Lets Semgrep configure itself to run in CI by writing to .github/workflows/semgrep.yml. GitHub allows writing to files within .github/workflows/ only if this permission is granted along with "Writing a single file." + +**Read source code of the repositories you select**
+Allows Semgrep Multimodal to fetch source code files on-demand to construct AI prompts. +
+ +
+ + + +#### Permissions for GitLab + +Semgrep requires the following permissions (scopes) to enable the authentication of a session: + +* `openid` +* `email` +* `profile` +* `API` + + +
+ +### IP addresses + +If you are behind a firewall, are using a virtual private network (VPN), or have network restrictions regarding access, you may need to add the following IP addresses to the **ingress** allowlist and **egress** allowlist: + +```bash +# Ingress IP addresses (from Semgrep to your infrastructure) +# and egress IP addresses (from your infrastructure to Semgrep) +35.166.231.235 +52.35.248.246 +52.34.137.110 +44.225.64.41 +``` + +#### Additional egress IP addresses + +You must also add **CloudFront IP addresses** to your **egress** allowlist. Refer to [ Locations and IP address ranges of CloudFront edge servers](https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/LocationsOfEdgeServers.html) for a list of IP addresses. + +#### Allowlists when using Semgrep Network Broker + +The [Semgrep Network Broker](/semgrep-ci/network-broker) facilities secure access with Semgrep, and its use can replace the allowlisting of the IP addresses required for ingress. The Network Broker, however, only facilitates requests from Semgrep to your network and *doesn't* assist with requests originating from your network, including those from your network to Semgrep. + +In other words, the only address you would have to allow inbound is `wireguard.semgrep.dev` on UDP port `51820`, or your tenant's equivalent. Depending on how restrictive your network is, you may also need to modify your allowlist to include the egress IP addresses provided in [IP addresses](#ip-addresses). + +#### Features that require inbound network connectivity + +- [Source code manager connections](/deployment/connect-scm#connect-to-on-premise-orgs-and-projects) +- [PR and MR comments](/category/pr-or-mr-comments) +- [Semgrep Managed Scans](/deployment/managed-scanning/overview) +- [Semgrep Multimodal](/semgrep-multimodal/getting-started) + +### Semgrep versions + +Many improvements to the Semgrep AppSec Platform experience only work with up-to-date Semgrep CLI versions. As such, Semgrep AppSec Platform only supports the 10 most recent minor versions of Semgrep CLI. For example, if the latest release was 1.60.0, all versions greater than 1.50.0 are supported, while earlier versions, such as 1.49.0, can be deprecated or can result in failures. + +To update Semgrep, see [Update Semgrep](/update). + +Docker users: use [the **latest** tag](https://hub.docker.com/r/semgrep/semgrep/tags?page=1&name=latest) to ensure you are up to date. + +### Semgrep AppSec Platform session details + +- The time before you need to reauthenticate to Semgrep AppSec Platform is 7 days. +- A Semgrep AppSec Platform session token is valid for 7 days. +- This session timeout is not configurable. +- Semgrep AppSec Platform does not use cookies; instead it uses `localStorage` to store access tokens. The data in `localStorage` expires every 7 days. + +## Additional resources + +Check out [How to introduce Semgrep to your organization](https://blog.trailofbits.com/2024/01/12/how-to-introduce-semgrep-to-your-organization/) from Trail of Bits for tips on how to evaluate and deploy Semgrep for your org. diff --git a/mintlify-docs/deployment/claim-a-license.mdx b/mintlify-docs/deployment/claim-a-license.mdx new file mode 100644 index 0000000000..47116a4c91 --- /dev/null +++ b/mintlify-docs/deployment/claim-a-license.mdx @@ -0,0 +1,27 @@ +--- +title: "Claim a license" +description: "Once you've purchased a subscription, you should receive an email from Semgrep with your license information. Follow the instructions provided in the email to claim your license and begin onboarding your Semgrep products." +--- + + +## For license key holders or manual license claims + + +**CAUTION** + +For **single-tenant** users, reach out to the [Semgrep Support Team](/support) directly. Please do not attempt to claim a license manually. + + +If you have been provided a license key by Semgrep or if you would like to claim a license manually: + + + + Sign up or log in to your Semgrep account. + + + [Create an org](/deployment/create-account-and-orgs/#initial-sign-in-to-semgrep-appsec-platform) if you haven't already done so. + + + Navigate to `http://semgrep.dev/orgs/-/settings/upgrade/YOUR_LICENSE_KEY`, making sure that you **replace** the `YOUR_LICENSE_KEY` placeholder with your license key value. + + diff --git a/mintlify-docs/deployment/connect-scm.mdx b/mintlify-docs/deployment/connect-scm.mdx new file mode 100644 index 0000000000..87e77179eb --- /dev/null +++ b/mintlify-docs/deployment/connect-scm.mdx @@ -0,0 +1,510 @@ +--- +title: "Connect a source code manager" +--- + + +**YOUR DEPLOYMENT JOURNEY** + +- You have gained the necessary [resource access and permissions](/deployment/checklist) required for deployment. +- You have [created a Semgrep account and organization](/deployment/create-account-and-orgs). + + +Linking a source code manager provides the following benefits: + +- Allows the Semgrep org membership to be managed by GitHub or GitLab. +- For GitHub users: + - Provides Semgrep access to post PR or MR comments. + - For GitHub Actions users: Enables you to add a Semgrep CI job to repositories in bulk. +- Allows you to scan and manage your Azure DevOps and Bitbucket projects in Semgrep AppSec Platform. +- Allows the Semgrep platform to generate hyperlinks to code in findings. + +If your organization uses both GitHub and GitLab to manage source code, log in with the source code manager that you would prefer to use to manage Semgrep org membership. You can still scan repositories from other sources, including Azure DevOps and Bitbucket, though you will need to use a separate SSO provider to manage the authentication of your users in such cases. + +The process to connect a source code manager depends on whether your SCM tool is cloud-hosted by the service provider, hosted on-premise, or hosted as a single tenant by the service provider. + +## Connect to cloud-hosted orgs + +If you opted to scan a GitHub or GitLab repository when you initially signed in, you may have already performed these steps and can skip to [Next steps](#next-steps). + + + + + +### Azure DevOps Cloud + + + +Sign in to [Semgrep AppSec Platform](https://semgrep.dev/login). + + +Optional: If you have created more than one Semgrep account, select the account you want to make a connection for by clicking on the **Navigation bar > Your account name > The account you want to connect**.
+ + + +
+ +Go to **Settings > Source code managers > Add > Azure DevOps**. + + +In the **Connect your Azure DevOps Project** dialog box, provide: + - The **Name of your Azure DevOps Organization**. + - The **Name of your Azure DevOps Project**. The name of your Azure DevOps organization and project can be seen in the project URL, for example `https://dev.azure.com/organization/project`. + - Your **Access token**. See [User personal access tokens](https://learn.microsoft.com/en-us/azure/devops/organizations/accounts/use-personal-access-tokens-to-authenticate) for information on generating a token. + + +Click **Connect** to save and proceed. + + +The Azure DevOps project is now listed under **Source code managers**. Click **Test** to verify that the new connection is installed correctly. + +
+ +
+ + +### Bitbucket Cloud + + + +Sign in to [Semgrep AppSec Platform](https://semgrep.dev/login). + + +Optional: If you have created more than one Semgrep account, select the account you want to make a connection for by clicking on the **Navigation bar > Your account name > The account you want to connect**.
+ + + +
+ +Go to **Settings > Source code managers > Add > Bitbucket Cloud**. + + +In the **Connect your Bitbucket Workspace** dialog box, provide: + - The **Name of your Bitbucket Workspace** + - Your **Access token**. Semgrep requires a [workspace-level access token](https://support.atlassian.com/bitbucket-cloud/create-a-workspace-access-token/). + + +Click **Connect** to save and proceed. + + +The Bitbucket project is now listed under **Source code managers**. Click **Test** to verify that the new connection is installed correctly. + +
+ +
+ + +### GitHub Cloud with GitHub SSO + +These steps are for users that sign in to Semgrep through GitHub. + + + +Sign in to [Semgrep AppSec Platform](https://semgrep.dev/login). + + +Optional: If you have created more than one Semgrep account, select the account you want to make a connection for by clicking on the **Navigation bar > Your account name > The account you want to connect**.
+ + + +
+ +From the **Navigation bar**, click **Settings > Source code managers**. + + +Click **Add > GitHub**. + + +Review the permissions requested by Semgrep, then click **Continue**. + + +Click the organization you want to install Semgrep on. + + +Choose to authorize and install Semgrep for **All repositories** or **Only select repositories**. + + +Click **Install and authorize**. + + +After a successful link, you are signed out of Semgrep AppSec Platform automatically, as your credentials have changed after linking an organization. + + +Sign back in to Semgrep AppSec Platform. + +
+ +If you'd like to connect multiple GitHub orgs, use the instructions for [GitHub Cloud with non-GitHub SSO](#github-cloud-with-non-github-sso). + +### GitHub Cloud with non-GitHub SSO + +These steps are for users who: + +- Sign in to Semgrep through a **non-GitHub** SSO provider +- Have connected to a GitHub org already, but want to add additional GitHub connections + +You can connect to GitHub using Semgrep's GitHub app and one of the following: a personal access token or your individual GitHub account. + + + +Navigate to the following link: [Semgrep GitHub app](https://github.com/marketplace/semgrep-dev) and install the Semgrep GitHub app onto the GitHub org you want to connect to. + + +Sign in to [Semgrep AppSec Platform](https://semgrep.dev/login) using your non-GitHub SSO provider. + + +From the **Navigation bar**, go to **Settings > Source code managers**. + + +Click **Add > GitHub**. + + +In the **Connect your GitHub Organization** modal, enter the name of your GitHub organization. Then, either: + - Enter a GitHub personal access token and click **Connect**. + - Click the **Authenticate with GitHub** button without providing a token. + + +Your GitHub organization is now listed under **Source Code managers**. Click **Test** to verify that the new connection is installed correctly. + + + +Alternatively, you can set up the [Semgrep GitHub app](https://github.com/marketplace/semgrep-dev). Then, [contact Support](/support#contact-support) and inform them which Semgrep account needs to be connected to the GitHub org. Support can help you finalize the connection. + +### GitHub Enterprise Cloud with data residency + +If your GitHub Enterprise instance contains many orgs, you must **choose an org** among your accounts that acts as the **owner** of the Semgrep App. As the owner, this org controls the settings and permissions granted to the app. Throughout the setup process, ensure that you select this org consistently when prompted. + +Perform the following steps to set up the connection: + + + +Sign in to [Semgrep AppSec Platform](https://semgrep.dev/login). + + +Optional: If you have created more than one Semgrep account, select the account you want to make a connection for by clicking on the **Navigation bar > Your account name > The account you want to connect**.
+ + + +
+ +From the **Navigation bar**, click **Settings > Source code managers**. + + +Click **Add > GitHub Enterprise**. + + +In the **Connect your GitHub Organization** dialog that appears, provide: + - The **Name of your GitHub Organization** + - The **URL** used to access the GitHub instance + + +Add the Semgrep GitHub App:
+ i. Under **Enter GitHub information**, indicate that you want to install the app on your **Organization**, and select the **Organization name** where the app is installed. If you have multiple GitHub organizations that you'd like to use with Semgrep, ensure that you select the **Use for multiple GitHub orgs** box.
+ ii. Under **Select features to enable**, indicate whether you would like to grant code access to Semgrep.
+ iii. Review the permissions requested by Semgrep.
+ iv. Click **Register a Semgrep GitHub App**. Semgrep asks if you'd like to be redirected to GitHub to continue creating the app. Click **Continue** to proceed.
+ v. You are taken to your GHE instance and asked to name your app. You can choose whatever name you'd like, but Semgrep recommends that you name it something that indicates that this is the Semgrep GHE app.
+ vi. After you name your app, choose the GHE org you want to install it on.
+ vii. Select the org, then click **Install**.
+ viii. Wait for the installation to complete. When done, you are redirected to Semgrep.
+ ix. Verify the installation by navigating to **Settings** > **Source code managers**. Ensure that the entry for your GitHub organization shows a **Connected** badge.
+ x. In GHE, you should see the app listed as installed on the **GitHub Apps** page.
+ - You can click **Configure** to choose the repositories to which the app has access. Additionally, you can go to **App settings** to customize the permissions granted to the app. +
+ +If you have additional GHE orgs you'd like to add, you can do so by repeating the previous steps 1-6. + + At this point, you've successfully installed the GHE Semgrep App on the owner GHE org. In the future, other members of your GHE instance can install the app on their GHE orgs using the public link if they have the proper permissions. You can get the public link from GHE by going to **GitHub Apps** > **App settings**. + + ![App installation page](/images/ghe-11-7886f9cda872941a6168734de5d98115.png) + + +
+#### Install the app for subsequent GitHub orgs + +You can install the Semgrep app onto additional GitHub orgs at any time. To do so: + + + +Go to the public link for the app. Click **Install**. + + ![App installation page](/images/ghe-12-93385476be689b84117756f9f5f08a46.png) + + + +Choose the GitHub org to which you want the app installed, and click **Install**. + + ![Org list](/images/ghe-13-b946c5e48935bfe8265a6f7720b5a748.png) + + + +In the popup confirmation message, click **Install**. + + ![GitHub installation prompt](/images/ghe-14-a39feb416a34d2da6cab74cf02df9213.png) + + + +The GitHub org should now be listed under **Source code managers**. + + + +You have successfully connected Semgrep to your GitHub organization. + +
+ + + +### GitLab Cloud + + + +Create a PAT by following the steps outlined in this [guide to creating a PAT](https://docs.gitlab.com/ee/user/profile/personal_access_tokens.html). Ensure that the PAT is created with the required `api` scope. + + +Sign in to [Semgrep AppSec Platform](https://semgrep.dev/login). + + +Optional: If you have created more than one Semgrep account, select the account you want to make a connection for by clicking on the **Navigation bar > Your account name > The account you want to connect**.
+ + + +
+ +Click **Settings > Source Code Managers > Add > GitLab Cloud** + + +Enter the personal access token generated into the **Access token** field. + + +Enter your GitLab group's name into the **Name of your GitLab Group** field. If your repositories are organized in subgroups, you only need to provide the name of the top-level group. + + +Optional, but recommended: if you have multiple GitLab groups in your GitLab account, create a source code manager per group. Repeat steps 1, 3-4 for each GitLab group. + + +The GitLab groups are now listed under **Source code managers**. Click **Test** to verify that the new connection is configured correctly. + +
+ +You have successfully connected an org in Semgrep AppSec Platform with an organization in your source code management tool. + +
+
+ +## Connect to on-premise orgs and projects + + + + + +### Bitbucket Data Center + + + +Create an HTTP Access Token for your project following the steps outlined in [Bitbucket Data Center documentation](https://confluence.atlassian.com/bitbucketserver/http-access-tokens-939515499.html). Ensure that the access token is created with `PROJECT_ADMIN` permissions. + + +Copy the token for use in the next steps. + + +Sign in to [Semgrep AppSec Platform](https://semgrep.dev/login). + + +Optional: If you have created more than one Semgrep account, select the account you want to make a connection for by clicking on the **Navigation bar > Your account name > The account you want to connect**.
+ + + +
+ +Go to **Settings** > **Source code managers**, and click **Add > Bitbucket Data Center**. + + +In the **Connect your Bitbucket project (key)** dialog box, provide: + - The **Name of your Bitbucket project (key)**. This must be the project key, which you can find by navigating to `/projects`. + - The **URL** to access your installation of Bitbucket Data Center; this is your fully qualified domain name. + - The **Access Token** that grants Semgrep permission to communicate with your project. Semgrep expects an [HTTP access token](https://confluence.atlassian.com/bitbucketserver/http-access-tokens-939515499.html) with `PROJECT_ADMIN` permissions. + + +Click **Connect** to save and proceed. + + +The Bitbucket project is now listed under **Source code managers**. Click **Test** to verify that the new connection was installed correctly. + + +To enable merge request comments, click **Incoming webhooks**. + + +Optional: Click **Auto scan** to onboard all current and future repositories under your project to Semgrep Managed Scans. + +
+ +
+ + +### GitHub Enterprise + +This section is applicable to users on a **GitHub Enterprise Server** plan. + +The **Semgrep App for GitHub Enterprise (GHE)** creates a connection between Semgrep +and orgs in your GHE deployment. There are two primary installation steps: + + + +Install the Semgrep App for the first time using the GHE organization (org) that "owns" the app. + + +Install the app for additional GHE orgs. + + + +#### Initial Semgrep App installation + +If your deployment contains many orgs, you must **choose an org** among your accounts that acts as the **owner** of the Semgrep App. As the owner, this org controls the settings and permissions granted to the app. + +Ensure that you have selected the intended owner by viewing the account name in the navigation bar: + + +
+Choose another account by clicking the **account name** and selecting an account from the drop-down box. Then, perform the following steps to set up the connection: + + + +Sign in to [Semgrep AppSec Platform](https://semgrep.dev/login/). + + +Click **Settings** > **Source code managers > Add > GitHub Enterprise**. + + +In the **Connect your GitHub Organization** dialog box, provide: + - The **Name of your GitHub Organization** + - The **URL** to access your deployment + + +Click **Connect** to save your changes. + + +In the **Add GitHub App** page that you're redirected to, ensure that: + - You've selected **Organization**. + - The **GitHub Organization name** is populated; if not, enter the name of your org. + - You've selected the **Use for multiple GitHub orgs (Enterprise-public app)** checkbox. + + +Select the features you'd like enabled. Enabling PR comments, Multimodal recommendations, and Semgrep Managed Scans requires you to grant Semgrep Code Access, while enabling only PR comments does not. + + +Review the permissions for the app; as the app owner, note that you can change these permissions later. + + +Click **Register GitHub App** to proceed. + + +You are taken to your GHE instance and asked to name your app. You can choose whatever name you'd like, but Semgrep recommends that you name it something that indicates that this is the Semgrep GHE app. + + +After you name your app, choose the GHE org to which you want it installed. + + +Select the org that you want to act as the owner of the app, and click **Install**. + + +Wait for the installation to complete. When done, you will be redirected to Semgrep. + + +Verify the installation by navigating to **Settings** > **Source code managers**. Ensure that the entry for your SCM shows a **Connected** badge. + + +In GHE, you should see the app listed as installed on the **GitHub Apps** page. + + ![GHE showing installed Semgrep App](/images/ghe-9-0f05ee3bc7f1ef7c1c1e41da3d20c2c2.png) + + You can click **Configure** to choose the repositories to which the app has access. Additionally, you can go to **App settings** to customize the permissions granted to the app. + + ![GitHub Apps page showing App settings link](/images/ghe-10-b108fbec17da29321cdaf8b2141e625d.png) + + + +If you have additional GHE orgs you'd like to add, you can do so by repeating steps 2-15. + + + +At this point, you've successfully installed the GHE Semgrep App on the owner GHE org. In the future, other members of your GHE instance can install the app on their GHE orgs using the public link if they have the proper permissions. You can get the public link from GHE by going to **GitHub Apps** > **App settings**. + + + ![App installation page](/images/ghe-11-7886f9cda872941a6168734de5d98115.png) + +#### Install the app for subsequent GHE orgs + +You can install the Semgrep app onto additional GHE orgs at any time. To do so: + + + +Go to the public link for the app shared with you by your admin. Click **Install**. + + ![App installation page](/images/ghe-12-93385476be689b84117756f9f5f08a46.png) + + + +Choose the GHE org to which you want the app installed, and click **Install**. + + ![Org list](/images/ghe-13-b946c5e48935bfe8265a6f7720b5a748.png) + + + +In the popup confirmation message, click **Install**. + + ![GitHub installation prompt](/images/ghe-14-a39feb416a34d2da6cab74cf02df9213.png) + + + +The GHE org should now be listed under **Source code managers**. + + + +You have successfully connected Semgrep to your GitHub Enterprise Server. + +
+ + +### GitLab Self-Managed + +This section is applicable to users with subscriptions to any **GitLab self-managed plan**. + +Connect Semgrep and GitLab Self-Managed accounts by creating a PAT and providing it to Semgrep using Semgrep AppSec Platform: + + + +Create a PAT by following the steps outlined in this [guide to creating a PAT](https://docs.gitlab.com/ee/user/profile/personal_access_tokens.html). Ensure that the PAT is created with the required `api` scope. + + +Sign in to [Semgrep AppSec Platform](https://semgrep.dev/login). + + +Optional: If you have created more than one Semgrep account, select the account you want to make a connection for by clicking on the **Navigation bar > Your account name > The account you want to connect**.
+ + + +
+ +Click **Settings > Source code managers > Add > GitLab Self-Managed** and enter the personal access token generated into the **Access token** field. + + +Enter your GLSM base URL into the **URL** field. + + +Enter your GitLab group's name into the **Name of your GitLab Group** field. If your repositories are organized in subgroups, you only need to provide the name of the top-level group. + + +If you have multiple GitLab groups in your GitLab account, you need to create a source code manager per group. Repeat steps 1, 3-5 for each GitLab group. + + +The GitLab groups are now listed under **Source Code managers**. Click **Test** to verify that the new connection is installed correctly. + +
+ +
+
+ +## Next steps + +- Optional: See [ SSO authentication](/deployment/sso) to set up user management through SSO. +- You are ready to scan your org's repositories with Semgrep. + diff --git a/mintlify-docs/deployment/core-deployment.mdx b/mintlify-docs/deployment/core-deployment.mdx new file mode 100644 index 0000000000..152f942540 --- /dev/null +++ b/mintlify-docs/deployment/core-deployment.mdx @@ -0,0 +1,108 @@ +--- +title: "Core deployment" +description: "Semgrep can be set up to scan repositories of any size." +--- + +Once added to Semgrep, a codebase, repository, or subfolder within a monorepo is referred to as a **project**. + +**Deployment** refers to the process of integrating Semgrep into your developer and infrastructure workflows. Completing the deployment process provides you with the Semgrep features that meet your security program's needs. + +Deployment includes: + +- Running Semgrep scanners as part of your CI. These scans can be any combination of SAST (Static Application Security Testing), SCA (Software Composition Analysis), or Secrets, depending on your plan. +- Managing team members' access and authentication. +- Ensuring that Semgrep has sufficient access to your self-hosted source code manager (SCM), such as GitLab Self-Managed. + +Semgrep does not require code access to complete the core deployment process. Your code is not sent anywhere. + + +**ARE THESE GUIDES FOR YOU?** + +- These guides outline procedures for the deployment of Semgrep as part of a security program. To try out Semgrep, refer to the [ Quickstart](/getting-started/quickstart) document. +- Individual users can also use these guides to deploy Semgrep as part of their personal security. + + +Many deployment features are set up through **Semgrep AppSec Platform**. + +Deployment does **not** include: + +- Customizing your SAST, SCA, or secrets scans +- Custom rule writing +- Triage + +For these features, refer to the **Scan and Triage** section in the navigation bar. + +### All Semgrep deployment features + +Semgrep supports many different technology stacks. Refer to the following table to evaluate which deployment features of Semgrep you can use based on your technologies. + +#### Core deployment + +These are the absolute minimum Semgrep features for any deployment. + +| Deployment feature | Notes | +| :--- | :--- | +| SAST scanning | Check that Semgrep: | +| SCA scanning | Check that Semgrep either supports your manifest file or lockfile and package manager. | +| Secrets scanning | Check that your services, such as Slack or Twilio, can be validated by Semgrep. Semgrep Secrets is available through Semgrep Sales, so you must Book a demo. | +| SSO | Semgrep supports:
  • OpenID Connect or OAuth 2
  • SAML 2.0
| +| Organizations | Semgrep can connect to orgs from GitHub and GitLab. Connecting an org enables Semgrep AppSec Platform to authenticate new users from the same org easily.

If you use Bitbucket or Azure Repos, you can use SSO to manage the authentication of your users, then add repositories for scanning through your CI provider. | +| Scanning remote repositories through CI | Semgrep fully supports many popular CI providers. See Add Semgrep to CI. | +| Managed Scans: scanning remote repositories in bulk without CI changes | An alternative method of scanning many repositories with Semgrep that doesn't require integration with your CI. Requires read access to user-selected repositories. See Add repositories to Semgrep in bulk. | +| PR or MR comments | Semgrep can post PR or MR comments in the following SCMs:
  • GitHub
  • GitLab
  • Bitbucket
| + +#### Additional deployment features + +Useful features that you can add based on your tech stack. You can integrate these features further into your security workflows after some initial testing of your core deployment. + +| Deployment feature | Notes | +| :--- | :--- | +| Notifications | Semgrep can send notifications through the following channels:
  • Slack
  • Email
  • Webhooks
| +| AI-assisted triage and remediation | Semgrep can give AI-assisted recommendations on whether a finding is a true or false positive as well as suggest code fixes for true positive findings. | +| IDE integration | Encourage developers to run Semgrep in their IDE. Officially supported extensions include:
  • Microsoft Visual Studio Code
  • IntelliJ Ultimate IDEA
  • Emacs
| +| API | Check that Semgrep's API meets your needs. See API docs. | + + +## Core deployment process + +At the minimum, your deployment of Semgrep consists of the following steps: + + + +Each user of Semgrep has one account. + + +Each Semgrep account can have many orgs. Orgs are logical groupings of related projects and users. + + + - For GitHub or GitLab users, you can connect your Semgrep org to the orgs in your source code manager (SCM). This means that any member of an org in your SCM can sign in to your Semgrep deployment. + - You can also use SSO to manage user authentication. + + +This step ensures that your Semgrep deployment is up and running and that you receive **findings** of security issues in Semgrep AppSec Platform. + + + + ![Core deployment steps](/images/core-deployment-7c163d2788754757edf2b150a5fff4e6.png) + + + + +To manage a large volume of users and projects, you may need to perform additional steps: + +- Role management +- Tagging projects + +These steps are covered in the section [Deployment at scale](/category/deployment-at-scale). + +Team size isn't necessarily indicative of deployment needs. Features for large teams can be deployed for smaller teams as well, and are available on the Semgrep Team Tier. + +## Deploy Semgrep in phases + +It is recommended to finish the core deployment of Semgrep to a few repositories or departments in your organization first before attempting to deploy to the majority. + +This **initial phase** prepares you to deploy Semgrep to the rest of the organization. Organizational infrastructure can vary greatly and the initial deployment can help you identify and address issues so that they do not recur in a wider deployment. + +## Next steps + +Click **Next** to begin setting up your core deployment. diff --git a/mintlify-docs/deployment/create-account-and-orgs.mdx b/mintlify-docs/deployment/create-account-and-orgs.mdx new file mode 100644 index 0000000000..41c8d7e716 --- /dev/null +++ b/mintlify-docs/deployment/create-account-and-orgs.mdx @@ -0,0 +1,240 @@ +--- +title: "Create a Semgrep account and set up organizations" +sidebarTitle: "Create an account" +--- + + + +**YOUR DEPLOYMENT JOURNEY** + +- You have gained the necessary [resource access and permissions](/deployment/checklist) required for deployment. + + +Create a Semgrep account by signing in to Semgrep AppSec Platform with your GitHub or GitLab account. This enables you to: + +* Add the rest of your GitHub or GitLab organization (org) members to Semgrep. +* Configure Semgrep to scan repositories in other source code managers, such as Bitbucket. + + +**USING SSO FOR YOUR INITIAL SIGN-IN** + +Alternatively, reach out to [ sales@semgrep.com](mailto:sales@semgrep.com) to set up SSO. This removes the need to sign in through a GitHub or GitLab account if you don't have one. + + +## Semgrep AppSec Platform + +Semgrep AppSec Platform is used to manage all Semgrep products, and it is where you can: + +- View and manage your Semgrep findings. +- Customize how Semgrep scans your code. +- Manage the users associated with your Semgrep organization. +- Set up alerts and notifications, including Slack alerts, emails, and pull request or merge request comments pushed to your source code manager + +## Initial sign in to Semgrep AppSec Platform + +The following steps walk you through creating a **user account** and your first **organization**: + + + + +To sign in using your GitHub account: + + + +Navigate to the [ Semgrep AppSec Platform login page](https://semgrep.dev/login/) and click **Sign in with GitHub**. + + +Click **Authorize semgrep-app** to [grant Semgrep the needed permissions](/deployment/checklist#permissions) and proceed. + + +Enter an **organization name** when prompted then click **Create new organization**. This organization name is typically the name of the org in GitHub that you want to connect Semgrep to. For individual users, this can also be a personal account. + + +Either select a **scan environment** or click **Don't want to connect to anything yet?** + + ![Select a scan environment](/images/onboarding-scan-location-edb6c536ee437983c9b1cc9403a325d4.png) + + + +If you selected a **scan environment**: + + i. Follow the prompts to set up the scan. + + +If you clicked **Don't want to connect to anything yet**: + + i. Choose either **Skip setup** if you prefer not to scan anything yet or **See demo project** to view how Semgrep scans and presents findings from a demo `juice-shop` project. + + + + + +To sign in using your GitLab account: + + + +Navigate to the [ Semgrep AppSec Platform login page](https://semgrep.dev/login/) and click **Sign in with GitLab**. + + +Click **Authorize** to grant Semgrep the needed permissions and proceed. + + +Enter an **organization name** when prompted then click **Create new organization**. This organization name is typically the name of the org in GitLab that you want to connect Semgrep to. For individual users, this can also be a personal account. + + +Either select a **scan environment** or click **Don't want to connect to anything yet?** + + ![Select a scan environment](/images/onboarding-scan-location-edb6c536ee437983c9b1cc9403a325d4.png) + + + +If you selected a **scan environment**: + + i. Follow the prompts to set up the scan. + + +If you clicked **Don't want to connect to anything yet**: + + ii. Choose either **Skip setup** if you prefer not to scan anything yet or **See demo project** to view how Semgrep scans and presents findings from a demo `juice-shop` project. + + + + + +You have successfully created an account, your first organization, and have optionally run your first scan. + +## Set up organizations + +Organizations (orgs) in Semgrep enable users to share access to, and management of, Semgrep resources such as findings and reports. + +Semgrep organizations can be **connected** to equivalent GitHub, GitLab, and SSO organizations, which enables users from those organizations to easily join your Semgrep deployment through their existing credentials. + +### Next steps for GitHub and GitLab users + +- Connect your Semgrep org to your GitHub or GitLab SCM. Refer to [ Connect a source code manager](/deployment/connect-scm) for steps. + +### Next steps for Bitbucket and Azure Repos users + +- Connect your Semgrep org to your Bitbucket Data Center project or your Azure DevOps project. Refer to [ Connect a source code manager](/deployment/connect-scm) for steps. +- To add members to your Semgrep organization, set up [ SSO authentication](/deployment/sso). +- You can also opt to scan a repository instead. + +## Appendices + + +**NOTE** + +These sections are helpful, but are not necessary to set up a deployment. + + +### How Semgrep organizations work + +Users can have more than one organization, and an organization can consist of one or many user accounts. Users must belong to at least one organization when they first sign in to Semgrep. + +Organizations can be as small as a single user in a department, or encompass whole companies. + +By default, orgs do not manage any authentication or repositories. You add resources and users to an org by connecting to an SCM or SSO, or setting up a Semgrep scan. + +Once you have connected to your SSO or SCM, any team member from your GitHub, GitLab, or SSO organization can sign in to Semgrep. This includes developers not part of your security team. To control which resources they are able to see or what policies they can change, configure their **role** through [ user access control features](/deployment/teams/overview). + +### Create additional orgs + +After you create your first org, you can create multiple orgs to group related resources together: + + + +In Semgrep AppSec Platform, click the drop-down box with your organization name, located at the sidebar. + + +Click **Add org**. + + +Click **Create an organization**. + + +In the popup, provide an **Organization display name**. + + + +### Organization setup examples + +The following examples illustrate what a completed organizational set-up can look like. + +#### Single-user organization in GitLab + +- In this example, a single GitLab user, `john-doe`, has a Semgrep org account with the same name. +- He has set up his CI workflow to scan `repo-A` and `repo-B` in his GitLab account. The CI job sends scan results (findings) to Semgrep AppSec Platform. +- This is similar to a **personal account** in GitHub or GitLab. + + + ![A simple example of a single-user, single-org setup.](/images/personal-org-97a1b6929bdbfcc0b563fdb397a995a2.png) + + +#### Enterprise org with SSO and multiple orgs in GitHub + +In this example, a `parent-company` has multiple `subsidiaries`, and wants to use SSO for user authentication: + +- Each `subsidiary` is its own GitHub organization. +- The security team is responsible for all `subsidiaries` in `parent-company`. Thus, the security team is a part of all `subsidiaries`. +- The `parent-company` enforces SSO for all of its `subsidiaries`. +- Here, membership and repository scanning are separately managed by two different services. + +The Semgrep deployment could look like this: + +- Each GitHub org has a corresponding Semgrep org. +- The security team has configured SSO for each Semgrep org. + - This means that `team-member-R` can also access `subsidiary-1-org`. The resources they are able to view or change can be constrained through **roles**. + + + ![A complex organization setup using SSO and multiple GitHub orgs.](/images/multiple-orgs-74e1ab656f772b1908062b0659e95051.png) + + +### Join an existing org + +Team members can join a Semgrep organization by logging in through the auth provider specified by your Semgrep admin: + + + +To join an existing org using your GitHub or GitLab credentials: + + + +Sign in to [ Semgrep AppSec Platform](https://semgrep.dev/login) with the account credentials specified by your admin. + + +Follow the on-screen prompts to grant Semgrep the needed permissions and proceed. This creates your **personal** Semgrep account. + + +Click the organization name displayed at the top of the **navigation bar** to expand the drop-down menu. + + +Click **Add org > Join an organization**. + + +Provide the name of the organization you'd like to join. Then, click **Join**. + + + + +To join an existing org through your SSO provider: + + + +Sign in to [ Semgrep AppSec Platform](https://semgrep.dev/login) with the account credentials specified by your admin. + + +You are automatically signed in to all organizations that your admin has set up for you. + + + + + + +**TIP** + +Semgrep admins can also [send developers invites to join their Semgrep org](/deployment/teams/manage#invite-a-user-through-email). + + +### Delete an existing org + +Reach out to [Support](/support) to delete an organization. diff --git a/mintlify-docs/deployment/customize-ci-jobs.mdx b/mintlify-docs/deployment/customize-ci-jobs.mdx new file mode 100644 index 0000000000..920660ec42 --- /dev/null +++ b/mintlify-docs/deployment/customize-ci-jobs.mdx @@ -0,0 +1,65 @@ +--- +title: "Customize your CI job" +--- + + +**YOUR DEPLOYMENT JOURNEY** + +- You have gained the necessary [resource access and permissions](/deployment/checklist) required for deployment. +- You have [created a Semgrep account and organization](/deployment/create-account-and-orgs). +- For GitHub and GitLab users: You have [connected your source code manager](/deployment/connect-scm). +- Optionally, you have [set up SSO](/deployment/sso). +- You have successfully added a [Semgrep job](/deployment/add-semgrep-to-ci) to your CI workflow. + + +Customize your CI job to achieve the following goals: + +* **Run Semgrep on a schedule**. Run full scans on main or trunk branches at the least intrusive time on developer teams. +* **Run Semgrep when an event triggers**. Run Semgrep when a pull request or merge request (PR or MR) is created. +- **Set a timeout to increase or decrease Semgrep's overall runtime.** If scans are taking too long, or rules aren't running, customize your per-rule timeout. + + +## Set up diff-aware scans + + +**INFO** + +Follow the steps in this section only for the following CI providers: + +- Jenkins +- CI providers without guidance from Semgrep AppSec Platform + + +Some Semgrep CI jobs require manual configuration of diff-aware scans, which scan pull requests or merge requests in feature branches. For the CI providers outlined in the preceding list, you can configure a diff-aware job by performing the following steps: + +1. Create a separate CI job following the steps in [Add Semgrep to CI through Semgrep AppSec Platform](/deployment/add-semgrep-to-ci/#add-semgrep-to-ci-1). +1. Set the `SEMGREP_BASELINE_REF` variable in your CI configuration file. The value of this environment variable is typically your trunk branch, such as `main` or `master`. + +## Set a scan schedule + +The following table is a summary of methods and resources to set up schedules for different CI providers. + +| CI provider | Where to set schedule | +| :--- | :--- | +| GitHub Actions | See [Sample CI configs](/semgrep-ci/sample-ci-configs#sample-github-actions-configuration-file) for information on how to modify your `semgrep.yml` file | +| GitLab CI/CD | Refer to [GitLab documentation](https://docs.gitlab.com/ee/ci/pipelines/schedules.html) | +| Jenkins | Refer to [Jenkins documentation](https://www.jenkins.io/doc/book/pipeline/running-pipelines/#scheduling-jobs-in-jenkins) | +| Bitbucket Pipelines | Refer to [Bitbucket documentation](https://support.atlassian.com/bitbucket-cloud/pipeline-triggers/) | +| CircleCI | Refer to [CircleCI documentation](https://circleci.com/scheduled-pipelines#get-started-with-scheduled-pipelines-in-circleci) | +| Buildkite | Refer to [Buildkite documentation](https://buildkite.com/pipelines/scheduled-builds) | +| Azure Pipelines | Refer to [Azure documentation](https://docs.microsoft.com/en-us/azure/devops/pipelines/process/scheduled-triggers?view=azure-devops&tabs=yaml) | +| Semaphore | Refer to [Semaphore documentation](https://docs.semaphore.io/using-semaphore/tasks) | + +## Set a custom timeout + +By default, Semgrep spends a maximum of **5 seconds** to scan with **each rule** on each %%targeted|scan_target%% file. To **set a different timeout**, set the `SEMGREP_TIMEOUT` environment variable (the value is in seconds). Decreasing this value speeds up your scans, but with the possibility of skipping some rules. Alternatively, increasing this value ensures that your most complex rules finish running. For example: + +```bash +SEMGREP_TIMEOUT="3" # Sets the per-rule timeout to 3 seconds. +``` + + +**CAUTION** + +Setting this variable to **0** removes the time limit, meaning that rules can take any amount of time to run. This is not recommended. + diff --git a/mintlify-docs/deployment/local-to-scp-scans.mdx b/mintlify-docs/deployment/local-to-scp-scans.mdx new file mode 100644 index 0000000000..738dcd2bd3 --- /dev/null +++ b/mintlify-docs/deployment/local-to-scp-scans.mdx @@ -0,0 +1,138 @@ +--- +title: "Scan local repositories and upload findings" +sidebarTitle: "Upload local scan findings" +--- + + +You can send findings (scan results) from a local repository to Semgrep AppSec Platform. The local repository is a separate **project** from its remote counterpart. This is useful for testing rules and policies, or simply scanning your own work before it is merged to your organization's trunk branch. + +## Prerequisites + +- Locally installed `semgrep`. + +## Best practices + +You can keep your local scans private and separate from your team by creating a Semgrep organization with only a single user. This is a **personal** org, similar to a personal account in your source code manager (SCM). This separation ensures that your findings data does not affect organizational records and trends. + +To create an org, perform the steps in [Create additional orgs](/deployment/create-account-and-orgs#create-additional-orgs). You don't need to perform any other steps. + +## Send findings from local repository scan to Semgrep AppSec Platform + + + +Ensure that you are signed into Semgrep AppSec Platform and you've switched to the org you want to send findings to. It is recommended to send local repository findings to your **personal** org. + + +In your CLI, log in to Semgrep: + +```bash +semgrep login +``` + + +Click the login URL provided, or copy and paste it into your browser's address bar. Your are taken to your web browser to complete the login process. + + +Follow any additional steps. + + +After logging in, start a scan in your CLI: + +```bash +semgrep ci +``` + + + +## Project separation between local and remote repositories + +The project slug for a **remote repository** takes the form `ACCOUNT-NAME/REPOSITORY_NAME.` + +The project slug for a **local repository** takes the form `local_scan/REPOSITORY-NAME`. + +- **For personal orgs:** A local repository scan does **not** overwrite the findings records of its remote counterpart. They are two separate projects. Personal accounts only have one team member or user: you. +- **For organization orgs**: A local repository scan does **not** overwrite findings records of its remote counterpart. However, if two members have both cloned the same local repository, such as `RepoA`, and both send local `RepoA` findings, one set of findings may overwrite other unintentionally. This is because orgs can have more than one team member, but all local scans are sent to the same project slug. + +## Link local scans to their remote repositories + +When sending findings from local repositories to Semgrep AppSec Platform, the links shown on the **Findings** page are not generated. They may be missing, or they may not link to the correct file. This is because the scan was performed on your local repository, not remote. + +You can optionally set up cross-linking between local and remote repositories to create the correct hyperlinks. To do so, set up environment variables through the CLI: + + +Navigate to the root of your repository. + + +Create the `SEMGREP_REPO_URL` variable, setting it to the URL you'd use to access your online repository: + +```bash +export SEMGREP_REPO_URL=URL_ADDRESS +``` + + + +Create the `SEMGREP_BRANCH` variable: + + i. Retrieve the branch name: + + ```bash + git rev-parse --abbrev-ref HEAD + ``` + ii. Set the variable as shown, making sure that you replace the `BRANCH_NAME` placeholder: + + ```bash + export SEMGREP_BRANCH=BRANCH_NAME + ``` + + +Create the `SEMGREP_REPO_NAME` variable, setting it to the name of your repository: + +```bash +export SEMGREP_REPO_NAME=REPO_NAME +``` + + +Create the `SEMGREP_COMMIT` variable: + + i. Retrieve the commit hash: + + ```bash + git log -n 1 + ``` + + ii. Set the variable by entering the text below, substituting `COMMIT_HASH` with the value from the previous step. + + ```bash + export SEMGREP_COMMIT=COMMIT_HASH + ``` + + + +After performing these steps, rescan your repository to correctly generate links in Semgrep AppSec Platform. + + +### Sample values + +The following is an example of the variables you'd need to create to generate links in Semgrep AppSec Platform, along with sample values: + +``` +# Set the repository URL +export SEMGREP_REPO_URL=https://github.com/corporation/s_juiceshop + +# Set the repository name +export SEMGREP_REPO_NAME=corporation/s_juiceshop + +# Retrieve the branch +git rev-parse --abbrev-ref HEAD +s_update + +# Set the branch +export SEMGREP_BRANCH=s_update + +# Retrieve the commit hash +git log -n 1 +commit fa4e36b9369e5b039bh2220b5h9R61a38b077f29 (HEAD -> s_juiceshop, origin/main, origin/HEAD, master) + +# Set the commit hash +export SEMGREP_COMMIT=fa4e36b9369e5b039bh2220b5h9R61a38b077f29 +``` \ No newline at end of file diff --git a/mintlify-docs/deployment/manage-projects.mdx b/mintlify-docs/deployment/manage-projects.mdx new file mode 100644 index 0000000000..c5660d6b9d --- /dev/null +++ b/mintlify-docs/deployment/manage-projects.mdx @@ -0,0 +1,145 @@ +--- +title: "Manage projects" +description: "View, sort, and tag your projects through the **Projects** page. Refer to this page to manage and troubleshoot thousands of repositories by identifying scan issues or scans with a high number of findings." +--- + + +**WHAT IS A PROJECT?** + +A **project** is a repository, or part of a repository, that you scan through Semgrep AppSec Platform, either using CI or Semgrep Managed Scans. This also includes local CLI scans whose results you have sent for viewing on Semgrep AppSec Platform. A project's scans can be viewed on the **Project details** page, and its findings can be viewed on the individual Semgrep products' **Findings** pages. + + +The **Projects** page features two tabs: + +- The **Scanning** tab lists all projects that have been provisioned or scanned by Semgrep, regardless of whether the project is actively being scanned. If the project's repository has been archived in the source code manager, it is listed under **Not scanning**. +- The **Not scanning** tab lists projects that are associated with [source code manager (SCM) connections that you've added](/deployment/connect-scm), but these projects aren't actively being scanned by Semgrep. The **Not scanning** page also lists projects where you've archived the corresponding GitHub repositories. + + + +## Sort projects + +View all projects by navigating to [Semgrep AppSec Platform](https://semgrep.dev/login) and clicking ** Projects**. + +To sort projects, click the attribute you want to sort by on the header row. You can only sort by one attribute. + +Sort by the following attributes: + +- **Project**: Click to toggle between sorting project names alphabetically in ascending or descending order. +- **Last scan**: Click to toggle between sorting the projects' latest scans in ascending or descending order. The sorting is based on when the last scan **started**, regardless of its status. For this reason, you may see that scans with statuses such as **Not started** or **Never finished** are not necessarily grouped together. + +## Filter a project's scans + + + +Navigate to the ** Projects** section in [Semgrep AppSec Platform](https://semgrep.dev/login). + + +Click the project name of interest for **Project details**. + + +The following filters are available in the first column: +- **Time period**: 7 days or 1 month +- **Scan type**: Full or diff-aware scans +- **Status**: Running, completed, error, or never finished +- **Products**: Code, Supply Chain, Secrets, or AI +- **Duration**: The amount of time the scan took to complete in hours or minutes + + + + +**NOTE** + +Scan details, such as logs, are available for scans run in the past **1 month**. Semgrep AppSec Platform does not display scan details older than 30 days, since this introduces performance issues due to the increased volume of stored scan data. + + +## Run scans in bulk + +You can scan multiple projects at once from the **Projects** page. This is useful when you want to rescan multiple projects after changing your ruleset or configuration. + +To run scans in bulk, select all the projects of interest and click **Scan**. + +## Scan details and logs + +To view the latest scan's details from the **Projects** page: + + + +Hover over the project's latest scan status. This displays the ** Drawer icon**. + + +Click the ** icon** to view the scan details drawer. This drawer displays both an **overview** of the scan and **CI or Managed Scan logs**. Local scans do not have a **Logs** tab. + + + +### Permalinks to scan details + +You can link to a specific scan's details to share with collaborators or for troubleshooting. Click the ** link icon** on the header to copy the permalink. + +## Project details page + +Each project listed on the **Projects** page has its own **Project detail** page, which you can access by clicking the ** window icon** under the **Details** column. The **Project detail** page is where you can filter scans, configure settings, and view detailed logs for each scan that has been run. Use the **Project detail** page to: + +- View trends over time, such as longer or shorter scan durations. +- Share information when troubleshooting scans through the **Scans** tab. +- Update a project's tags, primary branch, and path ignores through the **Settings** tab. + +Additionally, the Semgrep API allows you to filter tags for use in additional workflows and integrations within your own systems. Create tags based on engineering or department teams, external-facing or internal codebases, and so on. See [Tags](/semgrep-appsec-platform/tags) for more information. + +### Configure project settings + +You can configure a project's settings by going to the **Project details** page and clicking the **Settings** tab. + +See the following pages for more information: + +- [Configure Semgrep AppSec Platform to ignore specific file paths](/ignoring-files-folders-code). +- For Semgrep Managed Scans users: [configure your scans](/deployment/managed-scanning/overview). +- [Set a primary branch](/deployment/primary-branch). +- [Set tags](/semgrep-appsec-platform/tags). + +## Delete a project + +Deleting a project removes all of its findings, metadata, and other records from Semgrep AppSec Platform. + + + +In Semgrep AppSec Platform, click **Projects**. + + +Search for your repository's name. + + +Click the **windows icon** to access the settings page for that project. + + +Click the **three-dot (...) button** at the header and click **Delete project**. + + + +To delete an archived project: + + + +In Semgrep AppSec Platform, click **Projects**. + + +Switch to the **Not Scanning** tab of the **Projects** page. + + +Select the checkbox to **Show archived** projects. + + +Search for the archived repository's name. + + +Click the **window icon** under **Details** to access the settings page for that repository. + + +Click the dropdown at the header and click **Delete project**. + + + + +**INFO** + +It can take up to a day **(24 hours)** for the [Dashboard](/semgrep-appsec-platform/dashboard) to correctly update and remove findings associated with a recently deleted project. + diff --git a/mintlify-docs/deployment/managed-scanning/azure.mdx b/mintlify-docs/deployment/managed-scanning/azure.mdx new file mode 100644 index 0000000000..db1daf958a --- /dev/null +++ b/mintlify-docs/deployment/managed-scanning/azure.mdx @@ -0,0 +1,382 @@ +--- +title: "Add an Azure DevOps repository to Semgrep Managed Scans" +sidebarTitle: "Azure DevOps" +--- + +Add Azure DevOps repositories to your Semgrep organization in bulk without adding or changing your existing CI workflows through **Managed Scans**. + +## Prerequisites and permissions + +- Semgrep Managed Scans require repositories hosted by Azure DevOps Services. Azure DevOps Server is not supported. +- Semgrep recommends setting up and configuring Semgrep Managed Scans with an Azure DevOps service account, not a personal account. Regardless of whether you use a personal or service account, the account must be assigned the **Owner** or **Project Collection Administrator** role for the organization. +- During setup and configuration, you must provide a personal access token generated by the account. This token must be authorized with **Full access**. + - Once you have Managed Scans fully configured, you can add restrictions to the token provided to Semgrep. The scopes you must assign to the token include: + - `Code: Read` + - `Code: Status` + - `Member Entitlement Management: Read` + - `Project and Team: Read & write` + - `Pull Request Threads: Read & write` + +## Enable Managed Scans and scan your first repository + + + + +In Semgrep AppSec Platform, click **Projects**. + + +Click **Scan new project > Semgrep Managed Scan**. + + +Select **Azure Devops** as your source code manager. + + +On the **Add to Azure DevOps Pipeline** page, provide the following information:

+ i. Your **Access token**. See [User personal access tokens](https://learn.microsoft.com/en-us/azure/devops/organizations/accounts/use-personal-access-tokens-to-authenticate) for token generation information. Ensure you set the Azure DevOps SCM name to `organization_name/project_name`.

+ ii. The name of your **Azure DevOps Project**. +
+ +Click **Connect** to proceed. + +
+ +You have finished setting up a Semgrep managed scan. Click **Back to Managed Scans** to see your projects. + +- After enabling Managed Scans, Semgrep performs a full scan on all the repositories in batches. +- Once a repository has been added to Semgrep AppSec Platform, it becomes a **project**. A Semgrep AppSec Platform project includes all the repository's findings, history, and scan metadata. +- Projects with a Managed Scan configuration are tagged with `managed-scan`, regardless of whether the project is actively being scanned by Semgrep Managed Scans or not. The **Projects** list also contains pending scans and scans that never started. + +## Add additional Azure DevOps projects + +You can enable Semgrep Managed Scans for additional repositories after onboarding using the following steps: + + + + +In Semgrep AppSec Platform, click **Projects**. + + +Click **Scan new project > Semgrep Managed Scan**. + + +On the **Enable Managed Scans for repos** page, select the repositories you want to add to Semgrep Managed Scans.

+ i. Optional: If you don't see the repository you want to add, click **Sync projects**. +
+ +Select the repositories you want to scan from the list. + + +Click **Enable Managed Scans**. The **Enable Managed Scans** dialog appears. By default, Semgrep runs both full and diff-aware scans. + + +Optional: Disable PR or MR diff-aware scans by turning off the **Enable PR/MR scans** toggle. + + +Click **Enable**. + +
+ +### If the page doesn't display any repositories + + + +In Semgrep AppSec Platform, click **Projects**. + + +If the page doesn't display the repository you want to add, click **Sync projects**. + + +If the page doesn't display any repositories, click **Sync projects**. + + +Optional: Perform a hard refresh (Ctrl+F5 or Cmd+Shift+R). + + + +### Convert or migrate an existing Semgrep CI job + +You can immediately add any existing project to Managed Scans. + + + +Follow the steps in [Add additional Azure DevOps projects](#add-additional-azure-devops-projects). + + +Delete the existing pipeline configuration file in your repository if appropriate. + + + +If you plan to continue running some scans in Azure DevOps Pipelines (for example, using Managed Scans to run weekly full scans but Pipelines for diff-aware scans) you can leave the workflow file in place, and edit it to reflect your desired configuration. + + +**TIP** + +Semgrep preserves your findings, scans, and triage history. + + +## Scan management and configuration + +### Manually run a full scan + + + +In Semgrep AppSec Platform, click **Projects**. + + +Search for your repository's name. + + +Click the **gear icon** to access the settings page for that repository. + + +Click **Run a new scan > Rule-based detection**. + + + +> You can manually run a full scan for both primary and non-primary branches. + +### Re-run a failed scan or a scan that never finished + + + +In Semgrep AppSec Platform, click **Projects**. + + +Click on the project name. + + +Find the scan that failed or never finished using the **Status** column, and click **Details** to open the **Scan logs** dialog. + + +Ensure that you're on the **Overview** tab of the **Scan logs** dialog, then click **Retry scan**. + + + +### Disable diff-aware scans on PRs + + + +In Semgrep AppSec Platform, click **Projects**. + + +Search for your repository's name. + + +Click the **window icon** under **Details** to access the settings page for that repository. + + +Click the toggle for diff-aware scans. + + + +### Delete a project + + + +In Semgrep AppSec Platform, click **Projects**. + + +Search for your repository's name. + + +Click the **window icon** under **Details** to access the settings page for that repository. + + +Click the dropdown at the header and click **Delete project**. + + + +To delete an archived project: + + + +In Semgrep AppSec Platform, click **Projects**. + + +Switch to the **Not Scanning** tab of the **Projects** page. + + +Select the checkbox to **Show archived** projects. + + +Search for the archived repository's name. + + +Click the **window icon** under **Details** to access the settings page for that repository. + + +Click the dropdown at the header and click **Delete project**. + + + + +### Configure fail open to prevent diff-aware scans from blocking pull requests and merge requests + +By default, diff-aware managed scans are set to **fail open** if a scan errors out or takes too long. This means that diff-aware scans are marked as successful on the pull request (PR) or merge request (MR), even if they haven't completed after the specified timeout, allowing you to make the Semgrep status check required in your source code manager (SCM) while not blocking someone from merging a PR or MR if the check encounters an unexpected issue or takes too long. + + +![Sample pull request showing the status of a diff-aware scan.](/images/pr-status-check-06f80c71ec387d84294255bdbfcdc25e.png) + + +#### How fail open works + + + +If enabled, the fail open feature is triggered whenever you open a PR or MR. + + +Initially, Semgrep sends an update to mark the PR or MR as `pending`. + + +Once the diff-aware scan begins, the PR or MR is updated to a status of `running`. + + +The diff-aware scan completes, and the PR or MR is updated to a status of `succeeded` or `failed`. + + +If the diff-aware scan is in `pending` or `running` status longer than the configured timeout, then the fail open process updates the PR or MR to display a status of `succeeded`. This prevents the Semgrep scan from blocking the developer from merging their changes. + + + +If Semgrep marks a PR or MR as `succeeded`, you can merge the PR or MR without waiting for the diff-aware scan to complete. However, if the PR or MR is still open and the scan completes *after* the fail open timeout is reached, Semgrep can still report the findings and mark the status as `failed`. + +#### Configure fail open + +By default, fail open is enabled. However, you can disable this feature and adjust the timeout value: + + + +Sign in to [Semgrep AppSec Platform](https://semgrep.dev/login). + + +Go to **Settings > General > Managed Scans**. + + +Click the **Fail open** toggle to turn off this feature. + + +Set the **Timeout** value in minutes. The default value is **10 minutes**, the minimum value is **1 minute**, and the maximum value is **60 minutes**. + + ![Semgrep AppSec Platform settings page with fail open configuration options.](/images/fail-open-config-ccc1c4b81d5710e419b725b1a1d193e0.png) + + + + +## Disable webhooks + +Managed Scans of Azure DevOps projects require webhooks. The webhooks are enabled by default when you add Azure DevOps as a source code manager when setting up Managed Scans. Webhooks are required for diff-aware scans and triaging by PR or MR comments. + +You can turn off webhooks at any time by following these steps: + + + +In Semgrep AppSec Platform, go to [Settings > Source code managers](https://semgrep.dev/orgs/-/settings/source-code). + + +Find your Azure DevOps connection, and click the toggle to turn off **Incoming webhooks**. + + + +## Revoke Semgrep's access to your repositories + +The following steps revoke the code access you previously granted Semgrep for all repositories you selected. + + + +In Semgrep AppSec Platform, click **Settings > Source Code Managers**. + + +Find the Azure DevOps entry on the list of **Source code managers** and click **Remove**. + + +Click **Remove** to confirm. + + + +## Turn off Managed Scans for specific repositories in Semgrep AppSec Platform + + + +Sign in to Semgrep AppSec Platform. + + +Go to Projects and find the project you no longer want scanned with Semgrep Managed Scanning. Click the project's **Details** page > **Settings** tab. + + +Toggle the switch for **Managed diff scans** to turn off scans of new pull requests and merge requests and **Managed full scans** to turn off full scans of the base branch. + + ![Semgrep AppSec Platform toggles to turn off managed scans of repositories](/images/turn-off-sms-4b1b14b41b9fb0f8b3f2e9bffc2d96e8.png) + + + +## Enable status checks + +To protect branches whose repositories are automatically scanned by Semgrep, enable Azure DevOps status checks: + + + +Sign in to Azure DevOps and navigate to the Azure DevOps project you've connected to Semgrep. + + +Go to **Repos > Branches**. + + +Find the branch to which the status check should be applied, and click the three vertical dots to open up the **More options** dialog. + + +Select **Branch policies**. + + +Ensure that the branch to which you want the status check applied is selected. Navigate to **Status Checks**, and click the **Add +** button to proceed. + + ![Configure status checks for a branch in Azure DevOps](/images/ado-status-checks-setup-7b2a3ba57922077d629fca96080ad7a7.png) + + + +In the dialog that appears:

+ i. Leave the **Status to check** box blank, since this value is auto-populated as you provide values in subsequent steps.

+ ii. Select the **Enter genre/name separately** box. Provide the following values:

+ a. **Genre**: `security`

+ b. **Name**: `semgrep-cloud-platform/scan` + + Once you provide the **Genre** and **Name**, Azure DevOps auto-populates **Status to check**.

+ iii. Choose whether the status check needs to succeed or not to complete pull requests. Selecting **Required** means that a status of `succeeded` is necessary to complete pull requests. Selecting **Optional** means that a status of `failed` will not block the completion of pull requests. + + ![Add status policy dialog in Azure DevOps.](/images/ado-add-status-policy-5c20588b6ec763758c00bf97fe67c5c0.png) + +
+ +Click **Save** to proceed. + +
+ +At this point, all subsequent pull requests opened against this branch are subject to the status check you created. + + +![PR notification after a status check passes.](/images/ado-status-checks-ccb36f40dfe28484c85cd7b9944e1b53.png) + + +See [Configure a branch policy for an external service](https://learn.microsoft.com/en-us/azure/devops/repos/git/pr-status-policy?view=azure-devops) for additional information about status checks. + +## Troubleshooting: multiple projects + +If you currently scan Azure DevOps repositories in your CI pipeline, you may see findings assigned to two separate projects once you enable Semgrep Managed Scans. For example, findings from Managed Scans go to the `semgrep/frontend/webpage` project, while findings from CI scans go to the `frontend/webpage` project. If this is the case, Semgrep AppSec Platform flags these findings with **Possible duplicate**. Please [contact support](/support) for addition assistance. + +## Appendices + + +### Scan logs + +To view your scan logs in Semgrep AppSec Platform, go to **Projects**, then click on the project name. The projects in the list are sorted by scan date, with the most recent scans listed first. + + +**INFO** + +It can take a few minutes for your latest scan logs to appear. However, if the logs do not update 15 minutes after the scan, there may be issues with the scan itself. + + +### Scan statistics + +**Scan statistics**, such as how many of your repositories are being scanned, the scan success rate, and so on, can be provided once a week upon request. Contact your Semgrep account manager to request scan statistics. + + \ No newline at end of file diff --git a/mintlify-docs/deployment/managed-scanning/bitbucket.mdx b/mintlify-docs/deployment/managed-scanning/bitbucket.mdx new file mode 100644 index 0000000000..303b27566a --- /dev/null +++ b/mintlify-docs/deployment/managed-scanning/bitbucket.mdx @@ -0,0 +1,347 @@ +--- +title: "Add a Bitbucket repository to Semgrep Managed Scans" +sidebarTitle: "Bitbucket" +description: "Add Bitbucket repositories to your Semgrep organization in bulk without adding or changing your existing CI workflows through **Managed Scans**." +--- + +## Prerequisites and permissions + +Semgrep Managed Scans require one of the following plans: + +- Bitbucket Cloud Premium +- Bitbucket Data Center (v8.8 or above for diff-aware scans) + +### Bitbucket Cloud + +You must provide a Bitbucket [workspace access token](https://support.atlassian.com/bitbucket-cloud/workspace-access-tokens/) to Semgrep, which can be created by a user with the `Product Admin` role. Once you have Semgrep Managed Scans fully configured, you can update the token provided to Semgrep to one that's more restrictive. The scopes you must assign to the token include: + +- `webhook (read and write)` +- `repository (read and write)` +- `pullrequest (read and write)` +- `project (admin)` +- `account (read)` + +Webhook permissions are required to support diff-aware scans. + +### Bitbucket Data Center + +You must provide a Bitbucket [HTTP access token](https://confluence.atlassian.com/bitbucketserver/http-access-tokens-939515499.html) to Semgrep, which can be created by a user with the `Project Admin` role. This access token must be created with `PROJECT_ADMIN` permissions. + +Project-level webhooks are required to support diff-aware scans. + +## Enable Semgrep Managed Scans and scan your first repository + + + + +In Semgrep AppSec Platform, click **Projects**. + + +Click **Scan new project > Semgrep Managed Scan**. + + +Click **Manage Connections** and then **+ Connect more**. + + +Select **Bitbucket**. + + +In the **Set up Managed Scans** page that appears, provide the information needed by Semgrep to connect to your Bitbucket project:

+ i. Select **Bitbucket** or **Bitbucket Data Center**.

+ ii. Provide your **Access token**.

+ iii. Provide the name of your **Bitbucket workspace**.

+ iv. *For Bitbucket Data Center users only*: provide the **Bitbucket Data Center URL**.

+ v. Click **Connect**. +
+ +Repeat the steps above for each additional Bitbucket workspace you'd like added to Semgrep. + +
+ + +You have successfully set up Managed Scans for your workspace or project. + +- After enabling Managed Scans, Semgrep performs a full scan in batches on all the repositories in the workspace. +- Once a repository has been added to Semgrep AppSec Platform, it becomes a **project**. A project in Semgrep AppSec Platform includes all the findings, history, and scan metadata of that repository. +- Projects with a Managed Scan configuration are tagged with `managed-scan`, regardless of whether the project is actively being scanned by Semgrep Managed Scans or not. The **Projects** list also contains pending scans and scans that never started. + +## Add additional Bitbucket projects + +You can enable Managed Scans for additional repositories after onboarding using the following steps: + + + +In Semgrep AppSec Platform, click **Projects**. + + +Click **Scan new project > Semgrep Managed Scan**. + + +In the **Enable Managed Scans for repos** page, select the repositories you want to add to Semgrep Managed Scans.

+ i. Optional: If you don't see the repository you want to add, click **Can't find your project?** and follow the troubleshooting steps provided. +
+ +Select the repositories you want to scan from the list. + + +Click **Enable Managed Scans**. The **Enable Managed Scans** dialog appears. By default, Semgrep runs both full and diff-aware scans. + + +Optional: Disable PR or MR diff-aware scans by turning off the **Enable PR/MR scans** toggle. + + +Click **Enable**. + +
+ +### If the page doesn't display any repositories + + + +Ensure that you've connected your Bitbucket account by following the steps in [Connect a source code manager](/deployment/connect-scm) and confirm the workspace access token is created with the required scopes listed above with the `Product Admin` role. + + +In Semgrep AppSec Platform, click **Projects**. + + +If the page doesn't display the repository you want to add, click **Can't find your project? > Sync projects**. + + +If the page doesn't display any repositories, click **Sync projects**. + + +Optional: Perform a hard refresh (Ctrl+F5 or Cmd+Shift+R). + + + +### Convert or migrate an existing Semgrep CI job + +You can immediately add any existing project to Managed Scans. + + + +Follow the steps in [Enable Semgrep Managed Scans](#enable-managed-scanning-and-scan-your-first-repository). + + +Delete the `bitbucket-pipelines.yml` file in your Bitbucket repository if appropriate. + + + +If you plan to continue running some scans in Bitbucket CI/CD Pipelines (for example, using Managed Scans to run weekly full scans but Bitbucket CI/CD Pipelines for diff-aware scans) you can leave the workflow file in place, and edit it to reflect your desired configuration. + + +**TIP** + +Semgrep preserves your findings, scans, and triage history. + + +## Scan management and configuration + +### Manually run a full scan + + + +In Semgrep AppSec Platform, click **Projects**. + + +Search for your repository's name. + + +Click the **gear icon** to access the settings page for that repository. + + +Click **Run a new scan > Rule-based detection**. + + + +> You can manually run a full scan for both primary and non-primary branches. + +### Re-run a failed scan or a scan that never finished + + + +In Semgrep AppSec Platform, click **Projects**. + + +Click on the project name. + + +Find the scan that failed or never finished using the **Status** column, and click **Details** to open the **Scan logs** dialog. + + +Ensure that you're on the **Overview** tab of the **Scan logs** dialog, then click **Retry scan**. + + + +### Disable diff-aware scans on PRs + + + +In Semgrep AppSec Platform, click **Projects**. + + +Search for your repository's name. + + +Click the **window icon** under **Details** to access the settings page for that repository. + + +Click the toggle for diff-aware scans. + + + +### Delete a project + + + +In Semgrep AppSec Platform, click **Projects**. + + +Search for your repository's name. + + +Click the **window icon** under **Details** to access the settings page for that repository. + + +Click the dropdown at the header and click **Delete project**. + + + +To delete an archived project: + + + +In Semgrep AppSec Platform, click **Projects**. + + +Switch to the **Not Scanning** tab of the **Projects** page. + + +Select the checkbox to **Show archived** projects. + + +Search for the archived repository's name. + + +Click the **window icon** under **Details** to access the settings page for that repository. + + +Click the dropdown at the header and click **Delete project**. + + + + +### Configure fail open to prevent diff-aware scans from blocking pull requests and merge requests + +By default, diff-aware managed scans are set to **fail open** if a scan errors out or takes too long. This means that diff-aware scans are marked as successful on the pull request (PR) or merge request (MR), even if they haven't completed after the specified timeout, allowing you to make the Semgrep status check required in your source code manager (SCM) while not blocking someone from merging a PR or MR if the check encounters an unexpected issue or takes too long. + + +![Sample pull request showing the status of a diff-aware scan.](/images/pr-status-check-06f80c71ec387d84294255bdbfcdc25e.png) + + +#### How fail open works + + + +If enabled, the fail open feature is triggered whenever you open a PR or MR. + + +Initially, Semgrep sends an update to mark the PR or MR as `pending`. + + +Once the diff-aware scan begins, the PR or MR is updated to a status of `running`. + + +The diff-aware scan completes, and the PR or MR is updated to a status of `succeeded` or `failed`. + + +If the diff-aware scan is in `pending` or `running` status longer than the configured timeout, then the fail open process updates the PR or MR to display a status of `succeeded`. This prevents the Semgrep scan from blocking the developer from merging their changes. + + + +If Semgrep marks a PR or MR as `succeeded`, you can merge the PR or MR without waiting for the diff-aware scan to complete. However, if the PR or MR is still open and the scan completes *after* the fail open timeout is reached, Semgrep can still report the findings and mark the status as `failed`. + +#### Configure fail open + +By default, fail open is enabled. However, you can disable this feature and adjust the timeout value: + + + +Sign in to [Semgrep AppSec Platform](https://semgrep.dev/login). + + +Go to **Settings > General > Managed Scans**. + + +Click the **Fail open** toggle to turn off this feature. + + +Set the **Timeout** value in minutes. The default value is **10 minutes**, the minimum value is **1 minute**, and the maximum value is **60 minutes**. + + ![Semgrep AppSec Platform settings page with fail open configuration options.](/images/fail-open-config-ccc1c4b81d5710e419b725b1a1d193e0.png) + + + + +## Disable webhooks + +Performing diff-aware Managed Scans of Bitbucket projects requires webhooks to be enabled. Webhooks are enabled by default when you add Bitbucket as a source code manager when setting up Semgrep Managed Scans. You can disable webhooks at any time by following these steps: + + + +In Semgrep AppSec Platform, go to [Settings > Source code managers](https://semgrep.dev/orgs/-/settings/source-code). + + +Find your Bitbucket connection, and click the toggle to disable **Incoming webhooks**. + + + +This will stop any diff-aware scans of your projects. + +## Revoke Semgrep's access to your repositories + +The following steps revoke the code access you previously granted Semgrep for all repositories you selected. + + + +In Semgrep AppSec Platform, click **Settings > Source Code Managers**. + + +On the entry of the SCM you want to remove, click **Remove app**. + + +Click **Remove** to confirm. + + + +## Turn off Managed Scans for specific repositories in Semgrep AppSec Platform + + + +Sign in to Semgrep AppSec Platform. + + +Go to Projects and find the project you no longer want scanned with Semgrep Managed Scanning. Click the project's **Details** page > **Settings** tab. + + +Toggle the switch for **Managed diff scans** to turn off scans of new pull requests and merge requests and **Managed full scans** to turn off full scans of the base branch. + + ![Semgrep AppSec Platform toggles to turn off managed scans of repositories](/images/turn-off-sms-4b1b14b41b9fb0f8b3f2e9bffc2d96e8.png) + + + +## Appendices + +### Scan logs + +To view your scan logs in Semgrep AppSec Platform, go to **Projects**, then click on the project name. The projects in the list are sorted by scan date, with the most recent scans listed first. + + +**INFO** + +It can take a few minutes for your latest scan logs to appear. However, if the logs do not update 15 minutes after the scan, there may be issues with the scan itself. + + +### Scan statistics + +**Scan statistics**, such as how many of your repositories are being scanned, the scan success rate, and so on, can be provided once a week upon request. Contact your Semgrep account manager to request scan statistics. diff --git a/mintlify-docs/deployment/managed-scanning/github.mdx b/mintlify-docs/deployment/managed-scanning/github.mdx new file mode 100644 index 0000000000..1b10f928ba --- /dev/null +++ b/mintlify-docs/deployment/managed-scanning/github.mdx @@ -0,0 +1,364 @@ +--- +title: "Add a GitHub repository to Semgrep Managed Scans" +sidebarTitle: "GitHub" +description: "Add GitHub repositories to your Semgrep organization in bulk without adding or changing your existing CI workflows through **Managed Scans**. " +--- + + +## Permissions + +To add a repository, you must install the public Semgrep GitHub app and create and install a private Semgrep GitHub App. + +- The public Semgrep GitHub app is required to easily add members of your GitHub org to your Semgrep org. +- The private Semgrep GitHub app is required to enable code access for Managed Scans. + +If you haven't completed the installation of public and private Semgrep GitHub apps, Semgrep prompts you to do so when adding a repository. + +See [Pre-deployment checklist > Permissions](/deployment/checklist#permissions) for more information about the permissions used by Semgrep. + +## Add a repository + + + + +In Semgrep AppSec Platform, click **Projects**. + + +Click **Scan new project > Semgrep Managed Scan**. + + +If you haven't completed the installation of public and private Semgrep GitHub apps, you are redirected to the **Set up Managed Scans** page, which facilitates the creation of both.

+ i. Follow the steps in the page to create and register both a public and private Semgrep GitHub app. +
+ +In the **Enable Managed Scans for repos** page, select the repositories you want to add to Semgrep Managed Scans.

+ i. Optional: If you don't see the repository you want to add, click **Can't find your project?** and follow the troubleshooting steps provided. +
+ +Select the repositories you want to scan from the list. + + +Click **Enable Managed Scans**. The **Enable Managed Scans** dialog appears. By default, Semgrep runs both full and diff-aware scans. + + +Optional: Disable PR or MR diff-aware scans by turning off the **Enable PR/MR scans** toggle. + + +Click **Enable**. + + +If you use the **Semgrep Network Broker**, you must edit your Broker configuration file; refer to [Use Semgrep Network Broker with Managed Scans](/semgrep-ci/network-broker#use-semgrep-network-broker-with-managed-scans). + +
+ + +You have finished setting up a Semgrep managed scan. + +- After enabling Managed Scans, Semgrep performs a full scan in batches on all the repositories. +- Once a repository has been added to Semgrep AppSec Platform, it becomes a **project**. A project in Semgrep AppSec Platform includes all the findings, history, and scan metadata of that repository. +- Projects with a Managed Scan configuration are tagged with `managed-scan`, regardless of whether the project is actively being scanned by Semgrep Managed Scans or not. The **Projects** list also contains pending scans and scans that never started. + +### Troubleshoot your Semgrep GitHub app installation + +A complete installation is displayed in the Source Code Manager entry as follows: + + +![GitHub entry with public and private GitHub app connection](/images/zcs-code-access-enabled-a492223d4f03823cdfcc88f42371197d.png) + +_**Figure**. **Semgrep AppSec Platform > Settings > Source Code Managers** displaying a completed Managed Scans set-up._ + +You can also confirm a complete installation through your GitHub settings page, which should have two Semgrep apps: + + +![GitHub settings page](/images/zcs-github-apps-840f39a3c2638010a403c42f3b125ecd.png) + +_**Figure**. **GitHub > Settings > Applications** displaying both Semgrep apps. The private Semgrep app follows the convention **Semgrep Code - YOUR_ORG_NAME**_. + +### If the page doesn't display any repositories + + + +Ensure you have provided access to **both** the private and public Semgrep GitHub to the repositories you want to scan by following the steps in [Permissions and synchronicity](#permissions-and-synchronicity). + + +In Semgrep AppSec Platform, click **Projects**. + + +If the page doesn't display the repository you want to add, click **Can't find your project? > Sync projects**. + + +If the page doesn't display any repositories, click **Sync projects**. + + +Optional: Perform a hard refresh (Ctrl+F5 or Cmd+Shift+R). + + + +Repositories must be accessible to both the public Semgrep GitHub app and the private Semgrep GitHub app. + +### Convert or migrate an existing Semgrep CI job + +You can immediately add any existing project to Managed Scans. + + + +Follow the steps in [Add a repository](#add-a-repository). + + +Delete the `/.github/workflows/semgrep.yml` file in your GitHub repository if appropriate. + + + +If you plan to continue running some scans in GitHub Actions (for example, using Managed Scans to run weekly full scans but GitHub Actions for diff-aware scans) you can leave the workflow file in place, and edit it to reflect your desired configuration. + + +**TIP** + +Semgrep preserves your findings, scans, and triage history. + + +## Scan management and configuration + +### Manually run a full scan + + + +In Semgrep AppSec Platform, click **Projects**. + + +Search for your repository's name. + + +Click the **gear icon** to access the settings page for that repository. + + +Click **Run a new scan > Rule-based detection**. + + + +> You can manually run a full scan for both primary and non-primary branches. + +### Re-run a failed scan or a scan that never finished + + + +In Semgrep AppSec Platform, click **Projects**. + + +Click on the project name. + + +Find the scan that failed or never finished using the **Status** column, and click **Details** to open the **Scan logs** dialog. + + +Ensure that you're on the **Overview** tab of the **Scan logs** dialog, then click **Retry scan**. + + + +### Disable diff-aware scans on PRs + + + +In Semgrep AppSec Platform, click **Projects**. + + +Search for your repository's name. + + +Click the **window icon** under **Details** to access the settings page for that repository. + + +Click the toggle for diff-aware scans. + + + +### Delete a project + + + +In Semgrep AppSec Platform, click **Projects**. + + +Search for your repository's name. + + +Click the **window icon** under **Details** to access the settings page for that repository. + + +Click the dropdown at the header and click **Delete project**. + + + +To delete an archived project: + + + +In Semgrep AppSec Platform, click **Projects**. + + +Switch to the **Not Scanning** tab of the **Projects** page. + + +Select the checkbox to **Show archived** projects. + + +Search for the archived repository's name. + + +Click the **window icon** under **Details** to access the settings page for that repository. + + +Click the dropdown at the header and click **Delete project**. + + + + +### Configure fail open to prevent diff-aware scans from blocking pull requests and merge requests + +By default, diff-aware managed scans are set to **fail open** if a scan errors out or takes too long. This means that diff-aware scans are marked as successful on the pull request (PR) or merge request (MR), even if they haven't completed after the specified timeout, allowing you to make the Semgrep status check required in your source code manager (SCM) while not blocking someone from merging a PR or MR if the check encounters an unexpected issue or takes too long. + + +![Sample pull request showing the status of a diff-aware scan.](/images/pr-status-check-06f80c71ec387d84294255bdbfcdc25e.png) + + +#### How fail open works + + + +If enabled, the fail open feature is triggered whenever you open a PR or MR. + + +Initially, Semgrep sends an update to mark the PR or MR as `pending`. + + +Once the diff-aware scan begins, the PR or MR is updated to a status of `running`. + + +The diff-aware scan completes, and the PR or MR is updated to a status of `succeeded` or `failed`. + + +If the diff-aware scan is in `pending` or `running` status longer than the configured timeout, then the fail open process updates the PR or MR to display a status of `succeeded`. This prevents the Semgrep scan from blocking the developer from merging their changes. + + + +If Semgrep marks a PR or MR as `succeeded`, you can merge the PR or MR without waiting for the diff-aware scan to complete. However, if the PR or MR is still open and the scan completes *after* the fail open timeout is reached, Semgrep can still report the findings and mark the status as `failed`. + +#### Configure fail open + +By default, fail open is enabled. However, you can disable this feature and adjust the timeout value: + + + +Sign in to [Semgrep AppSec Platform](https://semgrep.dev/login). + + +Go to **Settings > General > Managed Scans**. + + +Click the **Fail open** toggle to turn off this feature. + + +Set the **Timeout** value in minutes. The default value is **10 minutes**, the minimum value is **1 minute**, and the maximum value is **60 minutes**. + + ![Semgrep AppSec Platform settings page with fail open configuration options.](/images/fail-open-config-ccc1c4b81d5710e419b725b1a1d193e0.png) + + + + +## Revoke Semgrep's access to your repositories + +### Remove the private app + +The following steps revoke the code access you previously granted Semgrep for all repositories you selected. + + + +In Semgrep AppSec Platform, click **Settings > Source Code Managers**. + + +On the entry of the SCM you want to remove, click **Remove app**. + + +Click **Remove** to confirm. + + + +### Limit access to specific repositories + + + +Navigate to your [GitHub settings page](https://github.com/settings/installations/). + + +On the entry of your private Semgrep GitHub app, click **Configure**. + +![GitHub settings page](/images/zcs-github-apps-840f39a3c2638010a403c42f3b125ecd.png) + + + +Under **Repository access**, de-select the repositories you no longer want to grant Semgrep access to. + + + +## Turn off Managed Scans for specific repositories in Semgrep AppSec Platform + + + +Sign in to Semgrep AppSec Platform. + + +Go to Projects and find the project you no longer want scanned with Semgrep Managed Scanning. Click the project's **Details** page > **Settings** tab. + + +Toggle the switch for **Managed diff scans** to turn off scans of new pull requests and merge requests and **Managed full scans** to turn off full scans of the base branch. + + ![Semgrep AppSec Platform toggles to turn off managed scans of repositories](/images/turn-off-sms-4b1b14b41b9fb0f8b3f2e9bffc2d96e8.png) + + + + +**WARNING** + +If your [source code manager has Auto-scan enabled](https://semgrep.dev/orgs/-/settings/source-code) so that Semgrep automatically scans new repositories, turn off Managed Scans for specific repositories using Semgrep AppSec Platform. **Do not turn off Managed Scans by deleting the repository from Semgrep AppSec Platform.** If you have Auto-scan enabled and you delete your repository from the platform, Semgrep re-syncs the repository you deleted. + + +## Appendices + +### Permissions and synchronicity + +Both the public and private Semgrep GitHub app must have access to the repositories you want to scan. + +To **view** the repositories you have granted access to: + + + +Navigate to your [GitHub settings page](https://github.com/settings/installations/). + + +On the entry of your public Semgrep GitHub app, typically **semgrep-app**, Click **Configure**. + + +Review the repositories under repository access. + + +Perform steps 2 and 3 on the entry of your private Semgrep GitHub app. + + + +#### Scan logs + +To view your scan logs in Semgrep AppSec Platform, go to **Projects**, then click on the project name. The projects in the list are sorted by scan date, with the most recent scans are listed first. + + +**INFO** + +It can take a few minutes for your latest scan logs to appear. However, if the logs do not update 15 minutes after the scan, there may be issues with the scan itself. + + +### Scan statistics + +**Scan statistics**, such as how many of your repositories are being scanned, the scan success rate, and so on, can be provided once a week upon request. Contact your Semgrep account manager to request scan statistics. + +### Git Large File Storage + +Semgrep Managed Scans skips files stored in Git Large File Storage (LFS). In general, Semgrep [skips large files](/ignoring-files-folders-code#files-folders-and-code-beyond-semgreps-scope) when scanning projects. diff --git a/mintlify-docs/deployment/managed-scanning/gitlab.mdx b/mintlify-docs/deployment/managed-scanning/gitlab.mdx new file mode 100644 index 0000000000..d2b47bdf79 --- /dev/null +++ b/mintlify-docs/deployment/managed-scanning/gitlab.mdx @@ -0,0 +1,346 @@ +--- +title: "Add a GitLab repository to Semgrep Managed Scans" +sidebarTitle: "GitLab" +description: "Add GitLab repositories to your Semgrep organization in bulk without adding or changing your existing CI workflows through **Managed Scans**. " +--- + +## Prerequisites and permissions + +Semgrep Managed Scanning (SMS) requires one of the following plans: + +- GitLab Premium +- GitLab Ultimate +- GitLab Self Managed + +You must provide a GitLab group access token or personal access token to Semgrep. The token must have the `api` scope assigned to it. + +During SMS onboarding, the group or user to which the token is assigned must have one of the following roles: + +- `Maintainer` +- `Owner` +- `Admin` + +This is because managed scans of GitLab repositories require the enablement of webhooks to facilitate diff-aware scans and the creation of pull request comments by Semgrep. The webhooks are enabled by default when you set up Managed Scans and add GitLab as a source code manager. Once onboarding is complete, you can downgrade the role assigned to the token to `Developer`. + +See [Pre-deployment checklist > Permissions](/deployment/checklist#permissions) for more information about the permissions used by Semgrep. + +## Enable Semgrep Managed Scans and scan your first repository + + + + +In Semgrep AppSec Platform, click **Projects**. + + +Click **Scan new project > Semgrep Managed Scan**. + + +In the **Enable Managed Scans for repos** page, select the repositories you want to add to Semgrep Managed Scans.

+ i. Optional: If you don't see the repository you want to add, click **Can't find your project?** and follow the troubleshooting steps provided. +
+ +Click **+ Connect more**. + + +Select **GitLab**. + + +In the **Set up Managed Scans** page that appears, provide the information needed by Semgrep to connect to your GitLab project:

+ i. Select **GitLab Cloud** or **GitLab Self-Managed**.

+ ii. Provide your **Access token**.

+ iii. Provide your **GitLab group**.

+ iv. *For GitLab Self-Managed users only*: provide the **GitLab URL**.

+ v. Click **Connect**. +
+ +Repeat the steps above for each additional GitLab group you'd like added to Semgrep. + +
+ + +You have finished setting up a Semgrep managed scan. + +- After enabling Managed Scans, Semgrep performs a full scan in batches on all the repositories. +- Once a repository has been added to Semgrep AppSec Platform, it becomes a **project**. A project in Semgrep AppSec Platform includes all the findings, history, and scan metadata of that repository. +- Projects with a Managed Scan configuration are tagged with `managed-scan`, regardless of whether the project is actively being scanned by Semgrep Managed Scans or not. The **Projects** list also contains pending scans and scans that never started. + +## Add additional GitLab projects + +You can enable Semgrep Managed Scans for additional repositories after onboarding using the following steps: + + + +In Semgrep AppSec Platform, click **Projects**. + + +Click **Scan new project > Semgrep Managed Scan**. + + +In the **Enable Managed Scans for repos** page, select the repositories you want to add to Semgrep Managed Scans.

+ i. Optional: If you don't see the repository you want to add, click **Can't find your project?** and follow the troubleshooting steps provided. +
+ +Select the repositories you want to scan from the list. + + +Click **Enable Managed Scans**. The **Enable Managed Scans** dialog appears. By default, Semgrep runs both full and diff-aware scans. + + +Optional: Disable PR or MR diff-aware scans by turning off the **Enable PR/MR scans** toggle. + + +Click **Enable**. + +
+ + +### If the page doesn't display any repositories + + + +Ensure that you've connected your GitLab account by following the steps in [Connect a source code manager](/deployment/connect-scm) and confirm the [PAT is created with the required `API` scope](https://docs.gitlab.com/user/profile/personal_access_tokens/#personal-access-token-scopes) by someone assigned the [role of **Maintainer** or **Owner**](https://docs.gitlab.com/ee/user/permissions.html#roles).

+ i. Once you successfully create the connection, the role for the person who owns the token can be downgraded to **Developer**. +
+ +In Semgrep AppSec Platform, click **Projects**. + + +If the page doesn't display the repository you want to add, click **Can't find your project? > Sync projects**. + + +If the page doesn't display any repositories, click **Sync projects**. + + +Optional: Perform a hard refresh (Ctrl+F5 or Cmd+Shift+R). + +
+ +### Convert or migrate an existing Semgrep CI job + +You can immediately add any existing project to Managed Scans. + + + +Follow the steps in [Enable Semgrep Managed Scans](#enable-managed-scanning-and-scan-your-first-repository). + + +Delete the `.gitlab-ci.yml` file in your GitLab repository if appropriate. + + + +If you plan to continue running some scans in GitLab CI/CD Pipelines (for example, using Managed Scans to run weekly full scans but GitLab CI/CD Pipelines for diff-aware scans) you can leave the workflow file in place, and edit it to reflect your desired configuration. + + +**TIP** + +Semgrep preserves your findings, scans, and triage history. + + +## Scan management and configuration + +### Manually run a full scan + + + +In Semgrep AppSec Platform, click **Projects**. + + +Search for your repository's name. + + +Click the **gear icon** to access the settings page for that repository. + + +Click **Run a new scan > Rule-based detection**. + + + +> You can manually run a full scan for both primary and non-primary branches. + +### Re-run a failed scan or a scan that never finished + + + +In Semgrep AppSec Platform, click **Projects**. + + +Click on the project name. + + +Find the scan that failed or never finished using the **Status** column, and click **Details** to open the **Scan logs** dialog. + + +Ensure that you're on the **Overview** tab of the **Scan logs** dialog, then click **Retry scan**. + + + +### Disable diff-aware scans on PRs + + + +In Semgrep AppSec Platform, click **Projects**. + + +Search for your repository's name. + + +Click the **window icon** under **Details** to access the settings page for that repository. + + +Click the toggle for diff-aware scans. + + + +### Delete a project + + + +In Semgrep AppSec Platform, click **Projects**. + + +Search for your repository's name. + + +Click the **window icon** under **Details** to access the settings page for that repository. + + +Click the dropdown at the header and click **Delete project**. + + + +To delete an archived project: + + + +In Semgrep AppSec Platform, click **Projects**. + + +Switch to the **Not Scanning** tab of the **Projects** page. + + +Select the checkbox to **Show archived** projects. + + +Search for the archived repository's name. + + +Click the **window icon** under **Details** to access the settings page for that repository. + + +Click the dropdown at the header and click **Delete project**. + + + + +### Configure fail open to prevent diff-aware scans from blocking pull requests and merge requests + +By default, diff-aware managed scans are set to **fail open** if a scan errors out or takes too long. This means that diff-aware scans are marked as successful on the pull request (PR) or merge request (MR), even if they haven't completed after the specified timeout, allowing you to make the Semgrep status check required in your source code manager (SCM) while not blocking someone from merging a PR or MR if the check encounters an unexpected issue or takes too long. + + +![Sample pull request showing the status of a diff-aware scan.](/images/pr-status-check-06f80c71ec387d84294255bdbfcdc25e.png) + + +#### How fail open works + + + +If enabled, the fail open feature is triggered whenever you open a PR or MR. + + +Initially, Semgrep sends an update to mark the PR or MR as `pending`. + + +Once the diff-aware scan begins, the PR or MR is updated to a status of `running`. + + +The diff-aware scan completes, and the PR or MR is updated to a status of `succeeded` or `failed`. + + +If the diff-aware scan is in `pending` or `running` status longer than the configured timeout, then the fail open process updates the PR or MR to display a status of `succeeded`. This prevents the Semgrep scan from blocking the developer from merging their changes. + + + +If Semgrep marks a PR or MR as `succeeded`, you can merge the PR or MR without waiting for the diff-aware scan to complete. However, if the PR or MR is still open and the scan completes *after* the fail open timeout is reached, Semgrep can still report the findings and mark the status as `failed`. + +#### Configure fail open + +By default, fail open is enabled. However, you can disable this feature and adjust the timeout value: + + + +Sign in to [Semgrep AppSec Platform](https://semgrep.dev/login). + + +Go to **Settings > General > Managed Scans**. + + +Click the **Fail open** toggle to turn off this feature. + + +Set the **Timeout** value in minutes. The default value is **10 minutes**, the minimum value is **1 minute**, and the maximum value is **60 minutes**. + + ![Semgrep AppSec Platform settings page with fail open configuration options.](/images/fail-open-config-ccc1c4b81d5710e419b725b1a1d193e0.png) + + + + +## Disable webhooks + +Semgrep Managed Scans of GitLab projects require webhooks. The webhooks are enabled by default when you add GitLab as a source code manager when setting up Managed Scans. You can disable webhooks at any time by following these steps: + + + +In Semgrep AppSec Platform, go to [Settings > Source code managers](https://semgrep.dev/orgs/-/settings/source-code). + + +Find your GitLab connection, and click the toggle to disable **Incoming webhooks**. + + + +## Revoke Semgrep's access to your repositories + +The following steps revoke the code access you previously granted Semgrep for all repositories you selected. + + + +In Semgrep AppSec Platform, click **Settings > Source Code Managers**. + + +On the entry of the SCM you want to remove, click **Remove app**. + + +Click **Remove** to confirm. + + + +## Turn off Managed Scans for specific repositories in Semgrep AppSec Platform + + + +Sign in to Semgrep AppSec Platform. + + +Go to Projects and find the project you no longer want scanned with Semgrep Managed Scanning. Click the project's **Details** page > **Settings** tab. + + +Toggle the switch for **Managed diff scans** to turn off scans of new pull requests and merge requests and **Managed full scans** to turn off full scans of the base branch. + + ![Semgrep AppSec Platform toggles to turn off managed scans of repositories](/images/turn-off-sms-4b1b14b41b9fb0f8b3f2e9bffc2d96e8.png) + + + +## Appendices + +### Scan logs + +To view your scan logs in Semgrep AppSec Platform, go to **Projects**, then click on the project name. The projects in the list are sorted by scan date, with the most recent scans listed first. + + +**INFO** + +It can take a few minutes for your latest scan logs to appear. However, if the logs do not update 15 minutes after the scan, there may be issues with the scan itself. + + +### Scan statistics + +**Scan statistics**, such as how many of your repositories are being scanned, the scan success rate, and so on, can be provided once a week upon request. Contact your Semgrep account manager to request scan statistics. diff --git a/mintlify-docs/deployment/managed-scanning/overview.mdx b/mintlify-docs/deployment/managed-scanning/overview.mdx new file mode 100644 index 0000000000..e8509a3a88 --- /dev/null +++ b/mintlify-docs/deployment/managed-scanning/overview.mdx @@ -0,0 +1,97 @@ +--- +title: "Semgrep Managed Scans" +sidebarTitle: "Managed Scans" +description: "Add repositories to your Semgrep organization in bulk without adding or changing your existing CI workflows through **Managed Scans**. Similar to CI workflows, Managed Scans also integrates into developer workflows through pull request (PR) or merge request (MR) comments." +--- + +This is an alternative method to [adding Semgrep in CI](/deployment/add-semgrep-to-ci). Instead of adding a Semgrep job or workflow to your CI/CD pipeline, repositories are added to Semgrep AppSec Platform. + +## Feature maturity and support + +You must be an existing [Semgrep AppSec Platform](https://semgrep.dev/orgs/-/) user with one of the following plans: + - Bitbucket Cloud Premium plans or Bitbucket Data Center (v8.8 or above for diff-aware scans) + - Hosted GitHub (GitHub.com) and GitHub Enterprise Server plans + - GitLab Cloud and GitLab self-managed plans and a Premium or Ultimate subscription + - Azure DevOps Cloud repositories + +Managed Scans is available for all Semgrep products you have purchased, including: + - Semgrep Code + - Semgrep Supply Chain + - Semgrep Secrets + +Semgrep performs full scans on a weekly basis and diff-aware scans when you create a pull request or merge request. + + +**INFO** + +- To receive Supply Chain findings, you must have a supported manifest file or lockfile in your repository. Managed Scans does **not** support generation of these files. +- For existing Semgrep projects, custom `semgrep.yml` configurations are not copied or detected when you use Managed Scans. If you have additional build steps when scanning, use [Semgrep in CI instead](/deployment/add-semgrep-to-ci). + + +Please leave feedback by either contacting your technical account manager (TAM) or through the Feedback form in Semgrep AppSec Platform's navigation bar. + +## Security + +Managed Scans require **read access** to your code for the repositories you choose to scan. Semgrep clones your repository at the beginning of every scan. Once the scan completes, the clone is destroyed and is not persisted anywhere. + +For GitHub users, access to your code is facilitated by a **private Semgrep GitHub app** that you create and register in your GitHub organization. For GitLab users, access to your code is facilitated by a **personal access token** that you generate and provide to Semgrep. + +- You are in control of the app and can revoke access to repositories at any time. +- GitHub users only: you can limit access to specific repositories. + +Managed scans are specifically designed to limit the amount of time that code remains within Semgrep infrastructure. + +### Code security measures + +Semgrep’s Managed Scans infrastructure ensures that customer code is scanned in a vacuum and inaccessible from other Kubernetes cluster resources. Semgrep does this by employing the following features and best practices: + +- **Ephemeral pods** + - Each scan creates a new pod from scratch, ensuring there is never leftover data from previous scans. + - Customer code is cloned into the new pod, scanned, and deleted once the scan is completed. The pod is then destroyed. + - Pods do not share volumes and do not persist after a scan is completed. Once a pod is destroyed, its volume and the data it contains are destroyed as well. +- **Network isolation** + - Pod network capabilities are completely locked down to ensure only allowed IP addresses are accessible. + - Pods are unable to access other pods within the cluster. This ensures that the customer code cloned to one pod is not accessible from another pod. + +### Life cycle of a managed scan + +1. When a scan begins, Semgrep creates an ephemeral container and clones the repository into it. +1. Semgrep runs the scan from that container. Diff-aware scans typically take seconds, while full scans can take minutes to hours to complete. +1. The ephemeral container is immediately and automatically destroyed post-scan along with all contents in it. + +## Default configuration + +By default, projects on Managed Scans are configured with: + +- **Weekly full scans** of the entire repository. When a project is first added to Managed Scans, the AppSec Platform performs an initial scan and then sets a random time up to 6 days after to perform a weekly full scan. Each weekly scan occurs on that same day and time. If a full scan doesn't complete, Semgrep re-attempts the scan once, in case it was affected by a temporary error. +- **Diff-aware scans** on pull requests that run on every PR. These diff-aware scans follow the **rule modes** set in your Policies, ensuring that developers are only notified of findings from high-signal rules you place in Comment or Block mode. + +## Run scans in bulk + +Semgrep Managed Scans enables you to scan multiple projects simultaneously, which is especially useful after updating your ruleset or configuration. + +To run scans in bulk, go to the **Projects** page, select the projects of interest, and click **Scan**. + +## Re-run scans + +You can re-run full scans from the **Projects** page in Semgrep AppSec Platform. + +There is no manual "re-run" action for pull request (PR) or merge request (MR) Semgrep Managed Scans. To re-run a PR or MR scan, push a new commit to the PR or MR branch. This triggers a new scan automatically. + +If no code changes are needed, you can push an empty commit: + +```bash +git commit --allow-empty -m "Trigger Semgrep scan" +git push +``` + +## Add a repository to Semgrep Managed Scans + +Learn how to add a repository to Semgrep Managed Scans: + + + + + + + \ No newline at end of file diff --git a/mintlify-docs/deployment/oss-deployment.mdx b/mintlify-docs/deployment/oss-deployment.mdx new file mode 100644 index 0000000000..cced6f8c17 --- /dev/null +++ b/mintlify-docs/deployment/oss-deployment.mdx @@ -0,0 +1,289 @@ +--- +title: "Semgrep Community Edition in CI" +description: "Semgrep Community Edition (CE) can be set up run static application security testing (SAST) scans on repositories of any size." +sidebarTitle: "Semgrep CE in CI" +--- + +This guide explains how to set up Semgrep CE in your CI pipeline using entirely open source components, also known as a **stand-alone** CI setup. The preferred Semgrep CE command is `semgrep scan`. + +## Prerequisites + +- Sufficient permissions in your repository to: + - Commit a CI configuration file. + - Start or stop a CI job. +- Optional: Create environment variables. + +## Ensure your scans use open source components + +This setup uses only the **LGPL 2.1** Semgrep CLI tool. It is not subject to the usage limits of Semgrep AppSec Platform. In order to remain strictly open source, you must ensure that the rules you run use open source licenses or are your own custom Semgrep rules. + +To verify a rule's license, read the `license` key under the `metadata` of a Semgrep rule. + + + +This rule's last line displays a `license: MIT` key-value pair. + +```yaml expandable +rules: + - id: eslint.detect-object-injection + patterns: + - pattern: $O[$ARG] + - pattern-not: $O["..."] + - pattern-not: "$O[($ARG : float)]" + - pattern-not-inside: | + $ARG = [$V]; + ... + <... $O[$ARG] ...>; + - pattern-not-inside: | + $ARG = $V; + ... + <... $O[$ARG] ...>; + - metavariable-regex: + metavariable: $ARG + regex: (?![0-9]+) + message: Bracket object notation with user input is present, this might allow an + attacker to access all properties of the object and even it's prototype, + leading to possible code execution. + languages: + - javascript + - typescript + severity: MEDIUM + metadata: + cwe: "CWE-94: Improper Control of Generation of Code ('Code Injection')" + primary_identifier: eslint.detect-object-injection + secondary_identifiers: + - name: ESLint rule ID security/detect-object-injection + type: eslint_rule_id + value: security/detect-object-injection + license: MIT +``` + + +For a comparison of the behavior between Semgrep CE CI scans and Semgrep AppSec Platform scans, see [Semgrep AppSec Platform versus Semgrep Community Edition](/semgrep-pro-vs-oss-1). + +## Set up the CI job + +### Use template configuration files + +Click the link of your CI provider to view a configuration file you can commit to your repository to create a Semgrep job: + + + + + + + + + + + +### Use other methods + +Use either of the following methods to run Semgrep on other CI providers. + +#### Direct docker usage + +Reference or add the [semgrep/semgrep](https://hub.docker.com/r/semgrep/semgrep) Docker image directly. The method to add the Docker image varies based on the CI provider. This method is used in the [Bitbucket Pipelines code snippet](/semgrep-ci/sample-ci-configs#sample-bitbucket-pipelines-configuration-snippet). + +#### Install `semgrep` within your CI job + +If you cannot use the Semgrep Docker image, install Semgrep as a step or command within your CI job: + +1. Add `pipx install semgrep` (or `uv tool install semgrep` if you use [`uv`](https://docs.astral.sh/uv/)) into the configuration file as a step or command, depending on your CI provider's syntax. See the [Python Packaging guide](https://packaging.python.org/en/latest/guides/installing-stand-alone-command-line-tools/) for more on installing standalone Python CLI tools. +2. Run any valid `semgrep scan` command, such as `semgrep scan --config auto`. + +For an example, see the [Azure Pipelines code snippet](/semgrep-ci/sample-ci-configs/#sample-azure-pipelines-configuration-snippet). + +## Configure your CI job + +The following sections describe methods to customize your CI job. + +```bash + + +``` + +### Schedule your scans + +The following table is a summary of methods and resources to set up schedules for different CI providers. + +| CI provider | Where to set schedule | +| :--- | :--- | +| GitHub Actions | See [Sample CI configs](/semgrep-ci/sample-ci-configs#sample-github-actions-configuration-file) for information on how to modify your `semgrep.yml` file | +| GitLab CI/CD | Refer to [GitLab documentation](https://docs.gitlab.com/ee/ci/pipelines/schedules.html) | +| Jenkins | Refer to [Jenkins documentation](https://www.jenkins.io/doc/book/pipeline/running-pipelines/#scheduling-jobs-in-jenkins) | +| Bitbucket Pipelines | Refer to [Bitbucket documentation](https://support.atlassian.com/bitbucket-cloud/pipeline-triggers/) | +| CircleCI | Refer to [CircleCI documentation](https://circleci.com/scheduled-pipelines#get-started-with-scheduled-pipelines-in-circleci) | +| Buildkite | Refer to [Buildkite documentation](https://buildkite.com/pipelines/scheduled-builds) | +| Azure Pipelines | Refer to [Azure documentation](https://docs.microsoft.com/en-us/azure/devops/pipelines/process/scheduled-triggers?view=azure-devops&tabs=yaml) | +| Semaphore | Refer to [Semaphore documentation](https://docs.semaphore.io/using-semaphore/tasks) | + +### Customize rules and rulesets + +#### Add rules to scan with `semgrep scan` + +You can customize what rules to run in your CI job. The rules and rulesets can come from the [Semgrep Registry](https://semgrep.dev/explore/), or your own rules. The sources for rules to scan with are: + +* The value of the `SEMGREP_RULES` environment variable. +* The value passed after `--config`. You can use multiple `--config` arguments, one per value. For example: `semgrep scan --config p/default --config p/comment`. + +The `SEMGREP_RULES` environment variable accepts a list of local and remote rules and rulesets to run. The `SEMGREP_RULES` list is delimited by a space (` `) if the variable is exported from a shell command or script block. For example, see the following BitBucket Pipeline snippet: + +```yaml +# ... + script: + - export SEMGREP_RULES="p/nginx p/ci no-exec.yml" + - semgrep ci +# ... +``` + +The line defining `SEMGREP_RULES` defines three different sources, delimited by a space: + +```bash +- export SEMGREP_RULES="p/nginx p/ci no-exec.yml" +``` + +The example references two rulesets from Semgrep Registry (`p/nginx` and `p/ci`) and a rule available in the repository (`no-exec.yml`). + +If the `SEMGREP_RULES` environment variable is defined from a YAML block, the list of rules and rulesets to run is delimited by a newline. See the following example of a GitLab CI/CD snippet: + +```yaml +# ... +variables: + SEMGREP_RULES: >- + p/nginx + p/ci + no-exec.yml +# ... +``` + +#### Write your own rules + +Write custom rules to enforce your team's coding standards and security practices. Rules can be forked from existing community-written rules. + +See [Writing rules](/writing-rules/overview) to learn how to write custom rules. + +### Ignore files + +See [ Ignore files, folders, and code](/ignoring-files-folders-code). + +By default `semgrep ci` skips files and directories such as `tests/`, `node_modules/`, and `vendor/`. It uses the default `.semgrepignore` file which you can find in the [Semgrep GitHub repository](https://github.com/semgrep/semgrep/blob/develop/cli/src/semgrep/templates/.semgrepignore). This default is used when no explicit `.semgrepignore` file is found in the root of your repository. + +Optional: Copy and commit the default `.semgrepignore` file to the **root of your repository** and extend it with your own entries or write your `.semgrepignore` file from scratch. If Semgrep detects a `.semgrepignore` file within your repository, it does not append entries from the default `.semgrepignore` file. + +For a complete example, see the [.semgrepignore file in Semgrep’s source code](https://github.com/semgrep/semgrep/blob/develop/.semgrepignore). + + +**CAUTION** + +`.semgrepignore` is only used by Semgrep. Integrations such as [GitLab's Semgrep SAST Analyzer](https://gitlab.com/gitlab-org/security-products/analyzers/semgrep) do not use it. + + +### Save or export findings to a file + +To save or export findings, pass file format options and send the formatted findings to a file. + +For example, to save to a JSON file: + +`semgrep scan --json > findings.json` + +> The JSON schema for Semgrep's CLI output can be found in [semgrep/semgrep-interfaces](https://github.com/semgrep/semgrep-interfaces/blob/main/semgrep_output_v1.jsonschema). + +You can also use the SARIF format: + +`semgrep scan --sarif > findings.sarif` + +Refer to the [CLI reference](/cli-reference) for output formats. + +## Migrate to Semgrep AppSec Platform from a stand-alone CI setup + +Migrate to Semgrep AppSec Platform to: + +* **View and manage findings in a centralized location**. False positives can be ignored through triage actions. These actions can be undertaken in bulk. +* **Configure rules and actions to undertake when a finding is generated by the rule**. You can undertake the following actions: + * Audit the rule. This means that findings are kept within Semgrep's **Findings** page and are not surfaced to your team's SCM. + * Show the finding to your team through the use of PR and MR comments. + * Block the pull request or merge request. + +To migrate to Semgrep AppSec Platform: + + + +Create an account in [Semgrep AppSec Platform](https://semgrep.dev/login). + + + Click **[Projects](https://semgrep.dev/orgs/-/projects)** > **Scan New Project** > Run scan in CI. + + + Follow the steps in the setup page to complete your migration. + + + Optional: Remove the old CI job that does not use Semgrep AppSec Platform. + + + +## Semgrep CE jobs versus Semgrep jobs + +| Feature | Semgrep CI (`semgrep ci`)| Semgrep CE CI (`semgrep scan`) | +| :--- | :--- | :--- | +| Customized SAST scans | ✔️ | ✔️ | +| [SCA (software composition analysis) scans](/semgrep-supply-chain/overview) | ✔️ | -- | +| [Secrets scans](/semgrep-secrets/conceptual-overview) | ✔️ | -- | +| [PR (pull request) or MR (merge request) comments](/category/pr-or-mr-comments) | ✔️ | -- | +| [Finding status tracked over lifetime](/semgrep-code/findings) | ✔️ | -- | diff --git a/mintlify-docs/deployment/primary-branch.mdx b/mintlify-docs/deployment/primary-branch.mdx new file mode 100644 index 0000000000..d5f389efcd --- /dev/null +++ b/mintlify-docs/deployment/primary-branch.mdx @@ -0,0 +1,104 @@ +--- +title: "Set a primary branch" +--- + +A **primary branch** is the base or target branch for pull requests and merge requests. It is usually referred to as a **default branch** or **trunk** by your source code manager (SCM). Typical names for a primary branch include `dev`, `production`, or `develop`. + +In many cases, Semgrep automatically detects primary branches when they first scan your project. If you have projects (repositories) with unique primary branch names, you can set them through the Semgrep web app. + +A primary branch enables Semgrep to filter your findings by branch and to accurately deduplicate findings. The primary branch is also used to analyze the deployment of [secure guardrails](/secure-guardrails/secure-guardrails-in-semgrep) to your developers; findings fixed before they are merged into the primary branch reduces the overall production backlog. + +The following video provides an introduction and walkthrough: + + + + + +## Prerequisite + +Ensure that the project you want to set a primary branch for has completed **at least one full scan** successfully. + +## Find projects without a primary branch + +Projects without primary branches have an orange information icon next to their name in the **Projects** page. + +## Changes to existing URLs + +For Semgrep AppSec Platform users whose accounts were created prior to September 4, 2024, this feature may affect any bookmarks or saved links created for custom views or slices in product pages such as **Code**, **Supply Chain > Vulnerabilities**, and **Secrets**. The primary branch feature deprecates certain filters, which affect the parameters in your URL. In these cases, you may have to re-create your bookmarks. + +- The following parameters are deprecated: + - `ref=_default` + - `ref=_other` +- For **Code** page and **Supply Chain > Vulnerabilities** tab: + - Bookmarks that use the `ref` parameter without a `repo`, your URL will be redirected to the default view instead. + - Bookmarks that use any number of `repo` parameters without a `ref` will display the findings of primary branches for all repositories selected. + - Any filters using multiple `refs` now show only one `ref`, such as the primary branch. + +## Set a project's primary branch + +- Primary branches are set on a **per-project** basis in the Semgrep web app. To quickly update your primary branches, use the [API endpoint](#through-an-api-endpoint). +- For more information on how primary branches may affect existing projects behavior see: + - [Changes to existing URLs](#changes-to-existing-urls) + - [How Semgrep counts findings in the projects page](/deployment/primary-branch#how-semgrep-counts-findings-in-the-projects-page) + +### Through the web app + + +**INFO** + + For Semgrep AppSec Platform users whose accounts were created prior to + September 4, 2024, you may have to sign out and sign in again for this feature + to appear. + + + + +In the Semgrep web app, click **Projects**. + + +Search for your project's name. + + +Click the ** gear icon** to access the settings page for that project. + + +In the **Primary branch** section, click the drop-down box and select a branch. The drop-down menu shows a list of **scanned branches**. + + +Click **Save**. + + + + + ![Primary branch selection](/images/primary-branch-ded868163a3219896c9b4bee2914c9f9.png) + +_**Figure**. Projects > Project settings page > Primary branch selection._ + +### Through an API endpoint + +You can also send a `patch` request to the following endpoint: [Deployment > Project endpoint](https://semgrep.dev/api/v1/docs/#tag/ProjectsService/operation/ProjectsService_UpdateProject). Add the `primary_branch` key in the request body. + +### How Semgrep counts findings in the Projects page + +You can view a total count of findings in the **Projects** page for all Semgrep products. + +- For Code and Supply Chain, this total count is computed from the **primary branch**. +- For Secrets, this total count is computed from deduplicated findings across all branches. + +This means that the count of findings in your Code, Secrets, or Supply Chain page may differ from the counts in your Projects page. + +The following links explain how Semgrep presents findings for each Semgrep product in their respective page: + + + + + + diff --git a/mintlify-docs/deployment/sso.mdx b/mintlify-docs/deployment/sso.mdx new file mode 100644 index 0000000000..8ed877f293 --- /dev/null +++ b/mintlify-docs/deployment/sso.mdx @@ -0,0 +1,271 @@ +--- +title: "Single-sign on (SSO) configuration" +sidebarTitle: "SSO authentication" +--- + + +**YOUR DEPLOYMENT JOURNEY** + +- You have gained the necessary [resource access and permissions](/deployment/checklist) required for deployment. +- You have [created a Semgrep account and organization](/deployment/create-account-and-orgs). +- You are an admin for both your Semgrep deployment and your IdP provider. +- For GitHub and GitLab users: You have [connected your source code manager](/deployment/connect-scm). + + +This article walks you through single-sign on (SSO) configuration. Semgrep supports SSO through [OpenID Connect / OAuth 2.0](#openid-connect--oauth-20) and [SAML 2.0](#saml-20). + +After setting up SSO, users are provisioned and managed on your IdP. Semgrep grants access to the deployment to any user at the configured domain who logs in and has the correct permissions in the IdP. If a user attempts to log in through GitHub or GitLab with an email at your configured domain, Semgrep prompts them to log in using corporate SSO instead. + + + + +### OpenID Connect / OAuth 2.0 + + +**MICROSOFT ENTRA ID** + +Semgrep AppSec Platform does not support using OpenID with Microsoft Entra ID. Follow the instructions to [set up SAML SSO with Microsoft Entra ID](/kb/semgrep-appsec-platform/saml-microsoft-entra-id) instead. + + +To set up SSO in Semgrep AppSec Platform: + + + +Sign in to [Semgrep AppSec Platform](https://semgrep.dev/login). + + +Go to [**Settings > Access > Login methods**](https://semgrep.dev/orgs/-/settings/access/loginMethods). + + +In the **Single sign-on (SSO)** section, provide a valid **Email domain**, then click **Initialize**. + + +The **Configure Single Sign-On** dialog appears. Begin by selecting your identity provider, or choose **Custom OIDC**. + + +Follow the instructions provided on the subsequent **Configure Single Sign-On** dialog pages to complete this process. When you've completed the required steps, use **Test sign-in** to test the connection. + + +Once test sign-in has passed, close the test page. Verify that the **Connection details** shown on the **Connection activated** screen are correct and close the dialog. + + +Verify that the **Connection status** is now **active** under the **Single sign-on (SSO)** section in Semgrep AppSec Platform. + + +To use the new connection, log out of Semgrep, then log back in using SSO. + + + +If you encounter issues during the setup process, please [reach out to support](/support) for assistance. + +### SAML 2.0 + + +**GOOGLE WORKSPACE SAML** + +If you're using Google Workspace SAML, see [SAML Single Sign-on with Google Workspace](/kb/semgrep-appsec-platform/saml-google-workspace) for specific guidance. + + +SAML2.0 is configured through **Semgrep AppSec Platform**. To set up SSO: + + + +Create a SAML app with your authentication provider. + + +With your authentication provider, add in two attribute statements: `name` and `email`. + + +Sign in to [Semgrep AppSec Platform](https://semgrep.dev/login). + + +Go to [**Settings > Access > Login methods**](https://semgrep.dev/orgs/-/settings/access/loginMethods). + + +In the **Single sign-on (SSO)** section, provide a valid **Email domain**, then click **Initialize**. + + +The **Configure Single Sign-On** dialog appears to guide you through the remaining configuration steps. Begin by selecting your identity provider, or choose **Custom SAML**. + + +Follow the instructions provided on the subsequent **Configure Single Sign-On** dialog pages to complete this process. If prompted, add in the requested attribute statements. Semgrep recommends the following mappings: + | Name | Value | + | :--- | :--- | + | id | `user.login` **OR** `user.email` | + | email | `user.email` | + | firstName | `user.firstName` | + | lastName | `user.lastName` | + When you've completed the required steps, use **Test sign-in** to test the connection. + + +Once test sign-in has passed, close the test page. Verify that the **Connection details** shown on the **Connection activated** screen are correct and close the dialog. + + +Verify that the **Connection status** is now **active** under the **Single sign-on (SSO)** section in Semgrep AppSec Platform. + + +To use the new connection, log out of Semgrep, then log back in using SSO. + + + + + + + +### OpenID Connect / OAuth 2.0 + + +**MICROSOFT ENTRA ID** + +Semgrep AppSec Platform does not support using OpenID with Microsoft Entra ID. Follow the instructions to [set up SAML SSO with Microsoft Entra ID](/kb/semgrep-appsec-platform/saml-microsoft-entra-id) instead. + + +To set up SSO in Semgrep AppSec Platform: + + + +Sign in to Semgrep AppSec Platform. + + +Navigate to **[Settings > Access > Login methods](https://semgrep.dev/orgs/-/settings/access/loginMethods)**. + + +Click **Add SSO configuration** and select **OpenID SSO**. + + +Provide a **Display name** and the **Email domain**. + + +Copy the **Redirect URL**, and provide it to your authentication provider. + + ![SSO configuration form displaying the redirect URL](/images/sso-redirect-url-6174b1e776a42c1c4915495349005d66.png) + + + +Generate a **Client ID** and **Client Secret** through your authentication provider and paste these values into Semgrep. + + ![Generating Client ID and Client Secret via the Okta](/images/sso-clientID-clientSecret-310403f630d93b528baaf02f4215c86d.png) + + + +From your authentication provider, copy the **Base URL** value, and provide it to Semgrep. For example, if you're using Okta SSO, the base URL is the **Okta domain**. + + +Optional: provide the following values from your authentication provider if necessary: + - **Well Known URL** + - **Authorize URI** + - **Token URI** + - **Userinfo URI** + + +Click **Save** to proceed. + + + +If you encounter issues during the setup process, please [reach out to support](/support) for assistance. + +### SAML 2.0 + + +**GOOGLE WORKSPACE SAML** + +If you're using Google Workspace SAML, see [SAML Single Sign-on with Google Workspace](/kb/semgrep-appsec-platform/saml-google-workspace) for specific guidance. + + +SAML2.0 is configured through **Semgrep AppSec Platform**. To set up SSO: + + + +Create a SAML app with your authentication provider. + + ![Creating SAML app through Okta](/images/saml-creating-app-377a1d48768c46f026f9940f5512b773.png) + + + +With your authentication provider, add in two attribute statements: `name` and `email`. + + ![Filling in attribute statements in Okta](/images/saml-attribute-statements-2ac1ee3e4d422a51d0c3d0ad8c95d332.png) + + + +Sign in to Semgrep AppSec Platform. + + +Navigate to **[Settings > Access > Login methods](https://semgrep.dev/orgs/-/settings/access/loginMethods)**. + + +Click **Add SSO configuration** and select **SAML2 SSO**. + + +Provide a **Display name** and the **Email domain**. + + +Copy the **SSO URL** and **Audience URL (SP Entity ID)**, and provide it to your authentication provider. + + ![Finding Single sign on URL, and Audience URI via Semgrep AppSec Platform](/images/saml-copy-urls-76bcb9e07c9e9d3b64f2e29e3943ff9f.png) + + + +From your authentication provider, copy your **IdP SSO URL** and **IdP Issuer ID** values, and download the **X509 Certificate**. + + ![Finding IdP SSO URL, IdP Issuer ID, and X509 Certificate through Okta](/images/saml-copy-IdPSSO-IdPID-and-X509-61f4be0b299a3bc32c4cf3fb1c49a387.png) + + + +Return to Semgrep AppSec Platform, and paste the **IdP SSO URL** and **IdP Issuer ID** values, and upload your **X509 Certificate**. + + ![Filling in IdP SSO URL, IdP Issuer ID, and X509 Certificate on Semgrep](/images/saml-filling-IdpSSO-IdpID-X509-9b0c382094a3881dd89b8f6191e8db76.png) + + + +Select the box next to **This SSO supports non-password authentication mechanisms (e.g. MFA, X509, PasswordLessPhoneSignin)** if applicable. + + +Click **Save** to proceed. + + + + + + +If you encounter issues during the setup process, [reach out to support](/support) for assistance. + + +**ADMIN AND ORG OWNER ACCOUNTS** + +By default, Semgrep creates new SSO accounts with the **Member** role assigned. You can change the default role assigned to a new user by going to [Settings > Access](https://semgrep.dev/orgs/-/settings/access/defaults). + +If you're an admin setting up SSO, and Semgrep creates an SSO account for you with the role of **Member**, you can elevate the permissions granted to your SSO account. To do so, log in to Semgrep with your admin account using the original login method, then [change the role](https://semgrep.dev/orgs/-/settings/access/members) of your newly created SSO account to **Admin**. + + +### Turn off sign in with GitHub / GitLab + +If you have SSO enabled, you can turn off login using GitHub or GitLab credentials. Doing so forces members of your organization to log in using an email address with an approved domain. + + + +Sign in to your [Semgrep account](https://semgrep.dev/login). + + +Navigate to [**Settings > Access > Login methods**](https://semgrep.dev/orgs/docs-test/settings/access/loginMethods). + + +GitHub users: Click the **GitHub SSO** toggle to turn off logins using GitHub. + + +GitLab users: Click the **GitLab SSO** toggle to turn off logins using GitLab. + + + + +**WARNING** + +Ensure that you have at least one user who can log in as an admin through SSO before disabling sign in with GitHub or GitLab. + + +### See also + + + + + \ No newline at end of file diff --git a/mintlify-docs/deployment/teams/manage.mdx b/mintlify-docs/deployment/teams/manage.mdx new file mode 100644 index 0000000000..a4a9483a58 --- /dev/null +++ b/mintlify-docs/deployment/teams/manage.mdx @@ -0,0 +1,259 @@ +--- +title: "Manage teams and roles" +--- + +Semgrep allows you to manage user membership and access to Semgrep resources, such as scans, findings, and repositories or codebases you have added to Semgrep. To configure those settings, go to **[Settings > Access](https://semgrep.dev/orgs/-/settings/access)** in Semgrep AppSec Platform. + +## Invite a user through email + +You can add new users to your organization by sending them an email. This email contains instructions for them to join your org through the same auth provider configured for your account. The invitation only facilitates access for users who are already provisioned in the configured auth provider. + +You must be an **admin** to perform this operation. + + + +Sign in to [ Semgrep AppSec Platform](https://semgrep.dev/login). + + +Click ** Settings > Access**. This brings you to the **Users** tab. + + +Click **Invite users**. + + +In the dialog, enter your team members' email addresses. You can invite up to 20 users at a time. Separate each email address with a Space or Tab key. You can also paste a comma-separated list of email addresses. + + +Click **Send invites**. + + + +## Set a default role for the organization + +Users are assigned a role based on your organization's default. New organizations are created with the default role set to **admin**. To change this setting, perform the following steps: + + + +In Semgrep AppSec Platform, click ** Settings**. + + +Click **Access > Defaults**. + + + +## Change a user's role + +You must be an **admin** to perform this operation. + + + +Sign in to [ Semgrep AppSec Platform](https://semgrep.dev/login). + + +Click ** Settings > Access**. + + +Search for the user whose role will be changed. + + +Click on the user's current role, under the role header. A drop-down box appears. + + +Select the new role for the user. + + + + +**NOTE** + +You cannot change your own role. + + +## Enable teams + + + +Sign in to [ Semgrep AppSec Platform](https://semgrep.dev/login). + + +Click **[ Settings > Access > Teams](https://semgrep.dev/orgs/-/settings/access/teams)**. + + +Optional: Click ** Yes, add new users to the default team** if you want new members and projects to be added to the default team. + + +Click **Enable**. + + +Read the dialog box to ensure that your settings are correct, then click **Enable beta**. + + + +When you have enabled teams for the first time, a team is automatically created with the name of your deployment. This preserves the settings you previously had using the **Users** feature; all current members retain their existing projects. + +## View your teams + +You must be an admin or manager to view the **Teams** tab. + + + +Sign in to [ Semgrep AppSec Platform](https://semgrep.dev/login). + + +Click **[ Settings > Access > Teams](https://semgrep.dev/orgs/-/settings/access/teams)**. + + + +## Create a team + + + +In the [ **Teams** tab](https://semgrep.dev/orgs/-/settings/access/teams), click **New team**. The **Create New Team** form appears. + + +Enter a **Name** for the team. + + +The **Projects** tab opens. Click the checkbox next to the name of the projects you want to give access to. You can also use the **Search** box or **tags** to help you find projects. + + +Click the **Users** tab, then click the checkbox next to the name of the team members you want to add. You can also use the **Search** box to help you find members. + + +Optional: Appoint a manager. Under the **Role** column, click the drop-down box and select **Manager**. + + +Click **Create**. + + + +### Create a subteam + + + +In the [ **Teams** tab](https://semgrep.dev/orgs/-/settings/access/teams), click ** Add subteam** next to the name of the top-level team you want to create a subteam for. The **Create new subteam** form appears. + + +Enter a **Name** for the subteam. + + +The **Projects** tab opens. Click the ** checkbox** next to the name of the projects you want to give access to. You can also use the **Search** box or **tags** to help you find projects. + + +Click the **Users** tab, then click the ** checkbox** next to the name of the team members you want to add. You can also use the Search box to help you find members. + + +Optional: Appoint a manager. Under the Role column, click the drop-down box and select **Manager**. + + +Click **Create**. + + + + +**INFO** + +- You must have at least one team before you can create a subteam. +- In subteams, you can add members that are not part of the top-level team. + + +## Manage your teams + +### Update an existing team or subteam + + + +In the [ **Teams** tab](https://semgrep.dev/orgs/-/settings/access/teams), click the ** edit** icon on the row of the team or subteam you want to edit. + + +Make your changes. + + +Click **Review > Save changes**. + + + +### Delete a team or subteam + + + +If you are deleting a team, delete its subteams first. + + i. In the [ **Teams** tab](https://semgrep.dev/orgs/-/settings/access/teams), click the ** down arrow** to show all subteams under a team, then follow steps 2-3. + + +Click the ** trash can** icon. + + +Click **Delete** to confirm. + + + +### Appoint a manager + +To set a member as a manager for a subteam: + + + +In the [ **Teams** tab](https://semgrep.dev/orgs/-/settings/access/teams), click the ** edit** icon on the row of the team or subteam you want to edit. + + +Click on the **Users** tab. + + +Under the Role column of the member you want to appoint, click the drop-down box and select **Manager**. Perform this step for all members you want to set as managers. + + +Click **Review**. + + +Click **Save changes**. + + + +#### View and edit subteams + + +**INFO** + +This feature is currently in invite-only beta. Please contact [Semgrep Support](/support) for more information. + + + + +In the [ **Teams** tab](https://semgrep.dev/orgs/-/settings/access/teams), click the ** edit** icon on the row of the team or subteam you want to edit. + + +Find the team to which the subteam should be added. Click **Add subteam**. + + +Provide a **Team name**. Click **Add projects**. + + +Select one or more projects to add to the subteam. Click **Add members**. + + +Select one or more users to add to the subteam. Click **Review**. + + +Review the changes you have made. If this looks correct, click **Create team** to proceed. + + + +Managers can view their subteams by going to the **Settings > Access > Teams** tab. Within this tab, they are also able to assign any of the projects they manage from one subteam to another. + +Note that this feature allows managers to view **all projects** in the **Edit teams** panel, including projects they are not assigned to. However, they cannot perform admin-level actions on those projects, such as assigning projects they are not designated to manage. + +### Filter findings for a team's projects + + + +Navigate to the **Findings** page. + + +Click the **Teams** filter. This filter displays teams you have access to. + + +Select the teams you want to see findings for. + + diff --git a/mintlify-docs/deployment/teams/overview.mdx b/mintlify-docs/deployment/teams/overview.mdx new file mode 100644 index 0000000000..5368bc4738 --- /dev/null +++ b/mintlify-docs/deployment/teams/overview.mdx @@ -0,0 +1,145 @@ +--- +title: "Manage user access to projects" +sidebarTitle: "Teams and users" +description: "Basic access control, which determines which users can manage Semgrep resources such as scans, projects, and findings, is managed in Semgrep AppSec Platform. This allows you to configure different levels of collaboration and visibility for users in your organization with access to Semgrep." +--- + + +Semgrep primarily divides users into three roles: + +- **Admin** +- **Member** +- **Read-only** + +Optionally, you can appoint members to a fourth role: the **manager** role. Managers are a subset of members with some additional capabilities and scopes. In particular, they are able to assign specific projects to members through the creation of [teams](#teams-beta). + +### User permissions and visibility + +**Admins** have full permissions, scopes, and visibility into all aspects of Semgrep. + +**Members** can *edit* the following page in Semgrep AppSec Platform: + +- **Findings**: They can view **all projects** in the Findings page, and can sort and triage findings. + +**Members** can *view* the following pages in Semgrep AppSec Platform: + +- **Dashboard**: They are able to see the total count of findings for all projects in the org. +- **Editor**: They can view an org's rules, but they can't write rules for the org. They can still write rules for their personal Semgrep orgs. +- **Registry**: They can view, but not add, rules and rule packs. +- **Docs**: Anyone can view the docs. + +**Members** *cannot view or perform any actions* on the following pages: + +- **Policies** +- **Projects** +- **Settings** + +## Teams (beta) + +The **Teams** feature enables admins to grant or limit access to **specific projects** in Semgrep AppSec Platform. This provides more granular control than the [**Users**](#user-permissions-and-visibility) feature alone. Teams helps security engineers and developers in large organizations focus on projects relevant to their specific department or team. + +You can quickly assign projects to large groups of users by first assigning users to teams and subteams within your organization. Once you've limited a user's access to a subset of your projects, their **Dashboard** and **Findings** pages all reflect that change. For example, their finding count is based on the total number of findings in the projects they can access. + +## Roles and access + +The Teams feature extends the existing roles defined in the **Users** tab. + +- **Admin** + - A user who has access to all features, resources, and projects of their Semgrep deployment. Admins can also change the role of members and managers. + - When creating teams, admins are automatically included in all teams and can't be removed from any team. The access of an admin cannot be restricted except by making them a member. + - An org admin can change the role of any other user, including a fellow admin. +- **Member** + - A user who has access to some features, resources, and projects of their Semgrep deployment. + - To grant members access to a project and its findings, you must add the members to a team, and that team must be assigned to the project. + - Members can scan their local or personal repositories through a personal account. + - Members can also be assigned as **Managers** within a team. +- **Read-only** + - A user who can only view projects and issues of their Semgrep deployment. +- **Manager** + - A member who can grant access to projects by creating subteams and assigning members to these subteams. + - A manager role is restricted to the teams where they have been assigned as a manager. Users can be managers of some projects, but members for others. For more information, see [the manager role](#the-manager-role). + +### Page and feature access per role + +| Page | Read-only | Member | Manager | Admin | +| :--- | :--- | :--- | :--- | :--- | +| **Dashboard** | ⚠️ Restricted. Scope is limited based on team assignments and the project access granted to those teams. | ⚠️ Restricted. Scope is limited based on team assignments and the project access granted to those teams. | ⚠️ Restricted. Scope is limited based on team assignments and the project access granted to those teams. | ✅ Yes | +| **Projects** | ⚠️ Restricted. Projects assigned to teams are visible to users assigned to those teams. | ⚠️ Restricted. Projects assigned to teams are visible to users assigned to those teams. | ⚠️ Restricted. Projects assigned to teams are visible to users assigned to those teams. | ✅ Yes. Admins can see all projects. | +| **Findings** | ⚠️ Restricted. Read-only users can perform no triage operations. | ⚠️ Restricted. Members can perform all triage operations on Projects assigned to them. | ⚠️ Restricted. Managers can perform all triage operations on Projects assigned to them. | ✅ Yes | +| **Policies** | ❌ No | ❌ No | ❌ No | ✅ Yes. Only admins can view and edit policies. | +| **Editor** | ❌ No | 👁️ Read-only. Members can view all rules of an organization, but can't edit or create their own. They can create their own rules in their personal account. | 👁️ Read-only. Managers can view all rules of an organization, but can't edit or create their own. They can create their own rules in their personal account. | ✅ Yes | +| **Settings** | ❌ No | ❌ No | ⚠️ Restricted. Managers can see the **Access** and **Account** subpages. On the **Access** page, they can edit the subteams to which they are assigned as manager. | ✅ Yes | + +### Operations permitted per role + +| Capability | Read-only | Member | Manager | Admin | Notes | +| :--- | :--- | :--- | :--- | :--- | :--- | +| Create or edit projects | ❌ No | ⚠️ Restricted | ⚠️ Restricted | ✅ Yes | | +| Change policies | ❌ No | ❌ No | ❌ No | ✅ Yes | | +| Triage findings | ❌ No | ⚠️ Restricted | ⚠️ Restricted | ✅ Yes | Members can perform all triage operations on Projects assigned to them. | +| Assign roles | ❌ No | ❌ No | ❌ No | ✅ Yes | | +| Create or edit teams | ❌ No | ❌ No | ❌ No | ✅ Yes | | +| Create or edit subteams | ❌ No | ❌ No | ✅ Yes | ✅ Yes | | +| Delete teams | ❌ No | ❌ No | ❌ No | ✅ Yes | | +| Delete subteams | ❌ No | ❌ No | ✅ Yes | ✅ Yes | A manager can delete a subteam they are assigned to manage, as long as no resources, such as projects, are assigned to that subteam. | +| API | ❌ No | ❌ No | ❌ No | ✅ Yes | | + + +**INFO** + +Members and managers can create projects by scanning a repository using the Semgrep CLI tool, but they can't access the project related to the repository in Semgrep AppSec Platform unless an admin provides them explicit access to the project. + + +### Semgrep Multimodal features permitted per role + +| Page | Read-only | Member | Manager | Admin | +| :--- | :--- | :--- | :--- | :--- | +| Add a memory | ❌ No | ❌ No | ❌ No | ✅ Yes | +| Receive weekly priority emails | ❌ No | ❌ No | ❌ No | ✅ Yes | +| Add a memory during triage | ❌ No | ❌ No | ❌ No | ✅ Yes | + +## How team access works + +- Members of a top-level team gain access to the projects of its subteams. They are indirect members of a subteam. +- Members of a subteam do not have access to the projects of teams or subteams above it. + +In the following diagram, team 1 gains access to subteam 1b's projects, but team 1b does not gain access to projects from team 1. + + + ![Team scopes diagram](/images/access-diagram-038e7132f085cf2adbc652f67ad56477.png) + + +- The members Alexis, Pam, and Raj have access to the following projects: + - App + - Microservices + - Frontend +- The members David, Sebas, and Phaedra have access to the following projects: + - Frontend + +If there is a user who is assigned **read-only** access to the deployment, but they need to be a member of specific teams, you must modify the user's roles at the team level to ensure that they can access those projects. + +### The manager role + +Use the **manager role** to delegate the assignment of projects across many users. Managers can speed up the deployment of Semgrep into your organization by creating subteams to grant members access to projects. + +Given a security engineer who is a manager of **team A** but a member of **team B**, with both teams having the same projects: + +- The security engineer has manager **access** to the projects. +- The security engineer can create subteams for team A but can't create subteams for team B. + +Additionally, the manager role is able to perform the following: + +- Scan, including managed scans on new projects through the **Projects** page. +- Edit projects that their team is assigned to. + +Managers cannot remove themselves from their team. Admins and co-managers of the same team or subteam can remove other managers. + +Managers can view and assign any of the projects they manage from one subteam to another at any time. + +For example, if Bob is a manager of `Team A` (assigned to projects `Foo` and `Bar`) and `Team B` (assigned to project `Baz`), Bob has access to all three projects: `Foo`, `Bar`, and `Baz`. Bob can also assign `Baz` to `Team A`. + +## Tips for creating teams and subteams + +- **Assign projects to only one team.** +- **Use subteams to grant access to a specific department's repositories**: Create a top-level team for managers or security engineers in your organization who have broad access to a variety of repositories, then create subteams for members to grant them limited access to their specific department's repositories. +- **Use flat teams to grant access to central projects that are used by a broad group of developers**: It is best to create a separate flat team, without any subteams, and grant the users access to foundational or central repositories from that team. For example, projects that all engineers commit to can be named the Engineering Team. diff --git a/mintlify-docs/deployment/tokens.mdx b/mintlify-docs/deployment/tokens.mdx new file mode 100644 index 0000000000..431622c985 --- /dev/null +++ b/mintlify-docs/deployment/tokens.mdx @@ -0,0 +1,134 @@ +--- +title: "Access tokens" +sidebarTitle: "Tokens" +description: "An access token is a secure credential used to authorize requests to Semgrep AppSec Platform or the Semgrep API without a username and password. Each token is associated with a specific Semgrep account and has a defined set of [scopes](#token-scopes) that determine the permissions granted to its bearer." +--- + +## Types of access tokens + +Semgrep uses the following types of access tokens: + +- API tokens +- CLI tokens +- Service tokens + +### API tokens + +API tokens can be created by admins and are used for calls to the Semgrep API and to set up third-party integrations. For auditing purposes, API tokens are associated with the user who created them. However, they remain valid until manually revoked, even if the creator is no longer associated with the deployment. + +### CLI tokens + +CLI tokens authenticate users who run scans or publish rules from the Semgrep CLI. Both members and admins of a deployment can create CLI tokens. The CLI token allows users to run scans on their local machine using the `semgrep ci` command. This sends findings data to Semgrep AppSec Platform. It also allows users to [publish rules](/writing-rules/private-rules#creating-private-rules) using `semgrep publish`. + +For auditing purposes, Semgrep [records the user who generated the CLI token](https://semgrep.dev/orgs/-/settings/tokens/cli), but the user's actions are attributed to the token rather than the user. + +Logging out of the Semgrep CLI with `semgrep logout` removes the local token, but it does not invalidate it. + +### Service tokens + +Service tokens are functionally the same as API tokens, but instead of being manually generated by a user, they are automatically generated during repository onboarding for CI/CD scans or when repositories are added to Semgrep AppSec Platform. These tokens authenticate agents running automated scans. The default scope for these tokens is Agent/CI, but admins can edit the token and grant them the API scope as well. + +## Token scopes + +The following table displays the scopes assigned to each token: + +| Token | Send findings from a remote repository | Send findings from a local repository | Connect to Semgrep API | +| :--- | :--- | :--- | :--- | +| API | ❌ No | ❌ No | ✔️ Yes | +| CLI | ❌ No | ✔️ Yes | ❌ No | +| Service (CI) | ✔️ Yes | ✔️ Yes | ❌ No | + +The following table displays typical uses for token scopes: + +| Token | Use | +| :--- | :--- | +| API | Used to access Semgrep's API | +| CLI | Auto-generated by Semgrep when a user is logging in through Semgrep CLI. Use this token to scan your code locally using your organization's configured policies, including private rules. | +| Service (CI) | Generated by Semgrep when onboarding (adding) a repository to Semgrep AppSec Platform. | + +## View and manage tokens + +You can view a list of tokens for your deployment [in Semgrep AppSec Platform under **Settings > Tokens**](https://semgrep.dev/orgs/-/settings/tokens). + +Each token type has its own page that lists all existing tokens of that type. Use the search bar to help find a specific token. + +For **API tokens**, you can use the drop-down menu to view only those tokens associated with specific roles, such as **Admin** or **Member**. + +For **Service tokens**, you can use the drop-down menu to view tokens for specific services, such as **Semgrep Managed Scans**, **Autofix**, or **AI Scan**. + +### Create an API token + + + + Sign in to [ Semgrep AppSec Platform](https://semgrep.dev/login). + + + Go to [**Settings > Tokens > API tokens**](https://semgrep.dev/orgs/-/settings/tokens/api). + + + Click **Create new token**. + + + Copy the **Secrets name** and the **Secrets value**, and save these values. The **Secrets value** is your token and is only shown at this time. + + + Select the **Token scopes**. + + + Optional: change the **Name** of the token. This is the value used in the list of tokens associated with your Semgrep deployment. + + + Click **Save** to proceed. + + + +### Create a CLI token + +Once you've [set up the Semgrep CLI](/getting-started/cli#set-up-semgrep), create a CLI token by running the following command: + +```bash +semgrep login +``` + +Running this command launches a browser window, but you can also use the link that's returned in the CLI to proceed. In the **Semgrep CLI login** window, click **Activate** to proceed. + +### Edit a token + + + + Sign in to [ Semgrep AppSec Platform](https://semgrep.dev/login). + + + Go to **Settings > Tokens**. + + + Go to one of the following pages based on the type of token you're interested in: **API tokens**, **CLI tokens**, or **Semgrep service tokens**. + + + Find the token, and click **Edit**. + + + In the dialog that appears, change the **Token scopes** or the displayed **Name**. + + + Click **Save** to proceed. + + + +### Revoke a token + + + + Sign in to [ Semgrep AppSec Platform](https://semgrep.dev/login). + + + Go to **Settings > Tokens**. + + + Go to one of the following pages based on the type of token you're interested in: **API tokens**, **CLI tokens**, or **Semgrep service tokens**. + + + Find the token, and click **Revoke**. + + + diff --git a/mintlify-docs/docs.json b/mintlify-docs/docs.json new file mode 100644 index 0000000000..0a514b86f0 --- /dev/null +++ b/mintlify-docs/docs.json @@ -0,0 +1,999 @@ +{ + "$schema": "https://mintlify.com/docs.json", + "theme": "mint", + "name": "Semgrep", + "colors": { + "primary": "#008456", + "light": "#008456", + "dark": "#624DEF" + }, + "background": { + "color": { + "light": "#f8f9f9", + "dark": "#131a2c" + }, + "decoration": "gradient" + }, + "favicon": "/favicon.png", + "navigation": { + "tabs": [ + { + "tab": "Home", + "pages": ["index"] + }, + { + "tab": "Scan with Semgrep", + "groups": [ + { + "group": "Get started", + "pages": [ + "getting-started/quickstart", + "getting-started/quickstart-managed-scans", + "prerequisites", + "getting-started/scm-support", + { + "group": "Supported languages", + "root": "supported-languages", + "pages": [ + "languages/csharp", + "languages/go", + "languages/java", + "languages/javascript", + "languages/kotlin", + "languages/python", + "languages/ruby", + "languages/scala", + "languages/swift" + ] + }, + { + "group": "Local and CLI scans", + "root": "category/local-and-cli-scans", + "pages": [ + "getting-started/cli", + "running-rules", + "update", + "deployment/local-to-scp-scans", + "troubleshooting/semgrep" + ] + } + ] + }, + { + "group": "Set up and deploy scans", + "pages": [ + { + "group": "Core deployment", + "root": "deployment/core-deployment", + "pages": [ + "deployment/checklist", + "deployment/create-account-and-orgs", + { + "group": "Connect a source code manager", + "root": "deployment/connect-scm", + "pages": ["semgrep-appsec-platform/scm-code-access"] + }, + "deployment/sso", + { + "group": "Scan repositories with the AppSec Platform", + "root": "category/scan-repositories-with-the-appsec-platform", + "pages": [ + { + "group": "Managed Scans", + "root": "deployment/managed-scanning/overview", + "pages": [ + "deployment/managed-scanning/azure", + "deployment/managed-scanning/bitbucket", + "deployment/managed-scanning/github", + "deployment/managed-scanning/gitlab" + ] + }, + { + "group": "AI-powered detection", + "root": "semgrep-code/ai-powered-detection-concepts", + "pages": ["deployment/add-ai-to-scans"] + }, + "deployment/add-semgrep-to-ci", + "deployment/add-semgrep-to-other-ci-providers", + "deployment/customize-ci-jobs", + "semgrep-ci/configuring-blocking-and-errors-in-ci", + { + "group": "Configuring SCA scans", + "root": "semgrep-supply-chain/setup-infrastructure", + "pages": ["semgrep-supply-chain/setup-maven"] + }, + "deployment/manage-projects", + "deployment/primary-branch", + "troubleshooting/semgrep-app" + ] + }, + { + "group": "PR or MR comments", + "root": "category/pr-or-mr-comments", + "pages": [ + "semgrep-appsec-platform/azure-pr-comments", + "semgrep-appsec-platform/github-pr-comments", + "semgrep-appsec-platform/gitlab-mr-comments", + { + "group": "Bitbucket PR comments", + "root": "category/bitbucket-pr-comments", + "pages": [ + "semgrep-appsec-platform/bitbucket-cloud-pr-comments", + "semgrep-appsec-platform/bitbucket-data-center-pr-comments" + ] + }, + { + "group": "Customize core deployment", + "root": "deployment/beyond-core-deployment", + "pages": [ + { + "group": "Ignore files, folders, and code", + "pages": [ + "ignoring-files-folders-code", + "semgrep-supply-chain/ignoring-dependencies" + ] + }, + "semgrep-code/semgrep-pro-engine-intro", + "semgrep-code/policies", + "semgrep-supply-chain/license-compliance", + "writing-rules/overview-1" + ] + } + ] + } + ] + }, + { + "group": "Deployment at scale", + "root": "category/deployment-at-scale", + "pages": [ + { + "group": "Teams and users", + "root": "deployment/teams/overview", + "pages": ["deployment/teams/manage"] + }, + "deployment/tokens", + "semgrep-appsec-platform/tags", + "semgrep-ci/network-broker" + ] + }, + { + "group": "Secure guardrails", + "root": "secure-guardrails/secure-guardrails-in-semgrep", + "pages": [ + "secure-guardrails/secure-defaults", + "secure-guardrails/custom-guardrails-rules" + ] + }, + { + "group": "Notifications", + "root": "semgrep-appsec-platform/notifications", + "pages": [ + "semgrep-appsec-platform/slack-notifications", + "semgrep-appsec-platform/email-notifications", + "semgrep-appsec-platform/webhooks" + ] + }, + "semgrep-appsec-platform/dashboard", + { + "group": "Extensions", + "root": "extensions/overview", + "pages": [ + "extensions/semgrep-vs-code", + "extensions/semgrep-intellij", + "extensions/pre-commit" + ] + }, + { + "group": "Integrations", + "pages": [ + "guardian", + "semgrep-appsec-platform/cortex", + "semgrep-appsec-platform/jira", + "semgrep-appsec-platform/sysdig", + "semgrep-appsec-platform/wiz" + ] + } + ] + }, + { + "group": "Scan and triage", + "pages": [ + { + "group": "SAST (Code)", + "pages": [ + "semgrep-code/overview", + { + "group": "AI-powered detection", + "root": "semgrep-code/ai-powered-detection-concepts", + "pages": ["deployment/add-ai-to-scans"] + }, + { + "group": "View findings", + "root": "semgrep-code/findings", + "pages": ["semgrep-code/finding-details"] + }, + { + "group": "Triage and remediation", + "root": "semgrep-code/triage-remediation", + "pages": ["semgrep-code/triage-remediation/autofix"] + }, + { + "group": "Manage rules and policies", + "root": "semgrep-code/policies", + "pages": ["semgrep-code/pro-rules", "semgrep-code/editor"] + }, + { + "group": "Perform cross-file analysis", + "root": "semgrep-code/semgrep-pro-engine-intro", + "pages": ["semgrep-code/semgrep-pro-engine-examples"] + }, + "semgrep-code/remove-duplicates" + ] + }, + { + "group": "SCA (Supply Chain)", + "pages": [ + "semgrep-supply-chain/overview", + { + "group": "Coverage", + "pages": [ + "semgrep-supply-chain/sca-package-manager-support", + "semgrep-supply-chain/sca-feature-support" + ] + }, + { + "group": "Open source security vulnerabilities", + "root": "semgrep-supply-chain/getting-started", + "pages": [ + { + "group": "View findings", + "root": "semgrep-supply-chain/findings", + "pages": ["semgrep-supply-chain/finding-details"] + }, + "semgrep-supply-chain/triage-and-remediation", + "semgrep-supply-chain/advisories", + "semgrep-supply-chain/policies", + "semgrep-supply-chain/ignoring-deps" + ] + }, + "semgrep-supply-chain/sbom", + "semgrep-supply-chain/dependency-search", + "semgrep-supply-chain/license-compliance", + "semgrep-supply-chain/malicious-dependencies" + ] + }, + { + "group": "Secrets", + "pages": [ + "semgrep-secrets/conceptual-overview", + { + "group": "Scan for secrets", + "root": "semgrep-secrets/getting-started", + "pages": [ + "semgrep-secrets/historical-scanning", + "semgrep-secrets/generic-secrets" + ] + }, + { + "group": "View findings", + "root": "semgrep-secrets/findings", + "pages": ["semgrep-secrets/finding-details"] + }, + "semgrep-secrets/triage-remediation", + "semgrep-secrets/policies", + "semgrep-secrets/rules-1", + "semgrep-secrets/validators-1", + "semgrep-secrets/glossary" + ] + } + ] + }, + { + "group": "Semgrep Multimodal", + "pages": [ + { + "group": "Overview", + "root": "semgrep-multimodal/overview", + "pages": ["semgrep-multimodal/metrics"] + }, + { + "group": "Getting started", + "root": "semgrep-multimodal/getting-started", + "pages": [ + "semgrep-multimodal/customize", + "semgrep-multimodal/best-practices-for-memories" + ] + }, + "semgrep-multimodal/analyze", + "semgrep-multimodal/privacy" + ] + }, + { + "group": "Semgrep Community Edition", + "pages": [ + { + "group": "Get started", + "root": "getting-started/quickstart-ce", + "pages": ["customize-semgrep-ce"] + }, + "semgrep-ce-languages", + "deployment/oss-deployment", + { + "group": "About Semgrep CE", + "pages": [ + "contributing/semgrep-philosophy-1", + "semgrep-pro-vs-oss-1", + "faq/comparisons/opengrep-1" + ] + } + ] + }, + { + "group": "References", + "pages": [ + { + "group": "CI references", + "root": "category/ci-references-1", + "pages": [ + "semgrep-ci/ci-environment-variables-1", + "semgrep-ci/sample-ci-configs-1", + "semgrep-ci/findings-ci-1", + "semgrep-ci/packages-in-semgrep-docker-1" + ] + }, + { + "group": "Language reference", + "root": "category/language-reference", + "pages": [ + "references/language-maturity-levels", + "references/feature-definitions", + "semgrep-code/java" + ] + }, + { + "group": "Glossaries", + "root": "category/glossaries-1", + "pages": [ + "semgrep-code/glossary", + "semgrep-supply-chain/glossary" + ] + }, + "semgrepignore-v2-reference", + "cli-reference", + "semgrep-appsec-platform/json-and-sarif" + ] + } + ] + }, + { + "tab": "Write rules", + "groups": [ + { + "group": "Write rules for Semgrep Code", + "pages": [ + "writing-rules/overview", + { + "group": "Rule structure syntax", + "root": "writing-rules/rule-syntax", + "pages": ["writing-rules/rule-ideas"] + }, + { + "group": "Rule pattern syntax", + "root": "writing-rules/pattern-syntax", + "pages": ["writing-rules/pattern-examples"] + }, + { + "group": "Advanced rule-writing techniques", + "pages": [ + "writing-rules/rule-defined-fix", + "writing-rules/data-flow/data-flow-overview", + "writing-rules/data-flow/constant-propagation", + { + "group": "Taint analysis", + "root": "writing-rules/data-flow/taint-mode/overview", + "pages": ["writing-rules/data-flow/taint-mode/advanced"] + }, + "writing-rules/data-flow/status", + "writing-rules/generic-pattern-matching", + "writing-rules/metavariable-analysis", + { + "group": "Experiments 🧪", + "pages": [ + "writing-rules/experiments/introduction", + "writing-rules/experiments/pattern-syntax", + "writing-rules/experiments/aliengrep", + { + "group": "Join mode", + "root": "writing-rules/experiments/join-mode/overview", + "pages": [ + "writing-rules/experiments/join-mode/recursive-joins" + ] + }, + "writing-rules/experiments/symbolic-propagation", + "writing-rules/experiments/display-propagated-metavariable", + "writing-rules/experiments/multiple-focus-metavariables", + "writing-rules/experiments/r2c-internal-project-depends-on", + "writing-rules/experiments/metavariable-type", + "writing-rules/experiments/deprecated-experiments" + ] + } + ] + }, + "writing-rules/private-rules", + "writing-rules/testing-rules", + "troubleshooting/rules", + "writing-rules/glossary" + ] + }, + { + "group": "Write rules for Semgrep Secrets", + "pages": ["semgrep-secrets/rules", "semgrep-secrets/validators"] + } + ] + }, + { + "tab": "Learning guides", + "groups": [ + { + "group": "Application Security", + "pages": [ + "learn", + { + "group": "Security Foundations", + "root": "learn/security-foundations/overview", + "pages": [ + "learn/security-foundations/sast/overview", + "learn/security-foundations/supply-chain-security", + "learn/security-foundations/security-testing-workflow" + ] + }, + { + "group": "Vulnerabilities", + "root": "learn/vulnerabilities/overview", + "pages": [ + "learn/vulnerabilities/code-injection", + { + "group": "Command Injection", + "root": "learn/vulnerabilities/command-injection", + "pages": [ + "learn/vulnerabilities/command-injection/argo-injection", + "learn/vulnerabilities/command-injection/github-actions-injection" + ] + }, + "learn/vulnerabilities/cross-site-scripting", + "learn/vulnerabilities/insecure-deserialization", + "learn/vulnerabilities/idor", + "learn/vulnerabilities/open-redirect", + "learn/vulnerabilities/server-side-request-forgery", + "learn/vulnerabilities/sql-injection", + "learn/vulnerabilities/xml-security" + ] + } + ] + }, + { + "group": "Secure Coding", + "pages": [ + "cheat-sheets/overview", + { + "group": "Go", + "root": "category/go", + "pages": [ + "cheat-sheets/go-command-injection", + "cheat-sheets/go-xss" + ] + }, + { + "group": "Java", + "root": "category/java", + "pages": [ + "cheat-sheets/java-code-injection", + "cheat-sheets/java-command-injection", + "cheat-sheets/java-jsp-xss", + "cheat-sheets/java-xxe" + ] + }, + { + "group": "JavaScript", + "root": "category/javascript", + "pages": [ + "cheat-sheets/javascript-code-injection", + "cheat-sheets/javascript-command-injection", + "cheat-sheets/express-xss" + ] + }, + { + "group": "Python", + "root": "category/python", + "pages": [ + "cheat-sheets/python-code-injection", + "cheat-sheets/python-command-injection", + "cheat-sheets/django-xss", + "cheat-sheets/flask-xss", + "learn/vulnerabilities/insecure-deserialization/python" + ] + }, + { + "group": "Ruby", + "root": "category/ruby", + "pages": [ + "cheat-sheets/ruby-code-injection", + "cheat-sheets/ruby-command-injection", + "cheat-sheets/rails-xss" + ] + } + ] + } + ] + }, + { + "tab": "API", + "groups": [ + { + "group": " ", + "pages": [ + "api-reference/Introduction", + "api-reference/Authentication", + "api-reference/Terms-of-Use", + { + "group": "Deployment", + "root": "api-reference/DeploymentsService", + "pages": ["api-reference/deploymentsservice/list-deployments"] + }, + { + "group": "Code, Supply Chain, and AI-Powered Scan", + "root": "api-reference/FindingsService", + "pages": [ + "api-reference/findingsservice/list-code-or-supply-chain-findings" + ] + }, + { + "group": "Other", + "root": "api-reference/MiscService", + "pages": [ + "api-reference/miscservice/[beta]-get-sms-vpc-bootstrap-cloudformation-template", + "api-reference/miscservice/ping" + ] + }, + { + "group": "Policies", + "root": "api-reference/PoliciesService", + "pages": [ + "api-reference/policiesservice/list-policies", + "api-reference/policiesservice/list-policy-rules", + "api-reference/policiesservice/update-policy" + ] + }, + { + "group": "Projects", + "pages": [ + "api-reference/projectsservice/list-all-projects", + "api-reference/projectsservice/delete-project", + "api-reference/projectsservice/get-project-details", + "api-reference/projectsservice/update-project-details", + "api-reference/projectsservice/toggle-managed-scans-for-a-project", + "api-reference/projectsservice/remove-tags-from-project", + "api-reference/projectsservice/add-tags-to-project" + ] + }, + { + "group": "Scans", + "root": "api-reference/ScansService", + "pages": [ + "api-reference/scansservice/get-scan-details", + "api-reference/scansservice/list-scans-beta" + ] + }, + { + "group": "Secrets", + "root": "api-reference/SecretsService", + "pages": ["api-reference/secretsservice/list-secrets"] + }, + { + "group": "Supply Chain", + "root": "api-reference/SupplyChainService", + "pages": [ + "api-reference/supplychainservice/list-dependencies", + "api-reference/supplychainservice/list-repositories-with-dependencies", + "api-reference/supplychainservice/list-lockfiles-in-a-given-repository-with-dependencies", + "api-reference/supplychainservice/create-a-new-sbom-export-job", + "api-reference/supplychainservice/get-the-status-of-a-sbom-export-job" + ] + }, + { + "group": "Ticketing", + "root": "api-reference/TicketingService", + "pages": [ + "api-reference/ticketingservice/unlink-a-jira-ticket", + "api-reference/ticketingservice/create-jira-tickets" + ] + }, + { + "group": "Triage", + "root": "api-reference/TriageService", + "pages": ["api-reference/triageservice/bulk-triage"] + } + ] + } + ] + }, + { + "tab": "Help", + "dropdowns": [ + { + "dropdown": "Knowledge base", + "icon": "book", + "groups": [ + { + "group": "Knowledge base", + "root": "kb", + "pages": [ + { + "group": "Semgrep Multimodal", + "root": "kb/semgrep-multimodal", + "pages": [ + "kb/semgrep-multimodal/azure-openai-error-429", + "kb/semgrep-multimodal/missing-pr-mr-comments" + ] + }, + { + "group": "Semgrep Code", + "root": "kb/semgrep-code", + "pages": [ + "kb/semgrep-code/InvalidHeaderValue", + "kb/semgrep-code/collect-cli-logs", + "kb/semgrep-code/finding_all_taints", + "kb/semgrep-code/gitlab-group-variables", + "kb/semgrep-code/support-for-language-versions", + "kb/semgrep-code/reduce-false-positives", + "kb/semgrep-code/run-specific-version", + "kb/semgrep-code/scan-engine-kill", + "kb/semgrep-code/semgrep-scan-troubleshooting", + "kb/semgrep-code/semgrepignore-ignored", + "kb/semgrep-code/unexpected-new-findings" + ] + }, + { + "group": "Semgrep Supply Chain (SSC)", + "root": "kb/semgrep-supply-chain", + "pages": [ + "kb/semgrep-supply-chain/exclude-rule", + "kb/semgrep-supply-chain/incident-response", + "kb/semgrep-supply-chain/scanning_multiple_lockfiles", + "kb/semgrep-supply-chain/ssc-lockfiles-circleci", + "kb/semgrep-supply-chain/ssc-python-lockfiles", + "kb/semgrep-supply-chain/why-no-findings" + ] + }, + { + "group": "Semgrep AppSec Platform", + "root": "kb/semgrep-appsec-platform", + "pages": [ + "kb/semgrep-appsec-platform/act-on-your-behalf", + "kb/semgrep-appsec-platform/api-404-token-scope", + "kb/semgrep-appsec-platform/automate-rules-deployment", + "kb/semgrep-appsec-platform/cannot-access-semgrep-after-github-login", + "kb/semgrep-appsec-platform/dependency-count-differ-platform", + "kb/semgrep-appsec-platform/error-externally-managed-environment", + "kb/semgrep-appsec-platform/fedramp-with-semgrep", + "kb/semgrep-appsec-platform/findings-count-differ-api-platform", + "kb/semgrep-appsec-platform/findings-count-differ-platform", + "kb/semgrep-appsec-platform/inline-pr-comments", + "kb/semgrep-appsec-platform/missing-pr-comments", + "kb/semgrep-appsec-platform/no-runs-in-github-merge-queues", + "kb/semgrep-appsec-platform/projects-not-yet-started-sms", + "kb/semgrep-appsec-platform/remove-users", + "kb/semgrep-appsec-platform/rerun-managed-scans", + "kb/semgrep-appsec-platform/saml-attributestatement", + "kb/semgrep-appsec-platform/saml-authentication-method-match", + "kb/semgrep-appsec-platform/saml-bad-signature", + "kb/semgrep-appsec-platform/saml-google-workspace", + "kb/semgrep-appsec-platform/saml-microsoft-entra-id", + "kb/semgrep-appsec-platform/saml-stops-working", + "kb/semgrep-appsec-platform/scan-duration-discrepancy", + "kb/semgrep-appsec-platform/search-filter-sort-findings", + "kb/semgrep-appsec-platform/semgrep-login-cli-tenant", + "kb/semgrep-appsec-platform/sso-attribute-error" + ] + }, + { + "group": "Semgrep Secrets", + "root": "kb/semgrep-secrets", + "pages": [ + "kb/semgrep-secrets/no-example-secrets-found", + "kb/semgrep-secrets/per-product-ignore-not-working" + ] + }, + { + "group": "Semgrep in CI", + "root": "kb/semgrep-ci", + "pages": [ + "kb/semgrep-ci/azure-self-hosted-ubuntu", + "kb/semgrep-ci/azure-using-templates-with-semgrep", + "kb/semgrep-ci/bitbucket-jenkins", + "kb/semgrep-ci/ci-vs-cli", + "kb/semgrep-ci/collect-gha-logs", + "kb/semgrep-ci/collect-gitlab-logs", + "kb/semgrep-ci/git-command-errors", + "kb/semgrep-ci/github-repository-rulesets-semgrep", + "kb/semgrep-ci/github-reusable-workflows-semgrep", + "kb/semgrep-ci/github-upload-findings-in-security-dashboard", + "kb/semgrep-ci/jenkins-diff-scans", + "kb/semgrep-ci/mr-comments-through-gitlab-runner", + "kb/semgrep-ci/new-scm-connections", + "kb/semgrep-ci/scan-compressed-files-artifacts", + "kb/semgrep-ci/scan-monorepo-in-parts", + "kb/semgrep-ci/semaphore-pipelines", + "kb/semgrep-ci/trigger-diff-scans-env-var", + "kb/semgrep-ci/upload-ci-findings-to-github", + "kb/semgrep-ci/upload-ci-findings-to-gitlab", + "kb/semgrep-ci/using-nonroot-docker-image-with-gha", + "kb/semgrep-ci/why-duplicate-findings" + ] + }, + { + "group": "Integrations", + "root": "kb/integrations", + "pages": [ + "kb/integrations/customize-semgrep-precommit", + "kb/integrations/defect-dojo-integration", + "kb/integrations/pagination" + ] + }, + { + "group": "Rules", + "root": "kb/rules", + "pages": [ + "kb/rules/changing-rule-severity-and-other-metadata", + "kb/rules/ellipsis-metavariables", + "kb/rules/exclude_rule_for_certain_filetypes", + "kb/rules/match-absence", + "kb/rules/match-comments", + "kb/rules/pattern-parse-error", + "kb/rules/pro-vs-community-secrets-vs-code-rules", + "kb/rules/rule-file-perf-principles", + "kb/rules/ruleset-default-mode", + "kb/rules/run-all-available-rules", + "kb/rules/understand-severities", + "kb/rules/using-pattern-not-inside", + "kb/rules/using-semgrep-rule-schema-in-vscode" + ] + } + ] + } + ] + }, + { + "dropdown": "Support", + "icon": "headset", + "groups": [ + { + "group": "Support & resources", + "pages": [ + "support", + { + "group": "Usage and billing", + "root": "usage-and-billing/overview", + "pages": [ + "deployment/claim-a-license", + "usage-and-billing/plan-changes-and-payments", + "usage-and-billing/contributor-count-explained", + "usage-and-billing/reconciliation" + ] + }, + "licensing", + { + "group": "Compliance", + "root": "compliance/compliance-overview", + "pages": [ + "compliance/fedramp", + "compliance/gdpr", + "compliance/hipaa-hitrust", + "compliance/iso27001", + "compliance/iso-27017", + "compliance/nist-800-171", + "compliance/pci-dss", + "compliance/soc2" + ] + }, + "trophy-case", + "run-a-successful-pov", + "metrics-1", + "security", + { + "group": "Contribute to Semgrep", + "pages": [ + "contributing/contributing", + "contributing/contributing-to-semgrep-rules-repository", + "contributing/contributing-code", + "contributing/semgrep-core-contributing", + "contributing/semgrep-contributing", + "contributing/adding-a-language", + "contributing/updating-a-grammar", + "contributing/troubleshooting" + ] + } + ] + } + ] + } + ] + }, + { + "tab": "Explore", + "dropdowns": [ + { + "dropdown": "What's Semgrep", + "icon": "sparkles", + "groups": [ + { + "group": "What's Semgrep", + "pages": [ + "introduction", + "faq/overview", + "run-a-successful-pov-1", + "semgrep-pro-vs-oss", + "contributing/semgrep-philosophy", + { + "group": "Comparisons with other tools", + "pages": [ + "faq/comparisons/codeql", + "faq/comparisons/endor-labs", + "faq/comparisons/opengrep", + "faq/comparisons/snyk", + "faq/comparisons/sonarqube" + ] + }, + "integrating", + "metrics" + ] + } + ] + }, + { + "dropdown": "For developers", + "icon": "code", + "pages": [ + "for-developers/overview", + "for-developers/signin", + { + "group": "Resolve findings", + "pages": [ + "for-developers/resolve-findings-through-comments", + "for-developers/resolve-findings-through-app" + ] + }, + { + "group": "Run scans", + "pages": ["for-developers/cli", "for-developers/ide"] + }, + { + "group": "References", + "pages": ["for-developers/detection"] + } + ] + }, + { + "dropdown": "References", + "icon":"code-simple", + "pages":[ + { + "group": "CI references", + "root": "category/ci-references", + "pages": [ + "semgrep-ci/ci-environment-variables", + "semgrep-ci/sample-ci-configs", + "semgrep-ci/findings-ci", + "semgrep-ci/packages-in-semgrep-docker" + ] + }, + { + "group": "Language-specific features", + "root": "category/language-specific-features", + "pages": [ + "semgrep-code/java" + ] + }, + { + "group": "Glossaries", + "root": "category/glossaries", + "pages": [ + "semgrep-code/glossary", + "semgrep-supply-chain/glossary" + ] + }, + "references/language-maturity-levels", + "references/feature-definitions" + ] + }, + { + "dropdown": "Release notes", + "icon": "bullhorn", + "groups": [ + { + "group": "Most recent posts", + "pages": [ + "release-notes", + { + "group": "2026", + "pages": [ + "release-notes/may-2026", + "release-notes/april-2026", + "release-notes/march-2026", + "release-notes/february-2026", + "release-notes/january-2026", + "release-notes/december-2025" + ] + }, + { + "group": "2025", + "pages": [ + "release-notes/november-2025", + "release-notes/october-2025", + "release-notes/september-2025", + "release-notes/august-2025", + "release-notes/july-2025", + "release-notes/june-2025", + "release-notes/may-2025", + "release-notes/april-2025" + ] + } + ] + } + ] + } + ] + } + ], + "global": { + "anchors": [ + { + "anchor": "Blog", + "href": "https://semgrep.dev/blog", + "icon": "newspaper" + }, + { + "anchor": "Academy", + "href": "https://academy.semgrep.dev", + "icon": "graduation-cap" + } + ] + } + }, + "logo": { + "light": "logo/light.svg", + "dark": "logo/dark.svg" + }, + "navbar": { + "links": [ + { + "label": "Login", + "href": "https://semgrep.dev/orgs/-" + } + ], + "primary": { + "type": "github", + "href": "https://github.com/semgrep/semgrep" + } + }, + "contextual": { + "options": [ + "copy", + "view", + "chatgpt", + "claude", + "perplexity", + "mcp", + "cursor", + "vscode" + ] + }, + "footer": { + "socials": { + "x": "https://x.com/semgrep", + "github": "https://github.com/semgrep/semgrep", + "linkedin": "https://www.linkedin.com/company/semgrep" + } + } +} diff --git a/mintlify-docs/extensions/overview.mdx b/mintlify-docs/extensions/overview.mdx new file mode 100644 index 0000000000..9c329aaf96 --- /dev/null +++ b/mintlify-docs/extensions/overview.mdx @@ -0,0 +1,36 @@ +--- +title: "Extensions" +description: "Several third-party tools include Semgrep extensions." +--- + + +## Official IDE extensions + +| Name | Marketplace link | Documentation | +| :--- | :--- | :--- | +| Microsoft Visual Studio Code | [ `semgrep-vscode`](https://marketplace.visualstudio.com/items?itemName=semgrep.semgrep) | [Semgrep VS Code extension](/extensions/semgrep-vs-code) | +| IntelliJ Ultimate Idea and many other IntelliJ products |[ `semgrep-intellij`](https://plugins.jetbrains.com/plugin/22622-semgrep) |[Semgrep IntelliJ extension](/extensions/semgrep-intellij) | +| Emacs | [ `lsp-mode`](https://github.com/emacs-lsp/lsp-mode) | See repository README | + +## Use of Language Server Protocol (LSP) + +All of the official IDE extensions use the [Language Server Protocol](https://microsoft.github.io/language-server-protocol/) to communicate with Semgrep. This allows the team to focus on one codebase that can be shared across most modern editor platforms. + +## `pre-commit` + +Prevent secrets or security issues from entering your Git source control history by running Semgrep as a [ pre-commit](https://pre-commit.com/) hook. See [`pre-commit` documentation](/extensions/pre-commit) for details. + +## Semgrep as an engine + +Many other tools have capabilities powered by Semgrep. Add yours [with a pull request](https://github.com/semgrep/semgrep-docs)! + + + + + + + + + + + \ No newline at end of file diff --git a/mintlify-docs/extensions/pre-commit.md.template.mdx b/mintlify-docs/extensions/pre-commit.md.template.mdx new file mode 100644 index 0000000000..4168afa9cf --- /dev/null +++ b/mintlify-docs/extensions/pre-commit.md.template.mdx @@ -0,0 +1,59 @@ +--- +title: "Run scans on pre-commit" +--- + + +The [pre-commit framework](https://pre-commit.com/) can run `semgrep` when you commit changes. This is helpful in preventing secrets and security issues from leaking into your Git history. + +## Prerequisites + +[ The `pre-commit` framework](https://pre-commit.com). + +## `pre-commit` with Semgrep Community Edition (no login) + +Use these instructions to run `pre-commit` without logging in. You can still use custom rules or rules from the Semgrep Registry. + +Add the following to your `.pre-commit-config.yaml` file: + +```yaml +repos: +- repo: https://github.com/semgrep/pre-commit + rev: 'SEMGREP_VERSION_LATEST' + hooks: + - id: semgrep + entry: semgrep + # Replace with your custom rule source + # or see https://semgrep.dev/explore to select a ruleset and copy its URL + args: ['--config', '', '--error', '--skip-unknown-extensions'] +``` + +## `pre-commit` with your Semgrep AppSec Platform configuration + +You can also run custom rules and rulesets from Semgrep AppSec Platform, similar to running `semgrep ci`. + +Ensure that you are logged in: + + + + og in to your Semgrep account. Running this command launches a browser window, but you can also use the link that's returned in the CLI to proceed: + + ```bash + semgrep login + ``` + + + In the **Semgrep CLI login**, click **Activate** to proceed. + + + +Add the following to your `.pre-commit-config.yaml` file: + +```yaml +repos: +- repo: https://github.com/semgrep/pre-commit + rev: 'SEMGREP_VERSION_LATEST' + hooks: + - id: semgrep-ci +``` + +For guidance on customizing Semgrep's behavior in pre-commit, see [Customize Semgrep in pre-commit](/kb/integrations/customize-semgrep-precommit). diff --git a/mintlify-docs/extensions/pre-commit.mdx b/mintlify-docs/extensions/pre-commit.mdx new file mode 100644 index 0000000000..093f9e5b9d --- /dev/null +++ b/mintlify-docs/extensions/pre-commit.mdx @@ -0,0 +1,58 @@ +--- +title: "Run scans on pre-commit" +--- + +The [pre-commit framework](https://pre-commit.com/) can run `semgrep` when you commit changes. This is helpful in preventing secrets and security issues from leaking into your Git history. + +## Prerequisites + +[The `pre-commit` framework](https://pre-commit.com). + +## `pre-commit` with Semgrep Community Edition (no login) + +Use these instructions to run `pre-commit` without logging in. You can still use custom rules or rules from the Semgrep Registry. + +Add the following to your `.pre-commit-config.yaml` file: + +```yaml +repos: +- repo: https://github.com/semgrep/pre-commit + rev: 'SEMGREP_VERSION_LATEST' + hooks: + - id: semgrep + entry: semgrep + # Replace with your custom rule source + # or see https://semgrep.dev/explore to select a ruleset and copy its URL + args: ['--config', '', '--error', '--skip-unknown-extensions'] +``` + +## `pre-commit` with your Semgrep AppSec Platform configuration + +You can also run custom rules and rulesets from Semgrep AppSec Platform, similar to running `semgrep ci`. + +Ensure that you are logged in: + + + + Log in to your Semgrep account. Running this command launches a browser window, but you can also use the link that's returned in the CLI to proceed: + + ```bash + semgrep login + ``` + + + In the **Semgrep CLI login**, click **Activate** to proceed. + + + +Add the following to your `.pre-commit-config.yaml` file: + +```yaml +repos: +- repo: https://github.com/semgrep/pre-commit + rev: 'SEMGREP_VERSION_LATEST' + hooks: + - id: semgrep-ci +``` + +For guidance on customizing Semgrep's behavior in pre-commit, see [Customize Semgrep in pre-commit](/kb/integrations/customize-semgrep-precommit). diff --git a/mintlify-docs/extensions/semgrep-intellij.mdx b/mintlify-docs/extensions/semgrep-intellij.mdx new file mode 100644 index 0000000000..89b399c6a5 --- /dev/null +++ b/mintlify-docs/extensions/semgrep-intellij.mdx @@ -0,0 +1,124 @@ +--- +title: "Semgrep IntelliJ extension" +sidebarTitle: "IntelliJ extension" +--- + +[Semgrep](https://semgrep.dev/) swiftly scans code and package dependencies for known issues, software vulnerabilities, and detected secrets. Run Semgrep in your developer environment with the IntelliJ extension to catch code issues as you type. By default, the Semgrep IntelliJ extension scans code whenever you change or open files. + + +**INFO** + +Semgrep's IntelliJ extension for Windows users is currently in beta. + + +## Prerequisites + +The Semgrep IntelliJ extension communicates with Semgrep command-line interface (CLI) to run scans. Install Semgrep CLI before you can use the extension. To install Semgrep CLI: + +```bash +# For macOS +$ brew install semgrep + +# For Ubuntu/Windows/Linux/macOS, using pipx (https://pipx.pypa.io/stable/how-to/install-pipx/) +$ pipx install semgrep + +# Or, using uv (https://docs.astral.sh/uv/) +$ uv tool install semgrep +``` + +## Quickstart + + + + Install the Semgrep extension: + - Visit [ Semgrep's page on the JetBrains Marketplace](https://plugins.jetbrains.com/plugin/22622-semgrep). + - In IntelliJ: **Settings/Preferences > Plugins > Marketplace > Search for `semgrep-intellij` > Install**. You may need to restart IntelliJ for the Semgrep extension to be installed. + + + Sign in: Press Ctrl+⇧Shift+A (Windows) or ⌘Command+⇧Shift+A (macOS) and sign in to Semgrep AppSec Platform by selecting the following command: + + ```bash + Sign in with Semgrep + ``` + + + Test the extension by pressing Ctrl+⇧Shift+A (Windows) or ⌘Command+⇧Shift+A (macOS) and run the following command: + + ```bash + Scan workspace with Semgrep + ``` + + + See Semgrep findings: Hold the pointer over the code that has the red underline. + + + + +**FEATURE MATURITY** + +Semgrep's IntelliJ extensions are currently in beta. Currently, the IntelliJ extension only supports Semgrep Community Edition (CE) - it doesn't support Semgrep Supply Chain, Secrets, Pro rules, or Pro Engine. Please join the [Semgrep community Slack workspace](https://go.semgrep.dev/slack) and let the Semgrep team know if you encounter any issues. + + +## Supported Jet Brains products + +Semgrep's IDE extension is available in many Jet Brains products: + +- AppCode +- Aqua +- CLion +- DataSpell +- DataGrip +- GoLand +- IntelliJ IDEA Ultimate +- PhpStorm +- PyCharm Professional +- Rider +- RubyMine +- RustRover +- WebStorm + + +**INTELLIJ EXTENSION DOES NOT SUPPORT:** + +- IntelliJ IDEA Community Edition. + +Semgrep does not offer an IDE integration with IntelliJ Community Edition because [this version lacks support for the Language Server Protocol (LSP)](https://plugins.jetbrains.com/intellij/language-server-protocol.html#supported-ides), which is essential for enabling Semgrep’s code scanning features. IntelliJ Ultimate, which includes LSP support, is required to use Semgrep's IDE integration. + + + +## Commands + +Run Semgrep extension commands through the IntelliJ Command Palette. You can access the Command Palette by pressing Ctrl+⇧Shift+A (Windows) or ⌘Command+⇧Shift+A (macOS) on your keyboard. + +- `Sign in with Semgrep`: Sign up or log in to the Semgrep AppSec Platform (this command opens a new window in your browser). Alternatively, you can log in through your command-line interface by running `semgrep login`. +- `Sign out of Semgrep`: Log out of Semgrep AppSec Platform. If you are logged out, you lose access to Semgrep Supply Chain and Semgrep Secrets. Alternatively, you can sign out through your command-line interface by running `semgrep logout`. +- `Scan workspace with Semgrep`: Scan files that have been changed since the last commit in your current workspace. +- `Scan workspace with Semgrep (Including Unmodified Files)`: Scan all files in the current workspace. + + +**TIP** + +You can also click the Semgrep icon in the IntelliJ toolbar to quickly access all available commands. + + +## Features + +### Automatic scanning + +When you open a file, Semgrep scans it right away. + +### Rule Quick Links + +Hover over a match and click the link. + +## Support + +If you need our support, join the [Semgrep community Slack workspace](https://go.semgrep.dev/slack) and tell us about any problems you encountered. + +## Limitations + +Semgrep's VS Code extension supports the use of Pro rules and cross-file analysis. Other IDE scans use Semgrep Community Edition (CE) for its speed, and these scans are limited to single-file analysis. As a result, you may encounter a higher rate of false positives. + +## License + +The Semgrep IntelliJ extension is licensed under the LGPL 2.1 license. diff --git a/mintlify-docs/extensions/semgrep-vs-code.mdx b/mintlify-docs/extensions/semgrep-vs-code.mdx new file mode 100644 index 0000000000..5722172898 --- /dev/null +++ b/mintlify-docs/extensions/semgrep-vs-code.mdx @@ -0,0 +1,139 @@ +--- +title: "Semgrep Visual Studio Code extension" +sidebarTitle: "Visual Studio Code extension" +--- + + +[Semgrep's Visual Studio Code (VS Code) Extension](https://marketplace.visualstudio.com/items?itemName=Semgrep.semgrep) allows you to scan lines when you open and change files in your workspace. It offers: + +- Automatic scans whenever you open a file +- Inline results and problem highlighting, as well as quick links to the definitions of the rules underlying the findings +- Rule-defined fix, which allows you to apply Semgrep's suggested resolution for the findings + +## Prerequisites + +- See [Supported Languages](/supported-languages) to verify that the extension supports your project. +- Windows users must use Semgrep VS Code extension v1.6.2 or later. + +## Quickstart + + + + [Install the Semgrep extension](https://code.visualstudio.com/editor/extension-marketplace#_install-an-extension). If you're unfamiliar with installing VS Code extensions, see the Extension Marketplace's article [Install an Extension](https://code.visualstudio.com/editor/extension-marketplace#_install-an-extension). + + + Use Ctrl+⇧Shift+P or ⌘Command+⇧Shift+P (macOS) to launch the Command Palette, and run the following to sign in to Semgrep AppSec Platform: + + ```bash + Semgrep: Sign in + ``` + + You can use the extension without signing in, but doing so enables better results since you benefit from [Semgrep Code](/semgrep-code/overview) and its [Pro rules](/semgrep-code/pro-rules). + + + Launch the Command Palette using Ctrl+⇧Shift+P or ⌘Command+⇧Shift+P (macOS), and scan your files by running: + + ```bash + Semgrep: Scan all files in workspace + ``` + + + To see detailed vulnerability information, hover over the code underlined in yellow. You can also see the findings identified by Semgrep using ⇧Shift+Ctrl+M or ⌘Command+⇧Shift+M (macOS) and opening the **Problems** tab. + + + +## Commands + +Run Semgrep extension commands through the [Visual Studio Code Command Palette](https://code.visualstudio.com/getstarted/userinterface#_command-palette). You can access the Command Palette using Ctrl+⇧Shift+P or ⌘Command+⇧Shift+P (macOS). The following list includes all available Semgrep extension commands: + +- `Semgrep: Scan all files in a workspace`: Scan all files in the current workspace. +- `Semgrep Search: Clear`: Clear pattern searches from the Primary Side Bar's Semgrep Search view. +- `Semgrep Search: Focus on Search Results View`: Bring the Primary Side Bar's Semgrep Search view into focus +- `Semgrep Restart Language Server`: Restart the language server +- `Semgrep: Scan changed files in a workspace`: Scan files that have been changed since the last commit in your current workspace. +- `Semgrep: Search by pattern`: Search for patterns in code using Semgrep pattern syntax. For more information, see [Pattern syntax](/writing-rules/pattern-syntax) documentation. +- `Semgrep: Show Generic AST`: Show generic AST in a new window +- `Semgrep: Show named Generic AST`: Show named AST in a new window +- `Semgrep: Sign in`: Sign in or log in to the Semgrep AppSec Platform (this command opens a new window in your browser). When you sign in, you can automatically scan with Semgrep [Pro rules](/semgrep-code/pro-rules) and add additional rules to the [Policies](https://semgrep.dev/orgs/-/policies) in Semgrep Code. If you are logged in with the command-line interface using semgrep login, you are also already signed in with the Visual Studio Code Semgrep extension. Alternatively, you can log in through your command-line interface by running `semgrep login`. +- `Semgrep: Sign out`: Log out from Semgrep AppSec Platform. Alternatively, you can sign out through your command-line interface by running `semgrep logout`. +- `Semgrep: Update rules`: For logged-in users. If the rules in the [Policies](https://semgrep.dev/orgs/-/policies) or rules included through the **Semgrep › Scan: Configuration** configuration option have been changed, this command loads the new configuration of your rules for your following scan. + + +**TIP** + +Tip: You can click the Semgrep icon in the Visual Studio Code to access all available commands quickly. + + +## Additional extension features + +Use auto-fix to apply code change suggestions from Semgrep to remediate the security issue. + + +