Semgrep

Security Plane
Security Suites
Source
Open
What is Semgrep?
Semgrep is an extensible application security platform that scans source code to identify and remediate security issues using AI-assisted static analysis, SCA, and secrets detection.

Profile

Semgrep is a fast, lightweight static analysis tool that enables developers and security teams to detect bugs, vulnerabilities, and enforce code standards across 30+ programming languages. Originally developed at Facebook as Sgrep in 2009 and evolved into Semgrep by r2c (now Semgrep, Inc.) in 2020, the tool combines semantic pattern matching with dataflow analysis to provide high-confidence security findings. The core engine remains open source under LGPL 2.1, while commercial offerings extend capabilities through the Semgrep AppSec Platform. Recognized in Gartner's Magic Quadrant for Application Security Testing, Semgrep serves organizations from startups to Fortune 500 enterprises, processing millions of scans across diverse development environments.

Focus

Semgrep addresses the fundamental tension between comprehensive security analysis and developer experience by providing semantic code pattern matching that understands code intent rather than exact syntax. Unlike traditional static analysis tools that generate excessive false positives and require specialized expertise, Semgrep enables developers to write security rules in YAML that resemble the code they already write, without requiring abstract syntax tree knowledge or complex regular expressions. The tool serves application security engineers enforcing organizational standards, individual developers catching vulnerabilities before commits, security consultants scanning client codebases, and DevSecOps practitioners automating security programs. By reducing false positives through semantic analysis and integrating seamlessly into existing development workflows, Semgrep transforms security scanning from a compliance burden into actionable developer feedback.

Background

Semgrep evolved from Sgrep, an open-source semantic grep tool developed at Facebook in 2009 as part of the pfff program analysis library, inspired by Coccinelle for C program analysis. When original author Yoann Padioleau joined r2c in 2019, the tool was repurposed and renamed Semgrep in April 2020. Founded by Isaac Evans, Drew Dennison, and Luke O'Malley, Semgrep, Inc. (formerly r2c) has raised substantial venture funding, including a Series D round led by Menlo Ventures. The tool is actively maintained with weekly releases and serves organizations including Figma, Dropbox, Slack, Snowflake, and Lyft. The core engine remains open source under LGPL 2.1, while proprietary rules and commercial platform features support an open-core business model.

Main features

Semantic pattern matching with metavariables

Semgrep's pattern matching engine understands code semantically rather than syntactically, enabling rules to match functionally equivalent patterns regardless of formatting variations. Metavariables act like capture groups but with full code awareness, allowing patterns to match when specific values are unknown. Ellipsis operators accept any number of arguments, fields, or statements, enabling general patterns that catch multiple variations without overly specific rules. Rules are written in YAML resembling the target code, eliminating the need for abstract syntax tree manipulation or complex regular expressions. This approach makes security rule writing accessible to developers without specialized program analysis expertise while maintaining the precision necessary for production security enforcement.

Cross-file dataflow and taint analysis

Semgrep's Pro Engine performs interfile and cross-function dataflow analysis, tracking how potentially malicious data flows through applications across file boundaries and function calls. Taint tracking follows untrusted data from sources where external input enters the system to sinks where that data is used in dangerous operations like SQL queries or command execution. This capability detects sophisticated vulnerabilities invisible to single-file analysis, such as tainted data flowing through multiple abstraction layers. Cross-file analysis examines code patterns spanning multiple files, catching security issues that only become apparent when considering complete code paths. These advanced analysis techniques dramatically reduce false positives while identifying complex attack vectors that simpler tools miss.

Reachability analysis for dependency vulnerabilities

Semgrep Supply Chain determines whether vulnerable functions in dependencies are actually invoked by a project's code, answering whether a vulnerability poses real risk to a specific application. This reachability analysis reduces false positives by up to 98% compared to traditional dependency scanning tools that report all vulnerabilities regardless of usage. The capability is available for eleven languages including Java, JavaScript, TypeScript, Python, Go, and Ruby, with support for critical severity CVEs. By eliminating findings for unreachable vulnerabilities, Semgrep transforms dependency scanning from an overwhelming list of potential issues into an actionable inventory of genuine risks, enabling security teams to prioritize remediation efforts effectively.

Abstract pattern of purple and black halftone dots forming a wave-like shape on a black background.