Skip to content(if available)orjump to list(if available)

Ignore 98% of dependency alerts: introducing Semgrep Supply Chain


I've always thought that dependabot was busy-work, a waste of time. This article makes a good point that drives it home: Alarams that aren't real make all alarms useless. Dependabot is especially painful in non-typed languages (Python, Ruby, and especially Javascript) where "upgrading" a library can break things that there's no way to know until production.

Maybe the constant work, extra build time (and cash for all that), and risk of breaking production, is worth it for the 0.01% of the time there's a real vulnerability? It seems like a high price to pay though. When there are major software vulnerabilities (like log4j), the whole industry usually swarms around it, and the alarm has high value.

I just realized how much CircleCI probably loves Dependabot. I wonder what hit % their margins would take if we moved off it collectively as an industry.


> When there are major software vulnerabilities (like log4j), the whole industry usually swarms around it, and the alarm has high value.

You're leaving me with the impression that you think we should only patch major software vulnerabilities. This I would disagree with. Minor vulnerabilities can be used, especially in groups, to do things we don't anticipate. It's not just about a single vulnerability but about how an attacker can leverage multiple different vulnerabilities together.




If you use vendoring, it's also worth considering that there's always some inherent security risk in upgrading dependencies. If an attacker takes control of a package somewhere in your dependency tree, you don't get compromised until you actually install a new version of that package. This risk can often outweigh the risk of very minor/dev-facing CVEs.


Shameless plug: This is what I’m building to solve.

Socket watches for changes to “package manifest” files such as package.json, package-lock.json, and yarn.lock. Whenever a new dependency is added in a pull request, Socket analyzes the package's behavior and leaves a comment if it is a security risk.

You can see some real-world examples here:


We use Socket and my favorite feature is when you highlight new dependencies with a post-install hook. It’s not always a problem, but almost always a smell.

One feature request: please allow me to “suppress” warnings for a specific package+version combo. This is useful for activist libs that take a political stance - I know it happens, but often cannot remove them, and don’t want to continue flagging the same problem at every sec review.


I kind of feel like dependabot alerts should be treated like a coding convention error - that extra whitespace isnt actually causing a problem but we fix it right away.

Otherwise you have to start analyzing the alerts, and good luck with that. The low severity ones are marked critical and the scary ones are marked low. Suddenly you have 200 unfixed alerts and its impossible to know if somewhere in that haystack is an important one.


Impossible? The article we're commenting on describesa tool that does this.


> can break things that there's no way to know until production

I would argue that any production system should have enough tests that upgrading a dependency that breaks compatibility should cause failure of the test suite in some way


You can argue that but I've seen plenty of production systems relying on dynamically typed languages where breaking third-party dependency changes were indeed undetectable until a new version was deployed and particular features stopped working at runtime. It's certainly not unheard of for the same to occur with statically typed languages but at least there's a whole class of potential errors that are caught automatically by compilers/linkers with no extra work on behalf of the programmers. And even when errors do occur at runtime due to JIT compilation etc. they're more likely to have helpful information attached to them. Either way, all the automated tests in the world aren't likely to catch a problem that's related to incorrect library versions being loaded from some global cache that you have limited control over (having moved back from SaaS to on-prem based software recently I've been very quickly reminded of that reality!).


This is a very good argument for why Ruby should never be used for production systems, and neither should Javascript without using Typescript.


Why specifically Ruby here? How about python, php, etc? Also, many dynamic languages have static type checkers if you want to use them (i.e. Ruby has Sorbet).


IMO Dependabot is really dreadful at its job. Try Renovate - it's really brilliant, fast, flexible, supports properly binding PRs/MRs.


> where "upgrading" a library can break things that there's no way to know until production.

unless you have automated testing


This is a similar mechanism as govulncheck (, which has been quite nice to use in practice. Because it only cares about vulnerable code that is actually possible to call, it's quiet enough to use as a presubmit check without annoying people. Nice to see this for other languages.


How does it deal with vulnerability alerts which don't say anything about what code is affected?


Both Semgrep Supply Chain and govulncheck (AFAIK) are doing this work manually, for now. It would indeed be nice if the vulnerability reporting process had a way to provide metadata, but there's no real consensus on what format that data would take. We take advantage of the fact that Semgrep makes it much easier than other commercial tools (or even most linters) to write a rule quickly.

The good news is there's a natural statistical power distribution: most alerts come from few vulnerabilities in the most popular (and often large) libraries, so you get significant lift just by writing rules starting with libraries.


(Disclaimer: I work at Phylum, which has a very similar capability)

Not all of it has to be manual. Some vulnerabilities come with enough information to deduce vulnerability reachability with a high degree of confidence with some slightly clever automation.

Not all vulns come with this information, but as time goes on the percentage that do is increasing. I'm very optimistic that automation + a bit of human curation can drastically improve the S/N for open source library vulns.

A nice property of this is: you only have to solve it once per vuln. If you look at the total set of vulns (and temporarily ignore super old C stuff) it's not insurmountable at all.


> Both Semgrep Supply Chain and govulncheck (AFAIK) are doing this work manually, for now.

Ya I get that, but surely you don't have 100% coverage. What does your code do for the advisories which you don't have coverage for? Alert? Ignore?


From "A vulnerability database is populated with reports using information from the data pipeline. All reports in the database are reviewed and curated by the Go Security team."

I would imagine that's what Semgrep is doing as well. You're paying for the analysis; the code is the easy part.


Really nice idea to only show warnings if they are relevant. It's indeed annoying if you need to upgrade lodash only to make your audit tool not show critical warnings because of some function that is not used at all.

This is not open source, though? It does make a big difference for some whether you're able to run the check offline or you're forced to upload your code to some service.

One feature I'd love in such tool would be to be able to get the relevant parts of the changelog of the package that needs to be upgraded. It's not responsible to just run the upgrade command without checking the changelog for breaking or relevant changes. That's exactly why upgrades tend to be done very late, because there is a real risk of breaking something even if it's just a minor version.


All the engine functionality is FOSS (code at; but the rules are currently private (may change in the future).

As with all other Semgrep scanning, the analysis is done locally and offline -- which is a major contrast to most other vendors. See #12 on our development philosophy for more details:

Relevant part of the changelog is a good idea--others have also come out with statistical approaches based on upgrades others made (eg dependabot has a compatibility score which is based on "when we made PRs for this on other repos, what % of the time did tests pass vs fail")


Ah okay, thanks for the information.


Here is some code on GitHub that does call site checking using SemGrep:

(Note: I helped write that. We're building a similar service to the r2c one.)

You're right that patching is hard because of opaque package diffs. I've seen some tools coming out like which show a diff between versions.

But, that said, this is still a hard problem to solve and it's happened before that malware[0][1] has been silently shipped because of how opaque packages are.




Thanks for mentioning :)

Looking at package diffs is super important because of the rise of "protestware". For example, a maintainer of the event-source-polyfill package recently added code which redirects website visitors located in Eastern European timezones to a petition page. This means that real users are being navigated to this random URL in production.

See the attack code here:

It’s very unlikely that users of event-source-polyfill are aware that this hidden behavior has been added to the package. And yet, the package remains available on npm many months after it was initially published. We think that supply chain security tools like Socket have an important role to play in warning npm users when unwanted ‘gray area’ code is added to packages they use.


Thanks for the pointers!


There are definitely other approaches that don't require code to be uploaded anywhere. For example, we ( work with your package managers to understand what dependencies your program has, and then analyze that metadata on the back end. Net result is still to be able to see what vulnerabilities are truly exploitable and which are not.


How does this tool go from a vuln. in a library to -> a set of affected functions/control paths? My understanding was that the CVE format is unustructed which makes an analysis like this difficult


We added support to the Semgrep engine for combining package metadata restrictions (from the CVE format) with code search patterns that indicate you're using the vulnerable library (we're writing those mostly manually, but Semgrep makes it pretty easy):

    - id: vulnerable-awscli-apr-2017
      - pattern: boto3.resource('s3', ...)
      - pattern: boto3.client('s3', ...)
        namespace: pypi
        package: awscli
        version: "<= 1.11.82"
      message: this version of awscli is subject to a directory traversal vulnerability in the s3 module
This is still experimental and internal ( but eventually we'd like to promote it and also maybe open up our CVE rules more as well!


Here is a good writeup of some of the pros and cons of using a "reachability" approach.

>Unfortunately, no technology currently exists that can tell you whether a method is definitively not called, and even if it is not called currently, it’s just one code change away from being called. This means that reachability should never be used as an excuse to completely ignore a vulnerability, but rather reachability of a vulnerability should be just one component of a more holistic approach to assessing risk that also takes into account the application context and severity of the vulnerability.


Err, "no technology currently exists" is wrong, "no technology can possibly exist" to say whether something if definitively called.

It's an undecidable problem in any of the top programming languages, and some of the sub problems (like aliasing) themselves are similarly statically undecidable in any meaningful programming language.

You can choose between over-approximation or under-approximation.


I saw that Java support was still in beta. But it makes me wonder if it's going to come with a "don't use reflection" disclaimer, then...?


My question too. All I see is this citation:

> [1] We’ll be sharing more details about this work later in October. Stay tuned!


Jokes on you I already ignore %100 of them /s

I like the promise however how can I trust it completely that the ignored part is not actually reachable? All the languages (except a few) do some magic that might not be detected? At previous work, we were bombarded with dependency upgrades, I can still feel the pain in my bones.


> "Have you ever gotten a "critical vulnerability in dependency" alert and when you look at it, it's something like “XML parser vulnerability in some buried transitive library that you actually use for parsing JSON and therefore aren’t affected by at all?"

Stop right there pal.

This amateurish risk assessment is part of the problem. How do you know that, say, an XML file cannot be smuggled disguised as a JSON into your app?


Or all the regex DoS vulnerabilities in the dev part of my package json. The worst that can happen is my build process hanging forever if someone commits special code to trigger the issue…


I know people won't like this solution but my solution is to use as few dependencies as possible. When I look for a new library, I check its dependencies, the fewer the better, 0 is best. I look through the dependencies as well, even looking into the source of the library and the source of the dependencies. What are they doing, do they have a reason to exist, was the dev being prudent or lazy. Were they coupling things that shouldn't have been coupled.

I'll also evaluate if I really need a library. Maybe the thing I need I can do myself in 3-30 lines of code and believe I'm unlikely to run into edge cases for my use case. If so I'll write the code rather than deal with another dependency.


I see this kind of take very often and I always wonder what people take in as dependencies that could be replaced by a few lines of code. Literally none of my direct dependencies could be replaced without thousands of lines of code and weeks or even months of testing. And I still have dozens of dependencies.


Because we are told to move fast and import things


Except I've learned the hard way it's not fast. Every week or so I lose an entire day to trying to update dependencies and then fixing what broke. So instead of going fast I'm stuck doing maintenance.

Even on my own projects there's been so many times when I had some free time I thought I'd try to make some progress on a personal project I haven't touched in 3-4 months but all that happens is 2-4hrs wasted on old dependencies.


Sounds nice. I've never worked with a tool like this that doesn't turn up a ridiculous number of false positives.


How the hell do you end up with 1644 vulnerable packages anyways?

* rhetorical question, JS...

It was actually one of the main drivers for me to start using Go instead of JavaScript for server-side applications and CLIs about 8 years ago.


Roughly: NPM, Github, and others funded open bug bounties for all popular NPM packages. These bug bounties led to a rash of security "vulnerabilities" being reported against open source project, to satisfy the terms of the bounty conditions. Public bug bounty "intermediary" companies are a major culprit here—they have an incentive to push maintainers to accept even trivial "vulnerabilities", since their success is tied to "number of vulnerabilities reported" and "amount of bounties paid out". This leads to classes of vulnerabilities like reDOS or prototype pollution that would never have been noticed or worth any money otherwise.


The problem really comes down to data quality in disclosing vulnerabilities.

With higher quality data, better CVSS scores can be calculated. With higher quality data, affected code paths can be better disclosed. With higher quality data, unknown vulnerabilities may be found in parallel to the known ones.

I don’t think any tool or automation can solve the problem of high quality data. Humans have to discern to provide it. No amount of code analysis can solve that. But it sure can help.


You're right. Nobody bothers to make scanners because there's no data, and nobody has come up with a good format to convey the data between producers (like NVD) and consumers (like dependabot).

I wrote a blog post talking about some of this stuff:

It truly is a chicken and egg problem. There are next to no automated scanners that make use of data like that, semgrep is the furthest along and my company is close behind them at taking a stab at it as far as I can tell. Heck there are hardly any that do anything with the existing "Environmental" part of the CVSS, and that has been pretty well populated by NVD, I believe.

The existing interchange formats for vulnerability data, such as OSV, are underdesigned to the point that it feels like GitHub CoPilot designed them. It's real work to even get to the point that you can consume them, given all the weird choices in there. Sorry if I'm salty.

There is an attempt to create a standard for situational vulnerability exposure called "VEX" or Vulnerability Exchange Format, but it's almost entirely focused on conveying information about what vulnerabilities have been manually eliminated, so that software "vendors" can satisfy their customers, especially in government contracts. It's not modeling the full picture of what can happen in a dependency tree and all the useful false-positive information in there.


Yeah agreed. When I see these problem statements, I see us addressing problems that are by-products of vulnerability fatigue.

I.e “be lazy and ignore those vulnerabilities by using our tools!”

It hardly solves the true issue of an industry wide challenge of lack of useful information or even transparency of said information from responsible parties. I believe this laziness is what got us here in the first place.


Possibly this isn't the wonder tool for busting through our current semiconductor supply chain problems that I hoped it was ...


This looks really cool. However, for regulated industries, auditors will never accept "We're not vulnerable to CVE-1234 in Blah-blah-blah Library because our code doesn't use the vulnerable functions." All auditors are concerned with is version numbers.


> All auditors are concerned with is version numbers.

That's because they're all* automatons with hard-coded brains that compel them to reduce fuzzy concepts like risk to a 1 or a 0. /vent

*Well, every one I've ever interacted with, at least.