Today’s JavaScript trash fire and pile on
Four years ago I wrote a slightly inflamatory blog on what still sucks in front-end dev. One of the points was on Bower and how JS library management sucks. Sadly, the tools have changed, but JS library managment still sucks.
We’ve had a few NPM-related trash fires over the years. left-pad is still a meme, and another one happened today? Well it turns out the ecosystem is as fragile as ever. Let’s start by assuming everyone here is acting in good faith (although that’s not necessarily the case, let’s do the whole innocent until proven guilty), apart from whoever wrote the malicious code.
9th September 2018: GitHub user “right9ctrl” adds flatmap-stream as a dependency of the JavaScript package event-stream. They’re a new maintainer who took over maintaining the package from the original author, Dominic Tarr.
16th September: right9ctrl rewrites the code to remove the dependency on flatmap-stream and pushes out a new version, as a semver-major version bump, which means updates will not automatically be upgraded.
5th October: Someone using the NPM account of user “hugeglass” pushes flatmap-stream 0.1.1 to NPM. Appended to the end of the minimised code that was pushed to NPM was an obfuscated snippet that decrypted some further content — the “payload” that did something nefarious.
29th October: Through sheer luck, the malicious code that bootstraps the malware uses an API that’s been deprecated in Node 10. Jayden Seric notices the warning and raises an issue on one of the packages which brings in flatmap-stream as a transitive dependency.
20th November: Ayrton Sparling identifies that the bug on the Nodemon package is a result of the compromised flatmap-stream package, and raises an issue on event-stream, blaming right9ctrl for introducing the issue and Dominic Tarr for handing over maintenance to another user.
26th November: I’m made aware of the issue through my employer, and initial evaluation reveals the encrypted payload is decrypted based on the presence of a specific value of description in package.json. It looks like a highly targeted attack but no-one is sure what the actual payload is as we do not know what package is targeted. Through a bit of teamwork and communication via the issue, Jacob Burroughs successfully finds the key and decodes the payload.
It turns out it was highly targetted towards a cryptocurrency wallet product to steal encryption keys for cryptocurrency wallets. Along with the rest of the dev community we’ve wiped our brows as we’ve got away with it, we didn’t have malicious code running on our dev machines, our CI servers, or in prod.
This time.
Nothing’s stopping this happening again, and it’s terrifying. The Hacker News and GitHub threads on the issue quickly denigrated into a blame game, pulling out license clauses and debating what open source ettiquette means. However, this is no one person’s fault, it’s a systemic failure that has led to this.
User right9ctrl has had to bear the blame for this, but despite the trial by Internet jury, there’s quite a lot of reasonable doubt from the evidence we’ve seen so far. And even if that user was acting maliciously, what does it say about the strength of our ecosystem that a single user can cause such chaos? In The Full Stack Developer (my upcoming book, nudge nudge wink wink) I discuss how trust is at the foundation of our industry (in how we interact with peers, and our users). But when it comes to security and integrity, we must not only trust, but trust and verify. And the JavaScript package ecosystem has not been built to make verification easy.
NPM has been a fantastic example of a smooth developer experience — introducing new packages and dependencies is easy! But it focussed too much on that and failed on enabling some of the more hidden requirements. The JavaScript ecosystem has often had a focus on making small “does one thing well” style packages, partly so as to minimise the size of JavaScript front-end bundles by only importing very tightly scoped functionality. This has happened in JavaScript, often a package is the smallest unit of importable functionality. Newer build tools do better at this, but there’s a long legacy that’s established a culture of many small packages.
The second is in package distribution. Locking package versions did not come until relatively recently (npm-shrinkwrap was usually a sub-optimal experience, but has been replaced through Yarn and NPM package locks), but other facets that have been common in distributing Linux system packages have not been adopted by NPM. The first is code-signing, and the second is reproducible builds.
If we had code-signing, would it have solved the issue? Maybe. It would have flagged to the upstream managers that the dependency has now been published by a different person, which may cause someone to re-review the code. But they may just accept it and move on, after all, who has time to review the code? And if you trust someone for one build, that means you trust them for all future builds too.
If we had reproducible builds, would it have solved this issue? Maybe, as it would have made it easier to compare the source in GitHub to what actually got installed. In this model, NPM would have to build the packages from the committed source code, rather than it being built and published locally, to ensure that the published open source code and built component correspond, to make review easier. But someone would still need to have reviewed the source code that built the dependency.
So it all comes down to trust. Even if we had reproducible builds (so you can review the human readable source and have confidence that it corresponds to the built product), reviewing every change you take into your product isn’t a feasible solution. Reviewing your immediate dependencies may be, but then you’re assuming that the people you depend on are reviewing theirs — a form of transitive trust. Combining this with code signing, you can perhaps increase your trust to only trusting on the initial import into your app, and then trusting that future changes by the same person are valid. But it’s still based on trust, and the distinct nature of JavaScript development has meant that we have to trust a large group of people due to the sheer amount of dependencies we can have, with no easy way to verify that trust. Until we have more effective ways than code review to verify this, there’ll be problems again and again. Tools will get better and make it rarer, but we can’t completely eliminate the issue.
And for that, I have no answer. But take yet another cautionary tale from this, learn from it what we can and let’s take a small step towards getting better.
I might have mentioned earlier in the post that I’m bringing out a book, something I’ve been working on for the past 3 years. It’s called The Full Stack Developer and it looks at the underlying principles of full-stack web development, rather than focussing on a particular framework. I’d really appreciate it if you could have a look at it and consider pre-ordering if it sounds like the kind of thing you could learn something from? Thanks!