Security thoughts on Maven and clojars

Distributing Clojure (and Java) libraries

Posted on Saturday, 2 January, 2016

TL;DR

I think the way forward is to retain https://clojars.org to allow the authors of fledgling projects to distribute their libraries, without the extra encumberments of Maven Central and jcenter. Once those libraries become established, we should encourage them to register with keybase.io, start signing their jars and git commits, and uploading to jcenter.

The long version

I've recently finished up a consulting stint at a well-known bank that has been embracing Clojure in a number of projects.

Without going into too much detail, one of the hurdles Clojure developers faced there was in accessing Clojure libraries. Like in other large organisations, firewall rules are such that you can't simply download third-party libraries from somewhere like Maven Central and clojars.org. To solve this problem, organisations often use an internal binary repository (in this case, Sonatype's Nexus) as a proxy. Developers are allowed to contact the proxy, and the proxy is allowed to contact the external repositories.

The problem in this case was that clojars.org was not on the list of allowed repositories, causing quite a good deal of additional complexity, and use of another 'legacy' repository that did have access.

The experience prompted me to consider whether the Clojure community (of which I am a part) needs to think more seriously about security, in order to prepare the ground for wider adoption in corporate settings.

Let me explain the problem with an example.

  1. Alice creates a useful Clojure library and uploads it onto clojars. She blogs about how useful it is.

  2. Bob reads about it and adds Alice's library details from her GitHub README into the dependencies of his project.clj (or pom.xml) file.

  3. Chuck forks Alice's project and adds a bit of code that steals the customer records of the organisation it runs in and uploads them to a secret site.

  4. Bob upgrades to the new version of the library and downloads and runs Chuck's version, thinking it to be from Alice.

There are a few mitigating factors that make this attack more unlikely.

Firstly, clojars.org only accepts uploads from users who have a username and password, and once a project name is established, other users cannot usurp it. Once a project name has been established, you can trust that new releases will be from the same origin. Also, clojars encourages users to encrypt their username and passwords with GPG (but can't actually force users not to keep their passwords around in plain-text).

Obviously this assumes that you can trust the library author in the first place. Trust can be established by a combination of

  • the reputation of the individual
  • community vetting
  • evidence from the number of GitHub followers they have, the stars they have earned
  • whether the project is public and open-source (which would make it harder to hide malware).
  • trust in GitHub itself

There is, however, a number of attack vectors left for Chuck to explore. One might be to upload a copy of the modified library to another repository, such as Maven Central. Most repositories are configured to try there first. So securing the repositories themselves is not 100% sufficient, we might need to start thinking about signing our libraries too. If libraries were signed, and their signatures were checked, then even if Chunk had control of a repository used by Alice's organisation, his library version would be automatically rejected if his public key wasn't trusted.

So what's the next step? Just now, Frankie Sardo made the following comment on our JUXT slack.

Can’t hack on my side projects today, clojars is still down :disappointed: I like clojars but, why isn’t everyone just using jcenter?

I think this could be a really good idea.

One advantage to using jcenter is that it has thought through security, perhaps better than Sonatype's Maven Central. I remember Sonatype being reluctant to move to https by default because their revenue model was based on charging for it. That may be unfair to Sonatype, but they seem to think that forcing users to sign their libraries gives some additional security, in the absence of a process for building trust in keys, that isn't nearly adequate. Here's a great article from jcenter that explains why: http://blog.bintray.com/2014/08/04/feel-secure-with-ssl-think-again/

The fact that Google have adopted jcenter for Android, where there is no doubting the critical relevance of security, really adds confidence.

What remains is to build a web-of-trust based library signature verification model inside Maven. I'm not sure what the status of this is yet. Here's an article from 2012 that indicates we're still quite far off this. (Again we hear that Sonatype are building their business model from the witholding of security, which should be built-in.)

I think Arch Linux's approach is relevant here. With Arch Linux, you use pacman-key --refresh-keys to download a set of trusted keys from a secure repository. Every package has to be signed, and every signature is verified before the packages can install. If any of the keys are compromised, they can be removed from the repository. Other Linux distributions, such as Debian, have had these security measures in place for a long time, which is a key reason why Linux is more secure than Mac OS X, which often encourages more ad-hoc installation of software, particular for devs.

To create a viable key-verification system with the widespread community of Clojure library authors, we'd need to adopt a web-of-trust mechanism. That's the big problem we need to solve.

A number of developers including myself have signed up with keybase.io, including James Reeves, Martin Trojer, Philip Meier and James Henderson. If enough Clojure devs start using this service we can begin to build a trusted library distribution system based on it, and perhaps get Maven to use it for verifying a user-defined subset of signatures.

Signing jars is already easy to do. In fact, I have been doing this for years for all my library releases and tags (bidi, yada, etc.), using my PGP key. It's a bit of a pain typing in a long passphrase every time I tag a commit or upload a jar, but I'm soon going to use my yubikey to make this easier.