Wednesday, April 16, 2014

The Delightful World of Open Source Software

The Heartbleed vulnerability was an awful bug, and caused incredible damage to the web of trust we have built around online transactions. But, there was a silver lining: people became aware of how much of our critical Internet structure relies on old, poorly-maintained code that nobody wants to support any more. That's, um, a bad thing, but now that we're aware of it people can start fixing it!

By far my favorite thing I've seen online recently is the git commit log for (a copy of) OpenSSL, the software with the Heartbleed bug. Once people started paying attention, interest in the project spiked, and they started seriously working to not just plug Heartbleed, but to fix the code in general. This led to a flurry of activity, and a fascinating kind of oral history that reminds me of someone discovering the Necronomicon: initial excitement and a sense of superiority gradually give way to a creeping suspicion that something is not right here, and ending in a howling wave of madness as everything you love in the world is destroyed. Here are some of my favorite commit messages, all from the last three days or so, presented in chronological order:

I am completely blown away that the same IETF that cannot efficiently allocate needed protocol, service numbers, or other such things when they are needed, can so quickly and easily rubber stamp the addition of a 64K Covert Channel in a critical protocol.  The organization should look at itself very carefully, find out how this this happened, and everyone who allowed this to happen on their watch should be evicted from the decision making process.  IETF, I don't trust you.

remove more cases of MS_STATIC, MS_CALLBACK, and MS_FAR. Did you know that MS_STATIC doesn't mean it is static?  How far can lies and half-truths be layered?  I wonder if anyone got fooled, and actually returned a pointer..


Remove various horrible socket syscall wrappers, especially SHUTDOWN* which did shutdown + close, all nasty and surprising.  Use the raw syscalls that everyone knows the behaviour of.


First pass at applying KNF to the OpenSSL code, which almost makes it readable.

Flense all use of BIO_snprintf from ssl source - use the real one instead

o_dir.c has a questionable odor

Toss a `unifdef -U OPENSSL_SYS_WINDOWS' bomb into crypto/bio.


No longer mention OPENSSL_EC_BIN_PT_COMP being required to allow for `compressed' EC point representation.
First, as researched by djb, quoting from :
``It should, in any case, be obvious to the reader that a patent cannot
  cover compression mechanisms published seven years before the patent
  was filed.'' 

Second, that define was actually removed from the code in in OpenSSL 1.0.0.


remove FIPS mode support. people who require FIPS can buy something that meets their needs, but dumping it in here only penalizes the rest of us.


Q: How would you like your lies, sir?
A: Rare.


just like every web browser expands until it can read mail, every modular library expands until it has its own dlfcn wrapper, and libcrypto is no exception.

The NO_ASN1_OLD define was introduced in 0.9.7, 8 years ago, to allow for obsolete (and mostly internal) routines to be compiled out. We don't expect any reasonable software to stick to these interfaces, so better clean up the view and unifdef -DNO_ASN1_OLD. The astute reader will notice the existence of NO_OLD_ASN1 which serves a similar purpose, but is more entangled. Its time will come, soon.


imake died in a fire a long time ago


we don't use these files for building


we don't use this makefile


the VMS code is legion 

remove ssl2 support even more completely. in the process, always include ssl3 and tls1, we don't need config options for them. when the time comes to expire ssl3, it will be with an ax.

Remove wraparounds for operating systems which lack issetugid(). I will note that some were missing, looking at you Solaris!!!  Anyone home? Using my own copyright on the file now, since this is a rewrite of a trivial wrapper around a system call I invented.

use explicit_bzero instead of a bizarro "no compiler could ever be smart enough to optimize this" monstrosity.


Three wrappers in this file: OPENSSL_strncasecmp, OPENSSL_strcasecmp, and OPENSSL_memcmp. All modern systems have strncasecmp.  No need to rewrite it. Same with memcmp, call the system one!  It is more likely to be hot in the icache, and is specifically optimized for the platform.  I thought these OpenSSL people cared about performance?


you do not want to do the things this program does

strncpy(d, s, strlen(s)) is a special kind of stupid. even when it's right,it looks wrong. replace with auditable code and eliminate many strlen calls to improve efficiency. (wait, did somebody say FASTER?)

spray the apps directory with anti-VMS napalm. so that its lovecraftian horror is not forever lost, i reproduce below a comment from the deleted code.
        /* 2011-03-22 SMS.
         * If we have 32-bit pointers everywhere, then we're safe, and
         * we bypass this mess, as on non-VMS systems.  (See ARGV,
         * above.)
         * Problem 1: Compaq/HP C before V7.3 always used 32-bit
         * pointers for argv[].
         * Fix 1: For a 32-bit argv[], when we're using 64-bit pointers
         * everywhere else, we always allocate and use a 64-bit
         * duplicate of argv[].
         * Problem 2: Compaq/HP C V7.3 (Alpha, IA64) before ECO1 failed
         * to NULL-terminate a 64-bit argv[].  (As this was written, the
         * compiler ECO was available only on IA64.)
         * Fix 2: Unless advised not to (VMS_TRUST_ARGV), we test a
         * 64-bit argv[argc] for NULL, and, if necessary, use a
         * (properly) NULL-terminated (64-bit) duplicate of argv[].
         * The same code is used in either case to duplicate argv[].
         * Some of these decisions could be handled in preprocessing,
         * but the code tends to get even uglier, and the penalty for
         * deciding at compile- or run-time is tiny.


Remove non-posix support. Why is OPENSSL_isservice even here? Is this a crypto library or a generic platform abstraction library? "A hack to make Visual C++ 5.0 work correctly" ... time to upgrade.

Your operating system memory allocation functions are your friend. If they are not please fix your operating system.


Make this byzantine horror a shell of it's former self by stubbing the functions. The ability to set the debug mem functions died with mem.c

Actually, now that I look at all that together at once, it's really reminding me of Johnny's journal entries from House of Leaves.

Now, for an open-source enthusiast like myself, the entire Heartbleed saga has been distressing on a psychological, even philosophical level. One of the core axioms that open-source advocates embrace is the belief that open software leads to greater security and stability in code. When you offer up your source code for the entire world, you gain thousands of eyeballs, any of who can spot bugs in the code and offer solutions. The thinking is that this is much safer than the closed-source Microsoft world, where bugs lie hidden in compiled code, and can't be discovered until a nefarious actor exploits it.

Heartbleed totally inverted that expectation: all the lame companies who used Microsoft IIS came out of the incident with flying colors, while all of the cool companies running LAMP-style stacks seemed like dupes. The truly embarrassing thing is that the bug was in OpenSSL for over two years before it was finally noticed and fixed. That seems to disprove Eric Raymond's Bazaar argument that "given enough eyeballs, all bugs are shallow."

Now that the dust has settled somewhat, a more nuanced view seems to be emerging. The reality is that there were almost no actual eyeballs on OpenSSL; even though six billion people could have looked at it, only three people were spending a couple of hours a month. Why? Well, because:
  1. It's ancient software. Programmers are always excited by newer and better things; who wants to waste time trawling through outdated code?
  2. It's a nightmare to read. As Neal Stephenson noted in his chapter on Linux from In the Beginning..., most low-level open-source software is written in C, and contains such a staggering amount of boilerplate precompiler definitions that it's a nightmare just to find where the actual code in the project lives. Many or most of the above-quoted commits are related to this complaint.
  3. Tied to the first two points, virtually all open source software is maintained on a purely volunteer basis. These are almost always programmers with full-time day jobs, so they'll spend their limited free-time programming hours on software that excites them and/or that has potential for future career advancement; which, respectively, means well-written projects and/or projects using cutting-edge technology. Exciting new projects with clean code and active communities like Django get a lot of volunteers and can advance very quickly; fugly old legacy projects like OpenSSL don't get volunteers.
  4. So, because nobody wants to work on these projects that are critically important but mind-numbingly dull, it's up to "the community" to fund continued support. Last year, the project received $2000 in donations, which isn't much at all, and most of which came from a couple of Internet companies. That's wildly out of sync with how important the software is, and a staggeringly small amount of resources from the Internet companies (Google, Amazon, Facebook, etc.) that rely on using the software. (By my calculations, the total budget last year for OpenSSL is equivalent to roughly 1.5% of a single Google engineer's salary.)
Fundamentally, Raymond isn't wrong, but we were wrong to assume that just because a limitless number of people could review software, enough people were reviewing it. 

It will be interesting to see how the industry as a whole responds to this incident. In the short term, I'm concerned that we might see a surge of similar exploits: now that it's widely known that vulnerabilities can languish for years in these open-source projects, criminal hackers are probably poring through CVS repositories looking for the next unpatched buffer overflow exploit. Any such additional vulnerabilities would intensify a call to action, but one is probably inevitably coming anyways. What needs to happen? In my opinion, those who have profited most from the labor of the community should contribute more to its survival, either fiscally (via donations of money) or in kind (by assigning employees to their maintenance, as Google currently does). Hopefully self-interest will motivate the big companies to do so; if not, it may be helpful to "name and shame" any freeloaders. (And, really, we're not talking about huge sums of money here, certainly far less than paying a license for commercial software would cost.)

The other big thing that needs to happen is what those wonderful, doomed folks at OpenSSL are doing now: throwing back the curtain around this old software, gaping in unfeigned shock at how awful it is, and taking a chainsaw to the worst bits of it, trying to hack it down to a state that's possible to comprehend and maintain. Any software developer will tell you that this is the worst kind of programming imaginable: fixing bad code that you didn't write in the first place. But, it's crucially important. (I'll be curious to see if this incident also adds a sense of urgency for rewriting software components in more modern languages that don't even allow frequently-exploited "features" like buffer overflows.)

With that in mind, I'm even happier to read the words in this commit log. Not just because they're funny, not just because they're enlightening, but because they're a part of the long oral tradition that is open source software. More so than any other area of software development, open source relies on programmers being brutally candid about what's going on: if someone writes a hack, you can be sure that they'll leave a comment pointing out that it is a hack, explaining why they had to do it, and what the implications of that hack are. Without any corporate PR arm around to vet them, coders can be perfectly frank about their opinion of any software that they're writing or reading. Back when I was in college and first getting into Linux, whenever I was bored (and didn't feel like piping random text files to /dev/dsp) I would open up a console prompt and type grep -R <my favorite curse-word> /usr/src/linux. This would inevitably bring me to the most interesting parts of the Linux codebase: not the reams of #ifdef macro commands, not the fiddly make settings for obscure hardware, but the places where some philosophical debate was occurring between different generations of Linux developers. Heartbleed is almost certainly the biggest threat that open source software has faced in the past twenty years, but if it can quickly respond with transparency, candor, and action, it will emerge stronger than ever before.

1 comment:

  1. It sounds like a lot of OpenSSL was written by Steve in a hurry, especially the FIPS bits.

    Let's not be so quick to judge and complain for the blood, sweat and tears millions of people depend on but don't pay for, but quick to offer alternatives, suggestions and maybe some help.