Sunday, February 01, 2009

Coder

Like many (most?) young boys, I went through a time in my childhood when I had a mild obsession with cryptography.  I read books on code-making and code-breaking, many of them beyond my grade level, and enjoyed solving the puzzles.  I learned the elementary tricks that you can use to break the simplest types of ciphers: look for one- and two-letter words.  Count the most frequent letters.  I learned the difference between a code and a cipher: a cipher replaced letters, so it could encrypt any message, but could be broken by a determined adversary; a code replaced entire words and phrases, and a well-chosen and well-kept code was practically unbreakable, but required you to plan your code words in advance, and left you in deep trouble should the code ever fall into the enemy's hands.

Looking back, I'm pretty amused at how much of my youth was devoted to finding ways to thwart "the enemy."  I obsessed over ways to hide information, planned escape routes, practiced plausible denials, diagrammed moats, build elaborate security systems, came up with extensive systems of ranks... quite a lot of effort for a kid who spent most of his time reading books and hoping nobody would punch him!

I even remember being proud when I came up with an "unbreakable" cipher.  As I recall, I created a 5x5 grid and filled it with letters of the alphabet.  Because I only had 25 letters to work with, I modified the alphabet a little... if I remember right, there was no "C" on the grid, so I would use either "K" or "S" depending on its sound.  I think that I also got rid of "Q" and "Z", and replaced them with dummy squares that didn't mean anything.

The real security was that the grid wasn't static.  You would create it yourself for each message you sent, and your receiver would know how to build the corresponding square to decode.  The details of the shift are lost to me... it may have had something to do with the initial letters of the message, or perhaps the first digits in the transmitted message would indicate how to construct it.  In either case, I was convinced that this would be unbreakable.  By cleverly altering the alphabet, I destroyed the 1-to-1 correlation that tripped up other codes.  By using a grid, I could send messages as a string of seemingly meaningless digits instead of letters; for example, 4235222213 instead of HELLO.  At last, my most important correspondence would be free from the prying eyes of The Enemy!

I found myself nostalgically remembering this and other explorations while reading through The Code Book, an excellent piece of nonfiction from Simon Singh.  It may be the most readable book on the topic that I've read yet, and it combines a broad scope with approachable prose.  Simon treats his readers with respect, offering up actual examples of real codes, keeping his hand-waving to a minimum (or, better, to the brief and excellently chosen appendices).  His approach combines a history of cryptography with an excellent primer on increasingly complicated systems of cryptanalysis.  He will typically introduce a period in history, explain the current political situation, describe an advance in code-making or code-breaking, analyze its significance, and talk about the repercussions.  As such, I think this book will interest students of history even if they aren't already interested in ciphers.  You can see how codes helped Louis XIV's empire expand, how breaking Enigma shortened World War II by several years, and other interesting anecdotes.

The book isn't just interesting; it's also fun.  Simon includes a few challenges, and you can run ahead and try to tackle them with the tools you've learned so far.  Ciphers exercise your brain in an interesting way, related to solving a crossword puzzle or Sudoku but also very different, especially if you are confronted with a cipher that does not indicate what encryption method it uses.

I also really appreciated this book as a computer scientist and as a technology geek.  I've been talking about public-key encryption for years, but this book was the first time I've actually understood the math behind how it works.  In recent years I've taken on more and more projects that involve some sort of encryption, and have come to understand the terminology and trade-offs of modern computer-aided crypto (like block ciphers, stream ciphers, key exchanges, and certification authorities), yet after reading this book, I feel more equipped than ever to put my knowledge to use and make more informed decisions about encrypted protocols.

The book also drove home for me that true security goes far beyond choosing a crypto package or slapping "https://" at the front of a URL.  One of the most amazing things to see is how, throughout history, tenacious codebreakers have been able to find a single mistake made by their opponents and exploit it to crack the entire code.  Enigma was a fearsome cipher, but the operators made the mistake of repeating the first message twice at the start of each day.  This may seem innocuous, but the brilliant minds at Bletchley Park found how this repetition allowed them to make guesses about the Enigma settings for the day.  One of the most fascinating examples actually deals with deciphering lost languages, like Linear B from Crete or the Egyptian series of hieroglyphics, and shows how it resisted all attempts at analysis (or even comprehension) until someone was able to guess that a certain sequence of symbols represented a particular name.  Again, much like solving a crossword or Sudoku, sometimes all it takes is a single crack, and then everything else swiftly follows.

Getting back to my professional interest - this book's lasting legacy will likely be to make me paranoid about all the secure code that I write.  It won't be enough to just encrypt things; if I want to be truly secure, I need to make sure that my keys are obfuscated and non-obvious; that unencrypted data never travels over an insecure channel or is stored on the filesystem; that all the partners I exchange data with also follow good security practices; etc.  Again, these are things that I "knew" to do before, but now I have a much fuller appreciation for how important it is.  128-bit encryption may be amazingly secure, but if someone can find the key by typing "strings chris.exe", I'd almost be better off with no crypto at all - at least then I won't be lulled into a false sense of security.

The book continues to progress through the ages, past RSA and PGP, and ends with an interesting chapter on the future - specifically, quantum computing (which may be able to defeat even the strongest public-key encryption) and quantum encryption (which Simon believes has been mathematically proven to be impossible to break, should it ever be created).  It's all kind of heady stuff, and here Simon acts as a Gladwell-like interlocutor, explaining the lofty theories of brainy eggheads to curious citizens like us. 

It was a fun romp and a great read.  It follows up on other excellent cryptography-related works which I can highly recommend to everyone: Cryptonomicon, by my beloved Neal Stephenson, is the most complete and useful book on cryptography in the fiction section (and incidentally overlaps quite a bit with the exciting World War II era and the modern connections between crypto and money); and "Code", an almost beautiful nonfictional book that walks you through the evolution of computer programming.  "Code" is the one book I know of that makes sense of the dual use of that word.

Anyways - for anyone with a bent towards history or problem solving, pick up The Code Book.  It's a fun read, and will expand your horizons.

1 comment:

  1. Bro- good stuff here. Remind me to show you the rudimentary cipher I put together to take notes in dull classes.

    ReplyDelete