Go to content Go to navigation Go to search

Thoughts on the Debian openssl vulnerability

I’ve been following the Debian openssl key generation vulnerability for the last few days. It’s an interesting bug that involved a lot of different people and it’s hard to know who should take the blame.

Recap

Luciano Bello discovered that the random number generator in Debian’s patched version of openssl was not actually very random. Instead of generating one of 2^1024 values it generated only one of 2^15. All ssh keys, OpenVPN keys, and certificates generated on a Debian based system from September 2006 on are therefore weak and guessable.

Back in 2006 a Debian developer was testing openssl through valgrind and found that the library was using uninitialized memory, something which is always suspicious, especially in such a sensitive package. The developer started a discussion about this on the openssl-dev mailing list, which he believed was the proper channel to raise the issue.

Since no one on the list objected to the proposed change the patch was applied to the Debian package. Uhfortunately the attempted patch was incorrect and the flaw was introduced.

One of the openssl developers has now basically said the openssl-dev mailing list was not actually the way to raise the issue — the developers did not monitor the list.

Some of the problems:

  1. A package maintainer made a change to a package that he probably should not have made. He should have waited for the openssl developers to make the change.
  2. The openssl developers didn’t provide a way to reach them, and did not monitor the only list they published in their documentation.
  3. The code in question was sketchy to begin with.

Don’t rely on uninitialized memory

I’m not too surprised about the communication issues between developers on this. After all, they are all volunteers and were trying to do what they thought was best.

I am surprised that the original code was written to rely on uninitialized memory as a source of random data. Uninitialized memory is in no way random. When a process is initialized all memory given out by the kernel is zeroed out. The process does not see random data from previous processes (which would be a serious security issue) and memory does not just become random over time.

There are probably some limited cases where some pieces of process memory could become more and more random over time (maybe if you were developing a multithreaded key generation server), but even then it would be unlikely. The stack would most likely have the same pattern at the same point of execution. For a tool like ssh-keygen uninitialized memory would not be random. A compiler is always allowed to do whatever it wants with uninitialized memory and would be within its rights to zero it out or fill it was a fixed pattern.

This practice was raised years ago in a bug filed in 2003. The response to the bug was basically that using uninitialized memory is a fine thing to do and don’t ask about it again since it’s a FAQ.

The code should have been eliminated long ago to avoid legitimate confusion about its purpose.

Comment [1]

Google Apps email list spam prevention

At work we moved all our mail services to Google Apps. Maintaining our own mail server was a waste of time and resources. Our server spent most of its time processing spam, and went down at least once or twice. This downtime was more than enough to justify the price of the move.

You give up a lot of flexibility once you switch over to google’s servers though. You can’t run procmail, and the email list functionality is limited.

We quickly found out that some of our older email lists were spam targets, and mail that came to them flooded everyone’s spam folder. Google’s spam filtering is very good, but if you get hundreds of spams a day it’s next to impossible to search through them for false positives.

To keep spam sent to email lists away from the users I set up a layer of indirection. (In this example I’ll use “staff” as the email list name). I made one new mailbox account called “mail-router” and added staff as a nickname of this account. Then I created a new email list with an obscured name (xx-staff) that held the addresses of the employees that were meant to receive the mail. Now, in the mail-router account I created a filter to forward mail addressed to staff to xx-staff.

Gmail doesn’t forward spam, so all spam to the staff alias gets left in the mail-router account. The two hidden addresses (mail-router and xx-staff) can both be changed at any time in case they too are targeted with spam.

Because any feature of filters can be used you can also use filters to delete some mail or only forward mail if it matches a whitelist. To make a whitelist filter, use the Has the words field and enter an OR expression like this: from:gooduser@gooddomain.com OR from:gooduser2@otherdomain.com

The mail-router account costs an extra $50 per year (easily justified), but it can be used for any number of email lists. Hopefully Google will eventually introduce at least a simple whitelist/blacklist at the email list level.

Comment

Is Google actually producing?

Large software companies can accomplish projects that smaller groups cannot come close to replicating. These big projects become the flagship products of the company, and each large company has at least one of these. Examples are Sun’s Java environment, Microsoft’s .NET framework, VMware’s technology, or Oracle’s database.

Some companies have many of these complex projects. Microsoft has Direct/X, the Microsoft C compiler, Visual Studio, XBox 360, and Microsoft Office, to name a few.

Google has one project like this, their search engine.

In the recent past, Google has gone on a hiring binge, acquiring big name open source developers and opening offices around the world. Expectations are high for a group of developers like this. My question is what do these people do?

water lounge

My impression of Google is that everyone is working on their own or in small groups. Developers choose their own direction, working on their side projects, either unofficially in their 20% free time or officially. Sometimes these projects are made public, leading to hit or miss offerings like video.google.com and toys like Google Sky. The services Google has come up with recently have been mostly smaller (though well implemented) productivity applications. You get the impression that no product produced by Google has more than a small group of developers (besides the search engine).

GMail is probably Google’s second most advanced service and this is built on a distributed storage system that already existed as part of their search engine. Google Maps is well done, but the hard part (the data behind the maps) is licensed from NAVTEQ.

Do the best developers at Google work only on smaller independent projects? Are they doing internal projects that support the search engine? There’s a limit to what small groups can accomplish. Does Google have the logistical ability to produce another service as advanced as their search engine?

Not all products and service require highly sophisticated implementations. However, small competitors can clone simple applications (Google Calendar, Google Docs). This leaves Google vulnerable if its flagship product is toppled. If you took the search engine away they would not survive with what’s left.

Comment [5]

Frustration a necessary part of coding?

I had a few frustrating days this week. I decided to reformat my hard drive using LVM). (I started by trying to reformat my root partition with LVM, then gave up on that). I did this mainly to try Xen, which I promptly learned does not support the binary Nvidia drivers, requiring an uninstall.

I spent the day today trying to figure out why some of my unit tests were failing half the time, spewing build failure emails to the team. (I eventually figured it out after countless test runs). Then I fixed a bug, accidentally implementing a feature that turned out to be unnecessary.

I used to feel like days like these were wasted days. Like I had lost all this time trying to get something to work while I could have been more productive doing real work. I’ve learned to stop feeling this way. Figuring something out yourself can be the only way to truly learn something cold. Reading about it is nice but there’s no substitute for debugging, experimenting, searching the forums, searching the bug databases for a fix, stepping through source, or throwing out a failed implementation and starting again. Lack of frustration really means you’re really not pushing your boundaries enough.

Comment [2]

Correct use of ConcurrentHashMap

ConcurrentHashMap has been pitched as a simple alternative for HashMap, eliminating the need for a synchronized blocks. I had some simple event counting code that created count records on the fly. Although I could have used synchronized blocks for safety I used ConcurrentHashMap for this situation, partly for efficiency but mostly for the exercise. Going through this made me realize how carefully ConcurrentHashMap must be used for your code to work correctly and efficiently.

When using a HashMap, the standard idiom to add a value if it doesn’t exist is to use code that looks something like this:

synchronized (this) {
  Record rec = records.get(id);
  if (rec == null) {
      rec = new Record(id);
      records.put(id, rec);
  }
  return rec;
}

If you were to simply replace HashMap with ConcurrentHashMap and remove the synchronized keyword your code would be exposed to a race condition. If a new Record was put into the map just after the call to get returned null the put operation would overwrite the value. You could add synchronized back in but this defeats the purpose of using ConcurrentHashMap.

To safely create values on demand you must use putIfAbsent (and avoid making extra calls to get in the process).

First check to see if a value with the key already exists in the map and use this value if it does. Otherwise, create a new value for the map and add it with putIfAbsent. putIfAbsent returns any existing value if there is one, otherwise null (this is why ConcurrentHashMap can’t contain null values).

private ConcurrentMap<String, Record> records =
     new ConcurrentHashMap<String, Record>();

private Record getOrCreate(String id) {
    Record rec = records.get(id);
    if (rec == null) {
        // record does not yet exist
        Record newRec = new Record(id);
        rec = records.putIfAbsent(id, newRec);
        if (rec == null) {
            // put succeeded, use new value
            rec = newRec;
        }
    }
    return rec;
}

If putIfAbsent does return a value, it’s the one that must be used. It may have already been used by other threads at this point. The new value created must be abandoned. Although it sounds wasteful this case should happen very infrequently.

I’ve seen other code on the net that ignores the return value of putIfAbsent and makes another call to get at the end to figure out which value made it into the map (the new value created or a value from another thread). Although this will work it introduces an unnecessary lookup.

Comment [10]

Previous