Go to content Go to navigation Go to search

What I want in a browser

As I’m sure everyone has heard, Google is releasing a new browser. It will be interesting to see how/if it takes off. As a developer it sounds like another headache. At least it’s based on WebKit so it’s not a completely new rendering engine. Seeing the announcement got me to thinking about what I would want in a browser.

  1. Garbage collection — All browsers seem to be filled with memory leaks. No matter how much the developers talk about implementations of malloc that reduce fragmentation and leak fixes they made they still haven’t fixed the problem – keep a browser open for a while and it will use a ton of memory. Because it’s basically impossible to implement a true, compacting garbage collector for a program written in C. This means the browser has to be written from scratch in a different language, an enourmous task. Lobo is written in Java and could be interesting at some point in the future. Maybe it would be possible to write a browser in D.
  2. Total Keyboard support — I would love to see a browser written with the goal of making it possible to use with a keyboard only. Opera has spatial navigation. Browser plugins can be used to select links by typing in an ID that pops up next to each link. These still haven’t gotten to the point where it’s possible or comfortable to navigate with the keyboard. I’m sure 90% of people are more comfortable with a mouse anyway, so I’m not holding my breath on this one.
  3. Speed – browsers still feel slow to me. Obviously some/most of the responsibility is the site you’re browsing. But why do Firefox and Opera still peg the CPU, stutter, or hang on large pages? I have a very fast 8 core machine.

My preferred browser is actually not Firefox but Opera. It’s really designed more as a power user’s browser in my opinion. Plus once you start using mouse gestures you can’t go back. (I find myself using them in apps that don’t support them.) In general it’s closer to meeting my top three requirements but still definitely falls short on all three.

Comment [1]

Java as a scripting language?

I came across a language comparison (which I wish I could still find) where the author presented a code sample in many different languages. The example he chose was computing the MD5 digest of a string. He showed a verbose Java version, some python etc. Finally php:

md5("Hello World");

This example, the author asserted, showed that PHP coders can do as much with one line as a Java coder can do in 20. Of course any problem is easy in any language if there’s an available library call that’s just right.

When I have to write a program to work with lines of text I’ll usually turn to Ruby or possibly Groovy (sorry Perl – not going there anymore). Scripting languages like these are usually geared toward text processing and their built in libraries make these jobs easy. I wouldn’t jump into Java because it’s such a pain to do this kind of thing.

Ruby

File.open("myfile.txt").each { |line|
    puts line if line =~ /blue/
}

some standard Java to do the same thing

import java.io.*;

public class ProcessWords {
  public static void main(String [] args) throws IOException {
    BufferedReader input = new BufferedReader(
        new FileReader("myfile.txt"));
    String line;
    while ((line = input.readLine()) != null)
        if (line.indexOf("blue") != -1)
            System.out.println(line);
    input.close();
  }
}

The Java code is obviously more verbose and uglier, matching many people’s opinion of Java in general. But is this because of the language or the API? Java’s APIs are all very general and give you complete control over everything you do. Ruby’s APIs make it easy to perform this common task.

You could easily create a TextFile class in Java with a linesMatching method that returned Iterable<String> allowing you to iterate over lines that matched a regular expression. Now the task is easy:

public class ProcessWords {
  public static void main(String [] args) throws IOException {
      for (String line : new TextFile("myfile.txt").linesMatching("blue")) {
          System.out.println(line);
      }
  }
}

The designers of Ant decided programmers like writing in XML. But I don’t. I’d rather write in Java than XML. Would ant build scripts be better expressed in Java?

My hypothetical translation into Java.

public class Ant implements ProjectDefinition {
    void targets(Project project) {
        project
            .add(new Target("clean") {
                    void execute(Tasks tasks) {
                        tasks.delete("build");
                    }
                })

            .add(new Target("compile") {
                    void execute(Tasks tasks) {
                        tasks.mkdir("build/classes");
                        tasks.javac("src").destdir("build/classes");
                    }
                });
    }
}

Both definitions are roughly the same size. The hypothetical Java version definitely has some cruft but is reasonably compact. Like Rake though, the Java version would allow you to use the power of a real language in your build script – conditionals loops, dynamic targets, variables. A Java version would give you instant IDE support, debugging, profiling on top of that.

Both the line iteration and ant project definition examples show internal domain specific languages. The first domain is text processing. Scripting languages compete in this space, but I would argue it has a lot more to do with the libraries they provide than the language syntax. The second domain is build configurations. With the right APIs Java would do a very good job here too.

Of course Java does have several strikes against it when you actually consider its use for scripting:

  1. you have to compile all the files which turns some people off. You could easily write a ClassLoader to compile the code on the fly.
  2. Java starts up slowly, which kills the performance of very quick scripts, if that matters.
  3. Java has no meta programming. This is probably the biggest issue, although reflection and code generation can help here. (You could generate the ant task APIs for my build script example).

I think more user friendly APIs could be written for Java for areas like text processing, XML parsing and creation, threading (even easier than java.util.concurrent), and file operations. Joda Time is a great example of a library that is cleary superior to Java’s in this respect.

When I write an API (in any language) I try to think of what would make the user happiest when coding against it, not what necessarily matches the implementation. Chained method calls for example don’t help at all in the implementation of a class, but returning this from each method can help the callers in some cases and is easy enough to justify. In some cases providing a little internal DSL instead of a collection of getters/setters and unrelated methods makes the code a lot more readable and helps keep the focus on the caller instead of the implementation.

Comment

Thoughts on the Debian openssl vulnerability

I’ve been following the Debian openssl key generation vulnerability for the last few days. It’s an interesting bug that involved a lot of different people and it’s hard to know who should take the blame.

Recap

Luciano Bello discovered that the random number generator in Debian’s patched version of openssl was not actually very random. Instead of generating one of 2^1024 values it generated only one of 2^15. All ssh keys, OpenVPN keys, and certificates generated on a Debian based system from September 2006 on are therefore weak and guessable.

Back in 2006 a Debian developer was testing openssl through valgrind and found that the library was using uninitialized memory, something which is always suspicious, especially in such a sensitive package. The developer started a discussion about this on the openssl-dev mailing list, which he believed was the proper channel to raise the issue.

Since no one on the list objected to the proposed change the patch was applied to the Debian package. Uhfortunately the attempted patch was incorrect and the flaw was introduced.

One of the openssl developers has now basically said the openssl-dev mailing list was not actually the way to raise the issue — the developers did not monitor the list.

Some of the problems:

  1. A package maintainer made a change to a package that he probably should not have made. He should have waited for the openssl developers to make the change.
  2. The openssl developers didn’t provide a way to reach them, and did not monitor the only list they published in their documentation.
  3. The code in question was sketchy to begin with.

Don’t rely on uninitialized memory

I’m not too surprised about the communication issues between developers on this. After all, they are all volunteers and were trying to do what they thought was best.

I am surprised that the original code was written to rely on uninitialized memory as a source of random data. Uninitialized memory is in no way random. When a process is initialized all memory given out by the kernel is zeroed out. The process does not see random data from previous processes (which would be a serious security issue) and memory does not just become random over time.

There are probably some limited cases where some pieces of process memory could become more and more random over time (maybe if you were developing a multithreaded key generation server), but even then it would be unlikely. The stack would most likely have the same pattern at the same point of execution. For a tool like ssh-keygen uninitialized memory would not be random. A compiler is always allowed to do whatever it wants with uninitialized memory and would be within its rights to zero it out or fill it was a fixed pattern.

This practice was raised years ago in a bug filed in 2003. The response to the bug was basically that using uninitialized memory is a fine thing to do and don’t ask about it again since it’s a FAQ.

The code should have been eliminated long ago to avoid legitimate confusion about its purpose.

Comment [1]

Google Apps email list spam prevention

At work we moved all our mail services to Google Apps. Maintaining our own mail server was a waste of time and resources. Our server spent most of its time processing spam, and went down at least once or twice. This downtime was more than enough to justify the price of the move.

You give up a lot of flexibility once you switch over to google’s servers though. You can’t run procmail, and the email list functionality is limited.

We quickly found out that some of our older email lists were spam targets, and mail that came to them flooded everyone’s spam folder. Google’s spam filtering is very good, but if you get hundreds of spams a day it’s next to impossible to search through them for false positives.

To keep spam sent to email lists away from the users I set up a layer of indirection. (In this example I’ll use “staff” as the email list name). I made one new mailbox account called “mail-router” and added staff as a nickname of this account. Then I created a new email list with an obscured name (xx-staff) that held the addresses of the employees that were meant to receive the mail. Now, in the mail-router account I created a filter to forward mail addressed to staff to xx-staff.

Gmail doesn’t forward spam, so all spam to the staff alias gets left in the mail-router account. The two hidden addresses (mail-router and xx-staff) can both be changed at any time in case they too are targeted with spam.

Because any feature of filters can be used you can also use filters to delete some mail or only forward mail if it matches a whitelist. To make a whitelist filter, use the Has the words field and enter an OR expression like this: from:gooduser@gooddomain.com OR from:gooduser2@otherdomain.com

The mail-router account costs an extra $50 per year (easily justified), but it can be used for any number of email lists. Hopefully Google will eventually introduce at least a simple whitelist/blacklist at the email list level.

Comment [1]

Is Google actually producing?

Large software companies can accomplish projects that smaller groups cannot come close to replicating. These big projects become the flagship products of the company, and each large company has at least one of these. Examples are Sun’s Java environment, Microsoft’s .NET framework, VMware’s technology, or Oracle’s database.

Some companies have many of these complex projects. Microsoft has Direct/X, the Microsoft C compiler, Visual Studio, XBox 360, and Microsoft Office, to name a few.

Google has one project like this, their search engine.

In the recent past, Google has gone on a hiring binge, acquiring big name open source developers and opening offices around the world. Expectations are high for a group of developers like this. My question is what do these people do?

water lounge

My impression of Google is that everyone is working on their own or in small groups. Developers choose their own direction, working on their side projects, either unofficially in their 20% free time or officially. Sometimes these projects are made public, leading to hit or miss offerings like video.google.com and toys like Google Sky. The services Google has come up with recently have been mostly smaller (though well implemented) productivity applications. You get the impression that no product produced by Google has more than a small group of developers (besides the search engine).

GMail is probably Google’s second most advanced service and this is built on a distributed storage system that already existed as part of their search engine. Google Maps is well done, but the hard part (the data behind the maps) is licensed from NAVTEQ.

Do the best developers at Google work only on smaller independent projects? Are they doing internal projects that support the search engine? There’s a limit to what small groups can accomplish. Does Google have the logistical ability to produce another service as advanced as their search engine?

Not all products and service require highly sophisticated implementations. However, small competitors can clone simple applications (Google Calendar, Google Docs). This leaves Google vulnerable if its flagship product is toppled. If you took the search engine away they would not survive with what’s left.

Comment [6]

Previous Next