Go to content Go to navigation Go to search

A better MySQL replication heartbeat

If you’ve used MySQL replication you’ve probably discovered that slave machines can lag behind the master. Replication can also break completely, requiring hours (or days) for the slave hours to catch up. Monitoring is required to catch issues before the slaves get too far behind.

Jeremy Zawodny has suggested a heartbeat mechanism to monitor the delay between the master and the slave. (I’m not sure if he came up with this solution). His suggestion is to periodically insert a row into a heartbeat table on the master. Then you poll the table on the slave, waiting for the row to appear. The length of time you spend polling is a rough estimate for how far behind the slave is at that moment.

There are a few problems with this solution. Your have to write code to poll the slave. If you poll very frequently (every second) you’ll be polling too often if replication is actually hours behind. When do you stop polling? If you poll less frequently (every minute) your estimate gets that much less accurate. You also have to poll every slave if there are more than one.

A new solution

You can get MySQL to do the hard work for use by taking advantage of the difference in behavior between SYSDATE and CURRENT_TIMESTAMP. In almost all cases when a slave runs a SQL statement it temporarily sets the “current time” to the time the statement was executed on the master. If you insert NOW at 12:00:04 on the master the row will hold exactly 12:00:04 on the slave, not matter when it’s run. However, the SYSDATE function does not follow this behavior. It always uses the value of the slave’s system clock.

If you insert a row with one column holding the value of NOW or CURRENT_TIMESTAMP and the other holding the value of SYSDATE into the master, you can use the difference between the two values on the slave to see how far behind it is. If the slave is in sync the two values will be identical. If the slave is one second behind the column holding SYSDATE will be one second ahead of the column holding NOW. No polling is required to determine the current lag.

Implementation

First, create the heartbeat table on the master. master_time wil hold the time the row was inserted on the master. slave_time will hold the time was inserted on the slave.

create table heartbeat( master_time TIMESTAMP DEFAULT CURRENT_TIMESTAMP, slave_time TIMESTAMP NOT NULL ) ENGINE=MyISAM;

Periodically (I do it every minute), insert a row into the heartbeat table on the master.

insert into heartbeat(slave_time) values(SYSDATE());

To see the current replication lag, at any time calculate the difference between the current time and the time the most recent row was inserted on the master. (This estimate can be off by up to one heartbeat period). This query is run on a slave.

select timediff(NOW(), max(master_time)) from heartbeat;

You can see how the replication delay changed over time by selecting all rows within a range. This example shows delay for every minute of the current day. The delays are accurate to within 1 second (the max resolution of MySQL).

select master_time, timediff(slave_time, master_time) from heartbeat where DATE(master_time) = DATE(NOW()) order by master_time;

Comment [5]

Improvement as a developer

I was doing some thinking about areas where I’m lacking as a developer. There are many, but I’ll list a few here.

Testing Improvements

I think I’ve come a long way in the last few years in the area of testing. I’m disciplined about testing now and don’t write code without writing unit tests at the same time. If I’m modifying existing code without any tests I spend the time to write tests first, even if this takes longer than making the change. That said, my tests still need work.

Most of my tests hit the positive cases. I achieve good coverage and verify most features of the code but only really in the positive cases. I need to write most negative test cases.

I write unit tests which occasionally end up more like functional/integration tests. I don’t always set out to write good, complete functional tests. Most of the bugs I find after a release would be found with good functional tests written at a higher level than my unit tests, verifying the interaction between classes. My unit tests do catch a whole class of bugs that used to slip through before I started writing them, but going a level up as well would catch more (and would also be a good source of documentation).

I have a habit of testing for performance only when it’s too late. If I’m writing code that I know will be performance critical I’ll profile. It’s the code I had no idea would be a bottleneck that ends up being the problem. This ties back to having better functional tests. These would make it possible to profile more realistic chunks of code. Profiling a unit test usually doesn’t help discover the real bottlenecks.

Like performance testing, multi-threaded testing is something I need to be more proactive about. The biggest problem is it’s hard to do. You really can’t write a test to prove code operates correctly in the presence of multiple threads. If your test does uncover a problem, great. If it doesn’t it might just mean you haven’t looked hard enough. My defense against multi-threading issues is to be extremely careful when writing the code in the first place. I carefully double check the code to make sure everything is properly synchronized and thread-to-thread communication is safe. Static analysis tools like Findbugs also help. Adding good multi-threaded testing would be another safeguard.

While I avoid copy and paste and code duplication in my code as much as possible, it tends to end up in my tests more than I would like it to. If I want to make a second test like the first I try to extract methods out of the first as much as possible to make the test small. Then I clone the first test and modify it to make it test the new condition. This does lead to some duplication. It’s usually something I can live with but I’d like to be more disciplined about avoiding it.

Project Issues

I find that I get distracted more frequently when I hit a tough part of a project. I bring up the code and start working on it, then find myself reading my news reader or trying to fix some unimportant issue with my machine (maybe getting sound working in Flash). I also get distracted like this when the requirements of what I’m working on are not yet well defined. I don’t know which way to start moving forward and so I just end up sitting still.

The last 10% is always the toughest in any project. Even when the code is out the door it’s not really complete. There are always a handful of small but difficult issues that need addressing. Too often I find myself relieved to have made the release and start working on something new instead of pushing to get that last 10% completed.

Skills

If I could choose one area where I would most like to improve it would be writing. I consider myself an average writer but I would like to work to become a good writer. Trying to write more is one way I’m working on this but it seems like something you’re either born with or you’re not.

I’m not good at delegating. My first instinct when I see an interesting problem is to solve it myself. When I see a boring or tedious problem I want someone else to do it, but I usually end up deciding it would take me longer to explain the issue than to do the work and end up doing it myself. I would like to be more selective about the work I do myself.

Making improvements

I could list ten things here I’d like to do to address the weaknesses listed above but that’s kind of like making a new years resolution. I’m going to start by trying harder to avoid distraction and go from there.

Comment [1]

Two useless Java customs

A couple common Java practices that annoy me.

Calling super() in a constructor

If you use Eclipse to generate a constructor it will look something like this:

public MyClass(int x) {
    super();
    this.x = x;
}

Eclipse always adds the call to super(). PMD has a rule for this (in its Controversial Rules category). The explanation is simply “It is a good practice to call super() in a constructor”.

If you omit super() in a constructor the compiler adds the call to super() for you. The compiled code is identical, whether you write the call or not. Effective Java doesn’t seem to address this. Is there any benefit to writing the call in all your constructors?

Supplying messages to JUnit assert calls

JUnit assert methods support a message argument, displayed if the test fails:

assertEquals("a != b", a, b);

This message does make a test failure slightly easier to read but I don’t believe the benefit outweighs the tedious overhead of typing messages for all your asserts. Tests should fail rarely, and when they do they almost always require stepping through or at least looking at the source of the test to figure out why they are failing.

I always omit the message unless it’s explaning something that isn’t obvious when looking at the code.

Comment [1]

32 bit JDK on a 64 bit Ubuntu system

If you have more than 3GB on your machine and you’re running Ubuntu you’ve probably had to figure out how to access that additional memory – the default Ubuntu desktop kernel will only allow access to the first 3GB. You can install the server kernel, but that’s been tuned for a server with different latency settings, etc. You can recompile the desktop kernel with HIGHMEM64 set, but then you’re stuck building the video drivers yourself.

My latest strategy has been to use the 64 bit kernel. 64 bit support is not bad now and most apps run normally. Of course they use about double the memory. If you’re running a lot of Java processes this 64 bit tax is very noticeable. For my needs, 32 bit Java is fine, even with a 64 bit kernel. Ubuntu/Debian ship a 32 bit JRE (ia32-sun-java6-bin). This package provides only the runtime environment (no javac) and the client VM so it has limited usefulness for a developer.

To install the 32 bit JDK from Sun on a 64 bit system you can use java-package. I’ve been running Eclipse and all my development applications and finally have some free memory again.

Installation

First, download the latest 32 bit JDK (not JRE) from Sun. At the time this was jdk-6u7-linux-i586.bin for me.

Install java-package:

sudo apt-get install java-package

Now use java-package to build a .deb package from the binary you downloaded. You have to trick it into building the 32 bit package:

DEB_BUILD_GNU_TYPE=i486-linux-gnu DEB_BUILD_ARCH=i386 fakeroot make-jpkg jdk-6u7-linux-i586.bin

This should generate a .deb package. For some reason the package name has the _amd64 suffix. Install the package:

sudo dpkg -i sun-j2sdk1.6_1.6.0+update7_amd64.deb

Use update-alternatives to select the new JDK. It was installed at /usr/lib/j2sdk1.6-sun for me.

sudo update-alternatives --config java

If you run java -version you should see the correct version:

java version "1.6.0_07"
Java(TM) SE Runtime Environment (build 1.6.0_07-b06)
Java HotSpot(TM) Server VM (build 10.0-b23, mixed mode)

32 bit Eclipse

I had to reinstall the 32 bit version of Eclipse (since SWT contains native code). I also had to delete my ~/.eclipse directory or Eclipse wouldn’t start (this requires reinstalling new versions of any plugins). Finally, add the new JRE in Java->Installed JREs using the install location (/usr/lib/j2sdk1.6-sun) and select it as the default.

Comment [2]

What I want in a browser

As I’m sure everyone has heard, Google is releasing a new browser. It will be interesting to see how/if it takes off. As a developer it sounds like another headache. At least it’s based on WebKit so it’s not a completely new rendering engine. Seeing the announcement got me to thinking about what I would want in a browser.

  1. Garbage collection — All browsers seem to be filled with memory leaks. No matter how much the developers talk about implementations of malloc that reduce fragmentation and leak fixes they made they still haven’t fixed the problem – keep a browser open for a while and it will use a ton of memory. Because it’s basically impossible to implement a true, compacting garbage collector for a program written in C. This means the browser has to be written from scratch in a different language, an enourmous task. Lobo is written in Java and could be interesting at some point in the future. Maybe it would be possible to write a browser in D.
  2. Total Keyboard support — I would love to see a browser written with the goal of making it possible to use with a keyboard only. Opera has spatial navigation. Browser plugins can be used to select links by typing in an ID that pops up next to each link. These still haven’t gotten to the point where it’s possible or comfortable to navigate with the keyboard. I’m sure 90% of people are more comfortable with a mouse anyway, so I’m not holding my breath on this one.
  3. Speed – browsers still feel slow to me. Obviously some/most of the responsibility is the site you’re browsing. But why do Firefox and Opera still peg the CPU, stutter, or hang on large pages? I have a very fast 8 core machine.

My preferred browser is actually not Firefox but Opera. It’s really designed more as a power user’s browser in my opinion. Plus once you start using mouse gestures you can’t go back. (I find myself using them in apps that don’t support them.) In general it’s closer to meeting my top three requirements but still definitely falls short on all three.

Comment [1]

Previous