I’ve often thought that developers have machines that are entirely too fast. In my case, I’ve got a relatively recent dual core machine with an SSD. It’s awesome.
Except I find that some people running my software are running on machines that are doing disk IO with considerably less capable disks.
In membase, disk performance differences can be noticeable. When I’m testing with an SSD and a customer is running with a 7200rpm disk (it happens), I can’t see the kinds of situations they run into from my development machine.
There are several options out there.
I played with cpulimit for a bit, but it was too coarse and really did awful things on my mac since it is basically strobing the process with
SIGCONT on an interval.
I experimented with a ptrace-based solution to allow me to more granularly slow things down, but it both didn’t help and it turns out that ptrace is kind of broken on OS X anyway, so it’s neither a portable thing to do, or really all that useful.
So I wrote a library interposer.
It was pretty cool to see it do stuff, but I didn’t want to tell people to recompile the whole thing every time they wanted to change a delay or something. I decided to toss in lua.
For example, what if you wanted a seek to take a full second 1% of the time (and for the fun of it, log that it did). You can write the following and feed it to
function before_lseek(fd, offset, whence) if math.random(1, 100) == 13 then io.write(string.format("Slowing a seek on fd=%d to %d (%d)\n", fd, offset, whence)) usleep(1000000) end end
Now I can remember what it was like to have a rotating disk in my laptop again.
Right now, my immediate need is solved, but it’s pretty easy to add functionality, so I’m thinking about making it be a full-on fault injection framework. I looked at fiu briefly along my path and found that it was pretty interesting, but didn’t work on MacOS and was still a bit too invasive for where I wanted to be (which includes doing random stuff to third-party apps).
In addition to the
before_lseek as above, I would imagine an
after_lseek and perhaps even an
around_lseek allowing for full AOP on your deployed C programs.
But for now, it just slows stuff down.