I wrote an erlang implementation of a memcached server speaking the binary protocol over the weekend. You’re all probably wondering why.
It was half a learning exercise, and half a way to plug a memcapable interface into more applications. In its current form, I consider it a framework to take existing erlang backend stores (e.g dict, ets, dets, mnesia, riak, etc…) and put a memcached interface on them. On its own, it’s a little boring.
The first implementation of the binary protocol was in my memcached-test project (which I still actively use as a reference implementation when playing around with features and testing clients and things). It’s a simple asyncore based server in python (with a sample synchronous client).
A bit later, came the actual production C server version we all know and love (which also had my java client to talk to).
I wrote twisted memcached and built an S3-backed server one weekend.
That leads us back to the present erlang-based server.
If you have a basic understanding of erlang, you may find this implementation to be the best documentation of the protocol in existence.
In the main loop of a connection, all I’m doing is calling
process_message
in a loop with the connected socket, a reference to
the storage server (an erlang gen_server
implementation), and the
result of a call to gen_tcp:recv(Socket, 24)
. That last call will
either return {ok, SomeData}
or {error, SomeReason}
.
The only definition of process_message/3
I have is shown below. The
only valid way to call this is when the third argument is the {ok,
Data}
tuple where Data
in this case is a binary pattern. Some of
the values are filled in (which means they must match for this
function body to be invoked), and some are bindings which will receive
the value.
In the code below (extracted from mc_connection.erl),
you’ll see that the first 8 bits must exactly be the defined
REQ_MAGIC
constant, and then the next 8 bits are stored in OpCode
,
and so on.
Any attempt to process a message not in this form will result in the connection processing crashing (the effect of which is your client being disconnected from it).
So you can see how erlang easily allows us to rip the bits we need out of the header for dispatch, so the next thing is to ask for the remaining data (extra headers, key, and body) before dispatching to the storage server process.
A storage server process is a gen_server
implementation whose
handle_call
implementations look will take a tuple of {OpCode,
ExtraHeader, Key, Value, CAS}
and return a mc_response
record.
For an example storage server, consider my two flush
implementations
in my hashtable store (noting that flash has one 32-bit
integer in extra header, no key, and no value):
That’s pretty much it. Even if nobody uses this code, it’s useful to me as a protocol reference since it’s easier to read than even the binary specification.