Why we love and use ZeroMQ at HootSuite

At HootSuite we are always on the lookout for technologies that will make our product better.

Recently we started exploring ZeroMQ (aka. ØMQ, 0MQ, ZMQ). In this post, I will describe why ZeroMQ fits right into our technology stack.

ZeroMQ is often bundled into a class of products with RabbitMQ and others that use Asynchronous Message Queuing Protocols (AMQP). After all, ZeroMQ has MQ in its name and both ZeroMQ and the original AMQP were created by Pieter Hintjens. Be that as it may, grouping ZeroMQ with other AMQP products is not quite the right model for some key reasons.

In particular, the fact that ZeroMQ is a socket library, and not a packaged solution like RabbitMQ affords it an order of magnitude more flexibility and versatility than other AMQP solutions.

For starters, ZeroMQ has 30+ language bindings available to support writing your system on top of the underlying socket library.

Another difference between ZeroMQ and AMQP products is the set of product architectures that each support. AMQP relies on a central message broker through which all messages must pass. As mature and stable as these product may be, this broker is a single point of failure (updated: as Alvaro Videla pointed out RabbitMQ provides reliable HA feature to mitigate this issue and we do make extensive use of Rabbit’s HA Cluster at HootSuite). ZeroMQ is distributed by nature. Messages do not have to travel through any single point of failure and can be delivered from source to destination directly.

RabbitMQ supports complex routing patterns out of the box, whereas ZeroMQ supplies a few simple routing patterns on top of which you can build your own additional patterns if required.

ZeroMQ is also more flexible at the transport layer. RabbitMQ message exchanges take place on top of TCP. ZeroMQ supports TCP, but also inproc, IPC, and multicast.

For these reasons, it’s not entirely accurate to group ZeroMQ in the same product bin as RabbitMQ. ZeroMQ’s flexibility enables it to solve a different classes of problems.

At HootSuite we’ve been using and will continue to use RabbitMQ for a few certain subsystems but ZeroMQ is gaining more and more traction.

So, why do we love ZeroMQ?

At its core, HootSuite’s web product is built on a Linux-Apache-MySQL-PHP stack (LAMP). One of the inherent challenges of the stack is that PHP does not support multi-threading, which means async operations do not come without help from other technologies. Gearman is one such async enabling technology that has worked for us. Adding this kind of middle-ware, however, creates additional development and maintenance costs. Not to mention that Gearman too becomes a single point of failure for many use cases and message routing control with Gearman is limited.

With ZeroMQ we can make a PHP processes talk to one another locally through ZeroMQ’s IPC transport, or remotely over TCP. Over the past few months, we’ve been migrating some of our distributed tasks and aysnc jobs from Gearman to ZeroMQ and have been very pleased with the results.

We’re pleased because ZeroMQ is fast, simple and reliable. We are able to push hundreds of thousands of 500-byte messages per second between Amazon EC2 m1.large instances. The threshold is essentially single connection bandwidth saturation. In other words, even if we were to build all of one’s own socket manipulation logic, we could not send messages much faster. But ZeroMQ IS the socket manipulation logic, so hides the complexities of TCP connection and buffering, thus allowing us to scale better, eliminate single points of failure and most importantly, concentrate on building product instead dealing with things like socket manipulation resilient to brief network failures.

(Incidentally, this particular common problem of brief network failures is elegantly handled by ZeroMQ queuing. Messages are queued safe and sound locally on both the sending and receiving ends, so if the receiver is temporarily down and back again no messages are lost. For the most mission critical systems, messages can be written to disk.)

While LAMP is at the core of HootSuite’s web product, other languages like Scala are in production too. And over the long run, this set of languages will inevitably evolve which is why we love the ZeroMQ’s essentially language agnostic design. As we transition more and more towards a Service Oriented Architecture (SOA), we expect ZeroMQ’s simplicity, flexibility and performance will provide a lot of the plumbing.

Because different services may be implemented using different languages, our service communication mechanism pretty much comes down to a handful of choices:

  • Thrift – it’s doable but we don’t like the complexity and overhead
  • REST – which is what we are doing right now but it’s slow and only allows request/response pattern
  • WS/SOAP – Yuck. Just out of the question.

I mentioned that ZeroMQ is very fast but it also provides other benefits: It not only gives us traditional client/server communication pattern but also offers pub/sub and pipeline out of the box, which will be extremely useful for things like event publishing, real time service registry/discovery, and bulk processing. ZeroMQ messages contain nothing but an array of frames (each frame contains strings) which gives us the flexibility to define our own simple communication protocol among our services that just fits our needs, no more, no less. At HootSuite we’ve already initiated the project to move our existing SOA to, and build our future SOA on top of ZeroMQ.

Like any new technology, ZeroMQ is not perfect. Even though it’s been around for two years, and is gaining traction, community support has not achieved the critical mass of many open source projects that have been around as long. Perhaps this is because ZeroMQ solves a class of problems reserved for organizations who are truly confronting big scalability issues and in reality, that is a small subset of all SaaS companies.

As a result, we’ve had to work through ZeroMQ idiosyncrasies and behaviours that don’t feel intuitive and effort has been required to produce basic libraries on ZeroMQ for things like packet format and authentication that could just have easily already been generated by an open source community. We’ve achieved success through lots of experimentation.

One example of many is the work we’ve done to figure out what ZeroMQ packet distribution patterns are most applicable in different situations. A trivial example serves to illustrate the point. ZeroMQ comes with a DEALER pattern. Intuition suggests that a DEALER would only apply on a message broker or a load balancing node. But in fact, we’ve found that the DEALER pattern also makes a lot of sense on client side services for its asynchronous sending and receiving capabilities. The takeaway is that pattern characteristics are more important than names when figuring out what to apply to specific use cases.

So far our experiments have yielded success, and we’ve deployed a few ZeroMQ based subsystems to production with fabulous results. As is our way, we will continue experimenting. Look for future posts sharing more ZeroMQ lessons as we continue to scale and of course the gotchas.

Until then, fasten your seatbelts. On the HootSuite engineering team, ZeroMQ has become an adverb. As in – it’s going to be ZeroMQ Fast.