How Hootsuite does Microservices
While working on Hootsuite’s Facebook Real-Time service for the past few months, I have had an extremely mind-opening experience dealing with back end development and the underlying architecture which makes all of the internal services work together. I will highlight the architecture of a Hootsuite service with reference to Facebook Real-Time.
OverviewThe Facebook Real-Time service consists of two microservices; Facebook Streaming and Facebook Processing. A microservice is an approach of service oriented architectures which divides a service into smaller, more flexible, and specialized pieces of deployable software. This design feature was decided to ensure each service does one thing very well. It allows scaling of modules of a service according to the resource usage it needs, and allows greater control over it. One key reason is that Facebook Processing does a lot of JSON parsing which consumes more CPU. The way we utilize microservices, is by implementing each microservice as a specialized piece of software. In the case of Facebook Real-Time, Facebook Streaming specializes in collecting event payload data from Facebook, while Facebook Processing parses this data into specific event types.
Facebook StreamingFacebook Streaming collects large payloads of information from Facebook when certain real-time events occur; a streaming service. Data streaming, in this case, is the act of dynamic data being generated by Facebook on a continual basis and being sent to the Facebook Streaming Microservice. These events can include actions such as a post on a Facebook page, or the simple press of the ‘like’ button on a post. Facebook Processing parses these large payloads of data into specific event types.
Webhooks are used in the collection of events from a Facebook page. They register callbacks to Facebook to receive real-time events. When data is updated on Facebook, their webhook will send an HTTP POST request to a callback url belonging to the Facebook Streaming Microservice and this will send the event payloads.
Facebook ProcessingThe Facebook Processing Microservice parses the event payload from Facebook Streaming into more specific events. So a published post will be one kind of event, and a ‘like’ will be another type of event. Here, the number of events which are assigned event types, can be controlled. This is important as event payloads are large, and requires a lot of CPU to parse. Limiting the set of event payloads being parsed at once, reduces CPU usage. So instead of parsing the event payloads at the rate they are received from Facebook Streaming, it can consume a set of these event payloads at a time, while the rest are in queue until the set of consumed events are already parsed.
We have also built a registration endpoint into the Facebook Processing service. Instead of manually adding facebook pages to the database of pages registered for streaming, the endpoint can be called by a service and will register the specified Facebook page.
Event BusA payload consists of batches of events from different Facebook pages; we call this a ‘Raw Event’. These raw events are published from Facebook Streaming to the Event Bus. The Event Bus is a message bus which allows different micro-services Apache Kafka technology and consists of a set of Kafka clusters with topics. A topic corresponds to a specific event type and consumers can subscribe to these topics. The event data corresponding to these topics will be collected by the consumer. A service can consume events from the event bus or produce events to the event bus, or both! Each service is configured to know which topics to consume or produce.
Event messages are formatted using protocol buffers. Protocol buffers (Protobuf) are a mechanism for serializing structured data. So the structure of a type of event only needs to be defined once, and it can be read or written easily from a variety of data streams and languages. We decided to use protocol buffers because it has an efficient binary format and can be compiled into implementations in all our supported languages. These include Scala, PHP, and Python. Event payloads can also be easily made backwards compatible, which avoids a log of ‘ugly code’ compared to using JSON. These are just some key examples, but there exist many other benefits to using Protobuf. With the growing number of services at Hootsuite, there was the problem where the occurrence of an event would need to be communicated to other services, asynchronously. The value which the Event Bus provides is that a service where the event is happening does not need to know about all the other services which the event might affect.
Giving an overview of how the Facebook Real-Time Service interacts with the Event Bus, raw events are first published to the Event Bus from the Facebook Streaming Microservice and consumed by Facebook Processing, where the events are assigned specific event topics and sent back to the Event Bus. This allows for specific events to be consumed by other services. Additionally, the events stay on the event bus for a period of time until they expire. In our case, because we wanted to publish raw events and then parse them at a different rate than the events are being sent to us, this allowed us to use the Event Bus as a temporary storage tool and enable Facebook Real-Time to consist of two, separate microservices. It also allows us to offset the producer versus consumer load used for processing events in the Facebook Processing service.
Service DiscoverySkyline routing is used to send HTTP requests to the Facebook Real-Time service. Skyline is a service discovery mechanism which ‘glues’ services together; it allows services to communicate with each other and indicate if they are available and in a healthy state. This makes our services more reliable and we are able to build features faster. Skyline routing allows a request to be sent to a service without knowing the specific server which the service is hosted on. The request is sent, and redirected to the appropriate server corresponding to the service. It routes requests from clients to instances of a service, so it can reroute the request to another server if the service worker fails. This also includes when a service is being restarted, so if there are several instances of the service running, then requests will go to another instance of the service while the other instance is restarting. This functionality also allows for new instances of services to be set up if it is being overloaded and there are too many events in queue, improving response time.
In addition, the client can access all the services routed by skyline via a single URL based namespace (localhost:5040), by specifying the service name and associated path. The request will be routed to the corresponding address where the service is hosted.
In conclusion, a microservice publishes events to the event bus, which contains a large pool of events. These events can be consumed by another microservice, which will effectively use the event data. And the services can communicate with each other via HTTP using Skyline routing, a service discovery mechanism.
About the AuthorXintong is a Co-op Software Developer on Hootsuite Platform’s Promise team. She works closely with the Facebook Streaming Microservies which is built on Hootsuite’s deployment pipeline and uses the event bus to communicate event information to the Inbound Automation Service. Xintong also enjoys the arts and likes to invest her creativity into fine arts and music.