By George Gao on December 13, 2016
Hootsuite’s recent transition from Monolith to Microservices, like any large scale change, has met many challenging issues. As we move towards a SOA (Service Oriented Architecture), we need to build new infrastructure, tools, and pipelines. This new architecture requires our web applications to be made up of many small components, which can be done via containerization. Containers then need to be orchestrated in order to truly run as a distributed system. An issue arose when our initial container orchestrator choice slowly became outclassed by other alternatives. Looking at where we wanted to be in the near future, we decided to migrate to a new platform. This blog will describe our technical decisions that drove this change.
What is container orchestration/scheduling? Why should you care?Consider a web developer who just finished building a simple PHP application. He has made sure every component functions properly on his localhost server. He uploads his code and assets to a host on the internet for the public, but is he guaranteed that everything will function the same as it did locally? The answer is ‘no’. The system environment could be very different between his local server and the external internet server. There might be missing dependencies, or the operating system could be entirely different. This is where a containerization tool called Docker comes in. Docker solves the inconsistency issue by stuffing the entire environment, along with the web app, into a “Docker image”. Any server with Docker installed can then run this image, regardless of the environment.
Running a single Docker image on one web server is pretty straightforward. That said, in a highly available SOA, there would ideally be many services each contained in their own Docker images, running on many servers scattered across large geographical areas. Important services expecting high traffic would have multiple copies of the same image running underneath a load balancer. How do we allocate server resources like CPU and memory to all these Docker containers? How do we update and scale our services without downtime? These are the problems container orchestration platforms and schedulers try to solve. And like all computer science problems, the solution is adding a level of indirection.