Tagging Infrastructure Design
The Inbox Project
For my coop term, my team and I tackled the Inbox project which is a platform that allows users to look at private messages across multiple social networks – think a Gmail or Facebook Messenger type experience, for all of your social networks. On top of basic messaging features, we wanted to integrate Inbox with existing Hootsuite functionality, including Hootsuite tagging feature. This would let users add and remove tags to private messages and filter over them. This was an interesting problem that I tackled, that involved some high level architecture decisions. Here is a general overview of all the relevant pieces before we made any changes.
- We must be able to see tags on messages even if it was applied in another piece of the Hootsuite system
- We must be able to apply tags and they should show up in other parts of Hootsuite
- We want to be able to filter and search on tags relatively quickly
- This should ideally all be event based and any changes to tags elsewhere should reflect in the inbox without refreshing the browser
BackgroundThe dashboard is our original monolith service that services a good portion of Hootsuite. It is already very complex and we do not want to add any functionality to the dashboard – as a company we’re trying to reduce our PHP dashboard to a thin layer, with most work being done by smaller services written in statically typed languages. However, part of the tagging infrastructure is currently in the dashboard and is exposed through a set of endpoints called Service Oriented Monolith or SOM. The tag service only contains the tags themselves (i.e. tag 123 has label “customer complaint”), but not the association between tags and messages (i.e. message 789 has tags 123 and 456). It’s a relatively new service and will have quick response times. Finally, all tagging related events are published on the message queue and are consumed by other areas of Hootsuite.
Solutions We ConsideredWe considered three approaches to adding tagging functionality to Inbox with considerations to time needed, code quality, and performance.
The first solution was to simply ingest tagged messages from the message queue and store them with our own messages. This solution is relatively simple to implement because any actions on tags are published over the message queue. We simply need to listen for any CRUD operations on tags and store them in our own messages store. However, this plan falls apart when we have to write our own tags. We would have to make sure that the dashboard listens to any events performed on the Inbox which is adding functionality to the dashboard. This is largely frowned upon because Hootsuite as a whole is trying to break features off the dashboard instead of adding features. Furthermore, we would be trying to keep two separate databases synchronized which is redundant and a waste of money.
The second solution was to go through SOM which is a set of endpoints on the dashboard exposed to internal services. This solution was appealing because it involves almost minimal code. We simply need to call the api everytime we wanted to create or delete an event. This way, every time a message passes through the inbox, we would enrich it with tags from SOM and every time someone tags a message, we would send the write request through SOM. However, developers are discouraged from using SOM endpoints as it is considered legacy and Hootsuite as a collective is moving towards a microservice architecture. Furthermore, search and filter by tags would entail that we would have to add more functionality to the dashboard which is not ideal since we want all tagging logic to live inside the tag service. We would also be hitting this endpoint with requests anytime anyone looks at a message which is bad since the dashboard is already under heavy load, and as a large PHP monolith, each request it services is relatively expensive.
The third solution was to migrate message tags into the Tag Service. This is intuitive because the tag service should be responsible for anything tag related and it would simply be a one stop destination for tags. However, the issue with this solution is that it involves a database migration which is non-trivial to execute. It would also make the work a little more complex since we would have to port all our logic from PHP to our new service. Finally, it could also affect other consumers of tagging related events if we accidentally or purposely make any changes to the logic on migration. This solution would take the longest to implement by far, but it will significantly separate logical components of the codebase.
The Solution We ChoseWe went with a mixture of the first solution and the third solution. We decided to go forward with the db migration because message tags should not live in the dashboard’s DB. This would clean up a lot of the tech debt and ensure that any new integration with tagging would be straightforward and clean since the Tag Service will be completely decoupled from the dashboard. However, we would still consume tagging events to allow Inbox be event-driven, and to allow for complex search queries on tags.
Our plan of action would be
- migrate all the logic to the Tag Service
- migrate the tagging data to the tag service db
- point existing clients to the Tag Service
- deprecate the dashboard endpoints
Current ProgressWe have already begun to consume tag events and have already started to migrate logic to the Tag Service. Once we are done, our tagging infrastructure will looking like this
We can see that centralizing all tagging functionality greatly reduces the dependencies and complexity of the system. Now any feature that requires tagging data can simply talk to the tag service, and the dashboard will have nothing to with tagging, which falls in line with the principles of microservices in that services mark a distinct separation in responsibility.
About the AuthorHenry is a coop :p. He enjoys powering the human connection through social media. I’m on the right.