Distributed Coordination with ZooKeeper

Written by: on August 27, 2014 Posted in:

For my 8-month co-op term at Hootsuite, I have been working on the Streaming team. The Streaming infrastructure allows services to receive real-time updates for a large user set. Streaming is responsible for delivering push notifications, analytics, and interaction history to subscribed Hootsuite users. One example is receiving a push notification on your mobile device for a Twitter retweet or mention from the Hootsuite application. Streaming is a distributed system and every complex distributed application needs a coordination and orchestration system of some sort, so the team decided to integrate ZooKeeper into the Streaming infrastructure back in 2011. ZooKeeper is a software project of the Apache Software Foundation, providing an open source distributed configuration service, synchronization service, leader election, and naming registry for large distributed systems.

One of my projects during my co-op term included fixing the previous implementation of ZooKeeper, as there were a number of things that were not properly working. Originally, ZooKeeper was embedded within the Streaming application, and there were a number of issues with this implementation.

Continue…

Life as a Hootsuite Co-Op

Written by: on August 14, 2014 Posted in:

In 8 months, I’ll be graduating from the University of British Columbia with a degree in Physics and Computer Science. I’ll be looking for jobs shortly thereafter, armed with a résumé to convince future employers that I am an amazing iOS developer. Hootsuite will be at the top of my ‘Work Experience’ section, along with a list of the various technologies I used and the tasks I was assigned. It’ll probably look something like this:

Hootsuite – iOS Developer Co-op
January – August 2014

Worked with Xcode 5 and 6, Objective-C, Swift, and iOS 6, 7, and 8. Contributed to new features and bug fixes for the Hootsuite app, and helped it become featured in the iOS App Store. Gave presentations, collaborated with teammates, and wrote automated tests using KIF. Used Facebook, Twitter, and Google APIs, communication patterns such as KVO, delegation, and notification, and Interface Builder with auto-layout.

Continue…

Automate everything – startup and shutdown the apps you need in seconds

Written by: on August 7, 2014 Posted in:

Do you hate coming to the office, firing up your favourite text editor, getting ready to do some work, and .. you forgot to start your Vagrant machine? Now you have to open up the terminal, type vagrant up, wait 90+ seconds for the beast to load and .. it doesn’t work because you forgot to connect to the VPN and Puppet cannot correctly provision the box without it. So, you do that too.

Now that everything is setup – you code the entire day, pack your things, arrive home, start browsing threads on Hacker News and… the low battery warning comes up because you forgot to close Vagrant.

Continue…

CSS at Hootsuite

Written by: on August 5, 2014 Posted in:

We’ve really enjoyed reading up the posts from Mark Otto, Ian Feather, Chris Coyer and others about CSS on the projects they work on. I’ve personally learned a lot from those individuals and picked up some more tricks in those posts about how CSS works on large projects.

This is a little bit about how CSS works at Hootsuite and process we have found work best for us at this point in time. Continue…

DOMObserver – react to changes in the DOM

Written by: on July 15, 2014 Posted in:

What: DOMObserver – a JavaScript library for easily tracking DOM mutations.
Code: https://github.com/uberVU/dom-observer/
Demo: http://ubervu.github.io/dom-observer/

Here at Hootsuite, we love to open source components that we develop for ourselves and might be useful to others. We feel that this is a great way to help to give back and contribute to the Open Source ecosystem.

While performing some refactoring on our fronted code, we managed to isolate a component that we use for observing DOM changes, as well as reacting to those changes. The reason we developed this code was that alternatives such as mutation-summary did not do all we wanted, and even now, they don’t offer fallback support, which we need since we support older versions of IE as well.

Enter DOMObserver, a JavaScript library that lets you observe and react to DOM changes in an easy and friendly manner. It uses either the new MutationObserver API or the older Mutation Events API, whichever the browser supports, and offers you control in an unified way. We use it for managing the widgets we show on a page, but you can use it for any changes to the DOM that you care about.

DOMObserver exposes an API that lets you set handlers for when:

  • a node is added to the DOM
  • a node is removed from the DOM
  • a node’s attributes are changed

The handlers receive the mutation target as a parameter, so you know which node changed. Furthermore, if you want to do more advanced things, the handlers also receive the mutation (event) object, so you can tell exactly what has changed.

As far as performance goes – you can set the handler on a specific DOM node, so there is no need to listen on the whole document. Also, for attribute changes you can filter the attributes you care about (so for example you only get notified when data-loaded is changed, but not for src changes). When you are no longer interested in the DOM changes (say your app reaches a constant, immutable state), you can simply shutdown the observer.

Check out the demo to get a hands-on feel of what DOMObserver can do for you, and make sure to check out the README file on GitHub for more information.


uberVU is part of the Hootsuite team. Read more about it here

GridList – building responsive dashboards with resizable widgets

Written by: on July 6, 2014 Posted in:

What?

GridList is a JS library for creating two-dimensional, resizable and responsive lists that we built to help create highly customisable dashboards. The library is used in one of our flagship features we call Boards and was developed with the needs of this project in mind.

Why a grid?

Our strongest point is our data, this is what we do best, collect and analyse great volumes of data. Data alone is not enough. It is critical to show the result of all the effort of collecting and analysing in an easy to comprehend way. We needed a way to make it easy to discover patterns and take action on them.

We needed a data visualisation tool that would be:

  • easy and intuitive to use;
  • easy to customise and manage;
  • responsive – meaning that it will look great no matter what screen it is shown on, a mobile phone or a big 4k display;

A dashboard is a way of visualising data that, if properly built, would tick all the boxes of our requirements. We are passionate about dashboards and went through a few iterations and versions of displaying data in a dashboard form to understand what makes a dashboard great.

Given all of these it was clear that we needed to use a grid system that supported drag and drop. We had two options, find and adapt an existing one or make our own.

Why build a new grid system?

In the end we decided to build our own library. We looked at existing grid systems and most closely at gridster.js, but all of them fell short and did not satisfy our requirements fully or would have required too much effort to adapt.

Our requirements for the grid system were:

  • support of horizontal grids – gridster.js works vertically and our design is horizontal. We looked at making gridster.js work both vertically and horizontally but the code required too much effort and was too complex to approach.

Mozaic logo

  • responsive – as mentioned earlier, it had to work on all kind of displays
  • needed full height widgets for our timelines, a particular type of widget that occupies a full column in horizontal grids or a full row in vertical ones, a feature that gridster.js did not have
  • we wanted the grid logic to be a DOM-less library outside the jQuery plugin. This allows us to compute grid positions on the server-side and run extremely fast tests with Node.
  • very good UX experience that would not frustrate users with large boards in which the order of the items was important. This is a major point which is very difficult to figure out as it is something quite subjective. We consider that we nailed this part by implementing a well thought collision mechanism that is a step ahead of the basic collision mechanism implemented by gridster.js and other similar libraries and by following a few principles that are discussed in depth later.

While solving all of this, our library ended up having 5 times fewer lines of code than gridster.js.

Principles

When building this project we stuck to a few principles:

Don’t break user input

This is a fundamental part of GridList and one that is easily missed. It is what gives the feeling that it works as it should or as expected when you drag an item around. The principle can be described best as – no surprises for the user. When you drag an item to a new position, that item will be placed there and nowhere else. The grid system will not do any magic afterwards and start moving the item around to make it fit. After an item is placed in the desired position the collision mechanism kicks in and the items that have to move are arranged so that as few changes of position are needed.

Mozaic logo

Collisions can be solved in two ways. First, an attempt to resolve them locally is made, meaning that the moved item tries to swap position with the overlapped item(s). This is the preferred fair trade. If this doesn’t work out and after swapping we still have collisions inside the grid, the entire grid will be recalculated, starting with the moved item fixed in its new position. In the latter case, all the items around and to the right of the moved item might have their position slightly altered.

As independent from jQuery and other libraries as possible.

This was achieved by splitting the GridList library in two parts, based on their roles: a) An agnostic GridList class that manages the two-dimensional positions from a list of items within a virtual matrix b) A jQuery plugin built on top of the GridList class that translates the generic items positions into responsive DOM elements with drag and drop capabilities

Only the second part has anything to do with the actual DOM, the core of the grid system does not need to know about it. The jQuery part only translates the results of the computations into pixels and handles drag and drop events.

The size of the grid can be changed at any time

This makes it possible to have a responsive grid. it also makes it easy to have zoom in and out controls. What we looked for here is to keep the widget positions in place as much as possible on grid size changes.

Mozaic logo

Agnostic to how the grid is stored and efficient with reads and writes

We made the grid be completely independent of how the actual grid data is saved. We also put a lot of effort into making sure we only send the actual changes. Once an item is dragged and dropped into a new position the grid system will only notify about the widgets that have a new position, instead of resending the whole grid And we went a step even further by supporting bulk requests.

Built for the open-source community

The whole project was conceived as being open source from the very start. We are heavy users of many open source projects and tools and contribute to many of them. This was an opportunity in which we had something to give back so we wrote GridList from the start as an open source project and we will continue to improve and maintain it.

The result

The effort of building the grid materialised in our boards feature. We made a horizontal, responsive, resizable dashboard that showcases the data we crunch very well.

Mozaic logo

If you want to start using GridList, find more about the technical aspects or contribute, check out the GitHub page


uberVU is part of the Hootsuite team. Read more about it here

Github-changelog – robust notification system built on top of github

Written by: on June 24, 2014 Posted in:

What is it?

Github-changelog provides a mechanism to communicate to the users of your web app that updates are available and that they should reload to see the changes. Github repository: Check it out here
Demo: See the demo

The source of the need

As developers, most of us strive to reach the Holy Grail of continuous deployment, so that any member of the team can push fixes and features as soon as these are built. Mean and lean. For most of us gone are the days of huge releases, replaced by an unpredictable and continuous flow of fixes and features. And these can be many. Etsy, a company we admire for their well oiled deployment system, managed to have 517 individual deploys in a single month. Although we are not quite at that scale yet, we manage to have 15+ daily deploys on a productive day. Continue…

Scala Trivial Refactoring Examples

Written by: on June 5, 2014 Posted in:

Here at HootSuite, we use Scala for some of our applications. Some of our Scala developers have a Java/PHP background, while others have a background in Haskell. After writing Scala for a few years, we feel that we have (more or less) figured out a happy medium of how we should be writing Scala with a functional style we can all agree on.

This blog post will show the actual code examples we’ve come across while refactoring as part of our code review / pair programming sessions. This is production-level code, but I have simplified it to demonstrate the changes.

Continue…

Animating a header out of a ListView

Written by: on May 1, 2014 Posted in:

When we enabled Social Sign In for our mobile users in the HootSuite Android app, we wanted to keep the flow simple and light with as few screens and dialogs as possible. This meant that we needed to defer collecting the user’s email address in certain cases until after sign-up.

As such, we wanted to surface an inline notification asking users to enter their email. The design was pretty straightforward: put a dialog inline with the main tab view content where the user can insert the information or dismiss the dialog.

I had done this a few times before, and the cleanest way to implement this in my opinion is to create the inline element as a view and add it as a header to the ListView object. What I never really had to do before was dismiss it away with a collapsing animation. When implementing that animation, I ran into an issue with ListView that I did not expect to find.

Continue…

DDoS Attack Thursday March 20th – Post Mortem

Written by: on March 27, 2014 Posted in:

Last Thursday at about 6:40 a.m. we received an email threatening a Distributed Denial of Service (DDoS) attack on our systems, unless we paid a ransom of 2.5 bitcoins. Within minutes an attack was launched and hootsuite.com became inaccesible. Our OnCall team immediately went to work to mitigate the attack.

In a DDoS attack, a large number of compromised computers (collectively called a botnet) repeatedly sends many simultaneous web requests to a website, which overwhelms the site’s ability to process regular traffic.

To a regular hootsuite.com customer, it appears like the site is unresponsive or in the best case really, really slow. Though such an attack is annoying and potentially costly to the customer, it’s important to note that in no way was any hootsuite.com customer data compromised. The attack was directed at the system that handles incoming web requests – our load balancer grid. Our databases and other internal systems were unaffected. In fact, during the attack scheduled messages continued to be sent.

After verifying the attack was legitimate, HootSuite’s OnCall team focused on getting our load balancer (LB) grid back online. We first attempted to identify and isolate the malicious traffic so that we could block it before it reached the LBs. However, our LBs were unable to perform diagnostic activity because they were maxed out handling incoming traffic. Next, we turned our attention to scaling up our capacity to handle the incoming traffic. Fortunately, we have a solid configuration management system in place powered by ansible, which allows us to spin up new LBs and add them to our grid quickly. That combined with the elasticity provided by AWS enabled us to ramp up the capacity of our LB grid to handle the traffic from the attack and process regular traffic at the same time – bringing hootsuite.com back online for our users at approximately 9:40am. We then continued to triage the malicious traffic in an attempt to block it, however about an hour after we brought the site back up, the attack stopped.

We recognize that HootSuite is an essential tool for our customers, and as such we invest a lot of effort in making sure it is available 24/7. We’re taking the following steps to reduce downtimes and improve resiliency to DDoS attacks in the future:

  1. Hardening the outer layer of our architecture so malicious traffic gets dropped instead of bogging down servers
  2. Improve effectiveness of our monitoring systems which will allow us to pinpoint the problem quickly
  3. Improve on our auto scaling infrastructure to make it even faster and easier to add capacity to handle attacks

You may be wondering why we didn’t just fork over the 2.5 Bitcoins (about $1200) to pay off the attacker. For one, we are not interested in negotiating with criminals. Additionally, paying the ransom would likely lead to future attacks.

It’s probable that the attacker was testing our willingness to bargain – 2.5 Bitcoins is high enough to sound legitimate, but low enough that there was a reasonable chance we would pay. If we handed over the money it’s likely another ransom would follow with an even higher price tag.

It’s important to us that our customers know what’s going on during an outage – we commit to posting an update to status.hootsuite.com as soon as we detect an issue and at least every 15 minutes after that until it’s resolved. The steps outlined above will ensure we can diagnose issues even faster and get updates out to customers as quickly as possible.

I would like to thank the folks at meetup.com who were eager to help and provided valuable information to our team, having suffered a similar attack a week before. Our team would be happy to collaborate with anyone suffering a similar attack – feel free to reach out to me on twitter at @sedsimon if you have information. In closing, I would like to offer my sincere apologies to all customers affected by this attack, and to thank you for your patience and continued support.