Posts from September 2014

During my time here as a member of Hootsuite’s Customer Happiness team, I’ve come to understand our techniques for debugging a large scale web application. I want to share the five best techniques I’ve learned for debugging issues on a web application as massive as Hootsuite, though they can really apply to debugging web applications of any size.

debugging web apps
Read More …

Functional Reactive Programming has been adopted in many programming communities, and for good reason. Trying to manage multiple asynchronous calls usually results in a mess of code that is not only tricky to debug, but difficult to maintain and build upon — not to mention the many “gotchas” surrounding the use of AsyncTasks on various different versions of the Android SDK. This kind of code usually ends up being the bane of an Android developer’s existence.

Enter the Android module for RxJava.

By using a souped up version of the Observer pattern, we gain the ability to create incredibly powerful chains of logic that can be specifically run against various threads (UI, background, etc) and that can be further modified by users of these observables to tune them exactly for the UI they are trying to populate.

Read More …

Listening to Lightning Talks

Reposted from our Hootsuite Blog

After attending Vancouver’s Polyglot Unconference in May 2013, Beier Cai and Rahim Lalani approached me with an idea: kickstart short, unconference-style talks on technical topics within our Engineering group. I loved this riff on the traditional lunch-and-learn and decided to set a short time limit of just 5 minutes per talk.

Why so short?

Short is all you need to start a conversation. Short also defines our modern day attention spans, and short is hard because this constraint requires you to think critically and creatively about how to convey only essential information. We started out with only technical topics but soon expanded to include non-technical topics and now include speakers from other parts of our organization as a way to expose our audience to different roles and functions. Read More …

At Hootsuite, we analyze user data to help us answer questions about product usability and give us insight into trends. Recently, we designed and implemented a RESTful API that provides access to our aggregated data sets and shields our data consumers from changes in infrastructure that might be happening in the backend. We built our RESTful API using the Play Framework with Scala and we believe it was a great decision for multiple reasons.

Currently, all of our logs are stored in HDFS and S3. We use Apache Spark to aggregate raw logs into more meaningful and usable data sets. For instance, one of our Spark jobs calculates an aggregate count of Hootsuite’s unique daily users across all our different platforms and plan types.

The aggregated data sets are stored in a central RDBMS and are used for data visualizations on dashboards around the office and reports which serve multiple internal stakeholders. Since we are continuously experimenting with different data processing pipelines and data stores, we decided that it would be necessary to provide a constant and defined way to access our data through an API. Providing an API also makes it easier to clearly communicate which data sets are available and makes documentation a much easier process.

We came across multiple features that demonstrated the advantages of using Play to design and implement our API: Read More …

Software-as-a-service organizations live and die on service reliability and uptime. Having an “on call” team (formal or otherwise) that can quickly react to incidents and outages is mission critical to the business. Hootsuite is no different – our customers around the globe rely on being available 24/7 – but even software engineers need sleep. So, who holds down the fort if the unthinkable happens outside business hours?

Beier Cai recovering from an outage. Beier is Hootsuite’s Director of Engineering - Platform and UserID #1. Photo credit: Simon Stanlake

This year we overhauled our On Call practice to version 3.0. This post walks through the evolution of our On Call model: the stark reality of 1.0, the noble intentions of 2.0, and the practicality of 3.0.

On Call 1.0

Our first On Call system was informal and happened reactively, out of necessity. A handful of original software engineers and operations engineers who built were unofficially on call all. the. time. No sleep allowed, no knowledge transfer, no metrics, and no visibility into incidents or issues or remedies. Read More …