Build a Protocol Buffer Powered Tracking Pixel in Go
How to Build a Protocol Buffer Powered Tracking Pixel in Go
At Reverb we’ve been working hard to take ownership of our data. This process is important to us as we grow as an organization and if you want to know more about the “why’s”, I would encourage you to read Joe Levering’s previous post.
Today I want to show “how” we collect some of this data.
Tracking Pixels or web beacons have been around for a long time. Open up the source for an email from any marketer or e-commerce website and you’ll likely find a line that looks like this:
<img src="http://yourfriends.us10.list-manage.com/track/open.php?u=3D9b25be73813defa5034315098a&id=3Dc368123b1a&e=3D6c44e4dd9b" height="1" width="1">
This 1x1 pixel was created just for the me and will call back to the marketer when I open the email and images are requested. I see nothing because the pixel returned is 1x1 and transparent, but the marketer can log these requests to track their open rates. In addition these pixels usually attach additional information via a query param (e.g. who opened the email).
Let’s Build an API
At Reverb we already have a microservice that takes tracking information from our frontend and places it in our event pipeline. It is written in Go and is nothing more than a thin proxy around fluentd. This seems like an obvious place to add a new tracking pixel feature.
The EventAPI service is built using Gin, a micro framework popular in the Go ecosystem. It provides a simple DSL around Go’s httpHandler interface.
Now we have an endpoint setup for our new tracking pixel, but we don’t know a lot about the request that just came in. While this implementation would be useful for knowing our general open rate, we don’t know much about which user or message this request originated from.
Dealing with Query Params
The simplest thing to do would be to add some simple query params to the end of request and log those. In fact, many tracking pixels are built using this simple technique.
http://your-event-service.com/v1/event.gif?user_name=theclash@casbah.com&message_id=32
But if you have highly structured or deeply nested objects this quickly becomes problematic.
http://your-event-service.com/v1/event.gif?user[id]=12&user[email]=theclash@casbah.com&user[expirements][0]=london&user[expirements][1]=calling&message[id]=12&message[experiments][0]=riot
Nested query params like this are hard to parse and harder to standardize on and our url is quickly becoming very large.
Web development of course has a common and beloved format for dealing with this problem: JSON. With a bit of clever encoding we can use it for our tracking pixel as well. We could escape our JSON and put it on the end of our query:
http://your-event-service.com/v1/event.gif?q="{\"user\":{\"id\":10,\"experiments\":[{\"name\":\"london\",\"value\":\"calling\"}]},\"message\":{\"id\":10,\"content_version\":2},\"sent_at\":\"2015-10-11\",\"mail_server\":\"hendrix\"}"
While we’ve solved the nested or complex object issue, we’re still stuck with a long url and a lot of escaping to ensure that this query doesn’t break the recipient’s client.
To solve the escaping issue some analytics services will instead encode their JSON queries with Base64.
http://your-event-service.com/v1/event.gif?q=yJ1c2VyIjp7ImlkIjoxMCwiZXhwZXJpbWVudHMiOlt7Im5hbWUiOiJsb25kb24iLCJ2YWx1ZSI6ImNhbGxpbmcifV19LCJtZXNzYWdlIjp7ImlkIjoxMCwiY29udGVudF92ZXJzaW9uIjoyfSwic2VudF9hdCI6IjIwMTUtMTAtMTEiLCJtYWlsX3NlcnZlciI6ImhlbmRyaXgifQ
At 254 characters, we’re not quite at the ~ 2k character limit for a URL which means we can pack a lot more data in that payload, but it could have its limits.
Enter Protocol Buffers
At Reverb we’ve been using Protocol Buffers to define event schemas internally. Since our tracking pixel event gets dumped into our event pipeline, we had already defined a schema for our tracking pixel events.
Swapping out our Base64 encoded JSON for a Base64 encoded Protocol Buffer at the end of our /v1/event.gif?q= call means a smaller payload, but more importantly it means a well defined payload that we can place directly into our event pipeline. Because we defined our message in a Protocol Buffer we can generate Ruby code to generate these pixels very simply in our Rails backend.
Which means our new query string is looking a lot more compact:
/v1/event.gif?q=EgIxMhoGT1BFTkVE
In addition because we're using Protocol Buffers our Go EventAPI can share the same schema via some code generation.
By using our Base64 encoded Protocol Buffers as our query string we've accomplished the goal of transmitting highly structured data via an HTTP GET request, limiting the size of the payload, and ensuring that the message is well structured enough for downstream systems to use these events for analytics.
Now our TrackingPixelHandler can look like this:
That’s it! Now we have our events flowing through our API in a well structured way that internal services can use for analytics. We can make some additional improvements like a generic wrapper for our events or decorating UserAgent information but we’ll leave that for another blog post.