Deploying Lambdas with S3, Jenkins, and Gulp

Serverless architectures are a hot topic right now, popularized in large part by the growing NoOps movement. Products like Heroku, Elastic Beanstalk, and Convox aim to build a layer of abstraction between software engineers and the operations environment surrounding their code. Lambda takes yet another step past automating operations by eliminating them entirely; Lambda functions are stateless, created and destroyed per request.

Because Lambdas are such a raw, context-driven solution, deployment processes are equally raw and context-driven. The only solution provided out of the box is to zip your Node.js package (or Python, or Java, though in this post we’ll be focusing on Node.js) – node_modules folder included, try not to gasp too loudly – compress it to a zip file, and manually upload it through Amazon’s UI, or use their cli tools to do an API call.

I’ll go out on a limb at this point and assume we’re thinking the same thing: that’s probably not a tenable or scalable solution. It’s time to automate.

Let’s survey the landscape of tooling available to us. Lambda is an offering from AWS (Amazon Web Services), which means we have the entirely of AWS at arm’s reach to support our Lambda deployments. We also have tools such as Terraform, a Hashicorp offering that automates AWS infrastructure setup and teardown. For our Lambda, we settled on the following tooling:

  • Node.js 4.3
  • S3
  • Gulp.js (with gulp-clean, gulp-zip)
  • Another Lambda function (we’ll discuss this later)
  • The S3 Node.js module
  • Jenkins
  • Terraform

The Basic Idea

The way the deployment system works is like this: we have two Lambda functions, one of which is the actual Lambda we want to deploy, and one of which is the Lambda function that actually deploys other Lambdas. Let that sink in for a second.

The flow looks like this: we have a task in Jenkins to deploy a Lambda. The Jenkins task calls a Gulp.js task called deploy. That Gulp.js task runs clean (delete any existing archives), build (zip everything up), and then deploys the archive to an S3 bucket specified in an environment variable.

The Lambda function that deploys other Lambda functions is listening for events on our S3 bucket. So, when a new archive is uploaded to the bucket, that Lambda function runs and calls updateFunctionCode on the other Lambda function to update it from the archive it received in S3. When the event says a file was deleted, the function looks up previous revisions of the file and restores the most recent non-deleted version, again calling updateFunctionCode to do so. There’s also a Gulp.js task called revert that will do this manually – which is called by Jenkins for rollbacks.

Here’s what it looks like:

cloudcraft - Lambda Deploy.png

Everything within AWS – the API Gateway, S3 Bucket, Lambda definitions, policies, roles – is provisioned by Terraform.

As you can see in the diagram, we also use NGINX to reverse proxy requests to an AWS API Gateway, which calls the Lambda function. This approach means that you can effectively build an API layer with Lambda functions and very little operational overhead.

The Web Request

If you’re using Lambda to build an API endpoint, you’re going to need a way to get that request all the way through to Lambda. That’s where Amazon’s API Gateway comes in handy.

API Gateway will do exactly what the name promises – provide an HTTP endpoint that acts as an API Gateway into a number of different services, including Lambda. There’s one caveat with Lambda, however: it can only be invoked via an HTTP POST. So, when setting up the integration between API Gateway and Lambda, regardless of the incoming HTTP verb, make sure that the gateway POSTs to the Lambda.

API Gateway will give you a kind of nonsensical-looking URL, so if you’re consuming this elsewhere in your codebase you probably want another layer on top of that. For us, this came down to either using Cloudfront or our existing NGINX setup to forward requests on from our existing API request paths to the new endpoint. We decided to set up an NGINX reverse proxy that passed the request on to API Gateway. This actually gets a little tricky as well, because of the expectations API Gateway has for secure handshakes. Our proxy ended up looking like this:

This does a few things: it uses Google’s DNS to perform lookups, enables TLS for the proxy (which API Gateway expects), and then passes any query parameters on.

So now we have an NGINX reverse proxy that sends a request to API Gateway, which invokes a Lambda function and pipes the response all the way back upstream.

If you find this kind of work interesting – or even if you don’t, but you love music and code – we’re hiring.

Jeff Meyers

Build a Protocol Buffer Powered Tracking Pixel in Go

How to Build a Protocol Buffer Powered Tracking Pixel in Go

At Reverb we’ve been working hard to take ownership of our data. This process is important to us as we grow as an organization and if you want to know more about the “why’s”, I would encourage you to read Joe Levering’s previous post.

Today I want to show “how” we collect some of this data.

Tracking Pixels or web beacons have been around for a long time. Open up the source for an email from any marketer or e-commerce website and you’ll likely find a line that looks like this:

<img src="http://yourfriends.us10.list-manage.com/track/open.php?u=3D9b25be73813defa5034315098a&id=3Dc368123b1a&e=3D6c44e4dd9b" height="1" width="1">

This 1×1 pixel was created just for the me and will call back to the marketer when I open the email and images are requested. I see nothing because the pixel returned is 1×1 and transparent, but the marketer can log these requests to track their open rates. In addition these pixels usually attach additional information via a query param (e.g. who opened the email).

Let’s Build an API

At Reverb we already have a microservice that takes tracking information from our frontend and places it in our event pipeline. It is written in Go and is nothing more than a thin proxy around fluentd. This seems like an obvious place to add a new tracking pixel feature.

The EventAPI service is built using Gin, a micro framework popular in the Go ecosystem. It provides a simple DSL around Go’s httpHandler interface.

/// Router setup
router := gin.New()
gifHandler := &TrackingPixelHandler{fluentClient: client}

router.POST("/v1/events", eventHandler.Handler)
router.GET("/v1/event.gif", gifHandler.Handler)
/// ...
/// tracking_pixel_handler.go
var GIF = []byte{
    71, 73, 70, 56, 57, 97, 1, 0, 1, 0, 128, 0, 0, 0, 0, 0,
    255, 255, 255, 33, 249, 4, 1, 0, 0, 0, 0, 44, 0, 0, 0, 0,
    1, 0, 1, 0, 0, 2, 1, 68, 0, 59,
}

func (h *TrackingPixelHandler) Handler(c *gin.Context) {
    h.fluentClient.Post("ReceviedEventGIFRequest") // Post to FluentD

    c.Header("Cache-Control", "no-cache, no-store, must-revalidate")
    c.Header("Content-Type", "image/gif")
    c.Writer.Write(GIF)
}

Now we have an endpoint setup for our new tracking pixel, but we don’t know a lot about the request that just came in. While this implementation would be useful for knowing our general open rate, we don’t know much about which user or message this request originated from.

Dealing with Query Params

The simplest thing to do would be to add some simple query params to the end of request and log those. In fact, many tracking pixels are built using this simple technique.

http://your-event-service.com/v1/event.gif?user_name=theclash@casbah.com&message_id=32

But if you have highly structured or deeply nested objects this quickly becomes problematic.

http://your-event-service.com/v1/event.gif?user[id]=12&user[email]=theclash@casbah.com&user[expirements][0]=london&user[expirements][1]=calling&message[id]=12&message[experiments][0]=riot

Nested query params like this are hard to parse and harder to standardize on and our url is quickly becoming very large.

Web development of course has a common and beloved format for dealing with this problem: JSON. With a bit of clever encoding we can use it for our tracking pixel as well. We could escape our JSON and put it on the end of our query:

http://your-event-service.com/v1/event.gif?q="{\"user\":{\"id\":10,\"experiments\":[{\"name\":\"london\",\"value\":\"calling\"}]},\"message\":{\"id\":10,\"content_version\":2},\"sent_at\":\"2015-10-11\",\"mail_server\":\"hendrix\"}"

While we’ve solved the nested or complex object issue, we’re still stuck with a long url and a lot of escaping to ensure that this query doesn’t break the recipient’s client.

To solve the escaping issue some analytics services will instead encode their JSON queries with Base64.

http://your-event-service.com/v1/event.gif?q=yJ1c2VyIjp7ImlkIjoxMCwiZXhwZXJpbWVudHMiOlt7Im5hbWUiOiJsb25kb24iLCJ2YWx1ZSI6ImNhbGxpbmcifV19LCJtZXNzYWdlIjp7ImlkIjoxMCwiY29udGVudF92ZXJzaW9uIjoyfSwic2VudF9hdCI6IjIwMTUtMTAtMTEiLCJtYWlsX3NlcnZlciI6ImhlbmRyaXgifQ

At 254 characters, we’re not quite at the ~ 2k character limit for a URL which means we can pack a lot more data in that payload, but it could have its limits.

Enter Protocol Buffers

At Reverb we’ve been using Protocol Buffers to define event schemas internally. Since our tracking pixel event gets dumped into our event pipeline, we had already defined a schema for our tracking pixel events.

/// Example event for an email we send to alert users of a new message from a buyer or seller
message MessagesMailer {
  enum ACTION {
    SENT = 0;
    OPENED = 1;
  }

  optional string email = 0;
  optional string message_id = 1;
  optional ACTION action = 2;
}

Swapping out our Base64 encoded JSON for a Base64 encoded Protocol Buffer at the end of our /v1/event.gif?q= call means a smaller payload, but more importantly it means a well defined payload that we can place directly into our event pipeline. Because we defined our message in a Protocol Buffer we can generate Ruby code to generate these pixels very simply in our Rails backend.

>> Base64.urlsafe_encode64(Reverb::Event::MessagesMailer.new(message_id: "12", action: Reverb::Event::MessagesMailer::ACTION::OPENED.name, email: "casbah@clash.com").encode)
=> "EgIxMhoGT1BFTkVE"

In addition because we’re using Protocol Buffers our Go EventAPI can share the same schema via some code generation.

By using our Base64 encoded Protocol Buffers as our query string we’ve accomplished the goal of transmitting highly structured data via an HTTP GET request, limiting the size of the payload, and ensuring that the message is well structured enough for downstream systems to use these events for analytics.

Now our TrackingPixelHandler can look like this:

func (h *TrackingPixelHandler) Handler(c *gin.Context) {
    rawEvent := c.Query("q")

    event, err := decodeEvent(rawEvent)
    if err != nil {
        c.String(400, err.Error())
        return
    }

    h.fluentClient.Post("tracking-pixel-event", *event)

    c.Header("Cache-Control", "no-cache, no-store, must-revalidate")
    c.Header("Content-Type", "image/gif")
    c.Writer.Write(GIF)
}
func decodeEvent(event string) (*reverb_event.MessagesMailer, error) {
    rawProto, err := base64.StdEncoding.DecodeString(event)
    if err != nil {
        return nil, err
    }

    pixelEvent := &reverb_event.MessagesMailer{}
    err = proto.Unmarshal(rawProto, pixelEvent)
    if err != nil {
        return nil, err
    }

  return pixelEvent, err
}

That’s it! Now we have our events flowing through our API in a well structured way that internal services can use for analytics. We can make some additional improvements like a generic wrapper for our events or decorating UserAgent information but we’ll leave that for another blog post.

@erikbenoist

Rates of Change

I keep coming back to an idea I picked up somewhere in Kent Beck’s writings: do not mix different rates of change. This fundamental principle applies to every aspect of a technology stack, and even the organization at large. Let’s take a look at some examples:

Business logic and persistence
Persistence changes slowly; business logic changes quickly. If you mix them you get the classical Rails Fat Model that keeps growing as business logic changes. If you keep them apart, you have single purpose business use case classes and a persistence model that changes slowly.

Sidekiq queues
Don’t mix fast and slow jobs in one queue. If you do, the slow jobs will tie up the threads and you’ll never get to the fast jobs. This can be especially important if you have fast jobs that schedule slow jobs. Corollary: don’t mix fast and slow queues on a single worker instance. If you have multiple worker instances, you can have some focusing on slow, low priority jobs, and others focusing on fast, high priority jobs.

Types of data (read/write, heavy/light)
Our ElasticSearch cluster had regular indexes for all our typical search needs (products, users, etc) and it had one extremely high write index (Feed). The high write index took up 80% of the space and 90% of the writes to the cluster, impacting performance in other areas. Again, rates of change tell us to separate these out.

People
Developers working on long features can be kept more productive by insulating them from disruptive bugfixing or firefighting duties. We have a triage rotation here where one developer every week is on the front lines of fighting bugs while others focus on longer term projects.

What are some other examples?

@skwp

Owning Our Data

When a company is young, time/money/resources are scarce and it’s usually easiest to rely on external services for non-critical needs. As a company matures, room for error decreases, which often necessitates replacing these services with custom-built solutions. At Reverb.com, we’ve recently begun the process of owning our data. Today I want to summarize briefly our first steps in this process and what benefits we hope to reap.

But first, some context.

What do I mean by “data”?

In general, I’m speaking of analytics data generated by user actions and the aggregation of these “events”. A few months ago, our analytics were a mess. We sent back-end events to a variety of services including our own ELK (Elasticsearch, Logstash, Kibana) stack. We piped our front-end events to external services as well, but not the same ones that received our back-end events. Our mobile apps also had their own services for analytics tracking.

With data spread all over the place, it became difficult to get a big-picture view of our analytics. The services we used allowed us to answer most of our smaller, more feature-focused questions, but we wanted to start consolidating our data to derive deeper insights.

Controlling the Pipeline

We started by refining our own pipeline. Instead of sending events to a variety of services, we wanted one place to send and store our data. We created an event-logging microservice in Go with a simple bulk /events endpoint that takes a JSON payload which includes an event name and arbitrary attributes. The events are delivered directly to FluentD, which prisms those streams to a number of collectors, including Elasticsearch. We wired this API up to our front-end, with the intention of having our mobile apps use the same API in the near future.

Standardizing Events

Once our data was flowing to the same place, we worked on standardizing our events. Previously we had data logged in a variety of different formats. Some events were logged using a custom logger that placed the majority of its information under a “data” key in a hash. Some used the built-in Rails logger with no formatting. Others still, like our API, used their own unique format. The result was that events contained a variety of information under a variety of different keys (or no key at all). Searching through logs was tedious at best and impossible at worst.

To resolve this, we configured all of our loggers to use a custom formatter. With every event being formatted the same, we were able to iterate through several different universal formats for events, ultimately settling on JSON objects with top-level @event_name, @event_source, and data keys. This allows us to quickly find events via their name (“analytics.worker.add_to_wishlist”), their origin (Rails, API, ElasticSearch, etc.), and also to provide a clear place for freeform, more detailed information.

What do we get out of this?

There are immediate benefits related to enhancing the searchability of our data; Developers can troubleshoot problems more effectively and data scientists have more confidence in their research. But, that’s really just the tip of the iceberg. With the ability to generate more comprehensive business analytics, we’re now squarely on the path to better site monitoring and personalization.

We can now more easily track performance of systems (like search), allowing us to iterate on algorithms until we nail something that works well. We can also A/B test more effectively, allowing data to better inform our design decisions.

With full control of our analytics, we can also achieve deep insights into what our customers are doing, what they like, and what they want to see. We can use this information to create relevance engines to help us to profile our customers and determine what content is most important to them. Subsequently, we can use these tools to create more pertinent and engaging pages, emails, and articles, all of which can be surfaced dynamically depending on the (analytics-driven) interests of the user.

We’re excited about these possibilities and more! As an envoy of Reverb’s newly-minted Discovery team, I hope you’ll join us in the future as we continue to document our quest to own our data and, ultimately, to bring world-class personalization and search to Reverb!

Joe Levering
joe@reverb.com

 

Testing Android Activity Results

TLDR

Espresso doesn’t have a way to test ActivityResults, but it’s an important thing to test. You can do it by creating an Activity that launches and listens for your subject’s results, then using matchers against that Activity to see if your subject’s results are as expected. You can read the whole story here: https://gist.github.com/saxophone/961ceceea43f8501cbaf.

The problem

Activity results often inform an app’s behavior and is a common way to communicate between Activities. They are an important feature to test, but Espresso does not provide a simple interface for performing these tests. Who knows if Espresso will ever get there, so the burden of creating a reusable solution is on the tester.

Making a plan

One way to approach a testing solution is to  have a testing-specific activity (ResultTestActivity) start the activity you want to test (SubjectActivity)  and record the result code and result data in ResultTestActivity. It is then possible to write a Matcher against ResultTestActivity matching for the result code and result data from SubjectActivity. The testing flow would be as follows:

  1. Start ResultTestActivity
  2. ResultTestActivity startsSubjectActivity
  3. TriggerSubjectActivity to finish with some result through UI actions or other
  4. ResultTestActivity receivesSubjectActivity’s results and stores them locally
  5. Use custom matchers to verify that ResultTestActivity’s stored results fromSubjectActivity are as expected

Writing the code

Step 1 is simple, define a method that creates an Intent for ResultTestActivity that stores SubjectActivity’s intent as an extra:

private final static String EXTRA_SUBJECT_ACTIVITY_INTENT = "extraStartActivityIntent";

public static Intent createIntent(Intent subjectIntent) {
    Intent intent = new Intent(getInstrumentation().getTargetContext(),
                                ResultTestActivity.class);
    intent.putExtra(EXTRA_SUBJECT_ACTIVITY_INTENT, subjectIntent);
    intent.setFlags(Intent.FLAG_ACTIVITY_NEW_TASK);
    return intent;
}

Step 2 involves starting SubjectActivity using subjectIntent from within ResultTestActivity’s lifecycle:

private final static int REQUEST_CODE = 9999;

@Override
public void onCreate(Bundle savedInstanceState) {
    super.onCreate(savedInstanceState);
    Intent startActivityIntent = getIntent().getParcelableExtra(EXTRA_SUBJECT_ACTIVITY_INTENT);
    startActivityForResult(startActivityIntent, REQUEST_CODE);
}

Step 3 is specific to SubjectActivity’s behavior and what will cause it to finish with a result. Two other aspects of this process are SubjectActivity specific as well, creating subjectIntent and getting the Matcher for step 5. It is most convenient to roll these three abstractions into a single interface whose methods will be called in order to successfully execute the test:

public interface ActivityResultTest {
    /**
     * @return the intent with the appropriate extras that will start
     * your subject activity
     */
    Intent getSubjectIntent();

    /**
     * Perform the necessary UI actions necessary to trigger the subject
     * activity finishing with a result.
     */
    void triggerActivityResult();

    /**
     * @return a matcher for the result test activity to match the target
     * activity's result
     */
    Matcher<ResultTestActivity> getActivityResultMatcher();
}

Step 4 is again simple, catch SubjectActivity’s result and store the data

private int mResultCode;
private Intent mResultData;

@Override
public void onActivityResult(int requestCode, int resultCode, Intent data) {
    super.onActivityResult(requestCode, resultCode, data);
    if (requestCode == REQUEST_CODE) {
        mResultCode = resultCode;
        mResultData = data;
    }
}

Step 5 calls the last method in the previously defined interface. A very useful Matcher<ResultTestActivity> is also defined for ease of use:

public static Matcher<ResultTestActivity>
receivedExpectedResult(final Matcher<Integer> resultCodeMatcher,
                       final Matcher<Intent> resultDataMatcher) {
    return new TypeSafeMatcher<ResultTestActivity>() {
        @Override
        protected boolean matchesSafely(ResultTestActivity item) {
            return resultCodeMatcher.matches(item.mResultCode) &&
                    resultDataMatcher.matches(item.mResultData);
        }
        @Override
        public void describeTo(Description description) {
            description.appendText("with result code=")
                    .appendDescriptionOf(resultCodeMatcher);
            description.appendText(" and with intent=")
                    .appendDescriptionOf(resultDataMatcher);
        }
    };
}

Putting it all together

The 5 steps can be summarized in a single static method:

public static void runActivityResultTest(ActivityResultTest test) {
    ResultTestActivity resultTestActivity = (ResultTestActivity) getInstrumentation().startActivitySync(test.getSubjectIntent());

    test.triggerActivityResult();

    assertThat(resultTestActivity, test.getActivityResultMatcher());

    resultTestActivity.finish();
}

Invoking this test is also simple. If Subject is an activity that should finish with a result code of RESULT_OK and data that holds a String named “RESULT_STRING” with a value of “resultString”, the test becomes:

@Test
public static void subjectActivityShouldReturnCorrectActivityResult() {
    runActivityResultTest(new ActivityResultTest() {
        @Override
        public Intent getSubjectIntent() {
            return new Intent(getInstrumentation().getTargetContext(), SubjectActivity.class);
        }

        @Override
        public void triggerActivityResult() {
            // Perform the appropriate actions necessary to trigger 
            // SubjectActivity's result
        }

        @Override
        public Matcher<ResultTestActivity> getActivityResultMatcher() {
            return receivedExpectedResult(is(RESULT_OK),
                 IntentMatchers.hasExtra("RESULT_STRING", "resultString"));
        }
    });
}

Caveats

This looks like it will work as-is but there’s a high chance it won’t. Enabling the “Don’t keep activities” flag in Developer options on your test device will cause two problems. Once your subject finishes, ResultTestActivity will launch the subject’s intent again, which is easily resolved by surrounding said logic in a null check for savedInstanceState. The harder problem to solve is that the ResultTestActivity returned from startActivitySync() isn’t the same object that is receiving the result because it got recreated. Instead of using that object, we have to get the current activity on the stack and test against that.

These enhancements are reflected in the whole story, which you can read here https://gist.github.com/saxophone/961ceceea43f8501cbaf.

Happy testing!

A gem for your unique constraints

A few weeks ago, we wrote about how much of a pain it is to handle unique constraints correctly in Rails and showed some code to deal with it.

Good news! We just released a very tiny gem that adds this capability to your models called rescue-unique-constraint.  Now with one line of code, you can ask your model to rescue your unique constraint failures and turn them into regular model errors that can be safely rendered in your views.

You can download the gem on rubygems (gem install rescue_unique_constraint) and github

Here’s a quick example of what it looks like:

Class Thing < ActiveRecord::Base
  rescue_unique_constraint index: "my_unique_index", field: "somefield"
end

thing = Thing.create(somefield: "foo")
dupe = Thing.create(somefield: "foo")
=> false
thing.errors[:somefield] == "somefield has already been taken"
=> true

- @skwp

Database unique constraints in Rails

TLDR

ActiveRecord uniqueness validations are not good enough for distributed systems. Instead, database-level unique constraints must be used. When they are, custom logic must be implemented on the Rails side to trap these errors and report them as standard AR errors rather than exceptions.

How race conditions happen

Note that any system running more than one thread (even two unicorn workers) is susceptible to this.
Here’s how a race condition occurs:

1. Thread 1 checks for presence of a record; it is false
2. Thread 2 checks for presence of a record; it is also false
3. Thread 1 and Thread 2 write to the database.
4. Database now contains invalid rows which can’t be re-saved because they will fail Rails validations

How to fix this

First, add a unique index. Keep in mind that this index should be added
concurrently to avoid locking up the table if you’re in a high volume system.

As part of your migration, provide code to de-duplicate the existing entries, or the index creation will fail but leave the index partially created, forcing you to drop it prior to re-creation.

Indexes can contain WHERE clauses to reduce their scope. Below, we have an example where the index should not be checked when the rows have been deleted.

Example:

Second, add handling to the ActiveRecord model for capturing the database-level constraint failure. The best way we have so far is to override the `create_or_update` method which is called by ActiveRecord during `save` and`save!`, like this:

If your table has multiple unique constraints, you can add a clause to the case statement for each index.

Note the adding of the standard `taken` error to the appropriate field. This is the error Rails would normally add for a uniqueness failure. You can adjust/override the message in a translations file like this:

Caveats

If you have code that rescues ActiveRecord::RecordInvalid, you should realize that it’s possible for the record to be valid from the Rails standpoint but still fail database level constraints. When that happens, you will get back an ActiveRecord::RecordNotUnique or RecordNotSaved if you implemented the create_or_update override suggested above.

These are all subclasses of ActiveRecord::ActiveRecordError, which is what you should rescue if you for some reason are calling `save!` and wanting to rescue the result.

Till next time,
@skwp

Introducing migr8 a Concurrent Redis Migration Utility Written in Go

Here at Reverb, we’ve got quite a few places that we like to store our data. One of those places is Redis. We use Redis in quite a few ways including our job queues for Sidekiq and our analytics tracking for our internal service called Bump.

As a scrappy startup we thought to ourselves “oh one redis instance should be just fine forever and ever”…until it wasn’t. Earlier this year we started looking at our rate of growth in our redis keyspace and noticed we were quickly running out of memory. We knew something had to be done.

We came up with a plan: split Bump out into its own Redis instance. With this plan in mind we started looking to see if anybody else has solved this problem before us. We stumbled upon this script which was the initial inspiration for our tool Migr8. One of the first problems we noted about this script is its use of “keys *”.

Running keys * is a pretty bad idea if you’ve got a decent sized data set in Redis. This command is fine to run in development or staging but please heed our warnings(we’ve made the mistake) do not run “keys *” in production. Your Redis instance will lock while trying to process the command and will likely fail in the process. Redis locks on “keys *” because it is O(n) with the number of keys. If you have a significant number of keys, lock city awaits your arrival.

Luckily we ran the command on a slave so we just had to resync the slave with the master. You’ve been warned:)

So after a lot of toying around with different implementations in Ruby, we decided to give it a shot writing a tool in Golang. Our initial Ruby implementations were processing keys at a rate of 100 keys per second. Ruby is slow because of the GIL (Global Interpreter Lock). This makes Ruby a poor choice if we want fast / concurrent code. Go has native concurrency built into the languageThe Go implementation made our network card the bottleneck at 20k keys per second. Yeah, wow. Go is pretty fast.

At the time we had to move around 40 million keys from our main Redis instance to this new bump instance. If we were to stick with the Ruby implementation, it would have taken us 100 hours to migrate keys from one instance to the other. This is way too long.

Using the migr8 utility, we were able to complete the migration in 30~ minutes. Now that’s a much more acceptable number in terms of downtime.

Here’s some quick examples of how to use the Migr8 utility:

Using Go for this tool was a huge win over Ruby. So now we’d like to share the tool with you in hopes that it helps you move some Redis.

Til next time,

@atom_enger

@erikbenoist

@kylecrum

Github link to migr8

 

Communicating via Code

For us at Reverb, as we’ve been growing, being able to communicate effectively between teams and coders has been crucial to our ability to scale and create great software that our customers love. And as an organization that likes to stay small and agile, one of the best ways we can communicate with each other is in the code we write.

I recently got a chance to synthesize some of the ideas about communicating with code that we use here at Reverb during the Windy City Rails Conference.

Enjoy the talk an any feedback is welcome.

Kyle (@kylecrum)

From Handcrafted to the Assembly Line: Terraforming Reverb.com

At Reverb, we’re always thinking about ways to improve our workflow. Whether it’s in our application, our customer experience or our infrastructure, we know there’s always room for improvement.

One of the areas that we still have a lot of blackboxes and not-so-obvious knowledge is in the infrastructure that powers Reverb.com. When I started at Reverb in November 2014, all of our servers were built and maintained by hand. I knew that this approach was not feasible if we wanted to continue scaling our platform.

My first pass at revamping the infra included writing a chef cookbook for every service and using Chef to manage the infrastructure. This offered us a lot of benefits, such as repeatability and being able to document the actions required to configure our servers.

While we made some strides on the operating system and application level, we were still building what I like to call ‘artisanally crafted infrastructure’. Networks, subnets and load balancers were all set up by hand.

As the year went on, a lot of questions from the team arose: “Why does this server have X amount of ram? Why is X service in this subnet? Why does the load balancer listen on this port?” I knew that if I wanted to scale this platform that I had to shift the way I approached our infrastructure. I knew I had to document our infrastructure entirely in code.

Enter Terraform.

Terraform has given us the ability to create and spin up new environments in just minutes. Not only are the files self documenting the infrastructure, it’s incredibly useful to be able to tear down and spin up an entire environment by running one command: `terraform apply`.

Here’s an example of a Terraform plan that we use to set up one of our staging environments.

So far, we’ve rebuilt every one of our staging environments using a similar Terraform plan. This plan brings up our load balancer, database, elasticache instance and the instance that will run the reverb code. It even configures the DNS record pointing to the CNAME of the load balancer.

After the instance has been provisioned, Terraform even does us another solid: bootstraps the instance with the Chef server.

Terraform allows you to dynamically reference resources as they’re created.  You’ll notice in the load balancer resource, I’m referencing ${aws_instance.example.*.id} which is string interpolation. Basically I’m telling the load balancer, “I don’t care how many instances there are, just use them all!”.

Another great feature of Terraform is that it allows you to generate dependency graphs so you can easily describe your infrastructure to others in a visual format:

blawg
Lastly, one of the things I’m really loving about this approach is that creating a new environment to test some crazy change is as easy as typing:

cp -r old-env/ new-env/

in Vim: %s/old-env-name/new-env-name/g

and Finally: terraform apply

Next time you find yourself logging into AWS to make a handcrafted server sandwich with an applewood smoked load balancer, ask yourself “Is this something I could document and share with my team using a Terraform plan?”.

More than likely the answer will be yes. Not only are you spreading the knowledge that it took to create that piece of the Rube Goldberg machine, you’re saving yourself from hours of pain later on figuring out how you setup the damn thing months ago.

Like what we’re doing here and want to contribute to the best place to by music gear on the web? We’re hiring for a Jr Devops Engineer and more!

Til next time,

@atom_enger

Stay safe while using html_safe in Rails

Whether you’re a junior dev, product designer or senior level software engineer, it’s easy to fall on your face when using `html_safe` in Rails.

The thing about this method is: it’s terribly named. I mean really, it’s a horrible name. When you call a method on an object which transforms the original object, the method name should describe the transformation which is about to happen.

The html_safe method makes you think that the transformation you’re doing to the string is actually going to be safe. It can be safe. It can be very unsafe, too.

I’m going to go on record stating that we should call this method something more sane, like: html_beware. Why beware? Because as a code committer, you should be very aware of the string that you’re calling this method on. If the string has input that is user controlled of any kind, you should certainly not call “html_safe” on it. This method should make you think twice about what you’re doing, and by calling it safe, it certainly doesn’t make you think at all.

Let’s go over some code examples and explain exactly how html_safe works, and why it’s unsafe in certain contexts.

Now that we’ve looked at how to use html_safe properly, let’s look an example of how we at Reverb fell on our face. Not too long ago we shipped some code which allowed user-controlled input to be inserted into the DOM. This resulted in a stored XSS attack, which you can see here:

xss1

Here’s the bad code:

And here’s how we fixed it:

While there’s nothing inherently harmful about a javascript alert besides a minor annoyance, this attack vector illustrates that a user can inject any type of html tags into the DOM, including script tags. This could be especially disastrous if this vector was used to steal session cookies or login information. Thankfully we caught this error ourselves and it was not exploited.

Keep this in mind while you’re building your next awesome project and know exactly where the string comes from that you’re adding html_safe to. And even if you’re not building something new and have inherited an older codebase, consider grep’ing your codebase looking for string interpolations combined with html_safe:

.*(\+|\}).*html_safe

So while nothing is perfect, including this method name, in conclusion we have learned that it pays to be careful about what type of user data you’re working with. Here at Reverb, we believe in owning mistakes and fully understanding why they happened.

That being said, we also believe that nothing is perfect and mistakes will happen. If you believe you’ve found a bug on our platform, please securely and responsibly disclose it to us at security@reverb.com. We will work with you to confirm, close and patch the hole. We do offer bounty for critical bugs and swag for bugs with a lower risk profile.

Until next time, stay html_safe!

@atom_enger

@joekur

Rails and Ember Side by Side

This is not a blog post about embedding Ember CLI in your Rails app. Instead, it’s a post about how to get the two to live in harmony next to each other by separately deploying Rails and Ember, but making them feel like one app.

Our first attempt

Last week we launched our first foray into Ember – an admin facing utility that helps us organize, curate, and police content on our site. Our admin capability is developed primarily in Rails, but we wanted one page to be the Ember app.

Our first instinct was to look at ways to integrate our ember app directly into the Rails admin so it can live “inside” the page. We tried ember-cli-rails, a project that promised a lot of magic.

With a few lines of configuration, we could get Rails to compile our ember app and ship it along with our asset pipeline. Great! Ship it! But…disaster struck.

Problems with ember-cli-rails

1. It forces an ember dependency on all our Rails developers. They now need to know about npm, bower, and more in order to get their Rails app to even boot. This is sadness.

2. It bloats our Rails codebase by introducing another big hunk of code into it (an entire ember app).

3. The worst part: it turned our relatively snappy 2 minute Jenkins deploy into an 8 minute deploy (!). The issue appeared to be in the asset pipeline. Something was causing a drastic slowdown in compilation, right around the time of dealing with ember’s vendor assets (things like ember-data). Whether this is a bug in ember-cli-rails or simply the asset pipeline being the slow beast that it is, still remains to be seen.

We could probably get over #1 and #2 after some initial pain, but a 4 fold increase in deploy times was an unacceptable tradeoff for having Ember part of our Rails app.

Solutions?

When we ran ember-cli’s preferred compilation method (ember build), the build time was just fine. In fact on the same Jenkins box that took 4 minutes to concatenate assets in the Rails asset pipeline, the ember build took less than 20 seconds!

So we decided we were going to separate the two apps. But we still wanted it to feel like one app. Let’s get to work.

1. The Ember app should share a session with the Rails app

Because we didn’t want to have to deal with fancy things like OAuth or token based authentication against our API, we could simply share our session cookies with the Rails app if we ran on the same domain. So we decided we would serve our app on the same domain – https://reverb.com/app_goes_here. If it was at the same domain, it would share cookies and the Rails app would see it as “logged in”.

So the first thing we need to do is get it into a public directory on our existing web servers. We’ll talk about this in the deploy section below.

2. The Ember app should be environment aware so it can point to different backends

When you build your ember app, you can pass in an environment with “ember build –environment production”. To make our app aware of different endpoints, we added this into its config/environment.js:

https://gist.github.com/skwp/0bc41973a8952652f47d.js

3. What about CSRF?

Rails comes with some CSRF protection out of the box. The way it normally works is Rails will return your CSRF token as a tag in the body of the html you request. You would then submit forms back to Rails with that CSRF token. Ember does not pull html from Rails and all it’s requests are asynchronous. How to fix?

1. Make Rails return the CSRF token in a cookie for Ember to read

https://gist.github.com/skwp/130d6b18ee90c1c93799.js

2. Make Ember pull that cookie and set it on every out going request

https://gist.github.com/skwp/f0d09dc9adac07e597bf.js

Done.

4. How to deploy?

Ok, now the fun part. What is an ember app at it’s core? It’s just static html and javascript. We know how to deploy that, we just put it in the public dir of our Rails app right? Ok, so all we need to do is:

1. compile the ember app (npm/bower/ember build)
2. upload it to an s3 bucket
3. tell all our servers to download it

This is not particularly polished, but you get the idea:
https://gist.github.com/skwp/92c569cb622c47d3a1b5.js
Done.

5. Bonus: make it feel like part of the app

I’ll describe this one instead of giving you code. We wanted the ember app to have the same “layout” as the rest of our admin interface. Ajax to the rescue: just make a controller to render a partial and have ember pull it using jQuery.load into a div of your choice. Style it similar to your Rails app, and the illusion is complete.

One thing to note is that the Ember app is currently fully self contained in terms of assets. So in order to mimic the look and feel of our admin (which was based on Bootstrap), we had to pull Bootstrap into the Ember project. In the future, we may want to pull assets from Rails to avoid duplicating CSS. We have some ideas on how to do this using a controller to serve up the asset paths via an API but we’ll blog about that once we have a working prototype.

Yan Pritzker – @skwp

Organizing your Grape API endpoints

The following is taken from a Reverb Architecture Decision Document

TLDR

Grape endpoints (classes inheriting from Grape::API) are basically equivalent to Rails controllers. As such, they can contain many unrelated methods (index/show/delete/create). As they grow, the code becomes harder to maintain because helper methods usually only apply to one of the endpoints, similar to Rails controller private methods.

Decision

Grape endpoints should be delivered as independent classes for each action. For example, instead of:

    
# app/api/reverb/api/my_resource.rb
class MyResource < Grape::API
  get '/something' do
  end

  post '/something' do
  end
end

Create separate classes (and files) for each verb:


# app/api/reverb/api/my_resource/index.rb
module MyResource
  class Index < Grape::API
    get '/something' do
    end
  end
end

# app/api/reverb/api/my_resource/create.rb
module MyResource
  class Create < Grape::API
    post '/something' do
    end
  end
end

This allows us to define helper methods in each endpoint specific to that endpoint. Additionally, prefer creating model classes to one-off helper methods for endpoints when appropriate.

Positive Programming with Junior Devs

Hello, World. I’m Tam, and I am writing to you fresh from my third week on the engineering team at Reverb. I also just crossed into my second year as a professional programmer. Milestones! Growth! Vim!

I think of myself as an experienced novice. Thanks to my origins in a programming bootcamp, I know a lot of other people in my boat. It’s becoming more of a ship, actually — a sizable fleet, and we are crash-landing at your company in numbers never-before-seen! Prepare thyself accordingly:

Kindness
The first few days I showed up, different team members took me out to lunch. They all already knew my name. This made me feel welcome, which goes a long way in those strange first days.

Transparency
Within my first week, I received a document: “Expectations of Junior Developers.” This inspired my trust and confidence: they have invested time and thought into how they can smoothly onboard me. It also gave me a roadmap to judge my own progress. Building self-sufficiency feels good; provide people with tools that they may do so.

Patience
We share vim configurations here, and one of our key mappings is [,][t]. It maps to fuzzy file searching. Now, I have been typing since I was 10. I can type really quickly! But every comma I’ve ever typed has been followed by a whitespace.  Do you have any idea how many times I screwed up typing comma-t while my pair waited? We likely spent an entire collective day waiting on my fumbling fingers. I couldn’t even remember the keystrokes at first. Herein lies an opportunity for immense frustration on all sides. I urge you, experienced team member, to have patience. You are in a leadership position. If you get too frustrated too quickly, your junior stands no chance. Be patient: they are trying really hard, and it is exhausting.

We can teach you things
This week I unintentionally taught our CTO that you can split a git hunk. That was really exciting! There is a lot to know about software development. If you stay receptive, we may be able to teach you something in return.

The bottom line is, you have to be excited that we’re here. Every junior I know is thrilled, nervous, and doing everything they can to stay afloat. If you’ve screened them, you know they have potential. Try not to get in the way!

To the juniors of the world, don’t be afraid. You can do this. Find a supportive environment, keep friends close, and … Go!

@tamatojuice

Making inheritance less evil

Sometimes you come up against a problem that just seems to want to be solved with inheritance. In a lot of cases, you can get away from that approach by flipping the problem upside down and injecting dependencies. Sandi Metz’s new Railsconf talk Nothing is something does a really great job talking about this concept in a really fun way.

But if you have decided that inheritance is truly the right approach, here is something you can do to make your life just a little easier. It’s called DelegateClass.

Let’s quickly summarize a few reasons why inheritance is evil, especially in Ruby:
1. You inherit the entire API of your superclass including any future additions. As the superclass grows, so do the subclasses, making the system more tightly coupled as more users appear for your ever-growing API.
2. You can access the private methods of your superclass (yes, really). This means that refactorings of the superclass can easily break subclasses.
3. You can access the private instance variables of your superclass (yes, really). If you set what you think are your own instance variables, your superclass implementation can overwrite them.
4. You can override methods from the superclass and supply your own implementation. Some think this is a feature (see: template method pattern), but almost always this leads to pain as the superclass changes and affects every subclass implementation. You can invert this pattern by using the strategy pattern, which solves the same problem through composition.

Sometimes, though, there are legitimate situations where you want to inherit the entire interface to another object. A realistic example from Reverb is our view model hierarchy where various search views are all essentially “subclasses” of a parent view object that defines basics that every view uses, and then each view can define additional methods.

In these cases, one of the cleanest solutions is the DelegateClass pattern in Ruby. This is basically a decorator object that delegates all missing methods to the underlying class, just like inheritance would, but without giving you any access to the private methods of that class, or its instance variables.

Check out this example that illustrates both classical and DelegateClass-based inheritance:

– Yan Pritzker (@skwp)