adrift on a cosmic ocean

Writings on various topics (mostly technical) from Oliver Hookins and Angela Collins. We have lived in Berlin since 2009, have two kids, and have far too little time to really justify having a blog.

Golang, testing and HTTP router package internals

Posted by Oliver on the 18th of January, 2015 in category Tech
Tagged with: encapsulationgolangprogrammingtestingyaks

We have an internal service that takes requests of the form something like /foo/{ID}/bar which basically is expected to generate data of the form "bar" about the ID entity within the collection "foo". Real clear! Because this service has a bunch of similar routes that have a resource identifier as part of the request path, we use the Gorilla Mux package. There are a lot of different approaches to HTTP request routing, many different packages (and some approaches recommending you use no external packages)

  • this is out of scope of this article!

This week we realised that for a given endpoint, one other service that calls ours was URL-encoding the resource ID. There are two parts to this that need to be stated:

  • IDs are meant to be opaque identifiers like id:12345 - so they have at least one colon in them.
  • A colleague later pointed out that encoding of path parts is actually up to the application to define (although I haven't read the part of the spec that says this, recently). URL-encoding thus is not strictly necessary except when it comes to query parameters, but in this case we have something being encoded and need to deal with it.

The fix is relatively simple. We already had some code pulling out the ID:

id := mux.Vars(request)["id"]

However, this was just passing the ID to another part of the code that validated that the ID is properly formed, with a regular expression. It was expecting, again, something like id:12345 and not id%3A12345. So we introduce a very simple change:

    encodedID := mux.Vars(request)["id"]
    id, err := url.QueryUnescape(encodedID)
    if err != nil {
        return err
    }

OK, this will work, but we should test this. We've introduced two new "happy paths" (receive a request with the ID not encoded and the decoded version is exactly the same, receive a request with the ID encoded, and the decoded version is what we expect for later validation) and one new "unhappy path" where the ID is malformed and doesn't decode properly. The problem here is that we need to test this function and pass in a request that has an appropriate ID for the path we are trying to test.

Already, this sets us down the road to the yak shaving shop. We can assemble a request ourselves, and attempt to set up all of the right internal variables necessary to be able to pull them out again inside our own function in the way it expects. A slightly harder way would be to set up a test server with httptest, set up the routing with mux and make a real request through them, which will ensure the request that makes it to our function under test is as real as can be. However we are now also effectively testing the request handling and routing as well - not the minimum code surface area.

As it turns out, neither of these options are particularly good. I'll start with the latter. You start up a test server and assemble a request like so (some parts have been deliberately simplified):

testID := "id%XX12345" // deliberately malform the encoded part
ts, _ := httptest.NewServer(...)
url := fmt.Sprintf("%s/foo/%s/bar", ts.URL, testID)
resp, err := http.Get(url)

Uh-oh, the malformed encoded characters will be caught by http.Get() before the request is even sent by the client. The same goes for http.NewRequest() We can't test like this, but what if we assemble the request ourselves?

serverAddr := strings.Split(ts.URL, "http://")[1]
req := &http.Request{
    Method: "GET",
 URL: &url.URL{
     Host:   serverAddr,
        Scheme: "http",
        Opaque: "/foo/id%XX12345/bar",
 },
}
resp, err := http.DefaultClient.Do(req)

We can send this request and it will make it to the server, but now the request handling on the server side will parse the path and catch the error there - it still won't make it to the function under test. We could write our own test server that has its own request handling wrapped around a TCP connection, but that's far more work. We also have to determine whether the request has succeeded or failed via the test server's response codes (and possibly response body text) which is really not ideal.

So, onto testing with a "fake" request. Looking back at our code, we notice that we are pulling the ID out of the request via mux.Vars(request)["id"]. When you don't use any request routing package like this, basically all request path and query parameter variables are accessible directly on the request object, but my suspicion was that mux.Vars didn't just simply wrap around data already in the request object but that it stored it elsewhere in a different way. Looking at the mux code, it actually uses a data structure defined in the context package, a very gnarly nesting of maps. The outer level keys off the request pointer, and each unique request pointer will have a map containing different "context keys" - either varsKey or routeKey depending on where the parameters are coming from (but I'll not dive into that in this article).

The part of the request we are interested in is grouped under varsKey, which is the first iota constant, so it is 0. We can use the context.Set() to set the appropriate data we want to fake out in our request, with a relatively bizarre invocation:

type contextKey int
const (
  varsKey contextKey iota
  routeKey
)
context.Set(request, varsKey, map[string]string{"id": "id%XX12345"})

This appeared to be enough to work, but inevitably the test would fail due to the result of mux.Vars(request)["id"] being an empty string. I added some debugging Printf's to the mux and context packages to print out what was being set and what was being accessed, and universally it looked like what was created should have been correct:

map[0x10428000:map[0:map[id:id%XX12345]]]

The request pointer keying into the top-level map was the same in both cases, but the map of parameter names to values was only there when setting it in the test - what mux.Vars() was accessing simply didn't have them.

The problem is of course simple. The mux package is keying into the second-level map with a variable of value 0 but type mux.contextKey. I was attempting to fool it by keying into the map with a main.contextKey of the same value. The only reason this worked at all was due to the inner map of data being map[interface{}]interface{} - effectively untyped - the two zero-valued variables of different types (and even then, only different by virtue of being in different packages) did not collide and hence there was no way to get out the value I had previously set.

Since mux.contextKey is not exported, there is actually no way to fake out that request data (well, I'm sure it can be done with some reflect package magic, but that's definitely indicative of code smell). The end result was that this small code change is untestable in the unhappy path. I'm still relatively sure nothing unexpected will happen at runtime since the request handling above the function will catch malformed encodings, and some alternatives do exist, such as doing this kind of decoding in its own handler wrapping around what we already have set up, or not using the mux package in the first place and simplifying our request routes.

It is yet again, a great example of why sometimes the simplest changes can take the most time, discussion, agonising over testing methodologies and greatest personal sacrifice of software development values. The only reason I didn't spend any longer on it (and I definitely could have) was because it was blocking other teams from progressing on their own work (outside of the obvious wastage in my own productive time).

© 2010-2018 Oliver Hookins and Angela Collins