programming

Golang, testing and HTTP router package internals

by Oliver on Sunday, January 18th, 2015.

We have an internal service that takes requests of the form something like /foo/{ID}/bar which basically is expected to generate data of the form “bar” about the ID entity within the collection “foo”. Real clear! Because this service has a bunch of similar routes that have a resource identifier as part of the request path, we use the Gorilla Mux package. There are a lot of different approaches to HTTP request routing, many different packages (and some approaches recommending you use no external packages) – this is out of scope of this article!

This week we realised that for a given endpoint, one other service that calls ours was URL-encoding the resource ID. There are two parts to this that need to be stated:

  • IDs are meant to be opaque identifiers like id:12345 – so they have at least one colon in them.
  • A colleague later pointed out that encoding of path parts is actually up to the application to define (although I haven’t read the part of the spec that says this, recently). URL-encoding thus is not strictly necessary except when it comes to query parameters, but in this case we have something being encoded and need to deal with it.

The fix is relatively simple. We already had some code pulling out the ID:

id := mux.Vars(request)["id"]


However, this was just passing the ID to another part of the code that validated that the ID is properly formed, with a regular expression. It was expecting, again, something like id:12345 and not id%3A12345. So we introduce a very simple change:

    encodedID := mux.Vars(request)["id"]
    id, err := url.QueryUnescape(encodedID)
    if err != nil {
        return err
    }


OK, this will work, but we should test this. We’ve introduced two new “happy paths” (receive a request with the ID not encoded and the decoded version is exactly the same, receive a request with the ID encoded, and the decoded version is what we expect for later validation) and one new “unhappy path” where the ID is malformed and doesn’t decode properly. The problem here is that we need to test this function and pass in a request that has an appropriate ID for the path we are trying to test.

Already, this sets us down the road to the yak shaving shop. We can assemble a request ourselves, and attempt to set up all of the right internal variables necessary to be able to pull them out again inside our own function in the way it expects. A slightly harder way would be to set up a test server with httptest, set up the routing with mux and make a real request through them, which will ensure the request that makes it to our function under test is as real as can be. However we are now also effectively testing the request handling and routing as well – not the minimum code surface area.

As it turns out, neither of these options are particularly good. I’ll start with the latter. You start up a test server and assemble a request like so (some parts have been deliberately simplified):

testID := "id%XX12345" // deliberately malform the encoded part
ts, _ := httptest.NewServer(...)
url := fmt.Sprintf("%s/foo/%s/bar", ts.URL, testID)
resp, err := http.Get(url)


Uh-oh, the malformed encoded characters will be caught by http.Get() before the request is even sent by the client. The same goes for http.NewRequest() We can’t test like this, but what if we assemble the request ourselves?

serverAddr := strings.Split(ts.URL, "http://")[1]
req := &http.Request{
	Method: "GET",
	URL: &url.URL{
		Host:   serverAddr,
		Scheme: "http",
		Opaque: "/foo/id%XX12345/bar",
	},
}
resp, err := http.DefaultClient.Do(req)


We can send this request and it will make it to the server, but now the request handling on the server side will parse the path and catch the error there – it still won’t make it to the function under test. We could write our own test server that has its own request handling wrapped around a TCP connection, but that’s far more work. We also have to determine whether the request has succeeded or failed via the test server’s response codes (and possibly response body text) which is really not ideal.

So, onto testing with a “fake” request. Looking back at our code, we notice that we are pulling the ID out of the request via mux.Vars(request)["id"]. When you don’t use any request routing package like this, basically all request path and query parameter variables are accessible directly on the request object, but my suspicion was that mux.Vars didn’t just simply wrap around data already in the request object but that it stored it elsewhere in a different way. Looking at the mux code, it actually uses a data structure defined in the context package, a very gnarly nesting of maps. The outer level keys off the request pointer, and each unique request pointer will have a map containing different “context keys” – either varsKey or routeKey depending on where the parameters are coming from (but I’ll not dive into that in this article).

The part of the request we are interested in is grouped under varsKey, which is the first iota constant, so it is 0. We can use the context.Set() to set the appropriate data we want to fake out in our request, with a relatively bizarre invocation:

type contextKey int
const (
  varsKey contextKey iota
  routeKey
)
context.Set(request, varsKey, map[string]string{"id": "id%XX12345"})


This appeared to be enough to work, but inevitably the test would fail due to the result of mux.Vars(request)["id"] being an empty string. I added some debugging Printf’s to the mux and context packages to print out what was being set and what was being accessed, and universally it looked like what was created should have been correct:

map[0x10428000:map[0:map[id:id%XX12345]]]


The request pointer keying into the top-level map was the same in both cases, but the map of parameter names to values was only there when setting it in the test – what mux.Vars() was accessing simply didn’t have them.

The problem is of course simple. The mux package is keying into the second-level map with a variable of value 0 but type mux.contextKey. I was attempting to fool it by keying into the map with a main.contextKey of the same value. The only reason this worked at all was due to the inner map of data being map[interface{}]interface{} – effectively untyped – the two zero-valued variables of different types (and even then, only different by virtue of being in different packages) did not collide and hence there was no way to get out the value I had previously set.

Since mux.contextKey is not exported, there is actually no way to fake out that request data (well, I’m sure it can be done with some reflect package magic, but that’s definitely indicative of code smell). The end result was that this small code change is untestable in the unhappy path. I’m still relatively sure nothing unexpected will happen at runtime since the request handling above the function will catch malformed encodings, and some alternatives do exist, such as doing this kind of decoding in its own handler wrapping around what we already have set up, or not using the mux package in the first place and simplifying our request routes.

It is yet again, a great example of why sometimes the simplest changes can take the most time, discussion, agonising over testing methodologies and greatest personal sacrifice of software development values. The only reason I didn’t spend any longer on it (and I definitely could have) was because it was blocking other teams from progressing on their own work (outside of the obvious wastage in my own productive time).

Tags: , , , ,

Sunday, January 18th, 2015 Tech 2 Comments

Setting goals for learning for 2014

by Oliver on Tuesday, February 18th, 2014.

Perhaps a little late in the year to be conducting a personal retrospective on the years past, but I feel at this point I’m starting to wonder about the challenges ahead. The last two to three years I’ve distinctly changed my career direction from systems engineering, to “DevOps” (whatever that means anymore), to developer. Sure, I’m technically a Computer Scientist by tertiary education standards but I’ve been outside of that field for a large part of my professional career. I’ve now almost completed the book Seven Languages in Seven Weeks – not just reading through it, but dedicating enough time to absorbing the material and implementing all of the problems given. In actual day-to-day programming terms I keep myself busy largely with Golang, occasionally Ruby, and occasionally ActionScript. Perhaps with exception of ActionScript I find myself solidly in the realm of “backend languages”.

That’s a pretty fair assessment, as I am employed as a backend engineer. From time to time I do need to delve into Javascript and front-end tasks, and I feel like everything goes to pieces. It conjures up the same feelings I had when working as a systems administrator, seeing an error and diving into the (usually C) codebase, only to stare at the screen utterly confused and not knowing what to do. The spirit was willing but the flesh was weak: and I feel the same way when getting into Javascript and/or front-end development territory.

Not influencing my thoughts around this, but also not entirely unrelated is this blog post by Ian Bicking (of SQLObject, Paste, virtualenv and pip fame (as well as many other excellent pieces of software)). Ian expresses some interesting points including “the browser seems like the most interesting platform” which does resonate with me – the HTML5 media realm is where a lot of my time is spent but without really understanding what is going on. For that reason I’m dedicating (at least a significant amount of) my mental space in 2014 to Javascript and front-end development learning.

If you’ve been writing a personal web page (or perhaps on behalf of a friend) and been stuck using tables or frames, or resorted to using Twitter Bootstrap out of frustration and even then not really knowing what you are doing, you’ll understand the desire to know more of how all that web magic works. I’m totally happy writing an API in some of the previously mentioned languages, but when it comes to actually making something that works in the browser that doesn’t look like it’s been transported from 1995, well – there’s something more to be learned.

Tags: , , , , ,

Tuesday, February 18th, 2014 Tech No Comments

Seven Languages – Clojure

by Oliver on Monday, February 10th, 2014.

I notice my pace has yet again slowed between the last chapter of the book – Erlang – and this one. Another five months has passed since I finished the chapter on Erlang! In actual fact, I haven’t been slaving away on the next language that whole time – decompression of sorts has to follow each chapter, and dealing with a manic three-year-old, finding some time for a bit of exercise and trying to learn a spoken language (German) all take a decent amount away from my free time.

The sixth chapter of Seven Languages in Seven Weeks is Clojure – a challenging language, but after getting through the previous five chapters this one only took me about three weeks of real world time (spent on-and-off) to conquer the last exercise of the chapter.

Since I tend to ramble on about the experiences I had while learning the new language, I’m going to break it down into a series of (hopefully) short points – what I liked about it and what I disliked. Do bear in mind that I’m no expert in Clojure, with only a brief learning period dedicated to it.

What I liked:

  • It seems to have everything. The transactional memory support, power/libraries/community of the JVM, and many programming paradigms baked into the one language. This felt a lot like my experience with Scala, and I’m not sure if it is due to the JVM powering the runtime or the intents of the creators of the language.
  • What I learned previously about how to best utilise recursion from Prolog and Erlang was also quite applicable here (albeit in slightly different form using loop/recur).
  • The Leiningen tool and its REPL make getting into Clojure relatively easy, without having to initially bother with much of the JVM-required compilation/classpathery stuff (which frankly, I still don’t understand).
  • After just a small amount of time, the initial perception of it all being a mountain of parentheses dissipates reasonably quickly (but not entirely). Prefix notation is actually not that bad.

What I disliked:

  • Despite my last point in the section above, parentheses and punctuation remain a big problem to newcomers to the language. If you are not used to Lisp-based languages, there is a big learning curve here. Similar to Scala, I found the large amount of other punctuation (which is used extensively in the language core as well) to be quite hard to understand. Some areas that provide interoperability with Java also have their own unique operators which makes it even harder to wrap your head around.
  • There are often several ways to do things which are not obvious to a newcomer (e.g. creating classes with deftype vs defrecord vs a regular map, or when to use atoms vs refs vs other blocking data structures from the Java library). Some are still listed as experimental alpha features. Fortunately there are plenty of resources out either via Google or Stackoverflow.
  • The language is powerful and sophistication, but I think this requires a corresponding amount of sophistication on the part of the programmer to use it without constructing a monstrosity. Macros take a while to wrap your head around (and I still couldn’t tell you with certainty exactly when things need to be quoted and when not).
  • Without being very familiar with Java (and its libraries) or the JVM, I felt at a disadvantage. I think a lot of parts of Clojure and Scala are framed in terms of how they wrap around the JVM, or solve a Java problem in a better or more understandable way than simply standing on their own. If you want to use the extensive Java interoperability then you have no choice but to learn how that works and its requirements (and with such extensive facilities on the Java side, it frequently makes sense to use the Java interop).
  • To me it just doesn’t feel like a great general-purpose language, but that is probably just because it seems quite academic. I can’t imagine doing very rapid iteration web-app development in it, for example (although I know some people at my work that are doing just that). I guess what it comes down to, is that you would need a lot more experience in this language than you would if you were to pick up Ruby and start developing with Rails for example.

If this all seems like I’m not in favour of the language, that’s not the case at all. Despite its challenges, I see Clojure as a very tempting and powerful language. If I were suddenly in a position where I had to do 100% of my coding in this language, I would see it as a good thing. For the moment though, there are simpler languages that accomplish everything that I need, and I don’t feel the desire to become an expert in every language I have managed to familiarise myself with.

Sidebar: Spoken vs Programming Languages

After doing this much study on a variety of programming languages I don’t use on a day-to-day basis, and having been learning German for a few years now (with varying levels of dedication) I’ve naturally been comparing how learning and knowledge of the two different types of language differs. I’ll preface everything I say below with the fact that I’m not a linguist and haven’t researched this topic academically whatsoever.

Firstly, there exists a certain type of programmer, computer nerd, systems engineer, etc. that will list (somewhat facetiously) their known languages (e.g. on Facebook, LinkedIn etc.) like this – English, German (or some other spoken language), Pig Latin, C, Python etc. etc. Maybe even Klingon. Their argument is that all languages are equivalent and that they know C just as well as they do English. The intent of listing languages in these data fields is usually just for natural spoken languages, but they have mixed the two “types” of language together.

To the majority of us, this argument is plainly false. I recall briefly reading some discussion on this from actual linguists, and at a purely biological level, using spoken languages and computer languages exercise completely different parts of the brain. There are different amounts of reasoning, analysis and plain communication going on depending on whether you are speaking to another human being or expressing an algorithm to a computer.

The grammar of spoken languages is complex, has many exceptions, idioms, and is constantly evolving, whereas in computer languages it is extremely well defined, seldom changes and must be understood by the computer and programmer in 100% of cases. Spoken languages have tens or hundreds of thousands of words, whereas computer languages often have just dozens or hundreds of identifiers at their core. Fluency is defined in a spoken language as basically needing no assistance to communicate with anyone in that language, whether it be spoken or written; even warping the language outside of its usual boundaries while remaining understood by other fluent speakers. Fluency in a computer language, it could be argued, might still permit a user of the language to consult references from time to time. Computer languages are also almost exclusively written, permitting more leisurely consideration of the correct grammar, syntax and vocabulary with which to express one’s self.

This seems like a fairly compelling argument for the two types of language to be vastly different, but recently I’ve been thinking more and more about another level of similarities beyond those points I’ve raised above. I would argue that true fluency in a computer language would in fact allow you to converse (perhaps not exclusively) with another fluent “speaker” of that language in actual spoken words, without aid of references. Anyone who has taken an interview at Google would know the requirement for whiteboarding a solution to a given problem in the language of your choice. You have no option but to be able to express yourself instantaneously, without references, and without making any mistakes – much like natural spoken languages.

Once you take into account all of the standard libraries, commonly used libraries outside of that, frameworks, extensions, plugins etc of a given computer language, the vocabulary is extended dramatically past the dozens or hundreds of words barrier. You can even draw a parallel between learning a given framework in a computer language, and becoming a specialist in a given occupational field – medicine for example introduces a new range of specialist language, just as the next new web-app framework might in your computer language of choice.

When speaking a computer language, the barrier for understandability is actually in some ways higher than than for natural spoken languages with a human partner. A human has the benefit of context, shared common knowledge and culture, observable body language, and can grant understandability concessions when the grammar, vocabulary or syntax is not entirely correct but can be inferred. A computer knows none of these and will not accept anything less than 100% accuracy.

Computers are hard, cold, reasoning machines and computer languages are expressly designed to convey meaning as efficiently as possible and with little room for interpretive error. Spoken languages are the result of centuries or millennia of evolution and culture, not to mention the development and psychology of the human brain itself. In some ways it is amazing that they are able to be compared at all, given their origins are so vastly different.

After dedicating my little free time over the last three weeks to Clojure it is now back to German until I have finished the current teaching book I’m working through. The unifying factor for me personally is that I find learning both spoken and computer languages challenging, mind-bending but exciting. I have no intention of becoming “fluent” in more than a very small amount of programming languages (a passing familiarity is probably sufficient) but I would be significantly upset if I never become fluent in German.

On a related note, if you haven’t yet checked out
Hello World Quiz, it is frustrating but simultaneously a lot of fun šŸ™‚

Tags: , , , ,

Monday, February 10th, 2014 Tech No Comments

Seven Languages – Erlang

by Oliver on Friday, September 27th, 2013.

Well, well, well. One virtual “week” later, but two real world months later, I have finished the chapter on Erlang from Seven Languages in Seven Weeks. Based on the Git repository where I am recording my coding efforts through the book, I’ve been working at it for almost 11 months, which is slightly disappointing, but I have noticed during the chapter on Erlang a subtle shift in how my brain is functioning. Perhaps it is Erlang, perhaps it is the book as a whole, but I do notice that I am considering programming languages in a more balanced, perhaps philosophical way.

After five chapters it is starting to become evident that the languages were presented in a very well-crafted order. Ruby is undeniably the easiest and most familiar to the hordes of programmers out there largely experienced in imperative or even vaguely C-like syntax languages. As a side-note, since we are hosting several RailsGirls teams in the building, we recently started a “book club” going through this book chapter by chapter with these new software developers and some of my coworkers. It is proving very stimulating discussing languages, especially when we have different opinions about them and different experiences. I’m looking forward to hearing others share their delight in features I’ve hated, and vice versa.

So far we have only just finished Ruby and started on Io (which I covered what feels like an age ago). From there it gets progressively more diverse, more challenging by the chapter but each time you reach a new language your thinking has been altered in such a way that the challenges are within reach. This is perhaps a bit abstract so I’ll get back to talking about Erlang, and perhaps my point will become clear.

I find it terribly ironic that I almost bitterly finished the chapter on Prolog, thankful I’d never see that syntax or computing model again, and found it staring my in the face with Erlang. Erlang originally grew out of Prolog so a lot of the syntax is very similar, but somehow after conquering the chapter on Prolog previously, the problems framed around learning Erlang didn’t seem as hard.

That being said, familiarity with the basic syntax only got me so far. The concurrency model, OTP services and supervisors/monitors are a lot to wrap your head around and I found this chapter utterly mind-bending, but in a good way. I gave myself plenty of time (about two months in total) but made sure I finished every exercise, and I believe it was well worth the effort, as I feel I have a much better understanding of how it all works now. It’s not a language I would reach for instinctively, but at least I feel I could navigate my way around it (and we do in fact have some services written in Erlang at SoundCloud). It does intrigue me to know that robust services such as CouchDB and the latest Chef Server have been written successfully in this language.

What specifically did I find interesting about the language?

  • The use of Prolog-inspired pattern-matching, guard statements and the catch-all underscore all make for some very interesting ways of expressing data-driven decision-making or branching.
  • Function overloading by using different pattern matchers in the input parameters.
  • The actor- and message-based concurrency model.
  • The various ways of linking and monitoring processes to ensure they robustly resist entire system failure.
  • Atoms. I’m not sure what I make of them exactly – they feel a bit like symbols in Ruby, but evidently have a much wider namespace as you can register a process with an atom as its name, then refer to it in different places to where you registered it. But they are definitely convenient.
  • The fact that you can send messages asynchronously or synchronously, either by expecting a response only as a return message to another process, or by waiting for return value.
  • The slightly different syntax to io:format.
  • How you have to do something that feels like force-quitting the Erlang VM to exit (Ctrl-C then q).

Of course these are all observations based on very beginner-usage of Erlang. It has its differences which I find both challenging and interesting, not necessarily flaws. If you’re playing along at home, feel free to check out the code I managed to create while working through this chapter, and of course I’m always interested to read your opinions on the subject so feel free to leave a comment. Onward and upward to Clojure!

Tags: , ,

Friday, September 27th, 2013 Tech No Comments

Seven Languages – Scala

by Oliver on Friday, July 26th, 2013.

Yet another instalment in my journey through Seven Languages in Seven Weeks – this time on Scala. The iteration period has gone down significantly to almost an actual week, so that’s some marked improvement on previous chapters!

I would say that I haven’t even really used Java seriously – what little I did in my university time I’ve forgotten, and I don’t believe I was ever competent in the language. So I’m approaching Scala from what I imagine is a fairly different vector to the typical Java “refugee” (yes, there are actually many blog posts using that exact term). That being said, I found Scala to have a somewhat similar syntax to Ruby and thus familiar in a sense. The strict typing is welcome after programming for quite a few months in Go and the language clearly has some power behind it.

Much like the sentiment in the book in the closing of the Scala chapter, I also found some of the more complicated syntax hard to wrap my head around. The idiomatic patterns seem to leave out parentheses, dots and introduce all manner of curly bracket blocks and closures which are a bit confusing for just a few days of Scala practice. I’m sure I have misunderstood aspects of the typing system, and I was quite confused by situations where Option or Any types were being returned. There seems to be a very powerful functional programming environment lurking under the covers but sadly the book and my practice barely touched on it.

Having grown to love Golang’s channels and goroutines, actors felt quite familiar but in practice behave quite differently. I do like the concept and think I could grow to love the actor model in Scala and many other aspects of the language, but that will heavily depend on what use I have for it. With Golang in my tool belt, it doesn’t seem I’ll need to reach for Scala much at all, sadly.

The next chapter is on Erlang (also the subject of a great YouTube video) which I am looking forward to even more than Scala. Stay tuned.

Tags: , ,

Friday, July 26th, 2013 Tech 2 Comments

Seven Languages – Prolog

by Oliver on Sunday, July 7th, 2013.

Yet another installment in my journey through Seven Languages in Seven Weeks. At this stage it is so far off seven weeks, it’s definitely going to be more than seven months, and I’m just hoping it won’t take seven years! But it is enjoyable nonetheless – if a little painful.

This time, I’ve just completed the chapter on Prolog. The previous chapter, which I found reasonably painful as well (but for different reasons) was Io. In the meantime, I’ve been doing a lot more work in another prototype-based language – JavaScript – and grown accustomed to the paradigm somewhat. For all its flaws (perceived or otherwise), I find JavaScript much easier to work with than Io, although I have to admit the learning experience with Io was still worth some of the pain.

Prolog has a deeper hold on me than Io, though. I once took a second year university course on Logic Programming which had Prolog at the very core. The textbook and much of the workload for the course relied on using and understanding Prolog to learn the fundamentals of the course, and I did extremely poorly in it. In fact, I more or less gave up trying to understand and earned the worst grade out of my entire university career in that subject. So this time around I felt that I had something I needed to prove, at least to myself.

All of that said, it was still a big struggle. The paradigm is inherently unfamiliar to me, and it took a long time to understand even the basic exercises. The last couple of days I actually managed to implement something resembling a recursive insertion sort from scratch, which I was relatively pleased with, and the rest of the chapter was at least understandable. Take a look at my Github account if you feel like it – there are a lot of examples from the book and some of my own solutions to the exercises.

Would I use Prolog at my day job? Almost certainly not, but I feel like I’ve at least partially conquered the demons from university and I have definitely expanded my mind. Every time I learn a new language or new paradigm I feel the same exhilaration I did when moving permanently from shell script to Ruby. Now I couldn’t imagine even attempting any given problem in anything less than a complete programming language, and shudder at the memory of some of the horrors I used to write in Bash.

If you haven’t picked it up, I strongly recommend reading and working through this book, even if you don’t consider yourself a programmer (maybe you’ll find yourself one by the end of it).

Tags: , ,

Sunday, July 7th, 2013 Tech No Comments

Seven Languages – Io

by Oliver on Saturday, January 12th, 2013.

I’ve mentioned a couple of times that I started reading through Seven Languages in Seven Weeks, and even though I’ve recently been heavily sidetracked by Learning Go I just finished chapter two which dealt with Io.

The book gushes over the language, and I’ve read a lot of other people’s blogs where they seem quite excited about it. In the end I couldn’t finish the last exercise, even having a working example to go off. It was just a bit too painful trying to find the right object context, navigate around the strange method syntax and other oddities in the language. Of course, it would be wrong to blame the language, so I’m just going to leave it saying that Io didn’t resonate with me.

Interestingly, even though I haven’t used a great deal of Javascript I wasn’t too bothered by the prototyping paradigm of the language. The main confusion (aside from general syntax) was the control you had over in whose context the method arguments would be evaluated – the sender’s or the target’s. It took a bit of playing around to figure out which was the correct alternative in all instances.

For now, I’m conquered. Maybe I’ll come back to that exercise and solve it, but probably not. On to Prolog, which I did get into briefly around 2001 (with fairly awful results). Hopefully the experience will be better this time.

Tags: , , ,

Saturday, January 12th, 2013 Tech 1 Comment

Another Personal Evolution – From Ruby to Go

by Oliver on Friday, January 4th, 2013.

Almost two years ago now, I wrote a post about how I was fed up with resorting to shell scripting as my knee-jerk reaction to computer problems. At the time, I had been attacking any problem that required more than a couple of commands at the prompt by writing a shell (usually BASH) script and hit major limitations that I really should have been solving with a legitimate programming language. I resolved to only resort to Ruby or Python and in that goal I’ve actually been very successful (although I’ve ended up using Ruby around 90% of the time and Python only 10% of the time, which I wish was a little more evenly distributed).

Now I feel as if there is another evolution happening which I need to apply myself to. As a side-effect of the kind of work I’ve been doing, Ruby is just not cutting it. I love the flexibility of it (even despite the numerous ways you can shoot yourself in the foot), and there are some really great libraries like the AWS Ruby SDK which I’ve been using a lot lately. However, when you start wanting to do highly parallelised or concurrent tasks (and this is an excellent talk on the subject), it all starts getting a bit painful. I dabbled in event-based programming last year with NodeJS but found the spaghetti callbacks a bit mind-bending. Similarly with Ruby and EventMachine the code can be less than perfectly understandable. Goliath makes the task somewhat easier (if you are writing a web-service), and em-synchrony follows a similar pattern with Ruby Fibers but they all fall down if you need to use any libraries which don’t make use of asynchronous IO. I briefly looked at Python’s Twisted framework but didn’t find it much better (although that may be an unfair statement, as I didn’t spend much time on it).

I tried a different approach recently and attempted to use the quite awesome JRuby and solve the problem with native threads and the power of the JVM, but hit similar problems with libraries just not working in JRuby. This seems to be a common problem still, unfortunately. The overall result is having no clear option from a Ruby point of view when attempting to make a high-performance application that is also readable and understandable. It’s a bit of a blanket statement, granted, and if I had more constraints on my development environment I might have persisted with one of the options above (there are certainly workarounds to most of the problems I’ve experienced).

Fortunately for me, I have a flexible working environment, buy-in with alternative languages is pretty good and I’m willing to learn something new. Go is a relatively new language, having only been around (publicly) for just over three years, but quite nicely fits my current needs. I won’t go into it technically, as it is all over the interwebs, but I find it relatively easy to read (even for a newbie), and similarly easy to write.

However, I find myself in the same situation I was almost two years ago: it will take some effort to stop the now familiar knee-jerk reaction – this time towards Ruby – and establish the new habit in using Go wherever possible. I’ve just finished up a recent small spare-time project which utilised Ruby so I have free rein to indulge in Go at every possible opportunity. It is scary, but also very exciting – just as it was declaring my intention to use only Ruby almost two years ago.

That’s not to say I’m going to use Go exclusively – I still have to finish up reading (and working) through Seven Languages in Seven Weeks. My intention is not to become a polyglot (I think that’s a bit beyond my capabilities), but I’d at least like to be reasonably proficient in at least one language that solves a given set of problems well. I found that niche with Ruby, and now I am hoping to find that niche with Go. If you haven’t tried it, I thoroughly recommend it.

Tags: , ,

Friday, January 4th, 2013 Tech 2 Comments

I’m done with shell scripting

by Oliver on Saturday, February 12th, 2011.

I think I will call this week the last I use shell script as my primary go-to language. Yes, by trade I am a systems administrator but I do have a Bachelor of Computer Science degree and there is a dormant programmer inside me desperately trying to get out. I feel like shell script has become the familiar crutch that I go back to whenever faced with a problem, and that is becoming frustrating to me.

Don’t get me wrong – there is a wealth of things that shell script (and I’m primarily referring to BASH (or Bourne Again SHell) here rather than C SHell, Korn SHell or the legendary Z SHell) can do, even more so with the old-school UNIX tools like grep, sed, awk, cut, tr and pals. In fact, if you have the displeasure of being interviewed by me, a good deal of familiarity with these tools will be expected of you. They have their place, that is what I am trying to say, but the reflex of reaching for these tools needs to be quietened somewhat.

The straw that broke my camel’s back in this instance was this sorry piece of scripting by yours truly. It’s not an exemplary piece of “code” and I think that demonstrates how little I cared about it at this point. I was briefly entertained by the idea of implementing a simple uploader for Flickr in shell script, and I did actually manage to write it up in a fairly short amount of time, and it did then successfully upload around 4GB of images. The problem was that while the initial idea was simple enough, the script took on a life of its own (especially once the intricacies of Flickr’s authentication API were fully realised) and became much more complex than initially envisaged.

Despite this, I had started out with the goal of making a reasonably “pure” shell uploader, and stuck to my guns. What I should have done, was call it quits when I started parsing the REST interface’s XML output with grep – that was a major warning sign. Now I have a reasonably inflexible program that barely handles errors at all and only just gets the job done. I had a feature request from some poor soul who decided to use it and I was actually depressed at the prospect of having to implement it – that’s not how a programmer should react to being able to extend the use of his/her work!

From a technical standpoint, shell is a terrible excuse for a “language”. The poor typing system, excruciating handling of anything that should be an “object” when all you generally have to work with are string manipulation tools, and a “library” that is basically limited by what commands you have available on the system. I know that I have probably barely plumbed the depths of what BASH is capable of, but when the basics are just so hard to use for what are frequently used programming patterns, I don’t really see the point.

So from next week, I’ve decided to reach for Python or Ruby when I have to code something up that is more than a few lines’ worth, or of reasonable complexity. Not that I don’t already use Python and Ruby when the occasion calls for it, but I think that those occasions are too few and far between. Shell scripting is an old-school sysadmin crutch and it is time to fully embrace the DevOps mentality and get into serious programming mode.

Tags: , , , , ,

Saturday, February 12th, 2011 Tech, Thoughts 2 Comments

My love/hate relationship with Ruby

by Oliver on Wednesday, November 3rd, 2010.

I consider myself a failed programmer, having never really excelled at it during University and only really having come to terms with some of the concepts several years later. I’ve always liked programming but at some point years ago I decided I didn’t want to be a programmer/developer so that was that. Since cementing myself in the realm of Systems Administration I’ve come to miss the programming that I was once so terrible at (and probably still am), but I never have quite enough time to catch up what I’ve missed. The programming landscape seems to have changed so much in the years subsequent to my joining the workforce so it seems like an ever increasing amount of new things to learn.

While working at Anchor I came to grips with Python which was at the time the “standard” language for the company (although I see now that their website is probably running on Ruby on Rails). I like Python, and find it logical and convenient (if not the best supported language out there at the moment). Ruby is actually not so much the new kid on the block any more but still has all of the Fanboyism that it gained a few years ago (if not more). Like the die-hard Mac users, Ruby programmers will defy all logic to defend their beloved language.

Critics of Ruby have made their opinions known far and wide around the Internets so I won’t repeat them here. I actually quite like Ruby because it is easy to use, has a huge collection of Gems to add functionality (and all-important code-reuse) and it is the language of Puppet which is my favourite configuration management tool, so I have to use Ruby to interface with it. I can get by with Ruby, but I also hate so many things about it.

One of the favourite lines of Ruby fans is how efficient Ruby is with simple string handling, thanks to the feature known as symbols. These are basically just a string of characters (with certain limitations) prefixed by a colon character, like :symbol. The efficiencies come from only storing the one copy of a symbol in memory at any time, even if it is used in many different places. I was intrigued by this claim when I first read it and set out to test the theory.

#!/usr/bin/ruby
  100000000.times do
    foo = :abcdefghijklmnopqrstuvwxyz
  end

That’s my basic testing framework. It is probably very naive, but I was looking for simplicity. To get an idea of how miniscule the “efficiencies” we gain, we have to run this loop 100 million times just to see numbers that differ significantly. The first time I ran this test over a year ago, I got slower results using symbols than using strings (“abcdefghijklmnopqrstuvwxyz” or ‘abcdefghijklmnopqrstuvwxyz’ rather than the symbol above) and laughed long and hard. I’ve now just retested and got the following results:

Symbol: 44.661 seconds
Single-quoted string: 53.224 seconds
Double-quoted string: 53.276 seconds

Wow, there actually is a benefit in using symbols. But bear in mind, we only saved about 9 seconds over 100 million invocations. You would have to be doing some pretty serious symbol use to gain performance from this. Ruby fans will take exception to this saying that the point of symbols is not for performance but for memory consumption, to which I would respond that Ruby has far more serious memory issues than in handling a few duplicate strings. Seriously.

The reason I tested single- and double-quoted strings is due to Ruby needing to check for interpolated variables within the double-quoted string. I had expected there to be more of a difference in performance but clearly there is not.

Out of interest I tried the same loop test in Python:

#!/usr/bin/python

i = 1
while i <= 100000000:
    foo = 'abcdefghijklmnopqrstuvwxyz'
    i += 1

How long did it take? 20.634 seconds.

Tags: , ,

Wednesday, November 3rd, 2010 Tech No Comments