adrift on a cosmic ocean

Writings on various topics (mostly technical) from Oliver Hookins and Angela Collins. We have lived in Berlin since 2009, have two kids, and have far too little time to really justify having a blog.

Client applications and TCP reset handling

Posted by Oliver on the 10th of June, 2018 in category Tech
Tagged with: tcptestingsocketsprogrammingnetworklinuxkerneliptablesip

Disclaimer: Most of this post was written in 2014, but it's been languishing in my drafts folder for a long time and I thought I'd push the publish button rather than waiting to perfect it. The original context was that I was attempting to perform integration testing on a client application that needed to handle a variety of different network behaviours - including TCP resets. I mulled over how this could be achieved and wrote a bit about it, but ultimately never solved the problem.

So: it is surprisingly hard to trigger a TCP reset.

Let's say you have a client application that you suspect reacts incorrectly to TCP reset packets that it receives from a server on the internet, but these resets happen quite infrequently (worse still, only for users of your product, but never for you personally). The logical first step to dealing with the problem is to find a way to reproduce locally and see what happens. I found that it was quite difficult to create a test harness that sent an RST packet to my client application - partially due to it being some time since I last studied the exact definition of the TCP RST flag, and also due to the relatively high level of abstraction most networking libraries give you around the network stack. Most of the time you cannot explicitly set TCP flags, but can only influence them through the state of your application.

Here's the typical TCP state machine diagram showing the transitions between different connection phases:

Around the time I had this problem, I attempted to create a small NodeJS HTTP server which would break the connection after sending a small number of bytes back to the client, for example by using the request abort function. To my dismay it would wrap up the TCP connection in an orderly fashion with the regular FIN/ACK flow. How the heck do I get it to send a reset?

After this I basically dropped the issue as it wasn't worth pursuing at the time, but I soon remembered a very obvious reason for this behaviour: the application is not in control of the connection - the kernel is! When making a TCP connection, from a client perspective it is asking the kernel to connect to a given IP and port on its behalf, and hand back an established connection to the application once it is successful (or an error if not). From a server application, it binds to an IP and port and starts listening, and awaits requests from the kernel (which have already been established). If a connection is started but cannot be successfully established, the application will never see it. This is especially important if you have a lot of network problems and aren't tracking anything but application metrics - you'll miss failed connection attempts.

But back to the problem at hand. Our application can't influence the establishing of the connection and it has only a small amount of influence over the semantics of the connection during the lifetime of that connection (e.g. being able to signal to the kernel to push data out in the case of time- or latency-sensitive payloads like SSH session keystrokes or console output). How would we then simulate a reset?

One possibility would be the Linux Netfilter Queue target in conjunction with a userland programme using libipq. Using ipq_set_verdict, however, you can basically only accept the packet, drop it or requeue it. No reject which might help achieve what we want. Dropping the packet would simply lead to retransmission in most cases, and requeuing defers the decision process to another rule in the chain or potentially another userland queue programme.

My only conclusion (which I didn't manage to implement) is that you need to interact with packets at a lower layer - either by opening a raw socket and managing (i.e. implementing) the TCP flow yourself, or writing a kernel module yourself to influence the packet flow natively inside of iptables. This would give access to all possible ways of influencing packet flow, but at a significant cost of harder implementation.

At the original time of having this problem, I did investigate just a little bit more and encountered this document describing how to send a TCP reset from a userland application. It is implemented in C, though, and I haven't tried it. Perhaps it would work for the original use case though - let me know if you have done this and how you fared with the same problem!

© 2010-2018 Oliver Hookins and Angela Collins