4 December 2018 | C++ | 3 Comments
Sometimes I wonder why some things are inside the C and C++ standard libraries, and some aren’t.
As far as I can be bothered to read the actual “standards” document (that are mostly written in legalise, not in understandable english for you and me). Theses languages are defined against an “abstract machine”, and the actual real-world implementation of them, on computers that actually exists, should follow the behavior described for that thing, modulo some implementations details.
Beside the specific case of having theses languages in a “free standing environment” (meaning that the code written isn’t actually relying on being executed inside an operating system, but is directly running on the bare hardware), It seems that some notions, like the OS having a “filesystem” where textual paths can points to files that can be opened, read and modified, is pretty standard.
What is strange about that is, if the notion of files and file systems are part of the standard library for both C and C++, (and were present since the beginning) networking sockets doesn’t exist in both languages. And C++ gained the notion of creating multiple threads of execution, and manipulating them inside it’s standard library only in 2011.
All of theses concepts : files, threads, and sockets, are Operating System specific constructs. Opening a file on Linux is fairly different that opening a file on Windows for example. But the standard library offer a single, unique, and portable way of doing so.
These three things are present in all operating systems in use for the past decades now (since the 70’s at least?). I find it strange that the C standard library only includes files. Since C++ now also has threads, I will consider this a non-issue. So let’s talk about the other one…
A song of files and sockets
I would like to take some time to discuss writing some low-level network code in C++. The current interface to get data to and from a network we know today are using a notion called sockets.
For lack of a better analogy, a socket can be thought as some kind of “magical file” that, when, written into, will send the bytes that has been written to on the network, and when receiving bytes, they will be accessible by reading said file. This notion comes from the UNIX world, where everything is effectively a file. And nothing is wrong with that, it’s actually a really simple, straightforward way of doing things.
Most of the operating systems that matters today are using this analogy. When I say OSes that matters today, I’m thinking of both the modern UNIX derivatives (Linux, macOS, and the rest of the BSD family). And Microsoft Windows.
The Windows socket API has been mostly borrowed from BSD anyway, if you remove some oddities like a few renamed functions, a few changed data types, and the added initialization procedure, Windows sockets are equivalent of sockets you have on Linux.
100% non-standard code
But, none of this exist inside the standard library of these languages. When you are doing socket programming on Linux, you are not calling functions of the library, you are performing Linux Kernel System calls, and you are dealing with file descriptors and bytes.
On Windows, you are calling part of the Win32 API (called WSA for Windows Socket API). This situation is unfortunate, because it means that I, as a C++ programmer, I need to make sure that my code will work both under Linux and under Windows. There’s no one single networking API that I can use everywhere without thinking about it. Sure, it’s 80% similar, or maybe 90% similar, but still, if I need to put #ifdef WIN32 in my code for something as fundamental as sending bytes to another computer in a network in 2018, we are doing something wrong.
Moreover, all theses OS-level API are implemented in C. Not C++. This means that everything you have are functions and structures. When describing socket configuration, you are filling structures and passing them to functions. When you have to reference a specific socket, you need to keep a little token and give it to a functions. When you need to read data, you need to have a buffer with the correct amount of bytes ready and give a pointer, alongside a variable containing the size of said buffer to a function, and make sure that you don’t mix them together.
Basically, you are doing 1970’s level computer science. This is fine for the lowest level code out there, but not for the code of an application.
Unnecessary added complexity
There are solutions to solve this, and some that are even well advanced in the path of getting standardized inside the C++ language, like Boost.Asio. But, In my humble opinion, there’s a fundamental problem with Asio itself: it is Asio.
For those who aren’t familiar with Asio, it’s name stands for “asynchronous input and output”. It’s a library, a good library for what it’s name stands for: doing intput and output in an asynchronous manner. For doing this, Asio as multiple contexts and constructs to deal with multiple threads and non blocking calls, and who executes them, and when they are executed.
The problem is: If I just want to connect to the network and exchange data, do I need to worry about io-context and handlers, and executors? Probably not.
Keep It Simple, Stupid.
In the C++ world, we struggle at keeping simple thing simple. The collection of libraries from the Boost project is one good example. Don’t get me wrong, these are high-quality, peer-reviewed C++ libraries. They are good code written by smart people, with the seal of approval by other smart people.
They are a demonstration of what you can do when you want to push the language forward. They contains a lot of useful pieces of generic code that you can reuse. A lot of really important and useful things from Boost finally landed inside the C++ standard library since 2011, like smart pointers, chrono, array, lambdas, and probably a lot more things like that are going to jump from existing on boost, to being implemented inside the standard.
And, if today you ask for a recommendation of something to use to do networking code in C++, I’m almost sure that you’ll get pointed to either Asio, the version of Asio inside of Boost, or, the Networking TS (addition to the standard that will probably land in a future version of the C++ language) that is… Basically based on Asio.
As you can guess… It’s not like I don’t like Asio, I find it genuinely interesting, and potentially useful. But I’m unsure it is the thing that I want to see standardized.
As stated earlier, if you’re just going to do some TCP/UDP communication, Asio comes wrapped in unnecessarily complexity.
Moreover, your OS comes with a socket API, but it isn’t super convenient to use in C++, and it’s not portable without doing the ugly #ifdef preprocessor dance.
Few months ago, I was thinking about this situation, and thought why not just wrap the os library in a nice C++17 interface?
Kissnet is a little personal project I started during the summer, and that I tweak a little from time to time, it’s mostly for fun, but I feel like some people could have some use for something like this.
The design goals of kissnet are pretty straightforward:
- Be a single header library, nothing to link against
- Be a single API for all supported operating systems
- Use C++17 std::byte as a “safe” way of representing non-arthmetic binary data
- Be the lighthest/thinest possible layer on top of the operating system as possible
- Handle all (or most) of what TCP and UDP sockets can do in ipv4/v6
- Don’t require the user to worry about the endianess of the network layer.
- Just transport the bytes, and do nothing else
- Hide os-specific weirdness
- Optional exception support (You can chose to replace throwing exception with the program either aborting or calling your own error handler)
- stay simple
Kissnet only implement 3 kinds of objects:
- A socket class. The behavior of the socked is templated around the protocol used (TCP vs UDP) and the version of IP used (ipv4 vs ipv6)
- An endpoint class that permit you to specify a host and port as just a string and a number, or a “hostname:port” string.
- A “buffer<size>” class that is just syntactic sugar around an std::array<std::byte, size>
Buffer are for holding received data, buffer know their own size, and can read the correct amount of bytes for you. Kissnet doesn’t care what data is sent, it’s not kissnet job.
Sockets are non-copyable (but movable) objects and they have the typical operations you can apply on socket implemented (bind, listen, accept, connect, send, receive).
Kissnet automatically manages the initialization/de initialization of the underlying socket API if needed (like on windows). This is done by exploiting the reference-counting of std::shared_ptr<>. This is the only overhead on top of holding a socket “file descriptor” ( = an simple integer variable).
I’ve only used kissnet in a couple of toy programs and not in a real project, however, I already think that I prefer this simple, down to the metal but yet type-safe and cross platform library as using something like Asio. Asio feels like using a bazooka to do kill a fly. I’ve heard a few opinions going the same way as I do. This is why I’ve put this little experimental project on GitHub, under a permissive MIT license.