HTTP abuse

This is not another rant about current state of the web, but rather about networking in general. Neither it is HTTP criticism: some other protocol could have been in its place.

HTTP is often used where it does not seem to be the best fit: IMs, video streaming, and all kinds of things that are far from hypertext transfer, yet they are crammed into the protocol restrictions, introducing workarounds to circumvent those (e.g., "push technology"). HTTP is not unique in that, and some of the software I generally like and use also gets used that way – Emacs, for instance, though it is not as widespread, and has little to do with networking, hence being mostly harmless. Out of other network protocols, perhaps SMTP and Finger can be mentioned, and out of more broad technologies – VPNs seem to be used quite often where a SOCKS proxy would suffice. RFC 3117: On the Design of Application Protocols (one of the BEEP RFCs) considers HTTP, SMTP, and FTP as common and reusable protocols, though only HTTP is used that much these days.

Evidently such a protocol or a framework is needed, since TCP and UDP (and even (D)TLS) are too low-level to build many custom application protocols on, and would lead to duplication of functionality and effort, with programmers often failing to use plain TCP properly.

Something similar happens with technologies adjacent to HTTP, too: JavaScript adoption outside of web browsers, for instance, is caused primarily by the WWW growth, rather than by JavaScript's merits. Though there are other awkward and popular languages (and other technologies), with various historical happenings bringing them popularity, and leading to many more suboptimal solutions; HTTP and the adjacent technologies are not unique in that either.

HTTP(S) nowadays covers not just the OSI model's application layer, but everything above TCP. Newer HTTP versions (2 and 3) spread even further across OSI layers. The OSI model may be imperfect, and protocols tend to fit poorly into it, but a fine separation of layers is missing: many common protocols handle everything above TCP on their own, occasionally incorporating TLS (which does not quite fit into OSI, either), and at best using SASL and standardised serialisation formats. In case of protocols built on top of HTTP, some awkward authentication usually gets defined, while JSON is often used for serialisation, and HTTP verbs, query, and headers – to fit metadata related to different OSI layers into those. Some protocols, such as SSH, define and use a few separate layers at once, yet in case of SSH they are coupled together and barely reusable (SSH is extensible, but for private use, and/or potentially leading to conflicts); some protocols work on top of SSH (via pseudo-terminal), but that is still slightly awkward, not a complete solution for common needs. The issues are similar to those with distributed systems, and with much of software: a lot of stuff that is hard to get right gets reimplemented over and over, differently, not in a reusable manner. Even despite reusing HTTP(S): even its own functionality is reinvented on top of it.

Though sometimes HTTP abuse does not look that bad: in most cases it is still better than completely custom protocols, more often than not programmers seem to manage to use it without breaking, and it is not like we have a choice of protocols for which a "just grab that data" function can be implemented easily in common languages, with easily available libraries. It still seems awkward and wrong though, but so does most of the other tech. Something that was at least designed for the task (maybe even BEEP or XMPP) probably would have worked better, but here we are.