Kaazing | Stop Using the WebSocket API

KWICies #008: How to Reinvent the Wheel to Run Yourself Over

Those who cannot remember the past are condemned to repeat it.
-George Santayana

If I’m not back in five minutes, just wait longer.
-Ace Ventura

Ever since its standardization back in 2011, the popularity of WebSocket for live, real-time websites, reactive frameworks and streaming APIs has grown rapidly. Stackoverflow is buzzing with the geekosphere seeking answers and suggestions on real-time architecture, thorny issues, mental blockers or just best practices for this IETF wire protocol and W3C-blessed API.

But many of these questions have a surprisingly, very common theme.

First, how would you answer this question:

I just learned about TCP today. Does anyone know how I can implement a message-passing subsystem with TCP?

I’ll get a coffee while you ruminate…

Funny… I ordered a latte…

You smartly respond with “Why would anyone consider writing low-level TCP code to implement something that was created years ago and already works well?”

Yep, exactly.

Most programmers on the planet are application developers. Yes… you, the one with the Earl Grey tea stain on your blue shirt. I’m talking to you. The goal of an application programmer is to write useful programs for users. Quickly. And soon as you complete your application and high-five a group of happy users, these audacious users want enhancements (the nerve!) to the app or even worse… they want more apps! Being agile just doesn’t really describe our lives nowadays. We need to be hyper-accelerated beings; like the Scalosians (ok, so I’m an old-school trekkie).

All of us need high-level tools, libraries and frameworks to be productive at writing applications. Yes, it’s interesting and useful to understand how it works under the hood, but right now you need to deliver an app to a group of waiting users. You do want to go home at the end of the day don’t you?

A Persistent Connection

Now our old friend the WebSocket is a low-level transport protocol. This wire protocol was well-defined bit for bit by the IETF in a standard (RFC 6455) back in December 2011. It is a peer protocol to HTTP. This means both WebSocket and HTTP (and their TLS/SSL encrypted versions) are physically implemented using TCP. In addition, there is an official W3C JavaScript API standard to use the protocol. There are many other language implementations that mimic this official API. You can easily find Java, Android, Objective-C/Swift-iOS, C#/.NET, C/C++, Python, Go, Haskell, Clojure, Ruby and Erlang bindings.

At this point WebSocket is ubiquitous.

So is TCP.

But… when was the last time you called recvmmsg() on a Linux machine to setup TCP connection timeouts, juggle the bits in a poll() for I/O multiplexing in an effort to get asynchronous messaging, or develop some makeshift monitoring system to detect a broken TCP connection? Using these low-level routines is tedious and difficult to get correct. So system programmers developed high-level abstractions to make it easier for application developers to create useful programs for actual users (oh yes, “them”). Providing high-level libraries allowed us to focus on the application semantics and develop robust, reliable apps faster.

This aha moment happened over 30 years ago while bands like REO Speedwagon played endlessly on music television. And yes, that was very, very, very painful.

So now we have WebSocket, a persistent connection over the web. The API that wraps the protocol seems quite simple enough to use. There are only a few API calls.

How hard can it be, right? There’s even a really good hello-world implementation that you can cut-and-paste into a working app on websocket.org and immediately test it. Yep, piece of cake.

But if it’s so simple, how come we see these types of WebSocket questions on Stackoverflow:

“I have 1,000 clients connected to my server. How can I direct certain messages to specific clients using WebSocket?”
“How can I make sure I receive exactly the same bytes that I send over WebSocket?”
“My server is sending one large JSON payload across multiple WebSocket frames. How can parse my JSON properly?”
“What is the best way to implement different chat rooms with WebSocket?”
“How can I guarantee a WebSocket message is received?”
“I am receiving WebSocket messages out of order. How can I prevent this from occurring?”
“How do I slow the sending speed of data sent via WebSocket to mobile devices in spotty coverage areas?
“My onMessage() is getting multiple responses from many client apps. How can I tell which WebSocket message belongs to which sender?”
“I’m sending a file over WebSocket. If there is a disconnection during the transfer, how can I restart from where I left off?”
“What is the best way to detect if a person is online and available for a chat session with WebSocket?

There are many other WebSocket questions similar to these on Stackoverflow and other techie forums.

Now here’s an exercise. Replace “WebSocket” with “TCP” in most of those questions.
You sit motionless and stroke your long, white, virtual beard for several minutes…

“Weren’t all of these questions basically answered decades ago?!!” you shout back to the embodied voice in this document while raising the attention of everyone on your morning commuter train.

The simple fact is the WebSocket API was not intended for the average application programmer. Just like TCP, WebSocket is a low-level transport. Application protocols that power publish/subscribe, chat, database transactions, tuple spaces, telemetry, data acquisition, system monitoring and other high-level application semantics can certainly be implemented with WebSocket. But similar to other low-level transports, application programmers should use higher-level APIs to obtain these behaviors under the hood.

Our JMS Gateway and AMQP Gateways allow you to develop publish/subscribe applications using the familiar JMS and AMQP APIs and not worry about the WebSocket API. You don’t even have to think about whether WebSocket is physically available on your client device. We implement the WebSocket API regardless whether it’s physically there or not. We even have a product that takes existing TCP apps and makes them work over the Web using WebSocket with no application code changes.

Of course a few of you may have a legitimate reason to use the WebSocket API or its underlying protocol directly. You may be implementing your own WebSocket server or developing a high-level protocol or framework that is based on WebSocket. However the vast majority of programmers encased forever in the large part of the Gaussian curve do not. Those of us who are responsible for developing cool and useful applications for our user community should not have to program at such a low-level.

Let’s not spend the next ten years totally reinventing the notion of 30 year old application protocols. Let’s move forward and focus on true innovation to advance the true state of the art in distributed computing. Application protocols and high-level APIs were invented for a reason.

Use a higher-level API with the semantics you need in your application. Don’t reinvent the wheel… or pub/sub… or chat… or tuple-spaces… or presence… or file transfer… or…

Frank Greco