Table of contents
- 1 What is WebSocket?
- 2 Steps to establish WebSocket
- 3 How to send and receive data via WebSocket
- 4 WebSocket test program
What is WebSocket?
In a word, WebSocket is a protocol that compensates for the shortcomings of HTTP by allowing real-time bidirectional communication in small-sized data frames.
WebSocket is not so new a protocol, but to use it, the browser must support it. This is not a problem since almost all familiar browsers now support it.
The details of a general explanation of WebSocket are the specialty of specialized web engineers, and here we are more on the side of applications outside the field, so we will limit our explanation to the parts that are specific to the realization of embedded MCU systems.
Steps to establish WebSocket
It is not difficult to understand how WebSocket communication works, but it is quite difficult to implement for the first time. The biggest challenge in implementation is generating the access key to return to the browser from the WebSocket key issued by the browser.
I will now review the steps to establish a connection for WebSocket communication.
The following diagram shows the flow of establishing a WebSocket connection, starting with accessing the server from a browser.
(1) First, when an IP address/port is specified in the browser to access the server as usual, an HTTP request is sent from the browser to the server.ます。
(4) When the server recognizes that the request from the browser is a WebSocket, it extracts the WebSocket key, generates an access key, and returns it to the browser.
(5) If the browser can certify that the access key returned from the server is a valid one, it establishes a WebSocket connection and fires an open event.
This is all that is required to establish a connection. Now the server and browser can communicate in both directions using the Websocket protocol. The rest is non-blocking socket communication. The flow itself is simple enough to understand, but there are many hurdles to overcome in order to implement it, so we will explain them one by one.
Web page with embedded WebSocket request
The code to start a WebSocket is roughly formatted and begins by creating a WebSocket object. The object name can be specified arbitrarily. In the sample example below, we use wsocket.
- open Event: onopen event Handler Property
Occurs when a WebSocket connection is opened
- close Event: onclose event Handler Property
Occurs when a WebSocket connection is disconnected
- message Event: onmessage event Handler Property
Occurs when data is received via WebSocket
- error Event: onerror event Handler Property
Occurs when a WebSocket connection is disconnected due to an error
The WebSocket methods are executed when the browser sends data and disconnects the WebSocket connection at an arbitrary time.
- send Method: wsocket.send(data) is for sending data
- close Method: wsocket.close() is for disconnecting WebSocket connections
0: CONNECTING Connections not yet established
1: OPEN Connections established
2: CLOSING Connections closing
3: CLOSED Connections closed
Ex.：var connectionstate=wsocket.readyState //0 - 3
Access key generation from WebSocket key
Executing the WebSocket request code on the browser side sends an HTTP upgrade request containing the WebSocket key to the server side.
The main point for establishing a WebSocket connection is to generate an access key from the WebSocket key passed from the browser.
The enclosed 24-character code is the key issued by the browser for the WebSocket connection. As an example, a server recognizes a WebSocket request if it finds "Sec-WebSocket-Key" in the head of the received GET request before the blanks, and then it will send the 24-character key (xxx…. xxx..xxx==) is extracted.
Next comes the biggest challenge: generating an Access key for the WebSocket key of the request.
The access key is expressed as "the value obtained by concatenating the browser-generated key given in the request with the fixed value "258EAFA5-E914-47DA-95CA-C5AB0DC85B11", performing an encryption called SHA-1 hashing, and Base64 encoding. The following table shows the data.
Procedures for generating hash and BASE64 encoded values
- Browser-generated keys:
Extract only hqkH4S/djHSSovAPaDdycg== from the request header
- GUID concatenation (using standard function strcat):
Concatenated the GUID to the extracted key and set hqkH4S/djHSSovAPaDdycg==258EAFA5-E914-47DA-95CA-C5AB0DC85B11
- Generate hash value SHA1 (20-digit/40 characters, hexadecimal) (hashing):
Hashing of the GUID concatenated key yields 1c10aa3dd498c5bfb39a95c5c10277e6770f28c1 (binary)
- BASE64 encoding (encoding) of 20-digit hash value:
Encoding a 20-digit SHA1 hash value called BASE64 yields HBCqPdSYxb+zmpXFwQJ35ncPKME= (28 characters text)
The procedure for generating an access key is as described above, but since the algorithm for generating a SHA1 hash value is difficult to understand, you should use some general-purpose library instead of doing this all by yourself.
As for BASE64 encoding, the content is not so esoteric, but there is a lot of library information on this as well, so it may be in your best interest to use it.
WebSocket connection establishment
Once the access key is generated, an HTTP upgrade response is returned to the browser to establish a WebSocket connection.
The format of the response should be as shown in the example below. The status line in the first line of the response can be either "HTTP/1.1 101 Switching Protocols" or "HTTP/1.1 101 OK".
When the browser recognizes that the generated access key is valid, a WebSocket connection is established and bidirectional communication using the WebSocket protocol begins. This means that a handshake has been established between the server and the client (browser).
Once established, the open event is fired and the defined event content is executed. For example, the onopen event fires the "Websocket Connect! message on the onopen event.
How to send and receive data via WebSocket
Once a WebSocket connection is established, bi-directional communication is possible between the browser and server at any given time. WebSocket communication can handle not only text but also binary data.
Data is handled during transmission and reception according to a format called WebSocket data frames.
For more information on data frames, please visit the website at the link below.
It is relatively simple if the data to be sent at one time is limited to 127 characters or less, either text or binary.
The format of a data frame consists of blocks of 1-byte units: the first byte block specifies whether the communication data is the last packet and the type of data; the second byte block specifies whether the data is masked or not. and data length.
As for whether the data is masked or not, it is a rule that a mask is added to the data from the browser and no mask is added to the data to the browser.
It is difficult to explain in words, but it will be easier to understand if you check with specific examples of application of data to be sent and received, so we will explain in the case of sending and receiving.
Data transmission from browser to server
As a concrete example, let's check the case where a browser sends the text string "test". Transmission and reception are performed by TCP socket communication.
The send method is used for sending from the browser.
Execute send Meshod wsocket.send(”test").
For a 4-character text, the browser sends a WebSocket data frame for 10 bytes. Assuming that the TCP data buffer to be received on the server side is data_buffer, an example of the data actually read into this receive buffer is shown below.
Since the 4-character text data is a single packet, the first byte block data_buffer is 0x81 in hexadecimal since the first bit FIN is 1 and opcode is 1 for text data.
The second byte block data_buffer is data from the browser with a mask, so the first bit MASK is 1 and the data payload length is 4, so it is 0x84 in hexadecimal notation.
The 3rd-6th byte blocks data_buffer- are mask keys added by the browser. These will change each time the same data is sent again.
The number of characters in the seventh and subsequent byte blocks data_buffer- are mask-coded into the text data sent by the browser. The coded data is decoded using a mask key to obtain the extracted data unmasked_str[i].
The extracted data unmasked_str[i] is obtained by UnMasking (decoding) by XOR (logical operation) of Mask_Key[i] and masked_data[i].
unmasked_str[i]=Mask_Key[i % 4]^masked_data[i]よりデータ取得
Browsers have other codes to send besides arbitrary text data and binary data. As an example, let's check the data exchanged when wsocket.close() is executed with the close method.
The data sequence sent by the browser in this case is a 6-byte data frame that does not contain any real data. The opcode of the first byte block is 0x8 when the close method is executed. The mask key is also sent, but it has no meaning here.
If the first byte block data_buffer received on the server side is 0x88, the browser has executed the close method.
If the first byte block data_buffer received on the server side is 0x88, the browser has executed the close method.
Other commonly used WebSocket opcodes are 0x9 (Ping) and 0xA (Pong). This can be used for handling disconnections, etc.
You may have understood it quickly if you check it with concrete data that is actually exchanged. Next, we will check the transmission from the server.
Data transmission from server to browser
Now I will check the behavior of sending the text string "test" in the server to the browser. The case from the server is very simple, since no mask is needed.
I will check the case where text data *str_send="test" in the server is sent to the browser for display.
When sending 4-character text, no mask is used, and the data format to be sent is specified in the first and second byte blocks of the WebSocket data frame, followed by the text data.
Compared to the HTTP protocol, the amount of information to be sent is smaller because of the smaller headers, especially during transmission. The smaller the data to be sent, the more pronounced the difference.
The file is sent to the browser by TCP send() (depending on the TCP processor).
The onmessage event is fired when data is received by the browser, so please set up the processing you wish to perform, such as displaying text.
A feature of WebSoket is that these outgoing blocks can be sent at any time and reflected in the browser.
WebSocket test program
The wiring is the same as before when checking the contents on the actual device.
Specifying the IP address/port of the server from a browser opens the HTML page described in the body message.
Press the "Connect" button to establish a WebSocket connection. In this sample, the WebSocket request is sent after the "Connect" button is pressed.
Websocket Connectioned!" in the open event fired when a WebSocket connection is established. is displayed in the open event that is fired when a WebSocket connection is established.
When the HTML slider is operateed, the "Slider output" displays the numerical value of the data range set in JavaSript. This data is generated and sent by the browser, so it will be displayed regardless of whether a WebSocket connection is established.
When the slider is operated while a WebSocket connection is established, a numerical value is displayed in "NUCLEO Loopbackck" at the same time. This is data received by the server and sent back to the browser as is, so the data on both sides are linked almost in real time, which is a characteristic of bidirectional communication using WebSockets.
When the "Close" button is pressed, the close method is executed to close the connection, and the "Websocket DisConnected…" message is displayed when the close event fires. is displayed.
Once a WebSocket connection is established, bidirectional communication using the Websocket protocol is relatively simple. However, in actual operation, communication tends to be more unstable than TCP socket communication, so some additional effort is required to achieve stable communication.
The following demo video sends variables in the server to the browser for display. The sampling rate of the transmission is 100 ms.
With this kind of display speed, it could be applied to real-time monitoring of sensor values, etc. that involve movement.
In this demo video, the count value is converted to a 3-digit fixed-length number (ASCII code) in the server, and the following transmission function is executed in 100ms cycles and sent to the browser.
websocket_send(3-digit number converted to string);
The transmission cycle is unstable when set to 100ms or less, and this speed is a little insufficient for real-time communication in embedded devices, so the advantages of WebSocket are not utilized. I will identify the cause of freezes and disconnections, and if possible, make the realization of more stable and faster communication a future issue.
This sample circuit is a simple one that simply connects a Nucleo to an Ethernet controller W5500, but since this technology is more software than embedded, it can be used not only for TCP communication, but also for HTTP communication, WebSocket communication, and an infinite number of other applications. It is a technology that is more software than embedded, so it can be used not only for TCP communication but also for HTTP and even WebSocket communication.