Measuring Performance of Messaging Protocols for HF Radio
This white paper shares measurements of performance of selected HF Messaging Protocols: ACP 142, CFTP and SLEP. These protocols and other messaging protocols are described in the companion white paper Messaging Protocols for HF Radio. Optimizing throughput over HF is the key challenge for bulk protocols such as messaging. The primary measurements in this paper look at throughput for vary link speeds, error rates and message sizes, using Isode’s STANAG 5066 and messaging products. Some latency measurements are also made.
Why Measure at Application level for HF?
When evaluating HF Radio systems, careful performance measurements are usually made at the modem level. Quite often, application performance is reported on subjectively in terms of user experience. This is unfortunate, as application performance is critical to the end user. HF is often used in mission-critical situations where information needs to be delivered as quickly as possible.
For bulk applications such as messaging, throughput is the most important measurement. It is also the hardest to optimize, as a wide range of parameters need to be selected. If too slow a transmission speed is selected, capacity is wasted. If too fast a speed is selected, data loss is too high. This paper looks primarily at throughput measurements, as messaging is a bulk application.
Latency is also important. For a bulk application such as messaging, best latency will typically be achieved when throughput is optimized, as latency is essentially the time taken to transfer the bulk data. Where data volumes are low relative to link speed, latency can be optimized by choosing conservative transmission speeds (to minimize retransmission) and keeping transmissions short. Optimizing for latency is generally easier than optimizing for throughput.
Message Protocols Measured
The following messaging protocols are considered.
- ACP 142. This is a high functionality NATO standard protocol which Isode recommends that provides unicast and multicast functionality using non-ARQ operation over STANAG 5066. Measurements made using STANAG 4406 Annex E with ARQ operation over STANAG 5066 to give direct comparison to the other two protocols. Measurements could have been made with MULE (SMTP transport over ACP 142) and similar results would be expected. More information is provided in the white paper [ACP 142: SMTP & STANAG 4406 Messaging for Constrained Networks].
- CFTP. This is a widely deployed protocol, with less functionality and poorer resilience than ACP 142.
- SLEP. This is a new protocol, optimized for unicast operation. SLEP is specified in [S5066-EP3] and is based on the Reliable Datagram Service. It does not have multicast support, but otherwise provides the same functionality as ACP 142.
Further information on these protocols is provided in whitepaper [Messaging Protocols for HF Radio]. This first version of the paper does not include SLEP measurements, these will be added in a future version. It is anticipated that SLEP will give performance (slightly) superior to both CFTP and ACP 142 in all situations.
How Throughput was Measured
The above diagram shows the test setup, which has the following components:
- MoRaSky is Isode’s Modem Radio and Ionosphere simulator. This was configured for “clear channel” except where specifically noted otherwise. Waveforms are selected by Icon-5066 and noted in the measurements below.
- Icon-5066 is Isode’s STANAG 5066 Product.
- M-Switch is Isode’s family Message Transfer Agent product. M-Switch products support each of the protocols being measured.
- The Isode test client submits messages for the test. Equivalent tests could also be carried out with standard clients.
- Messages are delivered by the receiving client into Isode’s M-Box product.
- The received messages were analysed by reading the received messages in Microsoft Outlook, connecting to M-Box.
The test model is that random binary data is transferred, and throughput is measured in terms of that data. It is important to use random data, as this cannot be compressed in transfer. The test goal is to measure protocol efficiency, not to evaluate compression algorithms.
The test here could be run with standard email clients and attaching data such as JPEG Photos. It would be straightforward to run comparative tests without any special components.
Isode’s test tool (available to Isode partners and customers) generates a configurable number of messages with binary attachment of configurable size. This enables systematic measurements to be made easily. It is important for tests to run for a reasonably long time with multiple messages, in order to get an accurate performance measurement. Throughput measurements were typically run for about an hour. The following screen shot shows the results of one test run.
There is sufficient information in the test message, to enable throughput to be calculated from information in the subject line and the time at which the message was delivered.
Core Measurements Based on Message Size
The first set of measurements was made at 9600 bits/second using emulation of STANAG 4539 waveform with short interleaver. The utilization was determined by measuring the volume of random binary data received during the test and determining effective transfer speed of that data. This enables link utilization to be determined. STANAG 5066 overhead was determined by independent measurement using Isode test tools. This enables the messaging protocol overhead to be determined.
Protocol | Size | Utilization | Message Protocol Overhead |
STANAG 5066 Overhead |
---|---|---|---|---|
ACP 142 | 1 Kbyte | 46% | 45.6% | 7.5% |
CFTP | 1 Kbyte | 55.5% | 37% | 7.5% |
ACP 142 | 10 Kbyte | 83% | 9.5% | 7.5% |
CFTP | 10 Kbyte | 82% | 10.5% | 7.5% |
ACP 142 | 100 Kbyte | 89.5% | 3% | 7.5% |
CFTP | 100 Kbyte | 88% | 4.5% | 7.5% |
ACP 142 | 1 Mbyte | 91% | 1.5% | 7.5% |
CFTP | 1 Mbyte | 89.5% | 3% | 7.5% |
Notes on the measurements:
- For larger messages, over 90% link utilization can be achieved. This seems a good result.
- For 1 Mbyte messages the overhead associated with the ACP 142 layer is around 1.5%. This modest overhead is quite acceptable.
- For larger messages, most of the remaining overhead with STANAG 5066. This overhead is discussed later in the paper.
- For messages of 10 Kbyte and larger, CFTP is slightly less efficient that ACP 142. This is primarily due to CFTP being a 7bit protocol, which leads to overhead to carry binary data.
- As message size decreases, the per-message overhead associated with the messaging protocol increases. This is to be expected, as the overhead from message headers is larger for small messages.
- For very small messages, ACP 142 has a higher overhead than CFTP. This is due to the additional ACP 142 protocol needed to enable multicast operation.
Measurements with varying Speed
This section looks at the impact varying speed, from 75 bps (slowest HF speed) to 240 kbps (top Wideband HF speed). The following table shows performance measurements at slower speeds, all using STANAG 4539 and short interleaver. 10 Kbyte messages were used for these tests, as larger messages would be prohibitively slow at 75 bps.
Protocol | Speed | Utilization | Message Protocol Overhead |
STANAG 5066 Overhead |
---|---|---|---|---|
ACP 142 | 75 bps | 60% | 7% | 33% |
CFTP | 75 bps | 60.5% | 6.5% | 33% |
ACP 142 | 1200 bps | 79.5% | 9% | 11.5% |
CFTP | 1200 bps | 79% | 9.5% | 11.5% |
ACP 142 | 9600 bps | 83% | 10% | 7% |
CFTP | 9600 bps | 82% | 11% | 7% |
It can be seen that the utilization decreases as speed decreases. The reason for this is that Icon-5066 (default configuration) will reduce C_PDU maximum segment size as speed falls. This is because the choice of value for optimum performance varies with speed. At lower speeds, the cost of retransmission is higher, so it is preferable to reduce size to make retransmissions smaller. Although use of max size C_PDU segment (1023 bytes) would increase throughput, use of values appropriate to the chosen speed is more realistic. The values used are: 75 bytes (75 bps); 300 bytes (1200 bps) and 800 bytes (9600 bps).
1 Mbyte payload is used for faster speeds, with the following waveforms chosen:
- STANAG 4539, Short interleaver, 9600 bps.
- STANAG 5069, 15 kHz bandwidth, short Interleaver, 57.6 kbps.
- STANAG 5069, 24 kHz bandwidth, Short interleaver, 240 kbps.
This gives a top narrowband speed, top WBHF speed and an intermediate speed.
Protocol | Speed | Utilization | Message Protocol Overhead |
STANAG 5066 Overhead |
---|---|---|---|---|
ACP 142 | 9.6 kbps | 91% | 1.5% | 7.5% |
CFTP | 9.6 kbps | 89.5% | 3% | 7.5% |
ACP 142 | 57.6 kbps | 93% | 1.5% | 5.5% |
CFTP | 57.6 kbps | 90.5% | 4% | 5.5% |
ACP 142 | 240 kbps | 92.5% | 1.4% | 6% |
CFTP | 240 kbps | 90% | 4% | 6% |
These results show that good utilization is achieved at higher speeds. Notes:
- ACP 142 continues to have slightly better performance than CFTP.
- Performance at 57.6 kbps is slightly higher than at 9600. This is primarily because the C_PDU maximum segment size is increased from 800 to 1023 bytes (the maximum).
- Performance at 240 kbps is very slightly less. This reflects (simulation) of small additional modem delay due to increased modem processing needed at higher speeds. This reflects Isode measurements on an early WBHF modem, which may not be present in more recent products.
Measurements with varying Error Rate
Some measurements were made to examine the impact of modem errors. These measurements were made with 100 Kbyte payload at 9600 bps. A simple random bit error pattern was used, with two different error rates:
- 10-6: This is a relatively low error rate in the range of what might typically be experienced when transmission speed is chosen to maximize throughput.
- 10-5: This is a relatively high error rate in the range of what might typically be experienced when transmission speed is chosen to maximize throughput.
These are artificial values, chosen to show broad characteristics. There is a discussion on real errors, later in this paper.
Protocol | Error Rate | Utilization | Message Protocol Overhead |
STANAG 5066 Overhead |
---|---|---|---|---|
ACP 142 | Clear | 89.5% | 3.5% | 7% |
CFTP | Clear | 88% | 5% | 7% |
ACP 142 | BER 10-6 | 88% | 3.5% | 8.5% |
CFTP | BER 10-6 | 86% | 5% | 9% |
ACP 142 | BER 10-5 | 72% | 3% | 35% |
CFTP | BER 10-5 | 64% | 4.5% | 30.5% |
Unlike the clear tests, runs with 20 100 Kbyte messages under error conditions showed quite significant variation of throughput between runs. This is because there will be different throughput impacts dependent on exactly which frames are damaged. The error measurements are in line with what is expected, but the numbers are approximate. The values give a sense of behavior as errors increase. Essentially, as errors increase, the overhead to reliably transfer data over STANAG 5066 increases.
Latency Measurements
Latency measurements were made to show how long message transfers take. These measurements were made with simulation of STANAG 4539 short interleaver at 9600 bps. Messages with 1kByte of payload were used, so data transfer should be less than two seconds. Ten messages were sent with 60 second gaps between messages.
Two scenarios were tested. In the first, the transfers excluded setup of STANAG 50660 CAS-1 soft links. This was achieved by using long CAS-1 soft link timeout so that the link did not drop. Results in table below.
Protocol | Min | Max | Mean |
---|---|---|---|
ACP 142 | 12 secs | 13 secs | 13.6 secs |
CFTP | 12 secs | 13 secs | 13.6 secs |
It can be seen that the latency is consistent and reasonably short. The latency is almost identical between the protocols and the slightly larger ACP 142 PDU is transferred in same time as CFTP. We believe there is potential to reduce this time.
In the second scenario, the transfers included setup of STANAG 5066 CAS-1 soft links. This was achieved by using a very short CAS-1 soft link timeout so that the link was dropped between each message. Results in table below.
Protocol | Min | Max | Mean |
---|---|---|---|
ACP 142 | 20 secs | 23 secs | 20.8 secs |
CFTP | 20 secs | 22 secs | 20.7 secs |
It can be seen that establishing the soft link adds approximately seven seconds, which is reasonable for the CAS-1 link setup exchange.
Over the Air Measurements
The measurements here are made using the Isode MoRaSky modem simulation. We believe that similar numbers can be obtained with real radios operating Over the Air. It is intended to include such numbers in a future update of this paper.
“Above Modem” Architecture Measurements
An architecture to deliver bulk data over HF, using messaging protocols or otherwise, will have a number of layers of the modem, referred to here as the “Above Modem” architecture. It is useful to measure such an architecture using “perfect” communication in the modem and radio systems. This has been done here for STANAG 5066 and messaging protocols operating over STANAG 5066. This “perfect” measurement will show a maximum that can be achieved in practice.
The measurements here have shown that for blocks data of 1 Mbyte or larger, 93% utilization can be achieved (i.e., a 7% overhead). This is a good baseline to measure alternate architectures.
A number of proposals are being made for “above modem” architectures, such as STANAG 5070 and various IP Services oriented architectures. Measuring these architectures in over “perfect modem” is a good starting point to determine if such architectures are viable alternatives to the protocols and architecture measured here.
Notes on STANAG 566 Protocol Performance
Most of the overhead in the measurements here is at the STANAG 5066 layer. For larger blocks of data, the overhead of the messaging protocols is very small (approximately 1.5% for ACP 142).
The protocol overhead of STANAG 5066 is quite small. When maximum C_PDU segment size (1024 bytes is used) the basic overhead is around 2%. For bulk data, most packets will be at this size, although some will be smaller and there is some additional overhead. A 3% overhead might be expected overall.
At slower speeds it will make sense to use a smaller maximum C_PDU segment size. This will increase overhead. However, it means that D_PDUs are less likely to get errors, and when errors do occur the cost of retransmission will be lower.
At higher speeds, STANAG 5066 Ed3 suffers from window exhaustion for ARQ transfers, such as the ones used in these measurements. This is addressed in [STANAG 5066 Large Windows Support (S5066-EP5)]. In making these measurements, it was found that even at 9600 bps with a clear channel that S5066-EP5 improved performance by 3-4%. With data loss and higher speeds, the improvements arising from S5066-EP5 will be significantly larger.
The other key overhead is turnaround time. When transferring ARQ data, the other end needs to transmit acknowledgements. For the 9600 bps tests, the turnaround time was 4 seconds, reflecting common modem and radio characteristics without encryption. The measurements used (maximum) 127.5 second transmissions which leads to approximated 3% overhead from turnaround. Note that if transmission length is reduced, which may improve latency for some applications, the throughput overhead would increase. For example, for one minute transmissions, the overhead would increase to 6%. This is considered in more detail in the Isode white paper [Reducing Turnaround Times in STANAG 5066] which also notes approaches to improving performance by reducing turnaround times.
Performance with Variable Speed
In a real deployment, the STANAG 5066 layer has additional parameters that it can control to optimize performance. These are discussed in the Isode white paper [Optimizing STANAG 5066 Parameter Settings for HF & WBHF]. The most critical parameter to select is transmission speed, but interleaver, maximum C_PDU segment size and transmission time are also important.
In evaluating real measurements, the “above modem” architecture throughput (with clear channel) is an important reference point. For example, with ACP 142 and transmission speeds around 9600 bps, this reference throughput is 91%. If real measurements were close to 91%, it is likely that speed selection is too conservative: a faster speed with higher error rate is likely to give better throughput. If the real measurement was (significantly) less than 45% (i.e., less than half the reference throughput), it is likely that speed selection is being to aggressive and that a slower speed would lead to better throughput.
Conclusions
This paper has provided a standard approach for measuring application level bulk throughput for messaging protocols. It has shown that protocols operating over STANAG 5066 ARQ can achieve greater than 90% throughput. This provides a reference number against which alternative “above modem” architectures can be evaluated.
The paper has shown the ACP 142 provides slightly better performance for data volumes over 10 Kbytes. For small messages, CFTP has slightly better performance. ACP 142 has other advantages over CFTP (multicast, priority handling, delivery reports, and much better compression where compressible attachments are sent).
It is anticipated that a future version of this paper will show that SLEP has (slightly) better performance than either CFTP or ACP 142, but does not support multicast.