FT8 hash collisions

By   7 March 2023 21:00

What the… are hash collisions? What has it to do with FT8?

This article explains that the FT8 protocol can confuse call signs and the software seems to behave erratically.

What is a hash?
In short, a hash code is a number, usually calculated via bit operations. Hash codes are widely used in data storage applications. The general idea is to calculate short numeric values which are unique for each record. The hashes are stored in an index table. Searching through hash tables is a lot faster than searching in the longer records themselves.

However, the shorter the hash code, the greater the probability that identical hash codes are found for differing records.This ambiguity is known as hash collisions.

The number of bits determine the amount of possible unique values. With 1 bit, we have 2 possible values, 0 and 1. When we add bits, the value is doubled with each additional bit. 2 bits can represent 4 binary values, 00, 01, 10 and 11. This corresponds with decimal values 0, 1, 2 and 3.

A byte, for example, has 8 bits and can represent 256 different values.

The FT8 protocol uses hash codes of 10, 12 and 22 bits. The short ones have 1024 or 4096 possible values and that limitation can cause problems. First, we will have a closer look at the FT8 protocol.

Amateur radio call signs in FT8
The team behind the FT8 protocol did a tremendous job. Many just see the screen and make QSO’s, not realising what happens under the hood.

The developers tried to minimise messages as much as possible, to maximise the efficiency. I will not go into detail, but one of the “tricks” is to translate call signs into bit sequences which are much shorter than transmitting one symbol for each character. More details are found in the appendix below.

Standard call signs
Call signs of up to six characters and which conform to usual call sign patterns are coded in 28 bits, without ambiguity. The receiver is therefore always able to decode standard calls correctly. Longer call signs cannot be coded this way and require more bits.

The full message uses 77 bits, of which 3 bits are used to identify the message type, leaving 74 bits for the message content. Additional bits are transmitted on top of the 77 bits for error correction and synchronisation.

If two standard call signs are transmitted, we need 2 x 28 = 56 bits. The remaining bits until we reach 74 are used to exchange additional information, like report, grid, 73 etc.

Non standard call signs
FT8 can handle non standard call signs up to 11 characters or ones that do not fit into the standard pattern, for example F/PA2S or PA2023EVENT.  The non standard coding scheme requires 58 bits, which only leaves 16 bits remaining. The second call sign does not fit into this space. In order to get two call signs into one message, FT8 shortens one of the two call signs to a hash code of 12 bits (or 10 bits for a DXpedition).

When the software receives a message with a hash code, it tries to find a recently decoded call sign for which the received hash code matches. WSJT-X shows the matching call sign as <CALLSIGN>. If no match is found, the missing information is displayed as <…>.

The hash code can also match with the hash of the own call (“mycall”), in which case the software will display the own call sign as <MYCALL>.

Suppose that F/PA2S calls ZL2CC. The message will contain the long 58 bits code for F/PA2S and only a hash code for ZL2CC.

The hash code received by ZL2CC should match with the code for his own call sign and WSJT-X will then display <ZL2CC> F/PA2S.

But there is a problem
It is not difficult to understand that 4096 different hash codes can never represent a far greater number of call signs and collisions exist. It is not possible to reconstruct the call sign from the hash code. Even if we knew all assigned call signs, we would find that many of them would have the same hash code.

It is a bit like a situation where persons with the same name are called. Both will respond when they hear their name. The name resembles the hash, which is too short to identify each of those persons. One would need an extra FT8 period to ask the caller who was addressed.

This is what happened recently when I called ZL2CC from France as F/PA2S. Later on, ZL2CC told via e-mail that he did not see me calling him, but saw me calling <PG6PEACE> F/PA2S a number of times.

I had a suspicion that it could have to do with the hash codes, but I had not studied the FT8 protocol in detail and did not know the hash code algorithm in order to check if my suspicion was correct.

An internet search identified some helper programs, published by the WSJT-X developers as additional files on the ARRL website. One of the helper programs can calculate the FT8 hash codes. It was confirmed that the 12 bit hash codes for ZL2CC and PG6PEACE are identical and the software displayed the “wrong” call. I could also confirm in the ALL.TXT file of ZL2CC that the PG6PEACE call sign was decoded about 15 minutes earlier on 21 MHz. The result was that WSJT-X mapped the hash code to the decode of PG6PEACE instead of to his own call ZL2CC.

hashcodes_output

Similar effects were seen recently on 60 metres with FM/VE3DZ and VK4MA, both having the same hash code. Some stations mapped it to the VK callsign and started calling the VK, hoping for a new one. I assume that the VK call was decoded before on a different band, because VK does not have an allocation for 60 m.

Another recent example was 3C3CA (Equatorial Guinea) who was a “new one” from France. I called many times and did not understand why I did not get a response. I normally use JTDX, because it decodes better than WSJT-X and I also prefer the user interface. But I gave it a try with WSJT-X instead but it did not help. Next, I switched to my French call sign F4VUY and voila, QSO made after a few calls.

The next day, I called 3C3CA again with F/PA2S and it worked flawlessly.

I have not seen any files in the AppData directory with hashes or the like, so I assume that hash codes are kept in memory and are flushed after stopping the software. I have not looked into the inner workings yet, also because it requires quite some time to study the code. But it seems likely that restarting can flush hash code mappings.

Resolution?
It will be difficult to find a solution that fits all. Especially if two stations with the same hash are operating at the same time, it will definitely lead to confusion, which cannot be resolved as far as I can see. Even if the own call sign would be given preference, it would imply that contacts can be made with the wrong station, being interpreted as having worked the wanted one.

It is obvious that non standard call signs have a hash collision risk and should better be avoided. For me, it means that I will switch to F4VUY if a “new one” appears, to be on the safe side.

But it also implies that any addition to the call sign like /QRP is not preferable. On 60 metres, adding QRP is a bit nonsensical anyway, because 15 W is QRP (for me at least).

The same goes for special event call signs, which can complicate things as well, especially on a busy band.

I would also think that stopping and restarting the software can erase earlier hashes and as long as “conflicting” call signs are not active at the same time, the problem will be less likely to occur.

Finally, it might be a good idea to get a warning that hash codes have collided with the own call. Perhaps it would also be of interest to see which other call(s) were matching the hash code. The operator can then decide what to do. Maybe it would be an option to add a feature to delete the confusing call sign(s) from memory.

Appendix

Standard call sign coding method
Let us take my previous call sign, PA2HJS. In 8 bit ASCII code, this would require six bytes, totalling 48 bits. Because the ASCII code includes lower case characters plus all sorts of other characters, which we do not need, it implies that we transmit more bits than needed.

We could reduce to 6 a bit code (64 combinations, which would be sufficient). It would result in (6 x 6 =) 36 bits.

A simple ASCII like coding scheme would still require 6 + 6 + 5 + 5 + 5 + 5 bits, totalling 32 bits. Further reduction could be achieved by not using unneeded characters in the coding scheme (like values after the Z) in which case “unused” bits are eliminated.

FT8 only uses 28 bits to code call signs. This was achieved by creating tables of characters with only those necessary. The position (index) in the table is used for the call sign message part.

For the first and second character of the call signs, 0-9 and A-Z are valid (26+10=36 combinations), the third character only allows 0-9 (10 combinations) and for characters on positions 4 to 6, only A-Z is valid (26 combinations). Anything not fitting in this scheme, is assumed to be a non standard call sign.

Each character of the call sign corresponds to an index in the ranges 0-9-A-Z, 0-9 or A-Z range. The indices found are multiplied and added.

The example below shows some codes. We see that if we change the last character from A to B, the result is one higher. If we change the fifth character from A to B, the result is 27 higher than AA0AAA.

Callsign: AA0AAA  c28 as decimal integer:  86171633

Callsign: AA0AAB  c28 as decimal integer:  86171634

Callsign: AA0ABA  c28 as decimal integer:  86171660

The call sign is found by reversing the calculation, and looking up the characters in the tables.

Message contents
This article focuses on the way, call signs are handled and the disadvantage of the use of hashes to fit longer or compound call signs into the protocol.

The FT8/FT4 messages contain a number of parts, like call signs, reports, RRR/RR73/73 plus fields for contests and field days. A random text message of up to 13 characters can also be sent. More information is found in reference [1].

References
[1] Joe Taylor, K1JT, Steve Franke, K9AN, and Bill Somerville, G4WJS, The FT4 and FT8 Communication Protocols, QEX July/August 2020 (ARRL)