10 years of Fonetic Voice Surveillance in the trading room

Fonetic Team
Wednesday, 05 December 2018 / Published in Compliance, machine learning, voice surveillance, Trading

10 years of Fonetic Voice Surveillance in the trading room

Many banks feel that voice surveillance is still something reserved for the movies. Having been in production in trading rooms for 10 years now, we can safely say that it is a reality for those who have invested in it. In fact, we would go further to say that effective voice surveillance is not only possible, it's also accurate and in Fonetic's case, it's the most accurate on the market today.

Do you want to know how we did it? 

Ten years ago, a global bank asked Fonetic to extend their capabilities in voice analysis to the trading floor. Their aim was to automatically analyse every voice communication to check for misconduct or market abuse. They succeeded and 6 months later left behind voice sampling. We spent the following 9 years and 6 months running into stumbling blocks and overcoming them, developing a solution which is without doubt the most accurate voice surveillance software around for trading floor environments.

This article will show you why you should do the same with your voice surveillance.


When Fonetic was tasked by a global bank to expand their capabilities in voice analysis, it marked the start of a brand-new era for this financial institution and for Fonetic as a company. Today, the relationship continues and what started as just voice communications surveillance has grown into a global initiative, stretching over many countries including not only audio monitoring but also e-comms surveillance and trade reconstruction, a solution which remains as one of the most sophisticated surveillance systems within a financial institution.

To celebrate our 10th anniversary in the voice domain for FIs, here are the 8 major achievements by Fonetic in voice comms analysis over the past decade:

1. Seamless Data Capture

We don’t need our customers to provide the recordings, we can simply connect to the source and retrieve their voice comms from turrets, recorders, MVR and carriers… but also attachments in emails, WeChat and other traditionally e-comms content. Simple. Straightforward. Solved by our experts.

 2. Unique Audio Processing 

Every audio channel is different, every phone call has its quirks. Whether working in noisy environments or with non-linear conversations, we only select the right acoustic models and parameters for each call environment to enhance audio transcription models and deliver improved audio quality and high accuracy. This means we achieve the best possible results, even when employees are using trading jargon or not speaking in a linear fashion. Something not uncommon on the trading floor.

Read more on tackling conduct risk by better voice surveillance techniques.

 3. Multi-language support

In an increasingly globalised society, banks need to stay ahead of the curve when it comes to monitoring their employees. English was the standard language for banks trade in, but now more and more languages are being used on trading floors. While Wall Street still reigns as the financial capital with the two largest exchanges in Manhattan, Europe is home to the third largest exchange where the most common languages are French, Italian and Spanish... that’s not to mention Japan and China.

Our system currently works in more than 20 languages and variations providing financial institutions with a solid roadmap when aiming at global voice surveillance. We also support much more than that: nearly twice as many languages are ready and waiting to be deployed on request.

 4. Multi-language or language switching detection

Employees on trading floors in mainland Europe are far more likely to be multilingual than in the UK or US. This means that employees will often switch languages half way through their phone call when they want to bypass any communications surveillance software. This tends to happen when they have intention to behave inappropriately or commit market manipulation.

We recently answered the question why detecting multiple languages within the same call is essential to modern day compliance. It’s the key to preventing traders from avoiding surveillance systems by changing languages mid conversation. We detect the language being spoken, before transcribing, which makes these kinds of tactics useless against Fonetic's technology.

 5. The highest speech to text accuracy on the market

The amount of data that compliance teams must process daily is unsurmountable. Advances in technology such as Machine Learning (ML) and Natural Language Processing (NLP) have greatly improved the ability to automatically understand communications.

Fonetic's technology is currently capable of detecting 60 different kinds of entities with an F1 score of up to 90% in real trading voice communications. These 60 entities - ranging from general words like “countries” to very specific terminology such as “swap type” - allow our customers to understand communications without the need to explicitly define values for these entities.

 6. Voice biometrics

Speaker recognition is the science of identifying a person from the characteristics of their voice. This is now a reality for customer authentication. At Fonetic, we have used voice biometrics to overcome challenges such as legal restrictions, cumbersome enrolling processes and successfully implemented solutions in different business environments with use cases including black lists, identification and verification.

7. Voice sentiment and noise event detection

Detecting anger, whispering, laughing, noise and noise events is now possible with Machine Learning algorithms and the right selection of features to train the systems. We have successfully implemented more than 5 different audio-based indicators to enhance understanding the emotional context in which communication takes place beyond what is said.

8. Background noise and silence detection

In adition to recognising sentiment in a person’s voice, Fonetic also detects background noise such as what you would find in street environments or a trading floor. Noisy audio tracks are subject to signal treatment which improves the overall quality of the output and improves the accuracy of language detection and speech to text transcription.

Detecting silences or duplicates on calls enable the system to remove unwanted audio and withhold it from the processing, thus decreasing the amount of comms to process.

Fonetic Trade Comms Suite now provides financial institutions with complete understanding of fraud, misconduct and execution through monitoring every communications channel. E-comms and v-comms are automatically linked to their related trades and orders and offer a deeper understanding of the trading activity.

This wouldn’t have been possible without our engineering team, our delivery teams and the continuous reinvestment in R&D, amounting to over $12m invested so far, as we strive to remain leaders in voice surveillance for financial services.

Everything started with voice, the hardest part of the problem.

For all that made this possible and all that already enjoy the benefits, Happy 10th birthday to Fonetic Voice Surveillance!

Find out more about our surveillance solutions with this free datasheet.

If you enjoyed this article, you may also be interested in: