A HOME FOR ANYONE ADDICTED TO ARABIC. 
JOIN ARABIC FOR NERDS➕

Support this site with a membership: For only $2.99 a month or $29.99 a year, you can have a true AD-FREE experience. You also get a 15% discount in my shop and a monthly premium newsletter. Find out more here.

SUPPORT THIS SITE

Passion doesn't need money. Unfortunately, my web provider does. Your contribution ensures that this site will grow and grow.

Buy Me A Coffee

PayPal Donate
amazon wishlist button
Free monthly newsletter

Subscribe to my FREE newsletter and get 10% off in my store!

Symbolic image: Arabic AI bot

The AI bot that speaks Arabic better than most native speakers

The Saudi company Maqsam has launched a new Arabic voice bot that sets new standards and can be used to communicate with customers. In this interview, Maqsam’s developers explain why it is so difficult to create an Arabic AI voice bot — and how they finally managed to do it.

Last updated: 1 month ago

Anyone who speaks Arabic knows the problem: as soon as you speak to someone from another Arab country, it quickly becomes difficult to understand what is being said. This is where AI technology can help.

The new Arabic chatbot from the company Maqsam has an amazingly high level of Arabic grammar and pronunciation that makes most native Arabic speakers look old. I had previously tested Maqsam’s speech recognition tool and now had a look at the new Arabic chatbot, which is quite impressive.

In the following interview, Rana Qubain (رنا قبعين) explains on behalf of her colleagues and developers how Maqsam’s chatbot works and where the difficulties lie in developing Arabic AI chatbots.

About the company Maqsam

Maqsam (مَقسَم) started as a phone company, offering cloud communication services. The startup company was co-founded by Sinan Taifour (سنان طيفور) and Fouad Jeryes (فؤاد جريس) in 2019, who are both specialists in the regional technology industry. Maqsam today is one of the leading AI companies in the Middle East. Maqsam is a Saudi company with subsidiaries in Cairo, Amman, UAE, and Qatar.


What is Maqsam’s AI Voice Bot?

Our AI Voice Bot is a dual-model chatbot capable of receiving and generating text and audio input. It has been trained to understand and reason across different domains and Arabic dialects, and has been instructed to interact in a conversational manner, especially in conversations involving customer interactions and inquiries.

How is your Arabic voice bot different from other Arabic chatbots?

The system manages both speech and text input and output, providing a flexible and comprehensive communication experience. It features enhanced reasoning capabilities, enabling it to deliver more refined and intelligent responses. This is achieved by effectively handling the grammatical complexity of modern Arabic, resulting in superior performance compared to existing market solutions. In addition, the system demonstrates robustness through its ability to adapt to different vocabularies and everyday conversational scenarios, including chats and inquiries.

It also supports a wide range of Arabic dialects and accepts both text and voice input, enhancing its usability across different linguistic communities. The system offers extensive customization options, allowing it to be tailored to different industries and use cases, making it ideal for multi-domain applications.


An example of how the Arabic AI bot deals with questions

In this funny example, a woman sarcastically asked the bot if the insurance company could compensate her for her dead husband by giving her a new one. The bot, however, stayed calm and did not give any crazy answer. Instead, the bot answered her normally and in context.

You may need to open the video in full-screen mode to be able to read the question. But the most impressive part is how the bot speaks Arabic (time stamp 0:11).

A woman sarcastically asked the bot if the insurance company could compensate her for her dead husband by giving her a new one, and again the bot answered her normally and in context.

Would you like to give the bot a try yourself?


How big is Maqsam’s team?

We have a team of scientists and a network of advisors around the world with experience at leading companies such as Google, Waymo, and OpenAI.

Our research unit, headed by Asma Hakouz, has nine machine learning and software engineers with experience in speech and text processing, particularly Arabic Natural Language Processing (NLP) systems.

Asma Hakouz started her journey with deep learning in bioinformatics, focusing on predicting protein interactions for drug design during her master’s studies. She then moved on to work on automatic speech recognition (ASR) and natural language processing (NLP), while also gaining knowledge of MLOps while working at Maqsam.

Who can use your Arabic chatbot and for what purposes?

Maqsam’s AI voice bot is mainly designed to be used by customer support and sales teams and manage general inquiries with business’s end customers. The product is not yet available to customers. We plan to offer flexible pricing that varies based on how much a company uses the bot, its features, and how much customization is needed.

The bot can be used in many ways. The bot can be integrated across your communication channels and wherever our customers deem it to be useful; in-app, on-site, over messages and traditional voice and telecom channels. It can also be added to applications via an SDK (Software Development Kit) and used on digital communication channels such as WhatsApp or a website via a widget. In addition, because the model is built in-house, companies can install it on-premises.


What does the Arabic word Maqsam (مَقسَم) mean?

The name “Maqsam” (مَقسَم) was chosen by the co-founders after an extensive brainstorming session. They considered various names and ultimately decided that “Maqsam” sounded the best and was closest to what the product offered at the time. In Arabic, “Maqsam” refers to a device or system used to route telephone calls to the appropriate destinations. This term was mainly used in the past, before the advent of digital systems and Internet calls.

Now, what kind of word is مَقسَم? The root is ق-س-م which has the core meaning of to split, but also to distribute. The noun مَقْسَمٌ is the noun of place (اِسْمُ الْمَكانِ) and denotes the place where things are distributed/split.

If you can’t yet imagine what the Arabic word “Maqsam” means in telecommunication, here’s a video that shows some devices. Note that this video has nothing to do with the company Maqsam

YouTube video
YouTube video about a telephone switchboard system (مَقسَم)

On your website it says “Trained to accurately understand Arabic diacritics”. What do you mean by that?

One of the challenges in developing accurate and natural text-to-speech systems in Arabic is correct pronunciation, which is usually achieved by correct diacritization of the text. Accurate Arabic pronunciation has long been a challenge due to the complexity of the language’s orthography.

Without proper diacritics, a written Arabic text can be unclear, leading to misinterpretations and problems in speech synthesis. As a result, we paid special attention to building the model to generate proper Arabic diacritics, ensuring that the generated speech sounded natural and pronounced the sentence accurately.



What Arabic data was used to train the AI?

A collection of labeled and unlabeled data was used for training and tuning the model. We used Arabic data available in books, websites and Wikipedia to improve the model’s understanding of the Arabic language and its dialects.

The conversational data was built in-house through our experience with conversational data and interactions. The latter data is essential for moving the language model from text completion mode to conversational mode.

What is the most difficult aspect of creating an Arabic chatbot?

Scarcity of Arabic data, and its dialectic variation, and the community of NLP does not support Arabic as much as English (lack of open-source models, toolkit, packages, etc.).

The scarcity of data is also magnified by the fact that Arabic speakers do not have a standard way of writing, which means that the difference in dialects is also shown in writing, unlike English where the different dialects are mainly represented in the spoken form.

What Arabic dialects does the bot understand?

The model can understand and directly process Modern Standard Arabic, Gulf dialects, and Levant-based dialects.

What types of Arabic text are hard for the bot to understand?

The low availability of Arabic data affects the tokenizer, which is responsible for converting the text into numbers to be processed by the model. The difference in data size is reflected in the ratio of words to tokens.

For example, each word in English corresponds to an average of 1.2 tokens, while this average is about 3.8 for Arabic. Since the window size for the language model is limited to 8k tokens, it is easier for the model to process longer English texts than Arabic texts.


Deep dive: What is a tokenizer in AI technology?

A “tokenizer” plays a crucial role in AI speech models by transforming raw text into manageable units called tokens. These tokens can be as small as single characters or as large as entire words or phrases, depending on the specific requirements of the AI model.

Tokenization allows the model to break down complex text into smaller, more digestible pieces, enabling more effective processing and understanding of language patterns. This preprocessing step is fundamental for many natural language processing (NLP) tasks, as it helps bridge the gap between human language and machine-understandable data.

For example, consider the sentence: “The cat sat on the mat.” A tokenizer might break this down into tokens such as [“The”, “cat”, “sat”, “on”, “the”, “mat”].

This tokenized output can then be fed into an AI model for further processing, such as sentiment analysis or language translation. By converting text into consistent and meaningful tokens, tokenizers help AI models perform a variety of linguistic tasks with greater efficiency and accuracy.


How important are grammar mistakes when talking to a chatbot in Arabic?

Input is processed contextually at the sentence or input level, rather than being parsed word by word. As a result, minor spelling or grammatical errors that do not change the overall meaning are acceptable.

The model is designed to handle these errors comfortably, ensuring accurate understanding and response despite such imperfections.

Does the chatbot understand Arabic text in transcription or transliteration?

Yes, the system is capable of understanding Arabic text presented in , but it does not produce responses in transliterated form. However, it can effectively comprehend such input and generate appropriate responses to queries.


Rana Qubain (رنا قبعين), thank you very much for your time and for sharing your expertise.


Would you like to give the bot a try yourself?

Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments
Previous Article
Illustration passive voice in Arabic

How to build the passive voice in Arabic easily

Next Article
Captagon symbolic picture - credit: Gerald Drissner

Media Arabic Booster 07/24

➤ DIDN'T FIND WHAT YOU ARE LOOKING FOR?

Related Posts