A research team of Carnegie Mellon University in Qatar (CMU-Q), a Qatar Foundation partner university, is working on exploring and analysing dialects in Qatar.
"Our main goal is to expand Qatar's knowledge base when it comes to the Qatari dialect, heritage, culture, and identity," says Zeinab Ibrahim, teaching professor of Arabic studies and the lead principal investigator on the project to create an interactive map of the Qatari dialect.
The project is funded by the Qatar National Research Fund's (QNRF) National Priorities Research Programme.
Principal investigators include Houda Bouamor, assistant teaching professor of information systems at CMU-Q, as well as Aisha Sultan from Doha International Family Institute and Hany Abdelrhem from Georgetown University in Qatar.
For the project, the research team is tracing the social and geographic variations of Qatari dialect over generations, and creating a digital tool to explore pronunciation, usage, and expressions.
Ibrahim believes CMU-Q's research can help preserve and promote Arabic language learning in Qatar. "I have been living in Qatar for quite a while now, and I've noticed a lack of references on the local dialect that has changed over the years. Also, a lot of people move and work here, and would like to learn the Qatari dialect, but there is neither a reference nor a textbook available for that," she says.
"Thus, the outcome of this research effort can be used to develop curricula that helps Qatari students learn Standard Arabic."
Bouamor is working on the second part of the project: to assess Qatari dialect usage from a computational linguistics perspective. "Looking at the way people write on social media, for example, we notice that they either use English or the Qatari dialect. It is therefore important to establish references of the language resources used. Dialects differ from one country to another, even within the Arabian Gulf. The Emirati dialect differs from the Kuwaiti one, for example. Therefore, we must conduct actual research to determine whether a general Gulf' dialect exists or if each country has its own specific dialect."
"We notice that many people over the age of 60, for example, use language expressions different from those used by youth in their 20s. Thus, we must track these changes, and draw up a reference map."
As part of the efforts undertaken to develop an interactive linguistic map of the Qatari dialect, the research team is working on collecting data from native speakers and gathering linguistic vocabulary in its basic form.
With the participation of Qatari researchers, the project features interviews with Qatari individuals in an effort to build standard written conventions for Qatari dialect, and to digitise and analyse this information using natural language processing and machine learning techniques.
Hamed al-Qahtani is a research assistant on the project and represents the Bedouin dialect. "As part of our work, we had to conduct interviews with people of different ages tackling five specific topics relating to heritage and old customs and how they changed over time, as well as the nature of past work and the difference between the past and the present. We also asked participants about their take on contemporary issues, such as Qatar hosting the World Cup."
The research endeavour has nevertheless faced several challenges, the most prominent of which has been gaining people's trust to get them to speak naturally and spontaneously, says Delma al-Hajri, another research assistant. "Among the difficulties we encountered is some people's reluctance to participate in interviews. Some were not interested in the subject, or didn't want the conversation on record for privacy concerns, despite being assured that the information would be used for research purposes only."
The research project also provides a valuable resource that can be leveraged to create different tools that automatically process the Qatari dialect, starting with creating Qatar's linguistic map, says Bouamor.
"From a computational perspective, this is a great resource. For example, we can draw a map showing where specific words are more commonly used, which represents one of the key objectives of this project. From a neuro-linguistic programming perspective, we can build a morphological analysis tool using machine learning. Another application is to build machine translation systems or tools to search for documents and information using a purely local dialect."