From left to right: The Bil-Arabi initiative to encourage Arab people to use Arabic language exclusively on the Internet; The non-profit website nok6a, which gathered 50 Arab specialized volunteers who translated and wrote 3200 scientific articles.
Sawsan al-Abtah’s recent article in Asharq al-Awsat exposed the lackluster presence of Arabic on the internet today. This topic is hardly a new one: since the early 2010s, discussions on the broader use of Arabic online and how to streamline Arabic use of the web brought concerns about the language’s underrepresentation in the digital world. Then the outbreak of the Arab Spring led to a larger Arab userbase, with increased citizen journalism, organized activism, and the availability of platforms allowing more freedom of speech and expression. With an abundance of Arabic-speaking users, one would expect Arabic content to develop naturally. On the contrary — al-Abtah points out that Arabic content has been and continues to be overshadowed by content in other languages. Today, we have become more conscious of its shortcomings amid a global pandemic that has pushed large portions of work, education, and social life online, with very few Arabic language resources available.
“Strong content is a source of pride for people and would open new economic opportunities,” explains al-Abtah, who calls for more significant efforts in bridging the content gap. But she cites that Arabic only makes up 5% of total content online, contributing 4% to the new economy versus 22% from content in other languages. The 2020 study “Top Ten Languages Used on the Web” recorded Arabic as the 4th most used language among users on the internet, making up 5.2% (237,418,349) of the users across the world. By comparison, actual Arabic content on the web ranks 11th out of 34 languages at 1.1% of websites published online, according to the study “Usage Statistics of Content Languages for Websites.”
The lack of original Arabic content may be one of the most significant issues facing the language online. Arab countries produce little of their own content and rely on importing others’ digital innovations. In the words of al-Abtah, “soon, the valuable information we need in medicine or literature, and perhaps entertainment will come at a price.” Most users spend their time checking emails, downloading files, socializing, job-hunting or shopping, activities which “involve extracting content from the internet, as opposed to introducing it,” according to Angela Franco from the blog Pangeanic. Online Arabic resources lack research, articles, ebooks, documentaries, and educational videos. The Arabic Wikipedia, for example, remains incomplete.
The problem is not just a lack of quantity but also quality. Al-Abtah notes Arabic content online comprises inaccurate and unreliable information. It also suffers from the issues of irrelevancy and searching for information in Arabic often yields impertinent information or gives results in the wrong language, said Franco. Dima Abusamra of the website Medium found in Motaz K. Saad and Wesam Ashour’s “OSAC: Open Source Arabic Corpora” report that most content is user-generated or machine-translated, with a severe lack of original, localized, and high-quality content. The need for variety and quality in content pushed many to rely on material in other languages. Abdel Salam Haikal, the founder of the company “Galaxy” which is trying to mend the situation, explained in Asharq al-Awsat, “Most of the material in Arabic is entertainment, religious, or just a snapshot of time. This compels the Arab, even though he is often not deeply proficient in a foreign language, to resort to a language other than his own when he needs useful information.” According to al-Abtah, this situation is a waste of energy and a waste of economic opportunities.
Digital experts face several challenges caused by the dwindling state of online Arabic content before making any actual progress. Al-Abtah pinpoints two major setbacks: a lack of a coordinated Arab technological vision and a lack of research necessary for technological advancement, including networking, data collection, analysis, and utilization of this data. Unlike the European Union, which worked in solidarity towards a technical vision, Arab nations operate individualistically, a-Abtah writes.
Underlying e-commerce and copyright laws issues also make many hesitant to introduce works online, including “absent or insufficient legislation on digital copyright and e-commerce,” in the words of Abusamra. Hamdy Soliman Mubarak, a senior software engineer at Qatar Computing Research Institute, suggests that one of the biggest concerns among authors in publishing their content lies in a predominant failure to observe intellectual property rights in Arab countries.
Arabic also poses linguistic obstacles when transferred into the digital sphere. Because of its unique structure and features, current technology and applications have difficulty communicating and sharing Arabic content. Eman Kamel in Al-Fanar Media highlights a few of the various problems: “In Arabic, one “root,” or combination of several consonant sounds in a certain order, can generate many words having different meanings. Also, the shape of the same letter differs depending on its position within the word.” Diacritics (the symbols placed above or below letters) alter the pronunciation, grammatical formulation, and meaning, which “confuses search systems and produces poor search results.” Since Arabic letters do not have upper- and lower-case forms, she says that identifying proper names is difficult.
Arabic not only suffers from underrepresentation online but also neglect from its speakers outside of the web. According to poet and journalist Mahmoud Abdel Raziq Jumaa in Al-Fanar Media, “Poor Arabic digital content results from poor education systems that reduce the Arabic language to abstract rules that students study just to pass their exams.” Education systems in the Arab world prioritize foreign languages over Arabic, resulting in Arab graduates with poor Arabic writing skills. This problem hinders the production of Arabic content, as cited by Abusamra.
Efforts to increase Arabic language representation have been in motion since the early 2010s. In 2013, the NGO “Arab Thought Foundation” launched a website to collect and analyze Arabic information on the internet. The website separates online texts, audio, and video across 22 Arab countries and 66 non-Arab countries into “informatics units,” updating the database every six months. Some categories include social and humanitarian texts, media and communication, culture and thought, and education and science.
More recently, the Qatar Foundation has taken on several recent endeavors to increase Arabic content online and develop tools to make the process easier. “Jalees,” an e-reader that supports reading Arabic text, allows users to switch between right-to-left and left-to-right reading orientations for use by teachers and students. It includes interactive games, videos, and simulations within books to stimulate young readers and promote learning. “Ethraa,” a tool for professionally translating digital content into Arabic, is collaborating with Wikimedia and aims to translate medical information to make medical resources more accessible in Arabic. “Farasa,” an open-source text-processing toolkit for Arabic text, may be one of the most promising tools for Arabic’s digital future. Artificial intelligence streamlines word segmentation and improves the output quality of machine translations and information retrieval. Because they can use it alongside other tools, such as text-to-speech software, search engines, machine translation, and social media analysis, Farasa will allow smoother content conversions in different languages into Arabic.
Projects like these are a much-needed step in the right direction but cannot flourish if the Arab world operates without a clear technological vision or unity. In the words of al-Abtah, “While the slogans of Arabism and unity were resounding loudly from the microphones, each country presented its shallow interests...We must all cooperate and toil with our projects to inject the pulse of creativity and innovation into [Arabic’s] veins, or we will continue fighting over the means of communication, throwing accusations and insults, and become extinct like dinosaurs.”
Al Jadid editors contributed Arabic translations to this essay.
Copyright © 2021 by Al Jadid