Training an Arabic LLM that reflects local values

Training an Arabic LLM that reflects local values

Training an Arabic LLM that reflects local values
The Arab world did not play a key role in the PC, internet and mobile eras. In the AI era, it will be different. (Shutterstock)
Short Url

Advances in the large language models that underpin generative AI are changing everything, from medicine and education to entertainment.

Our relationship with technology is becoming more intimate as machines change from passive tools into active assistants that amplify our innate human abilities.

This new era poses both a challenge and an opportunity for the Middle East.

The challenge is that leaders in this new field, like OpenAI’s ChatGPT and Google’s Gemini, come from Silicon Valley, or from China, where my team at 01.AI has built models that rival the Americans. In Europe, too, startups such as France’s Mistral have entered the race.

The opportunity is for the Middle East to join this league and make sure its voice is heard.

Inspired by my latest trip to Riyadh, I decided to test how the current crop of AI models would handle a simple request. I imagined myself as a young Saudi getting ready to host a dinner party and asked ChatGPT to prepare a menu.

The food it recommended sounded delicious — stuffed grape leaves, tabouleh salad, mandi and stuffed dates. But the beverages were a problem.

Aside from drinks such as mint lemonade and jallab, a mixture of dates, grape molasses and rose water, ChatGPT also offered this: “For alcoholic beverages, you could offer a selection of international wines, beers, or non-alcoholic mocktails.”

To its credit, when I repeated the question, it offered only non-alcoholic drinks.

If a model recommends breaking both the law and cultural norms, imagine how it might answer other more sensitive questions about politics or religion? Indeed, researchers have even shown that some models have exhibited an anti-Muslim bias.

My modest test underlines the urgent need to develop an Arabic large language model that reflects local values.

The first step to building this is creating enough high-quality Arabic digitized data to properly train a new generation of models.

Although there are 400 million Arabic speakers, only an estimated 2 percent of online content is in Arabic. Meta’s open source LLM model Llama is overwhelmingly trained on English data, with Arabic comprising less than 0.1 percent of the data.

The lack of data naturally skews the results. To fix this dearth of data, either a visionary entrepreneur or a government-backed organization should collect, digitize and convert the many Arabic books into training data for Arabic models.

Once the data is gathered, it can be fed into the breakthrough pre-training process, which reads trillions of words and creates its own virtual concept space or model of the world. This concept space has been shown to be mostly in English and Chinese.

Adding a sizable number of texts in Arabic, which has enormous cultural output and significance, will make the concept space more knowledgeable about Arabic and more balanced in its concepts and views.

After such pre-training, the model needs to be fine-tuned by data and labels from the Arab world, which will align with the values of the region. Those are different from American models, which are aligned to US values, and Chinese models, which reflect Chinese values.

The collection of alignment data, the coordination of human labeling and the alignment process will need to be done in-region by AI experts.

A new Arabic-enhanced large language model could encourage entrepreneurs and developers to build new applications tailored to the needs of their nations.

Kai-fu Lee

Finally, safety modules will need to be added to ensure legal compliance and to avoid harm. These will also need to be developed locally.

The above steps will create localized, sovereign models that will reflect the traditions of the Middle East. Privately developed or government-backed, it could be the foundation for a new wave of Arabic AI innovation.

A new Arabic-enhanced large language model could encourage entrepreneurs and developers to build new applications tailored to the needs of their nations.

Imagine an AI tool that could find, summarize, organize and write insightful content, an AI teacher that makes learning fun and customized, an AI doctor that is more knowledgeable than any human, an AI engineer that can write software and applications, and an AI assistant that knows its owner better than the owner themselves.

The Arab world did not play a leading role in the PC, internet and mobile eras. In the AI era, it will be different.

This transformation is by no means an easy feat. It will require an unprecedented investment of money, energy and human capital.

Middle Eastern leaders like Saudi Crown Prince Mohammed bin Salman and others have shown that they have the vision, determination and resources to lead their countries into the future.

Standing on my hotel balcony in Jeddah recently, overlooking the King Abdullah University of Science and Technology, I saw part of that vision coming to fruition.

Universities such as KAUST and the Mohamed bin Zayed University of Artificial Intelligence in the UAE are striking examples of the resources that have already been poured into this transformation.

These world-class academic institutions can attract and retain the best top tier global talent.  It is especially important to bring in the world’s best computer engineers to help fulfill this vision of the future AI.

Our team at 01.AI has shown what a group of talented and motivated computer scientists can achieve in just one year. With the right commitment of resources and drawing upon the best talent, countries like Saudi Arabia can easily catch up with their global peers.

The Middle East can also lead the world in the use of renewables to run power-hungry generative AI models.

As it seeks to diversify its economy, Saudi Arabia is actively promoting the use of alternative energy sources such as solar, which could power server farms and reduce their carbon footprint — a growing concern as AI becomes more widespread.

It may take time for countries to figure out their strategy for building a sovereign AI. But it is critical for the Arab world to quickly catalyze the creation of culturally appropriate LLMs and build a rich ecosystem to allow AI-powered Arabic apps to blossom.

A recent encounter with a female sales assistant at a computer store in Riyadh served as an apt reminder of what is at stake. Dressed in jeans and sporting a tattoo, she was a reminder of the transformative changes that the country is undergoing.

Where are you from, I asked. “I’m Saudi,” she said. “One day I want to be Saudi Arabia’s Elon Musk.” I hope on my next visit she will pitch me a homegrown AI app.

Kai-Fu Lee is a computer scientist, CEO of 01.AI, chairman of Sinovation Ventures, former president of Google China, and author of “AI 2041” and “AI Superpowers”
 

Disclaimer: Views expressed by writers in this section are their own and do not necessarily reflect Arab News' point of view

Glenn Phillips ton lifts New Zealand to 330-6 against Pakistan in tri-series

Glenn Phillips ton lifts New Zealand to 330-6 against Pakistan in tri-series
Updated 3 min 26 sec ago
Follow

Glenn Phillips ton lifts New Zealand to 330-6 against Pakistan in tri-series

Glenn Phillips ton lifts New Zealand to 330-6 against Pakistan in tri-series
  • Phillips was ably supported by Daryl Mitchell with 81 and Kane Williamson with 58 runs
  • Pakistan’s Shaheen Shah Afridi ended up with expensive figures of 3-88 from his 10 overs

LAHORE: Glenn Phillips cracked a maiden century to lift New Zealand to 330-6 against Pakistan in the tri-series opener in Lahore on Saturday.
Phillips hit 106 not out from 74 balls, with seven sixes and six boundaries, after New Zealand won the toss and batted.
He was ably supported by Daryl Mitchell with 81 and Kane Williamson (58).
Phillips added a quickfire 54 off just 47 balls with Michael Bracewell for the sixth wicket. Bracewell scored 31 from 23 balls, with three sixes.

New Zealand’s Glenn Phillips (R) is congratulated by Mitchel Santner after scoring a century during the tri-series ODI cricket match between Pakistan and New Zealand at Qaddafi Stadium in Lahore on February 8, 2025. (AP)

New Zealand plundered 123 runs in the last 10 overs, including 84 from the final five.
Phillips smashed a boundary and two sixes off pace bowler Shaheen Shah Afridi to reach his hundred off 72 balls, taking 25 in the 50th over.

Pakistan’s Mohammad Rizwan walks off the field as New Zealand’s players celebrate after his dismissal during the tri-series ODI cricket match between Pakistan and New Zealand at Qaddafi Stadium in Lahore on February 8, 2025. (AP)

Shaheen ended up with expensive figures of 3-88 from his 10 overs, although he gave Pakistan an early breakthrough by removing opener Will Young for four with the fourth ball of the match.
Spinner Abrar Ahmed had opener Rachin Ravindra caught and bowled for 25 but Williamson and Mitchell then added 95 off 112 balls to rebuild the innings.

Pakistan’s Babar Azam (R) and Fakhar Zaman run between the wickets during the tri-series ODI cricket match between Pakistan and New Zealand at Qaddafi Stadium in Lahore on February 8, 2025. (AP)

Williamson hit seven boundaries in his 46th half century, his first one-day international since November 2023, before edging Shaheen to wicketkeeper Mohammad Rizwan.
Mitchell appeared well set for a hundred but miscued a shot off Abrar in the 38th over to be caught after hitting four sixes and two boundaries.
Pakistan was hit hard when pace bowler Haris Rauf walked off in the 37th over after suffering a side strain, having bowled 6.2 overs that included the wicket of Tom Latham for nought.


Senior UN official slams inadequate global support for Pakistan’s climate efforts

Senior UN official slams inadequate global support for Pakistan’s climate efforts
Updated 18 min 29 sec ago
Follow

Senior UN official slams inadequate global support for Pakistan’s climate efforts

Senior UN official slams inadequate global support for Pakistan’s climate efforts
  • Mohamed Yahya urges polluting countries to show ‘stronger solidarity’ to rebuild destroyed homes in Pakistan
  • The country faced devastating floods in 2022 that killed 1,739 people, resulting in $14.9 billion in damages

ISLAMABAD: United Nations Resident and Humanitarian Coordinator Mohamed Yahya criticized the lack of global support for Pakistan in combating climate change this week, urging “stronger solidarity” with the South Asian nation to aid in the reconstruction of homes following the floods over two years ago.
In 2022, floods inundated one-third of Pakistan especially affecting the southeastern Sindh and southwestern Balochistan provinces, impacting 33 million people, causing 1,739 deaths and resulting in $14.9 billion (Rs4.1 trillion) in damage and $15.2 billion (Rs4.2 trillion) in economic losses, according to Pakistan’s National Disaster Management Authority.
The Global Climate Risk Index says Pakistan is among the countries most at risk from climate change. Extreme weather events like floods, droughts, cyclones, torrential rainstorms and heatwaves have been occurring more frequently and with greater intensity across Pakistan in recent years.
“One other things we are concerned about is the lack of stronger solidarity for Pakistan around the reconstruction after the 2022 floods,” Yahya told Arab News on the sidelines of the Breathe Pakistan Climate Conference in Islamabad on Friday.
He noted this was despite the fact that “Pakistan contributes even less than one percent of global emission and is in the top five countries impacted by climate change.”
Yahya described it as “unjust” for Pakistan to be asked to take loans for rebuilding homes destroyed in floods and mitigating a crisis caused by other countries, noting that 20 countries were responsible for 80 percent of global emissions.
According to the UN, the 20 countries contributing to the global greenhouse gas emissions include China, the United States, India, Russia, Japan, Germany and Iran etc.
“We obviously welcome the loans Pakistan has received but Pakistan should not be using or taking loans to rebuild things that it had very little to do with and that we think is not just,” he added.
The UN official maintained the world body consistently urged polluting countries, which have contributed to the climate change disaster, to do more and show solidarity and support to the countries bearing the brunt of the climate change impact.
International donors in January 2023 committed over $9 billion (Rs2.5 trillion) to help Pakistan recover from ruinous floods a year earlier, exceeding its external financing goals.
Officials from some 40 countries as well as private donors and international financial institutions gathered at a meeting in Geneva as Islamabad sought funds to cover around half of a recovery bill amounting to $16.3 billion (Rs 4.5 trillion).
Prime Minister Shehbaz Sharif also called for a grants-based and flexible financial assistance for climate resilience for developing nations like Pakistan this week.
He told the Breathe Pakistan Climate Conference that without global empathy and support, “the path to climate adaptation and green transformation will remain elusive.”


Pakistan’s Imran Khan writes another letter to army chief as party stages protest

Pakistan’s Imran Khan writes another letter to army chief as party stages protest
Updated 40 min 9 sec ago
Follow

Pakistan’s Imran Khan writes another letter to army chief as party stages protest

Pakistan’s Imran Khan writes another letter to army chief as party stages protest
  • The opposition party’s ‘Black Day’ protest is to mark the first anniversary of last year’s election
  • The ex-PM warns in his letter of a rift between the army and the people due to crackdown on PTI

KARACHI: Pakistan’s jailed former Prime Minister Imran Khan said on Saturday he has written another open letter to Chief of Army Staff General Asim Munir, complaining about the allegedly shrinking democratic space in the country since what he called “pre-poll rigging” in last year’s general elections, as his party marks a “Black Day” on the first anniversary of the electoral contest.
The letter is Khan’s second to the country’s powerful army chief this month. In the previous one, he had called for a reevaluation of current political policies while alleging that his party, Pakistan Tehreek-e-Insaf (PTI), was being targeted by the state.
Khan’s PTI and another opposition faction, Jamaat-e-Islami (JI), decided to stage protests today on the first anniversary of the last general elections. The PTI initially planned to hold a rally in Lahore but, after being denied permission by the local administration, relocated it to Swabi in Khyber Pakhtunkhwa, where the party is in power.
As protests continued in different cities, Khan warned in his letter of a widening rift between the army and the people.
“Using agencies for pre-poll rigging and manipulating election results to establish an orderly government, forcing a constitutional amendment through parliament under duress to subjugate the judiciary, recruiting handpicked judges, enforcing draconian laws like PECA [Prevention of Electronic Crimes Act] to suppress dissent, and involving state institutions in political engineering rather than their constitutional duties is not only hurting public sentiment but also deepening the divide between the people and the army,” he wrote.
“The army is a crucial institution of the country, but a few black sheep within it are harming the entire institution,” he added.
Khan also criticized state policies, saying that “Internet censorship and social media restrictions” was creating problems for the country.
He blamed “a handful of individuals” for undermining the public mandate, leading to economic instability that has pushed investors and skilled professionals to leave Pakistan.
“Economic instability is at its peak,” he said. “The growth rate is at zero, and investment in Pakistan is nearly nonexistent. Poverty and unemployment are soaring.”

Pakistan police stand guard near a red zone in Karachi on February 8, 2025, as opposition parties protest to mark anniversary of Pakistan national polls, which they say were rigged to benefit their opponents. (AN Photo)

Khan also accused the authorities of damaging the military’s reputation among the public, arguing that national security depended on a strong bond between the people and the armed forces.
“Our soldiers are sacrificing their lives for Pakistan,” he continued. “To succeed in the fight against terrorism, the nation must stand behind the army. But the establishment’s policies and illegal actions have only worsened the army’s reputation among the people.”
There has been no official response from the army or the government to Khan’s letter yet.
Meanwhile, in Karachi, a PTI protest at the Press Club failed to draw large crowds, with party leaders blaming heavy security restrictions.
“How can anyone come to the protest?” asked Khair-un-Nisa, PTI’s Women District Manager in Karachi. “All the roads leading to [the protest venue] have been blocked. Troops have been deployed. They have started the arrests. What kind of law is this?“
Another PTI office bearer described the situation as “very unfortunate.”
“Freedom of association is a basic and fundamental right ensured by the Constitution of Pakistan,” said Advocate Maqsood Alam, Vice President of PTI’s Karachi Division. “But look here. You can see that the people of Pakistan, the citizens of Pakistan, cannot raise their voice independently. They cannot protest according to the constitution.”
Arrests of Opposition Workers
Earlier, police arrested multiple opposition members ahead of planned protests by PTI and JI to observe February 8 as a “Black Day” to highlight alleged election irregularities.
Pakistan’s general election was marred by a mobile Internet shutdown and unusually delayed results. The elections resulted in a hung National Assembly, followed by weeks of opposition protests alleging vote fraud. The caretaker government and the Election Commission of Pakistan (ECP) have denied the charges, but the US House of Representatives and several European countries have called for an independent probe— an initiative Pakistan has so far rejected.
PTI candidates contested the elections as independents after the party was barred from running under its symbol. While they won the most seats, they fell short of a majority, allowing a coalition of rival parties, led by Prime Minister Shehbaz Sharif, to form the government.


Sudan army says retakes key district in Khartoum North

Sudan army says retakes key district in Khartoum North
Updated 08 February 2025
Follow

Sudan army says retakes key district in Khartoum North

Sudan army says retakes key district in Khartoum North
  • Military spokesman Nabil Abdullah said that army forces, alongside allied units, had “completed on Friday the clearing of” Kafouri and other areas in Sharq El Nil
  • The army has in recent weeks surged through Bahri pushing the paramilitaries to the outskirts

PORT SUDAN: Sudan’s military said Saturday that it had regained control of a key district in greater Khartoum as it presses its advance against the paramilitary Rapid Support Forces (RSF).
The district of Kafouri in Khartoum North, or Bahri, had been under RSF control since war between the army and the paramilitaries began in April 2023.
In a statement, military spokesman Nabil Abdullah said that army forces, alongside allied units, had “completed on Friday the clearing of” Kafouri and other areas in Sharq El Nil, 15 kilometers to the east, of what he described as “remnants of the Dagalo terrorist militias.”
The army has in recent weeks surged through Bahri — an RSF stronghold since the start of the war — pushing the paramilitaries to the outskirts.
The Kafouri district, one of Khartoum’s wealthiest neighborhoods, had served as a key base for RSF leaders.
Among the properties in the area was the residence of Abdel Rahim Dagalo, the brother of RSF leader Mohamed Hamdan Dagalo and his deputy in the paramilitary group.
The recapture of Kafouri further weakens the RSF’s hold in the capital and signals the army’s continued advance to retake full control of Khartoum North, which is home to one million people.
Khartoum North, Omdurman across the Nile River, and the city center to the south make up greater Khartoum.
On Thursday, a military source told AFP that the army was advancing toward the center of Khartoum, nearly two years after the city fell to the RSF at the start of the war.
Eyewitnesses in southern Khartoum reported hearing explosions and clashes coming from central Khartoum Saturday morning.
The developments mark one of the army’s most significant offensives since the war broke out between army chief Abdel Fattah Al-Burhan and his erstwhile ally Dagalo’s RSF, which quickly seized much of Khartoum and other strategic areas.
The conflict has devastated the country, displacing more than 12 million and plunging Sudan into the “biggest humanitarian crisis ever recorded” according to the International Rescue Committee.


War-torn Lebanon forms its first government in over 2 years

War-torn Lebanon forms its first government in over 2 years
Updated 4 min 39 sec ago
Follow

War-torn Lebanon forms its first government in over 2 years

War-torn Lebanon forms its first government in over 2 years
  • Salam’s cabinet of 24 ministers, split evenly between Christian and Muslim sects, was formed less than a month after he was appointed
  • Lebanon is also still in the throes of a crippling economic crisis, now in its sixth year

BEIRUT: Lebanon’s new prime minister on Saturday formed the country’s first full-fledged government since 2022.
President Joseph Aoun announced in a statement that he had accepted the resignation of the former caretaker government and signed a decree with new Prime Minister Nawaf Salam forming the new government.
Salam’s cabinet of 24 ministers, split evenly between Christian and Muslim sects, was formed less than a month after he was appointed, and comes at a time where Lebanon is scrambling to rebuild its battered southern region and maintain security along its southern border after a devastating war between Israel and the Hezbollah militant group. A US-brokered ceasefire deal ended the war in November.
Lebanon is also still in the throes of a crippling economic crisis, now in its sixth year, which has battered its banks, destroyed its state electricity sector and left many in poverty unable to access their savings.
Salam, a diplomat and former president of the International Court of Justice, has vowed to reform Lebanon’s judiciary and battered economy and bring about stability in the troubled country, which has faced numerous economic, political, and security crises for decades.
Though Hezbollah did not endorse Salam as prime minister, the Lebanese group did engage in negotiations with the new prime minister over the Shiite Muslim seats in government, as per Lebanon’s power-sharing system.
Lebanon’s new authorities also mark a shift away from leaders that are close to Hezbollah, as Beirut hopes to continue improving ties with Saudi Arabia and other Gulf nations that have been concerned by Hezbollah’s growing political and military power over the past decade.
In early January, former army chief Aoun was elected president, ending that position’s vacuum. He was also a candidate not endorsed by Hezbollah and key allies.
Aoun has shared similar sentiments to Salam, also vowing to consolidate the state’s right to “monopolize the carrying of weapons,” in an apparent reference to the arms of Hezbollah.

Salam’s 24-member cabinet included Deputy Prime Minister, Tarek Mitri, Defense Minister,  Michel Menassa, Minister of Interior and Municipalities, Ahmad Al-Hajjar, Minister of Foreign Affairs and Emigrants, Youssef Rajji, Minister of Telecommunications, Charles Hajj, Minister of Energy and Water, Joseph Saddi, Minister of Justice, Adel Nassar, Minister of Finance, Yassine Jaber, Minister of Public Health, Rakan Nasser Eldine, Minister of Culture, Ghassan Salameh, Minister of Industry, Joe Issa Al-Khoury, Minister of Economy and Trade, Amer Al-Bisat, Minister of Agriculture, Nizar Hani, Minister of Information, Dr Paul Morcos, Minister of Social Affairs, Haneen Sayed, Minister of Youth and Sports, Nora Bayrakdarian, Minister of Tourism, Laura El-Khazen Lahoud, Minister of Education, Rima Karami, Minister of Environment, Tamara El-Zein, Minister of Public Works and Transport, Fayez Rasamny, Minister of Displacement, Kamal Shehadeh (and State Minister for Technology Affairs and AI), Minister of Labor, Mohamed Haider, and Minister of Administrative Development, Fadi Makki.