DUBAI: Israel’s military is developing an advanced artificial intelligence tool, similar to ChatGPT, by training it on Arabic conversations obtained through the surveillance of Palestinians living under occupation.
These are the findings of a joint investigation by The Guardian, Israeli-Palestinian publication +972 Magazine, and Hebrew-language outlet Local Call.
The tool is being built by the Israeli army’s secretive cyber warfare Unit 8200. The division is programming the AI tool to understand colloquial Arabic by feeding it vast amounts of phone calls and text messages between Palestinians, obtained through surveillance.
Three Israeli security sources with knowledge of the matter confirmed the existence of the AI tool to the outlets conducting the investigation.
The model was still undergoing training last year and it is unclear if it has been deployed and to what end. However, sources said that the tool’s ability to rapidly process large quantities of surveillance material in order to “answer questions” about specific individuals would be a huge benefit to the Israeli army.
During the investigation, several sources highlighted that Unit 8200 had used smaller-scale machine learning models in recent years.
One source said: “AI amplifies power; it’s not just about preventing shooting attacks. I can track human rights activists, monitor Palestinian construction in Area C (of the West Bank). I have more tools to know what every person in the West Bank is doing. When you hold so much data, you can direct it toward any purpose you choose.”
An Israel Defense Forces spokesperson declined to respond to The Guardian’s question about the new AI tool, but said the military “deploys various intelligence methods to identify and thwart terrorist activity by hostile organizations in the Middle East.”
Unit 8200’s previous AI tools, such as The Gospel and Lavender, were among those used during the war on Hamas. These tools played a key role in identifying potential targets for strikes and bombardments.
Moreover, for nearly a decade, the unit has used AI to analyze the communications it intercepts and stores, sort information into categories, learn to recognize patterns and make predictions.
When ChatGPT’s large language model was made available to the public in November 2022, the Israeli army set up a dedicated intelligence team to explore how generative AI could be adapted for military purposes, according to former intelligence officer Chaked Roger Joseph Sayedoff.
However, ChatGPT’s parent company OpenAI rejected Unit 8200’s request for direct access to its LLM and refused to allow its integration into the unit’s system.
Sayedoff highlighted another problem: existing language models could only process standard Arabic, not spoken Arabic in different dialects, resulting in Unit 8200 needing to develop its own program.
One source said: “There are no transcripts of calls or WhatsApp conversations on the internet. It doesn’t exist in the quantity needed to train such a model.”
Unit 8200 started recruiting experts from private tech companies in October 2023 as reservists. Ori Goshen, co-CEO and co-founder of the Israeli tech company AI21 Labs, confirmed that his employees participated in the project during their reserve duty.
The challenge for Unit 8200 was to “collect all the (spoken Arabic) text the unit has ever had and put it into a centralized place,” a source said, adding that the model’s training data eventually consisted of about 100 billion words.
Another source familiar with the project said the communications analyzed and fed to the training model included conversations in Lebanese and Palestinian dialects.
Goshen explained the benefits of LLMs for intelligence agencies but added that “these are probabilistic models — you give them a prompt or a question, and they generate something that looks like magic, but often the answer makes no sense.”
Zach Campbell, a senior surveillance researcher at Human Rights Watch, called such AI tools “guessing machines.”
He said: “Ultimately, these guesses can end up being used to incriminate people.”
Campbell and Nadim Nashif, director and founder of the Palestinian digital rights and advocacy group 7amleh, also raised concerns about the collection of data and its use in training the AI tool.
Campbell said: “We are talking about highly personal information, taken from people who are not suspected of any crime, to train a tool that could later help establish suspicion.”
Nashif said: “Palestinians have become subjects in Israel’s laboratory to develop these techniques and weaponize AI, all for the purpose of maintaining (an) apartheid and occupation regime where these technologies are being used to dominate a people, to control their lives.
“This is a grave and continuous violation of Palestinian digital rights, which are human rights.”