How to Collect Sources from Syria If You Don't Read Arabic
Information from groups involved in international conflicts has never been as accessible as it is now. Rebel groups run social media accounts, video of missile launches are posted to Twitter, and the production value of propaganda has skyrocketed.
For researchers and journalists who do not master the Arabic language, the same problems still exist: how does one, with no knowledge of a region’s language, collect accurate and timely information?
While traditional resources like translators and sources remain important to any non-native speaker, a number of tools and strategies exist on the internet for doing some if not most of the work yourself.
This guide will introduce you to some free and easy to use tools that should get you started on researching groups and verify content you find online.
Establishing a Basic Understanding
The Syrian civil war is a complex, convoluted mess of alliances, backers, and enemies. In a matter of days, a group can go from allies, enemies, and back to working closely.
As you can see from this detailed chart of groups and factions by Cody Roche, which may be already partly outdated by now, keeping track of groups is at best extremely difficult.
You will need to familiarize yourself with the basic actors, overall factions, and recognize major areas of interest. This involves looking at important mainstream sources like Associated Press or any other major news organization for a broad overview or simply using a site like Wikipedia.
This will give you some initial insight to begin your research and identification of a group, but also important information like city names, province names, or other important geographical identification.
Once you established a basic understanding you’ll be ready to jump into what’s available in English language resources.
- Live Map Most up to date map of territorial control and news
- Wikipedia page for the Syrian Civil War
- Faction list of government and rebel forces
- Background information on factions and the war compiled by /r/SyrianRebels
- List of rebel and opposition media sources (A little outdated, but includes large list of media)
- Map of Deescalation zones Territorial control is outdated, important for context of Turkish operation in north Syria
Basic Tools for Arabic Research
- Arabic keyboard (physical or digital)
- Image editing software (Paint, Photoshop, or open-source GIMP)
- Google Translate
- Optical Character Recognition tool
- Accounts on most common social media platforms (Telegram, Facebook, Youtube, Twitter)
At this point you have some idea of the conflict. You know the major factions, and you’ve gathered some solid English language sources of media.
Before going over any Arabic related content it’s important to have some basic tools to help you with your research.
Google Translate is an obvious one, giving you the ability to translate phrases and names into Arabic when doing searches online.
Optical Character Recognition (OCR) will help in identifying text and writing in images you would otherwise be unable to read or is not included in the post.
Both of these tools are not guaranteed to be perfect. To make up for any outputs or results that don’t match the original text you’ll need to correct the language issues.
To do so, you’ll need to install an Arabic language keyboard for your computer or use a physical one. Since this is a guide for people who don’t speak Arabic, I’ve put together the below cheat sheet to help you identify and match any letters you may see that Google Translate or OCR software were not able to pick up.
The Process
So you’ve come across a post on Liveuamap about a group named al-Binyan al-Marsus in Manshia, Daraa.
Our goal will be to find their Telegram or other social media accounts.
Without knowing much about the group, we can do a quick Google search of the English name in Google Translate and get a pretty rough translation of “عمليات البنيان المرسي”. Taking this we’ll plug it into Google and search.
This will bring up a similarly named Libyan Rebel group’s Facebook page. This isn’t really useful for our search, but you’ll notice that their name is written as “عمليات البنيان المرصوص”.
Because we can’t expect Google Translate to properly translate every single phrase, double checking against other search terms is a great way of finding alternative or correct spellings of group names.
We can plug in the name into Facebook, Twitter, or YouTube and look for possible matches, looking for anything that matches content from the initial post we found.
To save us time, we can take the translation of Daraa “درعا” and search for them together. Searching for “ درعا عمليات البنيان المرصوص” on Twitter leads us to this page:
Combing through their Tweets or media releases on other social media channels you’ve searched you will find images and videos that include their social media handles.
For the most part, these groups want to be found. If anything you’ll be in a struggle with YouTube and Facebook as they take down pages and content.
By finding original sources you’ll often find hashtags and names of organizations and conflicts. You can use these to narrow down your search and find groups linked to each other.
Most importantly in any images you find you should lookings of social media sites. Telegram, an encrypted messaging service, is one of the most popular and consistent platforms for groups. If you find a group’s telegram page you’re guaranteed a much more stable source of information from these groups.
Due to efforts by some of these platforms to censor or delete pages with graphic content, it’s recommended you try and find at least two or three pages you can follow them on. Groups, especially larger ones who are consistently targeted on social media, are always sharing their most recent pages.
Using OCR
Optical Character Recognition (OCR) is a powerful tool when you do not have access to anything but an image of the text you are trying to translate.
Because there is no set standard for media releases and the level of professionalism can vary from group to group, not all releases include editable text or even written summaries.
To get around this and save time, you can use an OCR tool. There’s many available, so try them out and see which one works best for you.
Using the previous Tweet from Bunyan Marsous, which has matching text in the tweet, we can compare the results.
First, to make it easier for the software to recognize the letters we’ll cut out all the unnecessary images.
Going through New OCR we’ll upload the image and select the language to detect.
Pressing upload and OCR will give us the following result.
Without knowing any Arabic we know can see that the first and last sentences were good matches. There are some errors, but it’s close.
Now we can start tuning our input. By changing the bounding box around it, the OCR site gives us this:
There are still some issues, but we now have three of the five lines written down, with some fixable errors.
There could be a variety of reasons for why the site is not picking up third or fourth lines. In this case, the background includes a lot of mixed graphical elements. By editing the image and filling the space with black and inputting it into the site we then get:
We now have four lines of something workable. Isolating the sentence that was not being identified at all gives us one more rough transcription:
Comparing the text…
أَم حَسِبتُم أَن تَدخُلُوا الجَنَّةَ وَلَمّا يَأتِكُم مَثَلُ الَّذينَ خَلَوا مِن قَبلِكُم مَسَّتهُمُ البَأساءُ وَالضَّرّاءُ وَزُلزِلوا حَتّى يَقولَ الرَّسولُ وَالَّذينَ آمَنوا مَعَهُ مَتى نَصرُ اللَّهِ أَلا إِنَّ نَصرَ اللَّهِ قَريبٌ
أم خسـبتم أن تدخنـور الجنة ولمـ بأتكـم مثل الذين خنـوا مـن قبيخـم هشـتهم التأسـاغ والضـزاغ وزلزنوا حتـى تقول السول والذين آقنـوا فغــة فتى نصز الله أال إن نص الله قريت
..will show a flawed, but ultimately a workable piece of text and, in combination with the cheat sheet provided, we can make the appropriate corrections to translate or search for the origins of the text. This example text being verse 214 of al-Baqarah, the second Surah of the Qur’an.
Most of the time you won’t be working on complicated texts like this, but it shows how powerful OCR can be when you’re working on translation.
Keywords and Key Images
The most important thing to remember when searching for groups is to always look for unique keywords or hashtags.
These channels are run by human beings who are doing their best to make their releases easy to find with the least amount of effort.
So finding common hashtags, keywords, or phrases you see being repeated throughout releases will often lead you to other groups.
The same idea helps when it comes to images and logos. Keeping note of common imagery and media styles helps to identify allied groups, backup channels, and ideologically related groups.
As you go through these channels make sure you’re always asking if what you’re reading is in line with the other channels you’ve seen. Disinformation and fake channels exist and can be very well made.
Limitations and Recommendations
What you’ve gained and learned from the guide will serve as a stepping stone into a larger world of open source Arabic research.
This is not a total replacement of Arabic interpreters and learning the language. A basic knowledge of Arabic, even a short introductory course, can provide you with an even greater understanding of the content and I can’t recommend it enough.
But even without a full mastery of the language, this guide should serve as the foundation for better understanding of researching groups in Syria and the region.