Automating Photo Retrieval for Geolocating - Part 1: Panoramio
This post was originally published on AutomatingOSINT.com.
When geolocating a photograph part of the effort involves looking at photos from various websites (Flickr, Panoramio, Wikimapia) that may lie within the area of interest. We are going to use a previous Bellingcat post (Geolocating Tunisian Jihadists in Raqqah) as a starting point for beginning to automate photo retrieval. Over the course of a few blog posts, I am going to teach you how to build a Python script that can take a geographic area of interest and retrieve photos from various photo sharing sites so that you can quickly cycle through them looking for landmarks. To start with our script will just create an HTML page with a collage of photos so that you can scroll through and if you see a piece of imagery that matches, when you click on it, it will take you to that location on a Google Map. Even this small exercise will make going through photos much faster than clicking around on a map or having to continually submit manual searches on a site.
The first photo sharing site we will use is Panoramio. It is designed to allow people to upload photos and tag locations for those photos. Panoramio focuses on places not people, so it is a good choice for assisting with geolocating photographs, and in fact the original Bellingcat post used it to begin geolocating one of the buildings in their target video.
Do not be afraid to code. If you can run a spreadsheet, send an email, or send a Tweet it’s only a small step to start writing code as well. Coding is much like any other skill, focus on the typing to start with, develop that muscle memory and then begin to work on deeper understanding. Feel free to email or Tweet at me if you run into problems or need further assistance. I am also happy to write additional posts on the basics, just let me know.
Installing Python and Pip
Python 2.7.9 and later include Pip which is a simple Python package management system that allows us to quickly and easily add APIs to Python. The fastest way for you to get up and running with Python is to watch the following two videos.
For Windows users:
Installing Python 2.7 on Windows
For Mac users:
This should get you setup with the basics, and get you ready to start writing some Python scripts. You can write Python scripts with any text editor such as Notepad (Windows) or TextEdit (Mac) but I really recommend that you install a copy of WingIDE (free version here) and you can watch a quick tutorial here.
Panoramio API
Now that we have everything installed we are going to leverage a Python library called Pynoramio that is specifically designed to interact with the Panoramio API using Python. The first step is to get it installed, if you forget how to do this see the previous section on how to install and use pip.
C:\Users\Justin> pip install pynoramio
Perfect. Now we can begin to look at how to actually use the API. If you read the documentation it will tell you that the Panoramio API expects a bounding box in order to search for photos.
Bounding Boxes
Think of a bounding box as an imaginary rectangle that covers a geographic area. A lot of APIs use bounding boxes to return geotagged results. In a lot of cases a bounding box is measured by a latitude/longitude pair in the southwest corner of the box, and the northeast corner of the box. In other cases they call it a minimum/maximum latitude and longitude which is precisely the same thing where the minimums are the southwest corner and the maximums are the northeast corner.
I have built a tool to assist with this here.
To use the tool simply:
- Enter “Raqqah Syria” without the quotes into the location box.
- Click the Jump to Location button.
- Adjust the zoom level on the map so that you can see all of Raqqah.
- Click the little rectangle drawing button in the top of the map.
- Draw a rectangle around all of Raqqah, resize and move as you see fit.
If you scroll down the page you will see the various bounding box coordinates that we can then copy and paste into our Python script. Speaking of which…
Coding Up Our Panoramio Search
Let’s open up a new Python script in whatever editor you chose (again I strongly suggest using WingIDE here) and let’s enter the following code (you can download the full script from GitHub here):
So let’s take a look at what we have so far. It is worth noting that any line that begins with # just means it is a code comment to help understand what the code does, it is optional that you write them out:
- Line 1-3: these are Python imports. The import keyword tells Python that we want to pull in code from these libraries. You can see that we are pulling in the pynoramio library that we used pip to install.
- Line 6: here we are just naming the search we are performing. Feel free to call this whatever you please.
- Line 9 & 10: here we are pasting in the values from using the bounding box tool. Paste your own values in here.
- Line 13 & 14: here we are splitting the comma separated values into two separate numbers. This is just to save us time so we aren’t copying and pasting a bunch of values from the bounding box tool.
Great start so far! Let’s save the file as auto_image_search.py and add some more code to it as shown below.
Ok let’s pick this code apart a little bit:
- Line 20: the def keyword means that we are creating a function in Python. This particular function is going to be for performing Panoramio searches and it takes a single parameter. This parameter called fd is what is called a file descriptor so that we can write the results of our search to an HTML file. Think of a file descriptor just like you having to hold a physical piece of paper on your desk in order to write on it with a pen.
- Line 22: this is where we initialize the Pynoramio API so that we can use it.
- Line 25: here is where we tell the Pynoramio API to search based on the latitude and longitude values that we set using our bounding box.
- Line 28: we are now checking to see if we have any results from our search, if we do then we begin to loop over each search result so that we can add it to our HTML file.
- Line 33: this is where we are looping over each search result.
- Line 36: here we are taking the search result and formatting it into a Google Maps link so that when we load our HTML page we can click on an image from Panoramio and be shown the tagged location for that photo.
Ok! So this was a pretty heavy chunk of code but you made it. I want you to note that where you see indentation in the code, it is important that you indent (using the TAB key on your keyboard) exactly as you see it in the example. Python relies on this indentation to process your code.
Let’s add a bit more code and then we’ll be ready to take it for a spin.
Alright! Let’s take a quick look at what we have here:
- Lines 42-43: we want to create a folder for our new HTML file, so here we are just checking to see if we have a folder already and if we don’t we use os.mkdir to create a new folder with our search name.
- Line 46: here we are opening our HTML file so that we can write our search results to it. The %s part of the code is just going to take our search_name variable and substitute it in. So in this case we would have already created a new folder “BellingcatRaqqah” and then we are creating a new HTML file located in “BellingcatRaqqah\BellingcatRaqqah.html”.
- Line 49: here we are just creating the first part of the HTML file so that our browser (I use Google Chrome) will open it properly.
- Line 52: this is where the magic happens. We are calling our panoramio_search function that we created previously to perform the actual search and to update the HTML file.
- Line 55-56: here we just write out the closing HTML tags for our log file and close the file.
Whew! Alright now we are ready to run this badboy. If you saved it to your Desktop you can run it by double-clicking on it or you can use your command line do do:
C:\Users\Justin\Desktop> python auto_image_search.py
You should see a new folder be created on your Desktop called “BellingcatRaqqah” and inside of it an HTML file. If you double click the HTML file you should be shown a gigantic page with one photo after another. Click on a photo and your browser should open a new window that highlights a location on Google Maps.
Your Next Investigation
So now in the future, you can repeat the steps where you set a bounding box, copy the values into your Python script, set the search_name variable to reflect your new investigation and run the script again. You can see how this exercise would only take a few seconds, and would save you minutes (potentially hours) of clicking around Panoramio.
In our next post we will look at how to expand this script to add Wikimapia photos to our HTML file.