This article was originally posted to the AutomatingOSINT.com blog.
More and more investigations are being conducted on Tor and many of them can also include investigating Bitcoin transactions. The nature of Bitcoin is such that the transactions themselves are designed to be anonymous but there are many other factors that can dictate whether the owner of a Bitcoin wallet is protecting their identity correctly. By performing secondary searches for Bitcoin addresses you can typically find relationships or interesting trails of information that you can follow.
In this blog post we are going to develop a tool where we can target a particular Bitcoin address, visualize the transactions flowing in and out of it and then perform secondary dark web searches using Webhose.io to see if we can find hidden services where the bitcoin wallets have been mentioned. We will of course visualize all of this so that we can explore the data when we are finished.
Let’s get started!
Prerequisites
First make sure you have Python installed, and then install the requests library: pip install requests. If you are unsure of how to do this then check out the videos on this page under the Setup section.
Next head to Webhose.io and request an API access key.
You will also require a few Python libraries to be installed. So do the following:
Now let’s get coding!
Coding It Up
Crack open a new Python script, name it bitcoindarkweb.py (you can download the full source here) and start hammering out the following code:
All of this is pretty straightforward but we’ll talk about a few items of importance first:
- Line 5: this is the access token you need from Webhose.io in order for the dark web searches to work.
- Line 7: this is a blacklist of Tor hidden services that we will not include in our searches or our graph. The reason we have this list is because there are various Bitcoin hidden services that might list piles of addresses, and we aren’t really interested in having them come back as results. Feel free to add to this list as well.
The rest of the code deals with setting up some placeholder variables, and setting up the commandline parsing for the script. Now let’s continue adding code:
- Line 29: we setup our get_all_transactions function to take in a Bitcoin address. This function will be responsible for downloading all transactions for this address which we will use later.
- Line 35: here we are building up a BlockExplorer URL in the format that they require in order for us to download chunks of transactions from their service.
- Lines 37-47: we send off the request to BlockExplorer.com (37) and then attempt to parse the JSON result (40). If the parsing fails we ouput a message (42) and return an empty list of transactions (43). If the parsing was successful but there are no transactions for that address (45) we output a message (46) and return an empty list (47).
- Line 49: we take the list of transactions and add them to our main transaction list.
- Lines 51-62: we setup a loop to continue grabbing transactions in blocks of fifty at a time (51) and then we continually increase the from (53) and to (54) variables to “page” through the transactions. We pass these variables along to the BlockExplorer API (56) and then add the results to our main transactions list (62) before continuing to the top of the loop again.
- Line 66: now that we are done collecting all of the transactions, we return the transaction list.
Ok cool, we now have a method to collect all transactions flowing into and out of a particular Bitcoin address. Now let’s create a function that can loop through all of the Bitcoin transactions and extract all of the unique Bitcoin addresses. This will give us a unique list of Bitcoin addresses to search Webhose.io for later on. Punch in the following code:
- Line 72: we setup the function to take in our list of transactions that we previously have retrieved.
- Lines 79-80: we check the sending address to see if it is an address we haven’t seen before and if it is new, we add it to our bitcoin_addresses list (80).
- Lines 83-91: we loop through the receiving side of the transaction (83), and then walk through all of the addresses that are contained (87). For each address we encounter we check to see if it is in our list already (89) and if it is a new address we add it to our bitcoin_addresses (91) list.
- Lines 93-95: once we have retrieved all of the unique addresses that are related to our target address, we return this list so that we can use it in our Webhose.io searches.
Cool at this point we have all of the Bitcoin transactions for an address, and we have nailed down all of the unique addresses that have sent or received Bitcoin from that address. Now let’s implement the Webhose.io searching so we can pull on some additional intelligence threads to determine potential hidden services where these addresses have been mentioned. Continue adding code to your script:
- Line 101: we define our search_webhose function to take in the list of Bitcoin addresses that we want to search for.
- Lines 106-115: we begin looping over our list of Bitcoin addresses (106) and then add it to the Webhose search URL (111), before sending off the request (113) and processing the JSON response (115).
- Lines 118-128: we setup a loop to download all results (118) and then start walking through each search result (121). We test where we already have the current Bitcoin address in our bitcoin_to_hidden_services dictionary (123) and if not we add it and initialize an empty list (124). We then do a secondary check to make sure that the current hidden service result is not in the list associated with the current Bitcoin address (126) and if not then we add it (128). This whole dance is so that we can associate a Bitcoin address to a list of unique hidden services.
- Lines 130-131: if we have less than 10 results (130) we break out of the loop (131) and return to the top of the function to check the next Bitcoin address in the list.
- Lines 135-142: we want to now continue grabbing additional search results from Webhose, and how we do this is by using our original query, and then adding some blacklisted sites that we do not was results for. This will also lower the total number of API calls required to get all of our results. We do this by looping over the discovered hidden services (137) and build a negative query for each of them (138). We then loop over the blacklist we have at the top of the script (141) and also add those hidden services to our negative query (142).
- Lines 144-148: we send off the new query (144) and then parse the JSON response (148) before returning to the top of the loop on line 118 to continue processing newly discovered hidden services.
Oof! That’s a lot of code. At this point in our script we can get all Bitcoin transactions, all unique Bitcoin addresses related to those transactions and any mentions in the dark web that Webhose.io has stored. Now what we need to do is tie this all together into a tidy visualization that will allow us to review these results easily. Time to add our magical graphing function, so keep adding code to the script:
- Line 160: we define our build_graph function to take in the source Bitcoin address, the list of Bitcoin transactions and our hidden services dictionary.
- Line 162: we create a NetworkX graph object. Note that this is a DiGraph or directional graph which will show us directions for the Bitcoin transactions (in and out).
- Lines 165-173: we loop through the list of transactions (165), and pluck out the sender address (168). If we detect the sender is our target Bitcoin address (170) we add a node to the graph with the special type attribute set to “Target Bitcoin Wallet” (171). Otherwise we just set the attribute to “Bitcoin Address”. The reason we have the different attributes is so that we can color the dots on the graph in Gephi based on these attributes.
- Lines 177-187: now walk through the receiving side of the transaction (177) and then walk through each of the addresses that are found within the transaction (180). We do the same check for the target Bitcoin wallet (182) and apply the same attributes as we did previously. The last thing we do is connect the sender to the receiver on the graph by adding an edge (a line) between them. Notice that the sender is first in the add_edge function which will draw the arrow out to the receiver address when we visualize the graph.
- Lines 189-195: now we need to add our hidden services that were discovered. We do this by walking through the dictionary (189) and then walking through each hidden service in stored in with each dictionary key (191). If the hidden service is not in the blacklist (193) we add a new node to the graph with the type attribute set to “Hidden Service” (194) and then add an edge between the associated Bitcoin address and the hidden service (195).
- Line 199: the last thing we do is output the entire graph object to a GEXF file that we can open in Gephi.
Sweet! Now the last step is to tie together all of these functions so that we can take the whole thing for a test drive! We are almost finished, just a little more code to go:
- Line 207: we first call our get_all_transactions function to retrieve our transaction list for the specified target Bitcoin wallet.
- Lines 209-212: if we have a valid list of transactions (209) we then get all of the unique Bitcoin addresses from the transaction list (212).
- Line 216: we leverage Webhose.io to try to find any hidden services that have references to our discovered Bitcoin addresses.
- Line 219: we build the graph so that we can visually explore our results!
Nice work! If you have your Webhose.io token setup and a target Bitcoin address, you should be ready to take this script for a spin.
Let It Rip
Now give your script a run, in my case I used the following. You can download the resulting GEXF file here.
[*] Retrieving all transactions from the blockchain for 19m9yEChBSPuzCzEMmg1dNbPvdLdWA59rS
[*] Retrieved 594 bitcoin transactions.
[*] Identified 1998 unique bitcoin addresses.
[*] Searching 1 of 1998 bitcoin addresses.
[*] Searching 2 of 1998 bitcoin addresses.
[*] Searching 3 of 1998 bitcoin addresses.
[*] Discovered 2 hidden services connected to 19m9yEChBSPuzCzEMmg1dNbPvdLdWA59rS
[*] Searching 4 of 1998 bitcoin addresses.
…
[*] Searching 1997 of 1998 bitcoin addresses.
[*] Searching 1998 of 1998 bitcoin addresses.
[*] Done! Open the graph file and happy hunting!
Cool, now you can open up the graph in Gephi (I cover Gephi usage here) and begin exploring some of the transactions and hidden services that you have discovered. Another technique you could do to enhance your results would be to include normal web searches through Searx to find additional results. I have a supporting post here that can help!