SpiderFoot and Tor


In this post I will explain a little about the Tor service, how SpiderFoot integrates with it, and how you can use this capability to protect your anonymity during scans and improve your results.

Introduction

One of the most interesting things about performing reconnaisance/footprinting is the wealth of information that starts to flow in about your target from the many and varied data sources on the internet. Performing reconnaisance manually is time consuming and often tedious, but there are also challenges with automating it:

This is where Tor becomes very useful, and so in this post I will explain a little bit about Tor, how SpiderFoot integrates with it, and how you can use this new capability in SpierFoot 2.5.0 to improve your reconnaisance results.

What is Tor?

Taken from the Tor website, the Tor network is:

… a group of volunteer-operated servers that allows people to improve their privacy and security on the Internet. Tor’s users employ this network by connecting through a series of virtual tunnels rather than making a direct connection, thus allowing both organizations and individuals to share information over public networks without compromising their privacy…

Using the Tor network is as simple as installing the client software, which basically acts as a SOCKS-compatible proxy, and proxying TCP connections from your SOCKS-capable client (e.g. a web browser) through it. The target server you are connecting to does not see your IP address, but instead the IP address of the Tor “exit node” your connection is routed through after having hopped through other nodes within the Tor network (even the Tor exit node doesn’t know your IP address). That set of hops through the Tor network is known as your “Tor circuit” and is automatically changed every ten minutes.

Depending on your Tor configuration, the Tor client will listen on two ports - one for proxying connectivity and the other for accepting control commands. The control port is used by SpiderFoot to request refreshing the circuit (what I refer to as “re-circuiting”).

Running Tor

A wealth of information about installing and configuring Tor is available on the web, but for your purposes with SpiderFoot, you simply need to download and run the Tor client and enable control connections so that SpiderFoot can control it.

NOTE: Tor offer a “Tor Browser” which is NOT what you must use with SpiderFoot - you need the Tor “stand-alone” client.

For Linux/BSD

  1. Go to the Tor download page and download the package for your platform.
  2. Compile/Install the package as per the instructions provided.
  3. Run Tor as follows:
tor --SocksPort 9050 --ControlPort 9051

Output from the process should indicate any errors and general status updates, but a message like this would indicate you are successfully set up:

[notice] Tor has successfully opened a circuit. Looks like client functionality is working.

For Windows:

  1. Go to the Tor download page click Windows and then download the Expert Bundle. Do not download the “Tor Browser”!
  2. Unzip the package to a directory of your choice, open the Windows command line and change to the unzipped package “Tor” directory.
  3. Run Tor as follows:
tor --SocksPort 9050 --ControlPort 9051

Check that it is running and listening on both ports using Task Manager and then netstat should also indicate it is listening on both ports:

C:\Tor>netstat /na | findstr 905
  TCP    127.0.0.1:9050         0.0.0.0:0              LISTENING
  TCP    127.0.0.1:9051         0.0.0.0:0              LISTENING

Configuring SpiderFoot for Tor

To enable Tor in SpiderFoot, go to the Settings menu, and then the Global tab. Scroll down and you will see the following options:

Here is each option explained:

Once you have changed these settings, click the Save Changes button and you are ready to run your scan through Tor. Check the SpiderFoot logs for your scan to see any errors that might be related to Tor.

Warning: One very critical caveat is that the use of Tor only applies to TCP connectivity because Tor explicitly does not support UDP, and thus any DNS look-ups performed directly by SpiderFoot’s sfp_dns module will go directly to your configured DNS server.

How does it work?

Various SpiderFoot modules have mechanisms to detect when CAPTCHA’s or other blocking is being implemented by various intelligence sources (e.g. Google, Bing, etc.) When SpiderFoot detects this, it uses the Stem library to send a NEWNYM command to Tor. Tor then sets up a new circuit for subsequent connections. SpiderFoot will attempt to switch circuits up to three times before giving up.