SpiderFoot

SpiderFoot is an open source intelligence automation tool. Its goal is to automate the process of gathering intelligence about a given target.

Purpose

There are three main areas where SpiderFoot can be useful:

  1. If you are a pen-tester, SpiderFoot will automate the reconnaisance stage of the test, giving you a rich set of data to help you pin-point areas of focus for the test.

  2. Understand what your network/organisation is openly exposing to the outside world. Such information in the wrong hands could be a significant risk.

  3. SpiderFoot can also be used to gather threat intelligence about suspected malicious IPs you might be seeing in your logs or have obtained via threat intelligence data feeds.

Features

  • Utilises a shedload of data sources; over 50 so far and counting, including SHODAN, RIPE, Whois, PasteBin, Google, SANS and more.

  • Designed for maximum data extraction; every piece of data is passed on to modules that may be interested, so that they can extract valuable information. No piece of discovered data is saved from analysis.

  • Runs on Linux and Windows. And fully open-source so you can fork it on GitHub and do whatever you want with it.

  • Visualisations. Built-in JavaScript-based visualisations or export to GEXF/CSV for use in other tools, like Gephi for instance.

  • Web-based UI and CLI. Choose between a GUI that is easy to use and a powerful command-line interface. Take a look through the gallery for screenshots of the GUI.

  • Highly configurable. Almost every module is configurable so you can define the level of intrusiveness and functionality.

  • Modular. Each major piece of functionality is a module, written in Python. Feel free to write your own and submit them to be incorporated!

  • SQLite back-end. All scan results are stored in a local SQLite database, so you can play with your data to your heart’s content.

  • Simultaneous scans. Each footprint scan runs as its own thread, so you can perform footprinting of many different targets simultaneously.

  • So much more.. check out the documentation for more information.

Data Sources

This is an ever-growing list of data sources SpiderFoot uses to gather intelligence about your target. A few require API keys but they are freely available.

Module Name Description
sfp_accounts.py Accounts Look for possible associated accounts on nearly 200 websites like Ebay, Slashdot, reddit, etc.
sfp_adblock.py AdBlock Check Check if linked pages would be blocked by AdBlock Plus.
sfp_alienvault.py AlienVault OTX Obtain information from AlienVault Open Threat Exchange (OTX)
sfp_bingsearch.py Bing Some light Bing scraping to identify sub-domains and links.
sfp_binstring.py Binary String Extractor Attempt to identify strings in binary content.
sfp_bitcoin.py Bitcoin Finder Identify bitcoin addresses in scraped webpages.
sfp_blacklist.py Blacklist Query various blacklist databases for open relays, open proxies, vulnerable servers, etc.
sfp_botscout.py BotScout Searches botscout.com’s database of spam-bot IPs and e-mail addresses.
sfp_censys.py Censys Obtain information from Censys.io
sfp_clearbit.py Clearbit Check for names, addresses, domains and more based on lookups of e-mail addresses on clearbit.com.
sfp_coderepo.py Code Repos Identify associated public code repositories (Github only for now).
sfp_cookie.py Cookies Extract Cookies from HTTP headers.
sfp_crossref.py Cross-Reference Identify whether other domains are associated (‘Affiliates’) of the target.
sfp_crt.py Certificate Transparency Gather hostnames from historical certificates in crt.sh.
sfp_cymon.py Cymon Obtain information from Cymon.io
sfp_darksearch.py Darknet Search Tor ‘Onion City’ search engine for mentions of the target domain.
sfp_defaced.py Defacement Check Check if a hostname/domain appears on the zone-h.org ‘special defacements’ RSS feed.
sfp_dns.py DNS Performs a number of DNS checks to obtain Sub-domains/Hostnames, IP Addresses and Affiliates.
sfp_duckduckgo.py DuckDuckGo Query DuckDuckGo’s API for descriptive information about your target.
sfp_email.py E-Mail Identify e-mail addresses in any obtained data.
sfp_errors.py Errors Identify common error messages in content like SQL errors, etc.
sfp_filemeta.py File Metadata Extracts meta data from documents and images.
sfp_geoip.py GeoIP Identifies the physical location of IP addresses identified.
sfp_googlemaps.py Google Maps Identifies potential physical addresses and latitude/longitude coordinates.
sfp_googlesearch.py Google Search Some light Google scraping to identify sub-domains and links.
sfp_historic.py Historic Files Identifies historic versions of interesting files/pages from the Wayback Machine.
sfp_honeypot.py Honeypot Checker Query the projecthoneypot.org database for entries.
sfp_hosting.py Hosting Providers Find out if any IP addresses identified fall within known 3rd party hosting ranges, e.g. Amazon, Azure, etc.
sfp_hunter.py Hunter.io Check for e-mail addresses and names on hunter.io.
sfp_intfiles.py Interesting Files Identifies potential files of interest, e.g. office documents, zip files.
sfp_ir.py Internet Registries Queries Internet Registries to identify netblocks and other info.
sfp_junkfiles.py Junk Files Looks for old/temporary and other similar files.
sfp_malcheck.py Malicious Check Check if a website, IP or ASN is considered malicious by various sources. Includes TOR exit nodes and open proxies.
sfp_malwarepatrol.py MalwarePatrol Searches malwarepatrol.net’s database of malicious URLs/IPs.
sfp_names.py Name Extractor Attempt to identify human names in fetched content.
sfp_pageinfo.py Page Info Obtain information about web pages (do they take passwords, do they contain forms, etc.)
sfp_pastes.py Pastes PasteBin, Pastie and Notepad.cc scraping (via Google) to identify related content.
sfp_pgp.py PGP Key Look-up Look up e-mail addresses in PGP public key servers.
sfp_phone.py Phone Numbers Identify phone numbers in scraped webpages.
sfp_portscan_tcp.py Port Scanner - TCP Scans for commonly open TCP ports on Internet-facing systems.
sfp_psbdmp.py Psbdmp.com Check psbdmp.com (PasteBin Dump) for potentially hacked e-mails and domains.
sfp_pwned.py Pwned Password Check Have I Been Pwned? for hacked e-mail addresses identified.
sfp_s3bucket.py S3 Bucket Finder Search for potential S3 buckets associated with the target.
sfp_sharedip.py Shared IP Search Bing and/or Robtex.com and/or HackerTarget.com for hosts sharing the same IP.
sfp_shodan.py SHODAN Obtain information from SHODAN about identified IP addresses.
sfp_similar.py Similar Domains Search various sources to identify similar looking domain names, for instance squatted domains.
sfp_social.py Social Networks Identify presence on social media networks such as LinkedIn, Twitter and others.
sfp_socialprofiles.py Social Media Profiles Identify the social media profiles for human names identified.
sfp_spider.py Spider Spidering of web-pages to extract content for searching.
sfp_sslcert.py SSL Gather information about SSL certificates used by the target’s HTTPS sites.
sfp_strangeheaders.py Strange Headers Obtain non-standard HTTP headers returned by web servers.
sfp_threatcrowd.py ThreatCrowd Obtain information from ThreatCrowd about identified IP addresses, domains and e-mail addresses.
sfp_tldsearch.py TLD Search Search all Internet TLDs for domains with the same name as the target (this can be very slow.)
sfp_virustotal.py VirusTotal Obtain information from VirusTotal about identified IP addresses.
sfp_vuln.py Vulnerable Check external vulnerability scanning/reporting services (for now only openbugbounty.org) to see if the target is listed.
sfp_webframework.py Web Framework Identify the usage of popular web frameworks like jQuery, YUI and others.
sfp_websvr.py Web Server Obtain web server banners to identify versions of web servers being used.
sfp_whois.py Whois Perform a WHOIS look-up on domain names and owned netblocks.
sfp_wikileaks.py Wikileaks Search Wikileaks for mentions of domain names and e-mail addresses.
sfp_xforce.py XForce Exchange Obtain information from IBM X-Force Exchange
sfp_yahoosearch.py Yahoo Some light Yahoo scraping to identify sub-domains and links.