One of the first things an adversary or a penetration tester might do when targeting an organization is perform passive reconnaissance to identify initial entry points into the organization. One of the best things a defender might do in response is to do the same in order to identify the targets that will likely be the target for an initial intrusion and to proactively protect them.

The collection of information in this manner is referred to as Open Source Intelligence or OSINT. The figure below shows the kinds of information one might collect as part of this process.


In this lab, we will perform OSINT information gathering on publicly available electronic information for an organization using freely available tools.

Given the tools above, we'll now analyze the activity from the network forensics lab. In that lab, a machine has downloaded a binary from http://micropcsystem.com located at 198.54.126.123. Malicious downloads often come from sites that have been recently registered and that are quickly taken down once identified. As part of incident response, we might want to investigate both the domain and its IP address. On the Kali VM, begin by performing a DNS and a whois query on the name.

dig micropcsystem.com
whois micropcsystem.com

Regional Internet registries (RIRs) are organizations that manage the allocation of Internet addresses within a region. IANA (the Internet Assigned Numbers Authority) delegates addresses to these registries (APNIC, ARIN, RIPE NCC, LACNIC, and AFRINIC). The registries track who is responsible for which addresses so that abusive behavior can be reported and stopped. The whois tool allows us to query the RIR databases for information on arbitrary IP addresses. Run a whois query on the IP address:

whois <IP_address>

For the IP address hosting the download:

Several companies aggregate OSINT data from multiple sources to provide a comprehensive picture of arbitrary IP addresses. One such site is ipinfo.io. Visit the site at https://ipinfo.io and enter the IP address the download came from in the search box. Note that this may require you to sign in with an account Examine the output returned and answer the following questions:

Lookup each of the following IP addresses

1.1.1.1
45.35.81.2
35.233.233.233
82.102.19.137
104.149.133.54
107.189.11.228

Typically, if one wants to integrate network intelligence information into a running web application, a REST-like API is helpful. Most network intelligence platforms provide free APIs for leveraging their services, charging for use only after a certain limit is reached. An example of this is at https://getipintel.net. The site leverages a large collection of dynamic datasets of network behavior, then performs statistical analyses and machine learning on them in order to estimate the likelihood that a particular IP address is a proxy, VPN, or is the source of malicious activity. When a user supplies an IP address, the site then returns a number between 0 and 1 with 0 being an address that is not one of those categories while 1 being an address that is.

We'll create a short Python script for querying the prior list of IP addresses in order to query the getipintel API endpoint. On the Kali VM, create the Python script below, filling in the rest of the ip list and your PSU e-mail address to identify your query to the service.

query_getipintel.py

import requests
ip = ['1.1.1.1', ... ]
api = "http://check.getipintel.net/check.php"
email = '...'
for i in ip:
  resp = requests.get(f"{api}?ip={i}&contact={email}&format=json")
  print(resp.json())

Create a Python environment, activate it, install Python requests, then run your program.

virtualenv -p python3 env
source env/bin/activate
pip install requests
python query_getipintel.py

Programmatically, one can take the dictionary that is returned by parsing the JSON of the response and use it to make decisions within your code. Modify your code with a new set of IP addresses

ip = ['82.22.67.20', '131.252.220.66', '197.210.33.5']

Then, adapt your program to use the returned dictionary from the JSON object to print out the IP address and the result score for each address on successful queries.

  res = resp.json()
  if 'status' in res and 'success' in res['status']:
    print(res['queryIP'], ...)

Another source of intelligence are distributed blocklists that collect information across the Internet on IP addresses that have been the source of attacks. One aggregator of blocklists is DNSBL (https://dnsbl.info). Lookup the IP address the download came from on the site and wait for the results of all lookups.

Similarly, Threat Intelligence Platform provides intelligence aggregation for IP addresses. Visit its site at https://threatintelligenceplatform.com/ and enter in the IP address the download came from. The site will list all of the prior DNS names that has resolved to the IP address.

Finally, the Shodan and Censys search engines not only aggregate information, they also probe network addresses actively to discover what they are running. Unfortunately, these services can be utilized both by defenders to measure attack surface and to identify sources of attacks (and potentially hack back), but also by adversaries looking for vulnerable machines to attack.

Visit https://shodan.io and enter in the IP address the download came from.

Repeat the search on Censys at https://search.censys.io/

While we have shown how network intelligence can improve security, it can be easily leveraged by adversaries. One source of information adversaries can use is the geolocation of a user so that they can be successfully targeted with lures. One source of geolocation data that is particularly sensitive are the geolocations of individual wireless access points. Collecting such information is a threat to privacy and is extremely sensitive, as companies such as Google have discovered. However, it is difficult to stop crowd-sourced collection of such data since the wireless access point names and MAC addresses are public.

Wigle is a crowd-sourced wireless intelligence site that collects access point information including SSIDs, MAC addresses, and their associated geographic locations. It has submissions from hundreds of thousands of "stumblers", enumerating close to a billion WiFi networks. It also exports a query interface for searching particular addresses in order to identify their geographic location.

To begin with, visit the Wigle site and create an account on it.

Most operating systems will contain log data that keeps track of the various access points and their MAC addresses that you've connected to in the past. Doing so allows you to log into these access points automatically. Suppose you have captured my laptop and want to figure out where I live and work.

Digging through my laptop, you find the following MAC address, which happens to be one I use at work.

00:F2:8B:F9:9E:7D

Use Wigle's search interface to discover the geographic location of the access point.

Next, you find another MAC address I frequently use, which happens to be one I may use at home. Search Wigle again to discover my home's geographic location.

C0:56:27:60:EA:EA