How to Create a Custom Wordlist in Python

Learn how to build a custom wordlist generator with options of minimum and maximum characters, and more with itertools in Python.
  · 5 min read · Updated dec 2023 · Ethical Hacking · Python Standard Library

Unlock the secrets of your code with our AI-powered Code Explainer. Take a look!

As an Ethical hacker or security professional, you are tasked with cracking the passwords of recently seized documents previously owned and operated by cybercriminals. One very important tool you need to accomplish this task is a wordlist. Now, what is a wordlist?

A wordlist is a collection or list of words, typically organized in a specific order or categorized based on certain criteria. Wordlists can serve various purposes and can be used in different contexts, such as cryptography, password generation, password cracking, and more. 

There are a lot of already available tools that do this for us, like crunch. But because we’re programmers and like to build our tools ourselves, we’re building our own custom tool using Python.

In the context of the scenario at hand (password cracking), wordlists can be used to crack passwords through:

Dictionary Attacks

In a dictionary attack, attackers use a predefined list of words, known as a wordlist or dictionary, to attempt to crack passwords. These wordlists can include commonly used passwords, words from dictionaries, phrases, and combinations of characters. Attackers systematically try each word in the list until they find a match or exhaust the entire list.

Brute-Force Attacks with Wordlists

Wordlists are often used in conjunction with brute-force attacks. In a brute-force attack, attackers systematically try all possible combinations of characters, starting from the shortest and progressing to longer ones. Wordlists are used to optimize this process by focusing on likely passwords first, saving time compared to a purely random brute-force approach. In this tutorial, we will build a brute-force custom wordlist with custom options.

Open a Python file, name it meaningfully (like wordlist_gen.py) and follow along.

We’ll start by importing the necessary libraries:

# Import the argparse module for handling command line arguments.
# Import the itertools module for generating combinations.
import argparse, itertools

The argparse module is used for parsing command-line arguments in Python scripts. It provides a convenient way to specify the input parameters for a script and retrieve those values within the script.

The itertools module provides a set of fast, memory-efficient tools for working with iterators (sequences). It includes functions for creating and manipulating iterators, such as infinite iterators, combining multiple iterators, and generating permutations and combinations. This is what we use to create our wordlist.

Next, we create a function that does the wordlist generation:

# Define a function to generate a wordlist based on given parameters.
def generate_wordlist(characters, min_length, max_length, output_file):
   # Open the output file in write mode
   with open(output_file, 'w') as file:
       # Iterate over the range of word lengths from min_length to max_length.
       for length in range(min_length, max_length + 1):
           # Generate all possible combinations of characters with the given length.
           for combination in itertools.product(characters, repeat=length):
               # Join the characters to form a word and write it to the file.
               word = ''.join(combination)
               file.write(word + '\n')

The generate_wordlist() function takes parameters for characters, minimum and maximum word lengths, and an output file, then generates all possible combinations of characters within the specified length range and writes them as words into the given output file.

From this function, we can tell the program how many characters to start from (minimum) and what characters it shouldn't exceed (maximum). In a scenario where we have an idea of the number of characters our target’s password has, this is very useful.

Finally, we accept arguments from the command line using argparse:

# Create an ArgumentParser object for handling command line arguments
parser = argparse.ArgumentParser(description="Generate a custom wordlist similar to crunch.")
# Define command line arguments
parser.add_argument("-c", "--characters", type=str, default="abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789",
                   help="Set of characters to include in the wordlist")
parser.add_argument("-min", "--min_length", type=int, default=4, help="Minimum length of the words")
parser.add_argument("-max", "--max_length", type=int, default=6, help="Maximum length of the words")
parser.add_argument("-o", "--output_file", type=str, default="custom_wordlist.txt", help="Output file name")
# Parse the command line arguments
args = parser.parse_args()
# Call the generate_wordlist function with the provided arguments
generate_wordlist(args.characters, args.min_length, args.max_length, args.output_file)
# Print a message indicating the wordlist has been generated and saved
print(f"[+] Wordlist generated and saved to {args.output_file}")

From this part of the code, we accept arguments (specifications) from the user, generate the password accordingly, and save the generated output in the specified file.

Let’s run our code:

$ python wordlist_gen.py -c abc123 -min 3 -max 6 -o generated_passwords.txt

When we look at our generated_passwords.txt file:

There you have it! We have created our own custom wordlist. Pretty cool, right?

If you want to know how to take this wordlist further by learning how to crack passwords, check out these tutorials:

Happy generating ♥

Want to code smarter? Our Python Code Assistant is waiting to help you. Try it now!

View Full Code Explain The Code for Me
Sharing is caring!



Read Also



Comment panel

    Got a coding query or need some guidance before you comment? Check out this Python Code Assistant for expert advice and handy tips. It's like having a coding tutor right in your fingertips!