A Guide to Python Programming for Cybersecurity
Cybersecurity is the practice of protecting networks, systems, and programs from digital attacks. It is estimated to be an industry worth $112 billion in 2019, with an estimated 3.5 million unfilled jobs by 2021.
Many programming languages are used to perform everyday tasks related to cybersecurity, but one of them has emerged as the industry standard: Python, which is dominating the cybersecurity industry.
Python has a syntax that is easy to read and understand and a wide range of applications that make it a very versatile programming language for any aspiring cybersecurity professional.
Python’s growth over the last few years has been incredible, and it’s now considered one of the most popular languages across all industries, according to Stack Overflow.
If you’re a programmer thinking of transitioning to security, this post will show you how you can use your existing skill set in another high-income, low-unemployment industry. You could do that, for instance, by either automating repetitive processes to save your team countless hours or by creating security tools that can be used to test the security of applications or systems.
Imperva, a leading cybersecurity software and service provider, reports 77% of the websites they protect were attacked by a Python-based tool. As security professionals, part of our job is to mimic real-life attacks to ensure that companies are ready when real attacks occur, understanding the language and libraries used in real attacks. Replication of those tools is a very valuable skill set.
However, not all Python experience is equal in the security field. To build an effective portfolio, develop effective software, and properly demonstrate your value, you need to focus on learning the right Python libraries and frameworks for the industry.
So let’s look at some of the different Python libraries that you need to know to thrive in these areas.
Firstly, you want to be able to write effective Python scripts to automate many of the day-to-day tasks of a security professional.
Python has been widely used in security work because of its easy-to-learn syntax and wide range of libraries, which give it a lot of functionality. While other languages can be used to perform these tasks, I recommend learning Python. That’s what the majority of the industry will be using, and collaboration is important.
Many security tasks require you to apply the same operation across hundreds or thousands of endpoints. For example, let’s look at configuration management. This is the practice of defining a secure template for a system, including things like what services are allowed to be on the machine, what ports will be open, firewall rules, etc.
The ability to automate these processes will not only reduce time but also errors. Up to 90% of security incidents are a direct result of human error. The more you can move away from relying on human actors, the better it is from a security perspective. So this leads to the question, how can I learn to automate processes like this?
Boto3 is the Amazon Web Services (AWS) Software Development Kit (SDK) for Python, which allows programmers to write scripts that can interact with AWS services like Amazon Simple Storage Service(S3), Amazon Elastic Compute Cloud(EC2), and Amazon Virtual Private Cloud(VPC).
With Boto3 you can start and stop servers on demand, cancel instances that do not conform to your organization’s security standards, perform updates and patch management, and much more. Being familiar with this SDK is very valuable for any professional working with AWS.
Regex stands for regular expressions, and this is a tool that allows you to search for specific patterns within a block of text. This is a very useful function for extracting information from log files during an investigation or when scraping information from the internet.
By combining this library with other standard Python libraries, you can create some very useful programs. For example, you can use regex to search log files and locate IP addresses so you can determine if someone was able to hack into your network, what actions they performed, and what time this event took place.
Pyautogui and Web Browser
Pyautogui allows your scripts to control mouse and keyboard functions, letting you imitate intelligent user behavior. The web browser module allows you to launch a new browser to a specified URL.
You can use these in programs to automate any action requiring you to go to a website and perform any function, such as filling out a web form, downloading files, etc. This can be used to automate functions that require you to login to a web page and post information.
These libraries can also be used to automate other routine tasks that require you to login, go to a web browser, and perform an action on the webpage.
This library gives you the ability to access the clipboard directly from your Python scripts. While this can be done with the pyautogui library, pyperclip makes this process much simpler and adds flexibility to your scripts.
It’s particularly useful for any scripts that involve large bodies of text. For example, say you’re scanning an entire pdf for names, addresses, and phone numbers. Just by highlighting the pdf text and copying it to clipboard, pyperclip allows you to use it in your script as an input, saving you a significant amount of time.
This library is dedicated to producing fake data that can be used to test your programs. This is important to ensure that whatever scripts or tools you write will be able to perform the action as intended.
For example, if you have a script that extracts URLs, you may want to generate some fake text containing that information and test your program to ensure that your script can find it effectively. Faker can generate random data such as names, addresses, emails, countries, text, urls, etc.
Another important application of Python programming in cybersecurity is in the area of penetration testing. A penetration test is the process of trying to hack into a website, application, device, or network in order to test the security of that entity.
In order to perform these tests effectively, many professionals create their own tools and scripts that function exactly as they need them to for the test, and this is where knowing Python becomes very useful.
Python is largely used in this area to develop custom scripts and tools used to perform the attacks. If you want to be successful in this area, knowing how to write effective scripts and how to read and understand tools written by others will be very valuable to you. Here are some of the key libraries you need to be familiar with.
Nmap is a very widely used port scanner. Port scanning is the process of checking what ports are open on a computer and what services are running on that machine so you can start to determine how that machine may be vulnerable to getting hacked.
The Python Nmap library makes it easy for you to utilize nmap functionality through your Python scripts, speeding up the process of scanning a target computer for vulnerabilities and giving you more customization in your scans. This library allows you to analyse nmap scan results, perform custom scans, and import nmap results into other tools.
Socket is a low-level network interfacing library that allows you to establish client-server connections. In the context of cybersecurity, this is important because it allows you to connect to any machine on a specified port, with a specific protocol, and send data to that machine.
This can be used for port scanning of a machine as well as sending data to or extracting information from a machine. Data exfiltration occurs at a later stage of pen testing and is known as exploitation. Any project that requires you to communicate over a network interface will likely use Socket.
Scapy is a packet manipulation library that can forge and decode packets across many different network protocols.
In cybersecurity, there are situations where you need to monitor the packets being sent across a computer network. It could be to determine if someone hacked into your environment, see what ports and services are running on a machine, or troubleshoot a network problem.
Whatever the reason, this library is great for performing packet analysis and can allow the same functionality as popular tools such as Nmap, Wireshark and tcpdump.
Requests is pretty self-explanatory. It allows programmers to send HTTP requests through their scripts. HTTP requests are useful for pen testing activities by allowing the creation of custom payloads and attacks against web applications.
Requests can achieve the same functionality as a tool like burp suite but with more customization to your needs. Imperva researchers found that Requests was the most popular Python library used in web-based attacks, used in 89% of Python-based attacks.
This library specializes in assisting the information-gathering phase of penetration testing.
Beautiful Soup allows you to parse data from HTML and XML files, letting you automate data-scraping tasks. Data scraping can be important during the open-source intelligence phase of a penetration test, as this phase is dedicated to finding as much information about the target of the test as possible.
For this reason, you may want to create scripts to automate this phase, searching in places like Github to find information on your target company. This information could include IP addresses, or User IDs and passwords that are often accidentally committed by developers to public repositories.
Each of these libraries adds important functionality, but to get proficient with writing scripts related to security, it’s best to learn them in a structured way.
When it comes to automation tools, I highly recommend these two resources because they cover all of the core Python libraries used in automation of everyday tasks, and they guide you through several projects that you can put in your portfolio to demonstrate your knowledge to a recruiter.
- Automatetheboringstuff.com: This free ebook walks you through all of these libraries and more, related to automating everyday work tasks using Python. It is by far the most comprehensive guide I’ve found and comes with practice exercises, projects, and walkthroughs.
- Google’s Automation with Python Professional Certificate: Google has a crash course to introduce you to the language and walks you through important aspects of automation for an IT professional.
As you’re learning Python, I would highly recommend you keep all of the code that you write in these courses and use it in a portfolio. An easy and free way to do this is through a Github portfolio.
Each of these courses comes with several practice project ideas that you can do, but some of the key skills you want to demonstrate are the ability to read and write to files, extract information from text, and interact with online services through Application Programming Interfaces (APIs).
If you’re interested in learning Python directly for pen testing, here are some good places to start. These books go into great detail on how to use Python to accomplish security-specific activities, like security automation, developing Python security tools for security testing, and Python scripts used in computer forensic activities. They are also well respected by the security community, which is a testament to their quality.
- Violent Python: A Cookbook for Hackers, Forensic Analysts, Penetration Testers and Security Engineers
- Black Hat Python: Python Programming for Hackers and Pentesters
- Grey Hat Python: Python Programming for Hackers and Reverse Engineers
The ability to program is a valuable asset for any aspiring security professional, especially if you’re interested in technical roles such as being a security engineer or penetration tester.
Python Is Crucial in Cybersecurity
Python is the most prevalent programming language in cybersecurity, and demonstrating your ability to program in this language can greatly improve your chances of landing a job.
For building a strong programming portfolio, you want to focus on demonstrating that you can automate everyday tasks with Python as well as create security tools for Pen Testing Web Applications, Networks, and Computer Systems.
Cybersecurity is one of the highest paying tech industries, and it’s only projected to grow, presenting a big opportunity for those who are qualified.