As I reflect on my early days learning about digital forensics and incident response, I recall a particularly challenging experience that had an impact on my approach to threat hunting. It was during a SANS FOR500 course where we were presented with a case involving a missing person’s laptop hard drive. The capstone task was: analyze the device and determine what happened to the individual.
I froze, overwhelmed by the number of techniques I learned and knew, but didn’t know how to apply. However, this experience taught me the importance of frameworks in analysis. If only I had a clear framework to follow, I would have been more confident and effective in my analysis.
That’s why I created the PRECEEDframework, an acronym that represents a model designed to help threat hunters understand and stop insider threats. By categorizing activities into seven phases – Pivot Point, Reconnaissance, Evasion (Pre-Exfiltration), Collection, Exfiltration, Evasion (Post-Exfil) and Damage – we can develop a proactive approach to identify potential insider threats.
Understanding the PRECEED Framework
The PRECEEDframework offers a structured approach to insider threat hunting, ensuring that you don’t miss critical indicators throughout the incident. Here’s how each element fits into the model:
Pivot Point: This marks the beginning of the malicious activity sequence. It could be triggered by recruitment from a foreign nation-state, bribery, or negative interactions with managers.
Reconnaissance: An insider may start exploring internal systems to gather sensitive information or identify methods for data extraction.
Evasion (Pre-Exfiltration): Within the PRECEED framework, there are two phases of evasion. Pre-exfiltration Evasion involves avoiding security controls to aggregate and exfiltrate data.
Collection: The insider is actively moving or downloading data in preparation for exfiltration.
Exfiltration: This phase involves removing data from the environment, possibly by uploading it to a website, printing the information, or transferring it to a removable storage device.
Evasion (Post-Exfiltration): This phase focuses on covering tracks and deleting evidence (e.g. downloading and executing CCleaner).
Damage: We won’t focus our hunting efforts on the final phase of the insider threat incident because if you get to this phase, the insider threat has achieved their actions on objective and the incident has been discovered.
The PRECEED framework is more than just a model – it’s a tool that empowers you with confidence and expertise in threat hunting. By leveraging its structure, you can identify potential threats earlier, contain them faster, and ultimately prevent data breaches and other security incidents.
Hypothesis chaining is a method that enables threat hunters to narrow down search results during a hunt by appending or branching off of their original hypothesis.
This technique helps threat hunters take an overwhelming output from a hunt and continue their investigation without being hindered by a large volume of results.
For example, a threat hunter hypothesizes that threat actors may be using DGAs (Domain Generation Algorithms) for phishing, malware distribution, or command and control (C2). They generate a hunt in DNS or Secure Web Gateway logs, looking for users interacting with .xyz TLDs. However, upon running the initial query to validate their hypothesis, the results yield tens of thousands of entries. Despite efforts to group or count the domains, there are still over 1,000 distinct domains in the results.
This is where hypothesis chaining becomes useful. The threat hunter can append one or more sub-hypotheses to further narrow the results.
Original hypothesis: threat actors may be creating DGAs for phishing, malware, or C2
Sub-hypothesis 1: the malicious domain may be used for phishing, therefore we’ll filter domains in the hunt that have more than 20 transactions in the original hypothesis result set.
Sub-hypothesis 2: the malicious domain may be used for C2, which we hypothesize would have an average of one heartbeat every hour (regardless of jitter), therefore we filter down the original hypothesis results further by only displaying domains that were interacted with 118-218 times (1 transaction per hour * 24 hours = 24 times per day or 168 times per week [plus or minus 50 = 118-218]) in the last 7 days.
Hypothesis chaining can help a threat hunter take the output of a hunt that is too large to act upon, and further their investigation without being blocked by a large number of results.
For example, a threat hunter hypothesizes that threat actors may be creating DGAs for phishing, malware, or C2, so they generate a hunt in DNS or Secure Web Gateway logs looking for users who interact with .xyz TLDs. Upon running the initial query to validate their hypothesis, the results are in the tens of thousands. They take steps to group or count the domains, however they still have over 1,000 distinct domains in the results.
This is where hypothesis chaining comes in. The threat hunter could append one or more sub-hypotheses.
Original hypothesis: Threat actors may be using .xyz TLDs for malicious activity.
Sub-hypothesis Chain 1: The malicious domain may be used for phishing; therefore, we’ll filter domains in the hunt that have more than 20 transactions in the original result set
Sub-hypothesis Chain 2: The malicious domain may be used for C2, which we hypothesize would have an average of one heartbeat every hour (regardless of jitter). To refine the original results, we filter domains interacted with 118-218 times (1 transaction per hour × 24 hours = 24 times per day, or 168 times per week, plus or minus 50 = 118-218) in the last 7 days.
Running a “ipport:53” process search in Carbon Black has helped to locate hosts and/or processes performing an abnormal number of DNS queries, however recently I wanted to hunt processes generating punycode IDN DNS queries on a client network. Alerts were triggered by their SIEM, which was ingesting Microsoft Server DNS logs. The Sumo Logic alert was created using the following query (for reference):
_sourceCategory=Prod/*/DC/Windows/DNS | parse regex "^(?[1]?\d/[123]?\d/\d{4}) (?)$" | replace (question_name, /^(\d+)/, "") as question_name | replace (question_name, /(0)$/, "") as question_name | replace (question_name, /(\d+)/, ".") as question_name | where question_name matches /xn--/ | count by question_name, ip_address
Once a DNS query triggered an email alert, I pivoted into Carbon Black using the process search query: domain:xn--*
Double clicking on one of the results, we can see a ‘netconn’ to a punycode IDN that looks like LastPass.
If the punycode is converted to ASCII, we can special characters at the end of the domain:
While the illustration above/ is a false positive, it demonstrates the procedure for identifying and hunting the root cause/process that generated a punycode IDN query whereas DNS server log analysis only shows the source system that made the query. Alerting on punycode IDNs often uncovers phishing attempts and user access of malicious or foreign sites. If logs are not being ingested by a SIEM, Carbon Black watchlists could be used as an alternative.
Before starting, big shout out to Eric Zimmerman (https://github.com/EricZimmerman/) for creating so many great free DFIR tools.
KAPE can be downloaded here: https://www.kroll.com/en/insights/publications/cyber/kroll-artifact-parser-extractor-kape
KAPE is a standalone program that does not need to be installed. Decompress the zip file to a directory of your choosing and you are ready to go. KAPE requires administrator rights when run (luckily Carbon Black Live Response runs as SYSTEM). Once KAPE is unzipped, the following files and directories will be available:
If you’re new to using KAPE, start with gkape.exe (GUI version) to help build the command line for you:
Log into Carbon Black and go to the endpoint/sensor of the suspect system.
Click the “Go Live” button in the top right corner.
Carbon Black Live Response will drop you into C:\Windows\CarbonBlack\.
Type the following to execute cmd.exe, make the KAPE directory, move into the KAPE directory, and create a Targets directory on the suspect machine:
Move the kape.exe binary into C:\Windows\CarbonBlack\KAPE and the targets file (e.g. !SANS_Triage.tkape) into the C:\Windows\CarbonBlack\KAPE\Targets directory.
> put C:\Windows\CarbonBlack\KAPE > put C:\Windows\CarbonBlack\KAPE\Targets
You’re ready to execute KAPE CLI. As a reminder, I recommend using the GUI version of KAPE to build your syntax.
KAPE will take a few minutes to collect all the evidence. In the example below, it took 312 seconds to collect 2.5Gb of data. The part that takes some time is downloading the zipped VHDX. Alternatively, you could setup KAPE to send the evidence to an S3 bucket or FTP server.
Once the evidence collection is done, download the evidence:
> get “C:\Windows\CarbonBlack\KAPE\example_file.zip”
Once you have the .zip file on your system and extracted, you’ll see a .vhdx file that can be mounted on a Windows 10+ system.
From here, you can use a forensic tool like AXIOM to parse the triage data or manually browse the data:
I often get asked which books are worth reading for an aspiring InfoSec pro. While there are many great books out there, each serving a unique purpose, here are my general recommendations.
The Basics:
Wireshark 101: Essential Skills for Network Analysis – Second Edition: Wireshark Solution Series
Practical Packet Analysis, 3E: Using Wireshark to Solve Real-World Network Problems
The Art of Deception: Controlling the Human Element of Security
Defensive Security Handbook: Best Practices for Securing Infrastructure
Open Source Intelligence Techniques: Resources for Searching and Analyzing Online Information
T CP/IP Illustrated, Volume 1: The Protocols (2nd Edition) (Addison-Wesley Professional Computing Series)
Linux Basics for Hackers: Getting Started with Networking, Scripting, and Security in Kali
A Bug Hunter’s Diary
Ghost in the Wires
Violent Python: A Cookbook for Hackers, Forensic Analysts, Penetration Testers and Security Engineers
Offensive Security Books:
Rtfm: Red Team Field Manual
The Hacker Playbook 3: Practical Guide To Penetration Testing
Penetration Testing: A Hands-On Introduction to Hacking
The Hacker Playbook: Practical Guide To Penetration Testing
Hash Crack: Password Cracking Manual (v3)
The Hacker Playbook 2: Practical Guide To Penetration Testing
Defensive Security Books:
Blue Team Field Manual (BTFM)
The Practice of Network Security Monitoring: Understanding Incident Detection and Response
Practical Malware Analysis: The Hands-On Guide to Dissecting Malicious Software
The Art of Memory Forensics: Detecting Malware and Threats in Windows, Linux, and Mac Memory
Blue Team Handbook: SOC, SIEM, and Threat Hunting (V1.02): A Condensed Guide for the Security Operations Team and Threat Hunter
Blue Team Handbook: Incident Response Edition: A condensed field guide for the Cyber Security Incident Responder. Second Edition.
Network Forensics Tracking Hackers Through Cyberspace
The Tao of Network Security Monitoring: Beyond Intrusion Detection
NOTE: This vulnerability was discovered in early April of 2018. I immediately contacted Intuit support and their security team to responsibly disclose the vulnerability. I offered suggestions and help in finding a solution to protect the public at no cost. As of 12/25/2018 the software remains vulnerable and I am disclosing my findings in hopes for better security for all tax software.
While wrapping up a penetration test earlier this year at a CPA firm, I discovered unusual amounts of SMBv2 traffic traversing the network. I usually find some type of data exposure while capturing packets, but this was different. Most of the time it’s self-service password reset tools, scripts, or other IT tools spewing credentials in unencrypted cleartext on a flat network. This time the data exposure was more than I’d expected. I saw a lot of SMBv2 traffic with paths such as \\[server_name]\lacerte\17tax\idata\[customer_last_name]. I thought I’d found another low to medium risk item for my report. I could tell my client that their tax software, Intuit Lacerte, was exposing their customer’s last names. After the capture I analyzed the results and decided to follow one of the SMB TCP streams to see what data was being accessed by the software. Maybe I’d find bad permissions or more data leaking from the tax software.
To my surprise, when the CPA authenticated to the client-side software, it made a request to read the customer list from the server. The server responded over SMB dumping the entire customer database.
What I saw on the screen was over 1,000 customer records containing the following for each person: Client ID, First Name & Initial, Last Name, Title, Social Security Number, Occupation, Date of Birth, Dependencies, Spouse’s First Name & Initial, Spouse’s Last Name, Spouse’s Title, Spouse’s Social Security Number, Spouse’s Occupation, Spouse’s Date of Birth, Home Address, Home Phone, Work Phone, Work Extension, Daytime Phone, Mobile Phone, Fax Number, Email Address, Spouse’s Home Phone, Spouse’s Work Phone, Spouse’s Extension, Spouse’s Daytime Phone, Spouse’s Mobile, Spouse’s Fax Number, Email Address, and Spouse’s Email Address. Driver’s License Number, Spouse’s Driver’s License Number, and more.
There isn’t much more a criminal could ask for. Everything needed to commit identity theft or fraudulently file taxes on 1000 people’s behalf is presented on a silver platter to anyone on the local network or a compromised workstation.
In my penetration test and lab setup, I tested the software in a client/server configuration, which is the common setup for CPA firms with more than one employee.
After further testing, I discovered that the tax software exposes every customer in the database immediately after the CPA launches/authenticates (not default) the Lacerte client on their workstation. The client software submits a SMBv2 Read Request for \\[server_name]\[share]\idata\DATA1I17.dbf. The server happily responds with a aSMBv2 Read Response containing the contents of the .dbf file.
Below is a sample of captured data when launching the Lacerte client on a workstation:
The following scenarios could be leveraged by a bad actor to obtain the exposed data traversing the network:
Be on the same collision domain as the client/server (e.g. same Wireless network)
Be on the same broadcast domain (e.g. switched network) and perform ARP poisoning
Number of other man-in-the-middle scenarios
I continued to test the Lacerte software while running a packet capture tool on the collision domain to find out how often or what other events would trigger data exposure. In addition to the initial client application launch, I was able to capture other data leaks when almost any field within a customer’s record was modified.
While, single field modifications didn’t trigger the entire database to be re transmitted, changing values such as the dollar amount for Wages earned generated large SMB transfers. Analyzing the SMB traffic, I can see the Client sends the Server the entire customer record containing Client ID, Full Name, SSN, Phone Number, etc. as mentioned above. The Server then responds to the Client with the same copy of data containing Client ID, Full Name, SSN, Phone Number, etc.. The Server then sends a tabulated copy of the customer’s financial record with information such as Wages, Social Security Wages, Medicare, Tips, Depend Benefits, Tax Withholding, Social Security Withholding, etc. Essentially, if a bad actor was able to capture these packets, he or she would have a copy of the customer’s tax return and be able to identify targets based on income level.
Below is an example of modifying the Wages field for the test customer from a blank value to 9999991 and the corresponding capture in Wireshark to illustrate the data was intercepted in transit.
Initially, upon reporting this vulnerability to Intuit, they denied the validity of it being a vulnerability, blaming the underlying unencrypted cleartext protocol (SMBv2) as the vulnerability. I responded saying that’s like building a Bank website using HTTP instead of HTTPs and blaming HTTP when customer’s logins are stolen.
Further investigation of the tax software demonstrated that the Intuit Lacerte databases are not only traversing the network in unencrypted cleartext, they are also stored on the server in unencrypted cleartext.
The Intuit Lacerte software does not require a password upon setup, however there is an option to password protect access to the tax software. However, access controls to the database can be easily bypassed since the sensitive data in Lacerte is being stored in unencrypted cleartext as strings.
In the screenshot below, I extracted strings from the data1i17.dbf file to see what was readable without using a username/password. To my surprise, I found all customer records containing Full Name, SSN, Driver’s License Number, Address, etc. as seen below:
One of the foundational security concepts for security professionals is the CIA triad. I must ensure Confidentiality, Integrity, and Availability of data and systems. Confidentiality means ensuring only those who are authorized to access data can access it. That includes at rest, in transit, and in process. Intuit Lacerte failed to provide confidentiality of customer’s sensitive personally identifiable information and tax records at rest and in transit. I have not yet tested what data is being leaked in process, however we have my assumptions.
I have responsibly reported the vulnerability to the national vulnerability database:
CVE-2018-11338
Intuit Lacerte 2017 for Windows in a client/server configuration transfers the entire customer list in cleartext over SMB, which allows attackers to (1) obtain sensitive information by sniffing the network or (2) conduct man-in-the-middle (MITM) attacks via unspecified vectors. The customer list contains each customer’s full name, social security number (SSN), home address, job title, tax year being filed, phone number, Email address, spouse’s phone number/Email address, and other sensitive information.
After the client software authenticates to the server database, the server sends the customer list to the client over SMB in cleartext. Following the TCP stream displays the entire customer list.
There is no need for further exploitation as all sensitive data is exposed. This vulnerability was validated on Intuit Lacerte 2017, however older versions of Lacerte may be vulnerable.
NOTE: This vulnerability was discovered in late July of 2018. I immediately contacted Thompson Reuter support, their sales team, and Tweeted to them to responsibly disclose the vulnerability. I offered suggestions and help in finding a solution to protect the public at no cost.
After recent discoveries of the Intuit Lacerte data exposure vulnerability and my talk being accepted to the Breaking Ground track at BSidesLV 2018, I decided to test other popular tax prep software.
My lab setup includes a client and server. I installed the Thompson Reuter UltraTax CS software on the server, which auto-shared the directory. The share included Administrator and SYSTEM.
I navigated to the client system and mapped, allowing the client software to be installed. After modifying the location of the customer database to point to the location on the server, I opened a packet capturing tool and re-launched the client application.
Unlike the massive client database dump Intuit Lacerte transferred over unencrypted cleartext, UltraTax CS prompted for a client to be selected before the data started to leak.
Similar to the way I saw data being leaked by Intuit Lacerte, once a customer is selected, all their sensitive information is transferred over the network in cleartext. I can see the following unencrypted data was captured: Client ID, Full Name, Spouse’s Full Name, Social Security Number, Spouse’s Social Security Number, Occupation, Spouse’s Occupation, Daytime Phone, Home Phone, Tax Preparer, Federal and State Taxes to File, Bank Name and Bank Account Number.
I assume the CPAs are maintaining record of customer’s bank information for e-file purposes, however I can only imagine the criminal performing a man-in-the-middle attack would appreciate that he or she also got the tax paper’s bank account number.
Below is a screenshot of the data being leaked by Thompson Reuter UltraTax CS upon opening a customer in the software.
There isn’t much more a criminal could ask for. Everything needed to commit identity theft or fraudulently file taxes on 1000 people’s behalf is presented on a silver platter to anyone on the local network or a compromised workstation.
The following scenarios could be leveraged by a bad actor to obtain the exposed data traversing the network:
Be on the same collision domain as the client/server (e.g. same Wireless network)
Be on the same broadcast domain (e.g. switched network) and perform ARP poisoning
Number of other man-in-the-middle scenarios
Further investigation of the tax software demonstrated that the UltraTax databases are not only traversing the network in unencrypted cleartext, they are also stored on the server in unencrypted cleartext.
The UltraTax CS software does not require a password upon setup, however there is an option to password protect access to the tax software. However, access controls to the database can be easily bypassed since the sensitive data in UltraTax is being stored in unencrypted cleartext as strings.
In the screenshot below, I extracted strings from the \\[server_name]\WinCSI\UT17DATA\[customer_name\U0001TXP.XX17 (Note: database names could be different in other environments) file to see what was readable without using a username/password. To my surprise, I found all customer records containing Full Name, SSN, Driver’s License Number, Address, etc. as seen below:
One of the foundational security concepts for security professionals is the CIA triad. I must ensure Confidentiality, Integrity, and Availability of data and systems. Confidentiality means ensuring only those who are authorized to access data can access it. That includes at rest, in transit, and in process. UltraTax CS failed to provide confidentiality of customer’s sensitive personally identifiable information and tax records at rest and in transit. I have not yet tested what data is being leaked in process, however I have my assumptions.
There is no need for further exploitation as all sensitive data is exposed. This vulnerability was validated on UltraTax CS 2017, however older versions of UltraTax may be vulnerable.
Summary of Findings:
Thompson Reuters UltraTax CS for Windows in a client/server configuration transfers the customer records and bank account numbers in unencrypted cleartext over SMBv2, which allows attackers to (1) obtain sensitive information by sniffing the network or (2) conduct man-in-the-middle (MITM) attacks via unspecified vectors. The customer record transferred in cleartext contains: Client ID, Full Name, Spouse’s Full Name, Social Security Number, Spouse’s Social Security Number, Occupation, Spouse’s Occupation, Daytime Phone, Home Phone, Tax Preparer, Federal and State Taxes to File, Bank Name, Bank Account Number and possibly other sensitive information.
The UltraTax stores customer data in unique directories (%install_path%\WinCSI\UT17DATA\[client_ID]\[file_name].XX17) that can be bypassed without authentication by examining the strings of the .XX17 file. The strings stored in the .XX17 file contain each customer’s: Full Name, Spouse’s Name, Social Security Number, Date of Birth, Occupation, Home Address, Daytime Phone Number, Home Phone Number, Spouse’s Address, Spouse’s Daytime Phone Number, Spouse’s Social Security Number, Spouse’s Home Phone Number, Spouse’s Occupation, Spouse’s Date of Birth, and Spouse’s Filing Status.
There is no need for further exploitation as all sensitive data is exposed without need for authentication. This vulnerability was validated on UltraTax CS 2017, however older versions of may be vulnerable.