Don’t Upload Your HAR Files Anywhere (And If You Do, Encrypt Them)

This blog is the first part in a series called “Let’s Go on an Internet Safari.” This series will explore internet-facing vulnerabilities, misconfigurations, exposures and other “goldmines” that can be found proactively for good, or “evil” (if you’re an authorized pen tester!). 

 

 

Lessons from the Okta Breach for Managing Browser Logs and Tokens

I was recently researching the digital footprints and service exposures of popular cloud-hosted SIEM solutions on Shodan and VirusTotal, when I came across some interesting file relations on VT that were noted as referring files on VirusTotal. The first 4 files seemed to be ordinary licensing agreements that had been uploaded for unknown reasons. However, among them, I found a .HAR file that contained a high-privilege MFA-authenticated SSO token.

 

For those who are not familiar with HAR files, they are the sensitive browser activity logs that were obtained by a threat actor in the November 2023 Okta security incident. In this blog post, I will discuss what HAR files are, explain how to hunt for them on VirusTotal (and other platforms), address the risks they pose to your organization if leaked and share various recommendations for defending against token theft attacks.

 

Note: I responsibly disclosed the findings pertaining to the user whose SSO token was leaked to the user directly via email and cc’d their organization’s security team.

 

 

What is a HAR File?

The HTTP Archive format (HAR) is an archive file in JSON format that captures interactions between a browser and website. The file extension is generally .HAR. HAR files contain a breadth of secrets and technical details that are generally not presented to users in the browser. These files are useful in troubleshooting web apps, but they are also extremely valuable to adversaries who know a thing or two about what they’re doing. Upon examination of the aforementioned HAR file I discovered on VirusTotal, I realized that it contained MFA-authenticated SSO tokens and other goodies for a large organization that was likely in the process of migrating their SIEM.

 

Image
Picture1

 

 

 

Figure 1: A redacted screenshot of a HAR file containing an MFA-authenticated SSO token

 

Conveniently, the HAR file included an email address in one of its fields, identifying the owner of the token. A LinkedIn search of this person’s name revealed they are a cybersecurity manager who specializes in a variety of security technologies. Given their role, we can safely presume they have high privilege access to this organization’s environment. After uncovering this particularly egregious example of inappropriate VirusTotal uploads, I became curious what I would find if I wrote a VTGrep query to search for other HAR files. I was able to identify a number of other highly sensitive HAR files belonging to different organizations. As shown below, I found files containing access tokens to an IBM cloud tenant, a HubSpot app and a lot of Roblox.

 

 

Image
Picture2

 

 

Figure 2: HAR file containing an API token for an IBM cloud tenant

 

 

Image
Picture3

 

Figure 3: HAR file containing an API token for a HubSpot app

 

 

Image
Picture4

 

 

Figure 4: I have no idea why there was so much Roblox. Maybe poor instructions in a ‘tuts’ somewhere?

 

 

 

Writing a VTGrep Query

VTGrep is a feature in VirusTotal Enterprise that allows users to search the content of VirusTotal using a variety of tags and& filters to “find evil.” As VirusTotal states, “VT-Grep not only matches the raw content of the file, but it also searches over uncompressed and unpacked files plus VBA Code streams.” This feature is most useful for security researchers and SOC or DFIR analysts, as well as other professionals seeking to better understand threats and serve their clients. This may sound overwhelming, but it really is simple to use to search for many types of files such as private keys, configuration files and more.

 

The query I created in this particular hunt was rather simple. I started by looking at the specific HAR file I observed earlier and found consistent content that I could search for. I noticed that all the HAR files I found valuable contained the field, “name”: “WebInspector.” We know from the HAR format specification that they are all JSON formatted files, and we only want “useful” files containing tokens and other secrets. Using the filter for text files and the content string for WebInspector, I was able to produce a query that generated reams of HAR files. I then reduced them down to only the interesting ones by further specifying that my results must contain “Token,” “Secret,” etc. Here is a sample query:

 

VTGrep Query: content:"WebInspector" AND content:"token" AND type:json

 

 

Image
Picture5

 

 

Figure 5: HAR file header (Chromium browser) 

 

Using this same methodology to define criteria of a HAR file (or any file), you could create a query on any platform to search for inadvertently exposed HAR files. I’ve done the same thing here on Google search. In this example shown below, I found a GitHub repository containing a log from MITMProxy!

 

Google Search Query: filetype:har token

 

 

Image
Picture6

 

 

Figure 6: A Google search to find HAR files

 

 

Image
Picture7

 

 

Figure 7: Log output from MITMProxy inadvertently exposed on GitHub

 

Conclusion

Unfortunately, my time is limited, and I cannot spend my days endlessly scouring the web for treasures and goodies. But when I do, I always cherish the moments. Unlike the cited Okta incident, the root cause of the issue here is likely automated tools uploading things carelessly to VT, or uneducated users uploading these sensitive files unwittingly. I’d like to take this opportunity to offer some advice to those who may seek to proactively hunt for similar leaks and protect their organization against token-theft attacks.

 

  1. Employ a proactive or continuous dark web monitoring and search capability.
  2.  

    1. You or your service provider should search for potential data leaks or indications of possible imminent threats that may affect your organization. 
    2. Look for references to your organization such as:
      1. Obscure references to your production apps
      2. Token/ password/ cookie etc. followed by COMPANY_NAME
      3. Config file formats that also contain your COMPANY_NAME
      4. ComboLists that contain your user’s email addresses
      5. References to honeytokens and “fake accounts” in your organization

     

  3. Configure conditional access and/or other token lifetime policies that will shorten the validity period of any authentication token/cookie etc.
  4.  

    1. Typically, a 7-day lifetime is recommended for a blend between user convenience and security.
    2. In sensitive practices, it might make sense to reduce token lifetimes to periods shorter than this, such as 1 day or even 8 hours. 
    3. Services like Entra ID and Okta are continually developing new features to protect against token-theft attacks. However, many of them need to be specifically turned on by your organization. Simply “having the best license” is not enough to protect your environment. You must also use the entitlements.

     

  5. Do not upload HAR files or other debug logs in plaintext…ever…
  6.  

    1. Should you ever need to transmit these files for debug purposes, it is highly advisable to encrypt the file at rest.
    2. Simple solutions such as a password-protected 7-Zip are significantly better than transmitting the file in plaintext. Ideally, you should use a stronger encryption solution such as GNU Privacy Guard (GnuPG). Developers and admins alike should be trained on the sensitivity of HAR files and other debug logs and config files. The training should instruct them to treat these files as delicately as they would a password.

     

  7. Watch what you upload to VirusTotal (or any threat intelligence/malware analysis platform).
  8.  

    1. Anything you upload to VirusTotal is available to security researchers, your competitors, cybercriminals and the rest of the world! 
      1. If you’re a blue teamer, consider that file published to the whole internet
      2. If you’re a red teamer, consider the upload “burned” and indexed by every decent signature-based antivirus (AV) available.
    2. If you run a security team, SOC or MSSP, you need a standard operating procedure (SOP) for use of tools like VirusTotal. Querying file hashes and atomic indicators is generally safe, but uploading files should be a last resort action performed with the understanding that you are making evidence available to the world.
    3.  

  9. Look beyond VirusTotal.
  10.  

    1. In this particular case, I searched within VirusTotal because it was of interest to me and what I had readily available. But these files can easily be in many other places, like on your file servers, S3 buckets, desktops and more. 
    2. If you have access to Osquery, or a similar tool, conduct a search where you hunt for any .HAR file sitting in an insecure location and take action to remove them if possible.
    3. You could search for these files using a similar query on any threat intel platform, or possibly even Google!

     

  11. If available, use token-binding in your organization’s IDP or other applications.
  12.  

    1. One of Okta’s post-incident product enhancements was to implement session token binding based on network location to combat the threat of session token theft against Okta administrators.
    2. In practice, this would ensure any deviation from the appropriate IP bound to a token would effectively render the token useless.

     

  13. Hunt for obscurities in your logs.
  14.  

    1. As mentioned in my 2023 Source Zero Con talk, obscure user agents such as “Python Requests” are a dead giveaway something funny is happening. Proactively hunting for strange or anomalous user agents in your various logs is a great way to get ahead of threats.
    2. Search for atypical token characteristics or token use from strange locations (impossible travel, odd locations, etc.)
    3. Use threat intelligence services  in your log analysis workflow to identify IPs that are known to recently have been conducting automated attacks and other malicious activities.

     

  15. Prepare your organization for incident response.
  16.  

    1. Failure to plan is a plan to fail. Modern organizations should have robust incident response plans and playbooks that guide their technical teams on how to respond appropriately to incidents.
    2. Incidents that are complex by nature or scale should involve the assistance of a third-party incident response team. Having an incident response team on retainer ensures that you can have consistent quality service available to you at the best possible cost, 24/7/365.

 

Justin Safa
Digital Forensics and Incident Response Consultant | Optiv
Justin Safa is a Digital Forensics and Incident Response Consultant at Optiv on our Enterprise Incident Management Team. Justin is a Subject Matter Expert in Microsoft 365, Cisco Security, Carbon Black, Sentinel One and a variety of other technologies. Justin has led many Incident Response Engagements involving Ransomware, Targeted Attacks, Zero-Days and other Sophisticated Threats. Justin has worked for a diverse range of Clients in various industries including various levels of government and the Fortune 500.

Optiv Security: Secure greatness.®

Optiv is the cyber advisory and solutions leader, delivering strategic and technical expertise to nearly 6,000 companies across every major industry. We partner with organizations to advise, deploy and operate complete cybersecurity programs from strategy and managed security services to risk, integration and technology solutions. With clients at the center of our unmatched ecosystem of people, products, partners and programs, we accelerate business progress like no other company can. At Optiv, we manage cyber risk so you can secure your full potential. For more information, visit www.optiv.com.