Being Certain about Estimative Uncertainty

Being Certain about Estimative Uncertainty

I love it when my teenager says something like, “You know Dad, that’s how they designed it,” when in fact, my child does not have any evidence to support his conclusion. It’s spoken as a hard truth, with enthusiastic declaration. If it’s a fact – stick to the facts people. Too often we find individuals involved in cyber defense and threat research and response doing the same thing. It’s so easy to jump to conclusions, be tricked by nefarious threat agents and make mistakes in complex digital crime investigations. Sometimes a researcher just wants to be right or have that new ‘big news’ discovery. This is why it’s so essential that analysts be certain about estimative uncertainty.

 

If there was a CIA triangle for threat analysts it would probably be something like “C” for checking your facts, “I” for integrity and “A” for audit your work. Making sure you have properly researched, qualified and articulated finished intelligence is essential. To drive this point home I often use the following real-world example in training:

 

While researching a possible exploit kit URL a researcher receives a 404 error. After finishing the investigation, the researcher then states the following in a report:

 

“The exploit kit was offline at the time of testing.”

 

The above conclusion is a perfectly normal and reasonable assumption, but incorrect in this case. Imagine being the analyst and company that says an exploit kit is offline when in fact, it’s still up and running, and exploiting at will! In this real-world example the researcher previously used the same IP address to research kits. The threat agents behind such exploit kits included that IP into a blacklist so if it is seen researching their exploit kit link they are redirected to a fake 404 error page (not found response). In a security world of cat and mouse between the good guys and the bad guys, this is all too common.

 

Knowing the tools, tactics and procedures (TTPs) of nefarious actors potentially would tip off the analyst that there could be more going on that meets the eye in the research case. However, let’s face it. Even if you’re a ninja you can’t know it all, and eventually something will happen that will get past all your experience and knowledge and “Spidey” senses. So, how do you avoid being wrong, misleading or tricked into such reporting and findings?

 

One thing my teenager had right was the attitude of sticking to the facts. If we do that in this case we see that on a certain date and time from a certain IP, operating system, user-agent, etc., a 404 error was the result. Notice in that explanation that I added additional details previously not included. Being factual and very specific helps to create context. Countless times I’ve read through finished intel only to ask myself, “Did they test it with a Windows or Linux based user-agent string or software? Did they try going to the site from different geo-located IPs and countries?” Including specific details in your reporting helps the reader understand what they need to know so they can take action. Take, for example, the following:

 

“A ransomware sample was executed on a Windows 10 endpoint used by a developer but encryption did not take place.”

 

This is another real-world example where details like the operating system version and user (developer) matters. An individual reading the report may be able to use this in their own testing and research. It could help them discover that the installation of Python on the developer’s endpoint stopped the ransomware from executing its payload since it thought it was being monitored or researched. Some organizations use this type of information to great creative such as installing Python on all machines to help mitigate the potential impact of ransomware on an endpoint.

 

Also, using qualified estimative language is essential in writing finished intelligence.  While you can go off the deep end on this subject it comes down to accurately describing – objectively – your findings instead of presenting conclusions and interpretations. Take, for example, the tricky subject of passive DNS, where former domains are associated with an IP that may or may not be related to a threat investigation:

 

“The following passive domains were found to be related to the espionage C&C IP within the past six months. Based upon the format of the domain names and patterns of WHOIS registration, it appears likely that these domains are part of the espionage campaign.”

 

It’s easy to say, “Ah ha! We have more C&Cs baby!” Everyone wants to be the super hero or get that discovery, but we all are valuing more and more signal instead of noise in a big data global world. You can establish terms for use policy and procedure to help establish likelihood, certainty, confidence, and trust of sources and analysts as you write your finished intel. This is how one can be certain about the uncertainty that comes with intel.  As normally we only see bits and pieces of much larger campaigns and nefarious activities.  

 

As an anecdotal take-away, consider reviewing the following types of probability in your reporting: almost certain, probable, chances about even, probably not, almost certainly not. When you’re done you’ll find that you’ve likely done a better job of assessing uncertainty than you have about eliminating it (pun intended).

 

In conclusion, it’s your job to establish integrity in the outcome of your research and response derivatives. Taking full ownership, cautioning against assumptions and leaning heavily upon tradecraft like estimative geek speak within financial intelligence (FININT) is essential in an ever-increasing complex global cyber threat landscape. Stick to the facts, that you can prove and repeat in findings, and let the reader decide next steps as they consume your intel.

Ken Dunham
Senior Director, Technical Cyber Threat Intelligence
Ken Dunham has spent 30 years in cybersecurity, consulting in adversarial counterintelligence, forensics, Darknet Special Ops, phishing and hacking schemes, AI/BI, machine learning and threat identification.