PROMPT:
After completing your Chapter 14 reading, consider the “Intelligence Cycle for NSM” section and how IDS and IPS technologies help you gather data about activities in your network. In your initial post, select one step of the intelligence cycle and discuss how IDS/IPS false positives or negatives could impact your selected step.
CHAPTER 14: Friendly and Threat Intelligence
Abstract
The ability to generate intelligence related to friendly and hostile systems can be the defining factor that makes or breaks an investigation. This chapter begins with an introduction to the traditional intelligence cycle and how it relates to NSM analysis intelligence. Following this, we look at methods for generating friendly intelligence by generating asset data from network scan and leveraging PRADS data. Finally, we examine the types of threat intelligence and discuss some basic methods for researching tactical threat intelligence related to hostile hosts.
Keywords
Network Security Monitoring; Analysis; Intelligence; Threat; Hostile; Friendly; PRADS; nmap; Tactical; Strategic; Intel
CHAPTER CONTENTS
The Intelligence Cycle for NSM
Defining Requirements
Planning
Collection
Processing
Analysis
Dissemination
Generating Friendly Intelligence
The Network Asset History and Physical
Defining a Network Asset Model
Passive Real-time Asset Detection System (PRADS)
Making PRADS Data Actionable
Generating Threat Intelligence
Researching Hostile Hosts
Internal Data Sources
Open Source Intelligence
Researching Hostile Files
Open Source Intelligence
Conclusion
Intelligence has many definitions depending on the application. The definition that most closely aligns to NSM and information security is drawn from Department of Defense Joint Publication 1-02, and says that “intelligence is a product resulting from the collection, processing, integration, evaluation, analysis, and interpretation of available information concerning foreign nations, hostile or potentially hostile forces or elements, or areas of actual or potential operations.1 ”
While this definition might not fit perfectly for a traditional SOC performing NSM services (particularly the part about information concerning foreign nations), it does provide the all-important framing required to begin thinking about generating intelligence. The key component of this definition is that intelligence is a product. This doesn’t mean that it is bought or sold for profit, but more specifically, that it is produced from collected data, based upon a specific requirement. This means that an IP address, or the registered owner of that address, or the common characteristics of the network traffic generated by that IP address are not intelligence products. When those things are combined with context through the analysis process and delivered to meet a specific requirement, they become an intelligence product.
Most SOC environments are generally concerned with the development of two types of intelligence products: friendly intelligence and threat intelligence. In this chapter, we will take a look at the traditional intelligence cycle and methods that can be used to generate these intelligence products. This includes the creation of friendly intelligence products, as well as threat products associated with tactical threat intelligence. While reading, you should keep in mind that there are many components to intelligence as a whole, and we are only covering a small subset of that here.
The Intelligence Cycle for NSM *****
The generation of intelligence products in a SOC requires the coordinated effort of multiple stakeholders within the organization. Because there are so many moving parts to the process, it helps to be able to organize the intelligence generation process into an organized, repeatable framework. The framework that the government and military intelligence community (IC) have relied on for years is called the Intelligence Cycle.
Depending on the source you reference, the intelligence cycle can be broken down into any number of steps. For the purposes of this book, we will look at a model that uses six steps: defining requirements, planning, collection, processing, analysis, and dissemination. These steps form a cycle that can continually feed itself, ultimately allowing its products to shape how newer products are developed (Figure 14.1).
FIGURE 14.1 The Traditional Intelligence Cycle
Let’s go through each of these steps to illustrate how this cycle applies to the development of friendly and hostile intelligence for NSM.
Defining Requirements
An intelligence product is generated based upon a defined requirement. This requirement is what all other phases of the intelligence cycle are derived from. Just like a movie can’t be produced without a script, an intelligence product can’t be produced without a clearly defined intelligence requirement.
In terms of information security and NSM, that requirement is generally focused on a need for information related to assets you are responsible for protecting (friendly intelligence), or focused on information related to hosts that pose a potential threat to friendly assets (hostile intelligence).
These requirements are, essentially, requests for information and context that can help NSM analysts make judgments relevant to their investigations. This phase is ultimately all about asking the right questions, and those questions depend on whether the intelligence requirement is continual or situational. For instance, the development of a friendly intelligence product is a continual process, meaning that questions should be phrased in a broad, repeatable manner.
Some examples of questions designed to create baselines for friendly communication patterns might be:
What are the normal communication patterns occurring between friendly hosts?
What are the normal communication patterns occurring between sensitive friendly hosts and unknown external entities?
What services are normally provided by friendly hosts?
What is the normal ratio of inbound to outbound communication for friendly hosts?
On the other end of the spectrum, the development of a threat intelligence product is a situational process, meaning that questions are often specific, and designed to generate a single intelligence product for a current investigation:
Has the specific hostile host ever communicated with friendly hosts before, and if so, to what extent?
Is the specific hostile host registered to an ISP where previous hostile activity has originated?
How does the content of the traffic generated by the specific hostile host compare to activity that is known to be associated with currently identified hostile entities?
Can the timing of this specific event be tied to the goals of any particular organization?
Once you have asked the right question, the rest of the cards should begin to fall into place. We will delve further into the nature of friendly and threat intelligence requirements later in their respective sections.
Planning
With an intelligence requirement defined, appropriate planning can ensure that the remaining steps of the intelligence cycle can be completed. This involves planning each of these steps and assigning resources to them. In NSM terms, this means different things for different steps. For instance, during the collection phase this may mean assigning level three analysts (thinking back to our Chapter 1 discussion of classifying analysts) and systems administrators to work with sensors and collection tools. In the processing and analysis phase this may mean assigning level one and two analysts to these processes and sectioning off a portion of their time to work on this task.
Of course, the types of resources, both human and technical, that you assign to these tasks will vary depending upon your environment and the makeup of your technical teams. In larger organizations you may have a separate team specifically for generating intelligence products. In smaller organizations, you might be a one-man show responsible for the entirety of intelligence product creation. No matter how large or small your organization, you can participate in the development of friendly and threat intelligence.
Collection
The collection phase of the intelligence cycle deals with the mechanisms used for collecting the data that supports the outlined requirements. This data will eventually be processed, analyzed, and disseminated as the intelligence product.
In a SOC environment, you may find that your collection needs for intelligence purposes will force you to modify your overall collection plan. For the purposes of continual friendly intelligence collection, this can include the collection of useful statistics, like those discussed in Chapter 11, or the collection of passive real-time asset data, like the data generated with a tool we will discuss later, called PRADS.
When it comes to situational threat intelligence collection, data will typically be collected from existing NSM data sources like FPC or session data. This data will generally be focused on what interaction the potentially hostile entity had with trusted network assets. In addition, open source intelligence gathering processes are utilized to ascertain publicly available information related to the potentially hostile entity. This might include items like information about the registrant of an IP address, or known intelligence surrounding a mysterious suspicious file.
In order for intelligence collection to occur in an efficient manner, collection processes for certain types of data (FPC, PSTR, Session, etc.) should be well-documented and easily accessible.
Processing
Once data has been collected, some types of data must be further processed to become useful for analysis. This can mean a lot of different things for a lot of different types of data.
At a higher level, processing can mean just paring down the collected data set into something more immediately useful. This might mean applying filters to a PCAP file to shrink the total working data set, or selecting log files of only a certain type from a larger log file collection.
At a more granular level, this might mean taking the output from a third party or custom tool and using some BASH commands to format the output of those tools into something more easily readable. In cases where an organization is using a custom tool or database for intelligence collection, it might mean writing queries to insert data into this format, or pull it out of that format into something more easily readable.
Ultimately, processing can sometimes be seen as an extension of collection where collected data is pared down, massaged, and tweaked into a form that is ideal for the analyst.
Analysis
The analysis phase is where multiple collected and processed items are examined, correlated, and given the necessary context the make them useful. This is where intelligence goes from just being loosely related pieces of data to a finished product that is useful for decision-making.
In the analysis and generation of both friendly and threat intelligence products, the analyst will take the output of several tools and data sources and combine those data points on a per host basis, painting a picture of an individual host. A great deal more intelligence will be available for local hosts, and might allow this picture to include details about the tendencies and normal communication partners of the host. The analysis of potentially hostile hosts will be generated from a much smaller data set, and require the incorporation of open source intelligence into the analysis process.
What ultimately results from this process is the intelligence product, ready to be parsed by the analyst.
Dissemination
In most practical cases, an organization won’t have a dedicated intelligence team, meaning the NSM analysts will be generating intelligence products for their own use. This is a unique advantage, because the consumer of the intelligence will usually be the same person who generated it, or will at least be in the same room or under the same command structure. In the final phase of the intelligence cycle, the intelligence product is disseminated to the individual or group who initially identified the intelligence requirement.
In most cases, the intelligence product is constantly being evaluated and improved. The positive and negative aspects of the final product are critiqued, and this critique goes back into defining intelligence requirements and planning the product creation process. This is what makes this an intelligence cycle, rather than just an intelligence chain.
The remainder of this chapter is devoted to the friendly and threat intelligence products, and ways to generate and obtain that data. While the intelligence framework might not be referenced exclusively, the actions described in these sections will most certainly fit into this framework in a manner that can be adapted to nearly any organization.
Generating Friendly Intelligence
You cannot effectively defend your network if you do not know what is on it, and how it communicates. This statement cannot be emphasized enough. No matter how simple or sophisticated an attack may be, if you don’t know the roles of the devices on your network, especially those where critical data exists, then you won’t be able to effectively identify when an incident has occurred, contain that incident, or eradicate the attacker from the network. That’s why the development of friendly intelligence is so important.
In the context of this book, we present friendly intelligence as a continually evolving product that can be referenced to obtain information about hosts an analyst is responsible for protecting. This information should include everything the analyst needs to aid in the event of an investigation, and should be able to be referenced at any given time. Generally, an analyst might be expected to reference friendly intelligence about a single host any time they are investigating alert data associated with that host. This would typically be when the friendly host appears to be the target of an attack. Because of that, it isn’t uncommon for an analyst to reference this data dozens of times per shift for a variety of hosts. Beyond this, you should also consider that the analysis of friendly intelligence could also result in the manual observance of anomalies that can spawn investigations. Let’s look at a few ways to create friendly intelligence from network data.
The Network Asset History and Physical
When a physician assesses a new patient, the first thing they perform is an evaluation of the medical history and physical condition of the patient. This is called a patient history and physical, or an H&P. This concept provides a useful framework that can be applied the friendly intelligence of network assets.
The patient history assessment includes current and previous medical conditions that could impact the patient’s current or future health. This also usually includes a history of the patient’s family’s health conditions, so that risk factors for those conditions in the patient can be identified and mitigated.
Shifting this concept to a network asset, we can translate a network asset’s medical history to its connection history. This involves assessing previous communication transactions between the friendly host and other hosts on the network, as well as hosts outside of the network. This connection profiling extends beyond the hosts involved in this communication, but also to the services used by the host, both as a client and a server. If we can assess this connection history, we can make educated guesses about the validity of new connections a friendly host makes in the context of an investigation.
The patient physical exam captures the current state of a patient’s physical health, and measures items such as the patient’s demographic information, their height and weight, their blood pressure, and so on. This product of the physical exam is an overall assessment of a patient’s health. Often physical exams will be conducted with a targeted goal, such as assessments that are completed for the purposes of health insurance, or for clearance to play a sport.
When we think about a friendly network asset in terms of the patient physical exam, we can begin to identify criteria that help define the state the asset on the network, opposed to a state of health in a patient. These criteria include items such as the IP address and DNS name of the asset, the VLAN it is located in, the role of the device (workstation, web server, etc.), the operating system architecture of the device, or its physical network location. The product of this assessment on the friendly network asset is a state of its operation on the network, which can be used to make determinations about the activity the host is presenting in the context of an investigation.
Now, we will talk about some methods that can be used to create a network asset H&P. This will include using tools like Nmap to define the “physical exam” portion of an H&P through the creation of an asset model, as well as the use of PRADS to help with the “history” portion of the H&P by collecting passive real-time asset data.
Defining a Network Asset Model
A network asset model is, very simply, a list of every host on your network and the critical information associated with it. This includes things like the host’s IP address, DNS name, general role (server, workstation, router, etc), the services it provides (web server, SSH server, proxy server, etc), and the operating system architecture. This is the most basic form of friendly intelligence, and something all SOC environments should strive to generate.
As you might imagine, there are a number of ways to build a network asset model. Most organizations will employ some form of enterprise asset management software, and this software often has the capacity to provide this data. If that is true for your organization, then that is often the easiest way to get this data to your analysts.
If your organization doesn’t have anything like that in place, then you may be left to generate this type of data yourself. In my experience, there is no discrete formula for creating an asset model. If you walk into a dozen organizations, you will likely find a dozen different methods used to generate the asset model and a dozen more ways to access and view that data. The point of this section isn’t to tell you exactly how to generate this data, because that is something that will really have to be adapted from the technologies that exist in your organization. The goal here is simply to provide an idea of what an asset model looks like, and to provide some idea of how you might start generating this data in the short term.
Caution
Realistically, asset inventories are rarely 100% accurate. In larger organizations with millions of devices, it just isn’t feasible to create asset models that are complete and always up to date. That said, you shouldn’t strive to achieve a 100% solution if it just isn’t possible. In this case, sometimes it’s acceptable to shoot for an 80% solution because it is still 80% better than 0%. If anything, do your best to generate asset models of critical devices that are identified while doing collection planning.
One way to actively generate asset data is through internal port scanning. This can be done with commercial software, or with free software like Nmap. For instance, you can run a basic SYN scan with this command:
nmap –sn 172.16.16.0/24
This command will perform a basic ICMP (ping) scan against all hosts in the 172.16.16.0/24 network range, and generate output similar to Figure 14.2.
FIGURE 14.2 Ping Scan Output from Nmap
As you can see in the data shown above, any host that is allowed to respond to ICMP echo request packets will respond with an ICMP echo reply. Assuming all of the hosts on your network are configured to respond to ICMP traffic (or they have an exclusion in a host-based firewall), this should allow you to map the active hosts on the network. The information provided to us is a basic list of IP addresses.
We can take this a step farther by utilizing more advanced scans. A SYN scan will attempt to communicate with any host on the network that has an open TCP port. This command can be used to initiate a SYN scan:
nmap –sS 172.16.16.0/24
This command will send a TCP SYN packet to the top 1000 most commonly used ports of every host on the 172.16.16.0/24 network. The output is shown in Figure 14.3.
FIGURE 14.3 SYN Scan Output from Nmap
This SYN scan gives us a bit more information. So now, in addition to IP addresses of live hosts on the network, we also have a listing of open ports on these devices, which can indicate the services they provide.
We can extend this even farther by using the version detection and operating system fingerprinting features of nmap:
nmap –sV -O 172.16.16.0/24
The command will perform a standard SYN port scan, followed by tests that will attempt to assess the services listening on open ports, and a variety of tests that will attempt to guess the operating system architecture of the device. This output is shown in Figure 14.4.
FIGURE 14.4 Version and Operating System Detection Scan Output
This type of scan will generate quite a bit of additional traffic on the network, but it will help round out the asset model by providing the operating system architecture and helping clarify the services running on open ports.
The data shown in the screenshots above is very easily readable when it is output by Nmap in its default format, however, it isn’t the easiest the search through. We can fix this by forcing Nmap to output its results in a single line format. This format is easily searchable with the grep tool, and very practical for analysts to reference. To force nmap to output its results in this format, simply add –oG < filename > at the end of any of the commands shown above. In figure 14.5, we use the grep command to search for data associated with a specific IP address (172.16.16.10) in a file that is generated using this format (data.scan).
FIGURE 14.5 Greppable Nmap Output
You should keep in mind that using a scanner like nmap isn’t always the most conclusive way to build friendly intelligence. Most organizations schedule noisy scans like these in the evening, and this creates a scenario where devices might be missed in the scan because they are turned off. This also doesn’t account for mobile devices that are only periodically connected to the network, like laptops that employees take home at night, or laptops belonging to traveling staff. Because of this, intelligence built from network scan data should combine the results of multiple scans taking at different time periods. You may also need to use multiple scan types to ensure that all devices are detected. Generating an asset model with scan data is much more difficult than firing off a single scan and storing the results. It requires a concerted effort and may take quite a bit of finessing in order to get the results you are looking for on a consistent basis.
No matter how reliable your scan data may seem, it should be combined with another data source that can be used to validate the results. This can be something that is already generated on your network, like DNS transaction logs, or something that is part of your NSM data set, like session data. Chapter 4 and