A Deep-dive on Exploit Prediction Scoring System (EPSS) — Part 1

May 16, 2024

In today’s rapidly evolving cyber landscape, vulnerability management — a practice of identifying, prioritising, and remediating known software vulnerabilities — has been a continuous challenge for organisations.

The issue could be attributed to an increasing number of vulnerabilities identified annually, with a 24.3% increase in 2022 and a 15.6% increase in 2023 over previous years. This rise in published vulnerabilities can be attributed to several factors, such as —

the digital transformation has made software more ubiquitous;
the speed of innovation may inadvertently introduce more vulnerabilities; and
the growing vigilance of the cybersecurity community has exposed more vulnerabilities.

The issue is exacerbated by the shortage of skilled cybersecurity professionals. With increasing awareness of software vulnerabilities and limited capacity to remediate them, vulnerability prioritisation and remediation have become both chronic and acute concerns for organisations attempting to reduce their attack surface.

On one hand, there is a possibility to remediate all vulnerabilities, providing maximum coverage but at the expense of low efficiency. On the other hand, there is a possibility to remediate certain high-risk vulnerabilities, offering higher efficiency but at the risk of missing other high-risk vulnerabilities that may get exploited, thereby exposing an organisation to risk.

A study conducted by the Cyentia Institute has shown that organisations are only able to remediate approximately 10% of vulnerabilities in their environments, regardless of organisation’s size or the maturity of their vulnerability management program. Other studies indicate that while only up to 31% of the published vulnerabilities may have an associated exploit code, on average, fewer than 2% of vulnerabilities are ever weaponized or exploited in the wild.

The Exploit Prediction Scoring System (EPSS), offers a cutting-edge approach to this challenge. This blog post explores what EPSS is, the advantages EPSS offers over other vulnerability scoring systems e.g. Common Vulnerability Scoring System (CVSS), and how it can transform vulnerability management practices by enabling organisations to anticipate threats proactively helping them allocate resources more effectively.

I invite you to dive into the world of EPSS to understand how leveraging real-world data and predictive analytics can enhance your cybersecurity strategy.

What is EPSS?

The Exploit Prediction Scoring System (EPSS) is a data-driven framework managed by the Forum of Incident Response and Security Teams (FIRST), which helps estimate the likelihood that a particular vulnerability will be exploited in the wild within the next 30 days. The goal is to assist organisations in better prioritising their vulnerability remediation efforts.

Unlike traditional vulnerability scoring systems, such as Common Vulnerability Scoring System (CVSS), which assess the severity rating of vulnerabilities based on their inherent characteristics, EPSS uses machine learning to predict the probability of exploitation based on a combination of factors, including real-world exploit data and threat intelligence feeds.

The EPSS model produces probability scores between 0 and 1 (0% and 100%), with higher scores indicating a greater likelihood of exploitation. This system helps organisations prioritise vulnerabilities that are more likely to be exploited, thus enabling more efficient resource allocation towards mitigating critical threats.

EPSS generates daily scores for all published CVEs, reflecting the dynamic nature of cybersecurity threats and underscores the necessity for timely data to facilitate swift action on reducing organisational risk.

EPSS Model — Evolution and Improvements

The EPSS effort began with the publication of a research paper in the Journal of Cybersecurity in July 2020 and the first version of the model was introduced in April 2021.

Since its inception, EPSS has evolved from a simpler logistic regression model to a more complex machine learning model using techniques such as Extreme Gradient Boosting (XGBoost). This evolution reflects ongoing improvements in predictive accuracy and the ability to handle a broader array of data inputs.

The model has seen three major versions since it first emerged:

EPSS v1.0 (April 2021): This initial version used a logistic regression model that integrated data from a limited number of variables. It aimed to predict the likelihood of exploitation within the first year of a vulnerability’s publication. Although the model showed improvements in efficiency and coverage over CVSS, several limitations were highlighted due to the limited data set to train the model.
EPSS v2.0 (February 2022): In response to the desire for more robust data and less hands-on scoring by end-users, EPSS v2.0 moved towards a more complex and performant machine learning model, XGBoost. This version employed a gradient boosted tree-based model and significantly increased the number of variables considered — from 16 in the first model to 1,164. This version aimed at predicting the likelihood of exploitation activity within the next 30 days, marking a shift in its predictive focus to a broader and more immediate timeframe, resulting in significant improvements over the previous version.
EPSS v3.0 (March 2023): The latest version further refined the predictive capabilities of the system. With continuous updates to the machine-learning model and data sources, EPSS v3.0 can predict with greater efficiency, factoring in a broader array of data points from diverse sources including historical vulnerability data and daily exploit data achieving an overall 82% improvement over v2.0. This version maintains the objective of predicting short-term exploitation risks but with enhanced accuracy and performance.

Each iteration of EPSS has aimed to improve the predictiveness of the system by expanding the data sources used, increasing the sophistication of the machine learning algorithms, and refining the model’s focus to provide more timely and relevant predictions for vulnerability exploitation. The improvements made to the EPSS model across versions, and over the CVSS model, has been demonstrated in the diagram below:

Ref: https://arxiv.org/pdf/2302.14172

The evolution of EPSS reflects a concerted effort to provide a more effective tool for cybersecurity professionals to prioritise vulnerabilities based on the realistic likelihood of being exploited in the wild. This ongoing development signifies FIRST’s commitment to enhancing the practical utility of the EPSS for a comprehensive vulnerability management approach.

EPSS Model — Constituents and Features

The EPSS model incorporates multiple constituents or features, each selected for their relevance in predicting the likelihood of a vulnerability being exploited.

The EPSS model leverages a variety of data sources, including known vulnerabilities from the MITRE CVE list, exploit databases like ExploitDB, and real-world data on exploits. This comprehensive data approach allows EPSS to provide updated and relevant predictions daily. The details of the data sources in use are provided in the table below:

Source: https://arxiv.org/pdf/2302.14172

The significance of each constituent is rooted in its ability to provide a different perspective or piece of information about the vulnerability, which collectively enhances the model’s predictive accuracy. Here are the main constituents and their significance:

Common Vulnerabilities and Exposures (CVE) Data: This includes the specifics of the vulnerability such as the type, affected systems, keyword description of a vulnerability, and potential impact.
Number of Days Since Publication: The age of a CVE is a significant predictor because newer vulnerabilities might attract more attention from attackers and researchers alike, potentially leading to earlier exploits. Research indicates that 50% of exploits are published within two weeks, and 13% emerge within a month or so after a new vulnerability is published.
Published Exploit Code: The presence of published exploit code in repositories like Metasploit, ExploitDB, or GitHub significantly increases the likelihood of exploitation, as it makes the process easier for attackers by providing them with ready-to-use tools. The chances of exploitation in the wild are seven times higher when exploit code is published.
Security Scanner Data: Inputs from multiple security scanners about a vulnerability’s detectability and exploitability can provide insights into how easily a vulnerability can be exploited and thus its attractiveness to attackers.
CVSS Scores: While EPSS and CVSS are different, the CVSS scores, particularly the base metrics, provide insight into the severity and potential impact of a vulnerability. These are used in EPSS to provide context about severity, although EPSS focuses more on the likelihood of exploitation rather than severity.
CPE Data (Common Platform Enumeration): This indicates the specific vendor platforms (software or hardware) affected by the vulnerability. Understanding the platforms involved can help in assessing the potential reach and impact of an exploit, thereby influencing the likelihood of exploitation.
Common Weakness Enumeration (CWE): The type of weakness associated with a vulnerability helps inform the attractiveness of a vulnerability to adversaries.
Machine Learning Models: EPSS has evolved to use advanced machine learning algorithms, specifically a gradient boosted tree model in its latest iteration. These algorithms can handle a large variety of input features and find complex patterns in data that might not be immediately apparent to human analysts. The use of machine learning allows EPSS to continuously learn from new data and improve its predictions over time.
Real-world Exploit Data: By incorporating data about actual exploits from various threat intelligence feeds, the model gains a dynamic component that reflects current attack trends and techniques, thus enhancing its relevance and timeliness.
Vendor Products: Specific vendors and their products may be more attractive to attackers due to a specific product’s install base and the associated vulnerabilities.
Social Media: Discussions and mentions of a CVE on social media, such as Twitter, may help correlate information about exploitation activity.

The diagram below shows the 30 most significant features demonstrating their influence on the final predictive values produced by the model.

Source: https://arxiv.org/pdf/2302.14172

Each of these components contributes to the overall effectiveness of EPSS by providing comprehensive and nuanced insights into both the nature of the vulnerability and the context in which it exists. This multifaceted approach allows EPSS to offer a probabilistic estimate of a vulnerability being exploited, helping organisations prioritise their security measures more effectively.

Using EPSS for Better Vulnerability Management

Organisations can use EPSS scores, available through FIRST.org’s API and downloadable datasets, to prioritise vulnerabilities that pose a real threat of being exploited. This helps in efficiently directing remediation efforts towards the most critical threats.

EPSS can significantly enhance an organisation’s vulnerability management practices in several key ways:

Prioritisation of Remediation Efforts: EPSS helps organisations prioritise vulnerabilities based on the likelihood of exploitation rather than just severity. This is crucial because not all high-severity vulnerabilities are exploited with the same frequency or immediacy. By focusing on the likelihood of exploitation, organisations can allocate resources more efficiently, addressing the most pressing threats first.
Resource Allocation: By providing a probability score for each vulnerability, EPSS enables organisations to make informed decisions about where to allocate their limited security resources. This can lead to more effective risk management, as teams can focus on patching vulnerabilities that are most likely to be exploited in the near term. Using EPSS model, teams can make a trade-off between efficiency and coverage. For example, resource-constrained organisations may focus more on improved ‘efficiency’, whereas the better-resourced organisations with mature vulnerability management programs could focus more on improved ‘coverage’.
Enhanced Risk and Security Posture: With EPSS, organisations can enhance their overall risk and security posture by staying ahead of potential threats. Since the system is updated daily with new data, it provides a dynamic and current assessment of the threat landscape, allowing organisations to respond quickly to emerging threats before they are exploited. This may even require organisations to update their existing vulnerability management processes to respond to real world threat activity based on dynamic EPSS scores.

At the risk of making this article too long, I will divide it in two parts —

In this part, I have captured the details of the EPSS model, its evolution and history, main components, and how organisations can benefit by incorporating EPSS into their vulnerability management strategies.

In my forthcoming article, I will conduct further analysis on some of the vulnerability management strategies discussed above, along with comparison of EPSS with CVSS, and what EPSS is not. So, stay tuned!

Vishal Garg

Discussion about this post

Ready for more?