Optimal CyberSecurity Staffing Plan

OR/SYST 699 Final Report

Jennifer Krajic, Kendrick van Doorn, Thomas Lepp

Contents

1.0Executive Summary

2.0Background

2.1IDS

2.2Literature Review and Findings

2.3Problem Statement

3.0Problem Scope

3.1Primary Problem Requirements

4.0Technical Approach (Assumptions / Limitations)

4.1Assumptions

4.2Approach

4.2.1Time Period Analysis

4.2.2Model Architecture

4.2.3Model Part 1: Input

4.2.4Model Part 2: Optimize

4.2.5Model Part 3: Assign

4.3Results

5.0Sensitivity Analysis

6.0Evaluation

7.0Recommendations and Future Work

8.0References

9.0Appendix A

10.0Appendix B

Table of Tables

Table 1: Time Periods and Frequency of Alerts9

Table 2: Alert Frequency Mean Arrival Rate10

Table 3: Time Period to Minimum Payroll Total Cost Analysis11

Table 4: Analyst Alert Completion and Income Per Hour16

Table 5: Weekend Date Constraints23

Table of Figures

Figure 1: Intrusion Detection System

Figure 2: Alert Demand Frequency

Figure 3: Poisson vs Average Alert Distribution for Seven Days

Figure 4: Model Part 1 Output

Figure 5: Equations of Minimization Staffing Problem

Figure 6: Shift Pattern by Analyst Type

Figure 7: Demand Versus Supply of Alert

Figure 8: Cross Reference of Shift Pattern to Analyst

Figure 9: Part 3: Assign

Figure 10: 14-Day Intermediate Analyst Schedule

Figure 11: 14-Day Junior Analyst Schedule

Figure 12: 14-Day Senior Analyst Schedule

Figure 13: Model Results Alert to Analyst

Figure 14: 14-Day Senior Analyst Work Schedule with Weekend Constraint

Figure 15: 14-Day Constraint Identified

Figure 16: 14-Day Senior Analyst Work Schedule without Weekend Constraint

Figure 17: Model Validation Supply Vs. Demand

Figure 18: EVM Earned Value

Figure 19: EVM CPI & SPI

1.0Executive Summary

A CyberSecurityOperations Center (CSOC) protects against emerging CyberSecuritythreats by analyzing all alerts,within a short turnaround time period. The alerts are received from strategically placed sensors in a network that collects data. The CSOC employs varying skill levels of analysts, categorizedbyJunior, Intermediate, and Senior, for the initial investigation of all alerts. There are different payroll costs and expected alert analysis rates associated with each skill level.

The Optimal CyberSecurity Staffing Plan project’s objective is to find an optimal CyberSecurityanalyst-staffing plan, such that a payroll cost of the workforce is minimized.Athree-partmodel, which builds upon itself, wascreated and delivered by the project team. The model was developed with Python code,utilizingGurobi Solver for optimization.

Alert demand is generated from a Poisson distribution,with varying alert arrival rates:high, moderate, or low. Next, the model solves for feasible shift patterns within identified constraints.Alert arrivalquantity per time period and feasible shift patterns determine the number and type of analysts required.Using the minimum number of work schedules required, the model then creates analyst schedules by assigning shiftsusing a First Fit Decreasing (FFD) heuristic. The FFD sorts the shifts patterns into decreasing order, then assigns the shifts patterns to the first schedule that the shift pattern fits.

In conclusion, the Optimal CyberSecurity Staffing Plan team delivered a working model, a static staffing schedule for the CSOC, and has multiple future work recommendations to continue this project. Future recommendations include the creation of input preferences for each employee and a manager friendly schedule that allows for adjustments. In addition, another area of analysis will be for an alert backlog analysis to be performed, including areview of surge analysts to ensure all alerts are reviewed within a set time period.

2.0Background

A CSOC protects against emerging CyberSecurity threats. Strategically placed sensors in a network collect data which is analyzed by an intrusion detection system (IDS). Based on automated techniques, such as pattern matching, alerts are issued by an IDS.The CSOC employs CyberSecurity analysts who work in shifts and investigate these alerts.

2.1IDS

The IDS is a device or software application that monitors a network or system for malicious activity of policy violations. The main detection methods are signature-based, anomaly-based, and stateful protocol analysis. Some limitations are packet noise, false-alarms, legacy software, and lag time.

Figure 1: Intrusion Detection System

Figure 1: Intrusion Detection System displays the traditional workflow of alerts being passed to analysts at a CSOC for review. The workflow begins with multiple sensors, placed all over the world, actively monitoring network traffic that is of concern for the CSOC’s organization. The data is then transferred to the IDS or Security Information and Event Management (SIEM) system. The systems have libraries of previous attacks and behaviors that have been viewed before.

The IDS or SIEM will take the incoming data, process it against set rules, correlatemultiple alerts, and transfer alerts to the CSOC. The number of alerts that can be transferred to the CSOC are limited to the amount of network availability the IDS is able to analyze.

Once the alerts are transferred to the CSOC, they must be processed within an established amount of time due to the high risk of organizations security, information, and financial accounts. To accomplish this task, there must be enough analysts on staff to perform a pre-review process of categorizing complexity and checking for false-negatives. The most significant alerts are passed onto following stages, while anomalies are added to the database of previous alerts.

CyberSecurity is a dynamic field that requires constant vigilance and adaptation to evolving threats. An IDS is utilized to generate alerts for CyberSecurity analysts to review for potential danger and risk to the network. In recent news, data breaches across both private and commercial sectors have drastically increased. The data breaches have cost individuals and companies millions of dollars in damages and credibility.

At the core of CyberSecurity is monitoring. Monitoring is the critical action that is seen across all CyberSecurity methodologies. It is no longer a viable method to configure a system or network to be “secure.” The dynamic nature of CyberSecurity threats today requires constant monitoring for anomalies or atypical events in the system and network. Monitoring can include system logs, vulnerability scans, and IDS alerts. Without complete monitoring coverage, a system, network, or company is at risk.

2.2Literature Review and Findings

Optimal CyberSecurity Analyst Staffing Plan builds on the current 12-hour static staffing plan found in the research article “Dynamic Scheduling of CyberSecurity Analysts for Minimizing Risk Using Reinforcement Learning”.Since the staffing plan does not provide overlapping schedules and adaptions to variations in alert generation during a day, the Optimal CyberSecurity Analysts Staffing Plan team created a schedule that adapts to alert demand fluctuations. This is accomplished by using varying shift patterns to reduce the need for dynamic surge support and not burdening the next shift.

Starter parameters used from the research article “Dynamic Scheduling of CyberSecurity Analysts for Minimizing Risk Using Reinforcement Learning”were alert investigation rates for the three levels of analysts: Junior, Intermediate, and Seniorand the assumed Poissonmoderate alert arrival rate of 9 alerts per hour per sensor.

2.3Problem Statement

A CSOC protects against emerging and dynamic CyberSecurity threats. It is critical that all alerts are reviewed in a timely manner to reduce risk to the organization, while minimizing payroll costs. There are multiple variables such as different shiftoptions (4 -12 hours/shift), night shifts, weekend shifts, off-work times, staffing constraints, and analysts’skill levelsrequiringconsideration in the creation of a staffing schedule.

3.0Problem Scope

The Optimal CyberSecurity Analyst Staffing Plan Team is tasked with delivering a 14-day staffing schedule model for a CSOC that minimizes payroll costs. This must be accomplished by scheduling an adequate number of analyst (Junior, Intermediate, and Senior) for the initial investigation of varying alert generation patterns. The staffing schedule will include variable overlapping shift patterns that satisfy staffing requirements.

Deliverables:

●Staffing Model and Schedule

●Sensitivity Analysis of Results

●Model Validation

●Recommendations for Next Steps

Areas that are out of scope include:

●False negative alerts are not incorporated into the types of alerts that are received by the analysts.

●Updates and changes to the IDS are not included in the analysis and expectations of the model.

●Manager friendly and customizable schedules, allowing for employee leave and personal preferences are not included in the model.

3.1Primary Problem Requirements

The staffing plan must meet the following shift and staffing requirements:

●A minimum of two analysts must be on schedule every hour, with at least one being a Senior.

●A shift length can range from 4 to 12 hours.

●A minimum of 8 hours off-work must be between shifts for employees.

●Analysts require every other weekend off-work.

●Analysts cannot work more than six consecutive days.

Figure 2:Alert Demand Frequency

In addition to staffing and shift requirements, alert volume ranges from high to low and repeats weekly. Figure 2: Alert Demand Frequency illustrates the Poisson distribution of each day. To generate the Poisson distribution the team assumed there were 10 sensors available for collecting information for the IDS.

Time Period / Frequency of Alert
06:00 AM to 10:00 AM / High
10:00 AM to 02:00 PM / Moderate
02:00 PM to 06:00 PM / High
06:00 PM to 10:00 PM / Moderate
10:00 PM to 02:00 AM / Low
02:00 AM to 06:00 AM / Low

Table 1: Time Periods and Frequency of Alerts

Table 1: Time Periods and Frequency of Alerts displays the six, four-hour time periods. Each time period has a frequency of alerts that is high, moderate, or low. Each of these has a mean arrival rate shown in Table 2: Alert Frequency Mean Arrival Rate.

Alert Frequency / Mean Arrival Rate
High / 12
Moderate / 9
Low / 6

Table 2: Alert Frequency Mean Arrival Rate

4.0Technical Approach (Assumptions / Limitations)

4.1Assumptions

Assumptions made for developing the Optimal CyberSecurity Analyst Staffing plan are as follows:

●Alerts are batched for each time period and the batch is presented at the beginning of the time period.

●All alerts are investigated by the end of the time period received.

●Analysts work the entire time period.

●Investigation rates incorporate nominal work breaks.

●Each weekday has the same alert pattern and quantity as the same weekday in the following week. An example of this is Monday is week 1 has the same pattern and alert quantity as Monday in week 2.

4.2Approach

4.2.1Time Period Analysis

An initial analysis was conducted to determinealert analysis time period lengths and shift options. Time period increments analyzed were 1 and 4 hour lengths. The worker work hour varied from 4 to 12 hours.

Scenario / Period Increment / Shift Options / Min Payroll Total Cost / 24 Hour Pd. Analyst Shifts
1 / 1 hour / 4-12 hours / $16,412 / Junior – 2
Intermediate – 2
Senior – 25
2 / 1 hour / 4, 8, 12 hours / $16,412 / Junior – 2
Intermediate – 2
Senior – 25
3 / 4 hour / 4, 8, 12 hours / $16,412 / Junior – 2
Intermediate – 2
Senior – 25

Table 3: Time Period to Minimum Payroll Total Cost Analysis

Table 4: Time Period to Minimum Payroll Total Cost Analysis displays the analysis performed on three options. This analysis was performed on a single day, 24 hours. The alerts per hour or four-hour time period were set so that all alerts arriving in the 4-hour time period were the same each hour of that period.

Option 1 displays the analysis performed on 1 hour time periods, shift patterns to include work hours of 4 to 12 hours. This yielded a recommendation of 2 Junior, 2 Intermediate, and 25 Senior workers to meet the demand requirement. The minimum payroll total cost of this analysis is $16,412.

Option 2 displays the analysis performed on 1 hour time periods, shift patterns to include work hours of 4, 8, or 12 hours. This yielded a recommendation of 2 Junior, 2 Intermediate, and 25 Senior workers to meet the demand requirement. The minimum payroll total cost of this analysis is $16,412.

Option 3 displays the analysis performed on 4-hour time periods, shift patterns to include work hours of 4, 8, or 12 hours. This yielded a recommendation of 2 Junior, 2 Intermediate, and 25 Senior workers to meet the demand requirement. The minimum payroll total cost of this analysis is $16,412.

As seen from the initial test, the total costs and workforce required to meet demand are the same across the three scenarios. Since Option 3 only had 26 possible shift patterns for 1 day compared to Option 1 and 2 that had with 224 possible shift patterns for 1 day, the team concluded to design the model around 4-hour time periods, with employees working 4, 8, or 12-hour time periods.

4.2.2Model Architecture

The mathematical model was created in three major parts, which build upon each other, andutilizes a combination of python coding, integer programming, and the FFD Heuristic.

  • Input: Calculate average alert arrival rates and feasible shift patterns
  • Optimize: Minimize payroll costs
  • Assign: Minimize staff and create staff schedules

4.2.3Model Part 1: Input

Identifying hourly demand based on average alerts is the initial step of the optimization model. Determining all feasible shift patterns for 14 days,with four-hour time periods,has possibilities, while shift patterns for 1 day haspossibilities. Therefore, the model begins with the creation of alert demand for 1 day and repeats seven times to generate demand values for 1 week.Each of the seven days follows the varying alert pattern described in section 3.1 Primary Problem Requirements and corresponds to a day of the week shown in Figure 4: Poisson vs Average Alert Distribution. The varying alert generation rates and seven day patterns are developed by a Poisson distribution, written and executed using Python. The alert demand used for the model was based off a Poisson distribution plus 2 standard deviations.Sunday is the first day and the seventh day is Saturday of the week.The Poisson distribution is utilized due to the number of alerts is bounded by network throughput of the IDS, which control of is outside of scope.

Figure 3:Poisson vs Average Alert Distribution for Seven Days

Figure 3: Poisson vs Average Alert Distribution displays the average alert demand in yellow, the average alert demand plus 2 sigma in grey, and the Poisson distribution values in blue. The data utilizes information from both Table 1: Time Periods and Frequency of Alerts and Table 2: Alert Frequency Mean Arrival Rate to generate the distribution.The first seven different days of alerts are mirrored for the following seven days to create a 14-day pattern.

Next, the model identifies all possible shift patterns that meet given requirements provided in Section 3.1 Primary Problem Requirements and the payroll cost for a Junior, Intermediate, or Senioranalyst for each shift pattern. Specifically,the feasible shift patternsmeets the following staffing constraints:

  • Analysts can work a range of 4 hours to 12 hours within a single shift.
  • Analysts must receive a minimum of 8 hours off work in-between all shifts.

Payroll costsfor each shift pattern is calculated in Part 1: Input. In addition to the base rate for each employee as shown in Table 3: Analyst Type Rate and Alert Analysis per 4-Hour Time Period, two circumstances that increase the payroll cost for each employee are:

  • Analysts will receive a rate increase by 10% for working between 10 PM and 6 AM.
  • Analysts will receive a rate increase by 10% for 4 hour shifts

The output Part 1: Input is a .CSV file displaying all possible shift patterns that can be assigned to employees during a 24-hour period and the corresponding cost values.

Figure 4:Model Part 1 Output

Figure 4: Model Part 1 Output shows the output of Part 1: Input. There were thirty shift patterns that met the staffing requirements. The six columns of Time Period display a 0 or 1. A 0 shows that the employee is not working during the 4-hour period. A 1 shows the employee is working during the four-hour period. The file is conditionally formatted to highlight each 1 for easier viewing of the shift pattern. The final three columns display the cost of Junior, Intermediate, and Senior employees working the shift.

Each of the rows displays a different, unique work schedule for the 24-hour period. The Model Part 2: Optimize and Part 3: Assign will reference this table that has been created.

4.2.4Model Part 2: Optimize

The feasible shift patterns from Figure 4: Model Part 1 Output is the main input of Part 2: Optimize. Part 2: Optimize focuses on minimizing cost and outputs the required number of resources to meet the alert demand from the initial Poisson distribution. Figure 6: Equations of Minimization Staffing Problem displays the equations utilized by the Model.

Figure 5:Equations of Minimization Staffing Problem

The objective function above minimizes the payroll cost of the analysts while meeting the alert demand for each time period. Each analyst is either aJunior, Intermediate, or Seniorlevel with an alert analysis rate shown in Table 4: Analyst Alert Completion and Income Per Hour.At least two analysts must be working during each time period, and at least one Senior analyst much work during each time period.

Analyst Type / Number of Alerts per Hour / Income per Hour
Junior / 8 Alerts / $38
Intermediate / 10 Alerts / $49
Senior / 13 Alerts / $61

Table 4: Analyst Alert Completion and Income Per Hour

The output of Part 2: Optimize is the minimization of payroll cost by identifying shift patterns and the number of analysts needed to fulfill the demand of alerts requiring analysis. There are two types of information output: Number of employees of a type working a specific work pattern and the demand versus supply comparison. The identification of shift patterns to fulfill the demand of alerts while minimizing payroll cost. An initial test for the first day was run and documented to display the following output as an example.