Omar Al Barghouthi, Regional Director, Middle East at Dragos said, “Companies running data centres in today’s threat environment implicitly understand that cybersecurity is paramount. Data centres stand as a prime target for cybersecurity adversaries seeking to steal sensitive data and disrupt business operations. However, in most instances, the investments in data centre cybersecurity focus solely on the IT systems contained within the facility.”
Security architects assume that the Operations Technology control systems that power, cool, and otherwise run the building are naturally secure because they’re theoretically disconnected from data centre networks. Meanwhile, data centre architects primarily concern themselves with physical security and disaster-related risks to these critical environments. And so, the industrial control systems infrastructure comprised of underlying systems like building automation systems, electric power monitoring systems, HVAC controls, alarm systems, and entry badging systems remain quietly overlooked from a cybersecurity perspective.
There is a lot at stake here — physical damage, legal costs, repair costs, reputation, and the impact that downtime has on operations and morale. Attacks on some OT systems even have wide-reaching health and safety implications. So, let’s establish five best practices that allow data centre security teams to avoid disaster.
- Know your environment
Knowing the environment in detail shapes the first step in being able to secure it. Everything from network topology and asset inventory to the minutiae of security tools configuration settings and the details of security policies should be documented and readily available to analysts. Having to gather this information ad hoc will be a drain on time and budget. Platforms that grant comprehensive, granular views of the environment will accelerate investigations.
- Have a plan
Once the environment is understood in depth, Incident Response Plans (IRPs) give detailed descriptions of the actions and roles that come into play when an incident occurs. They define what constitutes an incident and when flags should be raised, and with whom. They provide templates for documenting occurrences and actions and lay the foundations for effective forensics and rapid business recovery while also issuing directives that maintain compliance throughout.
Since OT requirements differ significantly from IT, IRPs should separate their workflows. Core plant or manufacturing processes must be highlighted in the IRP, along with what data is collected from such systems, where it is stored, and for how long. This is important because most OT systems require data collection to be performed locally due to a lack of network bandwidth and a range of other regulatory and operational requirements. So, any forensic analysis of an incident in an ICS environment will lean heavily on the IRP for data collection, especially to ensure that it does not clash with the organisation’s emergency operations plans (EOP), which always override IRPs when safety concerns arise.
- Log everything
A formal collection management framework (CMF) is critical to the identification of available evidence because it reduces investigation time and highlights monitoring gaps. For modern incident response, sources should include Windows event logs, Active Directory authentication, Sysmon, PowerShell logging, firewall logging, and VPN authentication data. Documentation on configuration change management is also important, as are DNS query and response logs, DHCP, NetFlow, and Web proxy logs. And don’t forget distributed control system (DCS) or supervisory control and data acquisition (SCADA) environments, where communication protocols are often proprietary.
The focus should be placed on chokepoints and perimeter log collection, as well as east-west network traffic; and due attention should be paid to any third-party network connections, as they greatly broaden the definition of ‘perimeter’.
- Consider budgeting for an incident-response retainer
All business stakeholders should be aware that not having an incident response service retainer has risk attached to it. While business cases against retainers may make sense at the time, their absence in a crisis can lead to escalating costs in terms of hourly rates for response services.
In the worst-case scenario, a lack of availability of response services may even occur, as security teams contact security firm after security firm to find one with qualified resources on standby. For those who still think it is worth taking the risk, it is advisable to look at the headlines and see the possible downsides. It is also worth noting that some security partners will allow unused retainer hours to be diverted to other services, such as proactive threat hunting or penetration testing.
- Perform due diligence on incident response analysts
It should come as no surprise that not all security firms are created equal. The regional skills gap in cybersecurity means many newly qualified or underqualified people are serving in the field. Even IT security is under-resourced, but industrial environments can differ so wildly from data-centric IT that OT security specialists are even scarcer. And an ill-informed incident response team can often do more harm than good — inadvertently destroying evidence, scanning sensitive industrial devices without due care, and failing to provide industry-standard reporting. The solution: vet every candidate thoroughly and establish that its employees have substantial familiarity with industrial safety measures and equipment.
Safety first
The range of costs associated with a lack of attention to OT security in data centres could be devastating. Fortunately, we now know how to protect ourselves. OT-focused threat actors think they are in for an easy ride. Let’s show them how wrong they are.