Five ways to improve your backup and disaster recovery plan’s resilience against a cyber attack

Accenture recently reported that global cyber intrusion activity jumped 125% in the first half of 2021. These intrusions spanned across all industries, from banking to tourism and hospitality services.

The report also stated that ransomware continues to top the attack category list, representing 38% of cyber intrusion events. In May, a ransomware attack on the largest fuel pipeline in the U.S. shut down 40% of the U.S. East Coast’s fuel supply and was labelled a national emergency.

Ransomware will generally withhold access to data or an entire system until a ransom fee is paid. Generally this is achieved by encrypting drives, in addition to changing system passwords and permissions.

However, paying the ransom fee is no guarantee that your attacker will honour their side of the transaction. It also reinforces the profitability of such attacks, and may increase your likelihood of being targeted again. Major security organisations, including the FBI and the Australian Signals Directorate, do not recommend paying ransomware demands for these very reasons.

Security software is an important piece of the puzzle. But there is no such thing as a completely secure system.

Still, there’s no reason to get fatalistic about it – individuals and organisations should still take preventative actions to greatly reduce their risk. For those who fall victim to ransomware, there are grassroots community efforts such as No More Ransom to turn to, which offer free tools to assist those impacted by ransomware.

There is of course, no guarantee that a decryption tool will get your data back in one piece. But with a sound backup and disaster recovery plan, you might not even need them.

1 – Maintain a register for tracking vendor vulnerabilities

Your solution is only as secure as the weakest link in your supply chain. Create a vendor product register with zero trust principles front and centre.

Consider:

Is the software you use auditable? Does it rely on a mixture of vendor and third party code? If so, does your vendor review and test this third party code and monitor it for security risks?
Can your hardware vendors provide a forensic chain of accountability for what components were made in which factories? Hardware trojans will go completely under the radar of your security software, so it’s important that you have insight over both aspects of your solution.

By creating and maintaining a detailed register of your software and hardware products, you will be better equipped to quickly respond to news of a compromised component before it impacts on your own organisation. Hopefully, you won’t need to create this from scratch – instead, extend your organisation’s asset register to track as much of this information as possible.

2 – Follow the 3-2-1 rule for backups

The 3-2-1 rule is simple, but very effective.

Maintain three copies of your data
On two different types of media
With one copy off site

Having at least one copy offsite can be particularly invaluable when a network is compromised by malware. Its physical separation reduces the likelihood of being exposed to the malware before the intrusion is detected, allowing restoration of your critical data from a ‘clean’ copy.

Of course, if malware lurks undetected for some time in a system, it may be included with your backups as well. Deciding how to handle this is a matter of balance. You might choose to maintain a number of historical snapshots of your data over time, so that you can restore part or all of it from months earlier if need be. Or you might decide to increase investment in security monitoring software and business practices that increase your chance of detecting an intrusion early.

Another way to avoid your most critical backups from being tainted by malware is to implement data immutability on your backups.

3 – Have systems in place to check and maintain the integrity of your backup data

Avoid the heartbreak of restoring from backup only to discover your copies are corrupted and completely unusable, by having systems in place to ensure data remains pristine and free from bit-rot.

Avoiding bit rot (where the ‘0’s and ‘1’s flip in parts of your data due to brief non-critical storage media hiccups) is partially addressed with the 3-2-1 rule; the more copies of your data, and the more locations it is stored in, the less likely all copies will be corrupted at the same time.

However, particularly if you count your production data as one of the copies of your 3-2-1 implementation, I'd recommend implementing additional ways to maintain your data integrity.

Ceph, which offers unified object, block and file storage and space-saving erasure coding replication options, offers inbuilt protection against bit rot through ‘data scrubs’. Schedule normal scrubs to catch OSD (Ceph’s storage daemons) errors and issues within the file system, and deep scrubs to do a bit-for-bit comparison of your replicated data to detect errors (the latter is I/O intensive, so keep that in mind when planning your scrub schedule).

4 – Reduce single points of failure at every step of your backup solution

If you have five copies of your most critical data, but it all becomes unavailable at once, it’s no different than if you’d been maintaining a single copy. Whether that single point of failure is a power supply, a network connection, or a storage device, in the end, it’s just as critical as maintaining an adequate number of copies of your data.

Yes, this is another extension of the 3-2-1 rule: one that prioritises data availability along with its durability. Maintaining an off site copy and using two or more storage media is a start, but when downtime is not negotiable, be sure that your backups will remain available.

This can be achieved with more than a single off site backup location, hardware fault monitoring, and planning for performance bottlenecks that could occur should a failure domain become unavailable.

This is where Ceph can again make your life easier. Ceph’s CRUSH (Controlled Replication Under Scalable Hashing) map allows for complete configuration of where data is replicated – at the site, rack, device, and even storage drive level. Ceph uses the CRUSH rules to automatically back up data across failure domains, and will immediately respond to the absence of a lost storage location – again, whether it is a storage drive, device, rack or entire data centre – by re-distributing the replicated data stored in the failed site to the remaining available storage locations.

5 – Ensure you know the true cost and duration involved with restoring from backups

I’ve focused on ways to ensure your backups are auditable, intact and available, but one other aspect of preparing your backup and disaster recovery solution against cyber attacks remains: that tricky ‘recovery’ aspect.

Depending on how you have chosen to implement your backup solution, the time it will take to restore from backup (and the potential expenses) can vary considerably. It’s for this reason that your backup testing plan not only involves the how, but also the how long, and how much.

As a starting point – if you have to restore your critical data over a WAN connection, how long will that take? How long can you afford it to take? (In 2014, Gartner estimated the cost of downtime at an average of $US5,600 per minute). Are you likely to incur excess network traffic expenses from your internet provider?

If restoring from backups does incur additional expenses, then this needs to be factored into your operational budget, to ensure that backup testing is fully accounted for. Commit to a regular backup restoration test schedule for predictable performance.

For smaller teams, backup testing can seem like a daunting and time consuming task. In these cases, it may be worth budgeting for extra support from data protection companies, who can assist with automating the restoration process.

Fish out of order