top of page
Writer's pictureWendy White

Is it safe to use open source SDS for your backup and data recovery?

Losing your data can mean losing your business. But backup and data recovery is growing more complex. Open source technologies provide organisations with a cost-effective way to access a greater range of tools and expert knowledge, but is it really safe to use?

With increased global collaboration, working from home arrangements and digital-first approaches to service delivery, organisations are managing more data than ever before – and the integrity of that data is more critical than ever.


How do organisations deal with this? Typically, by adding additional protection tools. Perhaps additional backup servers, implementing virtualisation technologies, locking down network devices (not so easy in a bring your own device (BYOD) world, a practice that has become more common as more employees conduct their work from home) to name a few. And, of course, there’s cloud backup and archiving services.

Each approach to backup and disaster recovery requires a separate set of skills, not to mention a separate set of management interfaces. This complexity can lead to both data silos and risk.

Managing a mixture of local file shares, cloud-based shares, data lakes, application databases, virtual machines, code repositories, public cloud services and edge data repositories?


Yeah. There’s a reason for that headache you’re experiencing right now.

Many organisations are coming to the realisation that the answer is software-defined.

Software-defined storage enables use-driven customisation, scalability and optimization beyond a traditional solution. That’s way more effective than paracetamol when it comes to your storage-induced headaches.


A single platform to manage and protect your entire physical and virtual infrastructure? You mean, I can perform centralised monitoring, configuration and analytics across my solution?


Look, it’s fine. Give me a moment.


I’ve just got something in my eyes.


Okay.


There’s a variety of ways to implement SDS. There’s the proprietary hardware/software approach, though if you’re reading this article, you’re probably at the very least considering an solution that makes use of open source SDS. There’s a few different paths you can take when it comes to designing an open source SDS solution, which you can read about in detail in Architecting IT’s eBook, Validating Software-Defined Storage Operating Models for The Enterprise.


So, let’s say you’re keen to try open source SDS. Perhaps it’s the lack of licensing fees, the absence of vendor lock-in, or the level of customisation you can achieve (likely all of the above). But there’s at least one thing holding you back from taking the next step – the reason you’re reading this article to begin with.

With open source architecture in the news as a recurring target for cyber attacks, is it safe to use in an enterprise environment?

Well, before we address the main question, let me nit-pick a moment: no solution, proprietary, open source, or carefully hand-crafted in your underground cyber lair from scratch, is guaranteed immune to cyber attacks. As folks on both sides of the attacks know: it’s only a matter of time and resources.


Instead, what we should be thinking about is:


  • your appeal as a target, and

  • your capacity to implement a zero-trust environment.


Your appeal as a target

While there are hackers out there who will target big-name corporates and government agencies, whether it’s for the potential pay-off or the notoriety, those attacks are going to be just as ruthless regardless of the way you’ve implemented your SDS backup and data recovery solution. The less movie-plot-worthy reality is, however, that most attacks come as automated probes seeking out vulnerabilities. When a new security loophole is discovered, cyber criminals will seek to exploit that vulnerability in any system that has it, regardless of whether they’re a local council’s community portal or a small retail chain. The same is true for your SDS solution.


It’s not personal. It’s just convenient.


Want to reduce your appeal as a target? Keep your software patches up-to-date. Very recently GitLab servers were being exploited in DDoS attacks in excess of 1 Tbps – but the vulnerability behind this attack had already been addressed in a patch seven months earlier. Anyone who had already applied the patch was at no risk of this hijacking succeeding.


This is one of the many great things about open source software, especially those with thriving and active communities such as Kubernetes or Ceph. There are many, many eyeballs looking over the underlying code, catching new issues rapidly. Once a vulnerability is identified, work to begin patching it can begin immediately – and there are no new license fees to pay in order to keep your software up-to-date.


Your capacity to implement a zero trust environment

Implicit trust is how we still do a lot of things in the tech world today. We trust that our vendors have provided us what we’ve asked for – and nothing more. We trust that our email addresses and credit card details will be handled appropriately by those who come into contact with them. We trust that the supply chain hasn’t been compromised. And if our trust is proven to be unwarranted – we trust that our backup and disaster recovery systems will mitigate the potential fallout.


By comparison, a zero trust environment enforces clearly defined authentication and security rules, allowing for micro segmentation of resources. Every stage of an interaction, digital or physical, is verified. (Read more about zero trust here).


For this to work, auditability is key. But how exactly do you scrutinise every line of code and every piece of hardware in your solution? It’s highly unlikely you’ll be able to do this with proprietary software – your vendor will instead require you to trust that they will be alert to new vulnerabilities and supply (and potentially install) patches in a timely fashion. This is, of course, not the case with open source software, which is available for close scrutiny by as many eyeballs as you care to throw at it – not to mention the eyeballs of the rest of the community for that software.


Auditing the hardware used for your open source SDS solution can be more difficult, however – though there is one company out there offering a complete forensic chain of accountability for every product they manufacture.


While implementing a zero trust environment and minimising your appeal as a target are the two critical elements of implementing a secure solution, there’s one other thing to consider if you’re still on the fence about open source SDS…


Open source software is the backbone of the internet.

Most internet services have been built on open protocols. The domain name system (DNS) is administered in the open, its rules community-driven. Apache is the most widely used web server software. Linux is without a doubt the most commonly used operating system for web-facing computers, and much more besides.


This isn’t a coincidence. Open source software underpins the majority of the internet for a reason. The strength of the community behind each of these large open source projects makes them ultimately more reliable and more secure than the alternatives, with the benefit of freedom from forcing users to be beholden to a single corporate interest. They have a life beyond a single person or organisation, one that depends on being as useful and relevant to their user base as possible.


A new open source SDS project just started up, with a total of three commits to its codebase? Sure, maybe don’t line that one up for your enterprise solution just yet. But an open source SDS with a thriving community behind it, such as Ceph, is not a shot in the dark. It’s a powerful tool that’s perfect for use in a zero trust approach to building a secure, reliable storage solution for your backup and disaster recovery needs.

Okay, so, the astute amongst you will have noticed I've mentioned Ceph twice so far - I've actually recently become a Ceph Ambassador, as my current work has awakened a deeper interest in cloud infrastructure, and that lead me to learning about Ceph - a very, very clever open-source distributed storage system. Note: This is a version of a post I originally wrote for the SoftIron blog, but the original post is no longer available so I am re-publishing it here.

bottom of page