You, me and everyone else hated seeing this thing!
By Original uploader was Praseodymium at en.wikipedia – Transferred from en.wikipedia to Commons by Reader781 using CommonsHelper., Public Domain, https://commons.wikimedia.org/w/index.php?curid=4335321
Want to instill fear into any tech nerd or guru? Mention the Blue Screen of Death.
During Windows’s heyday, having your screen turn the color of a Smurf often meant a complete loss of your unsaved work.
I’ve heard countless groans, curses, and outright fits started by the BSoD. It was the absolute bane of our collective existence for over a decade.
In this article, you will learn:
- Why you should check your backups regularly.
- How to set up a recovery plan.
- How to create a backup-friendly mentality within your team.
Let’s talk about it!
Check your backups before it's too late
Backups and recovery plans are the fire extinguishers of the tech world. If you need one, you better hope it works. You don’t want a spreading fire to be your reminder to check if the extinguisher is charged and ready to use.
The same goes for backups and recovery plans. You have to be sure they will work ahead of time.
Which is why it’s best to find an expert, like my good friend and CFWhisperer Mike Brunt. His advice helps establish some good habits to keep your backups useful and ready.
- Know your backups in and out
Vigilant CIOs can recite their company’s backup plan by memory. They know them as well as the employees who set them up. (The best CIOs set up the backups themselves.)
For in-house backups, that means identifying what parts of the company’s data is on which backup location. That could be either a single disk or an entire server.
Those using data centers or some other service may include automated backups as part of their offerings. Some backup a web server, others a data server. Brunt suggests you see all the details in the fine print.
- Test your backups at least once a month
This may strike some as rudimentary. But so few CIOs have a backup test built into their team’s overall workflow or routine.
- Don’t wait!
If you’re shouting “Do we have a backup?”, it’s already too late. Plan ahead!
What is a Disaster Recovery Plan?
Many steps separate your company’s stable operations and a complete system meltdown where everything goes haywire. And sometimes, unfortunately, the latter happens. Disaster Recovery Plans are the antidote to those excruciating moments. They help your company return to “normal” as soon as possible, without any sign that something went wrong.
A professional disaster recovery plan eliminates panic when everything seems to be on the fritz. The recovery plan is your roadmap out of a total breakdown.
At its best, an IT disaster recovery plan:
- Inventories all hardware and software, vital or trivial.
- Follows the company’s priorities. It will restore key components necessary to keep the essential operations functioning.
- Predicts scenarios where key components of a system fail individually, in unison, or in random groupings.
- Minimizes costly outages, and delegates tasks.
- Restores components in a logical fashion. For example: hardware necessary to keep data flowing gets top priority, and further down the line.
Be sure to create the recovery plan in collaboration with your company’s business continuity plan.
How to create a reliable recovery plan
Every recovery plan, like every company, is unique. There are no universal plans to copy-paste. But your plan should have three goals in mind:
- Averting disaster altogether, by creating a fail-proof system of backups and other safety mechanisms.
- Discovering failures in real-time, rather than waiting for the whole system to collapse. This could mean regular check-ups or routine tests to find new threats or weaknesses in the company’s infrastructure.
- Correcting whatever the failure was to bring systems back to their normal state and recovering from the disaster.
How does a company get to this point?
Creating a recovery plan requires a team effort, with several members designated specific roles. When working in tandem, these team members can help focus efforts and coordinate the overall creation of the plan.
You will need:
- A Disaster Recovery Coordinator
As the title implies, this person takes the helm as soon as the alarms go off. This must be someone with the technical skills, quick pace, experience, and analytical abilities to see a disaster recovery through. It’s a high-level role in times of crisis, demanding poise and broad knowledge of the company’s IT infrastructure. It may be best for you, the CIO, to take on the role within smaller organizations.
- An auditor
Similar to an inspector, the auditor must “kick the tires” so to speak. He or she will follow the plan through from start to finish. They must assess the stability and viability of the plan, as well as the quality of the team members assigned to each task.
The auditor also tests the reliability of the recovery plan every time it is updated or changed.
- Outside teams working in tandem
Not directly linked to the disaster recovery efforts, other groups within the company must still work in tandem with the disaster recovery group to ensure its constant update and proper execution. This means developers, engineers, and record keepers must all ensure any changes they make to infrastructure or vendor contracts are reflected in the recovery plan.
You must design your recovery plan around two concepts: the RTO and the RPO. What’s in the alphabet soup?
- RTO refers to the Recovery Time Objective — how long it takes for your system to return completely.
- RPO is the Recovery Point Objective — selecting which in the long line of backups you want to return to.
Ideally, your recovery plan will have an accurate “worst-case scenario” map leading to a preselected RTO and RPO. This means backups must be coordinated and set to target specific data and apps. And team members must be in tune and ready to execute as the plan requires.
Before building the step-by-step process that’ll become your disaster recovery plan, you must:
- Assess your company’s IT systems and infrastructure.
- Start with a top-to-bottom audit of both hardware and software, from surge protectors to cooling fans. Every department should have a guru who specializes in a certain chunk of your company’s IT machine.
- Seek out weaknesses, whether they be dicey code or nuts-and-bolts problems such as power cables and surge protectors.
- Fix any immediate dangers or looming headaches. For example an aging server in need of an update.
- Make note of areas where your infrastructure is likely to fail. A list of suspect areas will make diagnostics easier in case of a failure.
- Make sure every group within the company chooses its own top-priorities within the recovery plan.
- Collect and inventory all data which must be included in the recovery.
Once these three key data points are collected, one can start building a recovery plan. Assume, first, a worst-case scenario. Assign one for every department within the company. (A total disaster for IT and accounting are vastly different, I promise).
Then make sure the backups are adequate and reflect those priorities. Be sure the step-by-step process is checked by all the necessary department heads, to ensure nothing important is left out.
Finally, hand everything over to the disaster recovery coordinator and auditor. Let them run tests on fictive meltdowns. These “dry runs” should be done regularly, and repeated with every major change introduced by a new acquisition or change in policy.
Yes, you really need to backup.
It feels ridiculous even writing this at this time. But you’d be amazed at the dumbfounded looks I encounter when I ask potential clients to describe their backup and recovery plans.
So please, let me reiterate:
- Make sure your team takes backups and recovery plans as seriously as any other part of their job. The mandate to monitor and backup data comes from the top. So mirror the seriousness you want to see in your own team.
- Backups are the best way to recover data — you can’t rebuild your castle the same way twice.
- Having a stable and ready backup, and a recovery plan removes a major worry from your day-to-day work life.
- A capable and ready backup and recovery plan shows initiative and forward-thinking to the higher-ups who sign your paycheck.
There are many more reasons to target a reliable data backup and recovery plan. The most important reason: it could cost you your job. By creating a reliable backup and recovery plan, you’ll ensure your company doesn’t find itself facing the modern equivalent of the Blue Screen of Death.
And to continue learning how to make your ColdFusion apps more modern and alive, I encourage you to download our free ColdFusion Alive Best Practices Checklist.
Because… perhaps you are responsible for a mission-critical or revenue-generating CF application that you don’t trust 100%, where implementing new features is a painful ad-hoc process with slow turnaround even for simple requests.
What if you have no contingency plan for a sudden developer departure or a server outage? Perhaps every time a new freelancer works on your site, something breaks. Or your application availability, security, and reliability are poor.
And if you are depending on ColdFusion for your job, then you can’t afford to let your CF development methods die on the vine.
You’re making a high-stakes bet that everything is going to be OK using the same old app creation ways in that one language — forever.
All it would take is for your fellow CF developer to quit or for your CIO to decide to leave the (falsely) perceived sinking ship of CFML and you could lose everything—your project, your hard-won CF skills, and possibly even your job.
Luckily, there are a number of simple, logical steps you can take now to protect yourself from these obvious risks.
No Brainer ColdFusion Best Practices to Ensure You Thrive No Matter What Happens Next
Modern ColdFusion development best practices that reduce stress, inefficiency, project lifecycle costs while simultaneously increasing project velocity and innovation.
√ Easily create a consistent server architecture across development, testing, and production
√ A modern test environment to prevent bugs from spreading
√ Automated continuous integration tools that work well with CF
√ A portable development environment baked into your codebase… for free!
Learn about these and many more strategies in our free ColdFusion Alive Best Practices Checklist.