by Brian McGinty.
Part 4– Using Runbooks for Disaster Recovery, in our 5-part series on ways to get your Disaster Recovery Plan back on track.
In part 3 of our series we discussed viewing data. In this post I am going to talk about Disaster Recovery runbooks. While a good part of your Disaster Recovery charter is assembling and organizing a great deal of data, there is another major piece – Disaster Recovery planning and testing.
To support a viable Disaster Recovery exercise, you must demonstrate on a regular basis the ability to successfully fail the business to a DR site.
The detailed recovery plans (runbooks) can be the most challenging because of the manual nature of creating them, coupled with the dependency on different groups to build and maintain them.
Part of that evolution of TransitionManager (TM) over the last 10 years is around the TransitionManager Task Management system. Since TM’s roots are in the migration of applications from one site to another it has the built-in capability to document and hold all the steps needed to get your applications from one site to another – regardless of migration method. Then, when you are ready to execute your DR plan, the steps needed to make it happen are viewed and managed through TransitionManager Task Manager interface (My Tasks, Task Manager, Task Graph, Task Timeline). For any DR event you will have tasks defined to support safe recovery of your systems. The TM Task Management system will track and manage the progress of activity. The granularity of the tasks you track is up to you.
One of the first questions that comes up is – does TransitionManager execute the task? The answer is no, TM generates the tasks, but there are no APIs into your environment to actually execute, for example, the “Shutdown Server” task. While we can automatically generate the runbook, the runbook itself does not execute tasks. TransitionManager orchestrates and tracks activity as the bullet points describe above. Assigned resources are responsible for starting, managing, and completing Runbook tasks.
What enables and builds the tasks and plan is the Recipe. The TM Recipe holds all the data rules your organization has defined to create the steps needed to execute the DR event. To enable as automated a process as possible, our team will work with you to identify those steps that are consistently used based on asset type.
Some simple examples:
The list can go on an on, but the key is that for any asset defined and captured in the TransitionManager repository, you can define a step to be generated based on the existence or absence of that asset or a characteristic of the asset.
Once the ruleset is defined, it is incorporated into a Recipe and the Recipe is run against the current in-scope asset list. The result is a Runbook that is visible as a table of tasks, a graphic view of tasks, or as a timeline. If for some reason manual changes are needed, they can be done. But the objective is to define a ruleset that essentially allows the automatic generation of a Runbook when that ruleset is applied to the asset list in TM (or defined subset).
Below is an example of what a completed Task Graph looks like. Similar to the Dependency Analyzer, you can zoom into any task for more detail.
In part 5 of this series, I will talk about other areas in your organization that TransitionManager can make a difference for your Disaster Recovery Process.
Ready to see how TransitionManager can improve your migration process?
You can't predict when a disaster will strike or when your normal operations are disrupted, but there are steps you can take to disaster-proof your applications and not only ensure business continues but key IT initiatives don’t get stopped in their tracks.
A resilient IT environment goes beyond having a disaster recovery plan. Here are 9 tips for improving your IT resiliency.
It’s not always possible to predict how and when a disaster will strike, however, there are steps CIOs can take to sustain business growth and ensure key IT initiatives don’t stop in their tracks. We are happy to share some of that guidance with the wider IT community.
TransitionManager, a powerful orchestration platform from TDS, is built to integrate with native and third-party tools, including AWS’s CloudEndure, to reduce the complexity of DR, and turn it into an easy, “one-click” process, minimizing the time and risk normally involved.
No matter what type of change your organization is facing, there are three fundamental pillars upon which IT can de-risk and efficiently plan, manage, and execute a complex migration, regardless of what other turbulence may come your way.
Recently, TDS CTO and Product Manager both delivered an AWS Migrations Unplugged session to address challenges faced when scheduling migration waves, and offer some solutions that help organizations accelerate and de-risk each step in their journey to the cloud.