Skip to content

IT4K12 – Disaster Recovery

November 17, 2016

Terac_it4k12_gryhese are my notes from various sessions at IT4K12 in Vancouver Nov 17-18, 2016. They may be messy, and there may be mistakes, and it may not be exactly what the presenter wanted to be remembered, but it’s what resonated with me. – Todd

Gregg Ferrie, Director of Information Technology
gferrie@sd63.bc.ca

Gregg reminded us of Murphy’s Law, where anything that can go wrong, will go wrong. Gregg does worry about what happens if we don’t have good DR. Gregg had piles of great examples and ideas about things to consider for a good DR and BC plan.

What kinds of disasters do we worry about? Well, on the island certainly we worry about Natural Disasters like Fire, Flood, Earthquake. But also Human Disasters: Loss of Internet, Sabotage, Theft, Extended Power Failure, Hack Attack, Terrorist Attack.

There have been a number of cases recently where we’ve had problems. U of C ransomware, Kings College – London lost a single RAID on October 17th, October 31st were still not up and running. 29,000+ students, 9,000 staff and 1 backup server.

Disaster Recovery is a lot like insurance. You buy it every year, you pay premiums, you complain about the cost, but when you need it, you’re glad it’s there.

Delta Airlines, small fire in their data centre, let to hundreds of millions of $’s lost. Thousands of cancelled flights and lost future revenue.

Questions to ask yourself:

  • If your critical information systems became unavailable due to some catastrophic event? eg fire or flood in the data centre with complete loss.
  • If the School Board Office and the data centre burned to the ground one Friday evening just before Christmas break?
  • It has been said that “Failing to plan, is planning to fail.”

Saanichton – old elementary school building, with small data centre, etc.

Reasons Why You Should Prepare

  • Because your district auditors strongly recommend it as a good business practice
  • Because the Government of BCs CIO office requires it
  • Because the architecture of the NGN could put you out of commission for up to 45 days and beyond at all sites
  • Jobs might depend on it – including yours
  • Because if you “fail to prepare you are preparing to fail”

What is Disaster Recovery?

  • DR is a set of policies and procedures to enable recovery or continuation of vital technology infrastructure and systems following a natureal or human-induced disaster
  • DRP typically is the domain of the technology department

What is Business Continuity?

  • Maintain a minimum level of service in the event of a disaster or catastrophe
  • It is about the ability to restore the district to business as usual
  • It is planning to mitigate unanticipated risk…

What are the differences?

  • BC is proactive, its focus is to avoid or mitigate the impact of risk
  • DR is reactive, its focus is to pick up the pieces and to restore the organization to business as usual after a disaster happens
  • DR is considered a subset of BC

Redundancy

  • Do site servers, schools, or departments have built-in redundancy? Including RAID, etc.
  • Are critical spares kept locally or at the district office?
  • Are offsite spares, equipment available quickly?

Backup Strategy

  • Are site and district servers backed up regularly?
  • Are they getting backed up to the Central Data Centre?
  • Are backups regularly verified and tested?
  • Do you also backup offsite as well (secondary site, tape or cloud)?
  • To be clear, good backups are NOT Disaster Recovery or Business Continuity!

Saanich is also looking for a tertiary backup system in addition to their primary and secondary. Could be cloud-based for data alone. Most services are warm or hot, so that the server is running, but the data may be a day out of date. Hot services, like e-mail are hot, and synchronized all the time.

Data Centre Safeguards

  • Is the Data Centre secure?
  • Does the Data Centre have environmental controls?
  • Does the Data Centre have fire suppression designed for a computing environment?
  • Does it have a backup generator?

Disasters or Catastrophes

  • If you only maintain data backups how rapidly can you rebuild critical and non-essential systems?
  • Do you maintain spares of servers, drives, power supplies, etc?
  • How does the district establish essential servers and how quickly?
  • Hence the need for a Disaster Recovery Plan

Saanich’s site has the data copied over, would need to change some IP’s and fire some equipment up and then they would be running. Wouldn’t it be great to have all of the central office services redundant to another site?

Disaster Recovery Planning

  • Start with the basics
  • Risk Analysis and Assessment is the first step
  • Review and change Backup and Restore procedures if necessary
  • Determine if a viable Failover site exists
  • Determine if you are going to have a cold, warm, or hot Failover – if at all
  • All of this is predicated on what the minimum amount of time each department/service is required to be operational
  • Education, for instance, might only require access to server-based files
  • HR/Finance/Payroll however might require minimal services in 72 hours but to be fully operational within a week

Gregg reviewed the basic steps for building a DR plan and also for BCP planning. The big difference being DR is really just IT, but BCP involves people and needs to include all of the Sr. Leadership team. BCP team meets monthly, and sometimes it’s hard to keep everyone on task, but it’s important.

Gregg has some concerns over the NGN network, as now all schools go directly to their Board Office Primary. If the Board Office goes down, it can take a long time, 30-45 days to re-route the NGN network. Therefore working on an architecture to get a failover site also connected to NGN network. Requires a Bias Failover line. Planning for everything to go well if the Board Office burns down, and then service can be running within about a week through the failover site.

Gregg is able to get better sleep at night as a result of the work and planning that their team has done.

 

 

 

Advertisements

From → ATLE, GHSD

Leave a Comment

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: