IT & Business Infrastructure

Disaster Recovery Planning Template News Feed

Disaster Planning Security Policies Metrics Security Audit IT Job Descriptions
DRP Audit


July 6th, 2008 - Disaster Planning Tips to Keep You Doors Open

What are the some quick tip for the disaster planning processes:

  • Ensure that your recovery plan is not attached to any one person.
  • Keep your plan portable, and keep it away from you
  • Make arrangements in advance with software vendors for license keys to put backup software at the disaster recovery site in operation.
  • Contact phone lists should also include vendors.
  • Remember the little things, like mice -- companies that develop disaster recovery sites may have all the servers they need, but they sometimes overlook essential hardware peripherals.

 Disaster Recovery Template Sarbanes OxleySecurity Template  Sarbanes OxleyDisaster Planning Audit Security Audit Program

Consider this, almost 40% of small businesses that close due to a disaster event never re-open. What would you do if the building your business is located within was damaged or destroyed in a disaster? Where would you go to continue providing your customers with your business services? Would you be prepared and have the correct resources, databases, contact information and other necessary items to adapt to these changes? Having a disaster plan that identifies these important items will help ensure your business is prepared to survive during unexpected and difficult times!

As historic floodwaters start to recede along the Mississippi and other Midwestern rivers, local businesses in affected communities like Cedar Falls, Iowa, are busy assessing the impact on IT equipment and whether disaster recovery plans stood the test.

A maker of computer games in Cedar Falls, may be permanently displaced after Cedar River floodwaters reached 6 feet in its administrative offices and 5.5 feet in an adjoining warehouse. The company sustained about $250,000 in damage to inventory.

The firm's president said all 65 employees are now working temporarily in borrowed offices in three facilities.

As the floodwaters approached on June 9, employees scurried to save 120 PCs, 80 monitors and eight servers. Three high-end printers could not be removed in time.

The company plans to revise his disaster recovery plan. "When a river comes up 6 feet higher than it ever has before, it's tough to have that foresight," they said. "But it is probably going to happen again."

A software development company has plans to deal with tornados and electrical outages, but executives never dreamed they would have to contend with the Cedar River surpassing 500-year-flood levels. "Going through this experience [will] make those plans [more] than just part of an IT checklist," he said.

A key lesson learned was that companies must prepare for employees to miss work to help families and communities after natural disasters.

more info

June 11th, 2008 - After You Recover from a Disaster You Must Handle the Media

After companies recover from a disaster, they need to manage their images.  Planet.com, an Internet Services provider, did not do that after a major fire.  Nothing was posted on their site.  The only news was on a media site (IDG - Computerworld). The story is

(Computerword) - The Planet.com Internet Services Inc. hopes to have all 9,000 of its servers in its Houston data center back online later tonight following a blast that shut down the facility on Saturday afternoon.

 Disaster Recovery Planning Template  Threat Vulnerability Assessment Tool  Business & IT Impact Analysis 

When firefighters arrived at around 5 p.m., they could see "light smoke" at the Planet data center -- the aftermath of an explosion in a network gear room that produced enough force to move walls. Sprinklers quickly doused whatever flames erupted; the fire was attributed to an electrical problem with a transformer, according to a Houston Fire Department spokeswoman. There were no injuries.

Although the data center says it has power systems that "are designed to run uninterrupted" and a "fully redundant network operations center" with diesel generators, the electrical problem exposed an apparent Achilles' heel in its business continuity planning. Firefighters told data center workers to turn off all the power, according Planet spokeswoman Yvonne Donaldson. That meant the servers, even though they weren't damaged, were offline.

Approximately 6,000 of the affected servers were returned to service early this morning. Another 3,000 were due to return online by tonight, the company said. The Planet staff provided updates on the restoration on its customer forum site, including a message from CEO and Chairman Douglas Erwin, who wrote that some servers will be relying on generator power for a week until normal utility connections are restored.

The Planet operates more than 40,000 servers at multiple data centers and hosts more than 3 million Web sites.

While Planet data center staff worked to restore service, users -- many of them small business owners -- wrote of their frustrations over the outage on forum posts. Questions about the data center's backup capabilities were raised, as well. One person, flynnibus, wrote: "You shouldn't put all your money into one bank -- and you shouldn't put all your servers in one DC [data center] if you want to be truly resilient."

more info

May 30th, 2008 - Many Disasters are Magnified by Human Error

(Computerworld)  A disk failure in a Sun Microsystems Inc. server caused the Federal Aviation Administration's NOTAM database to crash for nearly 20 hours last week, according to the FAA.

Disaster Planning Security TemplateThe NOTAM (notice to airmen) system provides notices to airmen, or pilots, regarding airports, equipment and security issues. The system went down late May 22 and was back up at around 7 p.m. on May 23.

Because of the disk failure, information had to be delivered to pilots through local air traffic controllers and alternate systems, including a Web site set up to disseminate the most up-to-date information, said a manager of aeronautical information management for the FAA. However, flight safety was never a problem, the FAA said.

"What happened was the drive in an end-of-life Sun box failed in the middle of updating the information on the hard drive, so it screwed up the database," the FAA said.

That was the beginning of the complications. The FAA team replaced the hardware and the drive which got the system running again.

The FAA already had the equipment to replace in place, they just had not done it yet, and that is why the hardware recovery was quite simple according to the FAA.

But even then, the system was running slowly, or in a deteriorated mode, and it got so bad that his team decided to reopen the problem to see what was going on.

As the technicians were working to fix the database, they decided to go to the backup system. As they did that, they soon realized they had written the error over to the backup system and had corrupted that system as well.

more info

May 30th, 2008 - Role of IT in a Disaster Defined

The first steps the IT department should take depend on how seriously a disaster affects resources. Does it require a few desktops and a room off site to provide a temporary recovery solution? Or does a larger plan need to be activated to move PCs and servers to a "hot site" to restore entire applications and set up temporary work facilities for a limited number of key workers to operate until normalcy is restored?

But what good does it do for IT to restore applications and data if there is no one there to run things? It is only half the solution, albeit the first half. The second half is the contact information for the business continuity piece. Recovering from disaster is less a solution than a process. Governments must take control of their own destinies. In the event of a disaster, a core team of people across all departments is typically designated to continue business operations pending the restoration of a normal work environment. These people need accurate information with which to call on IT and on vendors for technical support or to report to work at a temporary site.

more info

May 30th, 2008 - What Drives Disaster Recovery?

(Computerworld) As more organizations adopt replication as a primary component of disaster recovery, it's important to better understand some of the variances among replication technology and to clearly set expectations with application owners when planning replication deployments.

A common area of confusion in dealing with replication is the distinction between consistency and synchronicity. Many newcomers to replication tend to focus on synchronization issues when, from a recovery perspective, consistency may be the true requirement from an application perspective.

So what is the difference and why is it important? Synchronization implies complete and continuous fidelity between local and replicated data stores. With true synchronous replication, a write operation is not acknowledged until it has been written to the local storage system and replicated to the remote storage system. This certainly provides a very high degree of consistency, but it also carries with it high costs and significant limitations regarding distance and latency that can impact application performance.

Disaster AuditSynchronous replication is found primarily in the domain of the top tier of enterprise storage offerings and is usually reserved for those applications that are characterized by very high transaction rates where the recovery and re-execution of lost transactions would be difficult and costly.

The majority of replication is therefore of the asynchronous variety -- meaning that there is some degree of variance, based on change rate and available bandwidth, between the local and the replicated targets. In other words, by definition, the source and target are inconsistent with one another.

However, consistency still plays a critical role in the recoverability of asynchronously replicated data. The key is in understanding the interdependencies among related data components of a particular business function and ensuring that they are consistent among themselves at any given point in time at the target location. They may lag behind the original, but as long as they are equally behind, the function or application should be recoverable.

Although the notion of consistency groups is well established among enterprise-class storage systems, it may be less so for other forms of replication. Understanding consistency requirements and the ability of replication technologies to meet them should be a high priority consideration in disaster recovery design.

more info

May 13th, 2008 - Change Control Needs to be Implemented for DRP and BCP to Work

 Change ControlAnalysts confirm that approximately 80% of all software released into production will fail; and 70-80% of the cost of ownership of such business applications is related to finding and fixing these errors. In order to increase productivity and promote cost savings, it is imperative to consider the source of these failures, as well as the nature of the production environments

Add to that the processes necessary to support a Business Continuity and Disaster Recovery Plan and enterprises have an ever increasing complex problem.

more info

May 10th, 2008 - Disk from space shuttle crash recovered

(Computerworld) Researchers who extracted data from a hard drive onboard the ill-fated space shuttle Columbia say the device was so thoroughly damaged in the shuttles fiery crash that it just looked like a cracked "hunk of metal" when it appeared at their door six months later.

Data recovery specialists at Kroll Ontrack Inc. painstakingly retrieved 99% of the information stored on the charred 400MB Seagate hard drive's 2.5-in. platters over a two day period after the device was discovered six months after the 2003 shuttle crash. The device was found in a dried up lake bed along the shuttle's debris area.

Disaster Planning Security Template

Disaster Planning Audit

The successful retrieval of the data was disclosed in the April, 2008, issue of the Physical Review E journal, which published data from tests performed by the shuttle astronauts on the critical viscosity of xenon gas, according to published reports. The results of the tests were stored on the disk and retrieved by Kroll.

The Columbia disintegrated upon re-entry into the atmosphere of Earth on Feb. 1, 2003, killing all seven crew members and scattering debris across Texas and Louisiana.

more info

May 2nd, 2008 - The Importance of a Business Resumption and Continuity Plan is Key to Disaster Planning

Disaster recovery has always been a key concern in virtually all companies. But the widespread damage from Hurricane Katrina has companies re-evaluating their planning, procedures and overall systems to make sure they can survive a major outage.

Disaster Planning Audit

Wherever data resides, it must be protected. With this idea as the driving force, companies are looking for new and easier-to-manage ways to safeguard company databases, records and files.

When a disaster does strike (be it a fire, a flooded data center or a catastrophic malware attack) companies need to take several steps to reduce downtime and get operations back to normal. 

A business resumption and continuity plan should be in place before any disaster occurs.
more info

May 1st, 2008 - Mac Back-up released

Berkeley Data Systems released Mac Mozy public beta, the first unlimited online backup service for Mac users worldwide. The service allows Mac users to encrypt and automatically back up all of their digital media content online, including collections from iTunes and iPhoto.

Disaster Recovery Template Sarbanes OxleySecurity Template  Sarbanes Oxley

Designed as a consumer service, Mac Mozy leverages Apples innovative Spotlight Search technology, allowing users to easily select the types of files they want to back up. The service installs quickly and runs quietly in the background. Backup speeds vary from user to user, largely determined by the upload speed of the consumers internet connection.

Disaster Recovery Audit

Mac Mozy offers an added measure of privacy by allowing its users to choose between a Mozy encryption key and a private encryption key. Incremental backups and block level differentials are included, which means subsequent backups complete at a much faster rate than the initial backup. Mozys servers also retain the most recent version of a file as well as 30 days worth of previously modified file versions. Customers may retrieve files or versions of the files via the internet or by requesting a DVD restore with next-day delivery.

more info

April 25th, 2008 - Risk Taken by Not Shipping Backup Tapes Off-Site

(Computerworld) University of Miami officials last week acknowledged that six backup tapes from its medical school that contained more than 2 million medical records was stolen in March from a van that was transporting the data to an off-site facility.

Disaster Recovery Template Sarbanes OxleySecurity Template  Sarbanes Oxley
Disaster Planning AuditMetrics Internet IT

The vice president of communications at the university said a vehicle used by Archive America Ltd. to transport the patient data was broken into in downtown Coral Gables, Fla. Thieves removed a transport case carrying the schools computer backup tapes.

For reasons the VP could not explain, Archive America waited 48 hours before finally notifying the university about the break-in and theft. Officials from the transport firm could not be reached.

The university posted an alert about the incident a full month after the backup tapes were stolen. In a statement, the senior vice president for medical affairs and dean of the University of Miami Miller School of Medicine, said, Even though they were confident that the patients data was safe, we felt that it was in the best interest of the physician-patient relationship that the incident should be transparent.

Since the incident, the senior VP said that the university temporarily stopped transporting backup data off-site. At this point, the University is not transporting anything until they conduct their own internal evaluation of the incident and see if there is anything that could have been done differently or better.

more info