March 1, 2015

The Communications Plan

Filed under: Safety First — houtkin @ 2:44 pm

Today, I would like to discuss communications for the small business.

There are different people that you may need to communicate with and as a business owner, you may need to communicate beyond your business in the case that you have staff or temporary staff.

The communications plan is one of the most important plan you can have in your arsenal of disaster planning tools.

The business contact list. The business contact list is a listing of all staff in your immediate world, whether full-time or temporary - along with the emergency numbers you put in your notification sheet; e.g. building manager, local police, fire department, hospital, etc. Always ensure that you have 2 phone numbers and 2 email addresses (1-business/1-personal). Ask your staff to add everyone’s information in their smartphone’s address. As well, suggest that they also have one family member add these numbers to their smartphone as well.

The call tree. The call tree is not a contact list although it contains all of the information that you captured for the contact list. The call tree is a process. It ensures that there are at least 2 people who are responsible for calling each staff member.

There are many problems resulting from disasters but the biggest one is that we cannot dictate what will happen and how it will play out. We can never assume that someone else will call John and John may need help.

The call tree should be tested, along with the contact information for staff at least 3-4 times as year - taking into consideration how fluid contact information is and how often it changes.

Always ensure that you have each staff member’s family contact information. Be sure that you have at least two contacts - as family members may also work in the same area and be dealing with the same event.

Finally, include two business associates that are remote to your state - who can also help you to make phone calls to find staff - on your behalf - while you are dealing with more immediate safety concerns.

This remote contact is also recommended for the family communications plan. Tag 2 remote family members in different states, with texting your family after an event.

Each remote business and family contact should be given the full emergency notification plan as they may have to handle business and be responsible for communications during the first few hours of an event. As well, should you not be able to communicate, they can call local first responders on your behalf. Ensure that they are entered into your phone as your ICE (in case of emergency) contacts - to close the loop.

We cannot always assume that we will be able to carry out our plans during an event - Thats why we include remote contacts in the communications plan. Send them your emergency notification list and your immediate contact list of staff or family.

Finally, ask your lawyer or banker to identify a remote contact in case there are legal issues that you need responded to. It would be a good time to check with your lawyer and banker for their communications plans and to find out how they can support your business during an event. Perhaps they can make payments for you, represent your business while you are focussed on immediate safety concerns, call insurances, etc.

The other important aspect of your communications plan is a listing of your accounts - banking, insurance. You want to keep this encrypted, if you keep on your smartphone - but I always recommend a hard copy version to be kept in your evacution bag, your lawyer and your remote family member. Even the generators that power cell towers and the local telephone central office only has a 5 hour limit - so you would either have to find an office with a land-line or move out of the immediate area to find cell towers that are still on-line.

Remember, the stranges things happen during events - you cannot always assume that you are going to have power, that the services you norally have will have power and that you can conduct your life or business as usual.

February 21, 2015

Accounting for Family and Staff

Filed under: Safety First — houtkin @ 2:40 pm

The last piece of the safety process is accounting for family or staff.  This part of the process requires some work and planning with your team and family members - beyond the immediate people in your home or business.

You can account for family and staff through a call tree, an evacuation assembly or meeting point and by including your staff’s family and remote family members into the call tree/communication part of your plan.

I’ve written an article on the accounting process.  I hope you capture some ideas that you may use for your own business and family.



February 15, 2015

Notification Card

Filed under: Safety First — houtkin @ 3:39 pm

Notification Card

Safety First/Take 2

Filed under: Safety First — houtkin @ 2:08 pm

For the last two weeks, I focused on the emergency notification card - This falls under: SAFETY FIRST before we even consider discussions regarding Business Continuity or Disaster Recovery.

After 911, we identified that many people did not know who to call, how to handle emergencies in their office or home and most importantly, did not know how to communicate.

Please focus on preparing this emergency notification card. It is important for you, your family, your staff and your customer. For many customers, we have to prove that we are responsible. This is one way to do that.

After 9/11 we identified that it took an average of 3 months for the customer/client to find the consultants as there was no way to really communicate with them. I recommend that you take the first step - and ensure your safety and well-being by creating your notification card and presenting it to your customer and posting it in your office, home office, etc.

I will be creating a sample and will post it to my website so you can download and begin creating your own.

The next step, after the emergency card is your company Call Tree. As a small business, you say suggest that you do not have enough staff to create one — but your family is an important connection for you - should you need help or should you evacuate without ample time to inform them.

Part of this is creating your ICE contact in your address book. Your ICE contact (in case of emergency) is the person you designate that first responders call in case you require help or if they need to report on your status.

Be aware - that although many people feel that a passcode on their smartphone or phone is crucial in today’s security climate, note that not presenting your ICE information in your phone could potentially impact your ability for someone to communicate to your family on a timely basis.

One more point about understanding your building’s evacuation process and route. Many of you, no doubt, work from home and sometimes visit a customer in their building. Take a moment to review what you would do in your home office should an evacuation be required - from a physical perspective — or if you should have a fire emergency. Do you have your go kit (which we’ll discuss in a later post), a fire extinguisher? Notification numbers? Clear stairways?

Take the time to ensure you can deal with a fire or other in-house/office emergency and then what you need to do/have to evacuate.

I visit my customers sites from time/time. I always take a look around for the exits based on where I am in the building and I always ask what the evacuation route is. I do not want to go down one stairwell that is actually the up stairwell for emergency responders. As well, some buildings have painted their evacuation stairwells with a product that absorbs and then provides light during evacuation should the electricity go out. Make sure you know how to evacuate - especially if you are in a major part of the city that could potentially be a soft target.

Know where you land.
Some stairwells go to the street and some go to the lobby. This should be part of your criteria in creating an evacuation plan if your building manager has not already defined one.

Some main lobby entrances actually take you between buildings - and some to the street. Its a good idea to get the lay of the land — so you are comfortable and knowledgeable about a clear exit.

Why do we focus on SAFETY FIRST? Basically because if you can have the best tools, the best technology, but bottom line, if you do not have anyone to engage them, they make no difference to your ability to maintain a presence as a working business after an incident.

Take the time now to check these items under SAFETY FIRST. Your business will succeed during the most difficult of times.

January 24, 2015

Safety First-Building Safety

Filed under: Safety First — houtkin @ 10:30 pm

Last post focused on gathering emergency information and compiling it into a single document that can be posted in your office and at the homes of your staff.  This post deals with safety in the building where your office is located.  Since 911, the New York City Fire Department implemented an emergency process for buildings that includes regular evacuation and life safety drills at least once a year, a building emergency team, the designation point for evacuated tenants, recommendations for staff accountability and new processes that support both evacuation and shelter-in-place scenarios.

Here is a more in-depth article on talking with your building manager about building safety.   http://www.houtkinconsulting.com/FacilitiesPosted.pdf

Some information that you may want to add to your emergency notification card that you would gather after your discussion with your building manager is:

  1. Building Manager contact
  2. Building Manager contact number
  3. Building Manager Office Location
  4. Building Emergency Number
  5. Evacuation path on your floor
  6. External building meeting point in the case of evacuation
  7. Your Floor Wardens: Names and contact Information
  8. Floor Searchers: Male and Female
  9. Personnel trained in CPR on your floor
  10. Location of Fire Phone on your Floor
  11. Location of defibrillator

Your business emergency notification may look like this with this additional information.


January 18, 2015

Small Business Disaster Recovery: Safety First

Filed under: Safety First — houtkin @ 2:00 pm

Safety First focuses on those processes and procedures dealing with two basic scenarios: evacuation and shelter-in-place.

In both cases, the required tools are:

1. Communications plans: Owner to Emergency Personnel/Organizations; Owner to building manager; owner to staff; Staff to Family

2. Communications tools: Municipal websites, email addresses and twitter accounts; Business Call Tree and annual test, wallet card; Family Call Tree.

3. Evacuation process

4. Accountability process

5. Accountability tools.

Communications Plans: Part 1 (20150118)


1. Create an emergency listing of local, municipal resources.  The main ones are: City Office of Emergency Management, Local Emergency number, Local Police Precincts, Local Firehouse, Local hospitals (at least 2-3), Poison Hotline.

Include all contact information: Municipal (311, 911), full address, current Captains, email and twitter accounts.

Everyone in your family and all staff should have access to this information and should be posted several places in the office, on the phone and in the wallet card.

Speak with your local police and firehouse regarding those processes they have put in place in case of an incident, including when you would evacuation and who will communicate that; when you would shelter-in-place and who will communicate that and recommendations for your family and business.

Check these numbers and municipal process at least twice a year.

See sample, below.

March 5, 2013

Lessons Learned - Sandy

Filed under: Disaster Recovery — houtkin @ 6:15 am

I had the pleasure of attending the latest conference of the Contingency Planning Exchange last week. The agenda was focussed on lessons learned for representatives of various sectors of the business, municipal and government entities. Net/net, the basic lessons learned, including new concepts for me included:

1. Use of the transit strike map
The idea here was that since it was a city-wide event, the team would utilize the transit strike maps which go into effect in time of an incident. No-one anticipated that that even these maps would not be of use because areas of the city were flooded.

LESSON: Meet with the city transit organization after an event to identify their updated transit recommendations/maps/tools for further planning.

2. Solar Phone Chargers

LESSON: No electricity results in a loss of the ability to communicate. Check this out. Its a great idea.

3. Communications: push, SMS, email and voice
An emergency communications company identified that their successful modes of communication in order of greatest to best effort were: push, SMS, email and then voice.

LESSON: Meet with your notification provider and ask them for statistics captured re: their service during SANDY, consider the results for use with your team, Company, Business and then reconfigure before the next event. Inform your User-base so that they know what to expect and Test, test, test.

4. Staff anxiety
Some entities identified a growing anxiety amongst “non-critical” staff who may not have been asked to come into the work-place or engage in work-related contingency process.

LESSON: Re-brand the concepts of “critical” and “non-critical” according to staff and identify, if possible, use of staff who may be closer and able to come to work. Educate.

5. Staff support and volunteering
Some companies organized teams of their staff, living closer to those who may be impacted, and provided, money, food, places to stay, and general hands-on support to ensure: accountability and availability of staff after an event. This was almost altruistic but very important for maintaining the company vision of the importance of staff. Staff retention is a very real concern after an incident. This support decreases the %.

LESSON: Look at your company’s vision and precepts and consider the opportunity of creating teams of volunteers in various areas as well as processes and procedures to help impacted staff. This also helps the staff anxiety concept identified in #4, above.

6. Use of VDI in support of your company’s resiliency objective.
One company used VDI to facilitate the time to build alternative workgroups in place of buildings that were impacted.

LESSON. Look at the VDI solution for the desktop as a resiliency solution. In this case, the impact could potentially be felt in the data center, ability to procure workstations as part of the solution; required workstation/laptop/notebook requirements for use with VDI, which build to utilize, if your company uses several and sites with wan connections already in place. Clearly, there are many considerations and this may not be appropriate for your company. But it is a great resiliency solution.

8. Manual procedures and non digital tools.
Although this discussion solicited some laughs from the audience, the old ways are still the best - in consideration of loss of electricity, technology, etc. After a regional loss of power, the phone systems supporting the land-line networks have 4-5 hours before their generators lose power. What differentiates land-line from cellular is simply, power. If you have a phone that uses the electricity powered by the land-line RJ14 connection, you can send/receive calls while the phone company powers this network. So, some old-fashioned phones could be of use.

LESSON: Look at the logistics of your plan, the mission critical business process and ways that they can continue without technology; e.g. forms, non-feature phones, trade books, etc. To this day, I always keep old business forms at the workgroup site in case nothing else is available. I also work through passwords, etc. with the other side so that they know who they are talking to. It’s a bit of work, a bit of thinking like a movie script-writer, but sneaker-net has been known to keep businesses in business.

That’s all for today.

February 27, 2013

How much is Enough?

Filed under: Disaster Recovery — houtkin @ 11:56 am

You can only do so much to ensure that your soluion will work but - it must work to support our business. We must keep in mind that 1) the scenario can never be accurately planned for (we are not fortune tellers - and we cannot control disasters and how they manifest); 2) businesses and their priorities change - impacting the technical side of our work.

What I have found, however, is that there are some fundamentals, that if considered in designing, implementing and validating the solution, can reach a consistent level of integrity that helps in answering your question - in a positive way.

1. A clearly written and agreed-to definition of what a successful disaster recovery event means to the business. This should be reviewed twice a year — even management experiences re-organizations.

2. A clear understanding of what the business defines as mission critical and the technology that supports these services and applications/technology.

3. Enterprise architecture. The ability for the dr solution to integrate into the existing architecture or adhere to architectural precepts will help to cut down on some of the risk that the solution will not work.

4. The architecture of each application and its components - problems integrating into the overall architectural environment and those extra steps that are required to ensure that they can failover together in order to meet the RTO/RPO.

5. An extremely detailed failover plan - minute by minute and technology by technology — including all processes and procedures to fail over and fall-back - that becomes the fundamental training source and guideline for a disaster. This must be reviewed quarterly and tested as walkthroughs and through real testing multiple times during the year. As well, it should be audited once by an outside organization and upgraded for all new technology/upgraded technology integrated into the environment.

6. The skillset of staff required to support the dr solution and how often they are trained in the process - including a thought to outsourcing for the event if staff are not available.

7. A business continuity plan for the IT department - to ensure that disaster recovery teams with primary/secondary responsibilities are identified and practiced. Remember that IT representatives are people too and need to be considered in planning - in the same way that the business is.

8. All third-party software/carrier/infrastructure contracts are up-to-date and define roles/responsibiliites for systems/technology/applications during a dr event — and how they plan to handle their own dr event as well as notification plans so you are aware of their issues before it becomes a problem for your business.

Most importantly, one thing that I learned in 9/11 was that until you have your approved dr solution in place, you have to identify temporary solutions - that are agreed-to by the business. You cannot build out technology overnight. However, you can have agreements for temporary solutions should an event occur while your overall solution is being built.


February 20, 2013


Filed under: Disaster Recovery — houtkin @ 11:17 am

A successful deployment of SRM is dependent on a thorough understanding of the business RTO/RPO requirements, data classification standing of applications, a technical preparatory analysis of the environment, storage and backup policy and administrative procedures; storage/replication design and whether there is a need to perform physical to virtual migrations.

Work with VMware to identify the full scope of the SRM deployment, the architectural design that maps to the enterprise, licensing and your company’s purchasing agreement with VMware. Also, consider SRM future growth as architecture changes after a threshhold is met.

1. SRM is an engineering solution and should be fostered/owned by both server and storage engineering.

2. Identify an SRM owner and ensure that they are trained before and during the deployment.

3. Choose an integrator to support the project manager and the SRM owner. They will help design SRM for use in testing and disaster recovery.

4. If you do not have a replication solution in place or if you cannot fall back from the DR site, you may only be able to failover from prod-to-dr using SRM.

5. A data Classification policy specified by the business will facilitate the deployment of SRM. This should: 1) classify the application and data by criticality based on direction from the business; 2) The storage design would be dictated to some level by the data classification requirements as identified by the business - and this would find its way into the replication solution and schedule.

6. Identify what applications and application data will be configured into SRM. Here you can use data classification policy and application qualifications, a business’ set of applications or a particular business process and those applications/data used to manifest.

7. If there is goal is to configure all applications with physical server dependencies into SRM a review of all of the applications resident on the physical servers is required to identify:
a. re-configuration needs of the application;
b. if the application can be migrated to virtual;
c. whether any applications require additional licenses and / or upgrades to be able to migrate frfrom p-to-v. P-to-V migrations are a sub-project to the SRM deployment and need to include all application-owners who are responsible for the application through the complete migration and validation process.

8. If you plan on performing P-to-v migrations to accommodate your new SRM deployment, understand any additional ESX servers that you may require and the number of licenses to cover your complete solution.

9. The storage design should be analyzed to ensure:

a. The related data for these applications are on the same frame and not spread over various frames.
b. The related data is not spread over various vendor products.
c. The replication solution and schedule works and is in synch with the storage design and data classification requirements; e.g. does it meet the RTO and RPO?
d. There is enough storage to handle data requirements as a result of data classification requirements and its configuration in SRM.
e. Backup policies / administrative processes exist and can be tweaked as SRM is configured and tweaked.
f. Storage administration policies and administrative processes exist and can be tweaked as SRM is configured and tweaked.

10. A very strong testing and validation program with proven scripts owned by each respective technology layer.

11. Before you schedule any configuration of SRM or P-to-v migrations to be able to configure applications into SRM understand the business schedule to avoid impact to the technical plan as a result of month/quarter/year-end activities on the applications. So, change management is a very important aspect of the project methodology.

More tomorrow.

February 18, 2013

High-Level Framework: System/Technology/Application Recovery

Filed under: Disaster Recovery — houtkin @ 8:47 am

In the perfect dr world, all technology/systems/applications should go through 4 levels of testing before they go into production - and have an architectural / design document, as-built design, operations model and failover process - if you are lucky enough to have the staff and bandwidth to do this work. Reality dictates that this is not always available but we cannot get away with thinking we can recover an application/system/technology without understanding the basics: the business requirement / use of this application/technology/system and its criticality to the business; enterprise architecture, the architecture of the technology and how it integrates into the enterprise architectural precepts and then how to successfully recovery the system/application/technology.

So, the basics for a framework is an understanding of:
1. The business process that is manifested through the technology/system/application;
2. The RTO of the business process and the system/technology/application;
3. The applicaiton/system/technology architecture and how it integrates into the overall architecture of the technical environment;
4. The operational model - and how the system/technology/application is maintained.

Recovery does not necessarily mean a failover unless the time to recover surpasses the RTO agreed-to with the business. Items required for all system recovery requires:
1. architectural design document;
2. as-built document;
3. operations model and related processes/procedures;
4. recovery processes/procedures
5. testing script for both infrastructure (server, os, database) and application-levels

Other considerations:
-What is recovered: application/technology/system AND data? If so, what is the RPO of the data and can your recovery methodology meet that expectation?
-What up/down-stream technical dependencies are impacted by the outage and then recovery of the technology/system/application.
-What core infrastructure comprises the application/system/technology and what application-level procedures require failover or not. In other words, based on what “goes down”, what is the path to technical least resistance to meet the RTO;
-What skillset is required to recover the application and the various levels (infrastructure/database/application).
-Recovery methodology: do you recover in isolation and then integrate into production, etc.
-Security requirements during recovery and integration back into production; e.g. access control; vulnerability, etc.
-What is the recovery sla with the vendor, if a third-party or managed system/technology/application.
-What is the agreed-to scope of work with the vendor, if a third-arty or managed system/technology/application.
-What policies are in place (or not) to handle recovery.
-Governance - who determines that the applicaiton/system/technology has been fully recovered?

Off the cuff - this is a baseline idea from the technical side.

Next Page »

Powered by WordPress