Introduction To Site Resiliency In Exchange Server 2010

admin | July 17th, 2014 | Exchange Server

Availability of Exchange Server has become a matter of concern for administrators in the past years since dependency upon email and messaging has increased. There is a huge difference in Exchange Server 5.5 and the all new Exchange Server 2010. Not that the version 5.5 was quite unstable but the fact is nobody bothered much at that time if email were available for hours or a day.

For different people, meaning for the word “Availability” is different. Typically, Server availability means 99.99 percent uptime which is equal to about one hour of Server downtime in a year. However, consultants have to understand requirement for the organization and align the product accordingly. If the downtime is defined as the situation where all employees of the organization are unable to access mailboxes, it is manageable. On the other side, if mailbox of single employee has gone inaccessible, it is a matter o concern!

What kind of Availability technique has to be adopted, that completely depends upon Service Level Agreements that is defined at the time of deploying high availability design. There are two models that can be followed to work around disaster:

site-resilienceHigh Availability

  Server Resilience: This means local availability of database within same site or cluster. If the Server fails or any of the Server components fails to perform, the problem can be tackled through component resilience like network cards, RAID arrays, or Server clusters. Windows Server clustering in the form of Database Availability Group in Exchange 2007 and above editions by providing automatic failover to another Server that has a copy of active Server available. The reason why replication services in Exchange have been a great hit is it avoided single point of failure (means if one Server blows down, it does not affect the messaging service).

Site Resilience: This means if Exchange service falls out at primary site, it is available to another site. This can be considered as a building being blowing up and everybody inside it is running over cold production site. In case of Exchange Server, the word “Site” refers to “Active Directory Site”. Introduction to standby continuous replication service brought the possibility of site resilience without requirement for third party solutions.

Database Availability Group in Exchange 2010 is capable of providing both high availability and site resilience. A DAG model that is installed for HA can be in single Active Directory site or datacenter. The reason why DAG is considered suitable option for site resilience is it provides complete protection against failure of primary datacenter.

data-centerDatabase Availability Group

  Failover between two different data centers of DAG do not take place automatically but involves the process to shrink the cluster before making it secondary database online instead of primary DB and the related services like Domain Name Server of Client Access Server.

Nevertheless, when the actual primary datacenter becomes active, it is necessary to avoid the situation where the active primary Server makes attempt to activate its copies. This is because the last update that the primary Server had before failover occurred is “they owned a database”. In such circumstances, the copy of database will become inconsistent and will call for database copy from secondary active datacenter. This kind of situation is called “Split Brain” that can be avoided through Datacenter Activation Coordination Protocol (DACP). This technology prevents the primary datacenter to activate the database and take ownership via in-memory flag which is a DAC memory bit. Therefore, when the primary DC becomes online, they will make contact with the other members of DAG and in case they fail to contact other mailbox Servers of DAG, they won’t make attempt to mount database.

Database Availability Group Site Resilience facilities in Exchange 2010 are initially hard to understand and complicated to execute. However, with proper planning and information about how to handle switchover, site resilience proves quite helpful and seems straightforward too. DAG and site resilience collaboratively help organizations to work around downtime and have minimal loss!