Monday 24 December 2012

Achieving High Availability with Windows Azure Environment - Part 3


  1. Windows Azure SQL Database

    SQL Azure Database provides 99.9 percent availability by default to the subscribers.
    When creating a database in SQL Azure server, the database will have three replicas on three different and fully independent physical machines with in the same data center. In three replicas, one will act as primary replica and two others will act as secondary replicas. The architecture of the SQL Database will be defined as below
    Figure 3
    A single SQL Database distribution across machines in a datacenter (Image from Microsoft wiki page - link)

    As defined in the picture, the database DB1 is created three machines - Machine 4, Machine 5 and Machine 6 which are fully separated from each other from the hardware, network and the SQL Server instances. In this three database, one will act as primary and two other will act as secondary.

    All reads and writes against the database (DB1) will done only in primary replica and the changes to the primary replica will be replicated to the secondary replicas asynchronously in background. Any transaction will be considered as completed if the transaction done against the primary replica and one secondary replica completed successfully.

    When the primary replica fails due to hardware failure, network failure or device failure, the SQL Database fabric detects the failure and fails over to one of secondary replica. The database will be inaccessible for a short period of time while the failover event is happening and the application needs to re-try the transaction to handle this failover event in the logic with the same connection string.

    The SQL Database is inaccessible due to hardware failure, operating system crash, or any issue with the SQL Server instance in the data center. When any of this failure happened, the reconfiguration process starts – is a process to create a new replica for the failed one.

    When any of the machines failed, the entire database in that machine required to be replaced by new. So the Partition Manager randomly selects a failed replica in the failed machine and start creating the database in to another machine and replicating the data into that. When the process completed, it will take another database and process it.

    The databases in the failed machine will be primary replicas and secondary replicas for different databases. The reconfiguration process will be little different for each of them.

    When Primary Replica failed, the Partition Manager chooses on of a secondary replica and promotes it as primary replica. The algorithm to choose one replica from the two replicas is undetectable by the users. But most of the case the least load secondary replica promoted as primary replica. When the primary replica failed, it sends a “disconnection” message to the client application when the client trying to do some transaction. As reconfiguration process will take up to 30 seconds, the client must be configured to reconnect the server again when “disconnection“message received to connect successfully.

    When Secondary Replica failed, the Partition Manager verifies the failure is due what issue. If the failure is due to clean failure such as system update, upgrade and reboot, it will not consider as actual failure as the replica is failed due to out of service. In short time, the failed secondary replica will come back to online and make the database in healthy status by running the necessary comments.

    But when the replica is not come to online for two hours (if clean failure) or the replica failed due to permanent failure (such as hardware failure, network failure), the Partition Manager will start building a replacement replica in another machine and bring the database online. The time it takes to copy the entire data will be depending on the size of the database.

    Addition to the three replicas, Windows Azure SQL database fabric maintains database backup for minimum 14 days and the backup will be taken for every five minutes interval. These backup files are stored in the same datacenter which will be used in case of major hardware and system failures at datacenter level.

    6.1 Protecting data from user deletions or modifications
    Sometime, the user can delete or modify the data for unwanted reason or unknown to them. In that situation, the database required to be restored back or a particular record need to be get back to the existing database. The only possibility to solve this situation is to take the database backup on regular interval and keep them in safe location for further reference/restore.

    SQL Azure provides multiple choices for backing up and restoring the database on regular interval. Some are follows –

    1. Database Copy command – Creating a copy of SQL Database to another new database in the same or another server (but in the same data center).
    2. Using BCP utility – Exporting and importing a database table to a flat file.

    For more information on each method, refer the Export and Import SQL Databases and Tables section (6.3).

    6.2 Protecting data from data center damage
    The Copy database command copies an SQL Database to another new database, which can be used for reference (when unwanted data deleted/modified) or can switch the database when major data corruption.

    We also required data protection at data center level, which will be helpful when a major damage happened at datacenter geographic location such as earthquakes, wild fires, tornados, nuclear reactor meltdown, etc. The solution for such major damage at data center level is to keep the database copy at different data center, means data at different geographic location.

    For Ex: Let’s take an example of an application and database deployed in East Europe datacenter. To keep the application and data protected from East Europe datacenter damage, the user must keep online application and database backup to another datacenter may it be a West Europe, South Central US or other datacenter.

    The different hosted service applications can be groups together using Traffic Manager, so when any damage at East Europe datacenter, the request can be routed to any datacenters which are online. But the database connection string configured to point to the East Europe datacenter. So the application must have the ability to verify the database active status for some time (may be 5 min), when the database not responding the necessary action must be taken to change the database either automatically or by manually.

    There are many ways to take the backup of the online database to other datacenters.

    Some are follows –
    1. SQL Database Import/Export Service
    2. SQL Data Sync
    3. Running Windows Azure Worker Role for syncing the data

    6.2.1 SQL Database Import/Export Service
    From the ways, the user can take the database backup to Blob Storage Service, to Azure Drive or an enterprise folder. Once the backup completed, the backup file must be transferred to other datacenter blob storage to protect the datacenter level damage.

    The backup file can be restored (and the file can be deleted) to make the database online, so the failover time will get reduced. But restoring to a new database or keeping the backup file in storage service will make the billing cycle to start additional costing to the enterprise addition to the data transfer cost.

    The SQL Database Import/Export service copied the database object schema and the data to a DAC export file (BACPAC). So when exporting a customer database, the output file would be customer.bacpac. The same file can be used to import to another database in the same server or another server.

    There are many methods for using DAC framework for importing and exporting the database.
    1. Import and Export option with Management Portal
    2. Using Data Tire Application (DAC) framework
    3. Using a open source tool for automating the SQL Azure backup

    6.2.2 SQL DataSync
    Windows Azure SQL Data Sync service is used for synchronizing the data in multiple ways –

    1. Synchronizing the data between SQL Azure database to another SQL Azure database in the same server or another server and in the same datacenter or between another datacenter.
    2. Synchronizing the data between SQL Azure database and on-premise SQL Server database.
    3. Synchronizing the data between SQL Azure database, on-premise SQL Server database and another SQL Azure database.

    When using the SQL Datasync service one SQL Azure database will act as hub and other databases will act as member databases in the sync group. The synchronization will run in minimum five minutes or maximum 1 month interval.

    So deploying the SQL Azure Data Sync, the SQL Azure Database can taken backup to another datacenter SQL Database or in enterprise SQL Server Database. When datacenter loss, the synchronized SQL Azure database can bring up to online or the on-premise database can be restored to Azure and bring to online.

    For more information please refer the following urls –
    http://msdn.microsoft.com/en-us/library/hh456371.aspx
    http://msdn.microsoft.com/en-us/library/hh667328.aspx
    http://msdn.microsoft.com/en-us/library/hh667301.aspx

    6.2.3 Running Windows Azure Worker Role for syncing the data
    This may seems to be a tedious method, but the user has the capability to customize the synchronization and can make sure the data synced at a given point of time.

    In this method, when any transaction happened to the database the data access layer will add a message in the queue which tells about what transaction required to take on what data. The worker role takes the message and processes it to different Azure database whether the database is in the same server or different server and different datacenters.

    6.3 Export and Import (Backup and Restore) SQL Databases and Tables
    Microsoft has given various ways for exporting and importing SQL databases or tables. Each can be used for various scenarios. Some of the ways are

    1. Database Copy command
    2. Import and Export option with Management Portal
    3. Using Data Tire Application (DAC) framework
    4. Using BCP utility
    5. Using a open source tool for automating the SQL Azure backup

    6.3.1 Database Copy command
    Database copy command helps us to copy the entire database to another new database in the same server or to another server. To make sure the data safe at any given point, the database copy operation can be started to make another copy of the same database. This operation can also be done in a regular interval either every day, every week, every hours etc.

    Once the database copy successfully, the copied database will work as a standalone database with the users exist in the server. So when copying the database to another server, we have to make sure the server has required users exist.

    The command for copying the database to another is
    CREATE DATABASE destination_database_name AS COPY OF 
    [source_server_name].source_database_name
    The database copy command work only when the targeted server in the same datacenter.

    Once the database copy process started, the progress about the operation can get by running the following command
    SELECT [databases].[name], [copies].* 
    FROM sys.dm_database_copies copies 
    JOIN sys.databases databases 
    ON copies.database_id = databases.database_id
    At any point of time, any unwanted transaction done, the copied database can be reviewed with the online database data values and make the necessary modification. If required, the database name can be changed from copied database to online database. So the application can run without any issues.

    Below is an example command for modifying the copied database to online database name.
    ALTER DATABASE Database1 
    MODIFY NAME = Database1_OLD
    GO
    WAITFOR DELAY '00:00:30'
    GO
    ALTER DATABASE Database1_copy_02_01_2012
    MODIFY NAME = Database1
    GO
    Some important notes when using Database Copy:

    • The copy command creates new database, which will be charged as per normal billing rate. The billing will starts after successfully completing the database copy operation.
    • As the database copy started by user manually, Windows Azure environment is not responsible for deleting the database when not required.
    • The charges for the new database created will also follow the same billing as defined for SQL Azure billing rate.
    • When restoring the database, the user needs to verify the required data exist in the copied database.
    • If the database is in federation and configured to one or more federation schema, the database copy operation will fail.

    For more information on database copy, please refer the following url
    http://msdn.microsoft.com/en-us/library/ff951631.aspx

    6.3.2 Import and Export option with Management Portal
    SQL Azure management portal provides facility to backup and restore the SQL Database. For more information on how to import and export database with Management Portal, please refer the following url.
    https://thirumalaipm.blogspot.com/2011/09/import-and-export-backup-and-restore.html

    6.3.3 Using Data Tire Application (DAC) framework
    Data Tire Application Component is a command line application, which talks to SQL Azure using APIs for importing and Exporting SQL databases. This tool can also be used for migrating on-premise SQL Server database to SQL Azure database.

    For more information, please refer the following urls –
    http://sqldacexamples.codeplex.com/
    https://thirumalaipm.blogspot.com/2011/09/import-and-export-backup-and-restore_18.html

    6.3.4 Using BCP utility
    BCP is another command line application which can be used for exporting and importing a single database table to/from a flat file.

    For more information, please refer the following urls –
    http://msdn.microsoft.com/en-us/library/ms162802.aspx
    https://thirumalaipm.blogspot.com/2011/10/export-and-import-backup-and-restore.html

    6.3.5 Using a open source tool for automating the SQL Azure backup
    For automating SQL Azure backup operations, Microsoft open source portal (codeplex.com) has a project Automating SQL Azure Backup using Worker role with Microsoft Public License (Ms-PL) agreement.
    The url is – http://azureautobackup.codeplex.com

  2. Reference
The other links on Achieving High Availability with Windows Azure Environment:
  1. Achieving High Availability with Windows Azure Environment - Part 1
  2. Achieving High Availability with Windows Azure Environment - Part 2
  3. Achieving High Availability with Windows Azure Environment - Part 3

0 Responses to “Achieving High Availability with Windows Azure Environment - Part 3”

Post a Comment