Database

Implementing Oracle9i Data Guard

Michael New, Gradation LLC ()

Abstract

Specific Data Guard Environment Presented

Overview Of Data Guard Concepts

Operational Requirements

Data Guard Architecture

Best Practices For Data Guard Configurations

Implement Backup and Recovery Strategy First

Set Appropriate Database Protection Mode

Use FORCE LOGGING Mode

Use Identical Primary And Secondary Sites

Recommended Initialization Parameter Settings

Recommended Oracle Network Structure

Consider Standby Redo Logs

Monitor Data Guard Configuration

Consider Data Guard Broker

Create Data Guard Environment

Implement FORCE LOGGING Mode (Pri)

Create Identical Primary & Secondary Sites (Pri+Sec)

Set Initialization Parameters – Use SPFILE (Pri+Sec)

Create Oracle Network Configuration Files (Pri+Sec)

Change To ARCHIVELOG Mode With Automatic Archiving (Pri)

Backup Primary Database (Pri)

Create Standby Control File (Pri)

Copy Backup To Standby Site (Sec)

Recover Standby Database (Pri,Sec)

Enable Data Guard Broker (Pri+Sec)

Appendix 1: SPFILEs

SPFILE for PROD1 on node1

SPFILE for PROD1 on node2

SPFILE for PROD2 on node1

SPFILE for PROD2 on node2

Appendix 2: Network Configuration Files

listener.ora for node1

listener.ora for node2

tnsnames.ora for node1

tnsnames.ora for node2

tnsnames.ora for Non-Privileged Clients

tnsnames.ora for DBA Clients

Appendix 3: Monitoring Scripts for Standby Databases

crontab For Monitoring Scripts

STBY_gap_check.ksh

setup_ora_STBY

recovery_mode_check.sql

archive_log_gap_check.sql

STBY_integrity_check.ksh

unrecoverable_change.sql

STBY_gap_check.ksh Email Output

STBY_integrity_check.ksh Email Output

begin_managed_standby.ksh

recover_managed_standby.ksh

Abstract

Oracle has introduced many new features in Oracle9i Data Guard to enhance Oracle8i standby database functionality. This white paper covers how to implement Oracle9i Data Guard. An actual production Data Guard environment is presented, a bidirectional physical standby database configuration between two Linux Red Hat 7.1 servers running two separate databases. However, the concepts presented apply to any platform. Step-by-step procedures and actual configuration files demonstrate a working Data Guard implementation.[1]

Specific Data Guard Environment Presented

The environment presented here is that from an implementation of Data Guard at a customer site. It was a standard bidirectional Data Guard configuration built for two large mission-critical data warehouses on two separate Linux servers, as shown in Figure 1 below:

node1node2

Primary database: PROD1Primary database: PROD2

Standby database: PROD2Standby database: PROD1

Figure 1: Standard bidirectional Data Guard configuration implemented.

The specifics of this implementation were to create a standby database on server node2 for an existing database, PROD1, on node1. Similarly, to create a standby database on server node1 for a separate existing database, PROD2, on node2. Because one implementation was the mirror image of the other, the generic steps were the same to build each standby. The first standby built was the PROD1 standby on node2. Then the PROD2 standby was built on node1. Whenever specific commands or values are indicated here, they are for the PROD1 standby implementation on node2.

Each server was running a separate instance of Oracle 9.2.0.1.0 configured for dedicated server on a Red Hat Linux 7.1 platform. Each server was a Dell Poweredge 2550 with two 1000Mhz P3 processors and 4GB RAM. The two servers were located on a WAN, and the network provided high throughput between the servers. These servers were located in different geographical locations, thereby providing disaster recovery. The primary database on one server had its standby database on the other server to make efficient use of each system with no idle hardware. If either primary database became incapacitated, the physical standby database at the other location could be failed over to the primary role so processing could continue.

Overview Of Data Guard Concepts

Data Guard is software that maintains a standby database, or real-time copy of a primary database. Data Guard is an excellent High Availability (HA) solution, and can be used for Disaster Recovery (DR) when the standby site is in a different geographical location than the primary site. When the sites are identical, and the physical location of the production database is transparent to the user, the production and standby roles can easily switch between sites for many different types of unplanned or planned outages.[2]

Oracle Data Guard manages the two databases by providing remote archiving, managed recovery, switchover and failover features. A secondary site that is identical to the primary site allows predictable performance and response time after failing over or switching over from the primary site. An identical secondary site also allows for identical procedures, processes, and management between sites. The secondary site is leveraged for all unplanned outages not resolved automatically or quickly on the primary site, and for many planned outages when maintenance is required on the primary site.

Data Guard with a physical standby database provides benefits, which fall into two broad classes:

  • Availability and disaster protection - provides protection from human errors, data failures, and from physical corruptions due to device failure. Provides switchover operations for primary site maintenance, and different database protection modes to minimize or create no data loss environments. A specified delay of redo application at the standby database can be configured to ensure that a logical corruption or error such as dropping a table will be detected before the change is applied to the standby database. Using the standby database, most database failures are resolved faster than by using on-disk backups since the amount of database recovery is dramatically reduced. The standby database can be geographically separate from the primary database, a feature that provides Disaster Recovery against local catastrophic events. Data Guard, therefore, provides a higher degree of availability than other HA methods that do not employ a second database, such as Real Application Clusters (RAC) or Highly Available Disk Arrays (HADA).
  • Manageability - provides a framework for remote archiving services and managed standby recovery, contains role management services such as switchover and failover, and allows offloading of backups and read-only activities from the production database. The Data Guard broker provides the Data Guard Manager GUI and command-line interface to automate the management and monitoring of the Data Guard environment.

Operational Requirements

Below are operational requirements for maintaining a standby database. Some of these requirements are more lax then Data Guard best practices would dictate (see Best Practices For Data Guard Configurations below):

  • The primary database must run in ARCHIVELOG mode.
  • The primary and standby databases must be the same database release. To use the Data Guard broker, the database server must be licensed for Oracle9i Enterprise Edition or Personal Edition. The operating system on the primary and standby sites must be the same, but the operating system release does not need to be the same. The hardware and operating system architecture on the primary and standby locations must be the same. For example, a Data Guard configuration with a primary database on a 32-bit Linux system must be configured with a standby database on a 32-bit Linux system.
  • The primary database can be a single instance database or a multi-instance Real Application Clusters database. The standby databases can be single instance databases or multi-instance Real Application Clusters databases, and these standby databases can be a mix of both physical and logical types.
  • If using a physical standby database, log transport services must be configured to specify a dedicated server process rather than a shared server (dispatcher) process in managed recovery mode. Although the read-only mode allows a shared server process, you must have a dedicated server once you open the database again in managed recovery mode.[3]
  • The hardware (for example, the number of CPUs, memory size, storage configuration) can be different between the primary and standby systems.
  • Each primary database and standby database must have its own control file.
  • If you place your primary and standby databases on the same system, you must adjust the initialization parameters correctly.

Data Guard Architecture

Oracle9i Data Guard uses several processes to achieve the automation necessary for disaster recovery and high availability:

  • On the primary location, log transport services use the following processes:
  • Log writer process (LGWR) – This process collects transaction redo and updates the online redo logs.
  • Archiver process (ARC) –These processes create a copy of the online redo logs, either locally (or remotely for standby databases).
  • Fetch archive log (FAL) process (physical standby databases only) – This process provides a client/server mechanism for resolving gaps detected in the range of archived redo logs generated at the primary database and received at the standby database. The FAL client requests the transfer of archived redo log files automatically when it detects a gap in the redo logs received by the standby database. The FAL server typically runs on the primary database and services the FAL requests coming from the FAL client. The FAL client and server are configured using the FAL_CLIENT and FAL_SERVER initialization parameters which are set on the standby location.
  • On the standby location, log transport services use the following processes:
  • Remote file server (RFS) – This process receives redo logs from the primary database.
  • Archiver process (ARC) – This process archives the standby redo logs when standby redo logs and LGWR are used.
  • On the standby location, log apply services use the following processes:
  • Managed recovery process (MRP) - For physical standby databases only, the MRP applies archived redo log information to the physical standby database.
  • Logical standby process (LSP) - For logical standby databases only, the LSP applies archived redo log information to the logical standby database, using SQL interfaces.
  • On the primary and standby locations, the Data Guard broker uses the following processes:
  • Data Guard broker monitor (DMON) process – These processes work cooperatively to manage the primary and standby databases as a unified configuration. The DMON processes work together to execute switchover and failover operations, monitor the status of the databases, and manage log transport services and log apply services.

Figure 2 identifies the relationships of these processes to the operations they perform and the database objects on which they operate in the absence of the Data Guard broker.

Figure 2: Data Guard architecture.

In this figure, the standby redo logs are optionally configured for physical standby databases, except when running in maximum protection mode, which requires physical standby databases and standby redo logs. Logical standby databases do not use standby redo logs.

Log Transport Services

The log transport services component of the Data Guard environment is responsible for automatic archiving of primary database online redo logs. Log transport services provide for the management of archived redo log permissions, destinations, transmission, reception, and transmission failure resolution. In a Data Guard environment, the log transport services component coordinates its activities with log apply services and role management services for switchover and failover operations.

Log Apply Services

Log apply services automatically apply archived redo logs to maintain transactional synchronization with the primary database, and allow transactionally consistent read-only access to the data.

The main difference between physical and logical standby databases is the manner in which log apply services apply the archived redo logs. For physical standby databases, log apply services maintain the standby database in managed recovery mode or open read-only mode. For logical standby databases, log apply services maintain the standby database in SQL apply mode. The following list summarizes these modes:

  • Managed recovery mode (physical standby databases only) - In this mode, log transport services archive logs to the standby site, and log apply services automatically apply these logs. If you want maximum protection against data loss or corruption, then maintain the standby database in managed recovery mode in a Data Guard environment.
  • Read-only mode (physical standby databases only) - Use read-only mode for supplemental reporting of data contained in the primary database. If you want to use the standby database for reporting purposes, then open it in read-only mode in a Data Guard environment. Log apply services cannot apply archived redo logs to the standby database when it is in this mode, but you can still execute queries on the database. While the standby database is in read-only mode, it can continue to receive archived redo logs from the primary database.
  • SQL apply mode (logical standby databases only) - Log apply services manage logical standby databases in SQL apply mode only. Only logical standby databases can be opened in read/write mode, but the target tables for the regenerated SQL statements are available only in read-only mode for reporting purposes. The SQL apply mode supports the application of SQL statements and reporting activities simultaneously to the logical standby database.

Role Management Services

A database operates in one of two mutually exclusive roles: primary or standby. Role management services operate in conjunction with the log transport services and log apply services to change these roles dynamically as a planned transition called a switchover operation, or as a result of a database failure through either a graceful failover or a forced failover operation. To summarize, here are the types of failover:

  • Database switchover - Switchover operations provide the means to transition database roles from a primary role to a standby role and visa-versa.
  • Graceful (no-data-loss) failover - A no-data-loss failover is possible if the corresponding primary database is operating in either the Maximum Protection or Maximum Availability data protection mode.
  • Forced (minimal-data-loss) failover - Minimal-data-loss failover occurs when primary database modifications are not yet available on the standby site at the time the failover operation occurs. Minimal data loss is possible during a forced failover when operating in Maximum Performance mode.

You should not fail over to a standby database except in an emergency, because the failover operation is an unplanned transition that may result in data loss. If you need to fail over before the primary and standby databases are resynchronized, and if the standby database becomes inaccessible, data on the primary database may diverge from data on standby databases. You can prevent this by using a combination of forced logging (to force the generation of redo records of changes against the database), standby redo logs, and the Maximum Protection mode with physical standby databases.

The amount of data differences or data loss you incur during a failover operation is directly related to how you set up the overall Data Guard configuration, and log transport services in particular, and can be prevented entirely by using Maximum Protection mode.

Best Practices For Data Guard Configurations

A robust Data Guard standby database setup is essential to ensuring that both databases work properly during switchover, failover and switchback operations. To this end, following is a summary of the recommended Data Guard configuration options. The exact options and parameters specified below are those implemented in the environment discussed below under Create Data Guard Environment.

A number of these best practices recommendations are attributed to an excellent Oracle White Paper called Maximum Availability Architecture (MAA).[4] However, an important distinction between the environment presented in this MAA White Paper and that here is that the former recommends the Maximum Protection mode whereas the customer here chose Maximum Performance mode. So while most of these options are best practices for any type of Data Guard configuration, a few options were affected by the choice of Maximum Performance mode, and are so noted. Examples of these options are several LOG_ARCHIVE_DEST_2 attributes, and choosing not to use standby redo logs (SRLs). Some options not implemented here (such as SRLs) are presented for sake of completeness, and are not necessarily inconsistent with Maximum Performance mode. See the MAA White Paper for settings more appropriate for Maximum Availability mode, or for more details on options not implemented in the particular environment presented here.

While Maximum Protection mode is certainly the more robust choice for High Availability, the term best practices is relative and must be adapted to a particular client’s environment and needs. In this case study, our customer had two separate data warehouse environments, both of which were recoverable from daily backups within the timeframe specified by their service level agreements. The benefit of increased database performance by using Maximum Performance mode outweighed the possibility of minimal data loss.