jump to navigation

Testing your Oracle Standby configuration using Oracle Broker. 4 April 2011

Posted by David Alejo Marcos in Dataguard Broker, Oracle 11.2, Standby.
Tags: , ,
add a comment

Several weeks ago I wrote a post regarding testing your standby database ( https://davidalejomarcos.wordpress.com/2011/03/01/how-to-test-your-dataguard-configuration-standby-database/ ). The database I tested did not have Oracle Broker configured.

Today I will show you how I did perform a switchover with Oracle Broker configured.

The problem:

Perform a switchover using Oracle Broker.

The solution:

It proved to be quite simple and trouble-free.

Below are the steps I followed:

Note.- Primary is my production primary database during all the exercise, independently of the role the database is running. PRMRY relates to the commands to be run on the current production database.

Note.- Standby is my production standby database during all the exercise, independently of the role the database is running. STNDBY relates to the commands to be run on the current standby database.

1.- Check primary and standby database are in sync:

PRMRY> select sequence# seq,thread# thread  from v$log where status = 'CURRENT' order by thread#;

SEQ     THREAD
---------- ----------
33388        1

PRMRY> alter system switch logfile;

System altered.

PRMRY> /

System altered.

PRMRY> /

System altered.

PRMRY> select sequence# seq,thread# thread  from v$log where status = 'CURRENT' order by thread#;

SEQ     THREAD
---------- ----------
33391        1

PRMRY> exit

STNDBY> select distinct max( sequence#) over (partition by thread# order by thread#) seq, thread# thread from gv$log_history;

SEQ     THREAD
---------- ----------
33390        1

STNDBY> exit

2.- Connect to Oracle Broker from the server where PRMRY is running and:

[oracle@sssss ~]$ dgmgrl
DGMGRL for Linux: Version 11.2.0.1.0 - 64bit Production

Copyright (c) 2000, 2009, Oracle. All rights reserved.

Welcome to DGMGRL, type "help" for information.
DGMGRL> connect /
Connected.

2.1.- Check the configuration for errors:

DGMGRL> show configuration

Configuration - ssss_DG

Protection Mode: MaxPerformance
Databases:
PRIMARY    - Primary database
STANDBY    - Physical standby database

Fast-Start Failover: DISABLED

Configuration Status:
SUCCESS

2.2.- Check primary is in “TRANSPORT-ON” state


DGMGRL> show database primary

Database - primary

Enterprise Manager Name: sss_primary
Role:                    PRIMARY
Intended State:          TRANSPORT-ON
Instance(s):
sss

Database Status:
SUCCESS

2.3.- standby database is in “APPLY-ON” state:

DGMGRL> show database standby

Database - standby

Enterprise Manager Name: sss_standby
Role:                    PHYSICAL STANDBY
Intended State:          APPLY-ON
Transport Lag:           0 seconds
Apply Lag:               0 seconds
Real Time Query:         OFF
Instance(s):
sss

Database Status:
SUCCESS

3.- Perform the switchover using Oracle Broker:

DGMGRL> switchover to standby
Performing switchover NOW, please wait...
New primary database "standby" is opening...
Operation requires shutdown of instance "sss" on database "primary"
Shutting down instance "sss"...
ORA-01031: insufficient privileges

Warning: You are no longer connected to ORACLE.

Please complete the following steps to finish switchover:
shut down instance "sss" of database "primary"
start up and mount instance "sss" of database "primary"

DGMGRL>

4.- Restart the new standby database and register it with the listener:

[oracle@ssss ~]$ sqlplus '/as sysdba'

SQL*Plus: Release 11.2.0.1.0 Production on Sat Apr 2 09:02:20 2011

Copyright (c) 1982, 2009, Oracle.  All rights reserved.

Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production
With the Partitioning option

SQL> select open_mode from v$database;

OPEN_MODE
--------------------
MOUNTED

SQL> shutdown immediate

ORA-01109: database not open
Database dismounted.
ORACLE instance shut down.

SQL> startup mount

Total System Global Area 1.2827E+10 bytes
Fixed Size            2225336 bytes
Variable Size         1207962440 bytes
Database Buffers     1.1610E+10 bytes
Redo Buffers            7348224 bytes
Database mounted.

SQL> alter system register;

System altered.

SQL> exit

5.- Check log files are being sent and applied on the new standby database.

As always, comments are welcome.

————————-

Appendix:

Below is the section of the primary database while switching over to standby:

Sat Apr 02 08:58:37 2011
ALTER DATABASE COMMIT TO SWITCHOVER TO PHYSICAL STANDBY WITH SESSION SHUTDOWN
ALTER DATABASE COMMIT TO SWITCHOVER TO PHYSICAL STANDBY  [Process Id: 8432] (sss)
Sat Apr 02 08:58:37 2011
Thread 1 cannot allocate new log, sequence 33392
Private strand flush not complete
Current log# 3 seq# 33391 mem# 0: /u03/oradata/sss/redo03.rdo
Thread 1 advanced to log sequence 33392 (LGWR switch)
Current log# 1 seq# 33392 mem# 0: /u03/oradata/sss/redo01.rdo
Waiting for all non-current ORLs to be archived...
Waiting for the ORL for thread 1 sequence 33391 to be archived...
Sat Apr 02 08:58:44 2011
Archived Log entry 68782 added for thread 1 sequence 33391 ID 0x2a7ee921 dest 1:
Sat Apr 02 08:58:51 2011
ORL for thread 1 sequence 33391 has been archived...
All non-current ORLs have been archived.
Waiting for all FAL entries to be archived...
All FAL entries have been archived.
Waiting for dest_id 2 to become synchronized...
Sat Apr 02 08:59:01 2011
Active, synchronized Physical Standby  switchover target has been identified
Sat Apr 02 08:59:01 2011
Thread 1 cannot allocate new log, sequence 33393
Private strand flush not complete
Current log# 1 seq# 33392 mem# 0: /u03/oradata/sss/redo01.rdo
Thread 1 advanced to log sequence 33393 (LGWR switch)
Current log# 2 seq# 33393 mem# 0: /u03/oradata/sss/redo02.rdo
ARCH: Standby redo logfile selected for thread 1 sequence 33392 for destination LOG_ARCHIVE_DEST_2
Archived Log entry 68783 added for thread 1 sequence 33392 ID 0x2a7ee921 dest 1:
Sat Apr 02 08:59:01 2011
Stopping background process CJQ0
Sat Apr 02 08:59:01 2011
SMON: disabling tx recovery
Stopping background process QMNC
CLOSE: killing server sessions.
Sat Apr 02 08:59:14 2011
CLOSE: all sessions shutdown successfully.
Sat Apr 02 08:59:14 2011
SMON: disabling cache recovery
Sat Apr 02 08:59:15 2011
Shutting down archive processes
Archiving is disabled
Sat Apr 02 08:59:15 2011
ARCH shutting down
Sat Apr 02 08:59:15 2011
ARCH shutting down
ARC9: Archival stopped
ARC7: Archival stopped
Sat Apr 02 08:59:15 2011
ARCH shutting down
ARC5: Archival stopped
Sat Apr 02 08:59:15 2011
ARCH shutting down
Sat Apr 02 08:59:15 2011
ARCH shutting down
ARC3: Archival stopped
ARC0: Archival stopped
Sat Apr 02 08:59:15 2011
ARCH shutting down
Sat Apr 02 08:59:15 2011
ARCH shutting down
ARC4: Archival stopped
ARC6: Archival stopped
Sat Apr 02 08:59:15 2011
ARCH shutting down
ARC2: Archival stopped
Sat Apr 02 08:59:15 2011
ARCH shutting down
ARC8: Archival stopped
Sat Apr 02 08:59:15 2011
ARCH shutting down
ARC1: Archival stopped
Thread 1 closed at log sequence 33393
Successful close of redo thread 1
ARCH: Noswitch archival of thread 1, sequence 33393
ARCH: End-Of-Redo Branch archival of thread 1 sequence 33393
Archived Log entry 68785 added for thread 1 sequence 33393 ID 0x2a7ee921 dest 1:
ARCH: Archiving is disabled due to current logfile archival
Primary will check for some target standby to have received all redo
Final check for a synchronized target standby. Check will be made once.
LOG_ARCHIVE_DEST_2 is a potential Physical Standby  switchover target
Active, synchronized target has been identified
Target has also applied all redo
Backup controlfile written to trace file /opt/oracle/diag/rdbms/primary/sss/trace/sss_rsm0_8432.trc
Clearing standby activation ID 712960289 (0x2a7ee921)
The primary database controlfile was created using the
'MAXLOGFILES 12' clause.
There is space for up to 9 standby redo logfiles
Use the following SQL commands on the standby database to create
standby redo logfiles that match the primary database:
ALTER DATABASE ADD STANDBY LOGFILE 'srl1.f' SIZE 262144000;
ALTER DATABASE ADD STANDBY LOGFILE 'srl2.f' SIZE 262144000;
ALTER DATABASE ADD STANDBY LOGFILE 'srl3.f' SIZE 262144000;
ALTER DATABASE ADD STANDBY LOGFILE 'srl4.f' SIZE 262144000;
Archivelog for thread 1 sequence 33393 required for standby recovery
Switchover: Primary controlfile converted to standby controlfile succesfully.
Sat Apr 02 08:59:19 2011
MRP0 started with pid=18, OS id=10456
MRP0: Background Managed Standby Recovery process started (sss)
Serial Media Recovery started
Managed Standby Recovery not using Real Time Apply
Online logfile pre-clearing operation disabled by switchover
Media Recovery Log /u01/oradata/sss/arch/log1_33393_708427666.arc
Identified End-Of-Redo for thread 1 sequence 33393
Resetting standby activation ID 0 (0x0)
Media Recovery End-Of-Redo indicator encountered
Media Recovery Applied until change 241462639865
MRP0: Media Recovery Complete: End-Of-REDO (sss)
MRP0: Background Media Recovery process shutdown (sss)
Sat Apr 02 08:59:25 2011
Switchover: Complete - Database shutdown required (sss)
Completed: ALTER DATABASE COMMIT TO SWITCHOVER TO PHYSICAL STANDBY WITH SESSION SHUTDOWN
Using STANDBY_ARCHIVE_DEST parameter default value as /u01/oradata/sss/arch
ALTER SYSTEM SET log_archive_dest_2='' SCOPE=BOTH;
Using STANDBY_ARCHIVE_DEST parameter default value as /u01/oradata/sss/arch
ALTER SYSTEM SET log_archive_dest_state_2='ENABLE' SCOPE=BOTH;
Sat Apr 02 08:59:29 2011
RFS[1]: Assigned to RFS process 11886
RFS[1]: Identified database type as 'physical standby': Client is ARCH pid 5025
RFS[1]: Opened log for thread 1 sequence 33394 dbid 666702615 branch 708427666
Sat Apr 02 08:59:29 2011
RFS[2]: Assigned to RFS process 11894
RFS[2]: Identified database type as 'physical standby': Client is ARCH pid 5021
Sat Apr 02 08:59:29 2011
RFS[3]: Assigned to RFS process 11898
RFS[3]: Identified database type as 'physical standby': Client is ARCH pid 5029
RFS[3]: Opened log for thread 1 sequence 33395 dbid 666702615 branch 708427666
Archived Log entry 68787 added for thread 1 sequence 33394 rlc 708427666 ID 0x2cbe3767 dest 2:
Archived Log entry 68788 added for thread 1 sequence 33395 rlc 708427666 ID 0x2cbe3767 dest 2:
Sat Apr 02 08:59:33 2011
RFS[4]: Assigned to RFS process 12348
RFS[4]: Identified database type as 'physical standby': Client is ARCH pid 5037
RFS[4]: Opened log for thread 1 sequence 33396 dbid 666702615 branch 708427666
Archived Log entry 68789 added for thread 1 sequence 33396 rlc 708427666 ID 0x2cbe3767 dest 2:
Sat Apr 02 08:59:33 2011
RFS[5]: Assigned to RFS process 12342
RFS[5]: Identified database type as 'physical standby': Client is LGWR ASYNC pid 24051
Primary database is in MAXIMUM PERFORMANCE mode
RFS[5]: Opened log for thread 1 sequence 33397 dbid 666702615 branch 708427666
Archived Log entry 68790 added for thread 1 sequence 33397 rlc 708427666 ID 0x2cbe3767 dest 2:
RFS[5]: Opened log for thread 1 sequence 33398 dbid 666702615 branch 708427666
Sat Apr 02 09:00:17 2011
RFS[6]: Assigned to RFS process 18674
RFS[6]: Identified database type as 'physical standby': Client is ARCH pid 5021
Sat Apr 02 09:00:23 2011
Archived Log entry 68791 added for thread 1 sequence 33398 rlc 708427666 ID 0x2cbe3767 dest 2:
RFS[5]: Opened log for thread 1 sequence 33399 dbid 666702615 branch 708427666

and here is the section of the alert.log file of the production standby database during the switchover:

Sat Apr 02 08:58:41 2011
Media Recovery Waiting for thread 1 sequence 33392
Sat Apr 02 08:58:41 2011
Archived Log entry 34084 added for thread 1 sequence 33391 ID 0x2a7ee921 dest 1:
Sat Apr 02 08:58:51 2011
RFS[6]: Assigned to RFS process 23317
RFS[6]: Identified database type as 'physical standby': Client is ARCH pid 8452
Sat Apr 02 08:59:01 2011
RFS[7]: Assigned to RFS process 23326
RFS[7]: Identified database type as 'physical standby': Client is Foreground pid 8432
RFS[7]: Selected log 6 for thread 1 sequence 33392 dbid 666702615 branch 708427666
Sat Apr 02 08:59:01 2011
Archived Log entry 34085 added for thread 1 sequence 33392 ID 0x2a7ee921 dest 1:
Sat Apr 02 08:59:02 2011
Media Recovery Log /u01/oradata/sss/arch/log1_33392_708427666.arc
Media Recovery Waiting for thread 1 sequence 33393
Sat Apr 02 08:59:11 2011
RFS[8]: Assigned to RFS process 23340
RFS[8]: Identified database type as 'physical standby': Client is ARCH pid 8452
Sat Apr 02 08:59:18 2011
RFS[9]: Assigned to RFS process 24039
RFS[9]: Identified database type as 'physical standby': Client is Foreground pid 8432
RFS[9]: Opened log for thread 1 sequence 33393 dbid 666702615 branch 708427666
Archived Log entry 34086 added for thread 1 sequence 33393 rlc 708427666 ID 0x2a7ee921 dest 2:
Sat Apr 02 08:59:18 2011
Media Recovery Log /u01/oradata/sss/arch/log1_33393_708427666.arc
Identified End-Of-Redo for thread 1 sequence 33393
Resetting standby activation ID 712960289 (0x2a7ee921)
Media Recovery End-Of-Redo indicator encountered
Media Recovery Continuing
Resetting standby activation ID 712960289 (0x2a7ee921)
Media Recovery Waiting for thread 1 sequence 33394
Sat Apr 02 08:59:19 2011
RFS[10]: Assigned to RFS process 24045
RFS[10]: Identified database type as 'physical standby': Client is Foreground pid 8432
Sat Apr 02 08:59:25 2011
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL
MRP0: Background Media Recovery cancelled with status 16037
Errors in file /opt/oracle/diag/rdbms/standby/sss/trace/sss_pr00_6126.trc:
ORA-16037: user requested cancel of managed recovery operation
Managed Standby Recovery not using Real Time Apply
Recovery interrupted!
Waiting for MRP0 pid 6122 to terminate
Errors in file /opt/oracle/diag/rdbms/standby/sss/trace/sss_pr00_6126.trc:
ORA-16037: user requested cancel of managed recovery operation
Sat Apr 02 08:59:26 2011
MRP0: Background Media Recovery process shutdown (sss)
Managed Standby Recovery Canceled (sss)
Completed: ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL
ALTER DATABASE COMMIT TO SWITCHOVER TO PRIMARY WAIT WITH SESSION SHUTDOWN
ALTER DATABASE SWITCHOVER TO PRIMARY (sss)
Maximum wait for role transition is 15 minutes.
Backup controlfile written to trace file /opt/oracle/diag/rdbms/standby/sss/trace/sss_rsm0_5074.trc
SwitchOver after complete recovery through change 241462639865
Online log /u03/oradata/sss/redo01.rdo: Thread 1 Group 1 was previously cleared
Online log /u03/oradata/sss/redo02.rdo: Thread 1 Group 2 was previously cleared
Online log /u03/oradata/sss/redo03.rdo: Thread 1 Group 3 was previously cleared
Standby became primary SCN: 241462639863
Switchover: Complete - Database mounted as primary
Completed: ALTER DATABASE COMMIT TO SWITCHOVER TO PRIMARY WAIT WITH SESSION SHUTDOWN
Sat Apr 02 08:59:26 2011
ARC7: Becoming the 'no SRL' ARCH
Sat Apr 02 08:59:29 2011
ARC8: Becoming the 'no SRL' ARCH
ALTER SYSTEM SET log_archive_dest_2='service="horvitz"','LGWR ASYNC NOAFFIRM delay=0 optional compression=disable max_failure=0 max_connections=1 reopen=300 db_unique_name="horvitz" net_timeout=30','valid_for=(all_logfiles,primary_role)' SCOPE=BOTH;
ALTER DATABASE OPEN
Data Guard Broker initializing...
Switching redo format version from 11.1.0.0.0 to 11.2.0.0.0 at change 241462639868
Thread 1 advanced to log sequence 33395 (COMPATIBLE advance)
Sat Apr 02 08:59:29 2011
Assigning activation ID 750663527 (0x2cbe3767)
Thread 1 advanced to log sequence 33396 (thread open)
Sat Apr 02 08:59:29 2011
ARC9: Becoming the 'no SRL' ARCH
Thread 1 opened at log sequence 33396
Current log# 3 seq# 33396 mem# 0: /u03/oradata/sss/redo03.rdo
Successful open of redo thread 1
Sat Apr 02 08:59:29 2011
ARC0: Becoming the 'no SRL' ARCH
Archived Log entry 34087 added for thread 1 sequence 33394 ID 0x2cbe3767 dest 1:
Sat Apr 02 08:59:29 2011
NSA2 started with pid=32, OS id=24051
Sat Apr 02 08:59:29 2011
ARC1: Becoming the 'no SRL' ARCH
Archived Log entry 34088 added for thread 1 sequence 33395 ID 0x2cbe3767 dest 1:
Sat Apr 02 08:59:29 2011
SMON: enabling cache recovery
Thread 1 advanced to log sequence 33397 (LGWR switch)
Current log# 1 seq# 33397 mem# 0: /u03/oradata/sss/redo01.rdo
******************************************************************
LGWR: Setting 'active' archival for destination LOG_ARCHIVE_DEST_2
******************************************************************
Sat Apr 02 08:59:32 2011
Archived Log entry 34091 added for thread 1 sequence 33396 ID 0x2cbe3767 dest 1:
Successfully onlined Undo Tablespace 1.
Dictionary check beginning
Dictionary check complete
Verifying file header compatibility for 11g tablespace encryption..
Verifying 11g file header compatibility for tablespace encryption completed
SMON: enabling tx recovery
Database Characterset is AL32UTF8
No Resource Manager plan active
replication_dependency_tracking turned off (no async multimaster replication found)
Sat Apr 02 08:59:36 2011
Starting background process QMNC
Sat Apr 02 08:59:36 2011
QMNC started with pid=33, OS id=24060
LOGSTDBY: Validating controlfile with logical metadata
LOGSTDBY: Validation complete
Completed: ALTER DATABASE OPEN
ALTER SYSTEM SET log_archive_trace=0 SCOPE=BOTH SID='sss';
ALTER SYSTEM SET log_archive_format='log%t_%s_%r.arc' SCOPE=SPFILE SID='sss';
ALTER SYSTEM SET standby_file_management='AUTO' SCOPE=BOTH SID='*';
ALTER SYSTEM SET archive_lag_target=1800 SCOPE=BOTH SID='*';
ALTER SYSTEM SET log_archive_max_processes=10 SCOPE=BOTH SID='*';
ALTER SYSTEM SET log_archive_min_succeed_dest=1 SCOPE=BOTH SID='*';
ALTER SYSTEM ARCHIVE LOG
Thread 1 advanced to log sequence 33398 (LGWR switch)
Current log# 2 seq# 33398 mem# 0: /u03/oradata/sss/redo02.rdo
Sat Apr 02 08:59:37 2011
ARC7: STARTING ARCH PROCESSES
Archived Log entry 34094 added for thread 1 sequence 33397 ID 0x2cbe3767 dest 1:
Sat Apr 02 08:59:37 2011
ARCa started with pid=34, OS id=24064
ARCa: Archival started
ARC7: STARTING ARCH PROCESSES COMPLETE
Sat Apr 02 08:59:43 2011
Starting background process CJQ0
Sat Apr 02 08:59:43 2011
CJQ0 started with pid=39, OS id=24099
Setting Resource Manager plan SCHEDULER[0xFD16]:DEFAULT_MAINTENANCE_PLAN via scheduler window
Setting Resource Manager plan DEFAULT_MAINTENANCE_PLAN via parameter
Sat Apr 02 08:59:47 2011
Starting background process VKRM
Sat Apr 02 08:59:47 2011
VKRM started with pid=38, OS id=24118
Sat Apr 02 09:00:17 2011
Shutting down archive processes
Sat Apr 02 09:00:17 2011
ARCH shutting down
ARCa: Archival stopped
Sat Apr 02 09:00:20 2011
ALTER SYSTEM ARCHIVE LOG
Sat Apr 02 09:00:20 2011
Thread 1 cannot allocate new log, sequence 33399
Checkpoint not complete
Current log# 2 seq# 33398 mem# 0: /u03/oradata/sss/redo02.rdo
Thread 1 advanced to log sequence 33399 (LGWR switch)
Current log# 3 seq# 33399 mem# 0: /u03/oradata/sss/redo03.rdo
Archived Log entry 34096 added for thread 1 sequence 33398 ID 0x2cbe3767 dest 1:

How to test your Dataguard Configuration (Standby Database) 1 March 2011

Posted by David Alejo Marcos in Oracle 11.2, RAC, Standby.
Tags: , ,
3 comments

In a couple of weeks we will have to test our DR (based on standby database) configuration for our 3-node cluster.

This cluster has 3 databases running and, as part of the DR test, we will do a controlled fail over to DR, run some batches over there and fail back to production.

The problem:

Activate physical standby for testing. As part of the test, we will run some batches to ensure DR if fit to purpose, flashback any changes, and fail back to production.

The solution:

Below are the steps we will follow.

— Create restore point on standby
STANDBY> SELECT flashback_on, log_mode FROM v$database;
STANDBY> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL;
STANDBY> CREATE RESTORE POINT before_dr_test GUARANTEE FLASHBACK DATABASE;
STANDBY> SELECT scn, database_incarnation#, guarantee_flashback_database, storage_size, time, name FROM gv$restore_point;
STANDBY> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE DISCONNECT;

— Perform couple of logfile switches.
PRIMARY> ALTER SYSTEM SWITCH LOGFILE;
PRIMARY> ALTER SYSTEM SWITCH LOGFILE;
PRIMARY> ALTER SYSTEM SWITCH LOGFILE;

— Next step only if you are using Dataguard Broker.
PRIMARY> ALTER SYSTEM SET DG_BROKER_START=FALSE;

PRIMARY> ALTER SYSTEM SET LOG_ARCHIVE_DEST_STATE_2=DEFER;

— We will not ship any logfiles to old primary due to change of role.

STANDBY> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL;

STANDBY> ALTER SYSTEM SET LOG_ARCHVE_DEST_STATE_2=DISABLE;

— Convert standby database to a primary database.
STANDBY> ALTER DATABASE ACTIVATE STANDBY DATABASE;

STANDBY> STARTUP MOUNT FORCE;

— OPTIONAL. change dataguard protection mode to maximum performance if needed

STANDBY> ALTER DATABASE SET STANDBY DATABASE TO MAXIMIZE PERFORMANCE;

— Open database

STANDBY> ALTER DATABASE OPEN;

— Standby database is open for testing. No changes should be sent to primary database as destination is disabled.

— After testing, we perform Flashback to undo any changes and activate database as physical standby.
STANDBY> SHUTDOWN IMMEDIATE;

STANDBY> STARTUP MOUNT;

STANDBY> FLASHBACK DATABASE TO RESTORE POINT before_dr_test;

STANDBY> ALTER DATABASE CONVERT TO PHYSICAL STANDBY;

STANDBY> SHUTDOWN IMMEDIATE;

STANDBY> STARTUP MOUNT;

STANDBY> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE DISCONNECT;

— Enable logfile transfer on both, primary and standby.
PRIMARY> ALTER SYSTEM SET LOG_ARCHIVE_DEST_STATE_2=ENABLE;

STANDBY> ALTER SYSTEM SET LOG_ARCHIVE_DEST_STATE_2=ENABLE;

— Perform couple of logfile switches to verify configuration is working.
PRIMARY> ALTER SYSTEM SWITCH LOGFILE;
PRIMARY> ALTER SYSTEM SWITCH LOGFILE;
PRIMARY> ALTER SYSTEM SWITCH LOGFILE;

PRIMARY> SELECT DEST_ID, DEST_NAME, STATUS, PROTECTION_MODE, DESTINATION, ERROR, APPLIED_THREAD#, APPLIED_SEQ#, SYNCHRONIZED, GAP_STATUS FROM V$ARCHIVE_DEST_STATUS;

— Drop restore point on DR (this will avoid to run out of space on your FRA).
STANDBY> DROP RESTORE POINT before_dr_test;

As always, comments are welcome.

ORA-01555 caused by SQL statement below… on Active Dataguard after upgrade to 11.2.0.2 25 January 2011

Posted by David Alejo Marcos in Oracle 11.2, Standby, Tuning.
Tags: , ,
6 comments

Sometimes I wonder why we do upgrade our databases…

The problem:

after upgrading from 11.2.0.1 to 11.2.0.2, our standby databases (we are using active dataguard) was recording ORA-01555 on the alert log file after being running for some time.

We actively use our standby databases to run some queries for reporting.

The solution:

We have two standby databases, one local (less than 50km away) and another one on a remote location (250km) from primary.

The first thing we did was to raise a ticket with Oracle and change one of the standby databases from “open read only” to “mounted” (I had experience with problems on active dataguard in the past).

This proved useful in several ways:

1.- We narrowed the problem to active dataguard. The database in mount status was applying log files without issues and not logging any errors. The other standby started to log errors after a few hours of running (seems time depends on the workload the database is under).

2.- Once we stop the database and open it in read only, it will work for some time (not very elegant but still useful).

3.- Oracle Support came with a bug and a fix.

The bug number is 10320455 and the new patch is 10018789 (to be applied on top of 11.2.0.2).

The problem itself seems to be a hang (lock) in the library cache. This lock will cause the workload to accumulate and do not release resources and logging ORA-01555 after some time.

As the patch only updates binaries (546 kb in size) we might apply it to DR and, once we secure some downtime over the weekend, apply it to production. This will help us to minimize the time we impact business by having the standby databases in mounted status.

As always, comments are welcome.

ORA-16765 + ORA-16826: apply service state is inconsistent with the DelayMins property with Dataguard Broker 28 October 2010

Posted by David Alejo Marcos in Dataguard Broker, Oracle 11.2, Standby.
Tags: , ,
3 comments

Last Saturday I had to start-up our DR Database as the server went down 3 days before due to maintenance.

It should have been an easy task as the DR database was lagging “only” 400 files behind and all archivelogs were on the primary database.

The problem:

I had the following errors on the Dataguard Broker:

ORA-16765: Redo Apply is running

and

ORA-16826: apply service state is inconsistent with the DelayMins property

I first thought it was related to how much time we allow the standby database to be behind the primary, but it is not the case.

The solution:

I did copy all necessary archivelogs from the primary server to the standby server using SCP.

Once I had the necessary archivelogs, I registered using the following script:


spool /tmp/register.sql
begin
for i in 38010 .. 38060 loop
dbms_output.put_line ('alter database register logfile ''/u01/oradata/dbname/arch/1_'||i||'_697561259.arc'';');
end loop;
end;
/
@/tmp/register.sql

where:
1.- 38010 is the first archivelog file the standby was waiting for.
2.- 38060 is the last archivelog being generated by the primary database.

Once archivelogs were registered, the database started to apply them.

With only two logfiles behind, I decided to enable dataguard broker but the same error happened.

The only way I could get around was:

1.- To remove the standby database from the configuration:
DGMGRL> remove database 'dr_database';
Removed database "dr_database" from the configuration

2.- Add the database back to the configuration:
DGMGRL> add database 'dr_database' as connect identifier is 'dr_database';
Database "dr_database" added

3.- And then enable it again:
DGMGRL> enable database 'dr_database';
Enabled.

And voila, it started to work again.

As always, comments are welcome.

ORA-01187: cannot read from file because it failed verification tests + ORA-01110 on tempfiles 5 October 2010

Posted by David Alejo Marcos in Oracle 11.2, Standby.
Tags: ,
3 comments

Today I had a call from a Java developer asking me to help him with an error from our applications. The error being reported was:

java.sql.SQLException: ORA-01187: cannot read from file because it failed verification tests ORA-01110: data file 206: '/u03/oradata/xxx/tempapp22.dbf'

so I did have a look.

The problem:

ORA-01187 + ORA-01110 on the application logs. This part of the application runs on an Active Standby database.

This database has two temporary tablespaces called TEMPAPP and TEMPUSER. We assign TEMPAPP to application users and TEMPUSER to physical users.

The solution:

The first thing I did was to check the existence of the tempfile on the filesystem:

[oracle@sssss ~]$ ls -lrt /u03/oradata/xxx/tempapp22.dbf
-rw-r--r-- 1 oracle oinstall 26843553792 Oct 2 11:26 /u03/oradata/xxx/tempapp22.dbf

ok, this is a step forward as the file exists on the server. What did not look ok was the date of the file (October 2nd) and today is October 5th… maybe there is not much activity on the database?

Then I did have a look on the database:

SQL> select file_name from dba_temp_files;
ERROR:
ORA-01187: cannot read from file because it failed verification tests
ORA-01110: data file 205: '/u03/oradata/xxx/tempapp11.dbf'

This did not look good, specially because the tempfile being returned was a different one from the application. After some trial and error I decided the following strategy:

1.- Drop the tempfile from the database:
SQL> alter database tempfile '/u03/oradata/xxx/tempapp11.dbf' drop;

2.- Create a new tempfile for the temporary tablespace:
alter tablespace TEMPAPP1 add tempfile ‘/u03/oradata/xxx/tempapp11.dbf’ SIZE 25600M REUSE;

I did have to drop all tempfiles (from TEMPAPP and TEMPUSER) as all of them were giving problems when selecting from dba_temp_files

Once I did drop all tempfiles and created them again, the query did return the expected rows:

SQL> select file_name from dbA_temp_files;
FILE_NAME
--------------------------------------------------------------------------------
/u03/oradata/xxx/tempapp12.dbf
/u03/oradata/xxx/tempapp21.dbf
/u03/oradata/xxx/tempapp11.dbf
/u03/oradata/xxx/tempapp22.dbf
/u03/oradata/xxx/tempusers11.dbf
/u03/oradata/xxx/tempusers21.dbf

As always, comments are welcome.

Upgrade Oracle to 11.2 including Standby and DG Broker 1 July 2010

Posted by David Alejo Marcos in Dataguard Broker, Oracle 11.2, Standby.
Tags: , , ,
2 comments

This is a step by step guide of how I upgraded several databases from 11.1 to 11.2.0.1.

The problem:

We have a primary database and two standby managed by DG Broker. On this document, I will also include the steps to upgrade RDBMS DST (Daylight Savings Time) to 11g.

Standby databases are open in read only, the importance of this fact will be unveiled at the end of the article.

The server is running Linux 86-64 and the Oracle version was 11.1.0.7.

The solution:

The steps I followed were:

1.- Download and install Oracle following software:

Oracle 11.2:
linux.x64_11gR2_database_1of2.zip
linux.x64_11gR2_database_2of2.zip

Opatch:
p6880880_112000_Linux-x86-64.zip

PSU1:
p9352237_112010_Linux-x86-64.zip

2.- Check the previous backup run without problems or perform a full backup.

3.- Disable DG Broker:
connect to the primary box and execute:

Dgmgrl
Connect /
Disable fast_start failover
Disable configuration

4.- Connect to primary database and execute:
Sqlplus ‘/as sysdba’
Alter system set dg_broker_start=false scope=both;

5.- backup broker configuration on all 3 servers (primary and standby):
Sqlplus ‘/as sysdba’
show parameter dg_broker_config
copy those files (cp command) indicated on DG_BROKER_CONFIG_FILE1 and DG_BROKER_CONFIG_FILE2.

We are now in a position to start the upgrade.

6.- Check all Oracle components are VALID on the primary database:

SET pagesize500
SET linesize 100

SELECT SUBSTR (comp_name, 1, 40) comp_name,
status,
SUBSTR (version, 1, 10) version
FROM dba_registry
ORDER BY comp_name;

SELECT SUBSTR (object_name, 1, 40) object_name,
SUBSTR (owner, 1, 15) owner,
object_type
FROM dba_objects
WHERE status='INVALID'
ORDER BY owner, object_type;

SELECT owner,
object_type,
COUNT (*)
FROM dba_objects
WHERE status='INVALID'
GROUP BY owner, object_type
ORDER BY owner, object_type;

if the queries above reports invalid objects, we can recompile them by executing:
sqlplus ‘/as sysdba’
@?/rdbms/admin/utlrp.sql

7.- Check we do not have duplicates on SYS or SYSTEM:

SELECT object_name, object_type
FROM dba_objects
WHERE object_name ||object_type IN
(SELECT object_name||object_type FROM dba_objects WHERE owner = 'SYS')
AND owner = 'SYSTEM'

If the query returns the following, it is fine:

OBJECT_NAME OBJECT_TYPE
------------------------------ -------------------
AQ$_SCHEDULES TABLE
AQ$_SCHEDULES_PRIMARY INDEX
DBMS_REPCAT_AUTH PACKAGE
DBMS_REPCAT_AUTH PACKAGE BODY

if there are other duplications, we can follow NOTE.1030426.6 HOW TO CLEAN UP DUPLICATE OBJECTS OWNED BY SYS AND SYSTEM

8.- pre-upgrade checks:
copy
$ORACLE_HOME/rdbms/admin/utlu112i.sql
$ORACLE_HOME/rdbms/admin/utltzuv2.sql

to /tmp (ORACLE_HOME should be set to your 11.2 ORACLE_HOME).

execute:
cd /tmp
sqlplus '/ as sysdba'
SQL> spool upgrade_info.log
SQL> @utlu112i.sql
SQL> spool off

and check the output.
Note 1.- We kept one of the standby databases down during the upgrade and the testing phase as rollback mechanism.

9.- execute bdupgdiag.sql
Note 556610.1 Script to Collect DB Upgrade/Migrate Diagnostic Information (dbupgdiag.sql)

10.- Check all snapshot refreshes finished:
SELECT DISTINCT(TRUNC(last_refresh))
FROM dba_snapshot_refresh_times;

11.- Check there are not files that need media recovery:
SELECT * FROM v$recover_file;

12.- Check there are not files on backup mode:
SELECT * FROM v$backup WHERE status != 'NOT ACTIVE';

13.- Resolve outstanding distributed transactions prior to the upgrade.
SQL> select * from dba_2pc_pending;

If this returns rows you should do the following:

SQL> SELECT local_tran_id
FROM dba_2pc_pending;
SQL> EXECUTE dbms_transaction.purge_lost_db_entry('');
SQL> COMMIT;

14.- Check standby databases are in sync

15.- Disable batch and crontab jobs.

16.- Check default tablespace for users SYS and SYSTEM:
SQL> SELECT username, default_tablespace
FROM dba_users
WHERE username in ('SYS','SYSTEM');

If DEFAULT_TABLESPACE is anything other than SYSTEM tablespace, modify the default tablespace to SYSTEM by using the below command.

SQL> ALTER user SYS default tablespace SYSTEM;
SQL> ALTER user SYSTEM default tablespace SYSTEM;

17.- Check the aud$ table exists and that it is in the sys schema and in the system tablespace.
SQL> SELECT owner,tablespace_name
FROM dba_tables
WHERE table_name='AUD$';

18.- Stop listener on all three databases:
$ lsnrctl stop

19.- Stop dbconsole, agent and/or isqlplus:
$ emctl stop dbconsole
$ isqlplusctl stop
$ emctl stop agent

20.- Shutdown all three databases:
Primary:
$ sqlplus "/as sysdba"
SQL> shutdown immediate;

Standby:
(Althought it can be done using dgmgrl, for simplicity I have posted the “old style”):
$ sqlplus "/as sysdba"
SQL> alter database recover managed standby database cancel;
SQL> shutdown immediate;

21.- Create a new entry on /etc/oratab on all three servers:
Create entry on /etc/oratab:
db_name:/opt/oracle/product/11.2.0/dbhome_1:N
Comment out old entry:
#db_name:/opt/oracle/product/11.1.0/db_1:N

Execute:
. oratab
db_name

And check ORACLE_HOME and ORACLE_SID

21.1.- Copy Spfile (or init.ora file), orapw file, listener.ora, tnsnames.ora and sqlnet.ora from old ORACLE_HOME to new ORACLE_HOME.

21.2.- Modify ORACLE_HOME on the listener.ora under the SID_LIST_LISTENER to reflect the new ORACLE_HOME.

22.- Get ready to execute the upgrade:
On primary:
$ cd $ORACLE_HOME/rdbms/admin
$ lsnrctl start
$ sqlplus '/ as sysdba'
SQL> startup UPGRADE

On standby:
$ cd $ORACLE_HOME/rdbms/admin
$ lsnrctl start
$ sqlplus '/ as sysdba'
SQL> startup mount
SQL> alter database recover managed standby database disconnect;

23.- Execute the upgrade:
On primary:
$ sqlplus '/ as sysdba'
SQL> set echo on
SQL> SPOOL upgrade.log
SQL> @catupgrd.sql
SQL> spool off

check alert.log on standby to see archivelogs (or redo logs) are being applied.
Below is not true as broker is disabled. Put DB in recovery:
Alter database recover managed standby database disconnect;
Will give errors as listener is down. My decision is not to start listener until upgrade is done.

24.- Check logs:
Once the script has finished, check the log file to verify all is ok and run the Post-Upgrade Status tool

On primary:
$ sqlplus '/as sysdba'
SQL> STARTUP
SQL> @utlu112s.sql
SQL> @catuppst.sql
SQL> @utlrp.sql

25.- Change compatible parameter on primary and standby:
On primary:
$ sqlplus '/as sysdba'
SQL> alter system set compatible='11.2.0' scope=spfile;
SQL> SHUTDOWN IMMEDIATE
SQL> STARTUP

On Standby:
$ sqlplus "/as sysdba"
SQL> alter system set compatible='11.2.0' scope=spfile;
SQL> SHUTDOWN IMMEDIATE
SQL> STARTUP MOUNT
SQL> alter database recover managed standby database cancel;

Now we have an upgraded database. The next steps are to upgrade the RDBMS DST:

26.- Check current RDBMS DST version and “DST UPGRADE STATUS”.
$ sqlplus '/as sysdba'
SQL> SELECT PROPERTY_NAME, SUBSTR (property_value, 1, 30) value
FROM DATABASE_PROPERTIES
WHERE PROPERTY_NAME LIKE 'DST_%'
ORDER BY PROPERTY_NAME;

DST_PRIMARY_TT_VERSION should match the value found when selecting

$ sqlplus '/as sysdba'
SQL> SELECT version FROM v$timezone_file;

27.- Check if there is any data that cannot be upgraded:
$ sqlplus '/as sysdba'
SQL> exec DBMS_DST.BEGIN_PREPARE(11);

28.- Check for prepare status
$ sqlplus '/as sysdba'
SQL> SELECT PROPERTY_NAME, SUBSTR (property_value, 1, 30) value
FROM DATABASE_PROPERTIES
WHERE PROPERTY_NAME LIKE 'DST_%'
ORDER BY PROPERTY_NAME;

The output should be:
— PROPERTY_NAME VALUE
— —————————— ——————————
— DST_PRIMARY_TT_VERSION
— DST_SECONDARY_TT_VERSION
— DST_UPGRADE_STATE PREPARE

29.- Truncate logging tables if they exist.
$ sqlplus '/as sysdba'
SQL> TRUNCATE TABLE SYS.DST$TRIGGER_TABLE;
SQL> TRUNCATE TABLE sys.dst$affected_tables;
SQL> TRUNCATE TABLE sys.dst$error_table;

30.- Log affected data
$ sqlplus '/as sysdba'
SQL> BEGIN
DBMS_DST.FIND_AFFECTED_TABLES
(affected_tables => 'sys.dst$affected_tables',
log_errors => TRUE,
log_errors_table => 'sys.dst$error_table');
END;
/

31.- Check what tables have affected data that cannot be resolved automatically.
if this gives no rows then there is no problem at all

$ sqlplus '/as sysdba'
SQL> SELECT * FROM sys.dst$affected_tables;

IF previous select gives rows then you can see what kind of problem there are in those rows:

SELECT * FROM sys.dst$error_table;

32.- Check for different errors:
All “error_on_overlap_time” rows
SELECT * FROM sys.dst$error_table where ERROR_NUMBER= '1883';

All “error_on_nonexisting_time” rows
SELECT * FROM sys.dst$error_table where ERROR_NUMBER= '1878';

Check for all other possible problems
SELECT * FROM sys.dst$error_table where ERROR_NUMBER not in ('1878','1883');

33.- End prepare window, the rows above will stay in those tables.

EXEC DBMS_DST.END_PREPARE;

Check if this is ended

SELECT PROPERTY_NAME, SUBSTR(property_value, 1, 30) value
FROM DATABASE_PROPERTIES
WHERE PROPERTY_NAME LIKE 'DST_%'
ORDER BY PROPERTY_NAME;

The output should be:
— PROPERTY_NAME VALUE
— —————————- ——————————
— DST_PRIMARY_TT_VERSION
— DST_SECONDARY_TT_VERSION 0
— DST_UPGRADE_STATE NONE

34.- Startup the database for the upgrade:
$sqlplus '/as sysdba'
SQL> shutdown immediate;
SQL> startup upgrade;

Purge the Recycle Bin:
purge dba_recyclebin;

Clean used tables:

TRUNCATE TABLE SYS.DST$TRIGGER_TABLE;
TRUNCATE TABLE sys.dst$affected_tables;
TRUNCATE TABLE sys.dst$error_table;

Start upgrade window
SQL> EXEC DBMS_DST.BEGIN_UPGRADE(11);

Check the status:
SELECT PROPERTY_NAME, SUBSTR(property_value, 1, 30) value
FROM DATABASE_PROPERTIES
WHERE PROPERTY_NAME LIKE 'DST_%'
ORDER BY PROPERTY_NAME;

The output should be similar to:
— PROPERTY_NAME VALUE
— ————————— ——————————
— DST_PRIMARY_TT_VERSION
— DST_SECONDARY_TT_VERSION
— DST_UPGRADE_STATE UPGRADE

Check what tables need to updated using

SELECT OWNER, TABLE_NAME, UPGRADE_IN_PROGRESS
FROM ALL_TSTZ_TABLES
WHERE UPGRADE_IN_PROGRESS='YES';

Restart the database
shutdown immediate
startup

36.- Upgrade the tables who need action
set serveroutput on
VAR numfail number
BEGIN
DBMS_DST.UPGRADE_DATABASE(:numfail,
parallel => TRUE,
log_errors => TRUE,
log_errors_table => 'SYS.DST$ERROR_TABLE',
log_triggers_table => 'SYS.DST$TRIGGER_TABLE',
error_on_overlap_time => FALSE,
error_on_nonexisting_time => FALSE);
DBMS_OUTPUT.PUT_LINE('Failures:'|| :numfail);
END;
/

If there where no failures then end the upgrade.

VAR fail number
BEGIN
DBMS_DST.END_UPGRADE(:fail);
DBMS_OUTPUT.PUT_LINE('Failures:'|| :fail);
END;
/

37.- Last checks
SELECT PROPERTY_NAME, SUBSTR(property_value, 1, 30) value
FROM DATABASE_PROPERTIES
WHERE PROPERTY_NAME LIKE 'DST_%'
ORDER BY PROPERTY_NAME;

The output should be something like:
— PROPERTY_NAME VALUE
— —————————- ——————————
— DST_PRIMARY_TT_VERSION
— DST_SECONDARY_TT_VERSION 0
— DST_UPGRADE_STATE NONE

SELECT * FROM v$timezone_file;

— needed output:
— FILENAME VERSION
— —————— ———-
— timezlrg_.dat

The last step is to configure broker:

38.- Connect to the primary and standby databases and execute:
Sqlplus '/as sysdba'
Alter system set dg_broker_start=true scope=both;

39.- Connect to the DG Broker and finish the configuration:
dgmgrl
Connect /
CREATE CONFIGURATION 'db_name_DG' AS PRIMARY DATABASE IS 'db_name' CONNECT IDENTIFIER IS db_name;
onserveroutputSQL> set

add standby databases:
ADD DATABASE 'stdby_name1' AS CONNECT IDENTIFIER IS stdby_name1 MAINTAINED AS
PHYSICAL;

ADD DATABASE 'stdby_name2' AS CONNECT IDENTIFIER IS stdby_name2 MAINTAINED AS
PHYSICAL;

ENABLE CONFIGURATION;

Connect to the standby database and re-start it:
sqlplus '/as sysdba'
SQL> shutdown immediate;
SQL> startup mount;

I will strongly recommend performing a full backup after the upgrade.

I kept the standby databases in mount state for some time (24 hours) as I found a problem with having the database in read only.

The error I was having on the primary database (alert log) was:

————————————————————-
Current log# 5 seq# 16477 mem# 0: /u04/oradata/DB_NAME/redo05.log
Thu Jul 01 13:12:22 2010

TNS-12535: TNS:operation timed out
ns secondary err code: 12606
nt main err code: 0
nt secondary err code: 0
nt OS err code: 0
Client address: (ADDRESS=(PROTOCOL=tcp)(HOST=xxx.xx.xx.xxx)(PORT=nnn))
WARNING: inbound connection timed out (ORA-3136)

TNS-12535: TNS:operation timed out
ns secondary err code: 12606
nt main err code: 0
nt secondary err code: 0
nt OS err code: 0
Client address: (ADDRESS=(PROTOCOL=tcp)(HOST=xxx.xx.xx.xxx)(PORT=nnnn))
WARNING: inbound connection timed out (ORA-3136)
Tue Jun 29 13:08:03 2010
Thread 1 advanced to log sequence 16053 (LGWR switch)
Current log# 1 seq# 16053 mem# 0: /u04/oradata/DB_NAME/redo01.log
Tue Jun 29 13:08:03 2010
***********************************************************************
Fatal NI connect error 12170.

VERSION INFORMATION:
TNS for Linux: Version 11.2.0.1.0 – Production
Oracle Bequeath NT Protocol Adapter for Linux: Version 11.2.0.1.0 – Production
TCP/IP NT Protocol Adapter for Linux: Version 11.2.0.1.0 – Production
Time: 29-JUN-2010 13:08:03
Tracing not turned on.
Tns error struct:
ns main err code: 12535

TNS-12535: TNS:operation timed out
ns secondary err code: 12606
nt main err code: 0
nt secondary err code: 0
nt OS err code: 0
Client address: (ADDRESS=(PROTOCOL=tcp)(HOST=xxx.xx.xx.xx)(PORT=nnn))
WARNING: inbound connection timed out (ORA-3136)
————————————————————-

40.- Relink libobk.so if you are using NetBackup:

ln -sf /usr/openv/netbackup/bin/libobk.so64 /opt/oracle/product/11.2.0/dbhome_2/lib/libobk.so

to avoid:

RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03009: failure of allocate command on c01 channel at 01/15/2011 14:16:34
ORA-19554: error allocating device, device type: SBT_TAPE, device name:
ORA-27211: Failed to load Media Management Library
Additional information: 2

There is also the consideration of changing ORACLE_HOME on your backup scripts if you are using catalog, etc…

As always, comments are welcome.

ORA-01017: invalid username/password; logon denied after upgrade 11.2 30 June 2010

Posted by David Alejo Marcos in Oracle 11.2, Standby.
Tags: ,
4 comments

Over the last several weeks, I have been upgrading several database to 11.2 with and without broker.

I will write a step-by-step guide later and some of the problems I found.

The interesting thing about my last upgrade was an error (not surprise here, if all went smooth I will not have anything to write about) with the standby database.

The problem:

The error was quite self-explanatory:

ORACLE Instance ddddd - Archival Error. Archiver continuing.
Errors in file /opt/oracle/diag/rdbms/ssssss/dddddd/trace/ddddd_arc0_6420.trc:
ORA-01017: invalid username/password; logon denied
Error 1017 received logging on to the standby
------------------------------------------------------------
Check that the primary and standby are using a password file
and remote_login_passwordfile is set to SHARED or EXCLUSIVE,
and that the SYS password is same in the password files.
returning error ORA-16191
------------------------------------------------------------
Errors in file /opt/oracle/diag/rdbms/ssssss/dddddd/trace/ddddd_arc0_6420.trc:
ORA-16191: Primary log shipping client not logged on standby
FAL[server, ARC0]: Error 16191 creating remote archivelog file 'delta'
FAL[server, ARC0]: FAL archive failed, see trace file.
Errors in file /opt/oracle/diag/rdbms/ssssss/dddddd/trace/ddddd_arc0_6420.trc:
ORA-16055: FAL request rejected
ARCH: FAL archive failed. Archiver continuing

and the contents of /opt/oracle/diag/rdbms/ssssss/dddddd/trace/ddddd_arc0_6420.trc are:

ORA-01017: invalid username/password; logon denied
OCI_DBVER attribute value retrieval failed error=1017
*** 2010-06-26 08:51:58.413 4132 krsh.c
Error 1017 received logging on to the standby
------------------------------------------------------------
Check that the primary and standby are using a password file
and remote_login_passwordfile is set to SHARED or EXCLUSIVE,
and that the SYS password is same in the password files.
returning error ORA-16191
------------------------------------------------------------
*** 2010-06-26 08:51:58.413 869 krsu.c

The solution:

The first thing I did was to check the value of “remote_log_passwordfile” parameter. The value was correct (exclusive).

The second step was trying to connect as sys from one box the other and it was also fine.

What I did next was to recreate the orapwd files on primary and standby, but with two extra options:

orapwd file=orapwXXXXX password=YYYYYY entries=NN ignorecase=y force=y

With the ignorecase, we are telling oracle to revert to the previous behavior (all capitals). Force will allow to overwrite the current passwordfile if it exists.

After stopping and starting the primary and standby databases, all started to work as normal.

I think the problem was related to some capitals on the passwordfile when it was first created, but this is just a guess.

As always, comments are welcome.

change archivelog destination using Dataguard broker 20 May 2010

Posted by David Alejo Marcos in Dataguard Broker, Grid Control, Standby.
Tags: ,
add a comment

We have recently taken over some systems from a sister company and I have been working on them.

One of those systems is using dataguard with dataguard broker. It is a very interesting configuration as the primary and one of the standby database are on the same country while the second standby is in a different country.

I am used to work without the broker, so simple task as disable recovering manager or the archivelog destination because something like a “mini challege”.

In this post I will explain who to change the archivelog destination on primary and standby databases using DGMGRL.

The problem:

We have tree standby databases, Ellison (primary), Larry (physical standby) and Veltman (physical standby). We want to change the archivelog destination from /u03/oradata/arch to /u07/oradata/arch on all three environments.

Databases are being monitored using DataGuard Broker.

The solution:

if we decide to change the property StandbyArchiveLocation on the broker using DGMGRL. If  the value of the string on StandbyArchiveLocation is different from log_archive_dest_1, DGMgRL will update the value of log_archive_dest_2.

With the above in mind, we execute the following using SQLPLUS on all three databases:
SQL> ALTER SYSTEM SET log_archive_dest_1='location="/u07/oradata/arch", valid_for=(ONLINE_LOGFILE,ALL_ROLES)' scope=both sid='*';

Once we have changed the parameter on all three database, we are ready to proceed with the next step.

Now we have to change the property on DGMGRL. We connect to the server running the primary database and execute:
[oracle@ellison ~]$ dgmgrl
DGMGRL for Linux: Version 11.1.0.7.0 - 64bit Production
Copyright (c) 2000, 2005, Oracle. All rights reserved.
Welcome to DGMGRL, type "help" for information.
DGMGRL> connect /
Connected.
DGMGRL> show database larry standbyarchivelocation
StandbyArchiveLocation = '/u03/oradata/arch'
DGMGRL> show database ellison standbyarchivelocation
StandbyArchiveLocation = '/u03/oradata/arch'

Now that we have checked the current value, we execute the command to ammend it:

DGMGRL> edit database larry set property StandbyArchiveLocation = '/u07/oradata/arch';
DGMGRL> edit database ellison set property StandbyArchiveLocation = '/u07/oradata/arch';

And we check again to verify it:

DGMGRL> show database larry standbyarchivelocation
StandbyArchiveLocation = '/u07/oradata/arch'
DGMGRL> show database ellison standbyarchivelocation
StandbyArchiveLocation = '/u07/oradata/arch'

It is recommended to disable and enable the configuration to enforce the new parameters:
DGMGRL> disable configuration;
Disabled.
DGMGRL> enable configuration;
Enabled.

and to check alert.log files on all 3 servers.

I normally execute a couple of “alter system switch logfile;” on primary and check that the archivelogs are being applied.

Note:

Our standby databases are running in open read only mode. These changes did to work until i disabled the managed recovery process, performed a shutdown immediate and opened them for read only again. Not sure if there is a bug, it is recommended to stop the standby databases or I was just unlucky.

As always, comments are welcome.

Creating a 3-node standby database. Part III 29 April 2010

Posted by David Alejo Marcos in ASM, RAC, Standby.
Tags: , ,
2 comments

Time has arrived for us to test the results of our hard work by performing a switch between our single instance database to our three-node database.

I always recommend to start any test with a full backup if the space and time to do it does not jeopardize the deadlines.

Below is a list of some pre-checks worth doing:

1.- Check the full backup finished without errors.

2.- Make sure the init.ora file is the same on all 3 nodes (or pointing to the same spfile).

3.- The password file is correct (or recreate it) on all 3 nodes.

4.- check tnsnames.ora and listener.ora files are correct and the same on all 3 nodes.

5.- services are correct and running.

6.- peform a tnsping to and fro everywere to test connectivity and latency.

Ok, so we are ready for the switchover…., let’s do it:

1.- check the status of primary and standby databases:

SELECT switchover_status, database_role, open_mode FROM v$database;

2.- check the standby database is in synch ( I normally perfomr couple of switch logs and see that are being applied).

alter system switch logfile;
alter system switch logfile;

then you can check alert.log,  v$log_history, v$archived_log, etc…

3.- on Primary:

Disable job queue before the switchover on Primary Database (only in memory):

alter system set job_queue_processes=0;

4.- On primary:

Check for any active sessions appart from your own:

SELECT SID, PROCESS, PROGRAM
FROM GV$SESSION
WHERE TYPE = 'USER'
AND SID <> (SELECT DISTINCT SID FROM GV$MYSTAT);

5.- On primary

Perform switchover:

alter database commit to switchover to physical standby with session shutdown;
shutdown immediate;
startup mount;

On the alert.log of the new primary database you should see something like:

Sat Apr 24 12:13:17 2010
Media Recovery Log +EVB_RECOVERY/bvmprodsh/archivelog/2010_04_24/thread_1_seq_28879.314.717163985
Identified End-Of-Redo for thread 1 sequence 28879
Sat Apr 24 12:13:17 2010
Media Recovery End-Of-Redo indicator encountered
Sat Apr 24 12:13:17 2010
Media Recovery Applied until change 202225032180
Sat Apr 24 12:13:17 2010
MRP0: Media Recovery Complete: End-Of-REDO (bvmprod3)
Resetting standby activation ID 3970545885 (0xeca9b8dd)

6.- On all standby databases, check the status:

SELECT switchover_status, database_role, open_mode FROM v$database;

7.- Complete the switchover on the NEW Primary Database:

alter database commit to switchover to primary with session shutdown;
srvctl stop database -d bvmprodsh
srvctl start database -d bvmprodsh -i bvmprod1

8.- Enable block Change Tracking if you are using incremental backups is recommended:

alter database enable block change tracking using '+EVB_RECOVERY';

9.- On all standby databases (including the old production database), enable recovery:

alter database recover managed standby database disconnect;

10.- Perform several log switch and verify are being shipped and applied on standby databases:

alter system switch logfile;
alter system switch logfile;

Now you are on a cluster database… but there is some work to be done:

1.- Create undo tablespaces for all instances (this only has to be done the first time you open the RAC database):

create undo tablespace undotbs2 datafile size 5g;
create undo tablespace undotbs3 datafile size 5g;

alter system set undo_tablespace=’undotbs1′ scope=spfile sid=’bvmprod1′;
alter system set undo_tablespace=’undotbs2′ scope=spfile sid=’bvmprod2′;

alter system set undo_tablespace=’undotbs3′ scope=spfile sid=’bvmprod3′;

2.- Add logfiles to threads 2 and 3:

ALTER DATABASE ADD LOGFILE THREAD 2
GROUP 21 '+EVB_REDO1' SIZE 250M,
GROUP 22 '+EVB_REDO1' SIZE 250M,
GROUP 23 '+EVB_REDO1' SIZE 250M,
GROUP 24 '+EVB_REDO1' SIZE 250M,
GROUP 25 '+EVB_REDO1' SIZE 250M;

ALTER DATABASE ADD LOGFILE THREAD 3
GROUP 31 ‘+EVB_REDO1’ SIZE 250M,
GROUP 32 ‘+EVB_REDO1’ SIZE 250M,
GROUP 33 ‘+EVB_REDO1’ SIZE 250M,
GROUP 34 ‘+EVB_REDO1’ SIZE 250M,
GROUP 35 ‘+EVB_REDO1′ SIZE 250M;

alter system set thread=1 scope=spfile sid=’bvmprod1′;
alter system set thread=2 scope=spfile sid=’bvmprod2′;
alter system set thread=3 scope=spfile sid=’bvmprod3’;

3.- Define local_listeners (those entries must exists on the listener.ora on all 3 nodes):

alter system set local_listener='LISTENER_BVMPROD1' scope=spfile sid='bvmprod1';
alter system set local_listener='LISTENER_BVMPROD2' scope=spfile sid='bvmprod2';
alter system set local_listener='LISTENER_BVMPROD3' scope=spfile sid='bvmprod3';

4.- Defile remote_listener (entry must exists on the listener.ora on all 3 nodes):

alter system set remote_listener='listeners_bvmprod' scope=spfile sid='*';

5.- Some RAC parameters…

alter system set instance_number=1 scope=spfile sid='bvmprod1';
alter system set instance_number=2 scope=spfile sid='bvmprod2';
alter system set instance_number=3 scope=spfile sid='bvmprod3';

alter system set cluster_database=true scope=spfile sid=’*’;
alter system set cluster_database_instances=3 scope=spfile sid=’bvmprod3′;

alter database enable instance ‘bvmprod2’;
alter database enable instance ‘bvmprod3’;

alter database enable public thread 2;
alter database enable public thread 3;

6.- Stop the instance and start it as RAC:

srvctl stop database -d bvmprodsh -i bvmprod1
srvctl start database -d bvmprodsh
srvctl start service -d bvmprodsh
srvctl status database -d bvmprodsh
srvctl status service -d bvmprodsh

Thas it pretty much all, the only bit left is to test connectivity and keep monitoring archivelogs (or redo logs) are being transferred to all standby databases and applied.

As always, comments are welcome.

Creating a 3-node standby database. Part II 15 March 2010

Posted by David Alejo Marcos in ASM, RAC, RMAN, Standby.
Tags: , ,
add a comment

On Part I we installed the software and restored the database on a single node or a 3-node RAC Clusterware.

I will explain how I did configure the database on that  CRS and kept it in sync with the current production database. We have to bear in mind that the objective of this exercise is to move from the current production (single node) to an existing 3-node RAC using ASM.

Now we have a database that is an exact copy of our production database. We need to configure it as standby so we can keep it up to date until the migration.

1.- The first step is to create a copy of the current controlfile as standby:

$ rman target /
RMAN> backup current controlfile for standby format '/tmp/stdbyctl.bkp';

Once we have the controlfile for standby we copy it to the destination box using scp.

2.- We stop the database with shutdown immediate. It is important to check that LOG_FILE_NAME_CONVERT is defined using the correct directories to avoid problems in the following steps.

3.- Start the standby database as nomount.

4.- Restore the controlfile for standby:

$ rman nocatalog target /
RMAN> restore standby controlfile from '/tmp/stdbyctl.bkp';

5.- Mount standby database.

RMAN> alter database mount;

6.- Catalog datafiles of the standby database:

the following command will list all files and ask if you want them to be catalog. It is important to review the list and say YES if the list is correct. In our ASM configuration, the path for this database was ‘+EVB_DATA/BVMPRODSH/DATAFILE/’

RMAN> catalog start with '+EVB_DATA/BVMPRODSH/DATAFILE/';

7.- Once the command has finished, we have to “commit” those changes on the controlfile:

RMAN> switch database to copy;
RMAN> EXIT;

8.- If you were using Flashback, it is highly recommended to re-enable it again to avoid problems:

$ sqlplus / as sysdba
SQL> alter database flashback off;

Database altered.

SQL> alter database flashback on;

9.- Now we have to perform some housekeeping… Starting with the redo logs:

9.1.- Find out how many groups we have on the database and clear them:

SQL> select group# from v$log;

GROUP#
———-
1
2
3

SQL> alter database clear logfile group 1;

Database altered.

SQL> alter database clear logfile group 2;

Database altered.

SQL> alter database clear logfile group 3;

Database altered.

9.2.- Now we have to do the same for the standby redo logs:

SQL> select group# from v$standby_log;

GROUP#
———-
4
5
6

SQL> alter database clear logfile group 4;

Database altered.

SQL> alter database clear logfile group 5;

Database altered.

SQL> alter database clear logfile group 6;

Database altered.

OR

9.3.- Create standby redo logs if there are no present:

SQL> select group# from v$standby_log;

no row selected

SQL> alter database add standby logfile group 4 size 250m;

Database altered.

SQL> alter database add standby logfile group 5 size 250m;

Database altered.

SQL> alter database add standby logfile group 6 size 250m;

Database altered.

9.4.- Create spfile in ASM.

This step is not a must, but it is a good idea to use spfiles due to the flexibility that provides:

SQL> create spfile='+EVB_DATA' from pfile;

SQL> shutdown immediate

Edit init.ora file and paste the location of the spfile. This can be found using asmcmd under the diskgroup +EVB_DATA/BVMPRODSH/parameterfile

$ vi $ORACLE_HOME/dbms/init.ora
spfile='+EVB_DATA/BVMPRODSH/parameterfile/spfile.309.712071991'

SQL> startup mount;

10.- Configure the standby parameters, tnsnames.ora,…

10.1.- create entries on the TNSNames file to point to the current production on the standby server. Check you can connect as sys and you can tnsping to the standby server.

10.2.- create entries on the TNSNames file to point to the standby server on the current production server. Check you can connect as sys and you can tnsping to the production server.

10.3.- modify log_archive_config to have the primary database and ALL standby databases on all databases:

SQL> alter system set log_archive_config='DG_CONFIG=(<primary>,<standby1>,<standby2>...)' scope=both sid = '*';

10.4.- configure log_archive_dest_x to point to the new standby database on the primary database. Check log_archive_dest_state_x is enabled:

SQL> alter system set log_archive_dest_3='SERVICE= LGWR ASYNC VALID_FOR=(ALL_LOGFILES,PRIMARY_ROLE) DB_UNIQUE_NAME=' scope=both sid = '*';

10.5.- configure log_archive_dest_x to point to the production database on the new standby database. Check log_archive_dest_state_x is enabled:

SQL> alter system set log_archive_dest_3='SERVICE=<db_unique_name> LGWR ASYNC VALID_FOR=(ALL_LOGFILES,PRIMARY_ROLE) DB_UNIQUE_NAME=<db_unique_name>' scope=both sid = '*';

10.6.- Configure FAL_CLIENT and FAL_SERVER for gap resolution for primary and standby databases:

SQL> alter system set fal_client=<db_unique_name> scope=both sid='*';
SQL> alter system set fal_server=<db_unique_name for primary and standby databases> scope=both sid='*';

11.- We are good to start the recovery process on this standby database:

SQL> alter database recover managed standby database disconnect from session;

Database altered.

SQL> exit;

12.- Check the primary and standby databases are working properly by switching couple of logfiles and checking are being transferred and applied on all standby databases:

On primary database:

SQL> alter system switch logfile;

Now we have to register the database on the CRS…

1.- Add database
srvctl add database -d bvmprodsh -o -n evbprod

2.- Add instances
srvctl add instance -d bvmprodsh -i bvmprod1 -n
srvctl add instance -d bvmprodsh -i bvmprod2 -n
srvctl add instance -d bvmprodsh -i bvmprod3 -n

3.- Modify instance to add ASM
srvctl modify instance -d bvmprodsh -i bvmprod1 -s +ASM1
srvctl modify instance -d bvmprodsh -i bvmprod2 -s +ASM2
srvctl modify instance -d bvmprodsh -i bvmprod3 -s +ASM3

4.- Create and start services:
$ srvctl add service -d bvmprodsh -s evbprod -r <primary_node1>,<primary_node2>,... -a <available_node1><available_node2>,...
$ srvctl start service -d bvmprodsh -s evbprod
$ srvctl status service -d bvmprodsh

On part III, I will explain how we tested the configuration by doing a switchover from the current production single instance database to the 3-node RAC standby database.

As always, comments are welcome.