Got an ODG patched recently and after that, Dataguard was out of sync.
DGMGRL> show database verbose "PROD2"
Database Status:
DGM-17016: failed to retrieve status for database "PROD2"
ORA-01033: ORACLE initialization or shutdown in progress
ORA-16625: cannot reach database "PROD2"
PROD2 is the physicla standby. Normally in a Dataguard envrionment, it means something wrong with the password file. When looking at the password files on both primary and standby, somehow the password file on the primary got updated during the upgrade, alghouth I’ve confirmed the password for sys hadn’t changed on both sides.
--- Standby
[oracle@joestdb dbs]$ ls -lrt /opt/oracle/121020/dbs/orapwPROD
-rw-r----- 1 oracle oinstall 7680 Jan 15 2021 /opt/oracle/121020/dbs/orapwPROD
[oracle@joestdb dbs]$ md5sum /opt/oracle/121020/dbs/orapwPROD
a7f32dd2ba8c421bb980a754890e3ea3 /opt/oracle/121020/dbs/orapwPROD
--- Primary
[oracle@joepridb admin]$ ls -lrt /opt/oracle/121020/dbs/orapwPROD
-rw-r----- 1 oracle oinstall 7680 Sep 1 11:47 /opt/oracle/121020/dbs/orapwPROD
[oracle@joepridb admin]$ md5sum /opt/oracle/121020/dbs/orapwPROD
2afe514bf1b1942cb78c7f8291d74bea /opt/oracle/121020/dbs/orapwPROD
Once the root cause was known, it’s quite easy to fix it. Just copy the new password file from the primary server to the standby server to overwrite the old one, the issue was resolved.
I’ve seen in a RAC environment, this issue happens because the password file on the primary cluster databases is actually stored in ASM and when copied to the standby server, it is copied from the $ORACLE_HOME/dbs of the primary instead.
If there is a need to change the password of SYS in a dataguard envrionment with the physical standby, the procedure is
- Disable log shipping from the primary server to the standby server
- Change the SYS password on the primary server
- Copy the primary server password file to the standby server
- Enable log shipping between the two servers
To disable log shipping using Broker:
EDIT DATABASE 'PROD1' SET STATE='TRANSPORT-OFF';
Here PROD1 is the primary database. To enable it
EDIT DATABASE 'PROD1' SET STATE='TRANSPORT-ON';
Database Role | State in Broker |
---|---|
Primary | TRANSPORT-ON |
Primary | TRANSPORT-OFF |
Physical (Logical) Standby | APPLY-ON |
Physical (Logical) Standby | APPLY-OFF |
To stop redo apply on a standby PROD2:
EDIT DATABASE 'PROD2' SET STATE='APPLY-OFF';
To enable redo apply on a standby PROD2:
EDIT DATABASE 'PROD2' SET STATE='APPLY-ON';
Note, when using SQL command to disable shipping or redo apply, state in Broker won’t reflect that, instead you will see some ORA errors.
For example, disable log shipping with SQL (given log_archive_dest_1 is for the archive destination of physical standby PROD2) , Intended State within Broker still says TRANSPORT-ON, but shows ORA-16738
ALTER SYSTEM SET log_archive_dest_state_1='RESET' SCOPE=BOTH;
DGMGRL> show database "PROD1";
Database - PROD1
Role: PRIMARY
Intended State: TRANSPORT-ON
Instance(s):
PROD
Error: ORA-16738: redo transport service for database "PROD2" is not running
Database Status:
ERROR
Disable redo apply with SQL, Intended State still says APPLY-ON, but shows ORA-16766
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL;
DGMGRL> show database "PROD2";
Database - PROD2
Role: PHYSICAL STANDBY
Intended State: APPLY-ON
Transport Lag: 0 seconds (computed 0 seconds ago)
Apply Lag: 7 seconds (computed 0 seconds ago)
Average Apply Rate: 1.00 KByte/s
Real Time Query: OFF
Instance(s):
PROD
Database Error(s):
ORA-16766: Redo Apply is stopped
Database Status:
ERROR
To enable log shipping & redo apply with SQL:
ALTER SYSTEM SET log_archive_dest_state_1='ENABLE' SCOPE=BOTH;
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE DISCONNECT NODELAY;