In my last post, I resolved an ODG issue that the primary server IP was wrong on the standby server. I walked away and one hour later, I was told the redo apply was still not working. “Something else is wrong?” I was thinking.
So I looked at the system again. I saw the error ORA-38760 in the alert log
2020-06-12T15:48:29.542757-04:00
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE DISCONNECT NODELAY
Starting background process MRP0
2020-06-12T15:48:29.574028-04:00
MRP0 started with pid=46, OS id=6800
2020-06-12T15:48:34.605048-04:00
Started logmerger process
2020-06-12T15:48:34.620654-04:00
PR00 (PID:2948): MRP0: Background Media Recovery terminated with error 38760
2020-06-12T15:48:34.620654-04:00
Errors in file C:\PROD\DATA\LOGS\ORACLE\DIAG\diag\rdbms\PROD2\PROD\trace\PROD_pr00_2948.trc:
ORA-38760: This database instance failed to turn on flashback database
2020-06-12T15:48:35.729969-04:00
Recovery Slave PR00 previously exited with exception 38760
2020-06-12T15:48:35.823714-04:00
Errors in file C:\PROD\DATA\LOGS\ORACLE\DIAG\diag\rdbms\PROD2\PROD\trace\PROD_mrp0_6800.trc:
ORA-38760: This database instance failed to turn on flashback
The trace files didn’t contain too much information other than the ORA-38760 error itself. There were messages like below because OS is Windows:
Required IPC RDMAV_FORK_SAFE environment not set
Required IPC RDMAV_HUGEPAGES_SAFE environment not set
According to MOS note: 19c Database On Windows – ADR Is Flooded By Linux-related Trace Files (Doc ID 2588705.1), “This is not an error or warning in particular.”
The standby database has flashback_on set to yes. It looked like Oracle tried to turn it on after the media recovery started, but failed to do so for some reasons. Then recovery service terminated. So I decided to turn it off and see how it goes:
SQL> alter database flashback off;
Database altered.
SQL> alter database flashback on;
alter database flashback on
*
ERROR at line 1:
ORA-38706: Cannot turn on FLASHBACK DATABASE logging.
ORA-38788: More standby database recovery is needed
Now it’s clear why Oracle complained about “ORA-38760: This database instance failed to turn on flashback”. It was because the standby had fallen behind too much. And because of failing to turn on flashback as configured, redo log could not be applied — kind of like a deadlock. What I needed to do was to just start the media recovery with flashback off.
SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE DISCONNECT NODELAY;
and wait for the redo applying to catch up. During this process, I see the warning ORA-16853 in ODG broker because DelayMins is set to “0”. It would disappear when the log gap was closed.
DGMGRL> show configuration;
Configuration - DG_PROD
Protection Mode: MaxPerformance
Members:
PROD1 - Primary database
PROD2 - Physical standby database
Warning: ORA-16853: apply lag has exceeded specified threshold
Fast-Start Failover: Disabled
Configuration Status:
WARNING (status updated 43 seconds ago)
DGMGRL> show database "PROD2"
Database - PROD2
Role: PHYSICAL STANDBY
Intended State: APPLY-ON
Transport Lag: 0 seconds (computed 1 second ago)
Apply Lag: 1 day(s) 24 minutes 33 seconds (computed 1 second ago)
Average Apply Rate: 10.81 MByte/s
Real Time Query: OFF
Instance(s):
mvf
Database Warning(s):
ORA-16853: apply lag has exceeded specified threshold
Database Status:
WARNING
DGMGRL> show database "PROD2" DelayMins
DelayMins = '0'
After the primary and the standby was in sync, I stopped redo apply on the standby, turned on flashback, then started redo apply again. The issue was fixed by then.
SQL> alter database recover managed standby database cancel;
SQL> alter database recover managed standby database disconnect nodelay;