In my last post, I resolved an ODG issue that the primary server IP was wrong on the standby server. I walked away and one hour later, I was told the redo apply was still not working. “Something else is wrong?” I was thinking.
So I looked at the system again. I saw the error ORA-38760 in the alert log
2020-06-12T15:48:29.542757-04:00 ALTER DATABASE RECOVER MANAGED STANDBY DATABASE DISCONNECT NODELAY Starting background process MRP0 2020-06-12T15:48:29.574028-04:00 MRP0 started with pid=46, OS id=6800 2020-06-12T15:48:34.605048-04:00 Started logmerger process 2020-06-12T15:48:34.620654-04:00 PR00 (PID:2948): MRP0: Background Media Recovery terminated with error 38760 2020-06-12T15:48:34.620654-04:00 Errors in file C:\PROD\DATA\LOGS\ORACLE\DIAG\diag\rdbms\PROD2\PROD\trace\PROD_pr00_2948.trc: ORA-38760: This database instance failed to turn on flashback database 2020-06-12T15:48:35.729969-04:00 Recovery Slave PR00 previously exited with exception 38760 2020-06-12T15:48:35.823714-04:00 Errors in file C:\PROD\DATA\LOGS\ORACLE\DIAG\diag\rdbms\PROD2\PROD\trace\PROD_mrp0_6800.trc: ORA-38760: This database instance failed to turn on flashback
The trace files didn’t contain too much information other than the ORA-38760 error itself. There were messages like below because OS is Windows:
Required IPC RDMAV_FORK_SAFE environment not set Required IPC RDMAV_HUGEPAGES_SAFE environment not set
According to MOS note: 19c Database On Windows – ADR Is Flooded By Linux-related Trace Files (Doc ID 2588705.1), “This is not an error or warning in particular.”
The standby database has flashback_on set to yes. It looked like Oracle tried to turn it on after the media recovery started, but failed to do so for some reasons. Then recovery service terminated. So I decided to turn it off and see how it goes:
SQL> alter database flashback off; Database altered. SQL> alter database flashback on; alter database flashback on * ERROR at line 1: ORA-38706: Cannot turn on FLASHBACK DATABASE logging. ORA-38788: More standby database recovery is needed
Now it’s clear why Oracle complained about “ORA-38760: This database instance failed to turn on flashback”. It was because the standby had fallen behind too much. And because of failing to turn on flashback as configured, redo log could not be applied — kind of like a deadlock. What I needed to do was to just start the media recovery with flashback off.
SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE DISCONNECT NODELAY;
and wait for the redo applying to catch up. During this process, I see the warning ORA-16853 in ODG broker because DelayMins is set to “0”. It would disappear when the log gap was closed.
DGMGRL> show configuration; Configuration - DG_PROD Protection Mode: MaxPerformance Members: PROD1 - Primary database PROD2 - Physical standby database Warning: ORA-16853: apply lag has exceeded specified threshold Fast-Start Failover: Disabled Configuration Status: WARNING (status updated 43 seconds ago) DGMGRL> show database "PROD2" Database - PROD2 Role: PHYSICAL STANDBY Intended State: APPLY-ON Transport Lag: 0 seconds (computed 1 second ago) Apply Lag: 1 day(s) 24 minutes 33 seconds (computed 1 second ago) Average Apply Rate: 10.81 MByte/s Real Time Query: OFF Instance(s): mvf Database Warning(s): ORA-16853: apply lag has exceeded specified threshold Database Status: WARNING DGMGRL> show database "PROD2" DelayMins DelayMins = '0'
After the primary and the standby was in sync, I stopped redo apply on the standby, turned on flashback, then started redo apply again. The issue was fixed by then.
SQL> alter database recover managed standby database cancel; SQL> alter database recover managed standby database disconnect nodelay;