ORA-16778: redo transport error for one or more DBs

The Data Guard status of Database is Error ORA-16778: redo transport error for one or more databases.

Overview of the environment

Primary DB: ORCLCPG2
Secondary / physical Stand by / DR DB: ORCLRPG2
Operating System=AIX
Platform=powerpc, 2 Node RAC

 

OEM alert

Message=The Data Guard status of ORCLCPG2 is Error ORA-16778: redo transport error for one or more databases.

Verification

The apply lag is around 27 Min.

From stdbysrvr

stdbysrvr:/u00/app/oracle:SID=ORCLRPG2 > date
Mon Feb 6 17:56:28 GMT 2017

From Data Guard Broker

DGMGRL> show database “ORCLRPG2”

Database – ORCLRPG2

Role: PHYSICAL STANDBY
Intended State: APPLY-ON
Transport Lag: 27 minutes 2 seconds (computed 26 seconds ago)
Apply Lag: 27 minutes 2 seconds (computed 26 seconds ago)
Apply Rate: 46.01 MByte/s
Real Time Query: OFF
Instance(s):
ORCLRPG2

Database Status:
SUCCESS

DGMGRL> show database “ORCLCPG2”

Database – ORCLCPG2

Role: PRIMARY
Intended State: TRANSPORT-ON
Instance(s):
ORCLCPG21
ORCLCPG22

Database Status:

SUCCESS

DGMGRL> exit

Verify mrp process

stdbysrvr:/u00/app/oracle:SID=ORCLRPG2 > ps -ef | grep -i mrp
oracle 10223700 1 0 Dec 09 – 1:52 ora_mrp0_ORCLRPG2
oracle 17760362 5767174 0 17:57:34 pts/1 0:00 grep -i mrp

Verify mode and role etc

SQL> select name,open_mode,log_mode,database_role,created from gv$database;

NAME OPEN_MODE LOG_MODE DATABASE_ROLE CREATED
——— ——————– ———— —————- ———
ORCLCPG2 MOUNTED ARCHIVELOG PHYSICAL STANDBY 23-NOV-11

Verify Lag

SQL> select to_char(sysdate,’DD.MM.RR HH24:MI: SS’) time, a.thread#, (select max (sequence#) from v$archived_log where archived=’YES’ and thread#=a.thread#) archived, max (a.sequence#) applied, (select max (sequence#) from v$archived_log where archived=’YES’ and thread#=a.thread#)-max (a.sequence#) gap from v$archived_log a where a.applied=’YES’ group by a.thread#;

TIME THREAD# ARCHIVED APPLIED GAP
—————— ———- ———- ———- ———-
06.02.17 17:58: 05 1 99949 99948 1
06.02.17 17:58: 05 2 97504 97504 0

Reason

There is some network problem between primary and Standby servers which causing these Errors. No other Errors in Alert logfile.

Solution

Need to verify with N/W team for these kind of issues

Complete alert from OEM Server looks like this.

Alert from OEM server
ORA-16778: redo transport error for one or more databases.

Host=oraix01
Target type=Cluster Database
Target name=ORCLCPG2
Categories=Availability
Message=The Data Guard status of ORCLCPG2 is Error ORA-16778: redo transport error for one or more databases.
Severity=Critical
Event reported time=Feb 6, 2017 4:37:23 PM GMT
Operating System=AIX
Platform=powerpc
Associated Incident Id=133027
Associated Incident Status=New
Associated Incident Owner=
Associated Incident Acknowledged By Owner=No
Associated Incident Priority=None
Associated Incident Escalation Level=0
Event Type=Metric Alert
Event name=dataguard:dg_status
Metric Group=Data Guard Status
Metric=Data Guard Status
Metric value=Error ORA-16778: redo transport error for one or more databases
Key Value=ORCLCPG2
Key Column 1=Name
Rule Name=DEFAULT_RULESET_FOR_ALL_TARGETS,METRIC_ALERT_INCIDENT_CREATION
Rule Owner=
Update Details:
The Data Guard status of ORCLCPG2 is Error ORA-16778: redo transport error for one or more databases.
Incident created by rule (Name = Incident management rule set for all targets, Create incident for critical metric alerts [System generated rule]).

 

See also