Fixing Oracle RAC Node Problems With Addnode: DB Binaries

Occasionally, a DBA faces an Oracle RAC node that needs to be fixed, usually after applying a nasty patch.

Currently, my first approach is to remove the node, then add it back. There are other methods to try fixing the problem, but they usually will take some time, such as opening an SR with Oracle.

Even though it sounds complicated, it's not. I will show you how to recover a node from several disaster scenarios.

A. GRID binaries corrupted on ol8-19-rac2

B. GRID binaries corrupted on ol8-19-rac2

C. GRID and DB binaries corrupted on ol8-19-rac2

My configuration

Oracle RAC with 2 nodes: ol8-19-rac1 and ol8-19-rac2

CDB: cdbrac1

PBD: pdb1

Grid Version

34318175;TOMCAT RELEASE UPDATE 19.0.0.0.0 (34318175)

34160635;OCW RELEASE UPDATE 19.16.0.0.0 (34160635)

34139601;ACFS RELEASE UPDATE 19.16.0.0.0 (34139601)

34133642;Database Release Update : 19.16.0.0.220719 (34133642)

33575402;DBWLM RELEASE UPDATE 19.0.0.0.0 (33575402)

DB Version

34086870;OJVM RELEASE UPDATE: 19.16.0.0.220719 (34086870)

34160635;OCW RELEASE UPDATE 19.16.0.0.0 (34160635)

34133642;Database Release Update : 19.16.0.0.220719 (34133642)

When you see <dbenv>, load the DB HOME variables.

When you see <gridenv>, load the GRID HOME variables.

When you see <bnode>, execute the command on the broken node.

When you see <anode>, execute the command on any other working node.

I am using an installation where both DB and GRID are installed under the user ORACLE, and I set the variables to access each environment. But the same procedure works even if the installation uses two different users (usually oracle e grid).

OBS: Always validate any procedure before you try it in a production environment.

Scenario A - DB Binaries Corrupted On ol8-19-rac2 After Applying a Patch

This scenario's grid binaries were unaffected, so we don't need to replace them. The database is already down since the binaries are corrupted.

Backup the $ORACLE_HOME/network/admin

<bnode dbenv>
[oracle@ol8-19-rac2 admin]$ mkdir -p /tmp/oracle; tar cvf /tmp/oracle/db_netadm.tar -C $ORACLE_HOME/network/admin .

Deinstall the DB binaries

<bnode dbenv>
[oracle@ol8-19-rac2 admin]$ $ORACLE_HOME/deinstall/deinstall -local

Confirm the database name, and choose yes to delete the instance. It's not unusual to face file deletion error messages; ignore them.

Add the node back

If we were adding an actual new node, we were supposed to run some cluster verification checks to confirm that everything is OK, but since the node was already part of the cluster, let's skip that.

Due to that, it's not unusual to face the message "[WARNING] [INS-13014] Target environment does not meet some optional requirements." ignore it.

From any other node, add the ex-broken node back. This step takes a while, as it copies the files from one node to another.

<anode dbenv>
[oracle@ol8-19-rac1 ~]$ $ORACLE_HOME/addnode/addnode.sh -silent "CLUSTER_NEW_NODES={ol8-19-rac2}"

As root, execute the following on the ex-broken node.
<bnodt>
[root@ol8-19-rac2 scripts]# /u01/app/oracle/product/19.0.0/dbhome_1/root.sh

Add the instance back

From any other node, as oracle user execute the following to each existing database to add the nodes' instance back (recreate UNDOTBS, REDOS, etc).

<anode dbenv>
[oracle@ol8-19-rac1 ~]$ dbca -silent -ignorePrereqFailure -addInstance -nodeName ol8-19-rac2 -gdbName cdbrac -instanceName cdbrac2 -sysDBAUserName sys -sysDBAPassword SysPassword1

If needed, restore the $ORACLE_HOME/network/admin on the ex-broken node

<bnode dbenv>
[oracle@ol8-19-rac2 admin]$ tar xfv /tmp/oracle/db_netadm.tar -C $ORACLE_HOME/network/admin

[oracle@ol8-19-rac2 ~]$ srvctl status database -db cdbrac
Instance cdbrac1 is running on node ol8-19-rac1
Instance cdbrac2 is running on node ol8-19-rac2

Voilà, that's it. Now your database is supposed to be back online on the ex-broken node.

In the following article, I'll show how to recover from corrupted GRID binaries.

Fixing Oracle RAC Node Problems With Addnode: DB Binaries

Too Long; Didn't Read

My configuration

Scenario A - DB Binaries Corrupted On ol8-19-rac2 After Applying a Patch

About Author

TOPICS

THIS ARTICLE WAS FEATURED IN...

Categories

Trending Topics

Fixing Oracle RAC Node Problems With Addnode: DB Binaries

Too Long; Didn't Read

My configuration

Scenario A - DB Binaries Corrupted On ol8-19-rac2 After Applying a Patch

About Author

TOPICS

THIS ARTICLE WAS FEATURED IN...

RELATED STORIES

Categories

Trending Topics