Monday, May 29, 2017

lsnodes: error while loading shared libraries: libskgxn2.so: cannot open shared object file: No such file or directory

Seeing error "lsnodes: error while loading shared libraries: libskgxn2.so: cannot open shared object file: No such file or directory" when trying to extend the DB home on to the RAC 2nd node.


Seeing below Error:

INFO: /u01/app/oracle/product/12.1.0.2/db_1/oui/bin/../bin/lsnodes: error while loading shared libraries: libskgxn2.so: cannot open shared object file: No such file or directory
INFO: Vendor clusterware is not detected.
INFO: Error ocurred while retrieving node numbers of the existing nodes. Please check if clusterware home is properly configured.
SEVERE: Error ocurred while retrieving node numbers of the existing nodes. Please check if clusterware home is properly configured.
INFO: Alert Handler not registered, using Super class functionality
INFO: Alert Handler not registered, using Super class functionality
INFO: User Selected: Yes/OK


This error is due to the wrong lsnodoes in DB_HOME.

The solution is to create the following symbolic link from GI Home to DB home on Node1 before adding the 2nd node.

As the oracle db user (oradba):

$ export GRID_HOME=/u01/app/12.1.0.2/grid
$ export DB_HOME=/u01/app/oracle/product/12.1.0.2/db_1


$ cd $DB_HOME/oui/bin
$ mv lsnodes lsnodes.old
$ ln -s $GRID_HOME/bin/olsnodes $DB_HOME/oui/bin/lsnodes
$ ln -s $GRID_HOME/bin/olsnodes.bin $DB_HOME/oui/bin/lsnodes.bin

Now retry to add the oracle home on 2nd node.

Friday, February 10, 2017

CRS-2922: The attribute 'ORACLE_USER' is not supported for resource type 'ora.gipc.type'. CRS-4000: Command Add failed, or completed with errors.

When I tried to start the 12.1.0.2 CRS in one of the RAC nodes on Linux, seen the error "CRS-2922: The attribute 'ORACLE_USER' is not supported for resource type 'ora.gipc.type'. CRS-4000: Command Add failed, or completed with errors.


Solution:-

As root user executed the command

crsctl add res ora.gipcd -type ora.gipc.type -attr  "ACL='owner:oragrid:rw-,pgrp:oinstall:rw-,other::r--,user:oragrid:rwx'" -init

Here oragrid is the grid user and oinstall is the group.

Now start the CRS ./crsctl start crs.

Friday, September 16, 2016

RMAN-08138: WARNING: archived log not deleted - must create more backups

When I take the RMAN backup to disk of 12c Oracle database, seeing the error "RMAN-08138: WARNING: archived log not deleted - must create more backups" and archive logs are not getting deleted after the backup.

That is, rman is able to take the backup of database and archive logs but not deleting the backed up archive logs in order to free up the space.

Solution:-

The problem was with the below default configuration setting, as per this RMAN is looking to delete the archive logs that are backed up to SBT_TAPE.

CONFIGURE ARCHIVELOG DELETION POLICY TO BACKED UP 1 TIMES TO 'SBT_TAPE';


But as I said I am taking the backup to Disk, I changed the configuration parameter as below  to delete the archive logs after backing up to Disk.

CONFIGURE ARCHIVELOG DELETION POLICY TO BACKED UP 1 TIMES TO DISK;

Now rman is able to delete the archive logs after the backup to disk.

Wednesday, July 27, 2016

ORA-16826: apply service state is inconsistent with the DelayMinsproperty

Seeing warning "ORA-16826: apply service state is inconsistent with the DelayMinsproperty" in Dataguard Broker configuration.

DGMGRL> show configuration;

Configuration - orcl_dg

  Protection Mode: MaxPerformance
  Databases:
    orclprmy     - Primary database
    orclstby - Physical standby database
Warning: ORA-16826: apply service state is inconsistent with the DelayMins property

Fast-Start Failover: DISABLED

Configuration Status:
WARNING


Solution:-

The error is due to the mismatch of ‘delayMins’ property. We need to convert the standby to real time apply.

Note: - Standby redo logs must be created before converting the standby to real time apply.


  •                    Converting the standby to real time apply:-
1.    Cancel the current recovery mode on standby database

SQL> alter database recover managed standby database cancel;

2.    Enable real time apply

SQL> alter database recover managed standby database using current logfile disconnect from session;


  •                  Now check the dataguard configuration

DGMGRL> show configuration;

Configuration - orcl_dg

  Protection Mode: MaxPerformance
  Databases:
    orclprmy     - Primary database
    orclstby - Physical standby database
     

Fast-Start Failover: DISABLED

Configuration Status:
SUCCESS


Monday, July 4, 2016

ORA 600 [krr_process_read_error_1]: Oracle 12c Dataguard Error

After I upgrade the database from 11.2.0.4 to 12.1.0.2 started seeing the error “ORA 600 [krr_process_read_error_1]” on standby databases on AIX platform.  Also, MRP process is crashing.

When the MRP process crashes, the logs that come from Primary to standby will not be applied until the MRP process is started again.


Workaround:

This problem is due to the bug 22294260 happening only on AIX environments. 

Only the workaround for this problem is to start restart the recovery on standby database. 

SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING ARCHIVED LOGFILE DISCONNECT; 


ORA-700 [kskvmstatact: excessive swapping observed] : Oracle 12c RAC Error

Recently I upgraded 11.2.0.4 RAC to 12.1.0.2 on Linux. After upgrading both GI and Database to 12.1.0.2 started seeing “ORA-700 [kskvmstatact: excessive swapping observed]” alerts frequently.

Below is the alert log content when we got the alert:
WARNING: Heavy swapping observed on system in last 5 mins.
pct of memory swapped in [1.96%] pct of memory swapped out [0.41%].
Please make sure there is no memory pressure and the SGA and PGA
are configured correctly. Look at DBRM trace file for more details.
Errors in file /u01/app/oracle/diag/rdbms/orcl/ORCL1/trace/ORCL1_dbrm_14607.trc  (incident=37241):
ORA-00700: soft internal error, arguments: [kskvmstatact: excessive swapping observed], [], [], [], [], [], [], [], [], [], [], []

Every time when I get the alert observed sum of the pct memory swapped in and swapped out always higher than 2%. In this case swapped in is 1.96% and swapped out is 0.41%, i.e. total of 2.37%.

 

Solution:-

This is the expected behavior in 12c and no solution to get rid of this error for now.

Bug 19495842 is there to change the threshold in the swap warning in the alert log in future releases. But, we need to bear this alert in 12.1.0.2 :).


Thursday, June 23, 2016

Oracle RAC 12c MGMTDB (Management Database)

1.       MGMTDB is a container used to store the diagnostic information collected by Cluster Health Monitor (CHM).
2.       It is the single instance database managed by Oracle clusterware in Oracle 12c, in 11g we used to have Berkley database for the same purpose.
3.       Having MGMTDB is optional in 12.1.0.1, you can choose whether to have it or not during the time of Clusterware setup or upgrade from older versions to 12.1.0.1. But in 12.1.0.2 it become mandatory to have MGMTDB.
4.       MGMTDB runs on the master node (i.e. on which node MGMTDB runs, we can consider that as Master node).
5.       If the node where Management Database running crashes, automatically MGMTDB will be failed over to another node.
6.       Management database will be stored on the same storage as OCR and VOTING disks. So, if you go with MGMTDB option in 12c, you should have more than 5GB space for OCR and VOTING disks.

If you use the small size disks it will not allow you to setup/upgrade clusterware.

 

Commands on MGMTDB


1.       Finding on which node MGMTDB is running:
$ oclumon manage -get MASTER

Master = rac1

2.       Status of the MGMTDB
$srvctl status mgmtdb
Database is enabled
Instance -MGMTDB is running on node rac1

3.       Configuration of ManagementDB (MGMTDB)
$ srvctl config mgmtdb
Database unique name: _mgmtdb
Database name:
Oracle home: <CRS home>
Oracle user: oragrid
Spfile: +OCR_VOTE/_MGMTDB/PARAMETERFILE/spfile.268.915007725
Password file:
Domain:
Start options: open
Stop options: immediate
Database role: PRIMARY
Management policy: AUTOMATIC
Type: Management
PDB name: crs_linux
PDB service: crs_linux
Cluster name: crs-linux
Database instance: -MGMTDB
4.       Location of Cluster Health Monitor (CHM) repository
$ oclumon manage -get reppath
CHM Repository Path = +OCR_VOTE/_MGMTDB/FD9B43BF6A646F8CE043B6A9E80A2815/DATAFILE/sysmgmtdata.269.915007863
5.       CHM Repository size
$ oclumon manage -get repsize

CHM Repository Size = 136320 seconds