Killing Oracle Session from database and OS level

Identify the Session to be Killed

Killing sessions can be very destructive if you kill the wrong session, so be very careful when identifying the session to be killed. If you kill a session belonging to a background process you will cause an instance crash.

Identify the offending session using the [G]V$SESSION and [G]V$PROCESS views as follows.

SET LINESIZE 100
COLUMN spid FORMAT A10
COLUMN username FORMAT A10
COLUMN program FORMAT A45

SQL> SELECT s.inst_id,
       s.sid,
       s.serial#,
       p.spid,
       s.username,
       s.program
FROM   gv$session s
       JOIN gv$process p ON p.addr = s.paddr AND p.inst_id = s.inst_id
WHERE  s.type != ‘BACKGROUND’;

   INST_ID        SID    SERIAL# SPID       USERNAME   PROGRAM
———- ———- ———- ———- ———- ———————————————
         1         30         15 3859       TEST       sqlplus@oel5-11gr2.localdomain (TNS V1-V3)
         1         23        287 3834       SYS        sqlplus@oel5-11gr2.localdomain (TNS V1-V3)
         1         40        387 4663                  oracle@oel5-11gr2.localdomain (J000)
         1         38        125 4665                  oracle@oel5-11gr2.localdomain (J001)

The SID and SERIAL# values of the relevant session can then be substituted into the commands in the following sections.

ALTER SYSTEM KILL SESSION

The basic syntax for killing a session is shown below.

SQL> ALTER SYSTEM KILL SESSION ‘sid,serial#’;

In a RAC environment, you optionally specify the INST_ID, shown when querying the GV$SESSION view. This allows you to kill a session on different RAC node.

SQL> ALTER SYSTEM KILL SESSION ‘sid,serial#,@inst_id’;

The KILL SESSION command doesn’t actually kill the session. It merely asks the session to kill itself. In some situations, like waiting for a reply from a remote database or rolling back transactions, the session will not kill itself immediately and will wait for the current operation to complete. In these cases the session will have a status of “marked for kill”. It will then be killed as soon as possible.

In addition to the syntax described above, you can add the IMMEDIATE clause.

SQL> ALTER SYSTEM KILL SESSION ‘sid,serial#’ IMMEDIATE;

This does not affect the work performed by the command, but it returns control back to the current session immediately, rather than waiting for confirmation of the kill.

If the marked session persists for some time you may consider killing the process at the operating system level. Before doing this it’s worth checking to see if it is performing a rollback. You can do this by running this script (session_undo.sql). If the USED_UREC value is decreasing for the session in question you should leave it to complete the rollback rather than killing the session at the operating system level.

ALTER SYSTEM DISCONNECT SESSION

The ALTER SYSTEM DISCONNECT SESSION syntax is an alternative method for killing Oracle sessions. Unlike the KILL SESSION command which asks the session to kill itself, the DISCONNECT SESSION command kills the dedicated server process (or virtual circuit when using Shared Sever), which is equivalent to killing the server process from the operating system. The basic syntax is similar to the KILL SESSION command with the addition of the POST_TRANSACTION clause. The SID and SERIAL# values of the relevant session can be substituted into one of the following statements.

SQL> ALTER SYSTEM DISCONNECT SESSION ‘sid,serial#’ POST_TRANSACTION;

SQL> ALTER SYSTEM DISCONNECT SESSION ‘sid,serial#’ IMMEDIATE;

The POST_TRANSACTION clause waits for ongoing transactions to complete before disconnecting the session, while the IMMEDIATE clause disconnects the session and ongoing transactions are rolled back immediately.

The POST_TRANSACTION and IMMEDIATE clauses can be used together, but the documentation states that in this case the IMMEDIATE clause is ignored. In addition, the syntax diagram suggests both clauses are optional, but in reality, one or both must be specified or you receive an error.

SQL> alter system disconnect session ‘30,7’;

alter system disconnect session ‘30,7’
                                     *
ERROR at line 1:
ORA-02000: missing POST_TRANSACTION or IMMEDIATE keyword

SQL>
This command means you should never need to switch to the operating system to kill sessions, which reduces the chances of killing the wrong process.

The Windows Approach

To kill the session on the Windows operating system, first identify the session, then substitute the relevant SID and SPID values into the following command issued from the command line.

C:\> orakill ORACLE_SID spid

The session thread should be killed immediately and all resources released.

The UNIX Approach

Warning: If you are using the Multithreaded Model in Oracle 12c, you should not attempt to kill operating system processes. To know why, read this.

To kill the session on UNIX or Linux operating systems, first identify the session, then substitute the relevant SPID into the following command.

% kill spid

If after a few minutes the process hasn’t stopped, terminate the session using the following.

% kill -9 spid

If in doubt check that the SPID matches the UNIX PROCESSID shown using.

% ps -ef | grep ora

The session thread should be killed immediately and all resources released.

Patching RAC (Grid Plus Database) / Bundle Patch not full release


Environment: 11gRAC/DB (11.2.0.3) , Windows Server 2008 R2 x64 Edition
Patch:13965211 – 11.2.0.3 Patch 6 for Windows 64 bit, download patch from metalink.
This patch is RAC rolling upgradable for GRID, and RDBMS homes. 
You should administer one Oracle product, one Oracle home at a time, for its particular Bundle Patch Update.

DB 11.2.0.3 Patch 6 includes all bugs fixed in 11.2.0.3 Patch 1 to Patch 5. It must be applied on top of the 11.2.0.3 base release or on top of any earlier bundle patch from 11.2.0.3 Patch 1 to Patch 5.
The order of patching in RAC install is GRID_HOME, then RDBMS_HOME 
OPatch Info:
Oracle recommends that you use OPatch utility release 11.2.0.1.9 or later. You can download it fromMy Oracle Support with patch 6880880.

Patch Installation Instructions
Patching GRID Home

  1. Before Patching GRID_HOME, stop OCR related resources, Oracle HASD Services (OHASD) Services and all Oracle Services:
    > srvctl stop service -d  
             -s  -i 
    > srvctl stop instance -d DBNAME -i INSTANCE -o abort
    > crsctl stop crs

    make the backup of the Oracle Home being patached (eg; d:\app...)

  2. To verify all resources have been oved to the other nodes, run the following command from another node with GRID running:
    > crsctl status resource -t

     

  3. Then use the Computer Management Services GUI or the net stop command to stop ALL Oracle services running on that node:
        net stop OracleService
        net stop OracleTNSListenerLISTENER_
        net stop OracleTNSListenerLISTENER_SCANn
        net stop OracleDBConsole
        net stop OracleASMService
        net stop “Oracle VSS Writer Service”
        net stop OracleREMExecService
        net stop OracleOHService
        net stop “Oracle Object Service”
        net stop OracleClusterVolumeService
        net stop ocfs (required only for OCFS installation)
        net stop orafenceservice
        Note: An explicit ‘srvctl start listener -n ‘ should be issued after patching
  4. Set the ORACLE_HOME environment variable properly
    D:\Sources\11.2.0.3\Patches\p6880880_112000_MSWIN-x86-64_optach\OPatch>set ORACLE_HOME=D:\app\11.2.0.3\grid
  5. Go to the directory where you downloaded the patch:
    D:\Sources\11.2.0.3\Patches\p6880880_112000_MSWIN-x86-64_optach\OPatch>cd D:\Sources\11.2.0.3\Patches\p13965211_112030_MSWIN-x86-64
    First check the inventory for the existing patches
    D:\Sources\11.2.0.3\Patches\p13965211_112030_MSWIN-x86-64>D:\Sources\11.2.0.3\Patches\p6880880_112000_MSWIN-x86-64_optach\OPatch\opatch lsinventory
Oracle Interim Patch Installer version 11.2.0.3.0
Copyright (c) 2012, Oracle Corporation.  All rights reserved.


Oracle Home       : D:\app\11.2.0.3\grid
Central Inventory : C:\Program Files\Oracle\Inventory
   from           : n/a
OPatch version    : 11.2.0.3.0
OUI version       : 11.2.0.3.0
Log file location : D:\app\11.2.0.3\grid\cfgtoollogs\opatch\opatch2012-05-26_10-47-13AM_1.log

Lsinventory Output file location : D:\app\11.2.0.3\grid\cfgtoollogs\opatch\lsinv\lsinventory2012-05-26_10-47-13AM.txt

——————————————————————————–
Installed Top-level Products (1):

Oracle Grid Infrastructure                                           11.2.0.3.0
There are 1 products installed in this Oracle Home.


There are no Interim patches installed in this Oracle Home.


Rac system comprising of multiple nodes
  Local node = P-HQ-CL-OR-11
  Remote node = P-HQ-CL-OR-12

——————————————————————————–

OPatch succeeded.

  1. D:\Sources\11.2.0.3\Patches\p13965211_112030_MSWIN-x86-64>cd 13965211
  2. D:\Sources\11.2.0.3\Patches\p13965211_112030_MSWIN-x86-64\13965211>D:\Sources\11.2.0.3\Patches\p6880880_112000_MSWIN-x86-64_optach\OPatch\opatch apply -local
  1. opatch will attempt to apply the patch on all the nodes in the cluster sequentially. However will prompt for user input to start patching other nodes. At this time ensure that the Oracle services on target node are stopped before proceeding to patch. Once the opatch session completes patching this node, start the services and proceed applying patch to the next node.
  2.  Once the opatch session completes patching GRID_HOME on a node, run the following before proceeding to patch GRID_HOME on the next node. %ORACLE_HOME%\bin\acfsroot installD:\Sources\11.2.0.3\Patches\p13965211_112030_MSWIN-x86-64\13965211>%ORACLE_HOME%\bin\acfsroot install
    ACFS-9300: ADVM/ACFS distribution files found.
    ACFS-9307: Installing requested ADVM/ACFS software.
    ACFS-9308: Loading installed ADVM/ACFS drivers.
    ACFS-9327: Verifying ADVM/ACFS devices.
    ACFS-9157: Detecting driver ‘oracle oks’.
    ACFS-9157: Detecting driver ‘oracle advm’.
    ACFS-9157: Detecting driver ‘oracle acfs’.
    ACFS-9309: ADVM/ACFS installation correctness verified.

  3. After this you can start the CRS or go for the RDBMS home patch without starting it, I started.
    D:\app\11.2.0.3\grid\BIN>crsctl start crs
    CRS-4123: Oracle High Availability Services has been started.

    D:\app\11.2.0.3\grid\BIN>crsctl check crs
    CRS-4638: Oracle High Availability Services is online
    CRS-4537: Cluster Ready Services is online
    CRS-4529: Cluster Synchronization Services is online
    CRS-4533: Event Manager is online
     
  4. Verify the patch installation
  5. opatch lsinventory 

    Patching RDBMS Home

    1. Before Patching RDBMS_HOME, the GRID and ASM services can be running.
      Stop OCR related resources and Oracle Services on the current node as listed below:
      > srvctl stop service -d -s -i > srvctl stop instance -d -i -o abort
      D:\app\11.2.0.3\grid\BIN>srvctl stop instance -d rac -i rac1
    2. Set ORACLE_HOME
      D:\Sources\11.2.0.3\Patches\p13965211_112030_MSWIN-x86-64\13965211>set ORACLE_HOME=D:\app\farif\product\11.2.0.3\dbhome_1
    3.  Apply patch
      D:\Sources\11.2.0.3\Patches\p13965211_112030_MSWIN-x86-64\13965211>D:\Sources\11.2.0.3\Patches\p6880880_112000_MSWIN-x86-64_optach\OPatch\opatch apply 
    4. Start the DB instances
      D:\app\11.2.0.3\grid\BIN>srvctl start instance -d rac -i rac1

      D:\app\11.2.0.3\grid\BIN>
    5. Post installation steps

      After installing the patch, perform the following actions:
      For each database instance running out of the ORACLE_HOME being patched, connect to the database using SQL*Plus as SYSDBA and run catwinbundle.sql as follows:

      d %ORACLE_HOME%\Bundle\Patch6
      > sqlplus /nolog
      SQL> CONNECT / AS SYSDBA
      SQL> STARTUP
      SQL> @catwinbundle.sql
      SQL> QUIT

C:\Users\farif>cd D:\app\farif\product\11.2.0.3\dbhome_1\bundle\patch6


D:\app\farif\product\11.2.0.3\dbhome_1\bundle\patch6>sqlplus / as sysdba

SQL*Plus: Release 11.2.0.3.0 Production on Sun May 27 11:30:19 2012

Copyright (c) 1982, 2011, Oracle.  All rights reserved.


Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 – 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, Oracle Label Security,
OLAP, Data Mining and Real Application Testing options

SQL>

SQL> @catwinbundle.sql

SQL> ALTER SESSION SET current_schema = SYS;

Session altered.

SQL> PROMPT Updating registry…
Updating registry…
SQL> INSERT INTO registry$history
  2    (action_time, action,
  3     namespace, version, id,
  4     bundle_series, comments)
  5  VALUES
  6    (SYSTIMESTAMP, ‘APPLY’,
  7     SYS_CONTEXT(‘REGISTRY$CTX’,’NAMESPACE’),
  8     ‘11.2.0.3’,
  9     6,
 10     ‘WINBUNDLE’,
 11     ‘11.2.0.3 BP 6’);

1 row created.

SQL> COMMIT;

Commit complete.

SQL> SPOOL off
SQL> SET echo off
Check the following log file for errors:
D:\app\farif\cfgtoollogs\catbundle\catbundle_WINBUNDLE_RAC_APPLY_2012May27_11_31_37.log
SQL>

Check all dependent objects are valid and if not then compile them by using 
utlrp.sql
SQL> select count(*) from dba_objects where status ='INVALID';

COUNT(*)
----------
20083
Note: 
You must execute the steps in “Post Installation Steps” for any new database that was created by any of the following methods:

  • Using DBCA (Database Configuration Assistant) to select a sample database (General, Data Warehouse, Transaction Processing)
  • Using a script that was created by DBCA that creates a database from a sample database
  • Cloning a database that was created by either of the two preceding methods, and if “Post Installation Steps”   was not executed after Bundle Patch6 was applied

Using Service in Oracle RAC environment

10g

To add a service

Command: srvctl add service -d db_name -s service_name -r pref_insts [-a avail_insts] [-P TAF_policy]
Command: srvctl add service -d db_name -s service_name -u {-r “new_pref_inst” | -a “new_avail_inst”}
Example: srvctl add service -d RAC -s PRD -r RAC01,RAC02 -a RAC03,RAC04
Example: srvctl add serv -d CRM -s CRM -r CRM1 -a CRM3 -P basic

To remove a service

Command: srvctl remove service -d db_name -s service_name [-i inst_name] [-f]
Example: srvctl remove serv -d rac -s sales
Example: srvctl remove service -d rac -s sales -i rac01,rac02
To start a service

Command: srvctl start service -d db_name [-s service_names [-i inst_name]] [-o start_options]
Command: srvctl start service -d db_name -s service_names [-o open]
Command: srvctl start service -d db_name -s service_names -o nomount
srvctl start service -d db_name -s service_names -o mount
srvctl start serv -d dwh -s dwh

To stop a service

Command: srvctl stop service -d db_name [-s service_names [-i inst_name]] [-f]
srvctl stop serv -d dwh -s dwh

To view the status of Service

Command: srvctl status service -d db_name [-s service_names] [-f] [-v] [-S level]
srvctl status service -d rac -s rac

TO enable the service at cluster startup

Command: srvctl enable service -d db_name -s service_names [-i inst_name]
srvctl enable service -d apps -s apps1

To disable the service at cluster startup

Command: srvctl disable service -d db_name -s service_names [-i inst_name]
Example: srvctl disable serv -d rac -s rac -i rac1

To view the config service

Command: srvctl config service -d db_name [-s service_name] [-a] [-S level]
Command: srvctl config service -d db_name -a — -a shows TAF configuration
Example: srvctl config service -d TEST -s test PREF:TST1 AVAIL:TST2

To modify the service

Command: srvctl modify service -d db_name -s service_name -i old_inst_name -t new_inst_name [-f]
Command: srvctl modify service -d db_name -s service_name -i avail_inst_name -r [-f]
Command: srvctl modify service -d db_name -s service_name -n -i preferred_list [-a available_list] [-f]
Command: srvctl modify service -d db_name -s service_name -i old_inst_name -a avail_inst -P TAF_policy
Example: srvctl modify serv -d PROD -s DWH -n -i I1,I2,I3,I4 -a I5,I6

To relocate service

Command: srvctl relocate service -d db_name -s service_name -i old_inst_name -t target_inst [-f]
To Get/set the environment for Service

Command: srvctl getenv service -d db_name -s service_name -t name_list
Command: srvctl setenv service -d db_name [-s service_name] {-t “name=val[,name=val,…]” | -T “name=val”}
Command: srvctl unsetenv service -d db_name -s service_name -t name_list

11gR2

Command: srvctl add service -d db_unique_name -s service_name
-g server_pool [-c {UNIFORM|SINGLETON}] [-k network_number]
[-l [PRIMARY|PHYSICAL_STANDBY|LOGICAL_STANDBY|SNAPSHOT_STANDBY]
[-y {AUTOMATIC|MANUAL}] [-q {TRUE|FALSE}] [-j {SHORT|LONG}]
[-B {NONE|SERVICE_TIME|THROUGHPUT}] [-e {NONE|SESSION|SELECT}]
[-m {NONE|BASIC}] [-P {BASIC|NONE|PRECONNECT}] [-x {TRUE|FALSE}]
[-z failover_retries] [-w failover_delay]

To add service with service goals

Example: srvctl add service -d rac -s rac1 -q TRUE -m BASIC -e SELECT -z 180 -w 5 -j LONG

To add service wih preconnect taf

Command: srvctl add service -d db_unique_name -s service_name -u {-r preferred_list | -a available_list}
Example: srvctl add serv -d rac -s report -r rac01,rac02 -a rac03 -P PRECONNECT

To start service

Command: srvctl start service -d db_unique_name [-s “service_name_list” [-n node_name | -i instance_name]] [-o start_options]
Example: srvctl start serv -d rac -s rac
Example: srvctl start service -d rac -s rac -i rac2

To stop service

srvctl stop service -d db_unique_name [-s “service_name_list”] [-n node_name | -i instance_name] [-f]
srvctl stop service -d rac -s rac
srvctl stop serv -d rac -s rac -i rac2

To view service

srvctl status service -d db_unique_name [-s “service_name_list”] [-f] [-v]
srvctl status service -d rac -s rac -v

To modify the service

Command: srvctl modify service -d db_unique_name -s service_name
[-c {UNIFORM|SINGLETON}] [-P {BASIC|PRECONNECT|NONE}]
[-l {[PRIMARY]|[PHYSICAL_STANDBY]|[LOGICAL_STANDBY]|[SNAPSHOT_STANDBY]} [-q {TRUE|FALSE}] [-x {TRUE|FALSE}] [-j {SHORT|LONG}] [-B {NONE|SERVICE_TIME|THROUGHPUT}] [-e {NONE|SESSION|SELECT}] [-m {NONE|BASIC}] [-z failover_retries] [-w failover_delay] [-y {AUTOMATIC|MANUAL}]

Command: srvctl modify service -d db_unique_name -s service_name -i old_instance_name -t new_instance_name [-f]
Command: srvctl modify service -d db_unique_name -s service_name -i avail_inst_name -r [-f]
Command: srvctl modify service -d db_unique_name -s service_name -n -i preferred_list [-a available_list] [-f]

Example: srvctl modify service -d rac -s rac -i rac1 -t rac2
Example: srvctl modify serv -d rac -s rac -i rac1 -r
Example: srvctl modify service -d rac -s rac -n -i rac1 -a rac2
To relocate the service

Command: srvctl relocate service -d db_unique_name -s service_name {-c source_node -n target_node|-i old_instance_name -t new_instance_name} [-f]
Example: srvctl relocate service -d rac -s rac -i rac1 -t rac3

How Oracle Cluster Starts


When considering how Oracle Clusterware is able to store vital files (OCR and Voting Files) in an ASM Diskgroup, and while ASM is itself dependent on Oracle Clusterware, it would appear that there is a Chicken and the egg dilemma. The following is a brief overview of the logic used for starting Oracle Clusterware with ASM when clusterware files; OCR and Voting Files, are stored in an ASM Diskgroup. Note, that there are configuration dependent variations to the following logic, but what is described is the most common case.

1. When a node of an Oracle Clusterware cluster restarts, OHASD is started by platform-specific means. OHASD is the root for bringing up Oracle Clusterware. OHASD has access to the OLR (Oracle Local Registry) stored on the local file system. OLR provides needed data to complete OHASD initialization 

2. OHASD brings up GPNPD and CSSD. CSSD has access to the GPNP Profile stored on the local file system. This profile contains the following vital bootstrap data; 
a. ASM Diskgroup Discovery String 
b. ASM SPFILE location (Diskgroup name) 
c. Name of the ASM Diskgroup containing the Voting Files 

3. The Voting Files locations on ASM Disks are accessed by CSSD with well-known pointers in the ASM Disk headers and CSSD is able to complete initialization and start or join an existing cluster. 

4. OHASD starts an ASM instance and ASM can now operate with CSSD initialized and operating. The ASM instance uses special code to locate the contents of the ASM SPFILE, assuming it is stored in a Diskgroup. 

5. With an ASM instance operating and its Diskgroups mounted, access to Clusterware’s OCR is available to CRSD. 

6. OHASD starts CRSD with access to the OCR in an ASM Diskgroup. 

7. Clusterware completes initialization and brings up other services under its control. 


OLR holds metadata for the local node and in conjunction with GPnP profile OHASD has all the information required initially to join the node to the cluster and OHASD manages OLR. so both OLR and GPnP profile is essential and required. 

in RAC ,exactly from gpnp profile cssd gets all required information to get hold of maximum number of voting disk to start cssd . 

While doing to as asm_diskstring parameter ,it searches all the disk for voting file signature . 
When it completes the search ,writes how many voting file it found and start cssd . 

those information you can see on any RAC node using below command from OS user which is holding GI . 

$ gpnptool get 

You can see asm_disktring value on above output . 
If your system is using default path as per OS ,then it will be blank. 

In Case of Non-RAC,there is no such use of gpnp. 
Only OLR is being used to initialize ohasd. 

————————————-

 ASMCMD>DSGET
To fetch the ASM diskstring. The ASMCMD will work in both RAC and NON-RAC environment.

Relocating Voting disk and OCR from one disk group to another

C:\Users\farhat>crsctl query css votedisk
##  STATE    File Universal Id                File Name Disk group
—  —–    —————–                ——— ———
1. ONLINE   03c219af7dbe4f3dbf7ad4949cccbd42 (\\.\ORCLDISKOCRVOTE0) [OCRVOTE]
Located 1 voting disk(s).

C:\Users\farhat>crsctl replace votedisk +OCRVOTENEW
Successful addition of voting disk 7e56811064114ff9bfbee059c4ebacc6.
Successful deletion of voting disk 03c219af7dbe4f3dbf7ad4949cccbd42.
Successfully replaced voting disk group with +OCRVOTENEW.
CRS-4266: Voting file(s) successfully replaced

C:\Users\farhat>crsctl query css votedisk
##  STATE    File Universal Id                File Name Disk group
—  —–    —————–                ——— ———
1. ONLINE   7e56811064114ff9bfbee059c4ebacc6 (\\.\ORCLDISKOCRVOTENEW0) [OCRVOTENEW]
Located 1 voting disk(s).

Move OCR from diskgroup +OCRVOTE to +OCRVOTENEW

C:\Users\farhat>ocrcheck
Status of Oracle Cluster Registry is as follows :
Version                  :          3
Total space (kbytes)     :     262120
Used space (kbytes)      :       3316
Available space (kbytes) :     258804
ID                       : 1825388376
         Device/File Name         :   +OCRVOTE
Device/File integrity check succeeded

Device/File not configured

Device/File not configured

Device/File not configured

Device/File not configured

Cluster registry integrity check succeeded

———————————————————————-
C:\Users\farhat>ocrconfig -add +OCRVOTENEW

C:\Users\farhat>ocrcheck
Status of Oracle Cluster Registry is as follows :
Version                  :          3
Total space (kbytes)     :     262120
Used space (kbytes)      :       3316
Available space (kbytes) :     258804
ID                       : 1825388376
Device/File Name         :   +OCRVOTE
Device/File integrity check succeeded
Device/File Name         : +OCRVOTENEW
Device/File integrity check succeeded

Device/File not configured

————————————————————————–

C:\Users\farhat>ocrconfig -delete +OCRVOTE

—————————————————————————

C:\Users\farhat>ocrcheck
Status of Oracle Cluster Registry is as follows :
Version                  :          3
Total space (kbytes)     :     262120
Used space (kbytes)      :       3316
Available space (kbytes) :     258804
ID                       : 1825388376
         Device/File Name         : +OCRVOTENEW
Device/File integrity check succeeded

Device/File not configured

Device/File not configured

Device/File not configured

Device/File not configured

Cluster registry integrity check succeeded

Logical corruption check succeeded

Device/File not configured

Device/File not configured

Cluster registry integrity check succeeded
—————————————————————————

Changes in 11gr2 regarding votedisk

There are few noticeable changes made in 11gR2 specific to voting disks.

As we are aware that Voting disk and OCR can now be stored in ASM. But we also know in previous versions, CSS need voting disk to start.

So how grid starts, when voting disk is in ASM diskgroup?

I’ve created ASM diskgroup DATA and my voting and OCR resides in it.

SQL> select name, type, total_mb, usable_file_mb
from v$asm_diskgroup;
NAME    TYPE    TOTAL_MB   USABLE_FILE_MB
—— ——   ———- ————–
DATA   EXTERN   3072       2672

SQL> set line 200
SQL> column path format a30
SQL> select name, path, header_status from v$asm_disk;
NAME       PATH    HEADER_STATU
——     ——- ————
DATA_0002 /dev/sdg   MEMBER
DATA_0000 /dev/sdi   MEMBER
DATA_0001 /dev/sdh   MEMBER

crsctl query css votedisk
## STATE File Universal Id File Name Disk group
— —– —————– ——— ———
 1. ONLINE 4f2e374b254e4f57bf85ac5db31a91fb (/dev/sdi) [DATA]
Located 1 voting disk(s).
Few more things –

1. Voting disks are stored on individual disks Vs OCR which are stored as files in the ASM diskgroup. So voting disks will not span multiple disks i.e. they are confined in single disk.

2. At the start of ohasd, all disks as part asm_diskstring are scanned. ohasd know about asm_diskstring from GPnP profile. This functionality allows CSS start before ASM.

3. How ohasd will know which disk contains voting disk?

Following is the ASM disk header metadata (dumped using kfed). ohasd use the markers between vfstart & vfend. If the markers between vfstart & vfend are 0 then disk does *NOT* contain voting disk

grid@db1:~> kfed read ‘/dev/sdi’ | grep vf
kfdhdb.vfstart: 128 ; 0x0ec: 0×00000080
kfdhdb.vfend: 160 ; 0x0f0: 0x000000a0

grid@db1:~> kfed read ‘/dev/sdg’ | grep vf
kfdhdb.vfstart: 0 ; 0x0ec: 0×00000000
kfdhdb.vfend: 0 ; 0x0f0: 0×00000000

grid@db1:~> kfed read ‘/dev/sdh’ | grep vf
kfdhdb.vfstart: 0 ; 0x0ec: 0×00000000
kfdhdb.vfend: 0 ; 0x0f0: 0×00000000

So from above output, disk ‘/dev/sdi’ contains voting disk.

4. Number of voting disks, is decided based on the diskgroup redundancy.

5. If we want to store a voting disk in a new diskgroup (apart from one, which already holds a voting disk) we need to create a QUORAM failure group.

From Oracle docs – A quorum failure group is a special type of failure group and disks in these failure groups do not contain user data and are not considered when determining redundancy requirements.

6. Manual backup of the voting disk is not required anymore. Required data is backed up

How to read a ASM Stamped disk to find voting disk availability without asm started

Having the voting disk in ASM is a clusterware/database 11gr2 (11.2.0.1) feature. The first time I heard about it, it got me thinking: the clusterware needs to up before the ASM instance can start, because ASM relies on the clusterware for detecting machine and cluster problems. But: ASM needs to be started to be able to find files inside ASM. Classic chicken-egg problem…

So how does the clusterware find the voting disk when it is starting? ASM is not up at that time, so it can not ask the ASM instance.
What the clusterware does, is query the diskheaders of the ASM disks. In the (11gr2) ASM disk header, there’s room to tell if, and where the voting disk is present on that disk:
C:\Users\farif>kfed read \\.\ORCLDISKDATANEWSAN0
kfbh.endian:                          1 ; 0x000: 0x01
kfbh.hard:                          130 ; 0x001: 0x82
kfbh.type:                            1 ; 0x002: KFBTYP_DISKHEAD
kfbh.datfmt:                          1 ; 0x003: 0x01
kfbh.block.blk:                       0 ; 0x004: blk=0
kfbh.block.obj:              2147483648 ; 0x008: disk=0
kfbh.check:                  2894233388 ; 0x00c: 0xac827f2c
kfbh.fcn.base:                   760337 ; 0x010: 0x000b9a11
kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
kfbh.spare1:                          0 ; 0x018: 0x00000000
kfbh.spare2:                          0 ; 0x01c: 0x00000000
kfdhdb.driver.provstr:ORCLDISKDATANEWSAN0 ; 0x000: length=19
kfdhdb.driver.reserved[0]:   1096040772 ; 0x008: 0x41544144
kfdhdb.driver.reserved[1]:   1398228302 ; 0x00c: 0x5357454e
kfdhdb.driver.reserved[2]:      3165761 ; 0x010: 0x00304e41
kfdhdb.driver.reserved[3]:            0 ; 0x014: 0x00000000
kfdhdb.driver.reserved[4]:            0 ; 0x018: 0x00000000
kfdhdb.driver.reserved[5]:            0 ; 0x01c: 0x00000000
kfdhdb.compat:                186646528 ; 0x020: 0x0b200000
kfdhdb.dsknum:                        0 ; 0x024: 0x0000
kfdhdb.grptyp:                        1 ; 0x026: KFDGTP_EXTERNAL
kfdhdb.hdrsts:                        3 ; 0x027: KFDHDR_MEMBER
kfdhdb.dskname:              DGDUP_0000 ; 0x028: length=10
kfdhdb.grpname:                   DGDUP ; 0x048: length=5
kfdhdb.fgname:               DGDUP_0000 ; 0x068: length=10
kfdhdb.capname:                         ; 0x088: length=0
kfdhdb.crestmp.hi:             32969165 ; 0x0a8: HOUR=0xd DAYS=0xe MNTH=0x4 YEAR=0x7dc
kfdhdb.crestmp.lo:           3613329408 ; 0x0ac: USEC=0x0 MSEC=0x3c2 SECS=0x35 MINS=0x35
kfdhdb.mntstmp.hi:             33001418 ; 0x0b0: HOUR=0xa DAYS=0x1e MNTH=0x3 YEAR=0x7de
kfdhdb.mntstmp.lo:           2117023744 ; 0x0b4: USEC=0x0 MSEC=0x3ce SECS=0x22 MINS=0x1f
kfdhdb.secsize:                     512 ; 0x0b8: 0x0200
kfdhdb.blksize:                    4096 ; 0x0ba: 0x1000
kfdhdb.ausize:                  1048576 ; 0x0bc: 0x00100000
kfdhdb.mfact:                    113792 ; 0x0c0: 0x0001bc80
kfdhdb.dsksize:                  204800 ; 0x0c4: 0x00032000
kfdhdb.pmcnt:                         3 ; 0x0c8: 0x00000003
kfdhdb.fstlocn:                       1 ; 0x0cc: 0x00000001
kfdhdb.altlocn:                       2 ; 0x0d0: 0x00000002
kfdhdb.f1b1locn:                      2 ; 0x0d4: 0x00000002
kfdhdb.redomirrors[0]:                0 ; 0x0d8: 0x0000
kfdhdb.redomirrors[1]:                0 ; 0x0da: 0x0000
kfdhdb.redomirrors[2]:                0 ; 0x0dc: 0x0000
kfdhdb.redomirrors[3]:                0 ; 0x0de: 0x0000
kfdhdb.dbcompat:              168820736 ; 0x0e0: 0x0a100000
kfdhdb.grpstmp.hi:             32969165 ; 0x0e4: HOUR=0xd DAYS=0xe MNTH=0x4 YEAR=0x7dc
kfdhdb.grpstmp.lo:           3612037120 ; 0x0e8: USEC=0x0 MSEC=0x2d4 SECS=0x34 MINS=0x35
kfdhdb.vfstart:                     128 ; 0x0ec: 0x00000080
kfdhdb.vfend:                       160 ; 0x0f0: 0x000000a0
kfdhdb.spfile:                        0 ; 0x0f4: 0x00000000
kfdhdb.spfflg:                        0 ; 0x0f8: 0x00000000
kfdhdb.ub4spare[0]:                   0 ; 0x0fc: 0x00000000
kfdhdb.ub4spare[1]:                   0 ; 0x100: 0x00000000
kfdhdb.ub4spare[2]:                   0 ; 0x104: 0x00000000
kfdhdb.ub4spare[3]:                   0 ; 0x108: 0x00000000
kfdhdb.ub4spare[4]:                   0 ; 0x10c: 0x00000000
kfdhdb.ub4spare[5]:                   0 ; 0x110: 0x00000000
kfdhdb.ub4spare[6]:                   0 ; 0x114: 0x00000000
kfdhdb.ub4spare[7]:                   0 ; 0x118: 0x00000000
kfdhdb.ub4spare[8]:                   0 ; 0x11c: 0x00000000
kfdhdb.ub4spare[9]:                   0 ; 0x120: 0x00000000
kfdhdb.ub4spare[10]:                  0 ; 0x124: 0x00000000
kfdhdb.ub4spare[11]:                  0 ; 0x128: 0x00000000
kfdhdb.ub4spare[12]:                  0 ; 0x12c: 0x00000000
kfdhdb.ub4spare[13]:                  0 ; 0x130: 0x00000000
kfdhdb.ub4spare[14]:                  0 ; 0x134: 0x00000000
kfdhdb.ub4spare[15]:                  0 ; 0x138: 0x00000000
kfdhdb.ub4spare[16]:                  0 ; 0x13c: 0x00000000
kfdhdb.ub4spare[17]:                  0 ; 0x140: 0x00000000
kfdhdb.ub4spare[18]:                  0 ; 0x144: 0x00000000
kfdhdb.ub4spare[19]:                  0 ; 0x148: 0x00000000
kfdhdb.ub4spare[20]:                  0 ; 0x14c: 0x00000000
kfdhdb.ub4spare[21]:                  0 ; 0x150: 0x00000000
kfdhdb.ub4spare[22]:                  0 ; 0x154: 0x00000000
kfdhdb.ub4spare[23]:                  0 ; 0x158: 0x00000000
kfdhdb.ub4spare[24]:                  0 ; 0x15c: 0x00000000
kfdhdb.ub4spare[25]:                  0 ; 0x160: 0x00000000
kfdhdb.ub4spare[26]:                  0 ; 0x164: 0x00000000
kfdhdb.ub4spare[27]:                  0 ; 0x168: 0x00000000
kfdhdb.ub4spare[28]:                  0 ; 0x16c: 0x00000000
kfdhdb.ub4spare[29]:                  0 ; 0x170: 0x00000000
kfdhdb.ub4spare[30]:                  0 ; 0x174: 0x00000000
kfdhdb.ub4spare[31]:                  0 ; 0x178: 0x00000000
kfdhdb.ub4spare[32]:                  0 ; 0x17c: 0x00000000
kfdhdb.ub4spare[33]:                  0 ; 0x180: 0x00000000
kfdhdb.ub4spare[34]:                  0 ; 0x184: 0x00000000
kfdhdb.ub4spare[35]:                  0 ; 0x188: 0x00000000
kfdhdb.ub4spare[36]:                  0 ; 0x18c: 0x00000000
kfdhdb.ub4spare[37]:                  0 ; 0x190: 0x00000000
kfdhdb.ub4spare[38]:                  0 ; 0x194: 0x00000000
kfdhdb.ub4spare[39]:                  0 ; 0x198: 0x00000000
kfdhdb.ub4spare[40]:                  0 ; 0x19c: 0x00000000
kfdhdb.ub4spare[41]:                  0 ; 0x1a0: 0x00000000
kfdhdb.ub4spare[42]:                  0 ; 0x1a4: 0x00000000
kfdhdb.ub4spare[43]:                  0 ; 0x1a8: 0x00000000
kfdhdb.ub4spare[44]:                  0 ; 0x1ac: 0x00000000
kfdhdb.ub4spare[45]:                  0 ; 0x1b0: 0x00000000
kfdhdb.ub4spare[46]:                  0 ; 0x1b4: 0x00000000
kfdhdb.ub4spare[47]:                  0 ; 0x1b8: 0x00000000
kfdhdb.ub4spare[48]:                  0 ; 0x1bc: 0x00000000
kfdhdb.ub4spare[49]:                  0 ; 0x1c0: 0x00000000
kfdhdb.ub4spare[50]:                  0 ; 0x1c4: 0x00000000
kfdhdb.ub4spare[51]:                  0 ; 0x1c8: 0x00000000
kfdhdb.ub4spare[52]:                  0 ; 0x1cc: 0x00000000
kfdhdb.ub4spare[53]:                  0 ; 0x1d0: 0x00000000
kfdhdb.acdb.aba.seq:                  0 ; 0x1d4: 0x00000000
kfdhdb.acdb.aba.blk:                  0 ; 0x1d8: 0x00000000
kfdhdb.acdb.ents:                     0 ; 0x1dc: 0x0000
kfdhdb.acdb.ub2spare:                 0 ; 0x1de: 0x0000
kfdhdb.vfstart tells the begin allocation unit offset of the voting disk, kfdhdb.vfend tells the end offset.

Checking properties of ASM Diskgroups

SET LINESIZE 150
COL PATH FOR A30
COL NAME FOR A20
COL HEADER_STATUS FOR A20
COL FAILGROUP FOR A20
COL FAILGROUP_TYPE FOR A20
COL VOTING_FILE FOR A20
SELECT NAME,PATH,HEADER_STATUS,FAILGROUP, FAILGROUP_TYPE, VOTING_FILE
FROM V$ASM_DISK;

NAME                 PATH                           HEADER_STATUS        FAILGROUP            FAILGROUP_TYPE       VOTING_FILE
——————– —————————— ——————– ——————– ——————– ——————–
                     \\.\ORCLDISKDATANEWSAN1        PROVISIONED                               REGULAR          N
DBDATA_0000          \\.\ORCLDISKDBDATA0            MEMBER               DBDATA_0000          REGULAR              N
DGDUP_0000           \\.\ORCLDISKDATANEWSAN0        MEMBER               DGDUP_0000           REGULAR              Y
DBFLASH_0000         \\.\ORCLDISKDBFLASH0           MEMBER               DBFLASH_0000         REGULAR              N

OCR Vote replacement in 11GR2

OCR:
OCR can be created at the time of Grid Installation. It’s store information to manage Oracle cluster-ware and it’s component such as RAC database, listener, VIP,Scan IP & Services.
Minimum 1 and maximum 5 copy of OCR is possible.

Some key points about OCR handling in 11gR2 ASM:

  • We can store OCR And Voting disk on ASM or certified cluster file system.
  • We can dynamically add or replace OCR.
  • OCR can be keep in same disk-group or different disk-group
  • OCR automatic backup kept together in a single file.
  • Automatic backup of OCR happen after every four hours, end of the day, end of the week
  • must have root or sudo privilege account to manage it. 

To find location of Current OCR:
[oracle@rac1]$ ocrcheck
Status of Oracle Cluster Registry is as follows :
Version                  :          3
Total space (kbytes)     :     262120
Used space (kbytes)      :       2964
Available space (kbytes) :     259156
ID                       : 1390115973
Device/File Name         :      +DATA
Device/File integrity check succeeded
Device/File not configured
Device/File not configured
Device/File not configured
Device/File not configured
Cluster registry integrity check succeeded
Logical corruption check bypassed due to non-privileged user

Create mirror copy of OCR online
[oracle@rac1]$ sudo ocrconfig -add +CRS
Password:

Check location of OCR after mirror copy creation:
[root@rac1]# /u01/app/11.2.0/grid/bin/ocrcheck
Status of Oracle Cluster Registry is as follows :
Version                  :          3
Total space (kbytes)     :     262120
Used space (kbytes)      :       2964
Available space (kbytes) :     259156
ID                       : 1390115973
Device/File Name         :      +DATA
Device/File integrity check succeeded
Device/File Name         :       +CRS
Device/File integrity check succeeded
Device/File not configured
Device/File not configured
Device/File not configured
Cluster registry integrity check succeeded
Logical corruption check succeeded

Another file to find location of OCR:
[root@rac1]# cat /etc/oracle/ocr.loc
#Device/file  getting replaced by device +CRS
ocrconfig_loc=+DATA
ocrmirrorconfig_loc=+CRS

Voting Disk:
It manage information about node membership. Each voting disk must be accessible by all nodes in the cluster. If any node is not passing heat-beat across other note or voting disk, then that node will be evicted by Voting disk.
Minimum 1 and maximum 15 copy of voting disk is possible.

  • Some key points about VOTING DISK handling in 11gR2:
  • Voting disks will be kept in an ASM diskgroup
  • We can dynamically add or replace voting disk
  • Redundancy of the Voting Disk depends on the Redundancy of that diskgroup
  • Voting disk can be keep in same disk-group or different disk-group
  • Voting disk automatic backup kept together in a single file
  • Backup of Voting disk using “dd” command not supported
  • Voting Disks are mirrored across Failuregroups
  • Automatic backup of Voting disk and OCR happen after every four hours, end of the day, end of the week
  • must have root or sudo privilege account to manage it
  • vote file is not mirrored and stripped like any other files (data, redo, control, OCR etc…) and it has a fixed assignment to a failure group in a disk. 
  • Vote disk can be accessed even if ASM is not up whereas another file cannot be accessed if ASM is not up and running.
  • we cannot mix vote files on ASM and vote files on non-ASM. Instead we can dedicate either one ASM DG for all you cluster vote disk files or we can use non-ASM but never both except for initial migration to 11.2 when previous versions were on non-ASM storage. 
  • We cannot use the ‘crsctl add css votedisk’ to add a vote disk on ASM disk group or ACFS file system
  • A common example would be the Diskgroup DATA with redundancy normal that has at least 3 Failuregroups and contains also the Database Area of a RAC database. In this case 3 Failuregoups would each store one voting disk unstriped on one disk.

To find current location of Voting disk:

SQL> set line 200
SQL> column path format a30
SQL> select name, path, header_status from v$asm_disk;

[oracle@rac1]$ crsctl query css votedisk
##  STATE    File Universal Id File Name Disk group
–  —–    —— —— —–——— ———
1. ONLINE   6a60a2c3510c4fbfbff62dcdc279b247 (ORCL:DATA1) [DATA]

Relocate or multiplexing Voting disk to another disk-group (With normal redundancy)
[root@rac1]# /u01/app/11.2.0/grid/bin/crsctl replace votedisk +CRS
Successful addition of voting disk afb77b2693a24f1ebfe876784103e82a.
Successful addition of voting disk 3e2542c5b1154ffdbfc8b6dea7dce390.
Successful addition of voting disk 8e0f3c5921cc4f93bf223de1465d83cc.
Successful deletion of voting disk 6a60a2c3510c4fbfbff62dcdc279b247.
Successfully replaced voting disk group with +CRS.
CRS-4266: Voting file(s) successfully replaced

New location of Voting disk:
[root@rac1]# /u01/app/11.2.0/grid/bin/crsctl query css votedisk
##  STATE    File Universal Id                File Name Disk group
–  —–    —————–                ——— ———
1. ONLINE   afb77b2693a24f1ebfe876784103e82a (ORCL:DATA2) [CRS]
2. ONLINE   3e2542c5b1154ffdbfc8b6dea7dce390 (ORCL:DATA3) [CRS]
3. ONLINE   8e0f3c5921cc4f93bf223de1465d83cc (ORCL:DATA4) [CRS]
Located 3 voting disk(s).

[root@rac1]#
[root@rac1]# /u01/app/11.2.0/grid/bin/crsctl add css votedisk +data
CRS-4671: This command is not supported for ASM diskgroups.
CRS-4000: Command Add failed, or completed with errors.

[root@rac1]# /u01/app/11.2.0/grid/bin/crsctl add css votedisk /u01/vote_acfs.dsk
CRS-4258: Addition and deletion of voting files are not allowed because the voting files are on ASM
[root@raclinux2 bin]#

How ohasd will know which disk contains voting disk?
Following is the ASM disk header metadata (dumped using kfed).
ohasd use the markers between vfstart & vfend.
If the markers between vfstart & vfend are 0 then disk does *NOT* contain voting disk

[root@rac1] kfed read ‘/dev/sdi’ | grep vf
kfdhdb.vfstart: 130; 0x0ec: 0×00000060
kfdhdb.vfend: 168; 0x0f0: 0x000000d0

[root@rac1] kfed read ‘/dev/sdg’ | grep vf
kfdhdb.vfstart: 0 ; 0x0ec: 0×00000000
kfdhdb.vfend: 0 ; 0x0f0: 0×00000000

[root@rac1] kfed read ‘/dev/sdh’ | grep vf
kfdhdb.vfstart: 0 ; 0x0ec: 0×00000000
kfdhdb.vfend: 0 ; 0x0f0: 0×00000000

So from above output, disk ‘/dev/sdi’ contains voting disk.