Restoring OCR from backup
Testing Scenario:
OCR is located on +OCR_VOTE diskgroup which was created with external redundancy,
OCRVOTE is corrupted or diskgroup where OCRVOTE is located has problem.
1- Stop cluster on both nodes and delete asmdisk of ocr and vote
[root@rac1 bin]# ./crsctl stop cluster -all
We will drop ASMDSK5 which is used by +OCR_VOTE to test our scenario.
[root@rac1 sbin]# oracleasm deletedisk ASMDSK5
Clearing disk header: done
Dropping disk: done
[root@rac2 sbin]# oracleasm deletedisk ASMDSK5
Clearing disk header: done
Dropping disk: done
We will format /dev/sdf which was used by ASMDSK5(used for +OCR_VOTE diskgroup)
[root@rac1 bin]# ./crsctl stop cluster -all
We will drop ASMDSK5 which is used by +OCR_VOTE to test our scenario.
[root@rac1 sbin]# oracleasm deletedisk ASMDSK5
Clearing disk header: done
Dropping disk: done
[root@rac2 sbin]# oracleasm deletedisk ASMDSK5
Clearing disk header: done
Dropping disk: done
We will format /dev/sdf which was used by ASMDSK5(used for +OCR_VOTE diskgroup)
so data on it can be deleted (OCRVOTE file on this disk will be deleted)
[root@rac1 sbin]# fdisk /dev/sdf
2- Recreate ASMDSK5 on both nodes
[root@rac1 sbin]# oracleasm createdisk ASMDSK5 /dev/sdf1
[root@rac2 sbin]# oracleasm createdisk ASMDSK5 /dev/sdf1
3- Start cluster in exclusive mode
Upon reboot all cluster processes will not start because it cannot locate and read OCR, so to start
maintenance we will stop and restart cluster in exclusive mode.
If we will try to stop whole cluster some of services which are already started will not stop and because
all the processes are not STOPPED, disable the cluster AUTO Start and reboot the server for
cleaning all the pending processes.
root@rac1 bin]# ./crsctl disable crs
CRS-4621: Oracle High Availability Services autostart is disabled.
root@rac1 bin]# reboot
Upon reboot cluster will not be started as we disabled it. Start it in exclusive mode (using root user)
[root@rac1 bin]# ./crsctl start crs -excl
CRS-4123: Oracle High Availability Services has been started.
CRS-2672: Attempting to start 'ora.gipcd' on 'rac1'
CRS-2672: Attempting to start 'ora.mdnsd' on 'rac1'
CRS-2676: Start of 'ora.gipcd' on 'rac1' succeeded
CRS-2676: Start of 'ora.mdnsd' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.gpnpd' on 'rac1'
CRS-2676: Start of 'ora.gpnpd' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.cssdmonitor' on 'rac1'
CRS-2676: Start of 'ora.cssdmonitor' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.cssd' on 'rac1'
CRS-2679: Attempting to clean 'ora.diskmon' on 'rac1'
CRS-2681: Clean of 'ora.diskmon' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.diskmon' on 'rac1'
CRS-2676: Start of 'ora.diskmon' on 'rac1' succeeded
CRS-2676: Start of 'ora.cssd' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.ctssd' on 'rac1'
CRS-2672: Attempting to start 'ora.drivers.acfs' on 'rac1'
CRS-2676: Start of 'ora.ctssd' on 'rac1' succeeded
CRS-2676: Start of 'ora.drivers.acfs' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.asm' on 'rac1'
CRS-2676: Start of 'ora.asm' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.crsd' on 'rac1'
CRS-2676: Start of 'ora.crsd' on 'rac1' succeeded
NOTE: you could stop the cluster on each node by using below commands
CRS-4123: Oracle High Availability Services has been started.
CRS-2672: Attempting to start 'ora.gipcd' on 'rac1'
CRS-2672: Attempting to start 'ora.mdnsd' on 'rac1'
CRS-2676: Start of 'ora.gipcd' on 'rac1' succeeded
CRS-2676: Start of 'ora.mdnsd' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.gpnpd' on 'rac1'
CRS-2676: Start of 'ora.gpnpd' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.cssdmonitor' on 'rac1'
CRS-2676: Start of 'ora.cssdmonitor' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.cssd' on 'rac1'
CRS-2679: Attempting to clean 'ora.diskmon' on 'rac1'
CRS-2681: Clean of 'ora.diskmon' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.diskmon' on 'rac1'
CRS-2676: Start of 'ora.diskmon' on 'rac1' succeeded
CRS-2676: Start of 'ora.cssd' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.ctssd' on 'rac1'
CRS-2672: Attempting to start 'ora.drivers.acfs' on 'rac1'
CRS-2676: Start of 'ora.ctssd' on 'rac1' succeeded
CRS-2676: Start of 'ora.drivers.acfs' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.asm' on 'rac1'
CRS-2676: Start of 'ora.asm' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.crsd' on 'rac1'
CRS-2676: Start of 'ora.crsd' on 'rac1' succeeded
NOTE: you could stop the cluster on each node by using below commands
# crsctl stop crs -f
# crsctl start crs -excl -nocrs
‘-nocrs‘ option introduced with 11.2.0.2 prevents the start of the ora.crsd resource. It is vital that this option
is specified; otherwise the failure to start the ora.crsd resource will tear down ora.cluster_interconnect.haip,
which in turn will cause ASM to crash.
4- Create new Diskgroup for ocr and vote
As Oracle User connect to sqlplus
[root@rac1 ~]# su - oracle
[root@rac1 bin]# set oracle_sid=+ASM1
[oracle@rac1 ~]$ ./grid_env
[oracle@rac1 ~]$ sqlplus / as sysasm
SQL*Plus: Release 11.2.0.1.0 Production on Thu Mar 27 00:46:59 2014
Copyright (c) 1982, 2009, Oracle. All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - Production
With the Real Application Clusters and Automatic Storage Management options
SQL> alter system set asm_diskgroups='DATA','FLASH';
System altered.
SQL> create diskgroup OCR_VOTE external redundancy disk '/dev/oracleasm/disks/ASMDSK5' ATTRIBUTE 'compatible.rdbms' = '11.2', 'compatible.asm' = '11.2';
Diskgroup created.
SQL> show parameter asm
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
asm_diskgroups string DATA, FLASH, OCR_VOTE
asm_diskstring string /dev/oracleasm/disks
asm_power_limit integer 1
asm_preferred_read_failure_groups string
SQL> shutdown immediate;
ASM diskgroups volume disabled
ASM diskgroups dismounted
ASM instance shutdown
SQL> startup;
ASM instance started
Total System Global Area 284565504 bytes
Fixed Size 1336036 bytes
Variable Size 258063644 bytes
ASM Cache 25165824 bytes
ASM diskgroups mounted
ASM diskgroups volume enabled
SQL> select name,state from v$asm_diskgroup;
NAME STATE
------------------------------ -----------
DATA MOUNTED
FLASH MOUNTED
OCR_VOTE MOUNTED
5- Restore OCR
[oracle@rac1 ~]$ ./grid_env
[oracle@rac1 ~]$ sqlplus / as sysasm
SQL*Plus: Release 11.2.0.1.0 Production on Thu Mar 27 00:46:59 2014
Copyright (c) 1982, 2009, Oracle. All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - Production
With the Real Application Clusters and Automatic Storage Management options
SQL> alter system set asm_diskgroups='DATA','FLASH';
System altered.
SQL> create diskgroup OCR_VOTE external redundancy disk '/dev/oracleasm/disks/ASMDSK5' ATTRIBUTE 'compatible.rdbms' = '11.2', 'compatible.asm' = '11.2';
Diskgroup created.
SQL> show parameter asm
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
asm_diskgroups string DATA, FLASH, OCR_VOTE
asm_diskstring string /dev/oracleasm/disks
asm_power_limit integer 1
asm_preferred_read_failure_groups string
SQL> shutdown immediate;
ASM diskgroups volume disabled
ASM diskgroups dismounted
ASM instance shutdown
SQL> startup;
ASM instance started
Total System Global Area 284565504 bytes
Fixed Size 1336036 bytes
Variable Size 258063644 bytes
ASM Cache 25165824 bytes
ASM diskgroups mounted
ASM diskgroups volume enabled
SQL> select name,state from v$asm_diskgroup;
NAME STATE
------------------------------ -----------
DATA MOUNTED
FLASH MOUNTED
OCR_VOTE MOUNTED
5- Restore OCR
First know the location of OCR
$ cat /etc/oracle/ocr.loc
Locate the latest automatic OCR backup
$GRID_HOME\bin\ocrconfig –showbackup
$GRID_HOME\bin\ocrconfig –showbackup
[root@rac1 bin]# ./ocrconfig -restore /u01/app/11.2.0/grid/cdata/racscan/backup_20140327_002335.ocr
Verify that OCR is restored using ocrcheck
[root@rac1 bin]# ./ocrcheck
Status of Oracle Cluster Registry is as follows :
Version : 3
Total space (kbytes) : 262120
Used space (kbytes) : 2748
Available space (kbytes) : 259372
ID : 1499687051
Device/File Name : +OCR_VOTE
Device/File integrity check succeeded
Device/File not configured
Device/File not configured
Device/File not configured
Device/File not configured
Cluster registry integrity check succeeded
Logical corruption check succeeded
6- Initialize votedisk
Replace votedisk so it can be reinitialized
[root@rac1 bin]# ./crsctl replace votedisk +OCR_VOTE
Successful addition of voting disk 324b6b7134544f73bfb716c42f0f21c1.
Successful deletion of voting disk 0c1f71f3e5184f79bf79b85c77a79658.
Successfully replaced voting disk group with +OCR_VOTE.
CRS-4266: Voting file(s) successfully replaced
[root@rac1 bin]#
[root@rac1 bin]# ./crsctl query css votedisk
## STATE File Universal Id File Name Disk group
-- ----- ----------------- --------- ---------
1. ONLINE 324b6b7134544f73bfb716c42f0f21c1 (/dev/oracleasm/disks/ASMDSK5) [OCR_VOTE]
Located 1 voting disk(s).
[root@rac1 bin]#
7- Enable and start CRS
[root@rac1 bin]# ./crsctl enable crs
CRS-4622: Oracle High Availability Services autostart is enabled.[root@rac1 bin]# ./crsctl start crs
or reboot both nodes, now cluster should start
NOTE: You could stop the crs on each node and start again as below also
# crsctl stop crs -f# crsctl start crs
No comments:
Post a Comment