Please see my other blog for Oracle EBusiness Suite Posts - EBMentors

Search This Blog

Note: All the posts are based on practical approach avoiding lengthy theory. All have been tested on some development servers. Please don’t test any post on production servers until you are sure.

Monday, November 24, 2014

Exadata: Physical Disks, LUNs, and Cell Disks Mapping

As an Exadata DMA, you should know to map the physical disks in an Exadata Storage Server to Logical Units (LUNs) and map LUNs to Exadata cell disks in order to understand how Exadata’s disks are presented as usable storage entities.

LUN (Logical Unit Number)
Exadata is presented 12 physical disks, a LUN is automatically created on each physical disk and is mapped to the usable extents on the disk. An administrator cannot alter, create, or drop LUNs; they are wholly managed by the Exadata Storage Server software. The purpose of the LUN is to present capacity available to cell disks; on the first two disks, the LUN maps to the extents not used for the System Area, and on the remaining ten disks, the LUN represents the entire physical disk.
Cell Disk
An Exadata cell disk is created on top of a LUN. The cell disk is another layer of logical abstraction from which grid disks can be created. An Exadata administrator can alter, create, or drop cell disks. A cell disk does not necessarily need to use all of the available storage that a LUN presents, but it most commonly does. A somewhat common scenario for cell disks may be for an administrator to define an interleaving attribute on the cell disk. When configured with interleaving, grid disks built on the cell disk have their extents “interleaved” across fixed extent boundaries on the physical disk, which has the impact of balancing extents for the grid disks evenly, starting with the outermost disk tracks.
Grid Disk
The grid disk is the next logical storage entity inside Exadata storage cells. Grid disks are built on cell disks and are the storage entities on which Oracle ASM disk groups are built. Typically, grid disk planning and administration are done in tandem with ASM disk group planning.
ASM Disk Group
Oracle ASM disk groups are built on collections of Exadata grid disks. ASM disk groups are used to store database files.

You can map the  Physical Disks, LUNs, and Cell Disks as below.

1- identify how the LUNs are used to create cell disks. Start by running the following lsscsi command:

[root@pk3-iub-cel-es01 ~]# lsscsi
[0:0:20:0]   enclosu ORACLE   CONCORD14        0d00  -
[0:2:0:0]    disk    LSI      MR9261-8i        2.12  /dev/sda
[0:2:1:0]    disk    LSI      MR9261-8i        2.12  /dev/sdb
[0:2:2:0]    disk    LSI      MR9261-8i        2.12  /dev/sdc
[0:2:3:0]    disk    LSI      MR9261-8i        2.12  /dev/sdd
[0:2:4:0]    disk    LSI      MR9261-8i        2.12  /dev/sde
[0:2:5:0]    disk    LSI      MR9261-8i        2.12  /dev/sdf
[0:2:6:0]    disk    LSI      MR9261-8i        2.12  /dev/sdg
[0:2:7:0]    disk    LSI      MR9261-8i        2.12  /dev/sdh
[0:2:8:0]    disk    LSI      MR9261-8i        2.12  /dev/sdi
[0:2:9:0]    disk    LSI      MR9261-8i        2.12  /dev/sdj
[0:2:10:0]   disk    LSI      MR9261-8i        2.12  /dev/sdk
[0:2:11:0]   disk    LSI      MR9261-8i        2.12  /dev/sdl
[1:0:0:0]    disk    ORACLE   UNIGEN-UFD       PMAP  /dev/sdm
[8:0:0:0]    disk    ATA      3E128-TS2-550B01 UI39  /dev/sdn
[8:0:1:0]    disk    ATA      3E128-TS2-550B01 UI39  /dev/sdo
[8:0:2:0]    disk    ATA      3E128-TS2-550B01 UI39  /dev/sdp
[8:0:3:0]    disk    ATA      3E128-TS2-550B01 UI39  /dev/sdq
[9:0:0:0]    disk    ATA      3E128-TS2-550B01 UI39  /dev/sdr
[9:0:1:0]    disk    ATA      3E128-TS2-550B01 UI39  /dev/sds
[9:0:2:0]    disk    ATA      3E128-TS2-550B01 UI39  /dev/sdt
[9:0:3:0]    disk    ATA      3E128-TS2-550B01 UI39  /dev/sdu
[10:0:0:0]   disk    ATA      3E128-TS2-550B01 UI39  /dev/sdv
[10:0:1:0]   disk    ATA      3E128-TS2-550B01 UI39  /dev/sdw
[10:0:2:0]   disk    ATA      3E128-TS2-550B01 UI39  /dev/sdx
[10:0:3:0]   disk    ATA      3E128-TS2-550B01 UI39  /dev/sdy
[11:0:0:0]   disk    ATA      3E128-TS2-550B01 UI39  /dev/sdz
[11:0:1:0]   disk    ATA      3E128-TS2-550B01 UI39  /dev/sdaa
[11:0:2:0]   disk    ATA      3E128-TS2-550B01 UI39  /dev/sdab
[11:0:3:0]   disk    ATA      3E128-TS2-550B01 UI39  /dev/sdac
[root@pk3-iub-cel-es01 ~]#

We can see 12 physical disks, as indicated by the LSI in the third column, at slots 0:2:[0-11]:0. The sixth column displays the physical device, /dev/sda through /dev/sdl.

2-  We know from Oracle documentation that the first two disks contain the System Area, so if you do an fdisk on one of these, you’ll see which sections of this device are used for the System Area and which partitions are used for Exadata database storage:

[root@pk3-iub-cel-es01 ~]# fdisk -l /dev/sda

Disk /dev/sda: 598.9 GB, 598999040000 bytes
255 heads, 63 sectors/track, 72824 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *           1          15      120456   fd  Linux raid autodetect
/dev/sda2              16          16        8032+  83  Linux
/dev/sda3              17       69039   554427247+  83  Linux
/dev/sda4           69040       72824    30403012+   f  W95 Ext'd (LBA)
/dev/sda5           69040       70344    10482381   fd  Linux raid autodetect
/dev/sda6           70345       71649    10482381   fd  Linux raid autodetect
/dev/sda7           71650       71910     2096451   fd  Linux raid autodetect
/dev/sda8           71911       72171     2096451   fd  Linux raid autodetect
/dev/sda9           72172       72432     2096451   fd  Linux raid autodetect
/dev/sda10          72433       72521      714861   fd  Linux raid autodetect
/dev/sda11          72522       72824     2433816   fd  Linux raid autodetect
[root@pk3-iub-cel-es01 ~]#

This fdisk listing shows that /dev/sda3, a large partition starting at sector 17 and ending at sector 69039, is likely our non-System Area storage area for database storage.

3- If you perform a similar fdisk –l on the third disk, /dev/sdc, you can see that there are no host-usable partitions:

[root@pk3-iub-cel-es01 ~]# fdisk -l /dev/sdc

Disk /dev/sdc: 598.9 GB, 598999040000 bytes
255 heads, 63 sectors/track, 72824 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Disk /dev/sdc doesn't contain a valid partition table

4- The lowest level disk entity for Exadata Storage Server disks is the physical disk. Using CellCLI, you can query your physical disks and their attributes:
CellCLI> list physicaldisk attributes name,diskType,luns,physicalsize,slotNumber
         20:0            HardDisk        0_0     558.9109999993816G      0
         20:1            HardDisk        0_1     558.9109999993816G      1
         20:2            HardDisk        0_2     558.9109999993816G      2
         20:3            HardDisk        0_3     558.9109999993816G      3
         20:4            HardDisk        0_4     558.9109999993816G      4
         20:5            HardDisk        0_5     558.9109999993816G      5
         20:6            HardDisk        0_6     558.9109999993816G      6
         20:7            HardDisk        0_7     558.9109999993816G      7
         20:8            HardDisk        0_8     558.9109999993816G      8
         20:9            HardDisk        0_9     558.9109999993816G      9
         20:10           HardDisk        0_10    558.9109999993816G      10
         20:11           HardDisk        0_11    558.9109999993816G      11
         FLASH_1_0       FlashDisk       1_0     93.13225793838501G      "PCI Slot: 1; FDOM: 0"
         FLASH_1_1       FlashDisk       1_1     93.13225793838501G      "PCI Slot: 1; FDOM: 1"
         FLASH_1_2       FlashDisk       1_2     93.13225793838501G      "PCI Slot: 1; FDOM: 2"
         FLASH_1_3       FlashDisk       1_3     93.13225793838501G      "PCI Slot: 1; FDOM: 3"
         FLASH_2_0       FlashDisk       2_0     93.13225793838501G      "PCI Slot: 2; FDOM: 0"
         FLASH_2_1       FlashDisk       2_1     93.13225793838501G      "PCI Slot: 2; FDOM: 1"
         FLASH_2_2       FlashDisk       2_2     93.13225793838501G      "PCI Slot: 2; FDOM: 2"
         FLASH_2_3       FlashDisk       2_3     93.13225793838501G      "PCI Slot: 2; FDOM: 3"
         FLASH_4_0       FlashDisk       4_0     93.13225793838501G      "PCI Slot: 4; FDOM: 0"
         FLASH_4_1       FlashDisk       4_1     93.13225793838501G      "PCI Slot: 4; FDOM: 1"
         FLASH_4_2       FlashDisk       4_2     93.13225793838501G      "PCI Slot: 4; FDOM: 2"
         FLASH_4_3       FlashDisk       4_3     93.13225793838501G      "PCI Slot: 4; FDOM: 3"
         FLASH_5_0       FlashDisk       5_0     93.13225793838501G      "PCI Slot: 5; FDOM: 0"
         FLASH_5_1       FlashDisk       5_1     93.13225793838501G      "PCI Slot: 5; FDOM: 1"
         FLASH_5_2       FlashDisk       5_2     93.13225793838501G      "PCI Slot: 5; FDOM: 2"
         FLASH_5_3       FlashDisk       5_3     93.13225793838501G      "PCI Slot: 5; FDOM: 3"

We can see physical disks of type HardDisk with names 20:0 through 20:11 and disks of type FlashDisk grouped in sections of four per PCI flash card. We also are displaying the LUNs attribute, which shows us the physical disk to LUN mapping. If you would like to see all details for a physical disk, you can issue a list physicaldisk detail cellcli command:

CellCLI> list physicaldisk where name=20:2 detail
         name:                   20:2
         deviceId:               10
         diskType:               HardDisk
         enclosureDeviceId:      20
         errMediaCount:          0
         errOtherCount:          0
         foreignState:           false
         luns:                   0_2
         makeModel:              "HITACHI HUS1560SCSUN600G"
         physicalFirmware:       A700
         physicalInsertTime:     2013-05-13T11:14:58+03:00
         physicalInterface:      sas
         physicalSerial:         KT6G1N
         physicalSize:           558.9109999993816G
         slotNumber:             2
         status:                 normal

5- Oracle builds Exadata LUNs on the usable portions of each physical disk. Using CellCLI, query the LUN details:  
CellCLI> list lun attributes name, deviceName, isSystemLun, physicalDrives, lunSize
         0_0     /dev/sda        TRUE    20:0            557.861328125G
         0_1     /dev/sdb        TRUE    20:1            557.861328125G
         0_2     /dev/sdc        FALSE   20:2            557.861328125G
         0_3     /dev/sdd        FALSE   20:3            557.861328125G
         0_4     /dev/sde        FALSE   20:4            557.861328125G
         0_5     /dev/sdf        FALSE   20:5            557.861328125G
         0_6     /dev/sdg        FALSE   20:6            557.861328125G
         0_7     /dev/sdh        FALSE   20:7            557.861328125G
         0_8     /dev/sdi        FALSE   20:8            557.861328125G
         0_9     /dev/sdj        FALSE   20:9            557.861328125G
         0_10    /dev/sdk        FALSE   20:10           557.861328125G
         0_11    /dev/sdl        FALSE   20:11           557.861328125G
         1_0     /dev/sdv        FALSE   FLASH_1_0       93.13225793838501G
         1_1     /dev/sdw        FALSE   FLASH_1_1       93.13225793838501G
         1_2     /dev/sdx        FALSE   FLASH_1_2       93.13225793838501G
         1_3     /dev/sdy        FALSE   FLASH_1_3       93.13225793838501G
         2_0     /dev/sdz        FALSE   FLASH_2_0       93.13225793838501G
         2_1     /dev/sdaa       FALSE   FLASH_2_1       93.13225793838501G
         2_2     /dev/sdab       FALSE   FLASH_2_2       93.13225793838501G
         2_3     /dev/sdac       FALSE   FLASH_2_3       93.13225793838501G
         4_0     /dev/sdr        FALSE   FLASH_4_0       93.13225793838501G
         4_1     /dev/sds        FALSE   FLASH_4_1       93.13225793838501G
         4_2     /dev/sdt        FALSE   FLASH_4_2       93.13225793838501G
         4_3     /dev/sdu        FALSE   FLASH_4_3       93.13225793838501G
         5_0     /dev/sdn        FALSE   FLASH_5_0       93.13225793838501G
         5_1     /dev/sdo        FALSE   FLASH_5_1       93.13225793838501G
         5_2     /dev/sdp        FALSE   FLASH_5_2       93.13225793838501G
         5_3     /dev/sdq        FALSE   FLASH_5_3       93.13225793838501G

The output displays LUNs for both physical disks and flash disks, the physical device name, TRUE or FALSE for the isSystemLun attribute, and the LUN size. From the LUN level in the storage hierarchy, the device mapping simply shows which physical device a LUN is built on; the actual physical partition for the disks in the System Area is displayed at the cell disk level.

6- Exadata cell disks are built on LUNs. A CellCLI listing of cell disks is provided:
CellCLI> list celldisk attributes name,deviceName,devicePartition,interleaving,lun,size
         CD_00_pk3_iub_cel_es01  /dev/sda        /dev/sda3       none    0_0     528.734375G
         CD_01_pk3_iub_cel_es01  /dev/sdb        /dev/sdb3       none    0_1     528.734375G
         CD_02_pk3_iub_cel_es01  /dev/sdc        /dev/sdc        none    0_2     557.859375G
         CD_03_pk3_iub_cel_es01  /dev/sdd        /dev/sdd        none    0_3     557.859375G
         CD_04_pk3_iub_cel_es01  /dev/sde        /dev/sde        none    0_4     557.859375G
         CD_05_pk3_iub_cel_es01  /dev/sdf        /dev/sdf        none    0_5     557.859375G
         CD_06_pk3_iub_cel_es01  /dev/sdg        /dev/sdg        none    0_6     557.859375G
         CD_07_pk3_iub_cel_es01  /dev/sdh        /dev/sdh        none    0_7     557.859375G
         CD_08_pk3_iub_cel_es01  /dev/sdi        /dev/sdi        none    0_8     557.859375G
         CD_09_pk3_iub_cel_es01  /dev/sdj        /dev/sdj        none    0_9     557.859375G
         CD_10_pk3_iub_cel_es01  /dev/sdk        /dev/sdk        none    0_10    557.859375G
         CD_11_pk3_iub_cel_es01  /dev/sdl        /dev/sdl        none    0_11    557.859375G
         FD_00_pk3_iub_cel_es01  /dev/sdv        /dev/sdv        none    1_0     93.125G
         FD_01_pk3_iub_cel_es01  /dev/sdw        /dev/sdw        none    1_1     93.125G
         FD_02_pk3_iub_cel_es01  /dev/sdx        /dev/sdx        none    1_2     93.125G
         FD_03_pk3_iub_cel_es01  /dev/sdy        /dev/sdy        none    1_3     93.125G
         FD_04_pk3_iub_cel_es01  /dev/sdz        /dev/sdz        none    2_0     93.125G
         FD_05_pk3_iub_cel_es01  /dev/sdaa       /dev/sdaa       none    2_1     93.125G
         FD_06_pk3_iub_cel_es01  /dev/sdab       /dev/sdab       none    2_2     93.125G
         FD_07_pk3_iub_cel_es01  /dev/sdac       /dev/sdac       none    2_3     93.125G
         FD_08_pk3_iub_cel_es01  /dev/sdr        /dev/sdr        none    4_0     93.125G
         FD_09_pk3_iub_cel_es01  /dev/sds        /dev/sds        none    4_1     93.125G
         FD_10_pk3_iub_cel_es01  /dev/sdt        /dev/sdt        none    4_2     93.125G
         FD_11_pk3_iub_cel_es01  /dev/sdu        /dev/sdu        none    4_3     93.125G
         FD_12_pk3_iub_cel_es01  /dev/sdn        /dev/sdn        none    5_0     93.125G
         FD_13_pk3_iub_cel_es01  /dev/sdo        /dev/sdo        none    5_1     93.125G
         FD_14_pk3_iub_cel_es01  /dev/sdp        /dev/sdp        none    5_2     93.125G
         FD_15_pk3_iub_cel_es01  /dev/sdq        /dev/sdq        none    5_3     93.125G

  • The CellCLI list celldisk output tells us the following:
  • There is one cell disk per LUN.
  • We’re displaying cell disks for both hard disks and flash disks.
  • The devicePartition for cell disks CD_00_cm01cel01 and CD_01_cm01cel01, which reside on LUNs on the first two physical disks, map to the /dev/sda3 and /dev/sdb3 partitions. This is consistent with what we expected from previous fdisk listings.
  • The size of all but the first two disks equals the size of the LUN. For the cell disk built on LUNs that contain a System Area, Exadata automatically carves the cell disk boundaries to reside outside the System Area partitions.
  • Each of the cell disks built on hard disks is defined with a normal_redundancy interleaving attribute, in this case.

No comments: