oracle_exadata_storage

Overview
Configuration
- Enable Mail Notifications
Storage Cell
Flash Cache
- Management
- Monitoring

Overview

In this section we will configure and manage the Exadata storage cells. In exadata the storage cells are separate servers with their own operation system, mini oracle database and of course a storage. The facilitate the smart scan capabilities of exadata, by being able to scan through an oracle block and provide ONLY the necessary rows to a server process. Since they provide only the rows, that information cannot “live” in the SGA, but only in the PGA.

You can see basic architecture of the physical, LUN, celldisk and Grod Disks:

A more detailed view of a disk, you can see also below:

Configuration

In this section we will configure and re-configure certain feature of the cell storage server.

Enable Mail Notifications

To enable Mail notification from a certain cell, we can use the following command:

Enable Mail Notification

--List cell Details:
CellCLI> list cell detail
name: qr01celadm01
cellVersion: OSS_12.1.2.1.0_LINUX.X64_141206.1
cpuCount: 2
diagHistoryDays: 7
fanCount: 0/0
fanStatus: normal
********************************************
***NO NOTIFICATION SETTINGS*****************
********************************************
CellCLI>

--Modify The cell:
CellCLI> alter cell smtpServer='my_mail.example.com', -
> smtpFromAddr='[email protected]', -
> smtpFrom='John Doe', -
> smtpToAddr='[email protected]', -
> notificationPolicy='critical,warning,clear', -
> notificationMethod='mail'
Cell qr01celadm01 successfully altered

--List Details again
CellCLI> list cell detail
name: qr01celadm01
cellVersion: OSS_12.1.2.1.0_LINUX.X64_141206.1
cpuCount: 2
diagHistoryDays: 7
fanCount: 0/0
fanStatus: normal
********************************************
notificationMethod: mail
notificationPolicy: critical,warning,clear
********************************************
offloadGroupEvents:
offloadEfficiency:

However since the mail doesn't exist, we will recieved the following error if we try to validate it:

Validate Mail

CellCLI> alter cell validate mail
CELL-02578: An error was detected in the SMTP configuration:
CELL-05503: An error was detected during notification. The text
of the associated internal error is: Unknown SMTP host:
my_mail.example.com.
The notification recipient is [email protected].
Please verify your SMTP configuration.
CellCLI>

You can also validate the whole configuration using the following command:

Validate Exadata Configuration

CellCLI> alter cell validate configuration
Cell qr01celadm01 successfully altered
CellCLI>
--Note
Note that the ALTER CELL VALIDATE CONFIGURATION command does not perform I/O
tests against the cell’s hard disks and flash modules. You must use the CALIBRATE
command to perform such tests. The CALIBRATE command can only be executed in a
CellCLI session initiated by the root user.

Storage Cell

List Cell Processes

The storage cell server has couple processes:

RS (Restart Server) - Used to start and shut down the Cell server (cellsrv) and Management Server (MS)
MS (Management Server) - Provides exadata cell management and configuration. Cooperates with the cellcli interface. Also send alerts and gather statistics in addition to these collection by the cellsrv
Cellsrv (Cell server) - Primary exadata component, provides most of the exadata storage services. Primary Cellsrv communicates with the oracle database in order to provide simple block requests or smart-scan capabilities. It also implements IO management (IORM) and collects numerous statistics.

RS Server

[celladmin@qr01celadm01 ~]$ ps -ef | grep cellrs
root      1927     1  0 Nov29 ?        00:00:15 /opt/oracle/cell12.1.2.1.0_LINUX.X64_141206.1/cellsrv/bin/cellrssrm -ms 1 -cellsrv 1
root      1934  1927  0 Nov29 ?        00:00:04 /opt/oracle/cell12.1.2.1.0_LINUX.X64_141206.1/cellsrv/bin/cellrsbmt -rs_conf /opt/oracle/cell12.1.2.1.0_LINUX.X64_141206.1/cellsrv/deploy/config/cellinit.ora -ms_conf /opt/oracle/cell12.1.2.1.0_LINUX.X64_141206.1/cellsrv/deploy/config/cellrsms.state -cellsrv_conf /opt/oracle/cell12.1.2.1.0_LINUX.X64_141206.1/cellsrv/deploy/config/cellrsos.state -debug 0
root      1935  1927  0 Nov29 ?        00:00:05 /opt/oracle/cell12.1.2.1.0_LINUX.X64_141206.1/cellsrv/bin/cellrsmmt -rs_conf /opt/oracle/cell12.1.2.1.0_LINUX.X64_141206.1/cellsrv/deploy/config/cellinit.ora -ms_conf /opt/oracle/cell12.1.2.1.0_LINUX.X64_141206.1/cellsrv/deploy/config/cellrsms.state -cellsrv_conf /opt/oracle/cell12.1.2.1.0_LINUX.X64_141206.1/cellsrv/deploy/config/cellrsos.state -debug 0
root      1936  1927  0 Nov29 ?        00:01:38 /opt/oracle/cell12.1.2.1.0_LINUX.X64_141206.1/cellsrv/bin/cellrsomt -rs_conf /opt/oracle/cell12.1.2.1.0_LINUX.X64_141206.1/cellsrv/deploy/config/cellinit.ora -ms_conf /opt/oracle/cell12.1.2.1.0_LINUX.X64_141206.1/cellsrv/deploy/config/cellrsms.state -cellsrv_conf /opt/oracle/cell12.1.2.1.0_LINUX.X64_141206.1/cellsrv/deploy/config/cellrsos.state -debug 0
root      1937  1934  0 Nov29 ?        00:00:00 /opt/oracle/cell12.1.2.1.0_LINUX.X64_141206.1/cellsrv/bin/cellrsbkm -rs_conf /opt/oracle/cell12.1.2.1.0_LINUX.X64_141206.1/cellsrv/deploy/config/cellinit.ora -ms_conf /opt/oracle/cell12.1.2.1.0_LINUX.X64_141206.1/cellsrv/deploy/config/cellrsms.state -cellsrv_conf /opt/oracle/cell12.1.2.1.0_LINUX.X64_141206.1/cellsrv/deploy/config/cellrsos.state -debug 0
root      1944  1937  0 Nov29 ?        00:00:04 /opt/oracle/cell12.1.2.1.0_LINUX.X64_141206.1/cellsrv/bin/cellrssmt -rs_conf /opt/oracle/cell12.1.2.1.0_LINUX.X64_141206.1/cellsrv/deploy/config/cellinit.ora -ms_conf /opt/oracle/cell12.1.2.1.0_LINUX.X64_141206.1/cellsrv/deploy/config/cellrsms.state -cellsrv_conf /opt/oracle/cell12.1.2.1.0_LINUX.X64_141206.1/cellsrv/deploy/config/cellrsos.state -debug 0

MS Server

[celladmin@qr01celadm01 ~]$ ps -ef | grep msServer
root      2003  1938  0 Nov29 ?        00:04:39 /usr/java/jdk1.7.0_72/bin/java -client -Xms256m -Xmx512m -XX:CompileThreshold=8000 -XX:PermSize=128m -XX:MaxPermSize=256m -Dweblogic.Name=msServer -Djava.security.policy=/opt/oracle/cell12.1.2.1.0_LINUX.X64_141206.1/cellsrv/deploy/wls/wlserver_10.3/server/lib/weblogic.policy -XX:-UseLargePages -XX:ParallelGCThreads=8 -Dweblogic.ListenPort=8888 -Djava.security.egd=file:/dev/./urandom -Xverify:none -da -Dplatform.home=/opt/oracle/cell12.1.2.1.0_LINUX.X64_141206.1/cellsrv/deploy/wls/wlserver_10.3 -Dwls.home=/opt/oracle/cell12.1.2.1.0_LINUX.X64_141206.1/cellsrv/deploy/wls/wlserver_10.3/server -Dweblogic.home=/opt/oracle/cell12.1.2.1.0_LINUX.X64_141206.1/cellsrv/deploy/wls/wlserver_10.3/server -Dweblogic.management.discover=true -Dwlw.iterativeDev= -Dwlw.testConsole= -Dwlw.logErrorsToConsole= -Dweblogic.ext.dirs=/opt/oracle/cell12.1.2.1.0_LINUX.X64_141206.1/cellsrv/deploy/wls/patch_wls1036/profiles/default/sysext_manifest_classpath weblogic.Server
1000      7610  7441  0 03:21 pts/0    00:00:00 grep msServer
[celladmin@qr01celadm01 ~]$

CellSRV

[celladmin@qr01celadm01 ~]$ ps -ef | grep "/cellsrv "
root      1940  1936 22 Nov29 ?        05:34:49 /opt/oracle/cell12.1.2.1.0_LINUX.X64_141206.1/cellsrv/bin/cellsrv 40 3000 9 5042
[celladmin@qr01celadm01 ~]$

it is important to note that both: MS and CellSRV are children of the RS server (they have RS as a parent).

Examine storage Cell

With the cellCli we can also list the storage cell Luns. P.S. Since I don't have money for real exadata there will be some discrepencies between my output and the real output which you will get in the real exadata: Each exadata consists of:

12 Hard disk based Luns
4 Flash based Luns

Let's firstly start with the examination of tha Hard disk based storage:

Hard Disk Based

To list the status of a storage cell server, we can use the following command:

[celladmin@qr01celadm01 ~]$ cellcli -e list cell
	 qr01celadm01	 online
[celladmin@qr01celadm01 ~]$

We can obtain more detailed information using the cellCli interface:

[celladmin@qr01celadm01 ~]$ cellcli
CellCLI: Release 12.1.2.1.0 - Production on Mon Nov 30 03:30:03 UTC 2020

Copyright (c) 2007, 2013, Oracle.  All rights reserved.
Cell Efficiency Ratio: 391

CellCLI> list cell detail
	 name:              	 qr01celadm01
	 cellVersion:       	 OSS_12.1.2.1.0_LINUX.X64_141206.1
	 cpuCount:          	 2
	 diagHistoryDays:   	 7
	 fanCount:          	 0/0
	 fanStatus:         	 normal
	 flashCacheMode:    	 WriteThrough
	 id:                	 ef92136a-837c-4e1d-88d2-e01f5ab89b7b
	 interconnectCount: 	 0
	 interconnect1:     	 ib0
	 interconnect2:     	 ib1
	 iormBoost:         	 0.0
	 ipaddress1:        	 192.168.1.105/24
	 ipaddress2:        	 192.168.1.106/24
	 kernelVersion:     	 2.6.39-400.243.1.el6uek.x86_64
	 makeModel:         	 Fake hardware
	 memoryGB:          	 4
	 metricHistoryDays: 	 7
	 offloadGroupEvents:
	 offloadEfficiency: 	 390.8
	 powerCount:        	 0/0
	 powerStatus:       	 normal
	 releaseVersion:    	 12.1.2.1.0
	 releaseTrackingBug:	 17885582
	 status:            	 online
	 temperatureReading:	 0.0
	 temperatureStatus: 	 normal
	 upTime:            	 1 days, 0:47
	 cellsrvStatus:     	 running
	 msStatus:          	 running
	 rsStatus:          	 running

CellCLI>

List Lun

CellCLI> list lun
/opt/oracle/cell12.1.2.1.0_LINUX.X64_141206.1/disks/raw/DISK00	 /opt/oracle/cell12.1.2.1.0_LINUX.X64_141206.1/disks/raw/DISK00 normal
/opt/oracle/cell12.1.2.1.0_LINUX.X64_141206.1/disks/raw/DISK01 	 /opt/oracle/cell12.1.2.1.0_LINUX.X64_141206.1/disks/raw/DISK01 normal
/opt/oracle/cell12.1.2.1.0_LINUX.X64_141206.1/disks/raw/DISK02 	 /opt/oracle/cell12.1.2.1.0_LINUX.X64_141206.1/disks/raw/DISK02 normal
/opt/oracle/cell12.1.2.1.0_LINUX.X64_141206.1/disks/raw/DISK03 	 /opt/oracle/cell12.1.2.1.0_LINUX.X64_141206.1/disks/raw/DISK03 normal
/opt/oracle/cell12.1.2.1.0_LINUX.X64_141206.1/disks/raw/DISK04 	 /opt/oracle/cell12.1.2.1.0_LINUX.X64_141206.1/disks/raw/DISK04 normal
/opt/oracle/cell12.1.2.1.0_LINUX.X64_141206.1/disks/raw/DISK05 	 /opt/oracle/cell12.1.2.1.0_LINUX.X64_141206.1/disks/raw/DISK05 normal
/opt/oracle/cell12.1.2.1.0_LINUX.X64_141206.1/disks/raw/DISK06 	 /opt/oracle/cell12.1.2.1.0_LINUX.X64_141206.1/disks/raw/DISK06 normal
/opt/oracle/cell12.1.2.1.0_LINUX.X64_141206.1/disks/raw/DISK07 	 /opt/oracle/cell12.1.2.1.0_LINUX.X64_141206.1/disks/raw/DISK07 normal
/opt/oracle/cell12.1.2.1.0_LINUX.X64_141206.1/disks/raw/DISK08 	 /opt/oracle/cell12.1.2.1.0_LINUX.X64_141206.1/disks/raw/DISK08 normal
/opt/oracle/cell12.1.2.1.0_LINUX.X64_141206.1/disks/raw/DISK09 	 /opt/oracle/cell12.1.2.1.0_LINUX.X64_141206.1/disks/raw/DISK09 normal
/opt/oracle/cell12.1.2.1.0_LINUX.X64_141206.1/disks/raw/DISK10 	 /opt/oracle/cell12.1.2.1.0_LINUX.X64_141206.1/disks/raw/DISK10 normal
/opt/oracle/cell12.1.2.1.0_LINUX.X64_141206.1/disks/raw/DISK11 	 /opt/oracle/cell12.1.2.1.0_LINUX.X64_141206.1/disks/raw/DISK11 normal
/opt/oracle/cell12.1.2.1.0_LINUX.X64_141206.1/disks/raw/FLASH00	 /opt/oracle/cell12.1.2.1.0_LINUX.X64_141206.1/disks/raw/FLASH00 normal
/opt/oracle/cell12.1.2.1.0_LINUX.X64_141206.1/disks/raw/FLASH01	 /opt/oracle/cell12.1.2.1.0_LINUX.X64_141206.1/disks/raw/FLASH01 normal
/opt/oracle/cell12.1.2.1.0_LINUX.X64_141206.1/disks/raw/FLASH02	 /opt/oracle/cell12.1.2.1.0_LINUX.X64_141206.1/disks/raw/FLASH02 normal
/opt/oracle/cell12.1.2.1.0_LINUX.X64_141206.1/disks/raw/FLASH03	 /opt/oracle/cell12.1.2.1.0_LINUX.X64_141206.1/disks/raw/FLASH03 normal

The result which you will get on the real exadata will be similary to:

CellCLI> list lun
0_0 0_0 normal
0_1 0_1 normal
0_2 0_2 normal
0_3 0_3 normal
0_4 0_4 normal
0_5 0_5 normal
0_6 0_6 normal
0_7 0_7 normal
0_8 0_8 normal
0_9 0_9 normal
0_10 0_10 normal
0_11 0_11 normal
1_1 1_1 normal
2_1 2_1 normal
4_1 4_1 normal
5_1 5_1 normal
CellCLI>

The reason for that is the fact that on my virtualized env, the Cells are mapped to a virtualized disks and virtualized flash devices, where on real exadata, they will be mapped to PCI slot and device number.

We can list a specific LUN in more details, as follows:

List Lun

CellCLI> list lun where name like '.*DISK09' detail
	 name:              	 /opt/oracle/cell12.1.2.1.0_LINUX.X64_141206.1/disks/raw/DISK09
	 cellDisk:          	 CD_09_qr01celadm01
	 deviceName:        	 /opt/oracle/cell12.1.2.1.0_LINUX.X64_141206.1/disks/raw/DISK09
	 diskType:          	 HardDisk
	 id:                	 /opt/oracle/cell12.1.2.1.0_LINUX.X64_141206.1/disks/raw/DISK09
	 isSystemLun:       	 FALSE
	 lunSize:           	 11
	 physicalDrives:    	 /opt/oracle/cell12.1.2.1.0_LINUX.X64_141206.1/disks/raw/DISK09
	 raidLevel:         	 "RAID 0"
	 status:            	 normal

CellCLI>

List Physical Disk

CellCLI> list physicaldisk where luns like '.*DISK09' detail
	 name:              	 /opt/oracle/cell12.1.2.1.0_LINUX.X64_141206.1/disks/raw/DISK09
	 deviceName:        	 /opt/oracle/cell12.1.2.1.0_LINUX.X64_141206.1/disks/raw/DISK09
	 diskType:          	 HardDisk
	 luns:              	 /opt/oracle/cell12.1.2.1.0_LINUX.X64_141206.1/disks/raw/DISK09
	 physicalInsertTime:	 2015-02-17T03:31:42+00:00
	 physicalSerial:    	 /opt/oracle/cell12.1.2.1.0_LINUX.X64_141206.1/disks/raw/DISK09
	 physicalSize:      	 11
	 status:            	 normal

CellCLI>

List celldisk

--List all cell disks
CellCLI> list celldisk where disktype=flashdisk
	 FD_00_qr01celadm01	 normal
	 FD_01_qr01celadm01	 normal
	 FD_02_qr01celadm01	 normal
	 FD_03_qr01celadm01	 normal

--List Cell disk details
CellCLI> list celldisk CD_09_qr01celadm01 detail
	 name:              	 CD_09_qr01celadm01
	 comment:           	 
	 creationTime:      	 2015-02-20T00:57:10+00:00
	 deviceName:        	 /opt/oracle/cell12.1.2.1.0_LINUX.X64_141206.1/disks/raw/DISK09
	 devicePartition:   	 /opt/oracle/cell12.1.2.1.0_LINUX.X64_141206.1/disks/raw/DISK09
	 diskType:          	 HardDisk
	 errorCount:        	 0
	 freeSpace:         	 0
	 id:                	 f369c761-a9a1-4d1b-aaa0-51fc32edbc42
	 interleaving:      	 none
	 lun:               	 /opt/oracle/cell12.1.2.1.0_LINUX.X64_141206.1/disks/raw/DISK09
	 physicalDisk:      	 /opt/oracle/cell12.1.2.1.0_LINUX.X64_141206.1/disks/raw/DISK09
	 raidLevel:         	 "RAID 0"
	 size:              	 2G
	 status:            	 normal

CellCLI>

List grid disk

CellCLI> list griddisk where celldisk=CD_09_qr01celadm01 detail
	 name:              	 DATA_QR01_CD_09_qr01celadm01
	 asmDiskGroupName:  	 DATA_QR01
	 asmDiskName:       	 DATA_QR01_CD_09_QR01CELADM01
	 asmFailGroupName:  	 QR01CELADM01
	 availableTo:       	 
	 cachingPolicy:     	 default
	 cellDisk:          	 CD_09_qr01celadm01
	 comment:           	 
	 creationTime:      	 2015-03-11T03:23:37+00:00
	 diskType:          	 HardDisk
	 errorCount:        	 0
	 id:                	 9a11b79a-1fa7-4527-85bf-28ad5cb98cac
	 offset:            	 400M
	 size:              	 720M
	 status:            	 active

	 name:              	 DBFS_DG_CD_09_qr01celadm01
	 asmDiskGroupName:  	 DBFS_DG
	 asmDiskName:       	 DBFS_DG_CD_09_QR01CELADM01
	 asmFailGroupName:  	 QR01CELADM01
	 availableTo:       	 
	 cachingPolicy:     	 default
	 cellDisk:          	 CD_09_qr01celadm01
	 comment:           	 
	 creationTime:      	 2015-03-11T03:23:35+00:00
	 diskType:          	 HardDisk
	 errorCount:        	 0
	 id:                	 e304a244-75de-4ea6-b92d-b480e2226615
	 offset:            	 48M
	 size:              	 352M
	 status:            	 active

	 name:              	 RECO_QR01_CD_09_qr01celadm01
	 asmDiskGroupName:  	 RECO_QR01
	 asmDiskName:       	 RECO_QR01_CD_09_QR01CELADM01
	 asmFailGroupName:  	 QR01CELADM01
	 availableTo:       	 
	 cachingPolicy:     	 default
	 cellDisk:          	 CD_09_qr01celadm01
	 comment:           	 
	 creationTime:      	 2015-03-11T03:23:40+00:00
	 diskType:          	 HardDisk
	 errorCount:        	 0
	 id:                	 9923a077-dc5e-4f53-adc5-4cc3eaaa9916
	 offset:            	 1.09375G
	 size:              	 928M
	 status:            	 active

CellCLI>

Stop / Start Cell Services

We can Restart all cell services using the cellCli interface as follows:

[root@qr01celadm01 ~]# cellcli
CellCLI: Release 12.1.2.1.0 - Production on Mon Nov 30 03:36:16 UTC 2020

Copyright (c) 2007, 2013, Oracle.  All rights reserved.
Cell Efficiency Ratio: 392

CellCLI> alter cell restart services all

Stopping the RS, CELLSRV, and MS services...
The SHUTDOWN of services was successful.
Starting the RS, CELLSRV, and MS services...
Getting the state of RS services...  running
Starting CELLSRV services...
The STARTUP of CELLSRV services was successful.
Starting MS services...
The STARTUP of MS services was successful.

CellCLI>

This action, will NOT cause downtime.

ReConfigure GridDisk

Let's put ourselves a challange, let's change the size of the ASM disks one diskgroup, on oura grid disks from: 928 MB to - 608 MB That is hard process if we want to avoid downtime. In a nutshell, what we will do is:

Drop Disks on Cell Server 1
Reconfigure the Disks on Cell Server 1
Add the new disks to the diskgroup, while dropping the ones on the next Cell server

We will repeat step 1), 2) and 3) until we have all disks with the same size. P.S. Have disks with different size in one Diskgroup is very bad practice, it will cause constant rebalancing. Without further adew, let's get started:

Check Our Diskgroups

SQL> select dg.name, count(*), d.total_mb, d.os_mb,
  2  min(d.free_mb) MIN_FREE_MB, max(d.free_mb) MAX_FREE_MB
  3  from v$asm_disk d, v$asm_diskgroup dg
  4  where dg.group_number=d.group_number and d.mount_status='CACHED'
  5  group by dg.name, d.total_mb, d.os_mb;

NAME		  COUNT(*)   TOTAL_MB	   OS_MB MIN_FREE_MB MAX_FREE_MB
--------------- ---------- ---------- ---------- ----------- -----------
DBFS_DG 		36	  352	     352	   4	      84
DATA_QR01		36	  720	     720	 288	     328
RECO_QR01		36	  928	     928         888	     916        <- This one

Now, we can rebalance and manually bring the disks to 608, but that won't help:

Rebalance

SQL> alter diskgroup reco_qr01 resize all size 608m
2 rebalance power 1024;
Diskgroup altered.
SQL>
SQL> select dg.name, count(*), d.total_mb, d.os_mb,
  2  min(d.free_mb) MIN_FREE_MB, max(d.free_mb) MAX_FREE_MB
  3  from v$asm_disk d, v$asm_diskgroup dg
  4  where dg.group_number=d.group_number and d.mount_status='CACHED'
  5  group by dg.name, d.total_mb, d.os_mb;

NAME		  COUNT(*)   TOTAL_MB	   OS_MB MIN_FREE_MB MAX_FREE_MB
--------------- ---------- ---------- ---------- ----------- -----------
DBFS_DG 		36	  352	     352	   4	      84
DATA_QR01		36	  720	     720	 288	     328
RECO_QR01		36	  608	     928         568	     596        <- This one

The problem here is the fact that the grid disk on the Storage cell is still 928, as indicated by the “OS_MB” size. To change that, we need to do it on rotations, just like with Redo log files :)

Drop disks in Storage Cell Server 1

--Verify there is not rebalancing activity:
SQL> select * from gv$asm_operation;
no rows selected
SQL>

--Drop disks on Storage Cell Server 1:
SQL> alter diskgroup reco_qr01
2 drop disks in failgroup qr01celadm01 (that is our storage cell server 1)
3 rebalance power 1024;
Diskgroup altered.

--Of course there will be rebalance, but wait for that, to pass:
select * from gv$asm_operation;
INST_ID GROUP_NUMBER OPERA PASS STAT POWER ACTUAL SOFAR
---------- ------------ ----- --------- ---- ---------- ---------- ----------
EST_WORK EST_RATE EST_MINUTES ERROR_CODE
---------- ---------- ----------- --------------------------------------------
CON_ID
----------
2 3 REBAL RESYNC DONE 1024
0
2 3 REBAL RESILVER DONE 1024

--Some time later
SQL> select * from gv$asm_operation;
no rows selected
SQL>

--Check if they are mounted:
SQL> select path, free_mb, header_status, mount_status
2 from v$asm_disk
3 where path like '%RECO_QR01%celadm01';
PATH  								FREE_MB    HEADER_STATU MOUNT_S
--------------------------------------------------------------- ---------- ------------ -------
o/192.168.1.105;192.168.1.106/RECO_QR01_CD_10_qr01celadm01               0 FORMER       CLOSED
o/192.168.1.105;192.168.1.106/RECO_QR01_CD_06_qr01celadm01               0 FORMER       CLOSED
o/192.168.1.105;192.168.1.106/RECO_QR01_CD_00_qr01celadm01               0 FORMER       CLOSED
o/192.168.1.105;192.168.1.106/RECO_QR01_CD_05_qr01celadm01               0 FORMER       CLOSED
o/192.168.1.105;192.168.1.106/RECO_QR01_CD_08_qr01celadm01               0 FORMER       CLOSED
o/192.168.1.105;192.168.1.106/RECO_QR01_CD_03_qr01celadm01               0 FORMER       CLOSED
o/192.168.1.105;192.168.1.106/RECO_QR01_CD_09_qr01celadm01               0 FORMER       CLOSED
o/192.168.1.105;192.168.1.106/RECO_QR01_CD_04_qr01celadm01               0 FORMER       CLOSED
o/192.168.1.105;192.168.1.106/RECO_QR01_CD_01_qr01celadm01               0 FORMER       CLOSED
o/192.168.1.105;192.168.1.106/RECO_QR01_CD_07_qr01celadm01               0 FORMER       CLOSED
o/192.168.1.105;192.168.1.106/RECO_QR01_CD_02_qr01celadm01               0 FORMER       CLOSED
o/192.168.1.105;192.168.1.106/RECO_QR01_CD_11_qr01celadm01               0 FORMER       CLOSED
12 rows selected.

Now that we have the disks on Cell Storage Server 1 dropped, we can drop the actual disks from the Storage cell. Go to your storage cell server 1 and drop them:

Drop Grid disks

--List grid disks
[celladmin@qr01celadm01 ~]$ cellcli
CellCLI: Release 12.1.2.1.0 - Production...
CellCLI>
CellCLI> list griddisk attributes name, size, ASMModeStatus
***********************************************************
RECO_QR01_CD_00_qr01celadm01 928M UNUSED
RECO_QR01_CD_01_qr01celadm01 928M UNUSED
RECO_QR01_CD_02_qr01celadm01 928M UNUSED
RECO_QR01_CD_03_qr01celadm01 928M UNUSED
RECO_QR01_CD_04_qr01celadm01 928M UNUSED
RECO_QR01_CD_05_qr01celadm01 928M UNUSED
RECO_QR01_CD_06_qr01celadm01 928M UNUSED
RECO_QR01_CD_07_qr01celadm01 928M UNUSED
RECO_QR01_CD_08_qr01celadm01 928M UNUSED
RECO_QR01_CD_09_qr01celadm01 928M UNUSED
RECO_QR01_CD_10_qr01celadm01 928M UNUSED
RECO_QR01_CD_11_qr01celadm01 928M UNUSED

--Drop the disks:
CellCLI> drop griddisk all prefix=reco_qr01
GridDisk RECO_QR01_CD_00_qr01celadm01 successfully dropped
GridDisk RECO_QR01_CD_01_qr01celadm01 successfully dropped
GridDisk RECO_QR01_CD_02_qr01celadm01 successfully dropped
GridDisk RECO_QR01_CD_03_qr01celadm01 successfully dropped
GridDisk RECO_QR01_CD_04_qr01celadm01 successfully dropped
GridDisk RECO_QR01_CD_05_qr01celadm01 successfully dropped
GridDisk RECO_QR01_CD_06_qr01celadm01 successfully dropped
GridDisk RECO_QR01_CD_07_qr01celadm01 successfully dropped
GridDisk RECO_QR01_CD_08_qr01celadm01 successfully dropped
GridDisk RECO_QR01_CD_09_qr01celadm01 successfully dropped
GridDisk RECO_QR01_CD_10_qr01celadm01 successfully dropped
GridDisk RECO_QR01_CD_11_qr01celadm01 successfully dropped
CellCLI>

--Create new disks with the correct size:
CellCLI> create griddisk all harddisk prefix=RECO_QR01, size=608M
GridDisk RECO_QR01_CD_00_qr01celadm01 successfully created
GridDisk RECO_QR01_CD_01_qr01celadm01 successfully created
GridDisk RECO_QR01_CD_02_qr01celadm01 successfully created
GridDisk RECO_QR01_CD_03_qr01celadm01 successfully created
GridDisk RECO_QR01_CD_04_qr01celadm01 successfully created
GridDisk RECO_QR01_CD_05_qr01celadm01 successfully created
GridDisk RECO_QR01_CD_06_qr01celadm01 successfully created
GridDisk RECO_QR01_CD_07_qr01celadm01 successfully created
GridDisk RECO_QR01_CD_08_qr01celadm01 successfully created
GridDisk RECO_QR01_CD_09_qr01celadm01 successfully created
GridDisk RECO_QR01_CD_10_qr01celadm01 successfully created
GridDisk RECO_QR01_CD_11_qr01celadm01 successfully created

--Aaaaand check again:
CellCLI> list griddisk attributes name, size, ASMModeStatus
***********************************************************
RECO_QR01_CD_00_qr01celadm01 608M UNUSED
RECO_QR01_CD_01_qr01celadm01 608M UNUSED
RECO_QR01_CD_02_qr01celadm01 608M UNUSED
RECO_QR01_CD_03_qr01celadm01 608M UNUSED
RECO_QR01_CD_04_qr01celadm01 608M UNUSED
RECO_QR01_CD_05_qr01celadm01 608M UNUSED
RECO_QR01_CD_06_qr01celadm01 608M UNUSED
RECO_QR01_CD_07_qr01celadm01 608M UNUSED
RECO_QR01_CD_08_qr01celadm01 608M UNUSED
RECO_QR01_CD_09_qr01celadm01 608M UNUSED
RECO_QR01_CD_10_qr01celadm01 608M UNUSED
RECO_QR01_CD_11_qr01celadm01 608M UNUSED
CellCLI>

Now, that we have brand new disks, we have to add them :) Remember never drop more than you can carry

External Redundancy - Drop will cause data loss
Normal Redundancy - Drop more than 1/2 will cause data loss
High Redundancy - Drop more than 2/3 will cause data loss.

In our case, we have Normal Redundancy, so we go one by one just in case.

Verify the Disks

--In ASM we can check the new disks:
--Check if they are mounted:
SQL> select path, free_mb, header_status, mount_status
2 from v$asm_disk
3 where path like '%RECO_QR01%celadm01';
PATH  								FREE_MB    HEADER_STATU MOUNT_S
--------------------------------------------------------------- ---------- ------------ -------
o/192.168.1.105;192.168.1.106/RECO_QR01_CD_10_qr01celadm01               0 CANDIDATE       CLOSED
o/192.168.1.105;192.168.1.106/RECO_QR01_CD_06_qr01celadm01               0 CANDIDATE       CLOSED
o/192.168.1.105;192.168.1.106/RECO_QR01_CD_00_qr01celadm01               0 CANDIDATE       CLOSED
o/192.168.1.105;192.168.1.106/RECO_QR01_CD_05_qr01celadm01               0 CANDIDATE       CLOSED
o/192.168.1.105;192.168.1.106/RECO_QR01_CD_08_qr01celadm01               0 CANDIDATE       CLOSED
o/192.168.1.105;192.168.1.106/RECO_QR01_CD_03_qr01celadm01               0 CANDIDATE       CLOSED
o/192.168.1.105;192.168.1.106/RECO_QR01_CD_09_qr01celadm01               0 CANDIDATE       CLOSED
o/192.168.1.105;192.168.1.106/RECO_QR01_CD_04_qr01celadm01               0 CANDIDATE       CLOSED
o/192.168.1.105;192.168.1.106/RECO_QR01_CD_01_qr01celadm01               0 CANDIDATE       CLOSED
o/192.168.1.105;192.168.1.106/RECO_QR01_CD_07_qr01celadm01               0 CANDIDATE       CLOSED
o/192.168.1.105;192.168.1.106/RECO_QR01_CD_02_qr01celadm01               0 CANDIDATE       CLOSED
o/192.168.1.105;192.168.1.106/RECO_QR01_CD_11_qr01celadm01               0 CANDIDATE       CLOSED
12 rows selected.

So we have 6 shiny new disks to add to our group, let's add them:

Add the new Disks

SQL> alter diskgroup reco_qr01 add disk
2 'o/192.168.1.105;192.168.1.106/RECO_QR01_CD_00_qr01celadm01',
3 'o/192.168.1.105;192.168.1.106/RECO_QR01_CD_01_qr01celadm01',
4 'o/192.168.1.105;192.168.1.106/RECO_QR01_CD_02_qr01celadm01',
5 'o/192.168.1.105;192.168.1.106/RECO_QR01_CD_03_qr01celadm01',
6 'o/192.168.1.105;192.168.1.106/RECO_QR01_CD_04_qr01celadm01',
7 'o/192.168.1.105;192.168.1.106/RECO_QR01_CD_05_qr01celadm01',
8 'o/192.168.1.105;192.168.1.106/RECO_QR01_CD_06_qr01celadm01',
9 'o/192.168.1.105;192.168.1.106/RECO_QR01_CD_07_qr01celadm01',
10 'o/192.168.1.105;192.168.1.106/RECO_QR01_CD_08_qr01celadm01',
11 'o/192.168.1.105;192.168.1.106/RECO_QR01_CD_09_qr01celadm01',
12 'o/192.168.1.105;192.168.1.106/RECO_QR01_CD_10_qr01celadm01',
13 'o/192.168.1.105;192.168.1.106/RECO_QR01_CD_11_qr01celadm01'
14 drop disks in failgroup qr01celadm02                               <- Drop the disks on the 2nd Cell Storage Server
15 rebalance power 1024;
Diskgroup altered.
SQL>

--Of course there will be some rebalance, but that will pass
SQL> select * from gv$asm_operation;
***********************************
---------- ------------ ----- --------- ---- ---------- ---------- ----------
EST_WORK EST_RATE EST_MINUTES ERROR_CODE
---------- ---------- ----------- --------------------------------------------
CON_ID
----------
2 3 REBAL COMPACT WAIT 1024
0

--Some time later:
SQL> select * from gv$asm_operation;
no rows selected
SQL>

Now we can see that we have one disk group with two different sizes:

Check Diskgroups

SQL> select dg.name, count(*), d.total_mb, d.os_mb,
  2  min(d.free_mb) MIN_FREE_MB, max(d.free_mb) MAX_FREE_MB
  3  from v$asm_disk d, v$asm_diskgroup dg
  4  where dg.group_number=d.group_number and d.mount_status='CACHED'
  5  group by dg.name, d.total_mb, d.os_mb;

NAME		  COUNT(*)   TOTAL_MB	   OS_MB MIN_FREE_MB MAX_FREE_MB
--------------- ---------- ---------- ---------- ----------- -----------
DBFS_DG 		36	  352	     352	   4	      84
DATA_QR01		36	  720	     720	 288	     328
RECO_QR01		12	  608	     608         572	     592   <- New one (12 which we moved)
RECO_QR01		12	  608	     928         576	     592   <- Old one (24 - 12 (which we dropped in the previous step))
                                                                                       36 disks in total

Again that is very bad practice, so we should continue, as the steps repeat I will slack on explaining :) In a nutshell we have to move the disks from the group with 928 MB disk size to 608 disk size :)

Move the disks

--On the 2nd Cell Storage server
CellCLI> list griddisk attributes name, size, ASMModeStatus
RECO_QR01_CD_00_qr01celadm02 928M UNUSED
RECO_QR01_CD_01_qr01celadm02 928M UNUSED
RECO_QR01_CD_02_qr01celadm02 928M UNUSED
RECO_QR01_CD_03_qr01celadm02 928M UNUSED
RECO_QR01_CD_04_qr01celadm02 928M UNUSED
RECO_QR01_CD_05_qr01celadm02 928M UNUSED
RECO_QR01_CD_06_qr01celadm02 928M UNUSED
RECO_QR01_CD_07_qr01celadm02 928M UNUSED
RECO_QR01_CD_08_qr01celadm02 928M UNUSED
RECO_QR01_CD_09_qr01celadm02 928M UNUSED
RECO_QR01_CD_10_qr01celadm02 928M UNUSED
RECO_QR01_CD_11_qr01celadm02 928M UNUSED <- We dropped them earlier

--Drop the grid disks:
CellCLI> drop griddisk all prefix=reco_qr01
GridDisk RECO_QR01_CD_00_qr01celadm02 successfully dropped
GridDisk RECO_QR01_CD_01_qr01celadm02 successfully dropped
GridDisk RECO_QR01_CD_02_qr01celadm02 successfully dropped
GridDisk RECO_QR01_CD_03_qr01celadm02 successfully dropped
GridDisk RECO_QR01_CD_04_qr01celadm02 successfully dropped
GridDisk RECO_QR01_CD_05_qr01celadm02 successfully dropped
GridDisk RECO_QR01_CD_06_qr01celadm02 successfully dropped
GridDisk RECO_QR01_CD_07_qr01celadm02 successfully dropped
GridDisk RECO_QR01_CD_08_qr01celadm02 successfully dropped
GridDisk RECO_QR01_CD_09_qr01celadm02 successfully dropped
GridDisk RECO_QR01_CD_10_qr01celadm02 successfully dropped
GridDisk RECO_QR01_CD_11_qr01celadm02 successfully dropped
CellCLI>

--Create new Grid Disks with new Size:
CellCLI> create griddisk all harddisk prefix=RECO_QR01, size=608M
GridDisk RECO_QR01_CD_00_qr01celadm02 successfully created
GridDisk RECO_QR01_CD_01_qr01celadm02 successfully created
GridDisk RECO_QR01_CD_02_qr01celadm02 successfully created
GridDisk RECO_QR01_CD_03_qr01celadm02 successfully created
GridDisk RECO_QR01_CD_04_qr01celadm02 successfully created
GridDisk RECO_QR01_CD_05_qr01celadm02 successfully created
GridDisk RECO_QR01_CD_06_qr01celadm02 successfully created
GridDisk RECO_QR01_CD_07_qr01celadm02 successfully created
GridDisk RECO_QR01_CD_08_qr01celadm02 successfully created
GridDisk RECO_QR01_CD_09_qr01celadm02 successfully created
GridDisk RECO_QR01_CD_10_qr01celadm02 successfully created
GridDisk RECO_QR01_CD_11_qr01celadm02 successfully created
CellCLI>

--On ASM, add the new disks to the "608" disk group:
SQL> alter diskgroup reco_qr01 add disk
2 'o/192.168.1.107;192.168.1.108/RECO_QR01_CD_00_qr01celadm02',
3 'o/192.168.1.107;192.168.1.108/RECO_QR01_CD_01_qr01celadm02',
4 'o/192.168.1.107;192.168.1.108/RECO_QR01_CD_02_qr01celadm02',
5 'o/192.168.1.107;192.168.1.108/RECO_QR01_CD_03_qr01celadm02',
6 'o/192.168.1.107;192.168.1.108/RECO_QR01_CD_04_qr01celadm02',
7 'o/192.168.1.107;192.168.1.108/RECO_QR01_CD_05_qr01celadm02',
8 'o/192.168.1.107;192.168.1.108/RECO_QR01_CD_06_qr01celadm02',
9 'o/192.168.1.107;192.168.1.108/RECO_QR01_CD_07_qr01celadm02',
10 'o/192.168.1.107;192.168.1.108/RECO_QR01_CD_08_qr01celadm02',
11 'o/192.168.1.107;192.168.1.108/RECO_QR01_CD_09_qr01celadm02',
12 'o/192.168.1.107;192.168.1.108/RECO_QR01_CD_10_qr01celadm02',
13 'o/192.168.1.107;192.168.1.108/RECO_QR01_CD_11_qr01celadm02'
14 drop disks in failgroup qr01celadm03                              <- Drop the disks on the 3rd Storage Cell Server
15 rebalance power 1024;
Diskgroup altered.
SQL>


--On the 3rd Storage Cell Server
CellCLI> list griddisk attributes name, size, ASMModeStatus
RECO_QR01_CD_00_qr01celadm03 928M UNUSED
RECO_QR01_CD_01_qr01celadm03 928M UNUSED
RECO_QR01_CD_02_qr01celadm03 928M UNUSED
RECO_QR01_CD_03_qr01celadm03 928M UNUSED
RECO_QR01_CD_04_qr01celadm03 928M UNUSED
RECO_QR01_CD_05_qr01celadm03 928M UNUSED
RECO_QR01_CD_06_qr01celadm03 928M UNUSED
RECO_QR01_CD_07_qr01celadm03 928M UNUSED
RECO_QR01_CD_08_qr01celadm03 928M UNUSED
RECO_QR01_CD_09_qr01celadm03 928M UNUSED
RECO_QR01_CD_10_qr01celadm03 928M UNUSED
RECO_QR01_CD_11_qr01celadm03 928M UNUSED

--Drop the grid disks:
CellCLI> drop griddisk all prefix=reco_qr01
GridDisk RECO_QR01_CD_00_qr01celadm03 successfully dropped
GridDisk RECO_QR01_CD_01_qr01celadm03 successfully dropped
GridDisk RECO_QR01_CD_02_qr01celadm03 successfully dropped
GridDisk RECO_QR01_CD_03_qr01celadm03 successfully dropped
GridDisk RECO_QR01_CD_04_qr01celadm03 successfully dropped
GridDisk RECO_QR01_CD_05_qr01celadm03 successfully dropped
GridDisk RECO_QR01_CD_06_qr01celadm03 successfully dropped
GridDisk RECO_QR01_CD_07_qr01celadm03 successfully dropped
GridDisk RECO_QR01_CD_08_qr01celadm03 successfully dropped
GridDisk RECO_QR01_CD_09_qr01celadm03 successfully dropped
GridDisk RECO_QR01_CD_10_qr01celadm03 successfully dropped
GridDisk RECO_QR01_CD_11_qr01celadm03 successfully dropped
CellCLI>

--Create new Disks with the correct size (608)
CellCLI> create griddisk all harddisk prefix=RECO_QR01, size=608M
GridDisk RECO_QR01_CD_00_qr01celadm03 successfully created
GridDisk RECO_QR01_CD_01_qr01celadm03 successfully created
GridDisk RECO_QR01_CD_02_qr01celadm03 successfully created
GridDisk RECO_QR01_CD_03_qr01celadm03 successfully created
GridDisk RECO_QR01_CD_04_qr01celadm03 successfully created
GridDisk RECO_QR01_CD_05_qr01celadm03 successfully created
GridDisk RECO_QR01_CD_06_qr01celadm03 successfully created
GridDisk RECO_QR01_CD_07_qr01celadm03 successfully created
GridDisk RECO_QR01_CD_08_qr01celadm03 successfully created
GridDisk RECO_QR01_CD_09_qr01celadm03 successfully created
GridDisk RECO_QR01_CD_10_qr01celadm03 successfully created
GridDisk RECO_QR01_CD_11_qr01celadm03 successfully created
CellCLI>

--On ASM, add the new disks (finally) to the disk group:
SQL> alter diskgroup reco_qr01 add disk
2 'o/192.168.1.109;192.168.1.110/RECO_QR01_CD_00_qr01celadm03',
3 'o/192.168.1.109;192.168.1.110/RECO_QR01_CD_01_qr01celadm03',
4 'o/192.168.1.109;192.168.1.110/RECO_QR01_CD_02_qr01celadm03',
5 'o/192.168.1.109;192.168.1.110/RECO_QR01_CD_03_qr01celadm03',
6 'o/192.168.1.109;192.168.1.110/RECO_QR01_CD_04_qr01celadm03',
7 'o/192.168.1.109;192.168.1.110/RECO_QR01_CD_05_qr01celadm03',
8 'o/192.168.1.109;192.168.1.110/RECO_QR01_CD_06_qr01celadm03',
9 'o/192.168.1.109;192.168.1.110/RECO_QR01_CD_07_qr01celadm03',
10 'o/192.168.1.109;192.168.1.110/RECO_QR01_CD_08_qr01celadm03',
11 'o/192.168.1.109;192.168.1.110/RECO_QR01_CD_09_qr01celadm03',
12 'o/192.168.1.109;192.168.1.110/RECO_QR01_CD_10_qr01celadm03',
13 'o/192.168.1.109;192.168.1.110/RECO_QR01_CD_11_qr01celadm03'
14 rebalance power 1024;
Diskgroup altered.
SQL>

Finally we can check the size of the disk groups again:

Check the results

SQL> select dg.name, count(*), d.total_mb, d.os_mb,
  2  min(d.free_mb) MIN_FREE_MB, max(d.free_mb) MAX_FREE_MB
  3  from v$asm_disk d, v$asm_diskgroup dg
  4  where dg.group_number=d.group_number and d.mount_status='CACHED'
  5  group by dg.name, d.total_mb, d.os_mb;

NAME		  COUNT(*)   TOTAL_MB	   OS_MB MIN_FREE_MB MAX_FREE_MB
--------------- ---------- ---------- ---------- ----------- -----------
DBFS_DG 		36	  352	     352	   4	      84
DATA_QR01		36	  720	     720	 288	     328
RECO_QR01		36	  608	     608         572	     592        <- New one

Flash Cache

Flash cache in Exadata is a memory which stores often accessed data on storage level. With Flash cache you can achieve very good perfmance for reading. There are 3 types of a caches:

Write through

Using the write-through policy, data is written to the cache and the backing store location at the same time. The significance here is not the order in which it happens or whether it happens in parallel. The significance is that I/O completion is only confirmed once the data has been written to both places.

Advantage

Ensures fast retrieval while making sure the data is in the backing store and is not lost in case the cache is disrupted.

Disadvantage

Writing data will experience latency as you have to write to two places every time.

What is it good for?

The write-through policy is good for applications that write and then re-read data frequently. This will result in slightly higher write latency but low read latency. So, it’s ok to spend a bit longer writing once, but then benefit from reading frequently with low latency.

Write-around

Using the write-around policy, data is written only to the backing store without writing to the cache. So, I/O completion is confirmed as soon as the data is written to the backing store.

Advantage

Good for not flooding the cache with data that may not subsequently be re-read.

Disadvsntage

Reading recently written data will result in a cache miss (and so a higher latency) because the data can only be read from the slower backing store.

What is it good for?

The write-around policy is good for applications that don’t frequently re-read recently written data. This will result in lower write latency but higher read latency which is a acceptable trade-off for these scenarios.

Write-back

Using the write-back policy, data is written to the cache and Then I/O completion is confirmed. The data is then typically also written to the backing store in the background but the completion confirmation is not blocked on that.

Advantage

Low latency and high throughput for write-intensive applications.

Disadvantage

There is data availability risk because the cache could fail (and so suffer from data loss) before the data is persisted to the backing store. This result in the data being lost.

What is it good for?

The write-back policy is the best performer for mixed workloads as both read and write I/O have similar response time levels. In reality, you can add resiliency (e.g. by duplicating writes) to reduce the likelihood of data loss.

Management

The flash based modules, can be examined as the hard disk based ones:

List the cell disks

--Basic
CellCLI> list celldisk where disktype=flashdisk
	 FD_00_qr01celadm01	 normal
	 FD_01_qr01celadm01	 normal
	 FD_02_qr01celadm01	 normal
	 FD_03_qr01celadm01	 normal

--Detail
CellCLI> list flashcache detail
	 name:              	 qr01celadm01_FLASHCACHE
	 cellDisk:          	 FD_02_qr01celadm01,FD_03_qr01celadm01,FD_01_qr01celadm01,FD_00_qr01celadm01
	 creationTime:      	 2020-11-29T02:43:46+00:00
	 degradedCelldisks: 	 
	 effectiveCacheSize:	 1.0625G
	 id:                	 7fcc1eac-2214-40a1-9f27-eb988ec75340
	 size:              	 1.0625G
	 status:            	 normal

CellCLI>

Apart from the flashdisk, the exadata has also flashlog, to improve the redo log latency.

Flash log

--Detail
CellCLI> list flashlog detail
	 name:              	 qr01celadm01_FLASHLOG
	 cellDisk:          	 FD_01_qr01celadm01,FD_03_qr01celadm01,FD_00_qr01celadm01,FD_02_qr01celadm01
	 creationTime:      	 2020-11-29T02:43:44+00:00
	 degradedCelldisks: 	 
	 effectiveSize:     	 256M
	 efficiency:        	 100.0
	 id:                	 1bb5db75-8136-4952-9d20-bec44a303a17
	 size:              	 256M
	 status:            	 normal

CellCLI> 

--Content
CellCLI> list flashcachecontent detail
...
cachedKeepSize: 0
cachedSize: 262144
cachedWriteSize: 0
columnarCacheSize: 0
columnarKeepSize: 0
dbID: 2080757153
dbUniqueName: DBM
hitCount: 11345
missCount: 9
objectNumber: 4294967294
tableSpaceNumber: 0

Drop/Create Flash Cache

We can use the distributed CLI interface, provided with the exadata: “dcli” to run a command on multiple storage cells:

--Drop Flash Cache:
[celladmin@qr01celadm01 ~]$ dcli -c qr01celadm01,qr01celadm02,qr01celadm03 cellcli -e drop flashcache
qr01celadm01: Flash cache qr01celadm01_FLASHCACHE successfully dropped
qr01celadm02: Flash cache qr01celadm02_FLASHCACHE successfully dropped
qr01celadm03: Flash cache qr01celadm03_FLASHCACHE successfully dropped
[celladmin@qr01celadm01 ~]$

--Create Flash Cache
[celladmin@qr01celadm01 ~]$ dcli -c qr01celadm01,qr01celadm02,qr01celadm03 cellcli -e create flashcache all
qr01celadm01: Flash cache qr01celadm01_FLASHCACHE successfully created
qr01celadm02: Flash cache qr01celadm02_FLASHCACHE successfully created
qr01celadm03: Flash cache qr01celadm03_FLASHCACHE successfully created
[celladmin@qr01celadm01 ~]$

Drop Flash Cache

We can drop the flash cache only after, we have to stop the cluster.

Drop Cache

--Stop the cluster:
[root@qr01dbadm01 ~]# /u01/app/12.1.0.2/grid/bin/crsctl stop cluster -all
CRS-2673: Attempting to stop 'ora.crsd' on 'qr01dbadm01'
CRS-2673: Attempting to stop 'ora.crsd' on 'qr01dbadm02'
CRS-2790: Starting shutdown of Cluster Ready Services-managed resources on 'qr01dbadm01'
CRS-2673: Attempting to stop 'ora.LISTENER_SCAN3.lsnr' on 'qr01dbadm01'
CRS-2790: Starting shutdown of Cluster Ready Services-managed resources on 'qr01dbadm02'
CRS-2673: Attempting to stop 'ora.LISTENER_SCAN2.lsnr' on 'qr01dbadm01'
CRS-2673: Attempting to stop 'ora.LISTENER_SCAN1.lsnr' on 'qr01dbadm01'
CRS-2673: Attempting to stop 'ora.DBFS_DG.dg' on 'qr01dbadm02'
CRS-2673: Attempting to stop 'ora.dbm.db' on 'qr01dbadm02'

--Drop Flash cache
[celladmin@qr01celadm01 ~]$ dcli -c qr01celadm01,qr01celadm02,qr01celadm03 cellcli -e drop flashcache
qr01celadm01: Flash cache qr01celadm01_FLASHCACHE successfully dropped
qr01celadm02: Flash cache qr01celadm02_FLASHCACHE successfully dropped
qr01celadm03: Flash cache qr01celadm03_FLASHCACHE successfully dropped
[celladmin@qr01celadm01 ~]$

Shutdown Services

[celladmin@qr01celadm01 ~]$ dcli -c qr01celadm01,qr01celadm02,qr01celadm03 cellcli -e alter cell shutdown services cellsrv
qr01celadm01: 
qr01celadm01: Stopping CELLSRV services...
qr01celadm01: The SHUTDOWN of CELLSRV services was successful.
qr01celadm02: 
qr01celadm02: Stopping CELLSRV services...
qr01celadm02: The SHUTDOWN of CELLSRV services was successful.
qr01celadm03: 
qr01celadm03: Stopping CELLSRV services...
qr01celadm03: The SHUTDOWN of CELLSRV services was successful.
[celladmin@qr01celadm01 ~]$

Enable Flash Through Cache

[celladmin@qr01celadm01 ~]$ dcli -c qr01celadm01,qr01celadm02,qr01celadm03 cellcli -e alter cell flashCacheMode = WriteBack
qr01celadm01: Cell qr01celadm01 successfully altered
qr01celadm02: Cell qr01celadm02 successfully altered
qr01celadm03: Cell qr01celadm03 successfully altered
[celladmin@qr01celadm01 ~]$

Restart Flash Cache Services

Restart Flash Cache services

[celladmin@qr01celadm01 ~]$ dcli -c qr01celadm01,qr01celadm02,qr01celadm03 cellcli -e alter cell startup services cellsrv
qr01celadm01: 
qr01celadm01: Starting CELLSRV services...
qr01celadm01: The STARTUP of CELLSRV services was successful.
qr01celadm02: 
qr01celadm02: Starting CELLSRV services...
qr01celadm02: The STARTUP of CELLSRV services was successful.
qr01celadm03: 
qr01celadm03: Starting CELLSRV services...
qr01celadm03: The STARTUP of CELLSRV services was successful.
[celladmin@qr01celadm01 ~]$

Flush Caches

[celladmin@qr01celadm01 ~]$ dcli -c qr01celadm01,qr01celadm02,qr01celadm03 cellcli -e alter flashcache all flush
qr01celadm01: Flash cache qr01celadm01_FLASHCACHE altered successfully
qr01celadm02: Flash cache qr01celadm02_FLASHCACHE altered successfully
qr01celadm03: Flash cache qr01celadm03_FLASHCACHE altered successfully
[celladmin@qr01celadm01 ~]$

Monitoring

We can monitor the flash cache, using several commands

Determine the Flash Cache Type

Determine Type

[celladmin@qr01celadm01 ~]$ dcli -c qr01celadm01,qr01celadm02,qr01celadm03 cellcli -e list cell attributes flashCacheMode
qr01celadm01: WriteBack
qr01celadm02: WriteBack
qr01celadm03: WriteBack
[celladmin@qr01celadm01 ~]$

Determine the amount of Dirty Data

Determine the amount of dirty Data

[celladmin@qr01celadm01 ~]$ dcli -c qr01celadm01,qr01celadm02,qr01celadm03 cellcli -e list metriccurrent FC_BY_DIRTY
qr01celadm01: FC_BY_DIRTY	 FLASHCACHE	 0.000 MB
qr01celadm02: FC_BY_DIRTY	 FLASHCACHE	 0.000 MB
qr01celadm03: FC_BY_DIRTY	 FLASHCACHE	 0.000 MB
[celladmin@qr01celadm01 ~]$

Table of Contents

Overview

Configuration

Enable Mail Notifications

Storage Cell

List Cell Processes

RS Server

MS Server

CellSRV

Examine storage Cell

Hard Disk Based

List Lun

List Lun

List Physical Disk

List celldisk

List grid disk

Stop / Start Cell Services

ReConfigure GridDisk

Flash Cache

Write through

Advantage

Disadvantage

What is it good for?

Write-around

Advantage

Disadvsntage

What is it good for?

Write-back

Advantage

Disadvantage

What is it good for?

Management

List the cell disks

Flash log

Drop/Create Flash Cache

Drop Flash Cache

Shutdown Services

Enable Flash Through Cache

Restart Flash Cache Services

Flush Caches

Monitoring

Determine the Flash Cache Type

Determine the amount of Dirty Data