Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
oracle_exadata_storage_config [2020/11/30 16:02] – [Overview] andonovj | oracle_exadata_storage_config [2020/12/04 13:02] (current) – andonovj | ||
---|---|---|---|
Line 6: | Line 6: | ||
You can see basic architecture of the physical, LUN, celldisk and Grod Disks: | You can see basic architecture of the physical, LUN, celldisk and Grod Disks: | ||
+ | {{: | ||
- | {{ :exadata-storage-cell-disks.jpg? | + | A more detailed view of a disk, you can see also below: |
+ | {{: | ||
=====Configuration===== | =====Configuration===== | ||
+ | In this section we will configure and re-configure certain feature of the cell storage server. | ||
+ | ====Enable Mail Notifications==== | ||
+ | To enable Mail notification from a certain cell, we can use the following command: | ||
- | =====Management===== | + | < |
+ | --List cell Details: | ||
+ | CellCLI> list cell detail | ||
+ | name: qr01celadm01 | ||
+ | cellVersion: | ||
+ | cpuCount: 2 | ||
+ | diagHistoryDays: | ||
+ | fanCount: 0/0 | ||
+ | fanStatus: normal | ||
+ | ******************************************** | ||
+ | ***NO NOTIFICATION SETTINGS***************** | ||
+ | ******************************************** | ||
+ | CellCLI> | ||
+ | |||
+ | --Modify The cell: | ||
+ | CellCLI> alter cell smtpServer=' | ||
+ | > smtpFromAddr=' | ||
+ | > smtpFrom='John Doe', - | ||
+ | > smtpToAddr=' | ||
+ | > notificationPolicy=' | ||
+ | > notificationMethod=' | ||
+ | Cell qr01celadm01 successfully altered | ||
+ | |||
+ | --List Details again | ||
+ | CellCLI> list cell detail | ||
+ | name: qr01celadm01 | ||
+ | cellVersion: | ||
+ | cpuCount: 2 | ||
+ | diagHistoryDays: | ||
+ | fanCount: 0/0 | ||
+ | fanStatus: normal | ||
+ | ******************************************** | ||
+ | notificationMethod: | ||
+ | notificationPolicy: | ||
+ | ******************************************** | ||
+ | offloadGroupEvents: | ||
+ | offloadEfficiency: | ||
+ | </ | ||
+ | |||
+ | However since the mail doesn' | ||
+ | |||
+ | < | ||
+ | CellCLI> alter cell validate mail | ||
+ | CELL-02578: An error was detected in the SMTP configuration: | ||
+ | CELL-05503: An error was detected during notification. The text | ||
+ | of the associated internal error is: Unknown SMTP host: | ||
+ | my_mail.example.com. | ||
+ | The notification recipient is [email protected]. | ||
+ | Please verify your SMTP configuration. | ||
+ | CellCLI> | ||
+ | </ | ||
+ | |||
+ | You can also validate the whole configuration using the following command: | ||
+ | |||
+ | < | ||
+ | CellCLI> alter cell validate configuration | ||
+ | Cell qr01celadm01 successfully altered | ||
+ | CellCLI> | ||
+ | --Note | ||
+ | Note that the ALTER CELL VALIDATE CONFIGURATION command does not perform I/O | ||
+ | tests against the cell’s hard disks and flash modules. You must use the CALIBRATE | ||
+ | command to perform such tests. The CALIBRATE command can only be executed in a | ||
+ | CellCLI session initiated by the root user. | ||
+ | |||
+ | </ | ||
+ | =====Storage Cell===== | ||
====List Cell Processes==== | ====List Cell Processes==== | ||
Line 21: | Line 91: | ||
==RS Server== | ==RS Server== | ||
- | <sxh> | + | <Code:bash> |
[celladmin@qr01celadm01 ~]$ ps -ef | grep cellrs | [celladmin@qr01celadm01 ~]$ ps -ef | grep cellrs | ||
root 1927 | root 1927 | ||
Line 29: | Line 99: | ||
root 1937 1934 0 Nov29 ? 00:00:00 / | root 1937 1934 0 Nov29 ? 00:00:00 / | ||
root 1944 1937 0 Nov29 ? 00:00:04 / | root 1944 1937 0 Nov29 ? 00:00:04 / | ||
- | </sxh> | + | </Code> |
==MS Server== | ==MS Server== | ||
- | <sxh> | + | <Code:bash> |
[celladmin@qr01celadm01 ~]$ ps -ef | grep msServer | [celladmin@qr01celadm01 ~]$ ps -ef | grep msServer | ||
root 2003 1938 0 Nov29 ? 00:04:39 / | root 2003 1938 0 Nov29 ? 00:04:39 / | ||
1000 7610 7441 0 03:21 pts/0 00:00:00 grep msServer | 1000 7610 7441 0 03:21 pts/0 00:00:00 grep msServer | ||
[celladmin@qr01celadm01 ~]$ | [celladmin@qr01celadm01 ~]$ | ||
- | </sxh> | + | </Code> |
==CellSRV== | ==CellSRV== | ||
- | <sxh> | + | <Code:bash> |
[celladmin@qr01celadm01 ~]$ ps -ef | grep "/ | [celladmin@qr01celadm01 ~]$ ps -ef | grep "/ | ||
root 1940 1936 22 Nov29 ? 05:34:49 / | root 1940 1936 22 Nov29 ? 05:34:49 / | ||
[celladmin@qr01celadm01 ~]$ | [celladmin@qr01celadm01 ~]$ | ||
- | </sxh> | + | </Code> |
it is important to note that both: MS and CellSRV are children of the RS server (they have RS as a parent). | it is important to note that both: MS and CellSRV are children of the RS server (they have RS as a parent). | ||
Line 59: | Line 129: | ||
To list the status of a storage cell server, we can use the following command: | To list the status of a storage cell server, we can use the following command: | ||
- | <sxh> | + | <Code:bash> |
[celladmin@qr01celadm01 ~]$ cellcli -e list cell | [celladmin@qr01celadm01 ~]$ cellcli -e list cell | ||
qr01celadm01 online | qr01celadm01 online | ||
[celladmin@qr01celadm01 ~]$ | [celladmin@qr01celadm01 ~]$ | ||
- | </sxh> | + | </Code> |
We can obtain more detailed information using the cellCli interface: | We can obtain more detailed information using the cellCli interface: | ||
- | <sxh> | + | <Code:bash> |
[celladmin@qr01celadm01 ~]$ cellcli | [celladmin@qr01celadm01 ~]$ cellcli | ||
CellCLI: Release 12.1.2.1.0 - Production on Mon Nov 30 03:30:03 UTC 2020 | CellCLI: Release 12.1.2.1.0 - Production on Mon Nov 30 03:30:03 UTC 2020 | ||
Line 108: | Line 178: | ||
CellCLI> | CellCLI> | ||
- | </sxh> | + | </Code> |
==List Lun== | ==List Lun== | ||
- | <sxh> | + | <Code:bash> |
CellCLI> list lun | CellCLI> list lun | ||
/ | / | ||
Line 129: | Line 199: | ||
/ | / | ||
/ | / | ||
- | </sxh> | + | </Code> |
The result which you will get on the real exadata will be similary to: | The result which you will get on the real exadata will be similary to: | ||
- | <sxh> | + | <Code:bash> |
CellCLI> list lun | CellCLI> list lun | ||
0_0 0_0 normal | 0_0 0_0 normal | ||
Line 151: | Line 221: | ||
5_1 5_1 normal | 5_1 5_1 normal | ||
CellCLI> | CellCLI> | ||
- | </sxh> | + | </Code> |
The reason for that is the fact that on my virtualized env, the Cells are mapped to a virtualized disks and virtualized flash devices, where on real exadata, they will be mapped to PCI slot and device number. | The reason for that is the fact that on my virtualized env, the Cells are mapped to a virtualized disks and virtualized flash devices, where on real exadata, they will be mapped to PCI slot and device number. | ||
Line 158: | Line 228: | ||
==List Lun== | ==List Lun== | ||
- | <sxh> | + | <Code:bash> |
CellCLI> list lun where name like ' | CellCLI> list lun where name like ' | ||
name: / | name: / | ||
Line 172: | Line 242: | ||
CellCLI> | CellCLI> | ||
- | </sxh> | + | </Code> |
==List Physical Disk== | ==List Physical Disk== | ||
- | <sxh> | + | <Code:bash> |
CellCLI> list physicaldisk where luns like ' | CellCLI> list physicaldisk where luns like ' | ||
name: / | name: / | ||
Line 189: | Line 259: | ||
CellCLI> | CellCLI> | ||
- | </sxh> | + | </Code> |
==List celldisk== | ==List celldisk== | ||
- | <sxh> | + | <Code:bash> |
--List all cell disks | --List all cell disks | ||
CellCLI> list celldisk where disktype=flashdisk | CellCLI> list celldisk where disktype=flashdisk | ||
Line 220: | Line 290: | ||
CellCLI> | CellCLI> | ||
- | </sxh> | + | </Code> |
==List grid disk== | ==List grid disk== | ||
- | <sxh> | + | <Code:bash> |
CellCLI> list griddisk where celldisk=CD_09_qr01celadm01 detail | CellCLI> list griddisk where celldisk=CD_09_qr01celadm01 detail | ||
name: DATA_QR01_CD_09_qr01celadm01 | name: DATA_QR01_CD_09_qr01celadm01 | ||
Line 275: | Line 345: | ||
CellCLI> | CellCLI> | ||
- | </sxh> | + | </Code> |
+ | |||
+ | ====Stop / Start Cell Services==== | ||
+ | We can Restart all cell services using the cellCli interface as follows: | ||
+ | |||
+ | < | ||
+ | [root@qr01celadm01 ~]# cellcli | ||
+ | CellCLI: Release 12.1.2.1.0 - Production on Mon Nov 30 03:36:16 UTC 2020 | ||
+ | |||
+ | Copyright (c) 2007, 2013, Oracle. | ||
+ | Cell Efficiency Ratio: 392 | ||
+ | |||
+ | CellCLI> alter cell restart services all | ||
+ | |||
+ | Stopping the RS, CELLSRV, and MS services... | ||
+ | The SHUTDOWN of services was successful. | ||
+ | Starting the RS, CELLSRV, and MS services... | ||
+ | Getting the state of RS services... | ||
+ | Starting CELLSRV services... | ||
+ | The STARTUP of CELLSRV services was successful. | ||
+ | Starting MS services... | ||
+ | The STARTUP of MS services was successful. | ||
+ | |||
+ | CellCLI> | ||
+ | </ | ||
+ | |||
+ | This action, will NOT cause downtime. | ||
+ | |||
+ | ====ReConfigure GridDisk==== | ||
+ | Let's put ourselves a challange, let's change the size of the ASM disks one diskgroup, on oura grid disks from: 928 MB to - 608 MB | ||
+ | That is hard process if we want to avoid downtime. In a nutshell, what we will do is: | ||
+ | |||
+ | - Drop Disks on Cell Server 1 | ||
+ | - Reconfigure the Disks on Cell Server 1 | ||
+ | - Add the new disks to the diskgroup, while dropping the ones on the next Cell server | ||
+ | |||
+ | We will repeat step 1), 2) and 3) until we have all disks with the same size. | ||
+ | P.S. Have disks with different size in one Diskgroup is very bad practice, it will cause constant rebalancing. | ||
+ | Without further adew, let's get started: | ||
+ | |||
+ | |||
+ | < | ||
+ | SQL> select dg.name, count(*), d.total_mb, d.os_mb, | ||
+ | 2 min(d.free_mb) MIN_FREE_MB, | ||
+ | 3 from v$asm_disk d, v$asm_diskgroup dg | ||
+ | 4 where dg.group_number=d.group_number and d.mount_status=' | ||
+ | 5 group by dg.name, d.total_mb, d.os_mb; | ||
+ | |||
+ | NAME | ||
+ | --------------- ---------- ---------- ---------- ----------- ----------- | ||
+ | DBFS_DG 36 352 | ||
+ | DATA_QR01 36 | ||
+ | RECO_QR01 36 | ||
+ | </ | ||
+ | |||
+ | Now, we can rebalance and manually bring the disks to 608, but that won't help: | ||
+ | |||
+ | |||
+ | < | ||
+ | SQL> alter diskgroup reco_qr01 resize all size 608m | ||
+ | 2 rebalance power 1024; | ||
+ | Diskgroup altered. | ||
+ | SQL> | ||
+ | SQL> select dg.name, count(*), d.total_mb, d.os_mb, | ||
+ | 2 min(d.free_mb) MIN_FREE_MB, | ||
+ | 3 from v$asm_disk d, v$asm_diskgroup dg | ||
+ | 4 where dg.group_number=d.group_number and d.mount_status=' | ||
+ | 5 group by dg.name, d.total_mb, d.os_mb; | ||
+ | |||
+ | NAME | ||
+ | --------------- ---------- ---------- ---------- ----------- ----------- | ||
+ | DBFS_DG 36 352 | ||
+ | DATA_QR01 36 | ||
+ | RECO_QR01 36 | ||
+ | </ | ||
+ | |||
+ | The problem here is the fact that the grid disk on the Storage cell is still 928, as indicated by the " | ||
+ | To change that, we need to do it on rotations, just like with Redo log files :) | ||
+ | |||
+ | < | ||
+ | --Verify there is not rebalancing activity: | ||
+ | SQL> select * from gv$asm_operation; | ||
+ | no rows selected | ||
+ | SQL> | ||
+ | |||
+ | --Drop disks on Storage Cell Server 1: | ||
+ | SQL> alter diskgroup reco_qr01 | ||
+ | 2 drop disks in failgroup qr01celadm01 (that is our storage cell server 1) | ||
+ | 3 rebalance power 1024; | ||
+ | Diskgroup altered. | ||
+ | |||
+ | --Of course there will be rebalance, but wait for that, to pass: | ||
+ | select * from gv$asm_operation; | ||
+ | INST_ID GROUP_NUMBER OPERA PASS STAT POWER ACTUAL SOFAR | ||
+ | ---------- ------------ ----- --------- ---- ---------- ---------- ---------- | ||
+ | EST_WORK EST_RATE EST_MINUTES ERROR_CODE | ||
+ | ---------- ---------- ----------- -------------------------------------------- | ||
+ | CON_ID | ||
+ | ---------- | ||
+ | 2 3 REBAL RESYNC DONE 1024 | ||
+ | 0 | ||
+ | 2 3 REBAL RESILVER DONE 1024 | ||
+ | |||
+ | --Some time later | ||
+ | SQL> select * from gv$asm_operation; | ||
+ | no rows selected | ||
+ | SQL> | ||
+ | |||
+ | --Check if they are mounted: | ||
+ | SQL> select path, free_mb, header_status, | ||
+ | 2 from v$asm_disk | ||
+ | 3 where path like ' | ||
+ | PATH FREE_MB | ||
+ | --------------------------------------------------------------- ---------- ------------ ------- | ||
+ | o/ | ||
+ | o/ | ||
+ | o/ | ||
+ | o/ | ||
+ | o/ | ||
+ | o/ | ||
+ | o/ | ||
+ | o/ | ||
+ | o/ | ||
+ | o/ | ||
+ | o/ | ||
+ | o/ | ||
+ | 12 rows selected. | ||
+ | </ | ||
+ | |||
+ | Now that we have the disks on Cell Storage Server 1 dropped, we can drop the actual disks from the Storage cell. | ||
+ | Go to your storage cell server 1 and drop them: | ||
+ | |||
+ | < | ||
+ | --List grid disks | ||
+ | [celladmin@qr01celadm01 ~]$ cellcli | ||
+ | CellCLI: Release 12.1.2.1.0 - Production... | ||
+ | CellCLI> | ||
+ | CellCLI> list griddisk attributes name, size, ASMModeStatus | ||
+ | *********************************************************** | ||
+ | RECO_QR01_CD_00_qr01celadm01 928M UNUSED | ||
+ | RECO_QR01_CD_01_qr01celadm01 928M UNUSED | ||
+ | RECO_QR01_CD_02_qr01celadm01 928M UNUSED | ||
+ | RECO_QR01_CD_03_qr01celadm01 928M UNUSED | ||
+ | RECO_QR01_CD_04_qr01celadm01 928M UNUSED | ||
+ | RECO_QR01_CD_05_qr01celadm01 928M UNUSED | ||
+ | RECO_QR01_CD_06_qr01celadm01 928M UNUSED | ||
+ | RECO_QR01_CD_07_qr01celadm01 928M UNUSED | ||
+ | RECO_QR01_CD_08_qr01celadm01 928M UNUSED | ||
+ | RECO_QR01_CD_09_qr01celadm01 928M UNUSED | ||
+ | RECO_QR01_CD_10_qr01celadm01 928M UNUSED | ||
+ | RECO_QR01_CD_11_qr01celadm01 928M UNUSED | ||
+ | |||
+ | --Drop the disks: | ||
+ | CellCLI> drop griddisk all prefix=reco_qr01 | ||
+ | GridDisk RECO_QR01_CD_00_qr01celadm01 successfully dropped | ||
+ | GridDisk RECO_QR01_CD_01_qr01celadm01 successfully dropped | ||
+ | GridDisk RECO_QR01_CD_02_qr01celadm01 successfully dropped | ||
+ | GridDisk RECO_QR01_CD_03_qr01celadm01 successfully dropped | ||
+ | GridDisk RECO_QR01_CD_04_qr01celadm01 successfully dropped | ||
+ | GridDisk RECO_QR01_CD_05_qr01celadm01 successfully dropped | ||
+ | GridDisk RECO_QR01_CD_06_qr01celadm01 successfully dropped | ||
+ | GridDisk RECO_QR01_CD_07_qr01celadm01 successfully dropped | ||
+ | GridDisk RECO_QR01_CD_08_qr01celadm01 successfully dropped | ||
+ | GridDisk RECO_QR01_CD_09_qr01celadm01 successfully dropped | ||
+ | GridDisk RECO_QR01_CD_10_qr01celadm01 successfully dropped | ||
+ | GridDisk RECO_QR01_CD_11_qr01celadm01 successfully dropped | ||
+ | CellCLI> | ||
+ | |||
+ | --Create new disks with the correct size: | ||
+ | CellCLI> create griddisk all harddisk prefix=RECO_QR01, | ||
+ | GridDisk RECO_QR01_CD_00_qr01celadm01 successfully created | ||
+ | GridDisk RECO_QR01_CD_01_qr01celadm01 successfully created | ||
+ | GridDisk RECO_QR01_CD_02_qr01celadm01 successfully created | ||
+ | GridDisk RECO_QR01_CD_03_qr01celadm01 successfully created | ||
+ | GridDisk RECO_QR01_CD_04_qr01celadm01 successfully created | ||
+ | GridDisk RECO_QR01_CD_05_qr01celadm01 successfully created | ||
+ | GridDisk RECO_QR01_CD_06_qr01celadm01 successfully created | ||
+ | GridDisk RECO_QR01_CD_07_qr01celadm01 successfully created | ||
+ | GridDisk RECO_QR01_CD_08_qr01celadm01 successfully created | ||
+ | GridDisk RECO_QR01_CD_09_qr01celadm01 successfully created | ||
+ | GridDisk RECO_QR01_CD_10_qr01celadm01 successfully created | ||
+ | GridDisk RECO_QR01_CD_11_qr01celadm01 successfully created | ||
+ | |||
+ | --Aaaaand check again: | ||
+ | CellCLI> list griddisk attributes name, size, ASMModeStatus | ||
+ | *********************************************************** | ||
+ | RECO_QR01_CD_00_qr01celadm01 608M UNUSED | ||
+ | RECO_QR01_CD_01_qr01celadm01 608M UNUSED | ||
+ | RECO_QR01_CD_02_qr01celadm01 608M UNUSED | ||
+ | RECO_QR01_CD_03_qr01celadm01 608M UNUSED | ||
+ | RECO_QR01_CD_04_qr01celadm01 608M UNUSED | ||
+ | RECO_QR01_CD_05_qr01celadm01 608M UNUSED | ||
+ | RECO_QR01_CD_06_qr01celadm01 608M UNUSED | ||
+ | RECO_QR01_CD_07_qr01celadm01 608M UNUSED | ||
+ | RECO_QR01_CD_08_qr01celadm01 608M UNUSED | ||
+ | RECO_QR01_CD_09_qr01celadm01 608M UNUSED | ||
+ | RECO_QR01_CD_10_qr01celadm01 608M UNUSED | ||
+ | RECO_QR01_CD_11_qr01celadm01 608M UNUSED | ||
+ | CellCLI> | ||
+ | </ | ||
+ | |||
+ | |||
+ | Now, that we have brand new disks, we have to add them :) Remember never drop more than you can carry | ||
+ | |||
+ | * External Redundancy - Drop will cause data loss | ||
+ | * Normal Redundancy - Drop more than 1/2 will cause data loss | ||
+ | * High Redundancy - Drop more than 2/3 will cause data loss. | ||
+ | |||
+ | In our case, we have Normal Redundancy, so we go one by one just in case. | ||
+ | |||
+ | |||
+ | < | ||
+ | --In ASM we can check the new disks: | ||
+ | --Check if they are mounted: | ||
+ | SQL> select path, free_mb, header_status, | ||
+ | 2 from v$asm_disk | ||
+ | 3 where path like ' | ||
+ | PATH FREE_MB | ||
+ | --------------------------------------------------------------- ---------- ------------ ------- | ||
+ | o/ | ||
+ | o/ | ||
+ | o/ | ||
+ | o/ | ||
+ | o/ | ||
+ | o/ | ||
+ | o/ | ||
+ | o/ | ||
+ | o/ | ||
+ | o/ | ||
+ | o/ | ||
+ | o/ | ||
+ | 12 rows selected. | ||
+ | </ | ||
+ | |||
+ | So we have 6 shiny new disks to add to our group, let's add them: | ||
+ | |||
+ | < | ||
+ | SQL> alter diskgroup reco_qr01 add disk | ||
+ | 2 ' | ||
+ | 3 ' | ||
+ | 4 ' | ||
+ | 5 ' | ||
+ | 6 ' | ||
+ | 7 ' | ||
+ | 8 ' | ||
+ | 9 ' | ||
+ | 10 ' | ||
+ | 11 ' | ||
+ | 12 ' | ||
+ | 13 ' | ||
+ | 14 drop disks in failgroup qr01celadm02 | ||
+ | 15 rebalance power 1024; | ||
+ | Diskgroup altered. | ||
+ | SQL> | ||
+ | |||
+ | --Of course there will be some rebalance, but that will pass | ||
+ | SQL> select * from gv$asm_operation; | ||
+ | *********************************** | ||
+ | ---------- ------------ ----- --------- ---- ---------- ---------- ---------- | ||
+ | EST_WORK EST_RATE EST_MINUTES ERROR_CODE | ||
+ | ---------- ---------- ----------- -------------------------------------------- | ||
+ | CON_ID | ||
+ | ---------- | ||
+ | 2 3 REBAL COMPACT WAIT 1024 | ||
+ | 0 | ||
+ | |||
+ | --Some time later: | ||
+ | SQL> select * from gv$asm_operation; | ||
+ | no rows selected | ||
+ | SQL> | ||
+ | </ | ||
+ | |||
+ | Now we can see that we have one disk group with two different sizes: | ||
+ | < | ||
+ | SQL> select dg.name, count(*), d.total_mb, d.os_mb, | ||
+ | 2 min(d.free_mb) MIN_FREE_MB, | ||
+ | 3 from v$asm_disk d, v$asm_diskgroup dg | ||
+ | 4 where dg.group_number=d.group_number and d.mount_status=' | ||
+ | 5 group by dg.name, d.total_mb, d.os_mb; | ||
+ | |||
+ | NAME | ||
+ | --------------- ---------- ---------- ---------- ----------- ----------- | ||
+ | DBFS_DG 36 352 | ||
+ | DATA_QR01 36 | ||
+ | RECO_QR01 12 | ||
+ | RECO_QR01 12 | ||
+ | 36 disks in total | ||
+ | </ | ||
+ | |||
+ | Again that is very bad practice, so we should continue, as the steps repeat I will slack on explaining :) In a nutshell we have to move the disks from the group with 928 MB disk size to 608 disk size :) | ||
+ | |||
+ | < | ||
+ | --On the 2nd Cell Storage server | ||
+ | CellCLI> list griddisk attributes name, size, ASMModeStatus | ||
+ | RECO_QR01_CD_00_qr01celadm02 928M UNUSED | ||
+ | RECO_QR01_CD_01_qr01celadm02 928M UNUSED | ||
+ | RECO_QR01_CD_02_qr01celadm02 928M UNUSED | ||
+ | RECO_QR01_CD_03_qr01celadm02 928M UNUSED | ||
+ | RECO_QR01_CD_04_qr01celadm02 928M UNUSED | ||
+ | RECO_QR01_CD_05_qr01celadm02 928M UNUSED | ||
+ | RECO_QR01_CD_06_qr01celadm02 928M UNUSED | ||
+ | RECO_QR01_CD_07_qr01celadm02 928M UNUSED | ||
+ | RECO_QR01_CD_08_qr01celadm02 928M UNUSED | ||
+ | RECO_QR01_CD_09_qr01celadm02 928M UNUSED | ||
+ | RECO_QR01_CD_10_qr01celadm02 928M UNUSED | ||
+ | RECO_QR01_CD_11_qr01celadm02 928M UNUSED <- We dropped them earlier | ||
+ | |||
+ | --Drop the grid disks: | ||
+ | CellCLI> drop griddisk all prefix=reco_qr01 | ||
+ | GridDisk RECO_QR01_CD_00_qr01celadm02 successfully dropped | ||
+ | GridDisk RECO_QR01_CD_01_qr01celadm02 successfully dropped | ||
+ | GridDisk RECO_QR01_CD_02_qr01celadm02 successfully dropped | ||
+ | GridDisk RECO_QR01_CD_03_qr01celadm02 successfully dropped | ||
+ | GridDisk RECO_QR01_CD_04_qr01celadm02 successfully dropped | ||
+ | GridDisk RECO_QR01_CD_05_qr01celadm02 successfully dropped | ||
+ | GridDisk RECO_QR01_CD_06_qr01celadm02 successfully dropped | ||
+ | GridDisk RECO_QR01_CD_07_qr01celadm02 successfully dropped | ||
+ | GridDisk RECO_QR01_CD_08_qr01celadm02 successfully dropped | ||
+ | GridDisk RECO_QR01_CD_09_qr01celadm02 successfully dropped | ||
+ | GridDisk RECO_QR01_CD_10_qr01celadm02 successfully dropped | ||
+ | GridDisk RECO_QR01_CD_11_qr01celadm02 successfully dropped | ||
+ | CellCLI> | ||
+ | |||
+ | --Create new Grid Disks with new Size: | ||
+ | CellCLI> create griddisk all harddisk prefix=RECO_QR01, | ||
+ | GridDisk RECO_QR01_CD_00_qr01celadm02 successfully created | ||
+ | GridDisk RECO_QR01_CD_01_qr01celadm02 successfully created | ||
+ | GridDisk RECO_QR01_CD_02_qr01celadm02 successfully created | ||
+ | GridDisk RECO_QR01_CD_03_qr01celadm02 successfully created | ||
+ | GridDisk RECO_QR01_CD_04_qr01celadm02 successfully created | ||
+ | GridDisk RECO_QR01_CD_05_qr01celadm02 successfully created | ||
+ | GridDisk RECO_QR01_CD_06_qr01celadm02 successfully created | ||
+ | GridDisk RECO_QR01_CD_07_qr01celadm02 successfully created | ||
+ | GridDisk RECO_QR01_CD_08_qr01celadm02 successfully created | ||
+ | GridDisk RECO_QR01_CD_09_qr01celadm02 successfully created | ||
+ | GridDisk RECO_QR01_CD_10_qr01celadm02 successfully created | ||
+ | GridDisk RECO_QR01_CD_11_qr01celadm02 successfully created | ||
+ | CellCLI> | ||
+ | |||
+ | --On ASM, add the new disks to the " | ||
+ | SQL> alter diskgroup reco_qr01 add disk | ||
+ | 2 ' | ||
+ | 3 ' | ||
+ | 4 ' | ||
+ | 5 ' | ||
+ | 6 ' | ||
+ | 7 ' | ||
+ | 8 ' | ||
+ | 9 ' | ||
+ | 10 ' | ||
+ | 11 ' | ||
+ | 12 ' | ||
+ | 13 ' | ||
+ | 14 drop disks in failgroup qr01celadm03 | ||
+ | 15 rebalance power 1024; | ||
+ | Diskgroup altered. | ||
+ | SQL> | ||
+ | |||
+ | |||
+ | --On the 3rd Storage Cell Server | ||
+ | CellCLI> list griddisk attributes name, size, ASMModeStatus | ||
+ | RECO_QR01_CD_00_qr01celadm03 928M UNUSED | ||
+ | RECO_QR01_CD_01_qr01celadm03 928M UNUSED | ||
+ | RECO_QR01_CD_02_qr01celadm03 928M UNUSED | ||
+ | RECO_QR01_CD_03_qr01celadm03 928M UNUSED | ||
+ | RECO_QR01_CD_04_qr01celadm03 928M UNUSED | ||
+ | RECO_QR01_CD_05_qr01celadm03 928M UNUSED | ||
+ | RECO_QR01_CD_06_qr01celadm03 928M UNUSED | ||
+ | RECO_QR01_CD_07_qr01celadm03 928M UNUSED | ||
+ | RECO_QR01_CD_08_qr01celadm03 928M UNUSED | ||
+ | RECO_QR01_CD_09_qr01celadm03 928M UNUSED | ||
+ | RECO_QR01_CD_10_qr01celadm03 928M UNUSED | ||
+ | RECO_QR01_CD_11_qr01celadm03 928M UNUSED | ||
+ | |||
+ | --Drop the grid disks: | ||
+ | CellCLI> drop griddisk all prefix=reco_qr01 | ||
+ | GridDisk RECO_QR01_CD_00_qr01celadm03 successfully dropped | ||
+ | GridDisk RECO_QR01_CD_01_qr01celadm03 successfully dropped | ||
+ | GridDisk RECO_QR01_CD_02_qr01celadm03 successfully dropped | ||
+ | GridDisk RECO_QR01_CD_03_qr01celadm03 successfully dropped | ||
+ | GridDisk RECO_QR01_CD_04_qr01celadm03 successfully dropped | ||
+ | GridDisk RECO_QR01_CD_05_qr01celadm03 successfully dropped | ||
+ | GridDisk RECO_QR01_CD_06_qr01celadm03 successfully dropped | ||
+ | GridDisk RECO_QR01_CD_07_qr01celadm03 successfully dropped | ||
+ | GridDisk RECO_QR01_CD_08_qr01celadm03 successfully dropped | ||
+ | GridDisk RECO_QR01_CD_09_qr01celadm03 successfully dropped | ||
+ | GridDisk RECO_QR01_CD_10_qr01celadm03 successfully dropped | ||
+ | GridDisk RECO_QR01_CD_11_qr01celadm03 successfully dropped | ||
+ | CellCLI> | ||
+ | |||
+ | --Create new Disks with the correct size (608) | ||
+ | CellCLI> create griddisk all harddisk prefix=RECO_QR01, | ||
+ | GridDisk RECO_QR01_CD_00_qr01celadm03 successfully created | ||
+ | GridDisk RECO_QR01_CD_01_qr01celadm03 successfully created | ||
+ | GridDisk RECO_QR01_CD_02_qr01celadm03 successfully created | ||
+ | GridDisk RECO_QR01_CD_03_qr01celadm03 successfully created | ||
+ | GridDisk RECO_QR01_CD_04_qr01celadm03 successfully created | ||
+ | GridDisk RECO_QR01_CD_05_qr01celadm03 successfully created | ||
+ | GridDisk RECO_QR01_CD_06_qr01celadm03 successfully created | ||
+ | GridDisk RECO_QR01_CD_07_qr01celadm03 successfully created | ||
+ | GridDisk RECO_QR01_CD_08_qr01celadm03 successfully created | ||
+ | GridDisk RECO_QR01_CD_09_qr01celadm03 successfully created | ||
+ | GridDisk RECO_QR01_CD_10_qr01celadm03 successfully created | ||
+ | GridDisk RECO_QR01_CD_11_qr01celadm03 successfully created | ||
+ | CellCLI> | ||
+ | |||
+ | --On ASM, add the new disks (finally) to the disk group: | ||
+ | SQL> alter diskgroup reco_qr01 add disk | ||
+ | 2 ' | ||
+ | 3 ' | ||
+ | 4 ' | ||
+ | 5 ' | ||
+ | 6 ' | ||
+ | 7 ' | ||
+ | 8 ' | ||
+ | 9 ' | ||
+ | 10 ' | ||
+ | 11 ' | ||
+ | 12 ' | ||
+ | 13 ' | ||
+ | 14 rebalance power 1024; | ||
+ | Diskgroup altered. | ||
+ | SQL> | ||
+ | </ | ||
+ | |||
+ | Finally we can check the size of the disk groups again: | ||
+ | |||
+ | < | ||
+ | SQL> select dg.name, count(*), d.total_mb, d.os_mb, | ||
+ | 2 min(d.free_mb) MIN_FREE_MB, | ||
+ | 3 from v$asm_disk d, v$asm_diskgroup dg | ||
+ | 4 where dg.group_number=d.group_number and d.mount_status=' | ||
+ | 5 group by dg.name, d.total_mb, d.os_mb; | ||
+ | |||
+ | NAME | ||
+ | --------------- ---------- ---------- ---------- ----------- ----------- | ||
+ | DBFS_DG 36 352 | ||
+ | DATA_QR01 36 | ||
+ | RECO_QR01 36 | ||
+ | </ | ||
+ | |||
+ | =====Flash Cache===== | ||
+ | Flash cache in Exadata is a memory which stores often accessed data on storage level. With Flash cache you can achieve very good perfmance for reading. | ||
+ | There are 3 types of a caches: | ||
+ | |||
+ | ===Write through=== | ||
+ | Using the write-through policy, data is written to the cache and the backing store location at the same time. The significance here is not the order in which it happens or whether it happens in parallel. The significance is that I/O completion is only confirmed once the data has been written to both places. | ||
+ | |||
+ | ==Advantage== | ||
+ | Ensures fast retrieval while making sure the data is in the backing store and is not lost in case the cache is disrupted. | ||
+ | |||
+ | ==Disadvantage== | ||
+ | Writing data will experience latency as you have to write to two places every time. | ||
+ | |||
+ | ==What is it good for?== | ||
+ | The write-through policy is good for applications that write and then re-read data frequently. This will result in slightly higher write latency but low read latency. So, it’s ok to spend a bit longer writing once, but then benefit from reading frequently with low latency. | ||
+ | |||
+ | ===Write-around=== | ||
+ | Using the write-around policy, data is written only to the backing store without writing to the cache. So, I/O completion is confirmed as soon as the data is written to the backing store. | ||
+ | |||
+ | ==Advantage== | ||
+ | Good for not flooding the cache with data that may not subsequently be re-read. | ||
+ | |||
+ | ==Disadvsntage== | ||
+ | Reading recently written data will result in a cache miss (and so a higher latency) because the data can only be read from the slower backing store. | ||
+ | |||
+ | ==What is it good for?== | ||
+ | The write-around policy is good for applications that don’t frequently re-read recently written data. This will result in lower write latency but higher read latency which is a acceptable trade-off for these scenarios. | ||
+ | |||
+ | |||
+ | ===Write-back=== | ||
+ | Using the write-back policy, data is written to the cache and Then I/O completion is confirmed. The data is then typically also written to the backing store in the background but the completion confirmation is not blocked on that. | ||
+ | |||
+ | ==Advantage== | ||
+ | Low latency and high throughput for write-intensive applications. | ||
+ | |||
+ | ==Disadvantage== | ||
+ | There is data availability risk because the cache could fail (and so suffer from data loss) before the data is persisted to the backing store. This result in the data being lost. | ||
+ | |||
+ | ==What is it good for?== | ||
+ | The write-back policy is the best performer for mixed workloads as both read and write I/O have similar response time levels. In reality, you can add resiliency (e.g. by duplicating writes) to reduce the likelihood of data loss. | ||
- | ===Flash Based=== | + | ====Management==== |
The flash based modules, can be examined as the hard disk based ones: | The flash based modules, can be examined as the hard disk based ones: | ||
==List the cell disks== | ==List the cell disks== | ||
- | <sxh> | + | <Code:bash> |
--Basic | --Basic | ||
CellCLI> list celldisk where disktype=flashdisk | CellCLI> list celldisk where disktype=flashdisk | ||
Line 302: | Line 852: | ||
CellCLI> | CellCLI> | ||
- | </sxh> | + | </Code> |
Apart from the flashdisk, the exadata has also flashlog, to improve the redo log latency. | Apart from the flashdisk, the exadata has also flashlog, to improve the redo log latency. | ||
==Flash log== | ==Flash log== | ||
- | <sxh> | + | <Code:bash> |
--Detail | --Detail | ||
CellCLI> list flashlog detail | CellCLI> list flashlog detail | ||
Line 337: | Line 887: | ||
tableSpaceNumber: | tableSpaceNumber: | ||
- | </sxh> | + | </Code> |
+ | ==Drop/ | ||
+ | We can use the distributed CLI interface, provided with the exadata: " | ||
+ | < | ||
+ | --Drop Flash Cache: | ||
+ | [celladmin@qr01celadm01 ~]$ dcli -c qr01celadm01, | ||
+ | qr01celadm01: | ||
+ | qr01celadm02: | ||
+ | qr01celadm03: | ||
+ | [celladmin@qr01celadm01 ~]$ | ||
+ | --Create Flash Cache | ||
+ | [celladmin@qr01celadm01 ~]$ dcli -c qr01celadm01, | ||
+ | qr01celadm01: | ||
+ | qr01celadm02: | ||
+ | qr01celadm03: | ||
+ | [celladmin@qr01celadm01 ~]$ | ||
+ | </ | ||
- | ====Stop / Start Cell Services==== | + | ===Drop Flash Cache=== |
- | We can Restart all cell services using the cellCli interface as follows: | + | We can drop the flash cache only after, we have to stop the cluster. |
+ | < | ||
+ | --Stop the cluster: | ||
+ | [root@qr01dbadm01 ~]# / | ||
+ | CRS-2673: Attempting to stop ' | ||
+ | CRS-2673: Attempting to stop ' | ||
+ | CRS-2790: Starting shutdown of Cluster Ready Services-managed resources on ' | ||
+ | CRS-2673: Attempting to stop ' | ||
+ | CRS-2790: Starting shutdown of Cluster Ready Services-managed resources on ' | ||
+ | CRS-2673: Attempting to stop ' | ||
+ | CRS-2673: Attempting to stop ' | ||
+ | CRS-2673: Attempting to stop ' | ||
+ | CRS-2673: Attempting to stop ' | ||
- | <sxh> | + | --Drop Flash cache |
- | [root@qr01celadm01 ~]# cellcli | + | [celladmin@qr01celadm01 ~]$ dcli -c qr01celadm01, |
- | CellCLI: Release 12.1.2.1.0 - Production on Mon Nov 30 03:36:16 UTC 2020 | + | qr01celadm01: Flash cache qr01celadm01_FLASHCACHE successfully dropped |
+ | qr01celadm02: Flash cache qr01celadm02_FLASHCACHE successfully dropped | ||
+ | qr01celadm03: Flash cache qr01celadm03_FLASHCACHE successfully dropped | ||
+ | [celladmin@qr01celadm01 ~]$ | ||
+ | </ | ||
- | Copyright (c) 2007, 2013, Oracle. All rights reserved. | + | ===Shutdown Services=== |
- | Cell Efficiency Ratio: 392 | + | < |
+ | [celladmin@qr01celadm01 ~]$ dcli -c qr01celadm01,qr01celadm02,qr01celadm03 cellcli -e alter cell shutdown services cellsrv | ||
+ | qr01celadm01: | ||
+ | qr01celadm01: | ||
+ | qr01celadm01: The SHUTDOWN of CELLSRV services was successful. | ||
+ | qr01celadm02: | ||
+ | qr01celadm02: | ||
+ | qr01celadm02: | ||
+ | qr01celadm03: | ||
+ | qr01celadm03: | ||
+ | qr01celadm03: | ||
+ | [celladmin@qr01celadm01 ~]$ | ||
+ | </ | ||
- | CellCLI> alter cell restart services all | + | ===Enable Flash Through Cache=== |
+ | < | ||
+ | [celladmin@qr01celadm01 ~]$ dcli -c qr01celadm01, | ||
+ | qr01celadm01: | ||
+ | qr01celadm02: | ||
+ | qr01celadm03: | ||
+ | [celladmin@qr01celadm01 ~]$ | ||
+ | </ | ||
- | Stopping the RS, CELLSRV, and MS services... | ||
- | The SHUTDOWN of services was successful. | ||
- | Starting the RS, CELLSRV, and MS services... | ||
- | Getting the state of RS services... | ||
- | Starting CELLSRV services... | ||
- | The STARTUP of CELLSRV services was successful. | ||
- | Starting MS services... | ||
- | The STARTUP of MS services was successful. | ||
- | CellCLI> | + | ===Restart Flash Cache Services=== |
- | </sxh> | + | < |
+ | [celladmin@qr01celadm01 ~]$ dcli -c qr01celadm01, | ||
+ | qr01celadm01: | ||
+ | qr01celadm01: | ||
+ | qr01celadm01: | ||
+ | qr01celadm02: | ||
+ | qr01celadm02: | ||
+ | qr01celadm02: | ||
+ | qr01celadm03: | ||
+ | qr01celadm03: | ||
+ | qr01celadm03: | ||
+ | [celladmin@qr01celadm01 ~]$ | ||
+ | </Code> | ||
- | This action, will NOT cause downtime. | + | ===Flush Caches=== |
+ | < | ||
+ | [celladmin@qr01celadm01 ~]$ dcli -c qr01celadm01,qr01celadm02, | ||
+ | qr01celadm01: | ||
+ | qr01celadm02: | ||
+ | qr01celadm03: | ||
+ | [celladmin@qr01celadm01 ~]$ | ||
+ | </ | ||
+ | |||
+ | ====Monitoring==== | ||
+ | We can monitor the flash cache, using several commands | ||
+ | |||
+ | ===Determine the Flash Cache Type=== | ||
+ | < | ||
+ | [celladmin@qr01celadm01 ~]$ dcli -c qr01celadm01, | ||
+ | qr01celadm01: | ||
+ | qr01celadm02: | ||
+ | qr01celadm03: | ||
+ | [celladmin@qr01celadm01 ~]$ | ||
+ | </ | ||
+ | |||
+ | ===Determine the amount of Dirty Data=== | ||
+ | < | ||
+ | [celladmin@qr01celadm01 ~]$ dcli -c qr01celadm01, | ||
+ | qr01celadm01: | ||
+ | qr01celadm02: | ||
+ | qr01celadm03: | ||
+ | [celladmin@qr01celadm01 ~]$ | ||
+ | </ | ||