Shutdown and Startup of Exadata CellNode in Rolling

Shutdown and  Startup of Exadata CellNode in Rolling

 

High Level Steps:

a) Increase Disk Repair Time in ASM, to avoid disks drop from ASM

b) Execute All commands as root user on cell Node.

c) Check disks status from OS prompt by using “cellcli” utility.

d) Take All disks offline on Respective Cell Node before shutting down.

e) shutdown the Cellnode

f) Power on Cellnode by using “ILOM”

g) Bring back disks online once cellnode is online.

h) Repeat the same steps a to g remaining 13 Cellnodes on the Cluster.

 

  1. Adjust Parameters at ASM Level

If we need to offline the ASM disks for more than the default time of 3.6 hours then adjust the parameters at ASM Level on database Node as ASM User by Running below commands.

login to database node “ exanode01” as oracle user

. oraenv

+ASM1

Sqlplus /  as sysasm

ALTER DISKGROUP DATAC1 SET ATTRIBUTE ‘DISK_REPAIR_TIME’=’8.5H’;

ALTER DISKGROUP RECOC1 SET ATTRIBUTE ‘DISK_REPAIR_TIME’=’8.5H’;

ALTER DISKGROUP DBFS_DG SET ATTRIBUTE ‘DISK_REPAIR_TIME’=’8.5H’;

        2.Login to the Cell exacellnode01 as root , run below commands:

cellcli -e list celldisk ; Cellcli –e List physicaldisk ; Cellcli –e List Griddisk

cellcli -e List griddisk attributes name,asmmodestatus,asmdeactivationoutcome;

cellcli -e LIST GRIDDISK ATTRIBUTES name, asmmodestatus;

 

            3.Now will need to check if ASM will be OK if the grid disks go OFFLINE. The following command should return ‘Yes’ for the grid disks being listed:

cellcli -e list griddisk attributes name,asmmodestatus,asmdeactivationoutcome

 

    4.Run cellcli command to Inactivate all grid disks on the cell:

cellcli –e Alter griddisk all Inactive

 

5.Confirm that the griddisks are now offline by running command:

Execute the command below and the output should show either asmmodestatus=OFFLINE or asmmodestatus=UNUSED and asmdeactivationoutcome=Yes for all griddisks once the disks are offline in ASM. Only then is it safe to proceed with shutting down or restarting the cell:

a) cellcli -e list griddisk attributes name,asmmodestatus,asmdeactivationoutcome

b) List the griddisks to confirm all now show inactive:

cellcli -e list griddisk

6) shutdown the cell node using the Linux shutdown command.

(a) The following command will shut down Oracle Exadata Storage Server immediately: (as root):

#shutdown -h now

7 Steps to Power on Cellnode and Bring back all services and disks online.

 

8 Enter ILOM(Integrated Light Outs Manager) IP address of Cell in putty Terminal

 

Login as root and Execute below command:

Show /SYS       : This command shows the status of the Cell Node.  “power_state = OFF”  Means, the Node is down.

 

9 Now bring up the Cell by Executing below command:

 

Start /SYS   and Click Enter

It prompts for “Yes” or “No”

Enter “Yes”

 

Wait for Cellnode to come online.

 

10.Once the cell comes back online – we will need to reactive the griddisks:

           cellcli -e alter griddisk all active

11. Issue the command below and all disks should show ‘active’:

cellcli -e list griddisk

12.Verify grid disk status:

(a) Verify all grid disks are online using the following command:

cellcli -e list griddisk attributes name, asmmodestatus

(b) Wait until asmmodestatus is ONLINE for all grid disks. Each disk will go to a ‘SYNCING’ state first then ‘ONLINE’. The following is an example of the output:

DATAC1_CD_00_ exacellnode01 ONLINE

DATAC1_CD_01_ exacellnode01 SYNCING

DATAC1_CD_02_ exacellnode01 OFFLINE

(c) Oracle ASM synchronization is only complete when all grid disks show asmmodestatus=ONLINE.

 

13.Oracle ASM Synchronization

Before taking another storage server offline, Oracle ASM synchronization must complete on the restarted Oracle Exadata Storage Server. If synchronization is not complete, then the check performed on another storage server will fail. The following is an example of the output:

 

CellCLI> list griddisk attributes name where asmdeactivationoutcome != ‘Yes’

DATAC1_CD_00_ exacellnode02″ Cannot de-activate due to other offline disks in the diskgroup”

 

service celld status

 

14.Repeat above steps 1 to 13 for remaining 13 Cell Nodes.

Source is from Oracle support.

 

See also: