Please see my other blog for Oracle EBusiness Suite Posts - EBMentors

Search This Blog

Note: All the posts are based on practical approach avoiding lengthy theory. All have been tested on some development servers. Please don’t test any post on production servers until you are sure.

Thursday, October 31, 2013

Exadata: Health Checking Exadata (Exachk)

Oracle’s exachk utility (NON-INTRUSIVE and does not change anything in the environment) is designed to perform a comprehensive health check of Exadata Database Machine. It is designed to audit important configuration settings within an Oracle Exadata Database Machine. The components examined are database servers, storage Servers, InfiniBand fabric, InfiniBand Switches, and Ethernet network.
exachk should be executed (under Oracle Software owner on DB node) after the initial Oracle Exadata Database Machine deployment, as part of the routine maintenance schedule (at least monthly), and before and after any system configuration change. You should run only one exachk instance at a time.


[oracle@exadb sw]$ ./exachk -v

EXACHK  VERSION: 2.2.3_20131007
[oracle@exadb sw]$ ./exachk -h
Usage : ./exachk [-abvhpfmsuSo:c:t:]
        -a      All (Perform best practice check and recommended patch check)
        -b      Best Practice check only. No recommended patch check
        -h      Show usage
        -v      Show version 
        -p      Patch check only
        -m      exclude checks for Maximum Availability Architecture scorecards(see user guide for more details)
        -u      Run exachk to check pre-upgrade or post-upgrade best practices for 11.2.0.3,11.2.0.4.0 and 12.1.0.1
                -o pre or -o post is mandatory with -u option like ./exachk -u -o pre
        -f      Run Offline.Checks will be performed on data already collected from the system
        -o      Argument to an option. if -o is followed by v,V,Verbose,VERBOSE or Verbose, it will print checks which passs on the screen
                if -o option is not specified,it will print only failures on screen. for eg: exachk -a -o v 

        -clusternodes
                Pass comma separated node names to run exachk only on subset of nodes.
        -dbnames
               Pass comma separated database names to run exachk only on subset of databases
        -localonly
                Run exachk only on local node.

        -nopass
                Skip PASS'ed check to print in exachk report and upload to database.

        -noscore
                Do not print healthscore in HTML report.

        -diff [-outfile ]
                Diff two exachk reports. Pass directory name or zip file or html report file as &  

        -c     Used only under the guidance of Oracle support or development to override default components 

        -d
               start      : Start the exachk daemon
               stop       : Stop the exachk daemon
               status     : Check if the exachk daemon is running
               nextautorun: print the next auto run time
        -daemon
               run exachk only if daemon is running
        -nodaemon
               Dont use daemon to run exachk 
        -set 
               configure exachk daemon parameter like "param1=value1;param2=value2... "
               
                 Supported parameters are:-

                 AUTORUN_INTERVAL :- Automatic rerun interval in daemon mode.Set it zero to disable automatic rerun which is zero.

                 AUTORUN_SCHEDULE * * * *       :- Automatic run at specific time in daemon mode.
                                  - - - -
                                  � � � �
                                  � � � +----- day of week (0 - 6) (0 to 6 are Sunday to Saturday)
                                  � � +---------- month (1 - 12)
                                  � +--------------- day of month (1 - 31)
                                  +-------------------- hour (0 - 23)

                     example: exachk -set "AUTORUN_SCHEDULE=8,20 * * 2,5" will schedule runs on tuesday and friday at 8 and 20 hour.
                  
                 AUTORUN_FLAGS : exachk flags to use for auto runs.
         
                     example: exachk -set "AUTORUN_INTERVAL=12h;AUTORUN_FLAGS=-profile sysadmin" to run sysadmin profile every 12 hours

                              exachk -set "AUTORUN_INTERVAL=2d;AUTORUN_FLAGS=-profile dba" to run dba profile once every 2 days.

                 NOTIFICATION_EMAIL : Email address used for notifications by daemon if mail server is configured.

                 PASSWORD_CHECK_INTERVAL : Interval to verify passwords in daemon mode

        -get
               Print the value of parameter

        -profile Pass specific profile. 
                 List of supported profiles: 

                 asm             asm Checks          
                 clusterware     clusterware checks  
                 dba             dba Checks          
                 el_lite         Exalogic-Lite Checks(Exalogic Only)
                 el_rackcompare  Data Collection for Exalogic Rack Comparison Tool(Exalogic Only)
                 goldengate      oracle goldengate checks
                 maa             Maximum Availability Architecture Checks
                 obiee           obiee Checks(Exalytics Only)
                 storage         Storage Server Checks
                 switch          Infiniband switch checks
                 sysadmin        sysadmin checks     
                 timesten        timesten Checks(Exalytics Only)
                 virtual_infra   OVS, Control VM,  NTP-related and stale VNICs check (Exalogic Only)
                 zfs             ZFS storage appliances checks (Exalogic Only)

        -cells
                Pass comma separated storage server names to run exachk only on selected storage servers.

        -ibswitches
                Pass comma separated infiniband switch names to run exachk only on selected infiniband switches.

        -zfsnodes
                Pass comma separated ZFS storage appliance names to run exachk only on selected storage appliances.
[oracle@exadb sw]$

[oracle@exadb sw]$ ./exachk
List of running databases
1. exadb
2. None of above

Select databases from list for checking best practices. For multiple databases, select 1 for All or comma separated number like 1,2 etc [1-2][1].1
. . 

Checking Status of Oracle Software Stack - Clusterware, ASM, RDBMS
. . . . . . . . . . . . . . . 
-------------------------------------------------------------------------------------------------------
                                                 Oracle Stack Status                            
-------------------------------------------------------------------------------------------------------
Host Name  CRS Installed  ASM HOME       RDBMS Installed  CRS UP    ASM UP    RDBMS UP  DB Instance Name
-------------------------------------------------------------------------------------------------------
exadb       No              No              Yes             No         No       Yes      exadb     
-------------------------------------------------------------------------------------------------------
Copying plug-ins
. . . . . . . . . . . . . . . . . . 
. . . . . . 

*** Checking Best Practice Recommendations (PASS/WARNING/FAIL) ***
Collections and audit checks log file is 
/sw/exachk_exadb_exadb_103113_171641/log/exachk.log

Checking for prompts in /home/oracle/.bash_profile on exadb for oracle user...
=============================================================
                    Node name - exadb                                
=============================================================
Collecting - Database Parameters for exadb database 
Collecting - Database Undocumented Parameters for exadb database 
Collecting - RDBMS Feature Usage for exadb database 
Collecting - CPU Information
Collecting - DiskMount Information
Collecting - Kernel parameters
Collecting - Maximum number of semaphore sets on system
Collecting - Maximum number of semaphores on system
Collecting - Maximum number of semaphores per semaphore set
Collecting - Memory Information
Collecting - OS Packages
Collecting - Patches for RDBMS Home 
Collecting - number of semaphore operations per semop system call

Data collections completed. Checking best practices on exadb.
--------------------------------------------------------------------------------------
 FAIL =>    Database parameter DB_BLOCK_CHECKSUM is NOT set to recommended value on exadb instance
 FAIL =>    Database parameter DB_LOST_WRITE_PROTECT is NOT set to recommended value on exadb instance
 WARNING => Database parameter DB_BLOCK_CHECKING on PRIMARY is NOT set to the recommended value. for exadb
 INFO =>    Operational Best Practices
 INFO =>    Database Consolidation Best Practices
 INFO =>    Computer failure prevention best practices
 INFO =>    Data corruption prevention best practices
 INFO =>    Logical corruption prevention best practices
 INFO =>    Database/Cluster/Site failure prevention best practices
 INFO =>    Client failover operational best practices
 WARNING => oracleasm (asmlib) module is NOT loaded
 WARNING => Redo log file size should be sized to switch every 20 minutes during peak redo generation for exadb
 WARNING => RAC Application Cluster is not being used for database high availability on exadb instance
 FAIL =>    Flashback on PRIMARY is not configured for exadb
 INFO =>    Database failure prevention best practices
 WARNING => fast_start_mttr_target has NOT been changed from default on exadb instance
 WARNING => Database Archivelog Mode should be set to ARCHIVELOG for exadb
 FAIL =>    Primary database is NOT protected with Data Guard (standby database) for real-time data protection and availability for exadb
 FAIL =>    Active Data Guard is not configured for exadb
 WARNING => Redo log write time is more than 500 milliseconds for exadb
 INFO =>    Parallel Execution Health-Checks and Diagnostics Reports for exadb
 INFO =>    Oracle recovery manager(rman) best practices


Best Practice checking completed.Checking recommended patches on exadb.
---------------------------------------------------------------------------------

Collecting patch inventory on ORACLE_HOME /u01/app/oracle/product/11.2.0/dbhome_1 
---------------------------------------------------------------------------------
---------------------------------------------------------------------------------
4 Recommended RDBMS patches for 112030 from /u01/app/oracle/product/11.2.0/dbhome_1 on exadb
---------------------------------------------------------------------------------
Patch#   RDBMS    ASM     type                Patch-Description                       
---------------------------------------------------------------------------------
13923374  no             merge               DATABASE PATCH SET UPDATE 11.2.0.3.3 (IN
14727310  no             merge               DATABASE PATCH SET UPDATE 11.2.0.3.5 (IN
16056266  no             merge               DATABASE PATCH SET UPDATE 11.2.0.3.6 (IN
16619892  no             merge               DATABASE PATCH SET UPDATE 11.2.0.3.7 (IN
---------------------------------------------------------------------------------
---------------------------------------------------------------------------------
---------------------------------------------------------------------------------
              RDBMS homes patches summary report
---------------------------------------------------------------------------------
Total patches  Applied on RDBMS Applied on ASM ORACLE_HOME    
---------------------------------------------------------------------------------
 4              0              0                /u01/app/oracle/product/11.2.0/dbhome_1
---------------------------------------------------------------------------------
---------------------------------------------------------------------------------

Detailed report (html) - /sw/exachk_exadb_exadb_103113_171641/exachk_exadb_exadb_103113_171641.html
UPLOAD(if required) - /sw/exachk_exadb_exadb_103113_171641.zip
[oracle@exadb sw]$ 

exachk also produces an HTML report of findings, with the most important exceptions listed first by component. Below are the main sections of the report, and the list will vary depending upon the arguments that were passed when exachk was executed:

  •  Cluster Summary
  •  Findings Needing Attention
  •  MAA Scorecard
  •  Findings Passed
  •  Systemwide firmware and software versions
  •  Killed Processes
  •  Skipped Checks
  •  Excluded Checks
Below is the partial output form the html generated by exachk on my testing environment.

Oracle Database Assessment Report

System Health Score is 85 out of 100 (detail)

Summary

OS/Kernel VersionLINUX X86-64 OELRHEL 5 2.6.32-200.13.1.el5uek
DB Home - Version - Names/u01/app/oracle/product/11.2.0/dbhome_1 - 11.2.0.3.0 - exadb
Database Serverexadb
exachk Version2.2.3_20131007
Collectionexachk_exadb_exadb_103113_171641.zip
Collection Date31-Oct-2013 17:17:15

Database Server

StatusTypeMessageStatus OnDetails
WARNINGOS CheckRedo log write time is more than 500 millisecondsAll Database ServersView
INFOOS CheckParallel Execution Health-Checks and Diagnostics ReportsAll Database ServersView

RDBMS patch recommendation Detailed report

4 Recommended RDBMS patches for 112030 from /u01/app/oracle/product/11.2.0/dbhome_1
Patch#RDBMSASMTypePatch-Description
13923374not-appliedn/amergeDATABASE PATCH SET UPDATE 11.2.0.3.3 (INCLUDES CPU JUL2012)
14727310not-appliedn/amergeDATABASE PATCH SET UPDATE 11.2.0.3.5 (INCLUDES CPUJAN2013)
16056266not-appliedn/amergeDATABASE PATCH SET UPDATE 11.2.0.3.6 (INCLUDES CPUAPR2013)
16619892not-appliedn/amergeDATABASE PATCH SET UPDATE 11.2.0.3.7 (INCLUDES CPUJUL2013)

Findings Passed


Database Server

StatusTypeMessageStatus OnDetails
PASSOS Checkpam_limits configured properly for shell limitsAll Database ServersView
PASSOS CheckPackage glibc-2.5-24-x86_64 meets or exceeds recommendationAll Database ServersView
PASSOS CheckPackage elfutils-libelf-0.125-x86_64 meets or exceeds recommendationAll Database ServersView
PASSOS CheckPackage glibc-headers-2.5-12-x86_64 meets or exceeds recommendationAll Database ServersView
PASSOS CheckPackage elfutils-libelf-devel-0.125-x86_64 meets or exceeds recommendationAll Database ServersView
PASSOS Checkumask for RDBMS owner is set to 0022All Database ServersView
PASSOS Checkip_local_port_range is configured according to recommendationAll Database ServersView
PASSOS Checkkernel.shmmax parameter is configured according to recommendationAll Database ServersView
PASSOS CheckKernel Parameter fs.file-max is configuration meets or exceeds recommendationAll Database ServersView
PASSOS CheckShell limit hard stack for DB is configured according to recommendationAll Database ServersView
PASSOS CheckFree space in /tmp directory meets or exceeds recommendation of minimum 1GBAll Database ServersView
PASSOS CheckShell limit soft nofile for DB is configured according to recommendationAll Database ServersView
PASSOS CheckShell limit hard nproc for DB is configured according to recommendationAll Database ServersView
PASSOS CheckShell limit hard nofile for DB is configured according to recommendationAll Database ServersView
PASSOS CheckShell limit soft nproc for DB is configured according to recommendationAll Database ServersView
PASSOS CheckLinux Swap Configuration meets or exceeds RecommendationAll Database ServersView
PASSOS CheckPackage glibc-devel-2.5-24-i386 meets or exceeds recommendationAll Database ServersView
PASSOS CheckPackage compat-libstdc++-33-3.2.3-61-i386 meets or exceeds recommendationAll Database ServersView
PASSOS CheckPackage libstdc++-4.1.2-42.el5-x86_64 meets or exceeds recommendationAll Database ServersView
PASSOS CheckPackage glibc-devel-2.5-24-x86_64 meets or exceeds recommendationAll Database ServersView
PASSOS CheckPackage compat-libstdc++-33-3.2.3-61-x86_64 meets or exceeds recommendationAll Database ServersView
PASSOS CheckPackage sysstat-7.0.2-x86_64 meets or exceeds recommendationAll Database ServersView
PASSOS CheckPackage libgcc-4.1.2-42.el5-i386 meets or exceeds recommendationAll Database ServersView
PASSOS CheckPackage libstdc++-4.1.2-42.el5-i386 meets or exceeds recommendationAll Database ServersView
PASSOS CheckPackage glibc-2.5-24-i386 meets or exceeds recommendationAll Database ServersView
PASSOS CheckPackage unixODBC-devel-2.2.11-x86_64 meets or exceeds recommendationAll Database ServersView
PASSOS CheckPackage libstdc++-devel-4.1.2-42.el5-x86_64 meets or exceeds recommendationAll Database ServersView
PASSOS CheckPackage gcc-c++-4.1.2-42.el5-x86_64 meets or exceeds recommendationAll Database ServersView
PASSOS CheckPackage binutils-2.17.50.0.6-x86_64 meets or exceeds recommendationAll Database ServersView
PASSOS CheckPackage make-3.81-x86_64 meets or exceeds recommendationAll Database ServersView
PASSOS CheckPackage libaio-devel-0.3.106-3.2-i386 meets or exceeds recommendationAll Database ServersView
PASSOS CheckPackage libgcc-4.1.2-42.el5-x86_64 meets or exceeds recommendationAll Database ServersView
PASSOS CheckPackage libaio-devel-0.3.106-x86_64 meets or exceeds recommendationAll Database ServersView
PASSOS CheckPackage libaio-0.3.106-i386 meets or exceeds recommendationAll Database ServersView
PASSOS CheckPackage libaio-0.3.106-x86_64 meets or exceeds recommendationAll Database ServersView
PASSOS CheckPackage glibc-common-2.5-x86_64 meets or exceeds recommendationAll Database ServersView
PASSOS CheckKernel Parameter SEMMNS OKAll Database ServersView
PASSOS CheckKernel Parameter kernel.shmmni OKAll Database ServersView
PASSOS CheckKernel Parameter SEMMSL OKAll Database ServersView
PASSOS CheckKernel Parameter SEMMNI OKAll Database ServersView
PASSOS CheckKernel Parameter SEMOPM OKAll Database ServersView
PASSOS CheckKernel Parameter kernel.shmall OKAll Database ServersView
PASSOS CheckThe number of async IO descriptors is sufficient (/proc/sys/fs/aio-max-nr)All Database ServersView
PASSOS Checknet.core.rmem_max is Configured ProperlyAll Database ServersView
PASSOS Checknet.core.rmem_default Is Configured ProperlyAll Database ServersView

Note:
If it is desired for some reason, adding the exachk command line qualifier “-noscore” to the launch command will remove the “System Health Score” section from the final report of findings.

If you want  exachk to run on each database server in the cluster, you can use the “-localonly” command line option


Related Posts:

No comments: