README_sys_diag.txt (copyright 1999-2007 Todd A. Jobson) ---------------------------------------------------------------------------------------------- ---------------------------------------------------------------------------------------------- _____________________________________ 0. Outline of this README document : _____________________________________ 1. sys_diag v.7.04 Overview 2. Common Command Line usage and available parameters 3. Common line Usage Examples + 4. Examples for capturing sys_diag Command Line output 5. Examples of sys_diag Crontab entries 6. sys_diag DIRECTORIES and DATA FILE Descriptions 7. Sample Command Line Output 8. For more Information : Reference Links & Feedback ---------------------------------------------------------------------------------------------- ---------------------------------------------------------------------------------------------- ________________________________ 1. sys_diag v.7.04 Overview : ________________________________ BACKGROUND / INTRODUCTION : sys_diag is a Solaris utility (ksh script) that can perform several functions, among them, system configuration 'snapshot' and reporting (detailed or high-level) plus workload characterization/profiling via performance data gathering (over some specified duration or time in point 'snapshot'), high-level analysis, and reporting of findings/exceptions (based upon perf thresholds that can be easily changed within the script header). The output is provided in a single .tar.Z of output and corresponding data files, and a local sub-directory where report/data files are stored. The report format is provided in .html, .txt, and .ps as a single file for easy review (without requiring trudging through several subdirectories of separate files to manually correlate and review). sys_diag runs on any Solaris 2.6 (or above) Sun platform, including reporting of new Solaris 10 capabilities (zone/containers, SVM, zfspools, fmd, ipfilter/ipnat, link aggr, Dtrace probing, etc...). Beyond the Sun configuration reporting commands [System/storage HW config, OS config, kernel tunables, network/IPMP/Trunking/LLT config, FS/VM/NFS, users/groups, security, NameSvcs, pkgs, patches, errors/warnings, and system/network performance metrics...], sys_diag also captures relevant application configuration details, such as Sun N1, Sun Cluster 2.x/3.x, Veritas VCS/VM/vxfs.., Oracle .ora/listener files, etc.. detailed configuration capture of key files (and tracking of changes via -t), etc ... Of all the capabilities, the greatest benefits are found by being able to run this single ksh script on a system and do the analysis from one single report/ file... offline/elsewhere (in addition to being capable of historically archiving system configurations, for disaster recovery.. or to allow for tracking system chgs over time.. after things are built/tested/certified). One nice feature for performance analysis is that the vmstat and netstat data is exported in a text format friendly to import and created graphs from in StarOffice or Excell.. as well as creating IO and NET device Averages from IOSTAT / Netstat data (# IO's per device, AVG R/W K, etc..) along with peak exceptions for CPU / MEM / IO / NET .. Although this tool isn't meant to replace long-term historical Performance Trending and Capacity Planning packages (Teamquest, etc..), it provides the foundation and basis for a very robust starting point (and actually is much better at point in time workload characterization and root cause analysis of bottlenecks, where very granular detailed data correlation is required). Although I'm a Sun employee, this has been personally developed over many years, in my spare time in order to make my life a lot easier and more efficient. Hopefully others will find this utility capable of doing the same for them, also making use of it's legwork.. to streamline the admin/analysis activities required of them. This has been an invaluable tool used to diagnose / analyze hundreds of performance and/or configs issues Regarding the system overhead, sys_diag runs all commands in a serial fashion (waiting for each command to complete before running the next) impacting system performance the same as if an admin were typing these commands one at a time on a console.. with the exception of the background vmstat/mpstat/iostat/netstat that's done when (-g) gathering performance data over some interval for report/analysis (which generally has minimal impact on a system, especially if the sample interval [-I] is not every second). sys_diag is generally run from /var/tmp as "sys_diag -l" for creating a detailed long report, or via "sys_diag -g -l " for gathering performance data and generating a long/detailed config/analysis report), however offers many command line parameters documented within the header, or via "sysdiag -?". ** READ the Usage below, as well as the Performance Parameters sections for futher enlightenment.. ;) NOTE: For the best .html viewing experience, Do NOT use MS Internet Explorer browser as it varies in support of HTML stds for formating and iframe file inclusion (ending up opening many windows vs embedding output files within the single .html report). ** USE Netscape, Mozilla, Firefox, etc.. browsers, ensuring that your display resolution is set to the maximum resolution, and font sizes are defaults or not made too large (for best viewing open full screen) *** As is the best practice for any environment, first TEST thoroughly on a representative TEST configuraiton PRIOR to running this or making any production system changes. (read the sys_diag ksh headers for disclaimer and support notes) *** ** See http://blogs.sun.com/toddjobson/ for a blog relating to system performance, capacity planning, and systems architecture / availability. For the last BigAdmin released version of sys_diag or from SunFreeware.com see Section 8 at the end of this document for URL's. ___________________________________________________________________________________________ ___________________________________________________________________________________________ _________________________________________________________ 2. Common Command Line usage and available parameters : _________________________________________________________ COMMAND USAGE : # sys_diag [-a -A -c -C -d_ -D -f_ -g -G -H -I_ -l -L_ -n -o_ -p -P -s -S -T_ -t -u -v -V -h|-? ] -a Application details (included in -l/-A) -A ALL Options are turned on, except Debug and -u -c Configuration details (included in -l/-A) -C Cleanup Files and remove Directory if tar works -d path Base directory for data directory / files -D Debug mode (ksh set -x .. echo statements/variables/evaluations) -f input_file Used with -t to list configuration files to Track changes for -g gather Performance data (2 sec intervals for 5 mins, unless -I |-T exist) -G GATHER Extra Perf data (S10 Dtrace, more lockstats, pmap/pfiles) vs -g -h | -? Help / Command Usage (this listing) -H HA config and stats -I secs Perf Gathering Sample Interval (default is 2 secs) -l Long Listing (most details, but not -g,-V,-A,-t,-D) -L label_descr_nospaces (Descriptive Label For Report) -n Network configuration and stats (also included in -l/-A except ndd settings) -o outfile Output filename (stored under sub-dir created) -p Generate Postscript Report, along with .txt, and .html -P -d ./data_dir_path Post-process the Perf data skipped with -S and finish .html rpt -s SecurIty configuration -S SKIP POST PROCESSing of Performance data (use -P -d data_dir to complete) -t Track configuration / cfg_file changes (Saves/Rpts cfg/file chgs *see -f) -T secs Perf Gathering Total Duration (default is 300 secs =5 mins) -u unTar ed: (do NOT create a tar file) -v version Information for sys_diag -V Verbose Mode (adds path_to_inst, network dev's ndd settings, mdb, snoop..) Longer message/error/log listings. Additionally, pmap is run if -g ||-G, and the probe duration for Dtrace and lockstat sampling is widened from 2 seconds (during -G) to 5 seconds (if -G && -V). Ping is also run against the default route and google.com to guage latency. NOTE: NO args equates to a brief rpt (No -A,-g/I,-l,-t,-D,-V,..) ** Also, note that option/parameter ordering is flexible, as well as use of white space before arguments to parameters (or not). The only requirement is to list every option/parameter separately with a preceeding - (-g -l , but not -gl). BOTH of the following command line syntax examples is functionally the same : eg. ./sys_diag -g -I 1 -T 1800 -t -f ./config_files -l OR ./sys_diag -g -l -t -f./config_files -I1 -T1800 ---------------------------------------------------------------------------------------------- ---------------------------------------------------------------------------------------------- ____________________________ 3. Common Usage Examples : ____________________________ ./sys_diag -l Creates a LONG /detailed configuration rpt (.html/.txt) Without -l, the config report created has basic system cfg details. ./sys_diag -g -l gathers performance data at the default sampling rate of 2 secs for a total duration of 5 mins, adding a color coded performnc header/ Dashboard Summary section and any performance findings/exceptions found to the long (-l) cfg rpt. Also takes (3) starting/midpt/endpoint snapshots using minimal lockstat/kstat (1sec) NOTE: -g is meant to gather perf data without overhead, therefore only 1 second lockstat samples are taken. Use -G and/or -V for more detailed system probing (see examples and notes below) Using -V with -g, adds pmap/pfiles snapshots, vs. using -G to also capture Dtrace and extended lockstat probing. * Any time that sys_diag is run with either -g or -G, the performance section of the command line output is appended to the file sys_diag_perflog.out, which gets copied and archived as part of the final .tar.Z output file. (*Examples for capturing ALL output are in the next section *) ./sys_diag -g -I 1 -T 600 -l gathers perf data at 1 sec samples for 10 mins and creates a long config rpt as noted above. Also does basic start/mid/endpoint sampling using lockstat/kstat/pmap. ./sys_diag -l -C creates long config rpt, and Cleans up.. aka removes the data directory after tar.Z completes ./sys_diag -d base_directory_path (changes the base dir for datafiles from curr dir) ./sys_diag -G -I 1 -T 600 -l Gathers DEEP performance & Dtrace/lockstat/pmap data at 1 sec samples for 10 mins & creates a long cfg rpt (in addition to the standard data gathering from -g). *NOTE: this runs all Dtrace/Lockstat/Pmap probing during 3 snapshot intervals (beginning_0/midpoint_1/ and endpoint_#2 snapshots), limiting probing overhead to BEFORE/AFTER the standard data gathering begins (vmstat, mpstat, iostat, netstat, .. from -g). The MIDPOINT probing occurs at a known point as not to confuse this activity for other system processing. *Because of this, standard data collection may not start for 30+ seconds, or until the beginning snapshot (snapshot_#0) is complete. (-g snapshot_#0 activities only take a couple seconds to complete, since they do not include any Dtrace/lockstat.. beyond 1 sec samples). ./sys_diag -G -V -I 1 -T 600 Gathers DEEP, VERBOSE, performance & Dtrace/lockstat/pmap data at 1 sec samples for 10 mins (using 5 second Dtrace and Lockstat snapshots, vs. 2 second probes for only -G. (in addition to the standard data gathering from -g). ./sys_diag -g -l -S (gathers perf data, runs long config rpt, and SKIPS Post-Processing and .html report generation) ** This allows for completing the post-processing/analysis activities either on another system, or at a later time, as long as the data_directory exists (which can be extracted from the .tar.Z, then refered to as -d data_dir_path ). ** See the next example using -P -d data_path ** ./sys_diag -P -d ./data_dir_path (Completes Skipped Post-Processing & .html rpt creation) ---------------------------------------------------------------------------------------------- ---------------------------------------------------------------------------------------------- _____________________________________________ 4. Capturing sys_diag command line output : _____________________________________________ To capture all cmd line output (stdout/stderr) to a file use either : script [-a] /var/tmp/sys_diag.out (then after running sys_diag, type exit) OR ./sys_diag -g [..other options..] 1>/var/tmp/sys_diag.out 2>&1 (this will hide all command line output .. all instead going to the file) NOTE: If the filename used for capturing command line output is /var/tmp/sys_diag.out or uses the same path as the -d base_data_directory , then that file will be automatically copied as part of the .tar.Z created. ---------------------------------------------------------------------------------------------- ---------------------------------------------------------------------------------------------- _____________________________________________ 5. Executing sys_diag via CRONTAB entries : _____________________________________________ To run /var/tmp/sys_diag as a CRON entry (@9am every Friday), with data stored in (-d) /var/tmp, with all cmd line output appended to /var/tmp/sys_diag.out : (set EDITOR=vi;export EDITOR .. as root run "crontab -e" adding the following line) 0 9 * * 5 /var/tmp/sys_diag -g -d /var/tmp 1>>/var/tmp/sys_diag.out 2>&1 To run /var/tmp/sys_diag for tracking configuration and configuration file changes (-t) midnight every day, using an input file to specify the list of files to track and report on (-f /var/tmp/sysd_tfiles), storing the data directory for runs under the basedirectory (-d /var/tmp). All output from sys_diag gets saved (appended) in /var/tmp/sys_diag.out 0 0 * * * /var/tmp/sys_diag -t -l -f /var/tmp/sysd_tfiles -d /var/tmp 1>>/var/tmp/sys_diag.out 2>&1 Note, that the following describes the first 5 fields for crontab entries : minute (0-59), hour (0-23), day of the month (1-31), month of the year (1-12), day of the week (0-6 with 0=Sunday). * Lising a field with either comma or dash separated list allows multiple times/days (eg. "0 9 * * 1-5" runs Mon-Fri @9am, while "0 9 * * 1,5 " runs on Mon & Fri's only) * ---------------------------------------------------------------------------------------------- ---------------------------------------------------------------------------------------------- ______________________________________________________ 6. sys_diag DIRECTORIES and DATA FILE Descriptions : ______________________________________________________ The following list is a description of the files you will encounter within the default base directory that sys_diag uses for its data files (or identified with -d) : [NOTE: "socrates" is the hostname of the system used to generate the following filenames. * most files use the following naming convention : sysd_*_hostname_YYMMDD_HHMM ] # ls ./sys* -rwxr-xr-x 1 root root 186900 May 11 03:44 sys_diag drwxr-xr-x 1 root root 2560 May 11 03:44 sysd_socrates_070511_0355 drwxr-xr-x 2 root root 1024 May 11 03:56 sysd_cfg_mgt The listing above shows the sys_diag script itself, as well as the 2 directories that were created if run with the -A (or -t) options. The sysd_hostname_YYMMDD_HHMM directory is the data directory where all the data files are stored for the reporting and performance data capture. The last directory listed as sysd_cfg_mgt is only created/used if you run with either -t or -A to initiate tracking of system configuration changes. The details and descriptions of the contents of both directories is listed below : # ls ./sysd_socrates_070511_0355/ : SYS_DIAG DATA DIRECTORY (sysd_hostname_YYMMDD_HHMM) Filename Arg Description _______________________________________ _____ __________________________________________________ sys_diag * A copy of the sys_diag script used sys_diag.out - sys_diag command line output (if captured) sys_diag_perflog.out -g|-G Performance Summary cmdline output (history) sysd_socrates_070511_0355.out.html * **Final (Main) .html Report** sysd_socrates_070511_0355.out.ps -p Postscript Report 2 pages/pg landscape sysd_socrates_070511_0355.out.dash.html -g|-G Performance Analysis Dashboard .html piece sysd_socrates_070511_0355.out * Sys_diag main .txt output file (for .hmtl / .ps) sysd_net1_socrates_070511_035522.out -g|-G NIC1s netstat output file (NIC1= lo0) sysd_net1_socrates_070511_0357.gr.txt -g|-G NIC1s graph-reformatted netstat .txt output file sysd_net1x_socrates_070511_0357.out -g|-G NIC1 netstat traffic (exceptions) beyond thresholds sysd_net2_socrates_070511_035522.out -g|-G NIC2 netstat output file sysd_net2_socrates_070511_0357.gr.txt -g|-G NIC1 graph-reformatted netstat .txt output file sysd_net2x_socrates_070511_0357.out -g|-G NIC1 netstat traffic (exceptions) beyond thresholds .... etc.. for all network cards ... sysd_ifcfg_socrates_070511_0356.out -n|-l|-g Network ifconfig -a output for host socrates sysd_netstata_socrates_070511_035608.out -n|-l|-g netstat -a output sysd_netstat0_socrates_070511_035504.out -g|-G netstat -i -a stats summary (snapshot #0) sysd_netstat1_socrates_070511_035604.out -g|-G netstat -i -a stats summary (snapshot #1) sysd_netstat2_socrates_070511_035722.out -g|-G netstat -i -a stats summary (snapshot #2) sysd_netavg1_socrates_070511_0357.out -g|-G Network average/Peak calculations output file #1 sysd_netavg2_socrates_070511_0357.out -g|-G Network average/Peak calculations output file #2 sysd_knetb_hme0_socrates_070511_035522.out -g|-G Kstat output beginning snapshot for hme0 sysd_knetb_lo0_socrates_070511_035522.out -g|-G Kstat output beginning snapshot for lo0 sysd_knete_hme0_socrates_070511_035721.out -g|-G Kstat output ending snapshot for hme0 sysd_knete_lo0_socrates_070511_035721.out -g|-G Kstat output ending snapshot for lo0 .... etc.. for all network cards ... sysd_io_socrates_070511_035503.out -g|-G iostat data captured (raw format) sysd_iox_socrates_070511_0357.out -g|-G iostat exceptions beyond thresholds sysd_ioavg_socrates_070511_0357.out -g|-G iostat device avgs & peaks from post-processing sysd_iocavg_socrates_070511_0357.out -g|-G iostat controller averages sysd_vxstat0_socrates_070511_035504.out -g|-G vxstat FS stats (snapshot #0) sysd_vxstat1_socrates_070511_035604.out -g|-G vxstat FS stats (snapshot #1) sysd_vxstat2_socrates_070511_035722.out -g|-G vxstat FS stats (snapshot #2) sysd_mp_socrates_070511_035503.out -g|-G mpstat data captured (raw format) sysd_mpx_socrates_070511_0357.out -g|-G mpstat exceptions beyond thresholds sysd_mdb0_socrates_070511_035504.out -G && -V mdb kernel memory profile (snapshot #0) sysd_mdb1_socrates_070511_035604.out -G && -V mdb kernel memory profile (snapshot #1) sysd_mdb2_socrates_070511_035722.out -G && -V mdb kernel memory profile (snapshot #2) sysd_memx_socrates_070511_0357.out -g|-G vmstat memory exceptions sysd_vm_socrates_070511_035503.out -g|-G vmstat data captured (raw format) sysd_vm_socrates_070511_035503.out.gr.txt -g|-G vmstat reformatted graph datafile (S08) sysd_vmx_socrates_070511_0357.out -g|-G vmstat exceptions beyond thresholds sysd_vmavg_socrates_070511_0357.out -g|-G vmstat averages and Peak entries sysd_lI0_socrates_070511_035504.out -g|-G Lockstat -I -W -s (snap #0) sysd_lI1_socrates_070511_035604.out -g|-G Lockstat -I -W -s (snap #1) sysd_lI2_socrates_070511_035722.out -g|-G Lockstat -I -W -s (snap #2) sysd_lA0_socrates_070511_035513.out -g|-G Lockstat -A -D (snap #0) sysd_lA1_socrates_070511_035613.out -g|-G Lockstat -A -D (snap #1) sysd_lA2_socrates_070511_035730.out -g|-G Lockstat -A -D (snap #2) sysd_ls0_socrates_070511_035504.out -G Lockstat -s -D (snap #0) sysd_ls1_socrates_070511_035604.out -G Lockstat -s -D (snap #1) sysd_ls2_socrates_070511_035722.out -G Lockstat -s -D (snap #2) sysd_lP0_socrates_070511_035513.out -G Lockstat -AP -D (snap #0) sysd_lP1_socrates_070511_035613.out -G Lockstat -AP -D (snap #1) sysd_lP2_socrates_070511_035730.out -G Lockstat -AP -D (snap #2) sysd_psc0_socrates_070511_035504.out -g|-G Ps sorted by cpu (snap #0) sysd_psc1_socrates_070511_035604.out -g|-G Ps sorted by cpu (snap #1) sysd_psc2_socrates_070511_035721.out -g|-G Ps sorted by cpu (snap #2) sysd_psm0_socrates_070511_035504.out -g|-G Ps sorted by mem (snap #0) sysd_psm1_socrates_070511_035604.out -g|-G Ps sorted by mem (snap #1) sysd_psm2_socrates_070511_035721.out -g|-G Ps sorted by mem (snap #2) sysd_PSc_socrates_070511_035543.out -g|-G Ps (baseline) sorted by %cpu sysd_PSm_socrates_070511_035543.out -g|-G Ps (baseline) sorted by %mem sysd_warn_socrates_070511_035503.out -l|-g|-G Warning Messages from dmesg/messages/syslog... sysd_error_socrates_070511_035503.out -l|-g|-G Error Messages from dmesg/messages/syslog... sysd_pkg_socrates_070511_035503.out -l pkginfo -l (listing) sysd_snoop_socrates_070511_035522.out -g &(-n|-V) network snoop output sysd_swapl_socrates_070511_035622.out -l|(-g|-G) Physical Swap (swap -l) and phys RAM output sysd_sys_socrates_070511_035503.out -l|-c /etc/system kernel parameters/tunables file sysd_lwp_socrates_070511_035543.out -l|-g|-G Top processes via ps ..sorted by # LWP sysd_cputrk0_socrates_070511_035504.out -G Cputrack top PID data (TLB_misses & % FP) (snap #0) sysd_cputrk1_socrates_070511_035604.out -G Cputrack top PID data (TLB_misses & % FP) (snap #1) sysd_cputrk2_socrates_070511_035704.out -G Cputrack top PID data (TLB_misses & % FP) (snap #2) sysd_pmap0_socrates_070511_035504.out -G Top 5 PID details (pmap, pfiles, ptree) (snap #0) sysd_pmap1_socrates_070511_035604.out -G Top 5 PID details (pmap, pfiles, ptree) (snap #1) sysd_pmap2_socrates_070511_035704.out -G Top 5 PID details (pmap, pfiles, ptree) (snap #2) sysd_dpio0_socrates_070511_035504.out -G Dtrace : IOsnoop for top pids (snap #0) sysd_dpio1_socrates_070511_035604.out -G Dtrace : IOsnoop for top pids (snap #1) sysd_dpio2_socrates_070511_035704.out -G Dtrace : IOsnoop for top pids (snap #2) sysd_diow0_socrates_070511_035504.out -G Dtrace : File IO/IO waits (snap #0) sysd_diow1_socrates_070511_035604.out -G Dtrace : File IO/IO waits (snap #1) sysd_diow2_socrates_070511_035704.out -G Dtrace : File IO/IO waits (snap #2) sysd_dmpc0_socrates_070511_035504.out -G Dtrace : Top ICSW/SMTX/XCAL (#0, if -V||avg_icsw > HWM) sysd_dmpc1_socrates_070511_035604.out -G Dtrace : Top ICSW/SMTX/XCAL (#1, if -V||avg_icsw > HWM) sysd_dmpc2_socrates_070511_035704.out -G Dtrace : Top ICSW/SMTX/XCAL (#2, if -V||avg_icsw > HWM) sysd_dsyscall_counts0_socrates_070511_035504.out -G Dtrace syscall counts by call (snap #0) sysd_dsyscall_counts1_socrates_070511_035604.out -G Dtrace syscall counts by call (snap #1) sysd_dsyscall_counts2_socrates_070511_035704.out -G Dtrace syscall counts by call (snap #2) sysd_dcalls_by_procs0_socrates_070511_035504.out -G Dtrace process syscalls (snap #0) sysd_dcalls_by_procs1_socrates_070511_035604.out -G Dtrace process syscalls (snap #1) sysd_dcalls_by_procs2_socrates_070511_035704.out -G Dtrace process syscalls (snap #2) sysd_dintrtm0_socrates_070511_035504.out -G Dtrace Interrupt times (snap #0) sysd_dintrtm1_socrates_070511_035604.out -G Dtrace Interrupt times (snap #1) sysd_dintrtm2_socrates_070511_035704.out -G Dtrace Interrupt times (snap #2) sysd_dsdtcnt0_socrates_070511_035504.out -G Dtrace sdt_ counts (snap #0) sysd_dsdtcnt1_socrates_070511_035604.out -G Dtrace sdt_ counts (snap #1) sysd_dsdtcnt2_socrates_070511_035704.out -G Dtrace sdt_ counts (snap #2) sysd_dsinfo_by_procs0_socrates_070511_035504.out -G Dtrace process sysinfo counts (snap #0) sysd_dsinfo_by_procs1_socrates_070511_035604.out -G Dtrace process sysinfo counts (snap #1) sysd_dsinfo_by_procs2_socrates_070511_035704.out -G Dtrace process sysinfo counts (snap #2) sysd_dtcp_rx0_socrates_070511_035504.out -G Dtrace process tcp reads (snap #0) sysd_dtcp_rx1_socrates_070511_035604.out -G Dtrace process tcp reads (snap #1) sysd_dtcp_rx2_socrates_070511_035704.out -G Dtrace process tcp reads (snap #2) sysd_dtcp_tx0_socrates_070511_035504.out -G Dtrace process tcp writes (snap #0) sysd_dtcp_tx1_socrates_070511_035604.out -G Dtrace process tcp writes (snap #1) sysd_dtcp_tx2_socrates_070511_035704.out -G Dtrace process tcp writes (snap #2) sysd_dR_by_procs0_socrates_070511_035504.out -G Dtrace process read calls (snap #0) sysd_dR_by_procs1_socrates_070511_035604.out -G Dtrace process read calls (snap #1) sysd_dR_by_procs2_socrates_070511_035704.out -G Dtrace process read calls (snap #2) sysd_dW_by_procs0_socrates_070511_035504.out -G Dtrace process write calls (snap #0) sysd_dW_by_procs1_socrates_070511_035604.out -G Dtrace process write calls (snap #1) sysd_dW_by_procs2_socrates_070511_035704.out -G Dtrace process write calls (snap #2) lockstat_files.out -g|-G Lockstat syntax and output file list socrates_change_log.out -t|-A Configuration Tracking change log copy README_sys_diag.txt * This file. ** NOTE: the vmstat and netstat .gr.txt files above can easily be imported/inserted using StarOffice 8 or Excel to Generate GRAPHS. For Staroffice 8, Insert->sheet from file (delimited by space).. then hiding any columns that you don't want graphed.. following the wizard for graph choices/options. For Excel, File->Open (type *.txt) -> Text Import Wizard (Delimited-> Space), then after import, delete un-needed columns. -------------------------------------------------------------------------------------------- **Configuration Managment / Tracking Directory** # ls ./sysd_cfg_mgt Filename Description _______________________________________ _________________________________________________ cfgadm_last.cfg Last captured /usr/sbin/cfgadm output eeprom_last.cfg Last captured /usr/sbin/eeprom output metastat_last.cfg Last captured /usr/sbin/metastat output metadb_last.cfg Last captured /usr/sbin/metadb output psrinfo_last.cfg Last captured /usr/sbin/psrinfo output prtconf_last.cfg Last captured /usr/sbin/prtconf output prtdiag_last.cfg Last captured /usr/platform/*/sbin/prtdiag -v sysdef_last.cfg Last captured /usr/sbin/sysdef -D output F_hosts_last.cfg Last captured FILE: /etc/hosts F_mnttab_last.cfg Last captured FILE: /etc/mnttab F_nsswitch_last.cfg Last captured FILE: /etc/nsswitch.conf F_resolve_last.cfg Last captured FILE: /etc/resolv.conf F_syslog_last.cfg Last captured FILE: /etc/syslog.conf file F_system_last.cfg Last captured FILE: /etc/system file socrates_change_log.out Change log of past/current configuration chgs 070511_0356_cfgadm.cfg Date stamped historical cmd output files 070511_0356_df.cfg 070511_0356_eeprom.cfg 070511_0356_metastat.cfg 070511_0356_metadb.cfg 070511_0356_psrinfo.cfg 070511_0356_prtconf.cfg 070511_0356_prtdiag.cfg 070511_0356_sysdef.cfg 070511_0356_F_hosts.cfg Date stamped historical configuration FILES 070511_0356_F_mnttab.cfg 070511_0356_F_nsswitch.cfg 070511_0356_F_resolve.cfg 070511_0356_F_syslog.cfg 070511_0356_F_system.cfg ******* ** NOTE: If the -f intput_file option is used with -t, then all files listed within the input_file (as one absolute file path per line) will also be tracked for chgs. ---------------------------------------------------------------------------------------------- ---------------------------------------------------------------------------------------------- _________________________________ 7. Sample Command Line Output : _________________________________ The following output was captured recently from running sys_diag v.7.04 on a Sun Ultra60 2 cpu test system in my lab (note that sys_diag has been run on virtually every type of Sun system, running Solaris 2.6 -> S10. Note the list of utilities run and types of data captured, as well as the final performance summary (a small summary of the complete color coded HTML dashboard available in the full .html report). Realize that 70% of sys_diag 's benefit is in working from the .html aggregated report file.. that links and correlates all the independant data files together with findings and exceptions via a nice color-coded header / dashboard / and Table of Contents. (either way, the legwork is all done for you !) The following example does the deepest level of Performance data Gathering (-G, which includes Dtrace and pmap/pfiles snapshots vs. -g for light-weight perf gathering), Verbose output (-V), in addition to creation of a long/detailed configuration report (-l). The sampling rate used is 1 second intervals (-I1) for a total duration of 298 seconds (-T298). *Without -I || -T, the defaults are 2 second samples for 5 minutes total data gathering. Also note that when -G && -V are used together, the initial Dtrace and Lockstat snapshots take a couple minutes to complete, prior to beginning the data collection for 298 seconds (since the duration of probing is expanded with -V to 5 seconds vs 2 seconds with -G alone, or 1 second minimal lockstat data with -g.. but no Dtrace or pmap/pfiles visibility). root@/var/tmp # ./sys_diag -G -V -l -I1 -T298 sys_diag:0717_033209: GATHER Extra PERFORMANCE DATA (-G) sys_diag:0717_033209: VERBOSE (-V) sys_diag:0717_033209: INTERVAL : 1 second sampling (-I1) sys_diag:0717_033209: TIME Duration: 298 seconds (-T298) sys_diag:0717_033209: LONG report (-l) sys_diag:0717_033209: # Creating ... README_sys_diag.txt ... sys_diag: ------- Beginning Process SNAPSHOT (# 0) ------- sys_diag:0717_033209: Dtrace: TCP write bytes by process ...(_dtcp_tx Snap 0) sys_diag:0717_033209: Dtrace: TCP read bytes by process ... (_dtcp_rx Snap 0) sys_diag:0717_033209: Dtrace: systemwide IO / IO wait... (_diow Snap 0) sys_diag:0717_033235: Dtrace: Syscall count by process... (_dcalls_ Snap 0) sys_diag:0717_033243: Dtrace: Syscall count by syscall... (_dsyscall_ Snap 0) sys_diag:0717_033251: Dtrace: Read bytes by process... (_dR_ Snap 0) sys_diag:0717_033258: Dtrace: Write bytes by process... (_dW_ Snap 0) sys_diag:0717_033306: Dtrace: Sysinfo counts by process... (_dsinfo_ Snap 0) sys_diag:0717_033314: Dtrace: Sdt_counts ... (_dsdtcnt_ Snap 0) sys_diag:0717_033321: Dtrace: Interupt Times [sdt:::intr].. (_dintrtm_ Snap 0) sys_diag:0717_033321: # ps -e -o ...(by %CPU) ... Snapshot # 0 sys_diag:0717_033321: # ps -e -o ...(by %MEM) ... Snapshot # 0 sys_diag:0717_033332: # pmap -xs 519 ... sys_diag:0717_033332: # pmap -S 519 ... sys_diag:0717_033332: # pmap -r 519 ... sys_diag:0717_033332: # ptree -a 519 ... sys_diag:0717_033332: # pfiles 519 ... sys_diag:0717_033333: Dtrace: IO by process 519 ... (_dpio Snap 0) sys_diag:0717_033339: # pmap -xs 448 ... sys_diag:0717_033339: # pmap -S 448 ... sys_diag:0717_033339: # pmap -r 448 ... sys_diag:0717_033339: # ptree -a 448 ... sys_diag:0717_033339: # pfiles 448 ... sys_diag:0717_033340: Dtrace: IO by process 448 ... (_dpio Snap 0) sys_diag:0717_033346: # pmap -xs 90 ... sys_diag:0717_033346: # pmap -S 90 ... sys_diag:0717_033346: # pmap -r 90 ... sys_diag:0717_033346: # ptree -a 90 ... sys_diag:0717_033346: # pfiles 90 ... sys_diag:0717_033347: Dtrace: IO by process 90 ... (_dpio Snap 0) sys_diag:0717_033353: # pmap -xs 825 ... sys_diag:0717_033353: # pmap -S 825 ... sys_diag:0717_033353: # pmap -r 825 ... sys_diag:0717_033353: # ptree -a 825 ... sys_diag:0717_033353: # pfiles 825 ... sys_diag:0717_033353: Dtrace: IO by process 825 ... (_dpio Snap 0) sys_diag:0717_033353: # /usr/bin/netstat -i -a ... sys_diag:0717_033400: # Snapshot Kernel Memory Usage.. ::memstat | mdb -k ... sys_diag:0717_033409: # /usr/sbin/lockstat -IW -n 100000 -s 13 sleep 5 ... sys_diag:0717_033419: # /usr/sbin/lockstat -A -n 90000 -D15 sleep 5 ... sys_diag:0717_033431: # /usr/sbin/lockstat -A -s8 -n 90000 -D10 sleep 5 ... sys_diag:0717_033446: # /usr/sbin/lockstat -AP -n 90000 -D10 sleep 5 ... sys_diag:0717_033521: Dtrace: Involuntary Context Switches (icsw) by process .. (_dmpc Snap 0) sys_diag:0717_033526: Dtrace: Cross CPU Calls (xcal) caused by process ........ (_dmpc Snap 0) sys_diag:0717_033531: Dtrace: MUTEX try lock (smtx) by lwp/process ............ (_dmpc Snap 0) sys_diag: --**-- (Background) DATA COLLECTION FOR 298 secs STARTED --**-- sys_diag:0717_033531: # /usr/bin/vmstat -q 1 298 > ./sysd_socrates_070717_0332/sysd_vm_socrates_070717_033209.out 2>&1 & sys_diag:0717_033531: # /usr/bin/iostat -xn 1 298 > ./sysd_socrates_070717_0332/sysd_io_socrates_070717_033209.out 2>&1 & sys_diag:0717_033531: # /usr/bin/mpstat -q 1 298 > ./sysd_socrates_070717_0332/sysd_mp_socrates_070717_033209.out 2>&1 & sys_diag:0717_033537: # /usr/bin/netstat -i -I lo0 1 298 > ./sysd_socrates_070717_0332/sysd_net1_socrates_070717_033537.out 2>&1 & sys_diag:0717_033537: # /usr/bin/kstat -p -T u -n lo0 1> ./sysd_socrates_070717_0332/sysd_knetb_lo0_socrates_070717_033537.out 2>&1 sys_diag:0717_033538: # /usr/bin/netstat -i -I hme0 1 298 > ./sysd_socrates_070717_0332/sysd_net2_socrates_070717_033538.out 2>&1 & sys_diag:0717_033538: # /usr/bin/kstat -p -T u -n hme0 1> ./sysd_socrates_070717_0332/sysd_knetb_hme0_socrates_070717_033538.out 2>&1 sys_diag:0717_033538: # /usr/sbin/snoop ... sys_diag: ------- (Foreground) Gathering System Configuration Details ------- sys_diag:0717_033539: # uname -a ... sys_diag:0717_033539: # hostid ... sys_diag:0717_033539: # domainname (DNS) ... sys_diag:0717_033539: ###### SYSTEM CONFIGURATION / DEVICE INFO ###### sys_diag:0717_033539: # prtdiag ... sys_diag:0717_033539: # prtconf | grep Memory ... sys_diag:0717_033539: # /usr/sbin/psrinfo -v ... sys_diag:0717_033539: # /usr/sbin/psrinfo -pv ... sys_diag:0717_033539: # /usr/sbin/psrset -q ... sys_diag:0717_033539: # cfgadm -l ... sys_diag:0717_033539: # cfgadm -al ... sys_diag:0717_033539: # cfgadm -v ... sys_diag:0717_033539: # cfgadm -av | grep memory | grep perm ... sys_diag:0717_033541: ###### E10K / E25K / SunFire System INFO ###### sys_diag:0717_033541: # Checking Kernel Cage settings ... sys_diag:0717_033541: # eeprom ... sys_diag:0717_033541: # /usr/bin/coreadm ... sys_diag:0717_033541: # /usr/sbin/dumpadm ... sys_diag:0717_033541: # modinfo ... sys_diag:0717_033541: # /usr/sbin/lustatus ... sys_diag:0717_033541: # cat /etc/path_to_inst ... sys_diag:0717_033542: ###### WORKLOAD CHARACTERIZATION ###### sys_diag:0717_033542: # prstat -c -a 1 1 ... sys_diag:0717_033542: # prstat -c -J 1 1 ... sys_diag:0717_033542: # prstat -c -Z 1 1 ... sys_diag:0717_033542: # prstat -c 1 2 ... sys_diag:0717_033544: # prstat -c -v 1 3 ... sys_diag:0717_033546: # ps -e -o ...(by %CPU) ... sys_diag:0717_033546: # ps -e -o ...(by %MEM) ... sys_diag:0717_033546: # ps -e -o ...(by LWP) ... sys_diag:0717_033546: ###### PERFORMANCE PROFILING (System / Kernel) ###### sys_diag:0717_033547: # vmstat 1 5 ... sys_diag:0717_033551: # /usr/bin/mpstat 1 3 ... sys_diag:0717_033551: # /usr/bin/isainfo -v ... sys_diag:0717_033553: # /usr/bin/ipcs -a ... sys_diag:0717_033553: # /usr/bin/pagesize ... sys_diag:0717_033553: # swap -l ... sys_diag:0717_033553: # swap -s ... sys_diag:0717_033553: # /usr/bin/vmstat -s ... sys_diag:0717_033553: # /usr/bin/kstat -n system_pages ... sys_diag:0717_033553: # /usr/bin/kstat -n vm ... sys_diag:0717_033554: # /usr/sbin/trapstat 1 2 ... sys_diag:0717_033554: # /usr/sbin/trapstat -t 1 2 ... sys_diag:0717_033554: # /usr/sbin/trapstat -l ... sys_diag:0717_033554: # /usr/sbin/trapstat -t 1 2 ... sys_diag:0717_033554: # /usr/sbin/trapstat -T 1 2 ... sys_diag:0717_033554: # /usr/sbin/intrstat 1 2 ... sys_diag:0717_033554: # /usr/bin/vmstat -i ... sys_diag:0717_033554: ###### KERNEL ZONES/ SRM / Acctg / TUNABLES ###### sys_diag:0717_033554: # /usr/sbin/zoneadm list -v ... sys_diag:0717_033554: # /usr/bin/projects -l ... sys_diag:0717_033554: # /usr/sbin/psrset -i ... sys_diag:0717_033554: # /usr/sbin/psrset -p ... sys_diag:0717_033554: # /usr/sbin/psrset -q ... sys_diag:0717_033554: # /usr/sbin/rctladm -l ... sys_diag:0717_033554: # /usr/bin/priocntl -l ... sys_diag:0717_033554: # /usr/sbin/acctadm ... sys_diag:0717_033554: # /usr/sbin/acctadm -r... sys_diag:0717_033554: # tail -80 /etc/system ... sys_diag:0717_033554: # sysdef | tail -85 ... sys_diag:0717_033554: # tail -40 /etc/init.d/sysetup ... sys_diag:0717_033554: # cat /etc/power.conf ... sys_diag:0717_033612: ###### STORAGE / ARRAY INFO ###### sys_diag:0717_033612: # prtconf -pv ... sys_diag:0717_033613: # luxadm probe ... sys_diag:0717_033614: ###### STORAGE VOLUME MANAGEMENT INFO ###### sys_diag:0717_033614: ###### SOLARIS (SDS/SVM) VOLUME MANAGER Info ###### sys_diag:0717_033614: # /sbin/metadb ... sys_diag:0717_033614: # /sbin/metastat ... sys_diag:0717_033614: # /sbin/metastat -p... sys_diag:0717_033614: ###### Sun STMS / MPxIO Info ###### sys_diag:0717_033614: # cat /kernel/drv/fp.conf ... sys_diag:0717_033614: # cat /kernel/drv/fcp.conf ... sys_diag:0717_033614: ###### FILESYSTEM INFO ###### sys_diag:0717_033614: # df ... sys_diag:0717_033614: # df -k ... sys_diag:0717_033614: # mount -v ... sys_diag:0717_033614: # /usr/sbin/showmount -a ... sys_diag:0717_033614: # cat /etc/vfstab ... sys_diag:0717_033614: # /usr/bin/cachefsstat ... sys_diag:0717_033614: ###### I/O STATS ###### sys_diag:0717_033614: # /usr/bin/iostat -nxe 3 2 ... sys_diag:0717_033614: # /usr/bin/iostat -xcC 3 2 ... sys_diag:0717_033614: # /usr/bin/iostat -xnE ... sys_diag:0717_033614: ###### NFS INFO ###### sys_diag:0717_033614: # /usr/bin/nfsstat ... sys_diag:0717_033614: # /usr/bin/nfsstat -m ... sys_diag:0717_033614: ###### NETWORKING INFO ###### sys_diag:0717_033614: # cat /etc/hosts ... sys_diag:0717_033614: # /usr/sbin/ifconfig -a ... sys_diag:0717_033614: # /usr/bin/netstat -i ... sys_diag:0717_033614: # /usr/bin/netstat -r ... sys_diag:0717_033614: # /usr/sbin/arp -a ... sys_diag:0717_033614: # /usr/sbin/ping -s 192.168.200.1 56 10 ... sys_diag:0717_033614: # /usr/sbin/ping -s 192.168.200.1 1016 10 ... sys_diag:0717_033614: # /usr/sbin/ping -s google.com 56 10 ... sys_diag:0717_033614: # /usr/sbin/ping -s google.com 1016 10 ... sys_diag:0717_033614: # cat /etc/hostname.hme0 ... sys_diag:0717_033614: # cat /etc/inet/networks ... sys_diag:0717_033614: # cat /etc/netmasks ... sys_diag:0717_033614: # tail -30 /etc/inet/ntp.server ... sys_diag:0717_033614: # /usr/sbin/dladm show-dev ... sys_diag:0717_033614: # /usr/sbin/dladm show-link ... sys_diag:0717_033614: # /usr/sbin/dladm show-aggr ... sys_diag:0717_033614: # /usr/sbin/pntadm -L ... sys_diag:0717_033703: # /usr/bin/kstat -c net ... sys_diag:0717_033703: # ndd -get /dev/tcp ... sys_diag:0717_033703: # ndd -get /dev/udp ... sys_diag:0717_033703: # ndd -get /dev/ip ... sys_diag:0717_033706: # ndd -set /dev/hme instance 0 ... sys_diag:0717_033706: # ndd -get /dev/hme ... sys_diag:0717_033706: # /usr/bin/netstat -a ... sys_diag:0717_033711: # /usr/bin/netstat -s ... sys_diag:0717_033711: ###### TTY / MODEM INFO ###### sys_diag:0717_033711: # /usr/sbin/pmadm -l ... sys_diag:0717_033711: # cat /etc/remote ... sys_diag:0717_033711: # cat /var/adm/aculog ... sys_diag:0717_033711: ###### USER / ACCOUNT / GROUP Info ###### sys_diag:0717_033711: # w ... sys_diag:0717_033711: # who -a ... sys_diag:0717_033711: # cat /etc/passwd ... sys_diag:0717_033711: # cat /etc/group ... sys_diag:0717_033711: ###### SERVICES / NAMING RESOLUTION ###### sys_diag:0717_033711: # /usr/bin/svcs -v ... sys_diag:0717_033711: # cat /etc/services ... sys_diag:0717_033711: # cat /etc/inetd.conf ... sys_diag:0717_033711: # cat /etc/inittab ... sys_diag:0717_033711: # cat /etc/nsswitch.conf ... sys_diag:0717_033711: # cat /etc/resolv.conf ... sys_diag:0717_033711: # cat /etc/auto_master ... sys_diag:0717_033711: # cat /etc/auto_home ... sys_diag:0717_033712: # /usr/bin/ypwhich ... sys_diag:0717_033712: # /usr/bin/nisdefaults ... sys_diag:0717_033712: ###### SECURITY / CONFIG FILES ###### sys_diag:0717_033712: # cat /etc/syslog.conf ... sys_diag:0717_033712: # cat /etc/pam.conf ... sys_diag:0717_033712: # cat /etc/default/login ... sys_diag:0717_033712: # tail -250 /var/adm/sulog ... sys_diag:0717_033712: # /usr/bin/last reboot ... sys_diag:0717_033712: # /usr/bin/last -200 ... sys_diag:0717_033712: # /usr/sbin/ipf -T list ... sys_diag:0717_033712: # cat /etc/ipf/ipf.conf ... sys_diag:0717_033712: # cat /etc/ipf/pfil.ap ... sys_diag:0717_033712: # /usr/sbin/ipnat -vls ... sys_diag:0717_033713: ###### HA/ CLUSTERING INFO ###### sys_diag:0717_033713: ###### SUN N1 Configuration INFO ###### sys_diag:0717_033713: ###### APPLICATION / ORACLE CONFIG FILES ###### sys_diag:0717_033713: ###### PACKAGE INFO / SOLARIS REGISTRY ###### sys_diag:0717_033713: # /usr/bin/prodreg browse ... sys_diag:0717_033713: # /usr/bin/pkginfo ... sys_diag:0717_033713: # /usr/bin/pkginfo -l ... sys_diag:0717_033713: ###### PATCH INFO ###### sys_diag:0717_033713: # /usr/bin/showrev -p ... sys_diag:0717_033713: # /usr/sadm/bin/smpatch analyze NOT RUN, passwd required.... sys_diag:0717_033753: ###### CRONTAB FILE LISTINGS ###### sys_diag:0717_033753: ###### FMD / SYSTEM MESSAGE/LOG FILES ###### sys_diag:0717_033753: # /usr/sbin/fmadm config ... sys_diag:0717_033753: # /usr/sbin/fmdump ... sys_diag:0717_033753: # /usr/sbin/fmstat ... sys_diag:0717_033753: # tail -250 /var/adm/messages ... sys_diag:0717_033753: # /usr/bin/dmesg ... sys_diag:0717_033753: # tail -500 /var/log/syslog ... sys_diag:0717_033754: ...WAITING 12 seconds for midpoint data collection... sys_diag: ------- MidPoint Process SNAPSHOT (# 1) ------- sys_diag:0717_033806: Dtrace: TCP write bytes by process ...(_dtcp_tx Snap 1) sys_diag:0717_033806: Dtrace: TCP read bytes by process ... (_dtcp_rx Snap 1) sys_diag:0717_033806: Dtrace: systemwide IO / IO wait... (_diow Snap 1) sys_diag:0717_033832: Dtrace: Syscall count by process... (_dcalls_ Snap 1) sys_diag:0717_033840: Dtrace: Syscall count by syscall... (_dsyscall_ Snap 1) sys_diag:0717_033847: Dtrace: Read bytes by process... (_dR_ Snap 1) sys_diag:0717_033855: Dtrace: Write bytes by process... (_dW_ Snap 1) sys_diag:0717_033903: Dtrace: Sysinfo counts by process... (_dsinfo_ Snap 1) sys_diag:0717_033911: Dtrace: Sdt_counts ... (_dsdtcnt_ Snap 1) sys_diag:0717_033918: Dtrace: Interupt Times [sdt:::intr].. (_dintrtm_ Snap 1) sys_diag:0717_033918: # ps -e -o ...(by %CPU) ... Snapshot # 1 sys_diag:0717_033918: # ps -e -o ...(by %MEM) ... Snapshot # 1 sys_diag:0717_033929: # pmap -xs 4188 ... sys_diag:0717_033929: # pmap -S 4188 ... sys_diag:0717_033929: # pmap -r 4188 ... sys_diag:0717_033929: # ptree -a 4188 ... sys_diag:0717_033929: # pfiles 4188 ... sys_diag:0717_033929: Dtrace: IO by process 4188 ... (_dpio Snap 1) sys_diag:0717_033935: # pmap -xs 4181 ... sys_diag:0717_033935: # pmap -S 4181 ... sys_diag:0717_033935: # pmap -r 4181 ... sys_diag:0717_033935: # ptree -a 4181 ... sys_diag:0717_033935: # pfiles 4181 ... sys_diag:0717_033936: Dtrace: IO by process 4181 ... (_dpio Snap 1) sys_diag:0717_033942: # /usr/bin/netstat -i -a ... sys_diag:0717_033942: # Snapshot Kernel Memory Usage.. ::memstat | mdb -k ... sys_diag:0717_033952: # /usr/sbin/lockstat -IW -n 100000 -s 13 sleep 5 ... sys_diag:0717_034002: # /usr/sbin/lockstat -A -n 90000 -D15 sleep 5 ... sys_diag:0717_034015: # /usr/sbin/lockstat -A -s8 -n 90000 -D10 sleep 5 ... sys_diag:0717_034037: # /usr/sbin/lockstat -AP -n 90000 -D10 sleep 5 ... sys_diag:0717_034051: Dtrace: Involuntary Context Switches (icsw) by process .. (_dmpc Snap 1) sys_diag:0717_034056: Dtrace: Cross CPU Calls (xcal) caused by process ........ (_dmpc Snap 1) sys_diag:0717_034101: Dtrace: MUTEX try lock (smtx) by lwp/process ............ (_dmpc Snap 1) sys_diag: ------- EndPoint Process SNAPSHOT (# 2) ------- sys_diag:0717_034101: # /usr/bin/kstat -p -T u -n lo0 2>&1 sys_diag:0717_034101: # /usr/bin/kstat -p -T u -n hme0 2>&1 sys_diag:0717_034107: Dtrace: TCP write bytes by process ...(_dtcp_tx Snap 2) sys_diag:0717_034107: Dtrace: TCP read bytes by process ... (_dtcp_rx Snap 2) sys_diag:0717_034107: Dtrace: systemwide IO / IO wait... (_diow Snap 2) sys_diag:0717_034133: Dtrace: Syscall count by process... (_dcalls_ Snap 2) sys_diag:0717_034141: Dtrace: Syscall count by syscall... (_dsyscall_ Snap 2) sys_diag:0717_034149: Dtrace: Read bytes by process... (_dR_ Snap 2) sys_diag:0717_034156: Dtrace: Write bytes by process... (_dW_ Snap 2) sys_diag:0717_034204: Dtrace: Sysinfo counts by process... (_dsinfo_ Snap 2) sys_diag:0717_034212: Dtrace: Sdt_counts ... (_dsdtcnt_ Snap 2) sys_diag:0717_034220: Dtrace: Interupt Times [sdt:::intr].. (_dintrtm_ Snap 2) sys_diag:0717_034220: # ps -e -o ...(by %CPU) ... Snapshot # 2 sys_diag:0717_034220: # ps -e -o ...(by %MEM) ... Snapshot # 2 sys_diag:0717_034230: # pmap -xs 519 ... sys_diag:0717_034230: # pmap -S 519 ... sys_diag:0717_034230: # pmap -r 519 ... sys_diag:0717_034230: # ptree -a 519 ... sys_diag:0717_034230: # pfiles 519 ... sys_diag:0717_034231: Dtrace: IO by process 519 ... (_dpio Snap 2) sys_diag:0717_034237: # pmap -xs 448 ... sys_diag:0717_034237: # pmap -S 448 ... sys_diag:0717_034237: # pmap -r 448 ... sys_diag:0717_034237: # ptree -a 448 ... sys_diag:0717_034237: # pfiles 448 ... sys_diag:0717_034238: Dtrace: IO by process 448 ... (_dpio Snap 2) sys_diag:0717_034244: # pmap -xs 90 ... sys_diag:0717_034244: # pmap -S 90 ... sys_diag:0717_034244: # pmap -r 90 ... sys_diag:0717_034244: # ptree -a 90 ... sys_diag:0717_034244: # pfiles 90 ... sys_diag:0717_034245: Dtrace: IO by process 90 ... (_dpio Snap 2) sys_diag:0717_034251: # pmap -xs 825 ... sys_diag:0717_034251: # pmap -S 825 ... sys_diag:0717_034251: # pmap -r 825 ... sys_diag:0717_034251: # ptree -a 825 ... sys_diag:0717_034251: # pfiles 825 ... sys_diag:0717_034251: Dtrace: IO by process 825 ... (_dpio Snap 2) sys_diag:0717_034251: # /usr/bin/netstat -i -a ... sys_diag:0717_034258: # Snapshot Kernel Memory Usage.. ::memstat | mdb -k ... sys_diag:0717_034307: # /usr/sbin/lockstat -IW -n 100000 -s 13 sleep 5 ... sys_diag:0717_034317: # /usr/sbin/lockstat -A -n 90000 -D15 sleep 5 ... sys_diag:0717_034329: # /usr/sbin/lockstat -A -s8 -n 90000 -D10 sleep 5 ... sys_diag:0717_034344: # /usr/sbin/lockstat -AP -n 90000 -D10 sleep 5 ... sys_diag:0717_034358: Dtrace: Involuntary Context Switches (icsw) by process .. (_dmpc Snap 2) sys_diag:0717_034404: Dtrace: Cross CPU Calls (xcal) caused by process ........ (_dmpc Snap 2) sys_diag:0717_034408: Dtrace: MUTEX try lock (smtx) by lwp/process ............ (_dmpc Snap 2) sys_diag:0717_034408: ------- Data Collection COMPLETE ------- sys_diag:0717_034408: ###### SYSTEM ANALYSIS : INITIAL FINDINGS ... ###### sys_diag:0717_034414: ###### PERFORMANCE DATA : POTENTIAL ISSUES ###### _____________________________________________________________________________________ sys_diag:0717_034414: ## Analyzing VMSTAT CPU Datafile : ./sysd_socrates_070717_0332/sysd_vm_socrates_070717_033209.out ... * NOTE: 2.6936 % : 8 of 297 VMSTAT CPU entries are WARNINGS!! * TOTAL CPU AVGS : RUNQ= 0.1 : BThr= 0.0 : USR= 15.0 : SYS= 11.2 : IDLE= 73.5 PEAK CPU HWMs : RUNQ= 8 : BThr= 0 : USR= 51 : SYS= 96 : IDLE= 0 ___________________________________________________________________________________ sys_diag:0717_034414: ## Analyzing VMSTAT MEMORY from Datafile : ./sysd_socrates_070717_0332/sysd_vm_socrates_070717_033209.out ... * NOTE: 0.673401 % : 2 of 297 VMSTAT MEMORY entries are WARNINGS!! * TOTAL MEM AVGS : SR= 0.0 : SWAP_free= 747697.4 K : FREE_RAM= 287786.6 K PEAK MEM Usage: SR= 0 : SWAP_free= 500128.0 K : FREE_RAM= 57080.0 K ___________________________________________________________________________________ sys_diag:0717_034414: ## Analyzing MPSTAT Datafile : ./sysd_socrates_070717_0332/sysd_mp_*.out ... * NOTE: 5.20134 % : 31 of 596 MPSTAT CPU entries are WARNINGS!! * CPU MP AVGS: Wt= 0: Xcal= 736: csw= 120: icsw= 3: migr= 5: smtx= 3: syscl= 1024 PEAK MP HWMs: Wt= 0: Xcal= 51771: csw= 14108: icsw= 32: migr= 55: smtx= 79: syscl= 25836 NOTE: 0.2% CPU cycles handling TLB MISSES (0.0% ITLB_misses: 0.2% DTLB_misses) _____________________________________________________________________________________ sys_diag:0717_034414: ## Analyzing IOSTAT Datafile : ./sysd_socrates_070717_0332/sysd_io_*.out ... * NOTE: 14.4578 % : 24 of 166 IOSTAT entries are WARNINGS!! * TOP 10 Slowest IO Devices (* AVG of non-zero device entries *) : r/s w/s kr/s kw/s actv wsvc_t asvc_t %w %b device # I/O Samples 32.6 10.8 263.6 24.6 0.8 0.0 13.7 0.0 19 c0t0d0 164 34.0 7.5 10.8 0.0 0.0 0.0 0.0 0.0 0 c0t1d0 2 _____________________________________________________________________________________ CONTROLLER IO : AVG and TOTAL Throughput per HBA (*active/non-zero entries only*) : ------------ c0 : AVG : 32.6 r/s | 10.8 w/s | 260.6 kr/s | 24.3 kw/s | c0 : TOTAL: 5408 r | 1790 w | 43258 kr | 4037 kw | 166 entries _____________________________________________________________________________________ sys_diag:0717_034414: ## Analyzing NETSTAT Datafiles : ... * lo0 : NOTE: 0 % : 0 of 297 NETSTAT entries are WARNINGS!! * * hme0 : NOTE: 0 % : 0 of 297 NETSTAT entries are WARNINGS!! * ------------ *MAX_RX_PKTS* AVG_RX_PKTS AVG_RX_ERRS AVG_TX_PKTS AVG_TX_ERRS AVG_COLL NET1 : lo0 : 4 0.0 0.0 0.0 0.0 0.0 ------------ *MAX_RX_PKTS* AVG_RX_PKTS AVG_RX_ERRS AVG_TX_PKTS AVG_TX_ERRS AVG_COLL NET2 : hme0 : 14 0.4 0.0 0.4 0.0 0.0 : hme0 : TOT_RX_Bytes TOT_TX_Bytes TOT_RX_Packets TOT_TX_Packets TOTAL_Seconds 22210 30348 124 112 328 : hme0:1: TOT_RX_Packets TOT_TX_Packets : hme0:1: 0 0 NOTE: ** 2 ESTABLISHED connections (sockets) exist** _____________________________________________________________________________________ * NOTE: CPU=GRN : MEM=GRN : IO=YEL : NET=GRN * _____________________________________________________________________________________ sys_diag:0717_034417: ... gen_html_hdr ... sys_diag:0717_034417: ... gen_html_rpt ... sys_diag:0717_034419: ## Generating TAR file : ./sysd_socrates_070717_0332.tar ... tar -cvf ./sysd_socrates_070717_0332.tar ./sysd_socrates_070717_0332 1>/dev/null compress ./sysd_socrates_070717_0332.tarData files have been TARed and compressed in : *** ./sysd_socrates_070717_0332.tar.Z *** ------- Sys_Diag Complete ------- # ---------------------------------------------------------------------------------------------- ---------------------------------------------------------------------------------------------- __________________________________________________ 8. For more Information : Resources & Feedback : __________________________________________________ The following recent articles on my Sun external blog page provide an extended overview of sys_diag and it's capabilities : http://blogs.sun.com/toddjobson/ *_What is sys_diag ?? .. Automating Solaris Performance Profiling and Workload Characterization._* http://blogs.sun.com/toddjobson/entry/what_is_sys_diag_automating *_sys_diag v.7.04 command line output ..._* http://blogs.sun.com/toddjobson/entry/sys_diag_v_7_04 *_Solaris Performance Analysis and Monitoring Tools... at what cost ?..._* http://blogs.sun.com/toddjobson/entry/solaris_performance_monitoring_tools Note: * Read the ksh script header pages and/or the this file prior to using, and ALWAYS test first on a representative non-production system.. as is the best practice when making ANY production environment changes... ;) * The latest release of sys_diag is available from BigAdmin at :* http://www.sun.com/bigadmin/jsp/descFile.jsp?url=descAll/sys_diag__solaris_c http://www.sun.com/bigadmin/scripts/submittedScripts/sys_diag.txt * In addition to posting on BigAdmin, sys_diag should be availble soon from : http://sunfreeware.com/ (if you haven't checked out either of these sites, do so asap, they are 2 of the best sites for Solaris administrators, architects, and developers ! ) ------------------------------ FEEDBACK, Questions, & RFE's : ------------------------------ Forward and questions/comments back to me at todd.jobson@sun.com, along with any RFE's for future releases. (make sure to put "sys_diag" somewhere in the Subject line) ---------------------------------------------------------------------------------------------- ----------------------------------------------------------------------------------------------