At-A-Glance Monitoring

AAG is the top solution to integrate IBM i systems into any Nagios monitoring configuration. With new additions being developed constantly and added to our current 135+ check commands, AAG provides an exceptional number of data points to monitor and track your IBM i systems.

AAG Logo

Overview

High Availability software is only part of the solution in today’s distributed solutions, ensuring the system is always available requires notification of any system events that could impede application processing or lead to a system loss. This is not only pertinent for the production system but also for the recovery system, we have seen a number of occasions where an unmonitored recovery system has been offline for a significant period of time due to a lack of monitoring.

Nagios is a well-known player in the Enterprise Monitoring Solution market but has very little IBM i integration, even with the community plugins provided. At Shield Advanced Solutions we looked initially at the available plugins provided by community members and IBM to add monitoring of the IBM i to Nagios but soon decided that an alternative approach would be required, this resulted in a new agent for the IBM i and a new plugin for Nagios.

AAG is the Nagios plugin that allows system stats to be collected from the IBM i and reported back to Nagios, while NG4i is the IBM i responder application that provides the data back to AAG. NG4i is provided as an IBM LPP. Configuration is very simple using the panel groups provided, this means an IBM i instance can be installed and running in minutes.

Since V2R0-100323 AAG now supports the ability to monitor the HMC. Shield provides HMC specific check_commands and handles communication between your Nagios enviroment and the HMC, in a similar manner to how AAG monitors your IBM i.

Raspberry Pi

AAG is able to run on many different Linux platforms. We have had great success using a credit card sized Linux board called Raspberry Pi. Shield is able to provide a plug and play solution for your monitoring by shipping a pre-configured and assembled RPi that is matched to your infastructure. There is no need to question the relyability/longevity of micro-SD cards as all RPis provided by Shield will be converted to support NVMe drives. Due to the current chip shortage around the world, Raspberry Pis are in short supply. Contact us today to check availability.

RPI

Request a Demo TODAY!

With AAG being Shield's most recent development we are offering demos in order to provide a more personalized overview of our new solution. Contact us today to request a demo and find out what AAG can do for you!

FREE Demo
lock

Pushover Notification Integration

AAG has been developed to alert users of their infrastructure's status as quickly and efficiently as possible. In order to keep the user informed AAG uses several methods of notification. Nagios provides standard email notification for host or service issues. To improve upon this AAG also implements the Pushover API to send notifications either directly to a user's device or broadcast to a user group.

Pushover is a cost effective, single purchase application that provides simple and effective notifications to either apple, android or a browser interface. Find out more here.

Integrated into the AAG Linux distribution is NagiosTV developed by Chris Carey. This browser application provides the user with realtime, accurate information regarding the status of their Nagios infrastructure.

HA4i Check Commands

The following list are the possible check commands to be run against an IBM i host. This set of commands relates to Shield's product HA4i and allow the user to monitor the host's replication status.

check_HA4i_RATETransfer rate between the HA4i *MGT and *NET system.
check_HA4i_APYApply status from the *NET system.
check_HA4i_RJRNList each remote journal that is configured.
check_HA4i_OBJObject replication status from the *MGT system.
check_HA4i_SPLFSpool file replication status from the *MGT system.
check_HA4i_SYNCSync Manager status from the *MGT system.
check_HA4i_RJOBSNumber of HA4i responder jobs running on the system.
check_HA4i_SPLFWNumber of spool files that have been marked for replication but are still waiting to be sent to the remote system.
check_HA4i_IJRNNumber of *INACTIVE journals, configured for replication.
check_HA4i_STATUSHA4i Server status for each critical server running in the HA4i subsystem.
check_HA4i_AUDSTSReturns information on the last audit that was run. Severity can be set on whether the audit has finished or not.
check_HA4i_NEWLIBLists libraries that have been added but not configured for replication.
check_HA4i_NEWDEVLists devices that have been created but not configured for replication.
check_HA4i_ROLESWAPReturns information on the last Role-Swap.
check_HA4i_AUDERRReturns any HA4i audits reporting errors in the last 24 hours.
Download Manual

EM4i Check Commands

This set of commands relates to Shield's product EM4i and allow the user to monitor the host's message monitoring application.

check_EM4i_RESPWAITNumber of *INQ messages which EM4i is waiting for responses to.
Download Manual

Shield General Check Commands

This set of commands are Shield's general use commands. Using these checks a user is able to monitor general values on an IBM i.

check_Shield_KEYEXPNumber of days until a LPP license key expires.
check_Shield_SBSSRCHNumber of jobs running in a specified subsystem, with a *MSGW status.
check_Shield_JOBSRCHNumber of jobs that are running matching specified search criteria.
check_Shield_RPYWNumber of messages awaiting a reply in a specific message queue.
check_Shield_DSBPRFNumber of profiles that are in a disabled state.
check_Shield_SBSJOBNumber of active jobs for a given subsystem.
check_Shield_JOBQNumber of jobs on a specified job queue.
check_Shield_RCVRQuantity and total size of the receivers in a specified Library.
check_Shield_CACHEBATCache battery state and quantity.
check_Shield_UPDLVLReturns the information about an installed shield product. Displays PTF level, latest updates and maintenance expiry.
check_Shield_PRDSYNCEnsures AAG and NG4i are in-sync after updates.
check_Shield_APTUPDReturns information about possible apt updates for linux distro. Run this command against localhost of Nagios.
check_Shield_PTFReturns PTF levels for OS. If a PTF is not up to date, both latest and installed levels are displayed.
check_Shield_JOBSCDEChecks the status of a job schedule entry. Returns success state and time of last / next submission.
check_Shield_SSLCERTReturns number of SSL Certificates expiring in the next x days is set in configs.
check_Shield_DSKSTSDisplays number of disks reported / configured and any disk errors.
check_Shield_UPTIMEReturns the number of Min since last IPL.
check_Shield_SYSVALCompares returned system value against a passed in parameter.
check_Shield_SECUPDReturns security bulletins from IBM since a specified date.
check_Shield_AUTUPDCan be configured to automatically update AAG OR notify users when an update is available for AAG.
check_Shield_RCVRBKLGReturns a specified journal receiver backlog and estimated number of minutes to clear.
check_Shield_PINGReturns ping response time between IBM i and another address.
check_Shield_OSLVLReturns current OS level on IBM i.
check_Shield_ASPSTSReturns current status of passed ASP.
check_Shield_ASPAVLReturns disk space available on passed ASP.
check_Shield_ASPMIRReturns current mirror status of passed ASP.
check_Shield_ASPLIFEPercentage of life remaining for NVMe drives in passed ASP.
check_Shield_ASPDSKReturns Disk status for passed ASP.
check_Shield_ASPOVRFLWAmount of overflow storage for passed ASP.
check_Shield_ASPGEOSTSGeographic mirror data status.
check_Shield_DMGOBJNumber of damaged objects in a library.
check_Shield_DEVSTSReturns device status.
check_Shield_TOPINTTRANSTop x jobs by interactive transactions.
check_Shield_TOPINTRSTop x jobs by interactive response time.
check_Shield_TOPCPUTIMETop x jobs by CPU time(ms).
check_Shield_TOPCPUTop x jobs by CPU(%).
check_Shield_TOPDSKIOTop x jobs by disk I/O.
check_Shield_JOBCPUReturns the information about jobs that match the entered parameters specific to the CPU used and the runtime.
check_Shield_JOBSTGReturns the information about jobs that match the entered parameters specific to the QTEMP size and the temporary storage used.
check_Shield_DQECOUNTReturns number of data queue entries.
check_Shield_LIBSIZEReturns library size.
check_Shield_WRKPRBProblems on the IBM i returned by WRKPRB.
check_Shield_OUTQCReturns spoolfile count on an out queue.
check_Shield_SSTPReturns Status, Password Expiry or Password expiry Date of a SST Profile.
*NEW check_Shield_JOBESTSReturns job END Status. Uses search criteria to pull single or multiple jobs.
Download Manual

PowerHA Check Commands

This set of commands relates to the high availability product PowerHA.

check_PowerHA_SYNCReturns sync percentage complete for PowerHA.
Download Manual

BRMS Check Commands

This set of commands relates to the product BRMS.

check_BRMS_WERRReturns number of Write errors.
check_BRMS_RERRReturns number of Read errors.
check_BRMS_USEDReturns number of times volume is used.
check_BRMS_FULLReturns if volume is full.
check_BRMS_EXPDReturns volume expiry status.
check_BRMS_EDATReturns days until volume expiry.
check_BRMS_DUPDReturns duplication status for media.
check_BRMS_STSReturns BRMS status for control group.
Download Manual

MiMiX Check Commands

This set of commands relates to your MiMiX replication status.

check_MMX_JRNSTSMimix Journal Status.
check_MMX_SYSSTSMimix System Status.
check_MMX_AGSTSMimix Application Group Status.
check_MMX_ARSTSMimix Reciever Information.
check_MMX_APYSTSMimix Apply Status.
check_MMX_CNTRSTSMimix Container Replication Status.
check_MMX_CFGCHGMimix Config Changes.
check_MMX_OTESTSMimix Object Tracking Entries.
check_MMX_ITESTSMimix IFS Tracking Entries.
check_MMX_FESTSMimix File Tracking Entries.
check_MMX_OBJAPYMimix Object Replication Process.
check_MMX_RJLNKMimix Remote Journal Link.
check_MMX_SWSTSMimix Switch Status.
check_MMX_DBSNDMimix DB Send Status.
Download Manual

IBM i Status Check Commands

This set of commands are oriented around the general status of the IBM i.

check_Status_AVLDISKAvailable disk as a percentage.
check_Status_TOTDISKTotal disk in GB.
check_Status_AVLDISKGBAvailable disk in GB.
check_Status_SYSNAMESystem name.
check_Status_SYSSTATEReturns system state.
check_Status_CPUUSEDPercentage of processor used.
check_Status_NUMJOBNumber of jobs running on system.
check_Status_PADDRPercentage of permanent addresses used.
check_Status_TADDRPercentage of temporary addresses used.
check_Status_ASPSize of system ASP in GB.
check_Status_STORAGETotal storage size in GB.
check_Status_UNPSTGSize of unprotected storage in MB.
check_Status_MAXUNPSTGMax size of unprotected storage in MB.
check_Status_NUMPARTNumber of partitions on system.
check_Status_PARTIDPartition ID for host.
check_Status_CPUCAPProcessor capacity as a percentage.
check_Status_CPUSHAREProcessor sharing status.
check_Status_NUMCPUNumber of processors that are licensed.
check_Status_ACTJOBNumber of *ACTIVE jobs running on system.
check_Status_ACTTHDNumber of *ACTIVE threads on system.
check_Status_MAXJOBMaximum number of jobs on system.
check_Status_TMP256% of temporary 256MB segments used.
check_Status_PRM256% of permanent 256MB segments used.
check_Status_TMP4GB% of temporary 4GB segments used.
check_Status_PRM4GB% of permanent 4GB segments used.
check_Status_UCAP% of uncapped CPU used.
check_Status_SPOOL% of shared processor pool used.
check_Status_MAINMEMAmount of main memory in GB.
check_Status_PRCTTUAmount of processor unit time used in ms for each job.
check_Status_INTTRNNumber of interactive transactions per job listed.
check_Status_DBLCKWAmount of database lock waits per job listed.
check_Status_INTLCWAmount of internal machine lock waits per job listed.
check_Status_NDBLCKWAmount of non-database lock waits per job listed.
check_Status_AUXIORAmount of auxiliary I/O requests per job listed.
check_Status_PEAKTSAmount of peak temporary storage per job listed.
check_Status_QTEMPSSize of QTEMP library in MB per job listed.
check_Status_RESPTTTotal response time in seconds per job listed.
check_Status_TSDBLWTotal seconds spent in database lock wait, per job listed.
check_Status_TSINTLTotal seconds spent in internal lock wait, per job listed.
check_Status_TSNDBLTotal seconds spent in non-database lock wait, per job listed.
check_Status_TMPSTGTemporary storage used in MB, per job listed.
Download Manual

HMC Check Commands

AAG now provides the ability to create a HMC host and monitor with the following check_commands.

*NEW check_HMC_LEDReturns LED Status and SRC for partition under HMC.
*NEW check_HMC_STATEReturns System State for partition under HMC.
*NEW check_HMC_MEMSTReturns Memory Defragmentation State for partition under HMC.
*NEW check_HMC_MIGSTReturns Migration State for partition under HMC.
*NEW check_HMC_HIBSTReturns Hibernation State for partition under HMC.
*NEW check_HMC_RMCSTReturns RMC connection State for partition under HMC.
Download Manual

What's new!

More Info

Support

AAG is able to be installed by most IBM i/Nagios administrators, however, if you need assistance we provide highly trained consultants who will be able to install and configure AAG to ensure you are monitoring everything correctly.

Any customer with a current maintenance contract in place can use the support portal to raise tickets for issues and questions they have about the products. The support portal also lists a number of FAQs that can help with product set up and configuration. Any tickets raised via the portal are immediately flagged to the support team to ensure a rapid response to your question(s). Access to the support portal is available Here If you are requested to start a remote desktop session (teamviewer) use the following link to install the correct version of teamviewer from our site Teamviewer Version 8

AAG Latest Update: 1AAG210 Product level: AAG1053024

NG4i Latest PTF: 1NG2000 Product level: NG4I112023