Setting up Helium App Monitoring SQL View

Introduction

The purpose of this page is to give an provide an explanation on how to configure a Helium app for monitoring. This is to monitor any business rules and for data verification.

To enable this SQL view would need to exist in a helium application schema, this is view must be named __monitor_view__.

When the monitoring system detects that these views exist in a schema the results are pulled through to the monitoring server where results are automatically processed to be displayed on monitor.heliumapps.com with warnings or critical errors resulting in a JIRA-ticket created in the appropriate product teams project for investigation.

Use Cases

Typically we will use this to monitor critical in app business logic which require swift intervention by the product teams, this can include checks such as:

  • Data Discrepancies (logically impossible scenarios which are identified as problem areas in the application) - can be caused by bugs, sync issues or have unknown causes
  • The state of something like an SMS queue
  • Erroneous events detectable from either the DSL or SQL

Product teams would create an implementation of __monitor_view__ SQL on the respective production schemas, after which the monitor will automatically detect views and display the app on the monitor for activation, once activated the check will be visible as a service on the Helium Monitor.

The view can also return performance data (perfdata) which will be recored and represented visually on graphs. Use this to record interesting stats such as user adoption stats.

Performance

The nature of the monitoring is that we are calling this view regularly, often once a minute. The monitor view must therefore return within 5 seconds or less.

This means that it's often better to do any type of computation or heavy queries in a scheduled function than them being called from the view itself.

SQL View

Definition of __monitor_view__:

ColumnTypeDesc
categorytextThe 'name' of the application being monitored
statusinteger0 equals no issues, 1 equals warning, 2 equals critical issue
statustexttextA textual description of the issue such as 'The SMS queue is blocked, 40 messages outstanding' Or 'data discrepancy exist'
perfdatatext*Optional field - Semicolon ; separated key value pair of performance data stats such as 'stock_updates_submitted_today=16248;stock_updates_submitted_all_time=1506967;'

jira_project

textThe identifier of the JIRA project for the responsible product team (HEAL / HLTH / AGRI / AG)


Example "dummy" view which should be a good starting point:

/* 
 * This view performs the lookups within the app schema
 * deciding if there are issues that should be alerted
 * jira_project should be HEAL / HLTH / AGRI / AG
 */
CREATE OR REPLACE VIEW __monitor_view__ AS (
	select
		'Application Unique Name (SVS NDoH / Leap / AitaHealth)'::text as category,
		0 as status,
    	'No Issues'::text as statustext,
    	'stock_updates_submitted_today=16248;stock_updates_submitted_all_time=1506967'::text as perfdata,
    	'HEAL'::text as jira_project
);
Note

If the monitor view returns multiple records for different monitoring checks please ensure that the category name is different. If the category name is the same but the issues are different only one JIRA ticket will be created and the details of the ticket will be updated with the details of the last updated monitor record so you might miss previous failures.

 Permissions

On postgres the role helium_monitor_agent needs to be granted privileges on the tables and functions related to the monitoring queries otherwise the helium monitor will not be able to find the view and the result might be returned incorrectly. This includes all the tables and functions that are used lower down in the function tree.

Examples:

/* This is to grant privileges on the sql view __monitor_view__ */
GRANT ALL PRIVILEGES ON TABLE __monitor_view__ TO helium_monitor_agent;
 
/* This is to grant privileges on a sql function */
GRANT ALL PRIVILEGES ON FUNCTION __ping_monitor_function__() TO helium_monitor_agent;

Caveats

Although the Helium Monitor automatically discovers the views we have limited it so that:

  • It only applies to apps on the production servers. Snapshot apps are ignored.
  • Although the view is automatically discovered, it needs to be manually enabled in the Helium Monitor by the DevOps team. Log a ticket on Helium to have your view activated.
  • SMS/Email are enabled on a per-request basis.  A view may need to go through a trial period until it is trusted, as we cannot risk faulty views causing SMS or email spamming.

Some Examples

School Management

Warning (status returned is 1) when Live Change Audit Notification Queue greater than 20000

CREATE OR REPLACE VIEW ecdoe_001."__monitor_view__"
AS SELECT 'ECDOE'::text AS category,
        CASE
            WHEN count(*) > 20000 THEN 1
            ELSE 0
        END AS status,
        CASE
            WHEN count(*) > 20000 THEN 'Live Change Audit Notification Queue greater than 20000'::text
            ELSE 'No issues'::text
        END AS statustext,
    'EDU'::text AS jira_project,
    ''::text AS perfdata
   FROM ecdoe_001.change_audittrail c
  WHERE c.sent = 'No'::text
  GROUP BY c.sent;

Smart Asset Management Solution

Critical (status returned is 2) when financial year not properly configured for SAMS instance

CREATE OR REPLACE VIEW samsmtn_001."__monitor_view__"
AS SELECT c.app_name AS category,
    ( SELECT
                CASE
                    WHEN (( SELECT count(*) AS count
                       FROM samsmtn_001.financial_year f
                      WHERE f.active = true AND f.deleted = false
                      GROUP BY f._id_)) <> 1 THEN 2
                    ELSE 0
                END AS "case") AS status,
    'Financial year not properly configured for SAMS instance'::text AS statustext,
    ''::text AS perfdata,
    'SAMS'::text AS jira_project
   FROM samsmtn_001.client_config c;