Automatic Disabling of Uncompilable Apps

Background

When an app is upgraded to a new release using the no-ui dev client or Helium’s admin web interface, the process to compile the app and potentially do a schema upgrade is performed asynchronously by the Helium backend.

This asynchronous process is triggered when an attempt to access the app is made. This can be through an API call to the app, a scheduled function invocation or access to the app’s web interface.

In some cases, even though the app might appear to successfully upgrade to a new release, the background upgrade process that is triggered might fail.

This can potentially trigger many compile / upgrade attempts which causes resource issues on the Helium server that might in turn result in downtime for the entire Helium instance.

 

Automatic Disabling of Apps

To prevent a scenario such as described above, apps that fail to upgrade on the backend will be disabled automatically.

Helium maintains a counter of consecutive failed upgrade attempts. Once a threshold of 10 failures is reached, the app will be disabled preventing any further resource usage for the failing app.

The above mentioned counter is reset when the app release for the app is updated (using the no-ui dev client or Helium’s admin web interface) or when the app upgrade succeeds on the backend.

 

Monitoring

Monitoring of failing app upgrades is provided by the DevOps monitor. In addition, alerts will be created where appropriate. These will allow Jira tickets to be created and WhatsApp messages to be sent to developers as per the current monitor escalation and notification process.

Product teams can also manually check the health of their apps using the built-in inbound API provided for all apps:

https://mezzaninewiki.atlassian.net/wiki/spaces/HTUT/pages/5740441#InboundAPI-Example%3AAppHealth

The DevOps monitor will use the built-in health API for apps to check whether an app can be compiled.

https://mezzaninewiki.atlassian.net/wiki/spaces/HTUT/pages/5740441#InboundAPI-Example%3AAppHealth

Product teams can also use the same API to troubleshoot compile / upgrade issues for their apps.

Developer Intervention

Once an app has been disabled due to failing app upgrades, the app is disabled by Helium as a measure to protect the platform and other apps on the platform.

This, along with the monitoring mentioned above will prompt intervention from the product team that owns the app.

To troubleshoot, developers can use the above mentioned health API, or check their application logs.

Once the issue in the source code or app schema has been fixed, an upgrade for the app can be submitted to Helium as usual. The app can then be enabled on the Helium web interface (under the app update view), or using the DevOps portal interface or API.

If an app is disabled due to consecutive failures, the previous relevant error message and error details will be available in the response from the built-in health API.

While apps are automatically disabled, they are not automatically enabled once the issue is fixed.

Developers need to manually re-enable their apps after the compile / upgrade issue has been fixed.