Critical Support & High Priority Issues Policy

Updated: Aug 1, 2022

Purpose of this policy

This policy exists to define critical support for Postscript and describe the processes in place for Postscript to respond to critical support requests within 30 minutes and high priority issues within 12 hours.

Policy definitions

Critical Support Incident:
A state of the Postscript web application where all users are not able use business critical features such as:

  • Logging In

  • Sending / Receiving SMS Campaigns and Replies

High Priority Issue:
A state of the Postscript web application where multiple users are not able use business critical features such as:

  • Logging In

  • Sending / Receiving SMS Campaigns and Replies

Critical Support Team:
The designated group of engineers that are “on-call” to respond to critical incidents within 30 minutes and high priority issues within 12 hours.

RCA:
Root Cause Analysis -- A report to determine the underlying cause of a Critical Support Incident and prescribe solutions to minimize the likelihood and potential impact of a future incident.

Policy

Prevention

The Postscript team takes the following measures during code development and deployment to ensure that critical support incidences are minimal:

  • Code tests pre-deployment.

  • Deploying and testing in a staging environment.

  • Red/Blue deployments to production environments that automatically rollback if issues arise.

  • Canary deployments to subsets of customers to test production code before rollout to the general population.

Detection

  1. The Postscript engineering team uses the AWS Cloudwatch service to conduct 24/7 monitoring of business critical APIs and applications. If the service detects an outage, the Critical Support Team is notified to respond within 30 minutes.

  2. The email address critical at(@) postscript dot io can be used to notify the Critical Support Team 24/7. Emailing this address will notify the entire Critical Support Team via phone in the event of an emergency. All customers have access to this email address.

Reaction & Communication

When the existence of a Critical Support Incident is identified, customers will be notified via In-App message through an internal Postscript alert system.

The Critical Support Team will take steps to first diagnose the issue at hand and take measures to either 1. Deploy a patch to the Postscript application or, in the event that the outage is due to a 3rd party providers, 2. Deploy measures to minimize the customer impact due to the 3rd party outage.

After initial measures are taken to minimize customer downtime, all customers will be notified of the incident via email and in-app message.

The Critical Support Team will then conduct an RCA for both internal Postscript teams and for any customer directly affected by the outage.

The Postscript engineering team will prioritize and deploy any patches prescribed by the RCA within a 5 day period.