LogoLogo
API DocsDeveloper PortalSystem StatusTry for Free
  • Quickstart Guide
    • Introduction
    • Get started as an Account Owner
    • Get started as a User
    • Glossary
    • FAQs
  • Manage Users
    • Types of Users
    • Add and Delete Users
    • Import Users
    • User Permissions - Access Controls
    • Manage Your Profile
    • Notification Rules
    • On-Call Reminder Rules
    • Change Account Owner
  • Manage Teams
    • Understanding Teams
    • Role Based Access Control
    • Owner Based Access Control
    • Create and Delete Teams
    • Add and Remove Team Members
    • Squads
    • Stakeholder Groups
  • Services
    • Adding a Service
    • Service Overview
    • Service Graph
    • Maintenance Mode
    • Alert Deduplication Rules
      • Alert Deduplication Rules
      • Incident Status Based Deduplication
      • Service Dependency Based Deduplication
      • Key Based Deduplication
    • Event Tagging
    • Alert Routing
    • Alert Suppression
    • Custom Content Templates
    • Intelligent Alert Grouping (IAG)
    • Auto Pause Transient Alerts (APTA)
    • Delayed Notifications
  • Schedules
    • Schedules
      • Adding a Schedule
      • Schedules Overview
      • Who is On-Call?
      • My On-Call Shifts
      • Overrides
      • Videos: How to set up common use cases?
  • Escalation Policies
    • Create and Manage Escalation Policy
    • Round Robin & Advanced Escalations
    • Reassign an Incident
  • Notifications
    • Understanding Incident Notifications
  • Dashboards
    • Incident Management Dashboard
    • Dashboard Metrics
    • Take Bulk Actions
    • Squadcast Search
  • Incident List
    • Incident List View
    • Incident Priorities
    • Filter Incidents
    • Save Filter View
    • Merge Incidents
    • Snooze Incidents
  • Incidents Page
    • Incidents Details
    • Incident Activity Timeline
    • Communication Channels
    • Create Incident Manually
    • Incident Notes
    • Incident Watchers
    • Past Incidents
    • Additional Responders
    • Incident Summaries
    • Incident Suggestions
  • Runbooks
    • Runbooks
  • Postmortems
    • Postmortem Templates
    • Create Postmortems
    • Accessing Postmortem
  • Status Page
    • Status Page
    • Status Page Overview
    • Components and Groups
    • Issues
    • Subscribers
    • Maintenance
  • SLO Tracker
    • SLO Basics
    • Configure and Monitor your SLOs
  • Webforms
    • Webforms
  • Global Event Rulesets
    • Global Event Rulesets
  • Workflows
    • Workflows
    • Workflows Overview
    • Actions
  • Live Call Routing
    • Live Call Routing
  • Analytics
    • Analytics (New)
    • Organization Level Analytics
    • On Call Hours Per User
    • Weekly Reports
  • Integrations
    • Incident Webhook (Incident Webhook/API)
    • Outgoing Webhooks
    • ServiceNow Extension
    • Extensions
      • Jira Cloud Integration
      • Jira DC (Data Center)
      • CircleCI
      • Google Chat
      • Freshdesk
      • Freshservice
      • Asana
      • ClickUp
      • Trello
      • Zendesk
      • Hubspot
    • Alert Source Integrations (Native)
      • Admin Labs
      • Airbrake
      • Amazon EventBridge
      • Amazon GuardDuty
      • Amazon Opensearch
      • APImetrics
      • AppDynamics
      • AppSignal
      • Auvik
      • AWS CloudTrail Logs
      • AWS CloudTrail via CloudWatch
      • Amazon Cloudwatch (AWS) Integration
      • AWS CloudWatch Event Rules
      • AWS Elastic Beanstalk via CloudWatch
      • Amazon RDS (AWS)
      • Amazon SNS (AWS)
      • Azure Monitor
      • Better Uptime
      • Bitbucket
      • Bitrix 24
      • Blue Matador
      • Bugsnag
      • Buildkite
      • Checkly
      • Checkmk
      • CircleCI Integration
      • Cisco DNAC
      • Cisco Meraki
      • ClickUp Integration
      • CloudAMQP
      • Cloudflare
      • Conviva
      • CopperEgg
      • Coralogix
      • Cronitor
      • Crowdstrike Falcon
      • Datadog
      • Databricks
      • Dead Man's Snitch
      • Domotz
      • Dotcom Monitor
      • Dynatrace
      • ElastAlert
      • Elastic
      • Elecard Boro
      • Email Integration
      • Endtest
      • Errorception
      • Freshdesk Integration
      • Freshping
      • Freshservice
      • Ghost Inspector
      • GitHub Integration
      • GitLab
      • Grafana 8
      • Grafana
      • Graylog v4
      • Graylog
      • HaloPSA
      • Healthchecks
      • Heroku
      • HetrixTools
      • Honeybadger
      • Honeycomb
      • Humio
      • Hund
      • Hydrozen
      • Hyperping
      • Icinga2
      • InsightOps (LogEntries)
      • Instana
      • Intercom
      • Jenkins Integration
      • Jira Cloud Alert Source
      • Jira Server Alert Source
      • Kapacitor
      • Kentik
      • Komodor
      • Kibana
      • LibreNMS
      • Linear
      • Loggly
      • Logstash
      • Logz.io
      • ManageEngine Application Manager
      • ManageEngine Opmanager
      • Mezmo (formerly LogDNA)
      • MongoDB Atlas / Cloud Manager
      • Nagios
      • New Relic
      • Nixstats
      • NodePing
      • Observium
      • Oh Dear
      • Oracle Cloud Infrastructure
      • OSNexus QuantaStor
      • OverOps
      • Papertrail
      • Pingdom
      • Plesk 360
      • Postman
      • Postmark
      • Powercode
      • Progress WhatsUp Gold
      • Prometheus
      • PRTG Network Monitor
      • Rapid7 InsightIDR
      • RapidSpike
      • Redash
      • Redgate SQL Monitor
      • Rollbar
      • Rundeck
      • Runscope
      • Salesforce Cloud
      • Scout APM
      • Sematext
      • Sensu Go
      • Sensu
      • Sentry.io
      • Server Density
      • ServerGuard24
      • ServiceNow Integration
      • Shortcut (Clubhouse)
      • SignalFx
      • SigNoz
      • Site24x7
      • Slack
      • SolarWinds AppOptics
      • SolarWinds Observability SaaS (SWO)
      • SolarWinds Observability Self Hosted
      • Sonar
      • Splunk
      • Sqreen
      • Stackdriver
      • Stackify Retrace
      • StatHat
      • StatusCake
      • ServiceDesk Plus OD
      • Sumo Logic
      • Sysdig Monitor
      • Threat Stack
      • Trello
      • Twilio
      • Uptime
      • Uptime Robot
      • Uptrends
      • Wavefront
      • Zabbix 5.0
      • Zabbix 6.2
      • Zabbix
      • Zendesk Integration
      • Zoho Desk
      • Zoho Desk via Zoho Flow
      • LogicMonitor
  • ChatOps
    • Google Chat
    • Microsoft Teams
    • Slack for Incident Management
      • Using the Integration
  • Single Sign-On (SSO)
    • AWS SSO
    • Azure Active Directory SSO
    • Google SSO
    • Microsoft ADFS SSO
    • Okta SSO Integration
    • SAML 2.0 based SSO
  • Mobile App
    • Using the Mobile App
  • Terraform & API Documentation
    • Terraform Provider
    • Public API - Refresh Token
    • API Documentation
    • Getting Started with Squadcast GraphQL
      • Schedules
        • Create Schedule
        • Update Schedule
        • Delete Schedule
        • Pause Schedule
        • Get Schedules
        • Get Schedule by ID
        • Resume Schedule
        • Clone Schedule
        • Get Gaps
      • Rotations
        • Create Rotation
        • Update Rotation
        • Delete Rotation
        • Get Rotation by ID
        • Get Rotation Events by ID
      • Overrides
        • Create Override
        • Update Override
        • Delete Override
        • Get Override by ID
      • Calendar URLs
      • Who is On-Call
    • Developer Portal
    • Incident Rate Limiting
  • Managing your Squadcast Account
    • Audit Logs
    • Organizations
    • Billing FAQs
    • Deactivate your Squadcast Account
    • Delete your Squadcast Account
Powered by GitBook
On this page
  • 1. Set up your on-call team
  • Create your Profile and add Notification Rules
  • Download our Mobile App
  • Add Users
  • Assign Permissions
  • Create a Team
  • Assign Roles
  • Create a Squad
  • Create Schedules
  • Create Escalation Policies
  • Add Services
  • Configure Integrations
  • 2. Incident Response - Reduce MTTR with faster response
  • Add Extensions
  • 3. Incident Response - Noise Reduction & Contextual Awareness
  • Add Tags to incidents
  • Configure Routing Rules for automatic overrides
  • Deduplicate to reduce alert fatigue
  • Suppress non-actionable alerts
  • 4. Incident Communication
  • Set up your Public and Private Status Page
  • Create a Postmortem Report
  • 5. SRE Visibility and Insights
  • Setup Service Level Objectives (SLOs)
  • Analytics and Reporting

Was this helpful?

  1. Quickstart Guide

Get started as an Account Owner

Start building your teams, integrate your tools and create on-call schedules, with Squadcast

PreviousIntroductionNextGet started as a User

Last updated 1 year ago

Was this helpful?

1. Set up your on-call team

Create your Profile and add Notification Rules

To begin, configure your profile:

Navigate to the section to define your contact information, time zone, and notification preferences.

After you’ve set up your profile, you can head over to the section, to create your paging policies.

Important:

Verify your contact information to start receiving notifications from Squadcast.

🔹 Best Practice Tip 🔹 Use the mobile application to receive push notifications. The app gives you instant access to all details and actions. 🔹 Best Practice Tip 🔹 Apple and Google Docs, push notifications operate on a "best effort" basis. Consider setting up backup contact methods (SMS, email, phone) for reliability if push notifications fail. 🔹 Best Practice Tip 🔹 Furthermore, push notifications may also be impacted by energy-saving modes, low battery levels, or when the app is force-stopped.

🔹 Best Practice Tip 🔹 Your primary notification rule should be the most attention-grabbing notification method. We recommend using a diverse notification rule (Push, SMS, Phone, Email) with multiple steps to avoid single points of notification failure.

🔹 Best Practice Tip 🔹 Use a custom notification rule during business hours, that may not require aggressive notifying.

🔹 Best Practice Tip 🔹 Include a phone call in the last step of your notification rule, as a surefire way of getting alerted and acknowledging the incident. 🔹 Best Practice Tip 🔹 Furthermore, push notifications may also be impacted by energy-saving modes, low battery levels, or when the app is force-stopped.

Download our Mobile App

Explore the mobile and web platforms to get comfortable before beginning your configurations.

Add Users

Next, start adding users and stakeholders to your organization. You can manually add each user or bulk import them using a .csv file. Alternatively, you can automatically provision them using an SSO.

🔹 Best Practice Tip 🔹 For larger teams, the best way to add users would be to bulk import them using a .csv file.

🔹 Best Practice Tip 🔹 Make sure that all users have verified their emails and phone numbers as soon as they are added, to start receiving notifications.

Assign Permissions

Once you have added users to your organization, you can customize their access to the account by adding additional permissions. These are additional levels of permissions, on top of the User Type that they have been added as. You can only customize permissions for users and not stakeholders.

🔹 Best Practice Tip 🔹 Make sure to give the right org-level permissions to the right team members to have better visibility in the system settings.

Create a Team

Next, create teams to segregate data and have different environments for different functional teams. By default, all the users are added to the default team. The default team cannot be deleted.

🔹 Best Practice Tip 🔹 Keep a team naming convention that is intuitive to each team role or the alerts they work with. (ie. Support, Backend, Security, Data, etc.).

🔹 Best Practice Tip 🔹 Organize your teams according to the service they are responsible for. They will be able to manage their integrations and the whole alerting flow for themselves.

Assign Roles

Next, assign roles in a Team from within Squadcast Roles: Admin, Users, and Observers, or create custom roles.

🔹 Best Practice Tip 🔹 Restrict user access as much as possible to limit the number of users making changes. Recommendation: 1-2 admins per team, the rest as users or stakeholders.

🔹 Best Practice Tip 🔹 Make use of the custom roles/edit the default roles and give correct access to the right team member.

🔹 Best Practice Tip 🔹 Stakeholders added to teams can only carry the role of an Observer.

Create a Squad

Squads are sub-groups that can refer to folks handling a specific functionality, service, or project within the team. Squads are handy when you need to notify the whole group together. For instance, when a coordinated response is required for high-urgency high-complexity incidents, or at the end of an escalation policy when nobody has acknowledged it.

Examples:

  • Payment gateway Squad

  • Backend Squad

  • Frontend Squad

  • All Hands

Create Schedules

Once teams are created, you can set up your on-call schedules. An on-call schedule is used to determine who is on-call at a given time. They are based on different time zones and configurable rotations.

Important:

They are active only when added to an escalation policy.

🔹 Best Practice Tip 🔹 Keep your rotations as simple as possible, preferably with a continuous rotation of the same users to make your on-call schedule easy to manage. Remember that you can leverage scheduled overrides to address holidays or schedule conflicts.

Create Escalation Policies

Next, create escalation policies, and add your on-call schedules to them. This will automatically notify your on-call engineers when an incident is triggered.

Examples:

  • Website Monitoring

  • Payment Portal Monitoring

  • Backend Issues

🔹 Best Practice Tip 🔹When creating escalation policies keep a naming convention that allows others to know the context and priority of incidents that come in through a specific escalation policy.

🔹 Best Practice Tip 🔹Adding on-call schedules to your first escalation layer is the best way to notify your on-call engineers.

🔹 Best Practice Tip 🔹 For critical incidents, create separate layers with different methods of notification. The first layer contains non-intrusive methods like Email & Push, while the second layer contains intrusive methods like SMS & Phone calls.

🔹 Best Practice Tip 🔹 Add reminder notifications for acknowledged incidents and re-trigger unresolved incidents after a certain time to help reduce MTTR.

Add Services

Next, set up Services within Squadcast.

Services are at the core of Squadcast. A service represents an application or component that is crucial for your product or service. Services are created with an alert source integration through which incidents are triggered. Squadcast provides a Webhook URL to integrate with the tools you use.

🔹 Best Practice Tip 🔹 Give your services meaningful names that reflect the actual component name or functionality.

🔹 Best Practice Tip 🔹 You can assign a Squad as the owner of a service.

🔹 Best Practice Tip 🔹 You can use tags to differentiate between business and technical services.

🔹 Best Practice Tip 🔹 Only send critical, actionable alerts into Squadcast. Avoid unnecessary or noisy alerts – This will help reduce alert fatigue and make it easier to manage your incidents.

🔹 Best Practice Tip 🔹 You can check this nice blog which speaks about How to configure services in Squadcast: Best practices to reduce MTTR.

Configure Integrations

You can search through our documentation to find helpful alert source integration guides to walk you through any particular integration.

🔹 Best Practice Tip 🔹 Make sure you are only sending critical, actionable alerts to Squadcast to avoid alert fatigue and confusion.

🔹 Best Practice Tip 🔹 Check if your alert source is capable of sending tags/labels in their webhooks, you can use our dynamic tagging rules functions to reflect that on the platform and to use them in better ways.

🔹 Best Practice Tip 🔹 We always suggest using Incident webhooks over emails to trigger incidents, this way we can eliminate the dependency on third-party email clients to create incidents.

2. Incident Response - Reduce MTTR with faster response

Add Extensions

Extensions are deeper integrations with tools where actions can be taken from within the platform to reflect on the tool as well. Within Squadcast, these are called Extensions and can be found on the navigation sidebar.

Typically, extensions augment your incident management process by connecting with other tools where actions are required. ITSM, Communication, Web conferencing, Version Control, CI/CD, and SSO tools would typically act as extensions.

3. Incident Response - Noise Reduction & Contextual Awareness

Add Tags to incidents

Incident Tags are used to add more context to your incident and help classify incidents. You can configure tags from Tagging Rules associated with a service. You can configure tagging rules with an incident JSON to automatically add tags when incidents are triggered or you can manually create and update them.

🔹 Best Practice Tip 🔹 While creating the automation rules, ensure to add the source name under each condition to restrict the rule to apply only to that particular alert source.

🔹 Best Practice Tip 🔹 If you use Incident webhook to create incidents and if you send the Tags, the Tags that were carried by the webhook will be added to the incidents and not the Tags configured in the platform for that alert source.

Configure Routing Rules for automatic overrides

Alert Routing allows you to configure rules to ensure that alerts are routed to the right responder with the help of event tags attached to each alert. Routing is a part of the rules engine associated with each service. You can access routing rules from a service’s options dropdown. Note that this rule will override the escalation policy attached to the service. This is typically used in cases where severities are configured via tags and each severity type is to be handled by a different level of on-call user.

🔹 Best Practice Tip 🔹 Request that your routing rules name(s) follow your Escalation Policy naming convention.

Deduplicate to reduce alert fatigue

Alert Deduplication can help you reduce alert noise by organizing and grouping relevant alerts. This also provides easy access to similar alerts when needed. You can configure deduplication rules with an incident JSON to automatically deduplicate and group similar incidents and can see this reflected on the incident dashboard.

🔹 Best Practice Tip 🔹 While creating the automation rules, ensure to add the source name under each condition to restrict the rule to apply only to that particular alert source.

🔹 Best Practice Tip 🔹 Always arrange your automation rules in the right order based on the priority of how the rules have to be executed.

Suppress non-actionable alerts

Suppression Rules is a part of the Squadcast Rules Engine that allows you to configure rules to automatically suppress non-actionable alerts such as warning, informational, or test alerts. All suppressed data will still be available on the platform.

🔹 Best Practice Tip 🔹 While creating the automation rules, ensure to add the source name under each condition to restrict the rule to apply only to that particular alert source.

🔹 Best Practice Tip 🔹 Always arrange your automation rules in the right order based on the priority of how the rules have to be executed.

4. Incident Communication

Set up your Public and Private Status Page

Status Page helps you communicate status updates of your services to your customers and stakeholders about outages and scheduled maintenance.

Status Pages can either be public (accessible by everyone) or private (accessible by just your team on Squadcast) on Squadcast. You can also add a subscription option for your public status page so customers are automatically informed of any updates on the Status Page.

Create a Postmortem Report

An Incident Postmortem is a post-incident review that allows users to learn from major incidents by providing a summary of events that transpired, how the response was handled, and what steps were taken to resolve the incident.

You can create an incident postmortem from within an incident page once the incident is resolved. You can choose from several popularly used postmortem templates or create custom templates for your Organization.

🔹 Best Practice Tip 🔹 Always add all the necessary information while creating a postmortem.

🔹 Best Practice Tip 🔹 Add detailed notes in the Incident Notes section on the Incident Details Page, and star them to attach so that they’ll be present in the postmortem as well.

5. SRE Visibility and Insights

Setup Service Level Objectives (SLOs)

SLOs are used to define and track your service’s performance delivery. Any breach of SLOs will trigger an incident and notify the relevant Users, Squads, or Schedules.

Analytics and Reporting

Analytics help you view the performance of your Organization/Team, for a given period. It helps gain insights into how your system is functioning and what shape your responders are in.

You can also filter reports based on specific services, tags, and users.

The mobile app is available on both and .

See how to manage users, .

🔹 Best Practice Tip 🔹 Configuring SSO before adding users helps ensure all users link their SSO account. Squadcast supports any SAML 2.0-based Single Sign-On (SSO) and you can set it for your Organization by following this integration guide .

See how to manage teams, .

See how to manage roles, .

You can create multiple squads within a team. See how to manage squads, .

See how to and add .

🔹 Best Practice Tip 🔹 Learn and understand the difference between a , , and . Taking the time to understand the relationship between these functions will help you determine the most effective way to configure your team’s on-call schedule.

Squadcast enables you to add time-based Escalation Rules for , (a group of users), or (on-call schedules).

See how to Manage Escalation Policies, .

See how to manage services, .

To see the platform in action, integrate one of your existing tools. You can use a generic Email or API integration to get your alerts flowing, or just use one of our .

See more about how to use Extensions, .

See more about Tagging Rules, .

See more on Routing Rules, .

See more about Alert Deduplication, .

See more about Suppression Rules, .

See how to set up Status Pages, .

See how to create Postmortem templates, . See how to create Postmortems, .

See how to set up SLOs, .

See how to use Analytics, .

Have any questions? .

App Store
Google Play
here
here
here
here
here
Manage Schedules
overrides
Rotation
Shift
Escalation Policy
users
squads
schedules
here
here
native integrations
here
here
here
here
here
here
here
here
here
here
Ask the community
My Profile
Incident Notifications Rules

Your Role as an Account Owner

You are responsible for the management of the overall configuration, workflow, user permissions, and billing. You are the root user of the organization.

Your Permissions as an Account Owner

You have access to all functionality across the platform including scheduling, integrations, teams, user permissions, and billing.