Connect Datadog with Slack

Implementation Guide

Overview: Connecting Datadog and Slack

The Datadog-to-Slack integration is the cornerstone of any mature incident response and observability workflow. Datadog is a cloud-scale monitoring platform that aggregates metrics, logs, traces, and synthetic test results across infrastructure, applications, and third-party services. Slack is the de facto synchronous communication layer for engineering teams. Connecting these two systems means that a breach of a Service Level Objective, a spike in error rate, a failed deployment marker, or a P1 infrastructure alert does not sit in a dashboard that nobody is watching — it becomes an actionable notification in the exact Slack channel where the relevant team is operating.

This is not simply a matter of posting a message. Enterprise-grade Datadog-to-Slack integrations require conditional routing (different severity levels to different channels), rich message formatting using Slack's Block Kit to surface actionable context without requiring a login to Datadog, bidirectional linkage between Datadog Incidents and Slack incident channels, and suppression logic to prevent alert storms from flooding channels with redundant notifications. This guide covers the full implementation of all these patterns using Datadog's native Slack integration, Webhook Monitors, and the Datadog Incident Management API.

Core Prerequisites

On the Datadog side, you must have a Datadog account at the Pro tier or above (the Incidents feature requires Enterprise). To install the native Slack integration, navigate to Integrations > Slack in the Datadog UI and authorize it against your Slack workspace using OAuth 2.0. The required Slack OAuth scopes that Datadog requests are: channels:read, channels:join, chat:write, chat:write.public, files:write, groups:read, links:read, reactions:write, and users:read. Your Slack account performing the OAuth authorization must be a Workspace Admin or have the apps:manage permission. For Webhook-based integrations (used for custom payloads), you need the Datadog API key and optionally the Application Key, both available under Organization Settings > API Keys.

If you are configuring alert routing through a Datadog Monitor, you need at minimum the Monitors Write permission in your Datadog RBAC role. For Incident Management integration, the Incidents Write permission is also required. The Slack Bot Token (starting with xoxb-) used for the native integration should be stored as a Datadog secret if you are using a custom Webhook setup.

Top Enterprise Use Cases

The primary enterprise use case is severity-tiered alert routing. A P1 alert (e.g., error rate exceeding 5% for 5 minutes) is routed to #incidents-p1 and simultaneously pages the on-call engineer via Slack DM. A P2 alert (warning threshold breached) goes to #alerts-warning. A recovery notification is sent as a reply to the original alert message's thread, keeping context grouped rather than generating a new top-level message.

The second use case is automated incident channel creation. When a Datadog Incident is created via the Incidents API or the UI, a Slack channel is automatically provisioned (e.g., #inc-2024-11-01-database-latency), relevant stakeholders are invited, and the Datadog Incident timeline is cross-linked. This eliminates the manual overhead of setting up a war room during a live incident.

The third use case is metric snapshot delivery. Datadog allows you to include a snapshot image of the triggering metric graph in your Slack notification using the snapshot URL parameter in the monitor message template. This means on-call engineers see the actual graph in the Slack notification without needing to log in to Datadog to understand the scope of the issue.

Step-by-Step Implementation Guide

Begin with the native Datadog Slack integration. After completing the OAuth flow described in the prerequisites, navigate to a Monitor in Datadog (Monitors > All Monitors > select a monitor > Edit). In the "Notify your team" section, you can reference Slack channels using the @slack-WORKSPACE_NAME-CHANNEL_NAME mention syntax. For example, @slack-AcmeCorp-incidents-p1 will post to the #incidents-p1 channel in the AcmeCorp workspace.

The monitor message template supports conditional logic using Datadog's template variable system. A well-structured enterprise monitor notification message looks like this:

{{#is_alert}}
šŸ”“ *ALERT: {{monitor.name}}*
Service: {{service}}
Environment: {{env}}
Triggered at: {{last_triggered_at}}
Value: {{value}} (Threshold: {{threshold}})
[View in Datadog]({{url}}) | [Runbook](https://wiki.acme.com/runbooks/{{service}})
@slack-AcmeCorp-incidents-p1
{{/is_alert}}

{{#is_recovery}}
āœ… *RESOLVED: {{monitor.name}}*
Recovered at: {{last_triggered_at}}
@slack-AcmeCorp-incidents-p1
{{/is_recovery}}

{{#is_warning}}
🟔 *WARNING: {{monitor.name}}*
Value: {{value}} (Warning threshold: {{warn_threshold}})
@slack-AcmeCorp-alerts-warning
{{/is_warning}}

For richer formatting using Slack Block Kit, you must bypass the native integration and use a Datadog Webhook Monitor. Navigate to Integrations > Webhooks and create a new webhook pointing to a Slack Incoming Webhook URL or your Slack app's chat.postMessage API endpoint. Configure the webhook payload to use Slack's Block Kit JSON format. The Datadog Webhook integration passes a JSON payload and you can reference Datadog template variables within it. The following is a Block Kit payload template that renders a structured alert card:

{
  "channel": "#incidents-p1",
  "blocks": [
    {
      "type": "header",
      "text": {
        "type": "plain_text",
        "text": "šŸ”“ Alert: $EVENT_TITLE"
      }
    },
    {
      "type": "section",
      "fields": [
        { "type": "mrkdwn", "text": "*Monitor:*\n$EVENT_TITLE" },
        { "type": "mrkdwn", "text": "*Status:*\n$ALERT_STATUS" },
        { "type": "mrkdwn", "text": "*Environment:*\n$TAGS" },
        { "type": "mrkdwn", "text": "*Triggered:*\n$DATE" }
      ]
    },
    {
      "type": "actions",
      "elements": [
        {
          "type": "button",
          "text": { "type": "plain_text", "text": "View in Datadog" },
          "url": "$LINK"
        }
      ]
    }
  ]
}

The variables $EVENT_TITLE, $ALERT_STATUS, $TAGS, $DATE, and $LINK are Datadog's native Webhook template variables. A full list is available in the Datadog documentation under Integrations > Webhooks > Variables.

For the Incident Management bidirectional sync, use the Datadog Incidents API (POST /api/v2/incidents) combined with the Slack API's conversations.create endpoint. When a Datadog Incident is created via the API with the following payload, your automation layer (a Lambda function or a Datadog Workflow) provisions a dedicated Slack channel and links it back:

{
  "data": {
    "type": "incidents",
    "attributes": {
      "title": "Database write latency exceeding SLO",
      "customer_impact_scope": "All users in EU-WEST-1",
      "customer_impacted": true,
      "severity": "SEV-1",
      "fields": {
        "state": { "type": "dropdown", "value": "active" },
        "teams": { "type": "autocomplete", "value": "platform-engineering" }
      }
    }
  }
}

Common Pitfalls & Troubleshooting

A 403 Forbidden when Datadog attempts to post to a private Slack channel means the Datadog Slack app has not been added to that channel. Private channels require a Workspace Admin to manually invite the app using /invite @Datadog inside the channel. The native integration cannot join private channels automatically.

A 429 Too Many Requests from the Slack API is the most common issue during alert storms. Slack's rate limits are per-method: chat.postMessage is limited to approximately 1 message per second per channel. If a Datadog monitor flaps repeatedly (state oscillates between ALERT and OK within a short window), it will generate a burst of notifications that exceeds this limit. Configure Datadog monitor re-notification intervals (notify_no_data and renotify_interval) to suppress redundant alerts, and enable monitor flapping detection under the monitor's advanced settings.

If Slack message threading is not working correctly (recovery messages are not appearing as replies to the original alert), this is because the native Datadog integration does not support threaded replies using thread_ts out of the box. You must implement this using a custom Webhook + a stateful data store (e.g., a DynamoDB table) that maps Datadog Monitor ID to Slack message_ts at alert time, then uses chat.postMessage with thread_ts populated at recovery time.

A 401 Unauthorized from the Datadog Webhooks integration typically means your Slack Bot Token has been rotated or the Slack app was uninstalled and reinstalled, generating a new token. Update the Authorization header value in the Datadog Webhook configuration to the new xoxb- token.