google-cloud-functions
OpenAI
Anthropic

Technical API Analysis: OpenAI GPT vs Anthropic Claude

A direct comparison of OpenAI and Anthropic APIs covering request structures, error handling, cost and specific use cases for integration engineers.

TL;DR Matrix

A summary of key technical differences for quick reference.

DimensionOpenAI (GPT series)Anthropic (Claude series)
API Endpoint/v1/chat/completions/v1/messages
System PromptPassed as a message object with role: "system"Passed as a top-level system parameter
Tool/Function CallingNative tools and tool_choice parametersBeta tools parameter, requires specific model versions
Vision Supportcontent array with type: "image_url" objectscontent array with type: "image" objects (base64 encoded)
Streamingstream: true returns Server-Sent Events (SSE)stream: true returns Server-Sent Events (SSE)
Key HeadersAuthorization: Bearer $API_KEYx-api-key: $API_KEY, anthropic-version: YYYY-MM-DD
Rate Limit HeaderX-RateLimit-Remaining-Tokensanthropic-ratelimit-requests-remaining
Cost ModelPer-token (input/output), tiered by modelPer-token (input/output), tiered by model

Use Cases

Choosing a model often depends on the job's specific requirements.

  • OpenAI: Best for complex agentic workflows needing reliable tool use. Its function calling is mature and well-documented, making it ideal for integrating with external APIs or databases. Fine-tuning capabilities also offer a path to specialised behaviour.
  • Anthropic: Excels at tasks requiring long context windows and a strong adherence to safety guidelines. It's a good choice for summarising large documents, analysing legal texts or performing creative writing tasks where a constitutional AI's behaviour is preferred.

Technical Analysis

The two platforms have converged on a similar messages API structure, but key differences remain in request and response shapes.

OpenAI: Chat Completions API

OpenAI's API is structured around a list of message objects. The system prompt is the first message in this list.

Request (curl)

curl https://api.openai.com/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -d '{
    "model": "gpt-4o",
    "messages": [
      {
        "role": "system",
        "content": "You are a helpful assistant that speaks British English."
      },
      {
        "role": "user",
        "content": "What is the colour of the sky?"
      }
    ],
    "max_tokens": 50
  }'

Response (Abbreviated)

{
  "id": "chatcmpl-...",
  "object": "chat.completion",
  "created": 1714567890,
  "model": "gpt-4o-2024-05-13",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "The colour of the sky is typically blue during a clear day."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 25,
    "completion_tokens": 13,
    "total_tokens": 38
  }
}

Anthropic: Messages API

Anthropic's API requires a version header and separates the system prompt from the conversational messages.

Request (curl)

curl https://api.anthropic.com/v1/messages \
  -H "x-api-key: $ANTHROPIC_API_KEY" \
  -H "anthropic-version: 2023-06-01" \
  -H "content-type: application/json" \
  -d '{
    "model": "claude-3-sonnet-20240229",
    "system": "You are a helpful assistant that speaks British English.",
    "messages": [
      {
        "role": "user",
        "content": "What is the colour of the sky?"
      }
    ],
    "max_tokens": 50
  }'

Response (Abbreviated)

{
  "id": "msg_...",
  "type": "message",
  "role": "assistant",
  "model": "claude-3-sonnet-20240229",
  "content": [
    {
      "type": "text",
      "text": "The colour of the sky is typically blue on a clear day."
    }
  ],
  "stop_reason": "end_turn",
  "usage": {
    "input_tokens": 23,
    "output_tokens": 14
  }
}

The bit most guides skip: The primary structural difference is Anthropic's elevation of system to a first-class parameter outside the messages array. This isn't just syntactic sugar. Anthropic's models are tuned to treat this parameter with higher precedence, which can lead to more reliable instruction-following for persona and rule-setting. It also means you can't inject a system prompt halfway through a conversation, unlike with OpenAI. Anthropic's response structure is also more explicit. The content is an array of blocks (e.g. type: "text"), preparing for multi-modal outputs, whilst OpenAI's is a simpler string in message.content.

Error Handling

Your integration's stability depends on correctly handling API errors, particularly 429 and 5xx status codes.

  • 429 Too Many Requests: Both APIs use this for rate limiting. Your code must handle it gracefully. OpenAI provides X-RateLimit-* headers to help you manage token and request limits proactively. Anthropic provides Retry-After and anthropic-ratelimit-requests-remaining headers. Your retry logic must respect these headers to avoid being blocked.
  • 5xx Server Error: These are transient server-side issues. Implement an exponential backoff strategy with jitter to handle them. A simple approach is to wait (2^attempt * base_delay) + random_jitter seconds before retrying. Don't retry indefinitely. Cap retries at 3-5 attempts before failing the operation.

Python Retry Logic Example

import time
import random
from openai import APITimeoutError, APIConnectionError, RateLimitError, APIStatusError

# This example uses OpenAI's exceptions, but the pattern is identical for Anthropic
def call_with_retry(api_call_func, max_retries=5, base_delay=1.0):
    """Calls an API function with exponential backoff."""
    for attempt in range(max_retries):
        try:
            return api_call_func()
        except (APITimeoutError, APIConnectionError, RateLimitError, APIStatusError) as e:
            if isinstance(e, APIStatusError) and e.status_code < 500 and e.status_code != 429:
                raise # Don't retry on client errors like 400 Bad Request
            if attempt == max_retries - 1:
                raise # Re-raise the final exception

            delay = (base_delay * 2**attempt) + random.uniform(0, 1)
            time.sleep(delay)

Cost & Scalability

Both platforms operate on a pay-as-you-go, per-token model with different prices for input and output tokens.

  • Cost: Anthropic's Claude 3 Opus is generally more expensive than OpenAI's GPT-4o, whilst Claude 3 Sonnet is priced competitively against GPT-4 Turbo. The key cost driver for large tasks is the context window. Feeding a 150k token document to Claude for a small summary is expensive. A pre-processing step to extract relevant chunks is often more cost-effective than sending the entire document.
  • Scalability: Standard pay-as-you-go models have rate limits (tokens-per-minute and requests-per-minute). For high-throughput needs, both platforms offer provisioned throughput. This gives you a reserved amount of model processing capacity for a fixed price, removing rate limits and offering lower latency. It's a significant cost commitment and only makes sense for high-volume, predictable workloads.

Related Implementation Blueprints

Step-by-step guides for platforms mentioned in this analysis.

Need to implement one of these patterns?

Our engineering team documents new technical blueprints every week based on user requests.

Request an Integration Blueprint