SYSTEM_CONSOLE v2.4.0

Kafka vs Pub/Sub

The streaming backbone is the most consequential architectural decision in an event-driven platform. This page makes a clear recommendation based on what each system actually costs to run, operate, and scale.

LAST_UPDATED: 2025-07

The streaming backbone determines how you handle schema evolution, consumer lag, replay, fan-out, and operational overhead. Everything downstream adapts to it: your transformation pipelines, dead letter queues, observability tooling, and cost model all behave differently depending on which one you choose. Getting this wrong is expensive to undo.

The choice reduces to this: Kafka (Apache Kafka, Confluent Cloud, Amazon MSK) gives you a distributed, partitioned, retention-based log with full replay control and a mature schema registry. You pay for it in operational complexity or in Confluent licensing. Pub/Sub (Google Cloud Pub/Sub) is a fully managed subscription-based bus with no brokers to operate, first-class GCP integration, and a pricing model that punishes high consumer fan-out at scale.

The recommendation in this blueprint: use Pub/Sub as the default for GCP-native platforms with fewer than five consumer groups and moderate throughput. Switch to Kafka when you need exact replay semantics, more than five independent consumer groups reading the same topic, sub-20ms latency, or you are running multi-cloud. The rest of this page explains why.

How Kafka works internally

Kafka stores messages as an ordered, partitioned, immutable log. Each topic is divided into partitions (typically 3 to 100+), each partition maintained on one broker (leader) with replicas on others. Producers route messages to partitions via a partition key: the same order ID always lands on the same partition, guaranteeing ordering per key. Each consumer in a group owns a set of partitions and tracks its own offset (position in the log). Consumer groups are independent: Group A at offset 84,000 and Group B at offset 12 are reading the same partition simultaneously.

Replay is a first-class operation. Seek any consumer group to any offset or timestamp within the retention window (typically 7 days). Replaying the Bronze ingestion pipeline after a bad schema deployment takes two commands and does not affect any other consumer group. The Schema Registry (Confluent SR or Apicurio) enforces Avro or Protobuf compatibility at publish time: breaking changes are rejected before they reach the topic.

Kafka internal architecture

How Pub/Sub works internally

Pub/Sub stores messages in a topic and delivers a copy to each subscription independently. There are no partitions or offsets: the service manages delivery internally. Pull subscribers call the API to receive messages; push subscriptions deliver to an HTTP endpoint. Every message has an acknowledgement (ack) deadline. If the subscriber does not ack within the window (default 10s, configurable to 600s), Pub/Sub redelivers. At-least-once delivery is guaranteed; exactly-once requires idempotent consumers.

After max_delivery_attempts failed attempts (configurable per subscription), Pub/Sub routes the message to a dead letter topic (DLT) automatically. Replay via Pub/Sub Seek lets you rewind all subscriptions on a topic to a snapshot or timestamp, but it cannot replay one subscription independently of others. Pub/Sub Schemas provide basic Avro or Protobuf validation at publish time, but do not enforce compatibility evolution rules the way Confluent Schema Registry does.

Pub/Sub internal architecture

Platform fit: Kafka as the backbone

With Kafka, CDC from legacy databases uses Kafka Connect with Debezium source connectors. All producers register schemas in Confluent Schema Registry before publishing. Dataflow connects via the Kafka I/O connector. Consumer lag per partition per consumer group is the primary freshness signal, monitored via JMX or the Kafka AdminClient API. Schema evolution is enforced at the registry level: a producer cannot break a schema without incrementing the subject version.

Platform with Kafka backbone

Platform fit: Pub/Sub as the backbone

With Pub/Sub, CDC from legacy databases goes through Google Datastream, which replicates directly into BigQuery or GCS without a Debezium cluster to manage. Domain event producers publish to Pub/Sub topics; Dataflow reads from subscriptions via the native PubSubIO connector. Freshness monitoring uses the Cloud Monitoring oldest_unacked_message_age metric per subscription. Access control is handled by IAM bindings at the subscription level, not by client-side ACL configuration.

Platform with Pub/Sub backbone

Decision matrix

Opinionated comparison based on production use. "Depends" is used only where it is genuinely the honest answer.

Concern Kafka Pub/Sub Edge goes to
Peak throughput Unlimited via horizontal partition scaling. 1M+ msg/sec achievable. High but bounded. Default quota ~1 GB/s per region; requires quota increase for higher. Kafka
End-to-end latency 5–20ms p99 with low-latency config (acks=1, linger.ms=0). Tunable vs. throughput. 30–100ms typical. Not tunable. Sufficient for most data pipeline workloads. Kafka
Replay Full: seek any consumer group to any offset or timestamp. Partitions replay independently. Seek to snapshot or timestamp rewinds all subscriptions together. No per-subscription replay. Kafka
Consumer fan-out Unlimited consumer groups, each with independent offsets. 10 groups costs the same as 1. Each subscription is billed separately per GB delivered. 10 subscriptions = 10x delivery cost. Kafka
Ordering Strict per-partition order. Same partition key = guaranteed ordering. Unordered by default. Message ordering keys enable per-key ordering within a region. Tie
Operational complexity High self-managed; moderate on Confluent Cloud/MSK. Partition rebalancing, JVM tuning, quota management. Zero. No brokers, no rebalancing, no JVM. GCP manages everything. Pub/Sub
GCP integration Requires Kafka Connect or custom Dataflow connector. Not native GCP. IAM via Confluent RBAC. First-class: native Dataflow I/O, Cloud Monitoring, BigQuery subscription, IAM bindings. Pub/Sub
Cost at low volume Dedicated clusters are expensive even idle. Serverless Kafka (Confluent) is cheaper but has limits. Pay per message delivered. Free tier covers first 10 GB/month. Pub/Sub
Cost at high fan-out Flat: adding consumer groups does not change Kafka cluster cost. Multiplies per subscription. 10 subscriptions at 10 TB/month = 100 TB delivery charges. Kafka
Schema registry Confluent Schema Registry: full backward/forward/full compatibility enforcement, schema IDs in wire format. Pub/Sub Schemas: Avro/Protobuf validation at publish. No compatibility evolution rules. Kafka
Multi-region Confluent multi-region replication. Self-managed: MirrorMaker2. More control, more setup. Global by default. No configuration needed for cross-region delivery. Pub/Sub

Cost comparison

Scenario

Events/sec10,000
Avg payload1 KB
Retention7 days
Downstream consumers3
Ingress: 10 MB/s = 864 GB/day = ~25.9 TB/month. Retained at any point: ~6,050 GB (no replication factor).
Line item Confluent Cloud Dedicated GCP Pub/Sub
Cluster / base 4 CKU × $0.44/CKU/hr × 730 hr = ~$1,285/mo No cluster charge. $0
Storage (7-day, rf=3) 18,150 GB × $0.10/GB-mo = ~$1,815/mo Pub/Sub does not charge separately for retention. Included in delivery cost. ~$0
Publish / ingress Included in cluster cost above. $0 additional 25.9 TB: first 10 TB at $0.040/GB + 15.9 TB at $0.035/GB = ~$957/mo
Deliver (3 consumers) Same-region egress: ~$0.01/GB × 77.7 TB = ~$777/mo (cross-cloud adds ~$0.09/GB) 3 subscriptions × 25.9 TB × $0.037/GB avg = ~$2,875/mo
Schema Registry Included in Dedicated. $0 Pub/Sub Schemas included. $0
Monthly total (3 consumers) ~$3,877/mo ~$3,832/mo
Monthly total (10 consumers) ~$4,675/mo (+$798 egress) ~$13,507/mo (10× delivery)
Note
These are estimates using published list prices as of early 2026. Confluent Cloud cross-cloud egress (Confluent on AWS, consumers on GCP) adds ~$0.09/GB and changes the picture significantly. Committed Use Discounts on both platforms reduce costs 20–40% at scale. The fan-out row is the key insight: Kafka cost scales with cluster size, not consumer count. Pub/Sub cost scales with delivery volume.

Code: production patterns

Kafka producer with Avro and Confluent Schema Registry

Partition key is the order_id. Same order always routes to same partition. acks=all means the leader waits for all in-sync replicas before acknowledging.

Python
from confluent_kafka import Producer
from confluent_kafka.schema_registry import SchemaRegistryClient
from confluent_kafka.schema_registry.avro import AvroSerializer
from confluent_kafka.serialization import SerializationContext, MessageField

ORDER_SCHEMA = """
{
  "type": "record",
  "name": "OrderPlaced",
  "namespace": "com.example.orders",
  "fields": [
    {"name": "order_id",      "type": "string"},
    {"name": "customer_id",   "type": "string"},
    {"name": "amount_cents",  "type": "long"},
    {"name": "placed_at_ms",  "type": {"type": "long", "logicalType": "timestamp-millis"}}
  ]
}
"""

sr_client = SchemaRegistryClient({"url": "https://sr.example.com"})
avro_serializer = AvroSerializer(sr_client, ORDER_SCHEMA)

producer = Producer({
    "bootstrap.servers": "kafka.example.com:9092",
    "security.protocol": "SASL_SSL",
    "sasl.mechanism": "PLAIN",
    "sasl.username": "API_KEY",
    "sasl.password": "API_SECRET",
    # Reliability: wait for leader + all in-sync replicas
    "acks": "all",
    "retries": 3,
})

def delivery_report(err, msg):
    if err:
        raise RuntimeError(f"Delivery failed: {err}")

event = {
    "order_id": "ord-12345",
    "customer_id": "cust-67890",
    "amount_cents": 4999,
    "placed_at_ms": 1710000000000,
}

producer.produce(
    topic="orders.placed.v1",
    key="ord-12345",           # partition key: same order_id always same partition
    value=avro_serializer(
        event,
        SerializationContext("orders.placed.v1", MessageField.VALUE)
    ),
    on_delivery=delivery_report,
)
producer.flush()

Kafka consumer with consumer group and manual offset commit

enable.auto.commit=False is non-negotiable for at-least-once delivery. Auto-commit can acknowledge messages before they have been processed successfully.

Python
from confluent_kafka import Consumer, KafkaException
from confluent_kafka.schema_registry import SchemaRegistryClient
from confluent_kafka.schema_registry.avro import AvroDeserializer
from confluent_kafka.serialization import SerializationContext, MessageField
import logging

sr_client = SchemaRegistryClient({"url": "https://sr.example.com"})
avro_deserializer = AvroDeserializer(sr_client)

consumer = Consumer({
    "bootstrap.servers": "kafka.example.com:9092",
    "security.protocol": "SASL_SSL",
    "sasl.mechanism": "PLAIN",
    "sasl.username": "API_KEY",
    "sasl.password": "API_SECRET",
    "group.id": "orders-ingestion-pipeline",
    "auto.offset.reset": "earliest",
    "enable.auto.commit": False,   # manual commit: only after successful processing
    "max.poll.interval.ms": 300000,
})
consumer.subscribe(["orders.placed.v1"])

try:
    while True:
        msg = consumer.poll(timeout=1.0)
        if msg is None:
            continue
        if msg.error():
            raise KafkaException(msg.error())

        event = avro_deserializer(
            msg.value(),
            SerializationContext(msg.topic(), MessageField.VALUE),
        )

        try:
            process_order(event)
            # Commit only after successful write to Bronze. Guarantees at-least-once delivery.
            consumer.commit(asynchronous=False)
        except Exception as exc:
            # Do NOT commit. Message will be redelivered on next poll.
            logging.error("Processing failed, skipping commit", exc_info=exc)

except KeyboardInterrupt:
    pass
finally:
    consumer.close()

Pub/Sub publisher with message attributes

Message attributes carry routing and tracing metadata without polluting the payload schema. future.result() blocks until Pub/Sub acknowledges storage, not delivery.

Python
from google.cloud import pubsub_v1
import json

publisher = pubsub_v1.PublisherClient()
topic_path = publisher.topic_path("my-gcp-project", "orders.placed.v1")

def publish_order_placed(
    order_id: str,
    customer_id: str,
    amount_cents: int,
    placed_at_ms: int,
) -> str:
    payload = json.dumps({
        "order_id": order_id,
        "customer_id": customer_id,
        "amount_cents": amount_cents,
        "placed_at_ms": placed_at_ms,
    }).encode("utf-8")

    future = publisher.publish(
        topic_path,
        data=payload,
        # Message attributes used for filtering and tracing
        domain="orders",
        event_type="placed",
        schema_version="v1",
        correlation_id=order_id,
    )
    # .result() blocks until Pub/Sub acknowledges receipt
    message_id = future.result()
    return message_id

Pub/Sub pull subscriber with dead letter topic handling

The dead letter topic is configured on the subscription in Terraform, not in code. The subscriber only needs to distinguish permanent vs. transient failures when deciding whether to ack or nack.

Python
from google.cloud import pubsub_v1
from concurrent.futures import TimeoutError
import json
import logging

subscriber = pubsub_v1.SubscriberClient()
subscription_path = subscriber.subscription_path(
    "my-gcp-project",
    "orders.placed.v1-ingestion-pipeline",  # subscription, not topic
)

def handle_message(message: pubsub_v1.types.PubsubMessage) -> None:
    try:
        payload = json.loads(message.data.decode("utf-8"))
        process_order(payload)

        # Ack removes the message from this subscription permanently
        message.ack()

    except json.JSONDecodeError as exc:
        # Permanent failure: bad payload. Ack to prevent infinite redelivery.
        # Route to dead letter via message attribute or separate pipeline.
        logging.error("Unparseable message, acking to DLT path", exc_info=exc)
        message.ack()

    except Exception as exc:
        # Transient failure: nack triggers redeliver after ack deadline.
        # After max_delivery_attempts the subscription routes to the dead letter topic.
        logging.warning("Transient failure, nacking for redelivery", exc_info=exc)
        message.nack()

streaming_pull_future = subscriber.subscribe(
    subscription_path,
    callback=handle_message,
)

with subscriber:
    try:
        streaming_pull_future.result()
    except Exception:
        streaming_pull_future.cancel()
        streaming_pull_future.result()

Advanced topics

Kafka Connect for CDC pipelines

Kafka Connect runs source connectors that pull from external systems and publish to Kafka topics. For CDC (Change Data Capture), the Debezium connector tails the database binlog (MySQL binary log, Postgres WAL, Oracle LogMiner) and publishes row-level changes as Kafka messages. Single Message Transforms (SMTs) let you rename fields, filter rows, or add metadata before the message reaches the topic, without writing a custom Kafka Streams job.

Source connectors

Debezium MySQL, Postgres, Oracle, SQL Server. Each connector instance tracks its own binlog position.

SMT examples

MaskField (PII masking), ReplaceField (column rename), TimestampConverter (epoch to ISO 8601), ExtractField (flatten nested CDC envelope).

Gotchas

Connector restart loses binlog position if offsets.storage.topic is misconfigured. Always pin the Debezium version because connector upgrades can change the CDC envelope format.

Schema registry comparison: Confluent SR vs Apicurio

Both support Avro, Protobuf, and JSON Schema. Both enforce compatibility modes (BACKWARD, FORWARD, FULL). The differences are operational: Confluent Schema Registry is a single binary with a REST API; it is simple to run and well-documented, but it is part of the Confluent Platform licensing model when used with Confluent Cloud. Apicurio Registry is an open-source alternative that also supports OpenAPI and AsyncAPI artifacts, making it a better fit if you need a central contract registry beyond just Kafka schemas. Use Confluent SR if you are already on Confluent Cloud, where it is bundled. Use Apicurio if you want a schema registry independent of Confluent licensing or if you need multi-format artifact storage.

Best practice
Regardless of which registry you choose, enforce BACKWARD_TRANSITIVE or FULL_TRANSITIVE compatibility for all topics that feed Bronze storage. BACKWARD allows old consumers to read new messages. TRANSITIVE means the rule applies to all historical versions, not just the previous one. This is the setting that prevents the "two versions ago" breakage that BACKWARD alone misses.

Kafka consumer group lag monitoring

Consumer lag is the difference between the latest produced offset and the last committed offset for a consumer group on a partition. It is the primary freshness signal for Kafka-backed pipelines: a lag of 50,000 messages at 10,000 msg/sec means you are 5 seconds behind. At 1,000 msg/sec on a low-throughput topic, the same lag means 50 seconds behind.

What to track per consumer group Max lag across all partitions (not sum). Sum masks a single slow partition. Alert when max lag exceeds your SLO headroom.
Alert thresholds Warning: lag growing for 5 consecutive minutes. Critical: lag exceeds (expected_processing_time × 3). Both should fire even if absolute lag is small.
OpenTelemetry integration Use the OTel Kafka receiver in the Collector to scrape JMX metrics from brokers. Export kafka.consumer_group.lag as a gauge metric. Tag by consumer_group, topic, and partition.

Pub/Sub dead letter topics: configuration and behaviour

A dead letter topic (DLT) is configured at the subscription level, not the topic level. Each subscription can have a different DLT and a different max_delivery_attempts value (minimum 5, maximum 100). When a message is nacked or the ack deadline expires max_delivery_attempts times, Pub/Sub moves it to the DLT automatically with a CloudPubSubDeadLetterPolicy message attribute indicating the original subscription and delivery attempt count.

The DLT is itself a Pub/Sub topic. Create a separate subscription on the DLT to inspect and replay failed messages. Pub/Sub does not automatically retry DLT messages: you own the replay logic. A common pattern is a Cloud Function or small Dataflow job subscribed to the DLT that logs the failure, sends an alert, and republishes to the original topic after human review or automated correction.

Risk
Setting max_delivery_attempts too low (5 is the minimum) causes transient downstream failures to land in the DLT before the upstream issue has time to recover. Set it to at least 20 for pipelines where the consumer depends on an external service that can have brief outages. Monitor DLT message rate as a latency proxy: a rising DLT rate usually means a consumer is unhealthy before the consumer health check catches it.

Anti-patterns

Kafka anti-patterns

  • !
    Using Kafka as a database. Setting retention to "infinite" and querying Kafka for current state via consumer seeks. Kafka is a log, not a queryable store. Current state belongs in a database or data warehouse. Use Kafka Streams or ksqlDB materialised views if you need stateful queries, but accept the operational cost.
  • !
    Too many small topics. Creating a separate topic for every microservice event type with 3 partitions each. 200 microservices × 5 event types × 3 partitions = 3,000 partitions on a 4-broker cluster. ZooKeeper (or KRaft) controller election time scales with partition count. Keep topics coarse-grained at the domain level. Use message type headers or schema fields to distinguish event types within a topic.
  • !
    No partition key discipline. Using null partition keys (round-robin assignment) for events that require ordering, then wondering why the order of events for the same entity is inconsistent. Establish and document the canonical partition key for each topic before the first message is published.
  • !
    No schema enforcement at produce time. Letting producers publish raw JSON without schema registration. A producer deploys a change, renames a field, and every downstream consumer crashes silently. The schema registry check at produce time is the only reliable gate.

Pub/Sub anti-patterns

  • !
    Assuming exactly-once delivery. Pub/Sub guarantees at-least-once. Under normal conditions most messages are delivered once, but during redelivery windows, restarts, or subscription resets, duplicates occur. Every consumer must be idempotent. Using a message ID or a business key as an upsert key in BigQuery or Spanner is the correct pattern.
  • !
    Ignoring ordering on ordered subscriptions. Enabling message ordering keys without understanding the performance trade-off: ordered delivery serialises processing per key. One slow ack for key "order-123" blocks all subsequent messages with the same key. For high-cardinality keys (order IDs) the impact is minimal. For low-cardinality keys (region codes) it can stall an entire subscription.
  • !
    Using Pub/Sub for high-fan-out CDC. Routing full-table CDC from a high-write-volume Oracle table through Pub/Sub to 10+ downstream subscriptions. At 5 TB/day ingress with 10 subscriptions, Pub/Sub delivery costs alone exceed $5,000/month. Kafka with 10 consumer groups on the same data costs roughly the same as 1 consumer group. CDC fan-out is a primary use case where Kafka's cost model wins.

Key takeaways

  • 01 Use Pub/Sub as the default for GCP-native platforms with 3 or fewer consumer groups, moderate throughput, and no hard replay requirements. It costs nothing to operate and integrates directly with every GCP data service.
  • 02 Switch to Kafka when you need: exact per-partition replay for individual consumer groups, more than 5 independent consumers on high-volume topics, sub-20ms latency, multi-cloud portability, or full schema evolution enforcement. The operational cost is real, so price it in before committing.
  • 03 At moderate scale (3 consumers, ~25 TB/month), Kafka and Pub/Sub cost roughly the same. At high fan-out (10+ consumers), Kafka is substantially cheaper because consumer count does not change cluster cost. This is the single most important cost driver in the backbone decision.
  • 04 Schema enforcement is non-negotiable on either platform. Confluent Schema Registry enforces compatibility rules at produce time, which Pub/Sub Schemas does not. If you choose Pub/Sub and need strong schema governance, enforce it at the Dataflow ingestion step or via a schema validation Cloud Function on the topic.
  • 05 Both systems require idempotent consumers. Pub/Sub guarantees at-least-once delivery. Kafka with enable.auto.commit=True plus an application crash gives you the same problem. Design for duplicates regardless of which backbone you choose.

Failure modes

  • !
    Consumer lag grows silently. A Kafka consumer group falls behind because a single slow partition is not visible in average-lag dashboards. Max lag, not average lag, is the correct alert metric. By the time average lag is high, you are already hours behind.
  • !
    Schema registry outage stops all producers. The schema registry is a hard dependency in the produce path. If Confluent SR goes down without a local cache configured, all producers that validate at publish time fail. Enable the schema caching in the Avro serializer (schema.registry.ssl.keystore.location and the local cache TTL) to survive brief registry outages.
  • !
    Pub/Sub message backlog causes cost spike. A downstream Dataflow job is redeployed and takes 30 minutes to start. During that window, 18 million messages accumulate. When processing resumes, the subscription delivers all 18 million in a burst. Pub/Sub charges for each delivery. Unexpected redelivery bursts after consumer downtime are the most common source of surprise Pub/Sub invoices.
  • !
    Partition rebalance during peak traffic. A Kafka consumer group rebalance (triggered by adding or removing a consumer) pauses all message processing for the group for the duration of the rebalance. On a busy topic this can cause 30–120 seconds of processing downtime. Use the cooperative sticky rebalance protocol (partition.assignment.strategy=CooperativeStickyAssignor) to reduce this window.
  • !
    DLT never monitored. Dead letter topics on Pub/Sub and DLQ topics on Kafka fill silently. Business-critical messages sit in the DLT for weeks because nobody has a subscription or alert on it. Every DLT and DLQ must have: a monitoring subscription, an alert on non-zero message count, and a named owner responsible for replay.

Checklist

  • Backbone choice documented in an ADR with explicit rationale and the rejected alternative.
  • Consumer fan-out count estimated for the next 12 months. If above 5, Kafka cost model evaluated.
  • Schema registry deployed and compatibility mode set to BACKWARD_TRANSITIVE or FULL_TRANSITIVE for all production topics.
  • All consumers implement idempotent processing with a business-key dedup strategy.
  • DLT/DLQ configured, monitored, and has a named owner with a documented replay runbook.
  • Kafka: enable.auto.commit set to False on all consumers. Pub/Sub: ack/nack logic distinguishes transient vs. permanent failures.
  • Kafka: consumer group lag alert configured on max lag per partition, not average. Pub/Sub: oldest_unacked_message_age alert configured per subscription.
  • Cost budget alert configured for the backbone service. For Pub/Sub: alert at 50% of monthly expected delivery cost. For Confluent: alert if CKU count autoscales above baseline.
  • Replay procedure documented and tested in a non-production environment before the first production incident.