The streaming backbone determines how you handle schema evolution, consumer lag, replay, fan-out, and operational overhead. Everything downstream adapts to it: your transformation pipelines, dead letter queues, observability tooling, and cost model all behave differently depending on which one you choose. Getting this wrong is expensive to undo.
The choice reduces to this: Kafka (Apache Kafka, Confluent Cloud, Amazon MSK) gives you a distributed, partitioned, retention-based log with full replay control and a mature schema registry. You pay for it in operational complexity or in Confluent licensing. Pub/Sub (Google Cloud Pub/Sub) is a fully managed subscription-based bus with no brokers to operate, first-class GCP integration, and a pricing model that punishes high consumer fan-out at scale.
The recommendation in this blueprint: use Pub/Sub as the default for GCP-native platforms with fewer than five consumer groups and moderate throughput. Switch to Kafka when you need exact replay semantics, more than five independent consumer groups reading the same topic, sub-20ms latency, or you are running multi-cloud. The rest of this page explains why.
How Kafka works internally
Kafka stores messages as an ordered, partitioned, immutable log. Each topic is divided into partitions (typically 3 to 100+), each partition maintained on one broker (leader) with replicas on others. Producers route messages to partitions via a partition key: the same order ID always lands on the same partition, guaranteeing ordering per key. Each consumer in a group owns a set of partitions and tracks its own offset (position in the log). Consumer groups are independent: Group A at offset 84,000 and Group B at offset 12 are reading the same partition simultaneously.
Replay is a first-class operation. Seek any consumer group to any offset or timestamp within the retention window (typically 7 days). Replaying the Bronze ingestion pipeline after a bad schema deployment takes two commands and does not affect any other consumer group. The Schema Registry (Confluent SR or Apicurio) enforces Avro or Protobuf compatibility at publish time: breaking changes are rejected before they reach the topic.
How Pub/Sub works internally
Pub/Sub stores messages in a topic and delivers a copy to each subscription independently. There are no partitions or offsets: the service manages delivery internally. Pull subscribers call the API to receive messages; push subscriptions deliver to an HTTP endpoint. Every message has an acknowledgement (ack) deadline. If the subscriber does not ack within the window (default 10s, configurable to 600s), Pub/Sub redelivers. At-least-once delivery is guaranteed; exactly-once requires idempotent consumers.
After max_delivery_attempts failed attempts (configurable per subscription), Pub/Sub routes the message to a dead letter topic (DLT) automatically. Replay via Pub/Sub Seek lets you rewind all subscriptions on a topic to a snapshot or timestamp, but it cannot replay one subscription independently of others. Pub/Sub Schemas provide basic Avro or Protobuf validation at publish time, but do not enforce compatibility evolution rules the way Confluent Schema Registry does.
Platform fit: Kafka as the backbone
With Kafka, CDC from legacy databases uses Kafka Connect with Debezium source connectors. All producers register schemas in Confluent Schema Registry before publishing. Dataflow connects via the Kafka I/O connector. Consumer lag per partition per consumer group is the primary freshness signal, monitored via JMX or the Kafka AdminClient API. Schema evolution is enforced at the registry level: a producer cannot break a schema without incrementing the subject version.
Platform fit: Pub/Sub as the backbone
With Pub/Sub, CDC from legacy databases goes through Google Datastream, which replicates
directly into BigQuery or GCS without a Debezium cluster to manage. Domain event producers
publish to Pub/Sub topics; Dataflow reads from subscriptions via the native PubSubIO connector.
Freshness monitoring uses the Cloud Monitoring
oldest_unacked_message_age
metric per subscription. Access control is handled by IAM bindings at the subscription level,
not by client-side ACL configuration.
Decision matrix
Opinionated comparison based on production use. "Depends" is used only where it is genuinely the honest answer.
| Concern | Kafka | Pub/Sub | Edge goes to |
|---|---|---|---|
| Peak throughput | Unlimited via horizontal partition scaling. 1M+ msg/sec achievable. | High but bounded. Default quota ~1 GB/s per region; requires quota increase for higher. | Kafka |
| End-to-end latency | 5–20ms p99 with low-latency config (acks=1, linger.ms=0). Tunable vs. throughput. | 30–100ms typical. Not tunable. Sufficient for most data pipeline workloads. | Kafka |
| Replay | Full: seek any consumer group to any offset or timestamp. Partitions replay independently. | Seek to snapshot or timestamp rewinds all subscriptions together. No per-subscription replay. | Kafka |
| Consumer fan-out | Unlimited consumer groups, each with independent offsets. 10 groups costs the same as 1. | Each subscription is billed separately per GB delivered. 10 subscriptions = 10x delivery cost. | Kafka |
| Ordering | Strict per-partition order. Same partition key = guaranteed ordering. | Unordered by default. Message ordering keys enable per-key ordering within a region. | Tie |
| Operational complexity | High self-managed; moderate on Confluent Cloud/MSK. Partition rebalancing, JVM tuning, quota management. | Zero. No brokers, no rebalancing, no JVM. GCP manages everything. | Pub/Sub |
| GCP integration | Requires Kafka Connect or custom Dataflow connector. Not native GCP. IAM via Confluent RBAC. | First-class: native Dataflow I/O, Cloud Monitoring, BigQuery subscription, IAM bindings. | Pub/Sub |
| Cost at low volume | Dedicated clusters are expensive even idle. Serverless Kafka (Confluent) is cheaper but has limits. | Pay per message delivered. Free tier covers first 10 GB/month. | Pub/Sub |
| Cost at high fan-out | Flat: adding consumer groups does not change Kafka cluster cost. | Multiplies per subscription. 10 subscriptions at 10 TB/month = 100 TB delivery charges. | Kafka |
| Schema registry | Confluent Schema Registry: full backward/forward/full compatibility enforcement, schema IDs in wire format. | Pub/Sub Schemas: Avro/Protobuf validation at publish. No compatibility evolution rules. | Kafka |
| Multi-region | Confluent multi-region replication. Self-managed: MirrorMaker2. More control, more setup. | Global by default. No configuration needed for cross-region delivery. | Pub/Sub |
Cost comparison
Scenario
| Line item | Confluent Cloud Dedicated | GCP Pub/Sub |
|---|---|---|
| Cluster / base | 4 CKU × $0.44/CKU/hr × 730 hr = ~$1,285/mo | No cluster charge. $0 |
| Storage (7-day, rf=3) | 18,150 GB × $0.10/GB-mo = ~$1,815/mo | Pub/Sub does not charge separately for retention. Included in delivery cost. ~$0 |
| Publish / ingress | Included in cluster cost above. $0 additional | 25.9 TB: first 10 TB at $0.040/GB + 15.9 TB at $0.035/GB = ~$957/mo |
| Deliver (3 consumers) | Same-region egress: ~$0.01/GB × 77.7 TB = ~$777/mo (cross-cloud adds ~$0.09/GB) | 3 subscriptions × 25.9 TB × $0.037/GB avg = ~$2,875/mo |
| Schema Registry | Included in Dedicated. $0 | Pub/Sub Schemas included. $0 |
| Monthly total (3 consumers) | ~$3,877/mo | ~$3,832/mo |
| Monthly total (10 consumers) | ~$4,675/mo (+$798 egress) | ~$13,507/mo (10× delivery) |
Code: production patterns
Kafka producer with Avro and Confluent Schema Registry
Partition key is the order_id. Same order always routes to same partition. acks=all means the leader waits for all in-sync replicas before acknowledging.
from confluent_kafka import Producer
from confluent_kafka.schema_registry import SchemaRegistryClient
from confluent_kafka.schema_registry.avro import AvroSerializer
from confluent_kafka.serialization import SerializationContext, MessageField
ORDER_SCHEMA = """
{
"type": "record",
"name": "OrderPlaced",
"namespace": "com.example.orders",
"fields": [
{"name": "order_id", "type": "string"},
{"name": "customer_id", "type": "string"},
{"name": "amount_cents", "type": "long"},
{"name": "placed_at_ms", "type": {"type": "long", "logicalType": "timestamp-millis"}}
]
}
"""
sr_client = SchemaRegistryClient({"url": "https://sr.example.com"})
avro_serializer = AvroSerializer(sr_client, ORDER_SCHEMA)
producer = Producer({
"bootstrap.servers": "kafka.example.com:9092",
"security.protocol": "SASL_SSL",
"sasl.mechanism": "PLAIN",
"sasl.username": "API_KEY",
"sasl.password": "API_SECRET",
# Reliability: wait for leader + all in-sync replicas
"acks": "all",
"retries": 3,
})
def delivery_report(err, msg):
if err:
raise RuntimeError(f"Delivery failed: {err}")
event = {
"order_id": "ord-12345",
"customer_id": "cust-67890",
"amount_cents": 4999,
"placed_at_ms": 1710000000000,
}
producer.produce(
topic="orders.placed.v1",
key="ord-12345", # partition key: same order_id always same partition
value=avro_serializer(
event,
SerializationContext("orders.placed.v1", MessageField.VALUE)
),
on_delivery=delivery_report,
)
producer.flush() Kafka consumer with consumer group and manual offset commit
enable.auto.commit=False is non-negotiable for at-least-once delivery. Auto-commit can acknowledge messages before they have been processed successfully.
from confluent_kafka import Consumer, KafkaException
from confluent_kafka.schema_registry import SchemaRegistryClient
from confluent_kafka.schema_registry.avro import AvroDeserializer
from confluent_kafka.serialization import SerializationContext, MessageField
import logging
sr_client = SchemaRegistryClient({"url": "https://sr.example.com"})
avro_deserializer = AvroDeserializer(sr_client)
consumer = Consumer({
"bootstrap.servers": "kafka.example.com:9092",
"security.protocol": "SASL_SSL",
"sasl.mechanism": "PLAIN",
"sasl.username": "API_KEY",
"sasl.password": "API_SECRET",
"group.id": "orders-ingestion-pipeline",
"auto.offset.reset": "earliest",
"enable.auto.commit": False, # manual commit: only after successful processing
"max.poll.interval.ms": 300000,
})
consumer.subscribe(["orders.placed.v1"])
try:
while True:
msg = consumer.poll(timeout=1.0)
if msg is None:
continue
if msg.error():
raise KafkaException(msg.error())
event = avro_deserializer(
msg.value(),
SerializationContext(msg.topic(), MessageField.VALUE),
)
try:
process_order(event)
# Commit only after successful write to Bronze. Guarantees at-least-once delivery.
consumer.commit(asynchronous=False)
except Exception as exc:
# Do NOT commit. Message will be redelivered on next poll.
logging.error("Processing failed, skipping commit", exc_info=exc)
except KeyboardInterrupt:
pass
finally:
consumer.close() Pub/Sub publisher with message attributes
Message attributes carry routing and tracing metadata without polluting the payload schema. future.result() blocks until Pub/Sub acknowledges storage, not delivery.
from google.cloud import pubsub_v1
import json
publisher = pubsub_v1.PublisherClient()
topic_path = publisher.topic_path("my-gcp-project", "orders.placed.v1")
def publish_order_placed(
order_id: str,
customer_id: str,
amount_cents: int,
placed_at_ms: int,
) -> str:
payload = json.dumps({
"order_id": order_id,
"customer_id": customer_id,
"amount_cents": amount_cents,
"placed_at_ms": placed_at_ms,
}).encode("utf-8")
future = publisher.publish(
topic_path,
data=payload,
# Message attributes used for filtering and tracing
domain="orders",
event_type="placed",
schema_version="v1",
correlation_id=order_id,
)
# .result() blocks until Pub/Sub acknowledges receipt
message_id = future.result()
return message_id Pub/Sub pull subscriber with dead letter topic handling
The dead letter topic is configured on the subscription in Terraform, not in code. The subscriber only needs to distinguish permanent vs. transient failures when deciding whether to ack or nack.
from google.cloud import pubsub_v1
from concurrent.futures import TimeoutError
import json
import logging
subscriber = pubsub_v1.SubscriberClient()
subscription_path = subscriber.subscription_path(
"my-gcp-project",
"orders.placed.v1-ingestion-pipeline", # subscription, not topic
)
def handle_message(message: pubsub_v1.types.PubsubMessage) -> None:
try:
payload = json.loads(message.data.decode("utf-8"))
process_order(payload)
# Ack removes the message from this subscription permanently
message.ack()
except json.JSONDecodeError as exc:
# Permanent failure: bad payload. Ack to prevent infinite redelivery.
# Route to dead letter via message attribute or separate pipeline.
logging.error("Unparseable message, acking to DLT path", exc_info=exc)
message.ack()
except Exception as exc:
# Transient failure: nack triggers redeliver after ack deadline.
# After max_delivery_attempts the subscription routes to the dead letter topic.
logging.warning("Transient failure, nacking for redelivery", exc_info=exc)
message.nack()
streaming_pull_future = subscriber.subscribe(
subscription_path,
callback=handle_message,
)
with subscriber:
try:
streaming_pull_future.result()
except Exception:
streaming_pull_future.cancel()
streaming_pull_future.result() Advanced topics
Kafka Connect for CDC pipelines
Kafka Connect runs source connectors that pull from external systems and publish to Kafka topics. For CDC (Change Data Capture), the Debezium connector tails the database binlog (MySQL binary log, Postgres WAL, Oracle LogMiner) and publishes row-level changes as Kafka messages. Single Message Transforms (SMTs) let you rename fields, filter rows, or add metadata before the message reaches the topic, without writing a custom Kafka Streams job.
Source connectors
Debezium MySQL, Postgres, Oracle, SQL Server. Each connector instance tracks its own binlog position.
SMT examples
MaskField (PII masking), ReplaceField (column rename), TimestampConverter (epoch to ISO 8601), ExtractField (flatten nested CDC envelope).
Gotchas
Connector restart loses binlog position if offsets.storage.topic is misconfigured. Always pin the Debezium version because connector upgrades can change the CDC envelope format.
Schema registry comparison: Confluent SR vs Apicurio
Both support Avro, Protobuf, and JSON Schema. Both enforce compatibility modes (BACKWARD, FORWARD, FULL). The differences are operational: Confluent Schema Registry is a single binary with a REST API; it is simple to run and well-documented, but it is part of the Confluent Platform licensing model when used with Confluent Cloud. Apicurio Registry is an open-source alternative that also supports OpenAPI and AsyncAPI artifacts, making it a better fit if you need a central contract registry beyond just Kafka schemas. Use Confluent SR if you are already on Confluent Cloud, where it is bundled. Use Apicurio if you want a schema registry independent of Confluent licensing or if you need multi-format artifact storage.
Kafka consumer group lag monitoring
Consumer lag is the difference between the latest produced offset and the last committed offset for a consumer group on a partition. It is the primary freshness signal for Kafka-backed pipelines: a lag of 50,000 messages at 10,000 msg/sec means you are 5 seconds behind. At 1,000 msg/sec on a low-throughput topic, the same lag means 50 seconds behind.
Pub/Sub dead letter topics: configuration and behaviour
A dead letter topic (DLT) is configured at the subscription level, not the topic level. Each subscription can have a different DLT and a different max_delivery_attempts value (minimum 5, maximum 100). When a message is nacked or the ack deadline expires max_delivery_attempts times, Pub/Sub moves it to the DLT automatically with a CloudPubSubDeadLetterPolicy message attribute indicating the original subscription and delivery attempt count.
The DLT is itself a Pub/Sub topic. Create a separate subscription on the DLT to inspect and replay failed messages. Pub/Sub does not automatically retry DLT messages: you own the replay logic. A common pattern is a Cloud Function or small Dataflow job subscribed to the DLT that logs the failure, sends an alert, and republishes to the original topic after human review or automated correction.
Anti-patterns
Kafka anti-patterns
- ! Using Kafka as a database. Setting retention to "infinite" and querying Kafka for current state via consumer seeks. Kafka is a log, not a queryable store. Current state belongs in a database or data warehouse. Use Kafka Streams or ksqlDB materialised views if you need stateful queries, but accept the operational cost.
- ! Too many small topics. Creating a separate topic for every microservice event type with 3 partitions each. 200 microservices × 5 event types × 3 partitions = 3,000 partitions on a 4-broker cluster. ZooKeeper (or KRaft) controller election time scales with partition count. Keep topics coarse-grained at the domain level. Use message type headers or schema fields to distinguish event types within a topic.
- ! No partition key discipline. Using null partition keys (round-robin assignment) for events that require ordering, then wondering why the order of events for the same entity is inconsistent. Establish and document the canonical partition key for each topic before the first message is published.
- ! No schema enforcement at produce time. Letting producers publish raw JSON without schema registration. A producer deploys a change, renames a field, and every downstream consumer crashes silently. The schema registry check at produce time is the only reliable gate.
Pub/Sub anti-patterns
- ! Assuming exactly-once delivery. Pub/Sub guarantees at-least-once. Under normal conditions most messages are delivered once, but during redelivery windows, restarts, or subscription resets, duplicates occur. Every consumer must be idempotent. Using a message ID or a business key as an upsert key in BigQuery or Spanner is the correct pattern.
- ! Ignoring ordering on ordered subscriptions. Enabling message ordering keys without understanding the performance trade-off: ordered delivery serialises processing per key. One slow ack for key "order-123" blocks all subsequent messages with the same key. For high-cardinality keys (order IDs) the impact is minimal. For low-cardinality keys (region codes) it can stall an entire subscription.
- ! Using Pub/Sub for high-fan-out CDC. Routing full-table CDC from a high-write-volume Oracle table through Pub/Sub to 10+ downstream subscriptions. At 5 TB/day ingress with 10 subscriptions, Pub/Sub delivery costs alone exceed $5,000/month. Kafka with 10 consumer groups on the same data costs roughly the same as 1 consumer group. CDC fan-out is a primary use case where Kafka's cost model wins.
Key takeaways
- 01 Use Pub/Sub as the default for GCP-native platforms with 3 or fewer consumer groups, moderate throughput, and no hard replay requirements. It costs nothing to operate and integrates directly with every GCP data service.
- 02 Switch to Kafka when you need: exact per-partition replay for individual consumer groups, more than 5 independent consumers on high-volume topics, sub-20ms latency, multi-cloud portability, or full schema evolution enforcement. The operational cost is real, so price it in before committing.
- 03 At moderate scale (3 consumers, ~25 TB/month), Kafka and Pub/Sub cost roughly the same. At high fan-out (10+ consumers), Kafka is substantially cheaper because consumer count does not change cluster cost. This is the single most important cost driver in the backbone decision.
- 04 Schema enforcement is non-negotiable on either platform. Confluent Schema Registry enforces compatibility rules at produce time, which Pub/Sub Schemas does not. If you choose Pub/Sub and need strong schema governance, enforce it at the Dataflow ingestion step or via a schema validation Cloud Function on the topic.
- 05 Both systems require idempotent consumers. Pub/Sub guarantees at-least-once delivery. Kafka with enable.auto.commit=True plus an application crash gives you the same problem. Design for duplicates regardless of which backbone you choose.
Failure modes
- ! Consumer lag grows silently. A Kafka consumer group falls behind because a single slow partition is not visible in average-lag dashboards. Max lag, not average lag, is the correct alert metric. By the time average lag is high, you are already hours behind.
- ! Schema registry outage stops all producers. The schema registry is a hard dependency in the produce path. If Confluent SR goes down without a local cache configured, all producers that validate at publish time fail. Enable the schema caching in the Avro serializer (schema.registry.ssl.keystore.location and the local cache TTL) to survive brief registry outages.
- ! Pub/Sub message backlog causes cost spike. A downstream Dataflow job is redeployed and takes 30 minutes to start. During that window, 18 million messages accumulate. When processing resumes, the subscription delivers all 18 million in a burst. Pub/Sub charges for each delivery. Unexpected redelivery bursts after consumer downtime are the most common source of surprise Pub/Sub invoices.
- ! Partition rebalance during peak traffic. A Kafka consumer group rebalance (triggered by adding or removing a consumer) pauses all message processing for the group for the duration of the rebalance. On a busy topic this can cause 30–120 seconds of processing downtime. Use the cooperative sticky rebalance protocol (partition.assignment.strategy=CooperativeStickyAssignor) to reduce this window.
- ! DLT never monitored. Dead letter topics on Pub/Sub and DLQ topics on Kafka fill silently. Business-critical messages sit in the DLT for weeks because nobody has a subscription or alert on it. Every DLT and DLQ must have: a monitoring subscription, an alert on non-zero message count, and a named owner responsible for replay.
Checklist
- □ Backbone choice documented in an ADR with explicit rationale and the rejected alternative.
- □ Consumer fan-out count estimated for the next 12 months. If above 5, Kafka cost model evaluated.
- □ Schema registry deployed and compatibility mode set to BACKWARD_TRANSITIVE or FULL_TRANSITIVE for all production topics.
- □ All consumers implement idempotent processing with a business-key dedup strategy.
- □ DLT/DLQ configured, monitored, and has a named owner with a documented replay runbook.
- □ Kafka: enable.auto.commit set to False on all consumers. Pub/Sub: ack/nack logic distinguishes transient vs. permanent failures.
- □ Kafka: consumer group lag alert configured on max lag per partition, not average. Pub/Sub: oldest_unacked_message_age alert configured per subscription.
- □ Cost budget alert configured for the backbone service. For Pub/Sub: alert at 50% of monthly expected delivery cost. For Confluent: alert if CKU count autoscales above baseline.
- □ Replay procedure documented and tested in a non-production environment before the first production incident.