Azure Service Bus provides reliable asynchronous messaging between decoupled services. Choosing between queues and topic subscriptions, configuring message lock duration and retry settings correctly, and handling dead-letter messages are the three operational areas that most directly determine whether a messaging integration holds up under production conditions.
This post covers when to use each messaging model, how to configure reliability settings, and how to build a dead-letter remediation workflow.
1. Queues vs Topic Subscriptions
Service Bus supports two primary messaging models:
| Feature | Queue | Topic + Subscription |
|---|---|---|
| Delivery pattern | Point-to-point — one consumer receives each message | Publish-subscribe — each subscription receives an independent copy |
| Multiple consumers | Competing consumers; each message handled by exactly one receiver | Independent subscribers; every subscriber gets every message |
| Filtering | Not available at the queue level | SQL-based or correlation filter rules per subscription |
| Best for | Task dispatch, job queues, ordered processing | Event fanout, notifications, multi-system integration |
Use a queue when you have a producer dispatching discrete work items to be processed exactly once — a resize job, an email send, or a payment authorisation. Use a topic with subscriptions when a single event must be delivered independently to multiple consumers — an order placed that triggers both fulfillment and analytics processing.
A common mistake is building a single queue shared by multiple consumer types and having them compete for messages they cannot process. Topic subscriptions with filter rules solve this cleanly: each consumer subscribes only to the messages relevant to it.
2. Configuring Lock Duration and Max Delivery Count
Two settings have the most direct impact on message processing reliability.
Lock duration is how long a consumer has to process and complete a message before Service Bus makes it visible again for redelivery. The default is 60 seconds. Set this to a value that reflects a realistic 95th-percentile processing time, not the average — particularly for steps that involve downstream API calls or database writes.
Max delivery count is how many delivery attempts are made before a message is moved to the dead-letter queue. The default is 10. Set this based on how many transient failures your consumer may legitimately encounter before a message should be treated as unprocessable.
To configure these settings:
- Navigate to Service Bus namespace > Queues (or Topics > [topic] > Subscriptions)
- Select + Queue or open an existing queue
- Set Lock duration and Max delivery count in the queue properties
- Select Create or Save
Following is a recommended starting configuration for a queue processing calls to an external API:
| Setting | Recommended value | Reasoning |
|---|---|---|
| Lock duration | 5 minutes | Accounts for slow downstream API responses |
| Max delivery count | 5 | Enough retries for transient errors; not so many that a poison message loops indefinitely |
| Message time-to-live | 7 days | Prevents accumulation if consumers are offline briefly |
3. Understanding the Dead-Letter Queue
Every Service Bus queue and topic subscription has an automatically associated dead-letter queue (DLQ). Messages are moved to the DLQ when:
- The message has been delivered
Max delivery counttimes without being completed - The consumer explicitly calls
DeadLetterAsync(for messages that fail business validation) - The message TTL expires and
EnableDeadLetteringOnMessageExpirationis set totrue
The DLQ is a sub-resource of the parent queue, accessible at the path <queue-name>/$DeadLetterQueue. It uses the same lock and session mechanics as the main queue but does not itself have a dead-letter queue — messages that accumulate there and are not processed simply remain until explicitly removed.
I recommend treating a growing DLQ as a production alert condition, not a background concern. A message in the DLQ represents a business operation that has not completed and requires investigation.
4. Processing Dead-Letter Messages
Each dead-lettered message includes system properties that identify why it was moved. Inspect these before deciding how to handle the message.
| Property | Example value |
|---|---|
DeadLetterReason | MaxDeliveryCountExceeded |
DeadLetterErrorDescription | Message lock expired |
Following is a minimal C# example that reads from the DLQ, logs the reason, and completes the message after inspection:
var client = new ServiceBusClient(connectionString);
var receiver = client.CreateReceiver(
queueName,
new ServiceBusReceiverOptions { SubQueue = SubQueue.DeadLetter }
);
ServiceBusReceivedMessage dlqMessage = await receiver.ReceiveMessageAsync();
Console.WriteLine($"Dead-letter reason: {dlqMessage.DeadLetterReason}");
Console.WriteLine($"Description: {dlqMessage.DeadLetterErrorDescription}");
Console.WriteLine($"Body: {dlqMessage.Body}");
// After investigation: complete to discard, or re-enqueue to the main queue to retry
await receiver.CompleteMessageAsync(dlqMessage);
In production, I recommend a dedicated DLQ processor that writes dead-lettered messages to a Log Analytics workspace or storage account for triage, rather than automatically re-enqueueing without understanding why the original processing failed. Automatic retry without root cause analysis typically results in the same message dead-lettering again.
5. Monitoring Service Bus with Azure Monitor
Azure Monitor surfaces several Service Bus metrics that should form the basis of operational alerting:
| Metric | Alert condition | Meaning |
|---|---|---|
Dead-lettered messages | Count > 0 sustained for 5 minutes | Messages failing processing |
Active messages | Count exceeds expected queue depth | Consumer falling behind producer |
Throttled requests | Count > 0 | Namespace under load; consider upgrading to Premium tier |
Server errors | Count > 0 | Infrastructure-level issue |
To configure a dead-letter alert:
- Navigate to Service Bus namespace > Alerts > + Create > Alert rule
- Under Condition, select Dead-lettered messages
- Set Aggregation type to Total and Threshold to
1 - Assign an Action group to notify the responsible team
- Select Review + create
Enable diagnostic settings on the namespace to send metrics and logs to Log Analytics for retention and post-incident analysis. The AzureServiceBusLogs category captures message-level activity including dead-letter events with timestamps and message IDs.
Summary
Service Bus reliability depends on choosing the right messaging model for your delivery requirements, configuring lock duration and max delivery count to reflect realistic processing characteristics, and treating dead-letter queues as active operational signals. A queue with appropriate settings and alerting on dead-letter growth is substantially more resilient than one with default configuration and no observability. The time invested in getting these settings right at design time avoids significantly harder debugging when messages begin failing silently in production.
No comments:
Post a Comment