top of page

Amazon SQS dead-letter queue with CDK and its problem

Updated: May 7

Amazon Simple Queue Service (Amazon SQS) is a distributed message queuing service introduced by Amazon. Amazon SQS offers a secure, durable, and available hosted queue that lets you integrate and decouple distributed software systems and components. It also offers common constructs such as dead-letter queues and cost allocation tags. Besides that, it provides a generic web services API that you can access using any programming language that the AWS SDK supports.

What is a dead-letter queue?

A dead-letter queue (DLQ) is a special type of message queue that temporarily stores messages that a software system cannot process due to errors. Message queues are software components that support asynchronous communication in a distributed system. They let you send messages between software services at any volume and don’t require the message receiver to always be available. A dead-letter queue specifically stores erroneous messages that have no destination or which can’t be processed by the intended receiver.

What are the benefits of a dead-letter queue?

Next, we talk about the benefits of dead-letter queues (DLQ).

  1. Reduced communication costs Regular or standard message queues keep processing messages until the retention period expires. This helps ensure continuous message processing and minimizes the chances of your queue being blocked. However, if your system processes thousands of messages, a large number of error messages will increase communication overhead costs and burden the communication system. Instead of trying to process failing messages until they expire, it’s better to move them to a dead-letter queue after a few processing attempts.

  2. Improved troubleshooting If you move erroneous messages to the DLQ, this lets your developers focus on identifying the causes of the errors. They can investigate why the receiver couldn't process the messages, apply the fixes, and perform new attempts to deliver the messages. For example, a banking software might send thousands of credit card applications daily to its backend system for approval. From there, the backend system receives the applications but cannot process all of them because of incomplete information. Instead of making endless attempts, the software moves the messages to the DLQ until the IT team resolves the problem. This allows the system to process and deliver the remaining messages without performance issues. 

Implementing SQS with dead-letter queue in CDK

Construct SQS

export class Sqs extends Construct {
  private static readonly commonSqsQueueConfig: QueueProps = {
    visibilityTimeout: Duration.seconds(300),
    receiveMessageWaitTime: Duration.seconds(20),
    retentionPeriod: Duration.days(14),
  };

  readonly sqsQueue: Queue;

  constructor(
    readonly scope: Construct,
    readonly id: string,
  ) {
    super(scope, id);

    this.sqsQueue = new Queue(this, 'sqs-queue', {
      ...ProcessMrmReportSqs.commonSqsQueueConfig,
      queueName: `sqs`,
    });
  }

Construct its dead-letter queue

export class Sqs extends Construct {
  private static readonly MAX_RETRY_ATTEMPT = 3;
  private static readonly commonSqsQueueConfig: QueueProps = {
    visibilityTimeout: Duration.seconds(300),
    receiveMessageWaitTime: Duration.seconds(20),
    retentionPeriod: Duration.days(14),
  };

  readonly sqsQueue: Queue;
  readonly deadLetterQueue: Queue;

  constructor(
    readonly scope: Construct,
    readonly id: string,
    readonly env: EnvironmentVariables,
  ) {
    super(scope, id);

    this.deadLetterQueue = new Queue(this, 'dead-letter-queue', {
      ...ProcessMrmReportSqs.commonSqsQueueConfig,
      queueName: `sqs-dlq`,
    });

    this.sqsQueue = new Queue(this, 'sqs-queue', {
      ...ProcessMrmReportSqs.commonSqsQueueConfig,
      queueName: `sqs`,
      deadLetterQueue: {
        queue: this.deadLetterQueue,
        maxReceiveCount: ProcessMrmReportSqs.MAX_RETRY_ATTEMPT,
      },
    });
  }

Configure re-drive policy

export class Sqs extends Construct {
  private static readonly MAX_RETRY_ATTEMPT = 3;
  private static readonly commonSqsQueueConfig: QueueProps = {
    visibilityTimeout: Duration.seconds(300),
    receiveMessageWaitTime: Duration.seconds(20),
    retentionPeriod: Duration.days(14),
  };

  readonly sqsQueue: Queue;
  readonly deadLetterQueue: Queue;

  constructor(
    readonly scope: Construct,
    readonly id: string,
    readonly env: EnvironmentVariables,
  ) {
    super(scope, id);

    this.deadLetterQueue = new Queue(this, 'dead-letter-queue', {
      ...ProcessMrmReportSqs.commonSqsQueueConfig,
      queueName: `sqs-dlq`,
	  redriveAllowPolicy: {
        redrivePermission: RedrivePermission.BY_QUEUE,
        sourceQueues: [this.sqsQueue],
      }
    });

    this.sqsQueue = new Queue(this, 'sqs-queue', {
      ...ProcessMrmReportSqs.commonSqsQueueConfig,
      queueName: `sqs`,
      deadLetterQueue: {
        queue: this.deadLetterQueue,
        maxReceiveCount: ProcessMrmReportSqs.MAX_RETRY_ATTEMPT,
      },
    });
  }

Did you notice anything wrong? Yes, that's it!!! The main queue and it's dead-letter queue are depending circularly on each other. Why CDK, why???

So, instead of using the redriveAllowPolicy property, we have to implement the allow policy manually.

private constructDeadLetterQueuePolicy() {
    this.deadLetterQueue.addToResourcePolicy(
      new PolicyStatement({
        actions: [
          'sqs:StartMessageMoveTask',
          'sqs:ReceiveMessage',
          'sqs:DeleteMessage',
          'sqs:GetQueueAttributes',
          'sqs:CancelMessageMoveTask',
          'sqs:ListMessageMoveTasks',
        ],
        principals: [new ServicePrincipal('sqs.amazonaws.com')],
        resources: [this.deadLetterQueue.queueArn],
      }),
    );

    this.deadLetterQueue.addToResourcePolicy(
      new PolicyStatement({
        actions: ['sqs:SendMessage'],
        principals: [new ServicePrincipal('sqs.amazonaws.com')],
        resources: [this.sqsQueue.queueArn],
      }),
    );
  }

You can check for more SQS actions here.

And the final result is:

export class Sqs extends Construct {
  private static readonly MAX_RETRY_ATTEMPT = 3;
  private static readonly commonSqsQueueConfig: QueueProps = {
    visibilityTimeout: Duration.seconds(300),
    receiveMessageWaitTime: Duration.seconds(20),
    retentionPeriod: Duration.days(14),
  };

  readonly sqsQueue: Queue;
  readonly deadLetterQueue: Queue;

  constructor(
    readonly scope: Construct,
    readonly id: string,
    readonly env: EnvironmentVariables,
  ) {
    super(scope, id);

    this.deadLetterQueue = new Queue(this, 'dead-letter-queue', {
      ...ProcessMrmReportSqs.commonSqsQueueConfig,
      queueName: `sqs-dlq`,
	  redriveAllowPolicy: {
        redrivePermission: RedrivePermission.BY_QUEUE,
        sourceQueues: [this.sqsQueue],
      }
    });

    this.sqsQueue = new Queue(this, 'sqs-queue', {
      ...ProcessMrmReportSqs.commonSqsQueueConfig,
      queueName: `sqs`,
      deadLetterQueue: {
        queue: this.deadLetterQueue,
        maxReceiveCount: ProcessMrmReportSqs.MAX_RETRY_ATTEMPT,
      },
    });

	 this.constructDeadLetterQueuePolicy();
  }
}

That's all, lets test our queues above

First, send a test error message

You will see 1 message in flight in the main queue

After message timed out with max retries (in our case is 3), it will be moved into dead-letter queue

Lastly, let's redrive the message back to the main queue to test whether our policy works or not

Yassss, it worked :)))

Conclusion

That's all for my exp in implementing dead-letter queue with CDK. I hope this will help when you getting the same problem. Thanks and see ya in the next post!

11 views0 comments

Comments


bottom of page