Message queues are one of the critical components in any software architecture which helps different components talk to each other asynchronously. In this article let's discuss what message queues are, how they're used in modern architectures and what problems do they solve.
Let's say you want to withdraw some money from an ATM. An ATM machine can only allow one person to do transactions at a time. In this case, what do you do when you see there's a person already using the ATM machine? Do you wait in a queue or do you go back thinking you cannot withdraw money from this machine?
Exactly! You wait in a queue until your turn comes.
So when many people want to withdraw money they will have to wait in a queue and go to the ATM machine in the same order in which they entered the queue. This is one real world example of queues and message queues serve the same purpose in any software architecture.
We will talk about below topics in this article:
- What are message queues?
- How are message queues used in a system?
- What problems do message queues solve?
- Other properties of message queues
- Resources
- Conclusion
What Are Message Queues?
A queue is a First-In First-Out data structure. If you have a computer science background you know how a queue data structure works: the first element pushed into the queue is the first item extracted. It's basically the same concept as what we discussed in our ATM machine example.
A message queue is basically a queuing service used in micro services & server-less architectures to decouple different components that are easier to develop, scale & maintain in the long term. Message queues helps different components to communicate asynchronously by passing messages between them.
Generally in applications where message queues are used there are producers who create and put the messages into message queues. And there are consumers or worker processes which read those messages and take actions.
A message is any data that is passed between two components. It contains all the information needed for the worker components to take actions.
Let's consider a use case where we have to recharge a customer's account. The worker component need to know these details: account_id, card_id, amount (in dollars).
The message the producer creates and puts into a message queues will look something like this:
{
"account_id": 12345,
"card_id": 567,
"amount": 100
}
As you can see, the above message contains all the information that a worker process needs to recharge this customer's account.
There are many third party message queueing services like Rabbit MQ, Kafka, Amazon SQS to name a few. Each one of them provide different features suitable for different use cases.
How Are Message Queues Used In A System?
As mentioned earlier message queues are used to decouple different components in a system. But what does it mean to decouple a system? It basically means the process of separating functions so that they're more independent and self contained.
Let's discuss it with an example. Let's say we need an application to generate invoices and send it to our customers at the end of the month. For the sake of this discussion let's assume that generating invoices is a time taking process and take anywhere between few seconds to a minute to generate one single invoice which involves doing the below steps in order:
- Get customer account details
- Get customer billing details
- Get data for different products used by the customer during the month
- Apply any onetime charges and charge customers
- Finally generate a invoice PDF and send it to the customer
How would you implement such a functionality?
This is how a naive developer would approach it: Build a single REST service which accepts requests and does the above steps.
This is a monolith architecture and has many problems with it:
No Fault Tolerance: Fault tolerance is the ability for a system to continue operating without any interruption when one or more of its components fail. Since we have a single REST service, if the service goes down, the whole application stops working.
Components Cannot Be Scaled Independently: If we want to scale the application when the number of customers accounts increase, then we have to scale the whole application. This means more resources will be used since we cannot scale any individual component independently. So the use cases like Get Account Details will be scaled even though it is unnecessary because this components generally gets less traffic.
No Freedom of Language: Since all the components are built using a single application, a single language / framework need to be used to build the complete application. We cannot use different languages for different components since the whole application is a single code repository which is built and deployed together.
Increase in Development and Deployment Time: Having all the components as one single codebase will increase the development time. Since this is a big application, it's possible that different teams would be working on different components of the system at any point in time. So all the teams need to take care of code consistency, code quality and also have to spend time in resolving merge conflicts since active development will be going on. This adds to the development time. Also with bigger codebase it takes more time to build and deploy.
As seen above there are many problems if we go with a monolith architecture for the given problem statement. We need to divide the application into multiple components like Invoice Generation Service, Invoice Generation Service Worker, Account Service, PDF Generation Service etc.
Let's see how diving this into multiple services & having a message queue to communicate between them will solve these problems.
What Problems Do Message Queues Solve?
With micro services architecture, the architecture would look something like this:
In this architecture the Invoice Generation Service will create a message for every customer account. This message would look something like this:
{
"account_id": 12345,
"invoice_start_date": "2020-10-01",
"invoice_end_date": "2020-10-31"
}
Therefore if we have 1000 accounts there will be 1000 messages pushed into the queue. The worker service will contain multiple workers. Each worker will pickup a message and start processing it. All the workers will be running concurrently. Each of these workers will talk to Account Service, Billing Service and PDF Generation Services which are micro services themselves.
Now let's see how each of the problems mentioned above will be solved with this architecture.
Fault Tolerance: If one of the component fails the other will work seamlessly. Let's say if Invoice Generation Service Worker is down for reasons, Invoice Generation Service will continue to work and keep pushing messages into the queue. It is not dependent on whether Worker service is up or not. If Invoice Generation Service is down, Worker service will continue to pull the messages from the queue and continue to generate invoices. So now there is fault tolerance in the system.
Scalability: Now since the components are independent of one another they can be scaled independently too. All the heavy weight work is being done by the Worker service and Invoice Generation Service is just creating and pushing the messages into the message queue. So we need more computing power for the Worker service. So we can have powerful machines to run Worker service and we can add more tasks if the number of accounts increase in the future. Whereas the Invoice Generation Service can be run on less powerful machines. With this architecture we're able to scale the different components independently.
Freedom of Language: Having multiple components like this will give us the freedom to implement different components in different programming languages and frameworks. For example: Invoice Generation Service can be implemented in Java. Worker can be implemented in GoLang. And if let's say we have another service which is responsible for generating invoice pdf, this can be implemented in Python. So having multiple components like this and having a message queue for communication between them will give us the flexibility over languages used in the architecture.
Improved Developer Productivity & Deployment Time: Since different components can be developed independently, the developer productivity will increase since they don't have to spend more time in co-ordinating when adding new features, resolving merge conflicts etc. Also the codebase of each of the component or service will be lighter and will have a separate deployment pipeline. So the quality, build and deployment time will decrease.
Other Properties Of Message Queues
Different message queuing services offer different message queues with properties like visibility timeout, message retention period, message size etc based on the popular use cases in the industry.
AWS provided message queuing service is called SQS (Simple Queue Service).
Let's discuss about SQS provided message queues and their properties in this section which gives you an idea on the different types of properties available with message queues in general.
AWS SQS offers two types of message queues:
Standard Queue:
Amazon SQS offers standard as the default queue type. Standard queues support a nearly unlimited number of API calls per second, per API action (SendMessage, ReceiveMessage, or DeleteMessage). Standard queues support at-least-once message delivery. However, occasionally (because of the highly distributed architecture that allows nearly unlimited throughput), more than one copy of a message might be delivered out of order. Standard queues provide best-effort ordering which ensures that messages are generally delivered in the same order as they're sent.
FIFO Queue:
FIFO (First-In-First-Out) queues are designed to enhance messaging between applications when the order of operations and events is critical, or where duplicates can't be tolerated.
Properties:
- Visibility Timeout: In a distributed application when a worker picks up a message there's no guarantee that the message would be processed successfully. So SQS doesn't deleted the message from the queue once it is picked up. It is the responsibility of the worker to delete it from the queue once it has processed the message successfully.
Immediately after a message is received, it remains in the queue. To prevent other consumers from processing the message again, Amazon SQS sets a visibility timeout, a period of time during which Amazon SQS prevents other consumers from receiving and processing the message. This is called visibility timeout.
- Message Retention Period: If a worker application is down, we do not want to store the messages in the queue forever. We will keep the message in the queue for a specified amount of time and then delete it from the queue.
Message retention period is the length of time, in seconds, for which Amazon SQS retains a message.
There are many properties like these associate with message queues depending the service and the use cases. I hope this gives you an idea on the type of properties available in message queues.
Resources
Learning about infrastructure and cloud computing services is very important to understand how distributed and scalable applications are built. Message queues are one of the integral components used in building these applications.
Personally I loved the course The Good Parts of AWS from Daniel Vassallo who has 10+ years of experience working within AWS team. In this course Daniel talks about AWS SQS along with other critical services like S3, EC2, Autoscaling, Route53 etc.
This course is very helpful even for absolute beginners to get started with cloud computing services. I highly recommend buying this course if you have been wanting to learn about this topic. You'll not regret it!
Conclusion
- Message queues provide Asynchronous Communications Protocol between two components which helps two components talk to each other asynchronously.
- Producer applications create and insert messages into message queues. Consumer applications / worker processes read and processes these messages.
- Message queues help in decoupling components in an architecture.
- Using message queues improves fault tolerance, scalability, development time and freedom of language.
- AWS SQS, Kafka, RabbitMQ are some of the popular message queuing services available.
- Different message queuing services provide different features depending on popular industry standard use cases.
If you find this article useful, please Like/Share so that it reaches others as well. Please hit that Subscribe button at the top of the page to get an email notification on my latest posts.
You can connect with me on twitter where I usually share my knowledge more frequently on AWS concepts, building SaaS products and becoming a better developer in general.