Using CQRS - Command and Query Responsibility Segregation Pattern
Cloud (and preferably micro-service) design patterns are useful for building reliable, scalable, secure and reactive applications in the cloud. Now let’s take a closer look at one of them - CQRS.
CQRS design pattern
CQRS, short for Command and Query Responsibility Segregation, is a design pattern used to separate read and update operations for a data store. In other words, instead of using one model to both retrieve and change information, CQRS allows us to create two separate models, each to manage their own respectful operations(read or update).
The implementation of CQRS allows for an application to reach its maximum potential performance, scalability, and security as well as FE-BE layer independence (the way how to achieve a loosely-coupled and more reactive application architecture). Additionally, CQRS is great at preventing merge conflicts at the domain level.
In some cases, this separation can be useful, but watch out as CQRS can add risky complexity.
There is also real implementation example of CQRS pattern in CodeNOW demo application - Using CQRS in demo Reservation application
[//]: # (you can edit the scheme on this url: [https://app.diagrams.net/#G1tOT4Iacu5v24nobv3CuBJPFv8XQ75E7])
Context
The mainstream approach is to use the same data model to both query and update a database. While this method is simple and efficient for basic CRUD operations, there are many situations where it may no longer be so. For instance, we may want to combine multiple sets of data into one, or perhaps display data in different forms etc. On the update (write) side of the pattern, validation rules may be found which limit only some combinations of data to be stored. Said rules may even clash with the form of data being sent and change it up accordingly. As a result, the application ends up being overly complex and difficult to properly manage.
Developers and designers can suffer the consequences from such a complex model, which can lead to a single information having multiple representations. When users interact with the information, they use various representations of itself, which can later result in confusion. Developers, members of the domain team managing the given object, should use a domain model to describe the structure and behavior of the object, while its read only representations could be fitted to the needs of consumers.
Such multiple layers of representation can get quite complicated, but they should still be mapped to the single conceptual representation (domain model) which acts as a translation point between all the representations.
- Consistency: In many cases, inconsistencies between the read and write representations of the data occur. This can often be caused by minor things, such as properties that weren’t correctly updated. For example, a common mobile operator customer service or domain model stores some basic customer data. However, it may return some additional (derived or computed) data, such as customer rating or segment assignment. If you change the basic data, the derived data should be recalculated. Such logic can often be complex.
- Throughput: When multiple operations are performed simultaneously on a piece of data, it may cause confusion.
- Performance: A decrease in overall performance could occur with the traditional approach. Which is mostly due to the high workload for the data store and complexity of querying information.
- Security: Since each piece of information is tampered with in two models (read and write), it’s important not to disclose data in the improper context. Making overall security complicated. For example, if a mobile operator common customer service or domain model contains common aggregated logic for multiple customer segments (family, retail, corporate) then you have to evaluate many complex and conflicting conditions for displaying allowed content for a given segment.
- Robustness and layer independency: A frequent problem in distributed systems (not necessarily microservice-based) is strong dependency among communicating services and components, typically synchronous request-response. Failure of any integrated service called from the n-th layer of back-end systems can cause failure of the whole system (and e.g. client front-end malfunction). See also DDD, bounded contexts and context maps.
And last but not least, you could have applied the Microservices architecture pattern and the Database per service pattern. As a result, it is no longer straightforward to implement queries that join data from multiple services (and their data sources). Also, if you have applied the Event sourcing pattern then the data is no longer easily queried. Thus, the problem is how to implement a query that retrieves data from multiple services in a microservice architecture.
Solution
As stated prior, CQRS introduces the ability to split a domain model into separate read and update models. The ability to do so resolves having a single conceptual model that does neither operation well. Who makes the most out of CQRS are usually projects with high traffic and complex write to read updates.The update/write model can also be referred to as the Command model and the read side as the Query model.
CQRS solution aspects:
- Commands sent to the Query model shouldn’t be data centric (meaning describing what happens with the affected data). Instead, they should be revolving around the overall operation being processed. For example, instead of “Create new userID”, it should be “Register user” or instead of “Set order status”, it should be “Confirm order” .
- Commands can be queried asynchronously resulting in reaching a higher response efficiency.
- Queries solely retrieve data, never change it. When something is queried, a DTO is returned.
- A Query (view) database, this database is being updated according to Domain events by the domain that holds the data.
The command/query models may share the same (in-memory) database, in which case the database acts as the communication point between the two models. They may also use separate databases often based on different technology optimized for fast and effective reading or reliable writing respectively. In this case, there needs to be some form of communication mechanism between the two models or their databases.
If you decide to separate the databases, it's necessary that they are kept up to date. As mentioned before, this is where Domain events come in. Essentially, when a domain’s database is updated, the database sends a request for the Query database to be updated also. Updating the database and publishing the event must occur in a single transaction. This way any changes made can instantly translate into the Query (view) database.
The Query model could be an exact, read-only, copy of the Command model. Or they could also have an entirely different structure. Having multiple Query models could increase its performance. The separation between the two different databases also allows for an independent scaling in order to match their workloads.
CQRS Benefits
- Independent scaling - the workloads of read and write models can differ, meaning independent scaling would allow for maximum efficiency.
- Security - checking if the changes in data are legitimate becomes much simpler due to the separation.
- Each respective side can have a different schema, meaning their performance is optimized. The write side is optimized for updates, the read side for queries
- CQRS allows for better coordination when operating in either model. The segregation results in models that are better administered as well as flexible. As a result both models can be quite simple.
- QUERIES
- Performance - You can use a fast in-memory database (or cache) as the read model and thus avoid a lot of expensive database interactions.
Challenges
CQRS also has its challenges, some of them include:
- Asynchronous messaging. While messaging isn’t a necessity in CQRS, it’s a convenient way of processing events and commands (via dumb broker).
- Complexity. As mentioned before, using CQRS may come off as a simple procedure. While it can be, the application can also become complex. Having to deal with information possibly having multiple representations or Event sourcing can result in a chaotic application.
- Stale data. The Query model must be updated frequently enough in order to reflect on the changes in the Command model. A failure to do so can result in a user placing a request on data that wasn’t updated fast enough.
- Coding. CQRS can’t just be installed and work perfectly on any given application. Therefore CQRS must be coded in order to fit an application’s specific needs. Scaffolding mechanisms are in this case mostly useless.
When to use
So when should we use CQRS? CQRS isn’t meant to be implemented on a whole application. Instead, it should be focused on a specific part of a system because each different part needs a different schema in order to achieve its maximum performance.
Overall the general rule is if an application doesn’t have enough overlap between the two sides, CQRS can come in handy. This would for example be applications with high performance. The ability to scale each model independently suddenly becomes a necessity because of the difference in workload.
Apart from workload, CQRS can also come in handy if we want to use unique techniques for the sides, for example when a write side needs a schema optimized for updating information and the read side a schema for Querying.
And yet another indicator might be a problem with how to implement a query that retrieves data from multiple services in a microservice architecture, e.g. in a Backend For Frontend (BE4FE) layer. You can implement API Composition pattern but it can lead to undesired strong (synchronous) dependency. Better is aggregating data from all the services within the CQRS query model in BE4FE. Thus, BE4FE and service independence can be achieved, leading to more robust, responsive, fault-tolerant and highly available systems.
When not to use
While CQRS seems like a beneficial pattern, it should be considered whether transferring from a CRUD model to CQRS is worth it. For example, if an application’s read and write schemas and workloads are similar, the basic CRUD model is the one to go for. Improper implementation of CQRS could result in unnecessary complexity, high maintenance and possibly serious difficulties.
The potential risk of CQRS complexity and the general complexity of microservice architecture could be reduced by using a mature platform such as CodeNOW to help efficiently develop, deploy, control and monitor complex distributed systems. In this way, the advantages of CQRS can outweigh the disadvantages.
Relation to other design patterns
CQRS naturally fits with some other architectural patterns.