DDD Part Seven: Performance Optimization of Repository

In DDD, the aggregate root needs to be persisted through the repository (Repository). The repository decouples the storage of the aggregate root from the storage middleware (Mysql, ElasticSearch, MonogoDB, etc.). We can decide to choose a relational database according to the business characteristics of the aggregate. It is also a non-relational database to store the aggregate root.

Many readers may still have questions about why the resource library only provides a save method to persist aggregate roots. The reason is that in DDD, the resource library is the container of the aggregate root, but it does not limit what the container is made of, which is the decoupling from the bottom layer as mentioned above. If the container is a key-value database, it does not support updating a certain field, and inset and update are not distinguished. The resource library is different from the DAO. The resource library only provides aggregate roots and persistent aggregate roots to the domain model.

If we choose a relational database as the container for the aggregate root, it may be necessary to split the aggregate root and the entities under the aggregate root into multiple table storage when storing the aggregate root, which may cause the save aggregate root to be executed every time Multiple update statements, even if the entity under the aggregate root has not changed, even if only one field (value object) is modified by the aggregate root, it will seriously affect the performance of the application.

In order to solve the performance problems caused by selecting relational databases as the aggregate root container, we need to make extra efforts, such as using memory snapshots to determine which tables only need to be updated each time the save aggregate root.

Based on the feature that each business use case needs to obtain the aggregate root through the resource library and finally persist the aggregate root through the resource library, we can create a snapshot when the aggregate root is obtained, and compare (diff) the snapshot when the aggregate root is persisted to obtain the difference. Information, only the difference information that needs to be updated is executed.

This article is sharing a scheme implemented by the author. Although the DDD code specifications defined by each team are different, the difference in the implementation of the resource library is not big, so it also has reference value.

First, abstract the aggregate root snapshot storage AggregateRootSnapshot, which provides methods for caching aggregate root snapshots, obtaining snapshots according to the aggregate root ID, and removing snapshots.

Tip: We agree that the aggregate root must inherit an abstract class BaseAggregate, which defines the method to obtain the aggregate root ID. When the snapshot is cached, the aggregate root id can be used as the key cache, so that it can be obtained according to the aggregate root ID when it is taken.

We can use redis to implement aggregate root caching, but it is not recommended to use low-performance storage middleware storage, because not only the performance of the resource library has not been optimized, but it also affects the performance anyway. Of course, the best way is to store it in memory, although at the expense of memory, this is space for time.

We use ThreadLocal to store aggregate root snapshots, so the AggregateRootSnapshot implementation class written is as follows.

If the aggregate root id is not generated by the database (we do not recommend that the aggregate root id depends on the database generation, the reason has been introduced in the previous article). In order to avoid getting the wrong snapshot when the aggregate root is newly created, for example, when the thread executes the last business use case (an interface request), it only calls the method of obtaining the aggregate root, and then does not call the storage method of the aggregate root. Snapshot (such as obtaining the details of the aggregate root), and this time a new aggregate root is created. Of course, the method of obtaining the aggregate root by the resource library is not called to update the snapshot. Then the snapshot obtained this time will be the previous snapshot, so we still Need to compare whether the aggregate root id is the same.

Of course, only comparing the aggregate root id cannot ensure that the new aggregate root is obtained. It can ensure that the aggregate root is unique. There is also this condition: "Based on each business use case, the aggregate root needs to be obtained through the resource library first, and finally it needs to be persisted through the resource library. This sentence is the most important.

提示:ThreadLocal类型字段非静态,不会导致内存泄露吗?答案是不会,后面会讲到。

Next, we write an abstract class for the resource library that uses the relational database to store the aggregate root, and the resource library that needs to use snapshots to optimize performance can inherit this abstract class.

RepositorySnapshotSupper implements the findById, save, and deleteById methods of the Repositor interface, and provides abstract methods to be implemented by subclasses. Because we need to create a snapshot of the aggregate root when the aggregate root is obtained by findById and cache it, take the snapshot to complete the diff judgment before the actual save aggregate root, and then hand the diff result to the subclass, so that the subclass can be based on the implementation of save The diff result reduces unnecessary sql.

提示:RepositorySnapshotSupper的快照存储器并非静态的,而快照存储器的ThreadLocal类型字段也非静态,因此我们需要确保一个资源库只存在一个实例(单例),才不会导致ThreadLocal内存泄露,只是每个聚合根强引用一个ThreadLocal。

The above steps are not difficult. The difficulty lies in how to create a snapshot and implement diff.

Snapshot tool class (SnnapshotUtils) implementation ideas:
advance condition: require entities and aggregate roots to provide a private no-parameter constructor for creating instances through reflection.

  • 1. Realize the field value copy through reflection. When the field type of the aggregate root is a non-entity type, it is the value object type. For the value object type, we only need to copy the reference;
  • 2. If it is an entity type collection, create a new collection, and add a copy of each entity element in the original collection to the new collection, and assign the new collection to the snapshot. The copy rule of the entity is the same as the aggregation root, which can be used Recursive implementation.

Diff tool class implementation ideas:

First define the diff result type: unmodified, new, updated, and deleted.

  • 1. For the aggregate root, if there is no snapshot, the Insert type is considered, and all entities under the aggregate root are of the Insert type;
  • 2. For the aggregate root, if there is a snapshot, except for the entity type or the entity type collection field, as long as any other value object is different, the aggregate root diff result is considered to be of the Update type, otherwise it is of the Non type;
  • 3. As long as the aggregate root is not newly added, regardless of whether the aggregate root is updated or not, it will not affect the diff of entities under the aggregate root;
  • 4. If the entity and the aggregate root are one-to-one, that is, it is not a collection type field, then: if the corresponding entity snapshot does not exist, the diff result is considered Insert, otherwise if the entity snapshot exists but the new one is null, it is considered Delete, otherwise the comparison Each value object of the entity, if it is not modified, it is Non, and if it is modified, it is Update;
  • 5. If the entity and the aggregate root are many-to-one, that is, a collection of entities, if the order has multiple order items, then it needs to be compared one by one: the new item cannot be found in the snapshot, it is Insert, and the item in the snapshot is no longer available. If there is a new entity set, it is Delete. Otherwise, compared to item, it is Non if it is not modified, and it is Update if it is modified.

Define the class for storing diff results:

Since the BaseAggregate aggregate root implements the entity interface (the aggregate root is also an entity), we use Entity to refer to the aggregate root/entity in EntityDiff to facilitate subsequent insertion and update of the entity directly from the diff, or the entitySnapshot for deletion. (For an entity collection, the index of the entity in the collection can also be stored.)

If the entity field under the aggregate root is a collection type, then the diff result also uses collection storage:

Implementation of diff tool class:

Due to the inconvenience of posting the project code, I simply wrote a test case here to share the results.

Order aggregate root:

提示:使用lombok有个坑,如果使用@Builder注解,需要提供一个无参构建方法(建议是私有的构建方法),然后在构建方法上添加@Tolerate注解。

Order item entity:

Order resource library implementation:

  • When the diff result type of the aggregate root is Insert, the aggregate root and entities under the aggregate root are fully stored;
  • When the diff result type of the aggregate root is Non, there is no need to update the aggregate root, but whether the entities under the aggregate root need to be updated still needs to be determined according to the diff result of the aggregate root entity;
  • When the diff result type of the aggregate root is Update, the aggregate root needs to be updated;
  • Get the diff result of the entity, and decide whether to insert, update, delete, or do nothing based on the diff result.

unit test:

The unit test results are as follows:

to sum up

This article introduces how to optimize the performance of the resource library by means of snapshot + diff, all of which can be done because each business use case needs to obtain the aggregate root through the resource library first, and finally needs to persist the aggregate root through the resource library. For performance reasons, we decided to trade space for time, use ThrealLocal + reflection to create and cache aggregate root snapshots, and finally use reflection to complete the diff logic. Of course, there is still room for optimization in the diff class.

The snapshot introduced in this article is based on the aggregate root (DO). Of course, we can also implement it based on the (PO), which will be simpler.

  • Note: The code in this picture may have bugs. It has not been updated to the optimized code. I am too lazy to take a screenshot again, just for reference!

references:

Thank you for the ideas provided by the Taoist technologist Yin Hao!

[Java Art] WeChat ID: javaskill

Deeply cultivate the back-end architecture, explore the underlying implementation principles, and pay attention to Java art. We grow together on the road of architects!