Microservice notes: Two books that millions of programmers have read!

There are so many books related to microservices, and it is really not easy to find out which one is suitable for you. Here I recommend two books that I have read, and organize my reading notes to share with everyone.

"Microservice Design"

Insert picture description here

Author: [America] Sam Newman

This book is only 200 pages, but although the sparrow is small and complete, it fully introduces all aspects of microservice design. Including the advantages of microservices, how to split microservices, large-scale microservice-oriented host management, service deployment, service testing, service security, service monitoring, service governance, and Conway's law.

  • Service splitting: The splitting of microservices requires familiarity with the business domain. The common theoretical support is DDD. For unfamiliar domains, the author does not recommend to be too anxious to serve, but to maintain a monolithic system. Gradually, as you become familiar with the domain Split. For scenes with very clear domain boundaries, fine-grained splitting can be performed for large-scale servicing. Maintaining service autonomy and independent deployment is the basic principle.
  • Service management: After large-scale servicing, the question of how to manage the service naturally arises, whether it is a physical machine, a virtual machine, or a containerized deployment. Obviously containerization is the future trend, because the container resource occupies more lightweight, starts faster, and is suitable for elastic scaling of services.
  • Service monitoring: The scale of services has increased, and the complexity has grown rapidly. There may be thousands of services in an enterprise, and the probability of errors has increased. How to quickly locate the boundary when a problem occurs requires tracking the service invocation chain and monitoring the service. Calling indicators for collection is best displayed in a visual chart.
  • Service management and control: The service call link needs to have fault-tolerant measures after the traffic exceeds the carrying capacity, such as limiting the upstream service flow, fusing the downstream service call, and directly degrading the upstream when downstream problems occur.
  • Service testing: For software development, the importance of testing is self-evident. Testing includes unit testing, service testing, and end-to-end testing with three levels of granularity ranging from low to high. Unit testing can be integrated through continuous integration CI capabilities, and each time the code is submitted, a single test is automatically performed, which can quickly feedback development problems.
    Of course, there are some other content, such as synchronous or asynchronous when calling between services, and whether to use the rpc or rest protocol. And Conway's law talks about the relationship between organization and software architecture, and how to deal with security under the microservice architecture. It is still recommended to take a look and have an overall grasp of the points that need to be considered in the design of microservices.

"Microservice Governance: System, Architecture and Practice"

Insert picture description here

Author: Li Xin
The biggest feeling after reading this book is two words-measurement, a bit like the taste of measurement-driven development.

Measurement-driven development : tells the core indicators of microservice measurement, and how to measure these indicators. For example, the number of service calls is summarized, and the summary can be divided into minutes and hours.

Robustness of microservice architecture : Then, I wrote about the robustness (or robustness) of microservice architecture, and pointed out the design principles of robustness in the industry: redundancy, elastic scalability, single point statelessness, immutability Infrastructure, fault conduction blocking (such as fusing, current limiting, downgrading, timeout, power, etc. strategies), infrastructure as code, etc. Among them, "single point statelessness" is a prerequisite for elastic scaling. And "infrastructure as code" is also a prerequisite for rapid deployment and elastic scaling . **"Infrastructure as code"** provides a virtual layer that shields the differences between the underlying resources such as servers, networks, configuration, DNS, CDN, firewall, logs, monitoring and other resources or services for users. And abstract it as a virtual world object, which can be programmed, integrated, and scheduled through programming. **Users can order and program large-scale deployment through this virtual layer! **In this way, the infrastructure can quickly respond to the rapid iterative product demand, and the more efficient continuous delivery of the R&D team and devops can become possible!

Cluster fault tolerance : The cluster fault tolerance mechanism has many mechanisms such as the "call is always correct" mechanism provided by failsafe, that is, when a call exception occurs, a newly constructed result is returned. There are also FailOver failover mechanism and Failback failure retry mechanism. This mode re-initiates remote calls at fixed time intervals. Here you need to pay attention to the difference between cluster fault tolerance and fault tolerance degradation: cluster fault tolerance is used to ensure the reliability of remote calls, and fault tolerance degradation is to ensure business availability .

Service online life cycle management : such as graceful shutdown of services, blue-green release, gray release, canary testing concepts.
Blue-green release means adjusting routing and load balancing strategies, and uniformly switching traffic to the new version (green cluster), but the old service (blue cluster) is not offline. At this time, the two clusters coexist, but the old cluster has no traffic. Once the service of the new version is abnormal, quickly switch back to the old version (blue cluster) by adjusting the routing and load balancing strategy.
Gray release is actually to smoothly release the traffic in batches, gradually expand the scope, and route the selected online traffic to the new version, collect feedback in real time to verify the release effect, and decide whether to continue or roll back.
☆So what is the canary test ? Apply the original words in the book: "In the gray release, the first batch (or the first N batches) of service nodes released and the user traffic cut to that node are of special significance. They often act as the "first mover". Role, most anomalies can be found in the first batch of releases. Since the scope of the first batch (the first N batches) is very small (generally no more than 1%), the scope of impact is limited, so the first batch (previous N batches) released separately called "canary test"."

Service online stability guarantee
☆ Sort out online failure scenarios, sort them according to the probability of occurrence, and formulate plans.
☆ Prioritize the efficiency of the plan, such as expansion or current limiting in response to high-traffic scenarios. What are the advantages and disadvantages of each?
☆Troubleshooting to ensure that you will not be rushing when something goes wrong!
☆Fault review, it is recommended that all employees participate, avoid publicly blaming a person or team.
In addition, chaos engineering is also mentioned, its essence is to create faults and vulnerabilities and improve them.

Architecture governance : mainly involves the evolution of service layering, for example, the general business layer is split to form a business service layer**, and the parts of the personalized function that are not related to the business can be split, such as protocol adaptation, unpacking, and security verification and other layers were ripped out with the gateway as a security gateway layer , and then is assembled on the business service layer and common services layer functions and aggregation of "aggregation service layer."

Insert picture description here


The business domains involved in each service layer are different, and the division of labor is also different.
⭐️The general service layer spans multiple business domains and provides common services for the entire enterprise's business. For example, in the fund industry, the general service layer also provides general services to different business domains such as direct sales, agency sales, and high-end financial management.
⭐️The scope of the business service layer is controlled within a single business domain. For example, the direct sales business domain will form a unique business service layer based on business characteristics. There is no multiplexing relationship between the business service layers of different business domains. If a service under a certain business domain can be reused by other business domains, it should continue to sink to the general service layer.
⭐️Aggregation service layer** is more related to channels, and it carries very little business logic. It mainly functions to aggregate and assemble the services of the business service layer and the general service layer.
Therefore, the definition of the three-tier service layer: aggregation service layer, business service layer, and general service layer is another way of saying: business front-end service layer, business middle-office service layer, and general back-end service layer .
Under the service layered architecture, the services of each layer are not static, and common services will continue to sink. Therefore, the lower the service level, the higher the level of abstraction, the more versatile and static, and will not change frequently; the closer the front-end service is, the closer to the business, the more unstable it will continue to change with the rapid changes in the business, and objectively must Maintain a lighter posture.
The front desk service can be divided into more detail, so that the front desk service does not need to process business logic through a large amount of code, and only needs a small amount of glue code to quickly perform the general service of the exclusive business field of the middle station and the general service of the back office across the business field. Assembly and aggregation fundamentally reduce the workload and cost of front-end service development.

R&D governance : such as design review, code review, automated testing, unit testing, and commissioning issues under microservices.

Operation and maintenance governance : online and offline environment isolation, continuous integration environment construction, etc.
⭐️Continuous integration is the process of automatic detection, pull, construction and unit testing after source code changes. The goal of continuous integration is to quickly ensure that the new changes submitted by developers are correct, the new code and the original code can be accurately integrated, and are suitable for further use in the code base.

⭐️Continuous delivery : The components obtained by continuous integration mainly go through unit testing and partial integration testing. These tests can only prove that there is no problem with the merging of branch codes, but it cannot guarantee that the overall quality of the component can meet the product requirements. It must be deployed to For further verification in the test environment, it is necessary to use the continuous delivery process to generate the final deployment component based on the component generated by the continuous integration, and deploy it to various test environments in combination with configuration management, and verify its function and performance. Continuous delivery uses components generated by continuous integration and other necessary components to perform integration compilation or assembly, generate deployment components, and conduct integration tests on the deployment components and their dependent upstream services, databases, caches, message queues and other resources to verify whether the deployment components are Can work normally, and whether it meets the performance requirements. Deployed components that pass rigorous testing can be released to the component library as needed, and can be used for deployment in the production environment at any time.

⭐️Continuous deployment : Continuous deployment can be implemented after the deployment components are obtained through the continuous delivery link. The goal of continuous deployment is to release the deployed components to the production environment, but this process is not a pinch. If there is a pre-release environment, the deployment component will be deployed to the pre-release environment for verification first, and then the production deployment will be performed after the verification is passed.
Continuous delivery mainly ensures that there are components that can be used for deployment, but it does not necessarily need to be deployed. It focuses on embodying a capability; continuous deployment is a behavior and a means of landing value.