[Cloud Computing Study Notes (19)] Detailed Explanation of Nova Service Architecture and Execution Process

Article Directory

This article is published by the official account [Developing Pigeon]! Welcome to follow! ! !


Old Rules-Sister Town House:

One. Nova service

(1) Nova architecture

1. Architecture diagram

Nova's architecture is very complex and contains many components, which run as background daemons. The architecture diagram is as follows:

Insert picture description here

2. Component introduction

(1) API
1) nova-api

To accept and respond to customer API calls, in addition to accepting OpenStack's own API, it also supports Amazon EC2 API, which means that nova-api is compatible with EC2 API.

(2) Compute Core
1) nova-scheduler

Virtual machine scheduling service, this component is responsible for deciding which computing node should run the virtual machine.

2) nova-compute

This component is the core service for managing virtual machines, and realizes the life cycle management of virtual machines by calling the Hypervisor API.

3) Hypervisor

The virtualization hypervisor running on the computing node is the lowest-level program for virtual machine management. Different virtualization technologies provide their own hypervisors, commonly used are KVM, Xen, VMware, etc.

4) nova-comductor

Compute nodes often need to update the database, such as updating the status of virtual machines. For security and scalability considerations, nova-compute does not directly access the database, but instead delegates this task to nova-conductor.

(3) Console Interface
1) nova-console

Users can access the console of the virtual machine in a variety of ways:

nova-novncproxy: VNC access based on Web browser;

nova-spicehtml5proxy: SPICE access based on HTML5 browser;

nova-xvpnvncproxy: VNC access based on Java client;

2) nova-consoleauth

This component is responsible for token authentication for requests to access the virtual machine console.

3) nova-cert

Provide x509 certificate support.

(4) Database

Nova has some data that needs to be stored in the database, generally Mysql is used. The database is generally installed on the control node, and the database name used by the Nova service is nova.

(5) Message Queue

Since Nova contains many sub-services, these sub-services need to coordinate and communicate with each other. In order to decouple each sub-service, Nova uses Message Queue as the information transfer station for the sub-services, and the default is RabbitMQ.

3. Physical deployment plan of nova components

Nova's components will be deployed on two types of nodes: computing nodes and control nodes.

Computing node: Hypervisor (running virtual machine) and nova-compute;

Control nodes: nova-scheduler, nova-conductor, nova-consleauth, nova-cert, nova-api, nova-novncpoxy, nova-compute, my-server (message queue), mysqld (Mysql server)

4. Collaborative work of nova sub-services

(1) The customer sends a request to nova-api to request the creation of a virtual machine;

(2) After nova-api processes the request, it sends a message "Let the scheduler create a virtual machine" to mq;

(3) Nova-scheduler obtains the message sent to it by the API from mq, and then executes the scheduling algorithm to select node A from several computing nodes;

(4) noav-scheduler sends a message "create virtual on node A" to mq;

(5) nova-compute of computing node A obtains the message sent to it by nova-scheduler from mq, and then starts the virtual machine on the hypervisor of this node;

(6) During the process of virtual machine creation, if nova-compute of compute node A needs to query or update database messages, it will send messages to nova-conductor through mq, and nova-conductor is responsible for database access.

(Two) Nova component introduction

1.nova-api

Nova-api is the portal of the entire Nova component. All requests to Nova are first processed through nova-api. nova-api exposes several HTTP REST API interfaces. In Keystone, we can query the endpoints of nova-api, and the client can send the request to the address specified by endpoints to request operations from nova-api. As long as it is an operation related to the life cycle of a virtual machine, nova-api can respond.

2.nova-scheduler

(1) flavor

When creating an Instance, the user will propose resource requirements, such as how much CPU, memory, and disk are required. OpenStack will define these requirements in the flavor, and the user only needs to specify which flavor to use. Flavor defines four types of details: VCPU, RAM, DISK, and Metadata. Nova-scheduler will select the appropriate computing node according to the flavor.

(2) Parameter configuration of nova-scheduler

In /etc/nova/nova.conf, nova configures nova-scheduler through the three parameters of schedulerdriver, scheduleravailable_filters and schedulerdefaultfilters.

(3) Filter Scheduler

Filter Scheduler is the default scheduler of nova-scheduler. The scheduling process is divided into two steps:

Select the computing node (nova-compute) that meets the conditions through the filter filter;

Create Instance on the computing node with the largest weight through weighting calculation;

Nova allows the use of third-party schedulers, just configure scheduler_driver.

(4) Filter

When the Filter scheduler needs to perform a scheduling operation, it will let the filter judge the computing node and return True or False.

The scheduleravaiablefilters option in nova.conf is used to configure the filters available to the scheduler. By default, all the filters that come with nova can be used for filtering operations.

Another option schedulerdefaultfilters is used to specify the filters actually used by the scheduler. There are the following filters:

1) RetryFilter

The function is to flush the nodes that have been scheduled before. This filter is to prevent the operation of an assigned computing node from failing. RetryFilter can directly filter out the failed node to prevent the operation from failing again.

2) AvailabilityZoneFilter

In order to improve disaster tolerance and provide isolation services, computing nodes can be divided into different Availability Zones. OpenStack has a Nova Availability Zone by default, and all computing nodes are initially placed in the Nova space. When creating an Instance, you can specify to deploy the Instance to the specified Availability Zone.
AvailabilityZoneFilter will filter out the computing nodes that do not belong to the specified Availability Zone.

3) RAMFilter

RamFilter filters out computing nodes that cannot meet the memory requirements of flavor. Note that in order to improve the resource utilization rate of the system, the available memory of the computing node is allowed to overcommit, that is, it can exceed the actual memory size, which is controlled by ramallocationratio in nova.conf, and the default value is 1.5.

4) DiskFilter

DiskFilter filters out computing nodes that cannot meet the demand of flavor disks. The same disk also allows overcommit, which is controlled by the diskallocationratio in nova.conf, and the default value is 1.

5)CoreFilter

CoreFilter will not be able to meet the requirement of flavor VCPU computing node filter, VCPU also allows overcommit, through the cpuallocationration control of nova.conf, the default value is 16.

6) ComputeFilter

ComputeFilter guarantees that only the computing nodes whose nova-compute service works normally can be scheduled by nova-scheduler.

7)ComputeCapabilitiesFilter

The filter is filtered according to the characteristics of the computing node, such as whether the architecture of the computing node is X86 or ARM, these are all filtered by this filter.

8) ImagePropertiesFilter

The filter filters the matching computing nodes according to the attributes of the selected Image. For example, Image is specified to run on a certain kind of hypervisor, which is filtered according to the Metadata of the Image. If the Metadata of the Image is not set, the filter will not work. .

(5) Weight

After filtering by a bunch of filters, nova-scheduler selects the computing nodes that can deploy the instance, and then uses weight to score each computing node, and the one with the highest score wins. The default weight calculation strategy of nova-scheduler is to calculate the weight value by calculating the amount of free memory of the node. The more free memory, the greater the weight.

(6) Scheduler log

The entire scheduling process is recorded in the nova-scheduler log, in /var/log/nova/scheduler.log. If you want to query the debug log, you need to turn on the debug option in /etc/nova/nova.conf.

3.nova-compute

(1) Hypervisor

Nova-compute runs on compute nodes and is responsible for managing instances on the nodes. Nova-compute and Hypervisor implement OpenStack's management of the instance lifecycle. Nova-compute defines unified interfaces for different hypervisors. Hypervisors only need to implement these interfaces, and then they can be directly inserted into the OpenStack system in the form of Drivers.
       Configure the corresponding compute_driver in the configuration file /etc/nova/nova.conf of the compute node nova-compute. For example, KVM configures the Libvirt driver.

(2) nova-compute function

The functions of nova-compute are divided into two categories:

1) Regularly report the status of the computing node to OpenStack

To get the resource usage of a computing node, you need to know the resource occupation information of all instances on the current node. These are all instances of resource information obtained through the Hypervisor driver.

2) Realize the management of the instance life cycle

OpenStack's operations on the instance are all implemented through nova-compute, including the start, stop, restart, pause, resume, terminate, migration, and snapshot of the instance.
After nova-scheduler selects the compute node to deploy the instance, it will issue a command to start the instance to the selected compute node through RabbitMQ, and nova-compute on the compute node will execute the instance creation operation after receiving the message.

(3) Steps to create instance by nova-compute

It is divided into four steps:

1) Prepare resources for instance;

nova-compute first allocates memory, disk space, VCPU and network resources for the instance in turn according to the specified flavor.

2) Create a mirror file of instance;

After the resources are prepared, nova-compute will create an image file for the instance. First select an Image in Glance, check whether the Image already exists in the compute node, if not, download it from Glance to the compute node, and then use it as a backing file to create an instance image file. The image file is created from Image through the qemu-img command. Note that the image file is the file corresponding to the instance boot disk, and the Image is the template saved on Glance, that is, the template that the instance runs on.

3) Create an XML definition file of the instance;

Create an instance XML definition file.

4) Create a virtual network and start the virtual machine;

After creating a virtual network device for the instance, you can start the instance.

4.nova-conductor

Since nova-compte needs to obtain and update the instance information in the database, nova-compute does not directly access the database, but implements database access through nova-conductor. This can ensure database access security and better system scalability. For highly concurrent database access requests, the nova-conductor cluster can be configured to share access pressure.

(3) Detailed explanation of Nova operation

a) Routine operation

1.Launch

Start an instance and coordinate it through nova-api, nova-scheduler, rabbitmq, nova-compute, and nova-conductor.
(1) Send a request to nova-api;

(2) nova-api sends a message to RabbitMQ;

(3) Nova-scheduler obtains messages from RabbitMQ and executes scheduling to select nodes;

(4) nova-scheduler sends a message to RabbitMQ;

(5) The node's nova-compute obtains the message from rabbitMQ, and creates an Instance through the node's Hypervisor Driver;

2.Shut Off

Stop an instance.

(1) Send a request to nova-api;

(2) nova-api sends a message to RabbitMQ;

(3) The node's nova-compute gets the message from rabbitMQ and stops the Instance;

3.Start

Generate a new instance.

(1) Send a request to nova-api;

(2) nova-api sends a message to RabbitMQ;

(3) After the node's nova-compute obtains the message from rabbitMQ, it starts to start, prepares the virtual network card, prepares the XML file of the Instance, prepares the image file of the Instance, and finally starts the Instance;

4.Soft/Hartd Reboot

Two restart methods
(1) Soft reboot
restarts the operating system, and the Instance is still running during the whole process.

(2) Hard reboot
restarts the Instance, which is equivalent to shutting down and restarting.

5.Lock/Unlock

In order to avoid misoperation, you can lock the Instance and restore it to normal by unlocking (Lock). Lock/Unlock operations are all performed in nova-api. After the operation is successful, nova-api will update the locked state of the Instance. When performing other operations, nova-api determines whether the operation is allowed according to the locked state.

6.Terminate

Delete Instance.
(1) Send a request to nova-api;

(2) nova-api sends a message to RabbitMQ;

(3) The node's nova-compute gets the message from rabbitMQ, closes the Instance, deletes the image file of the Instance, and releases other resources such as the virtual network;

7.Pause/Resume

Suspend the Instance for a short time, save the Instance's state to the host's memory through the Pause operation. When it needs to be restored, execute the Resume operation, read back the Instance's state from the memory, and then continue to run the Instance.
(1) Send a request to nova-api;

(2) nova-api sends a message to RabbitMQ;

(3) The node's nova-compute gets the message from rabbitMQ, and after suspending the Instance, the status of the Instance becomes Paused;

8.Suspend/Resume

To suspend an instance for a long time, you can save the state of the Instance to the host's disk through the Suspend operation. When it needs to be restored, perform the Resume operation to read back the state of the Instance from the disk and continue running. The state of the Instance after being Suspend is Shut Down.

9.Snapshot

Sometimes the operating system is badly damaged and cannot be restored through the Rescue operation, so consider using a backup to restore. Nova's backup operation is Snapshot. The working principle is to perform a full backup of the Instance's image file (system disk), generate an Image of type snapshot, and then save it on Glance.

(1) Send a request to nova-api;

(2) nova-api sends a message to RabbitMQ;

(3) The node's nova-compute obtains the message from rabbitMQ, suspends the Instance, takes a snapshot of the Instance's image file, restores the Instance, and uploads the snapshot to Glance.

10.Resize

The Resize operation is used to adjust the size of the VCPU, memory, and disk resources of the Instance, that is, to select a new flavor for the Instance. Re-select a suitable computing node for Instance through nova-scheduler. If the selected node is not the same as the current node, then the Migrate operation is required.

b) Failure handling within the plan (system upgrade, hardware replacement)

1.Sleve

Although the Instance is in the Shut Down state after being Suspend, the Hypervisor still reserves resources for it on the host so that it can successfully Resume. If you want to release these resources, you can use the Shele operation, which saves the Instance as an Image in Glance, and then deletes the Instance on the host.

(1) Send a request to nova-api;

(2) nova-api sends a message to RabbitMQ;

(3) The node's nova-compute gets the message from rabbitMQ, closes the Instance, and then specifies the snapshot operation for the Instance. After success, the image generated by the snapshot will be saved in Glance, and finally the resources of the Instance on the host are deleted;

2.Unshelve

Restore the Shelve's Instance through the Unshelve operation. Its operation is actually to start a new Instance through the Image saved in Glance, and nova-scheduler will also schedule the appropriate computing nodes to create the Instance.

3.Migrate

The function of the Migrate operation is to migrate the Instance from the current computing node to other nodes. The Migrate operation does not require the source node and the target node to share storage, but a condition must be met, that is, the nova user passwordless access needs to be configured between the computing nodes.

(1) Only Admin users can send requests to nova-api;

(2) nova-api sends a message to RabbitMQ;

(3) After receiving the message, nova-scheduler will select the appropriate computing node for the Instance, and then notify the computing node to migrate the Instance;

(4) The nova-compute of the source node gets the message from rabbitMQ, closes the Instance, and then transfers the mirror image of the Instance to the target node. nova-compute tries to touch a temporary file in the Instance directory on the target node through ssh, then closes the Instance, and transfers the Instance image file to the target node through scp; (note that when copying files across nodes, you must ensure that the nova-compute process is The startup user can access without password between computing nodes)

(5) Start Instance on the target node, the process is very similar to lauch Instance;

(6) At this time, the Instance is in the "Confirm or Revert Resize/Migrate" state, which requires the user to confirm or return to the current migration operation.

(7) When the Confirm button is pressed, the source computing node will delete the Instance directory and delete the Instance on the Hypervisor;

(8) After pressing the Revert button, close the Instance on the target node, delete the Instance directory, delete the Instance on the Hypervisor, and start the Instance on the source node.

4.Live Migrate

The Migrate operation will stop the Instance, that is, cold migration. Live Migrate is a hot migration, and the Instance will not be shut down. Live Migrate is divided into two types:
(1) The source and target nodes do not share storage. Instance needs to transfer its mirror file from the source node to the target node when migrating. This is called block migration;

(2) The source and target nodes share storage. The mirror file of the Instance does not need to be migrated, only the loading of the Instance needs to be migrated to the target node;

The source and target nodes need to meet certain conditions to realize Live Migrate:
(1) The CPU types of the source and target nodes are the same;

(2) The Libvirt version of the source and target nodes are the same;

(3) The source and target nodes can identify each other's host name, such as adding each other's entry in /etc/hosts;

(4) Specify the TCP protocol to be used during online migration in /etc/nova/nova.conf of the source and target nodes;

(5) Since the Instance uses the config driver to save its metadata, the config driver also needs to be migrated to the target node during the block migration process. Since the current libvirt only supports the vfat type config driver, it must be in /etc/nova/ nova.conf points out that the vfat type config driver is created when lauch instance;

(6) The libvirt TCP remote monitoring service of the source and target nodes needs to be turned on;

Shared storage can be implemented in many ways, such as NFS servers, NAS servers, and distributed file systems. There is no need to transfer mirror files, only the state of Instances, which is much faster than block migration.

c) Unplanned troubleshooting (file damage, hardware failure)

1.Rescue/Unrescue

When the operating system fails and cannot be restarted, we can use a system disk to boot the system first, and then try to recover. This approach is suitable for less serious failures, called Rescue.
       Rescue uses the specified Image as the boot disk to guide Instein's guess, and mounts the system disk of the Instance itself as the second disk to the operating system.

(1) Send a request to nova-api. Nova uses the Image used in Instance deployment by default;

(2) nova-api sends a message to RabbitMQ;

(3) The node's nova-compute gets the message from rabbitMQ, closes the Instance, creates a new boot disk through Image, named it disk.rescue, and starts the Instance. When Rescue is executed successfully, you can view the XML file of Instance through virsh edit<instance>, you can see that disk.rescue is the boot disk vda, and the original boot disk disk is the second disk vdb.
After the repair is complete, use the Unrescue operation to reboot the Instance from the original startup disk.

2.Rebuild

The Rebuild operation can restore the snapshot, replace the image file of the current Instance with the snapshot, while keeping the other Instance, such as the network, and the resource allocation attributes unchanged.

(1) Send a request to nova-api to select the Image used for recovery;

(2) nova-api sends a message to RabbitMQ;

(3) The node's nova-compute gets the message from rabbitMQ, closes the Instance, downloads the new Image, prepares the image file of the Instance, and starts the Instance;

3.Evacuate

The host is damaged, how to restore the instance on the host. Through the Evacuate operation, when nova-compute cannot work, the Instance of the node is migrated to other computing nodes, provided that the image file of the Instance must be placed on shared storage.

(4) OpenStack log reading

OpenStack logs record detailed information, which is of great help to us in troubleshooting.

1. Directory location

For OpenStack that is not installed by devstack, logs are generally placed in the /var/log/xxx directory.

2. Log format

OpenStack's log format is unified, as shown below:
       <timestamp> <log level> <log content> <source code location>
       log stamp: the time of the log record;

Log level: INFO, WARNING, ERROR, DEBUG;
       Request ID: Each operation will be assigned a unique Request ID, which is easy to find;

Log content: the main body of the log, recording the current operations and results;

Source code location: the location of the log code, including the method name, the directory location and line number of the source code file;

3. View the log

First, you need to master the operating mechanism of OpenStack in order to view logs in a targeted manner. Otherwise, the contents of the logs can be dazzling. Analyze the running process of an operation, and then check the log on the corresponding node.

You can determine the large range first, for example, use tail -f to print the log file before the operation, so that the log that needs to be viewed must be in the content printed after the operation;

In addition, the required log range can also be determined by the timestamp;