强烈高潮全身颤抖xxx,成年片黄网站色大全免费,日本理论片和搜子同居的日子,爱爱小说黄文片段

購物�

解決方案

解決方案

您現(xiàn)在的位置�首頁 > 解決方案 > 行業(yè)�(yīng)� > 詳細(xì)�(nèi)�
基于兆芯平臺的OpenStack云系�(tǒng)�(diào)�(yōu)指南
�(fā)布時間:2024-07-22   點擊次數(shù)�238�

  為讓使用兆芯平臺服務(wù)器的用戶能夠更快捷的�(gòu)建IaaS系統(tǒng),充分發(fā)揮兆芯平臺的性能,特編寫本指�,供用戶參�。本指南并不依賴特定�(chǎn)品型�,重在指�(dǎo)用戶在兆芯平臺服�(wù)器上的調(diào)試方�,并且會隨著評估測試的深入和成熟度增加逐步完善�

 

  一、系�(tǒng)介紹

 

1.硬件�(huán)境:

測試用IaaS系統(tǒng)�(guī)模為46個節(jié)�,均使用兆芯平臺雙路服務(wù)器構(gòu)�。其中OpenStack節(jié)�40個,Ceph分布式存儲系�(tǒng)節(jié)�6�,詳見下表�

 

節(jié)點類�

節(jié)點配�

�(shù)�

管理節(jié)�

128G DDR4�(nèi)�+120G SSD+千兆�(wǎng)�(luò)x2

3

�(wǎng)�(luò)節(jié)�

128G DDR4�(nèi)�+120G SSD+千兆�(wǎng)�(luò)x3

1

塊存儲節(jié)�

128G DDR4�(nèi)�+120G SSD+千兆�(wǎng)�(luò)x2

1

對象存儲節(jié)�

128G DDR4�(nèi)�+120G SSD+4T HDDx3+千兆�(wǎng)�(luò)x2

1

計算節(jié)�

128G DDR4�(nèi)�+120G SSD+千兆�(wǎng)�(luò)x2

33

�(jiān)控節(jié)�

128G DDR4�(nèi)�+120G SSD+千兆�(wǎng)�(luò)x2

1

Ceph Mon節(jié)�

128G DDR4�(nèi)�+120G SSD+萬兆�(wǎng)�(luò)x2

3

Ceph OSD節(jié)�

128G DDR4�(nèi)�+250G SSD+500G SSDx3+2T HDDx12+萬兆�(wǎng)�(luò)x4

3

Rally Server

Dell E3-1220v3服務(wù)�

1

 

2.軟件�(huán)境:

軟件

版本

備注

Host OS

CentOS7.6

安裝ZX Patch v3.0.9.4

Python

v2.7.5

 

Docker

v19.03.8

 

OpenStack

Sterin

基于容器部署

Mariadb

v10.3.10

多主Golare Cluster

SQLAlchemy

v0.12.0

python module

RabbitMQ

v3.8.14

搭配Erlang v22.3.4.21

Haproxy

v1.5.18

 

KeepAlived

v1.3.5

 

Ceph

v14.2.19

 

Rally

v2.0

并發(fā)測試工具

CosBench

v0.4.2

對象存儲性能測試工具

TestPerf

v2.16.0

消息隊列性能測試工具

 

3.存儲方案

各OpenStack節(jié)點的本地磁盤只做系統(tǒng)使用。提供給用戶的存儲資源統(tǒng)一使用Ceph作為后端。即nova、glance、cinder-volume和cinder-backup以及manila全部使用Ceph做后端存�。基于兆芯平臺的Ceph集群部署和調(diào)�(yōu)方法請參見官�(wǎng)已發(fā)布的《基于兆芯平臺部署Ceph分布式存儲系�(tǒng)的最佳實踐��

 

部署了Swift對象存儲服務(wù),使用本地磁�,但沒有配置給其他組件使�。其性能也不包含本文討論范圍之內(nèi)�

 

4.組網(wǎng)方案

根據(jù)兆芯平臺服務(wù)器板載網(wǎng)卡配置情�、計劃業(yè)�(wù)�(guī)模和�(xiàn)有網(wǎng)�(luò)�(huán)境等條件,規(guī)劃集群網(wǎng)�(luò)如下圖所�。將邏輯上的五類�(wǎng)�(luò)收縮成三個物理網(wǎng)�(luò)實施部署,即管理和Tunnel�(wǎng)�(luò)公用一個物理網(wǎng)�(luò)、部署和存儲�(wǎng)�(luò)公用一個物理網(wǎng)�(luò)以及外網(wǎng)�

 

- 部署�(wǎng)�(luò):用于PXE boot及安裝軟件時訪問本地軟件源鏡��

- 管理�(wǎng)�(luò):用于各節(jié)點之間通過API訪問以及SSH訪問�

- Tunnel�(wǎng)�(luò):用于各計算節(jié)上虛擬機互通以及和�(wǎng)�(luò)節(jié)點的連接,主要承載業(yè)�(wù)的東西流量;

- 存儲�(wǎng)�(luò):用于訪問統(tǒng)一存儲后端�

 

基于兆芯平臺的OpenStack云系統(tǒng)調(diào)優(yōu)指南

                                                    �1 �(wǎng)�(luò)�?fù)?/p>

 

  �、調(diào)�(yōu)策略和手�

 

1.性能指標(biāo)

由于本文主要涉及IaaS平臺性能,并不包含虛擬主機的性能,因此測試和�(diào)�(yōu)工作主要在針對OpenStack�(guān)鍵組件的性能測試。我們關(guān)注的性能指標(biāo)有:單個請求的完成時間、批量請求的95%完成時間和批量請求的尾端延遲�

 

2.測試方法

獲得�(yōu)化參�(shù)的流程主要是測試、分�、調(diào)�(yōu)、再測試�

 

- 測試工具

對于OpenStack組件我們使用Rally來�(jìn)行批量測試,可以使用Rally中包含的測試用例也可以根�(jù)需要自定義測試用例。Rally的測試報告中會有詳細(xì)的耗時�(tǒng)�。本測試批量請求的測試規(guī)模是200并發(fā),被測計算節(jié)點十�,使用的Guest OS是Cirros�

 

對于單個請求的測試還可以使用OSprofiler。Openstack中的請求多數(shù)都需要經(jīng)過多個組件處理才可以完成,該工具主要用途是�(xié)助分析請求的處理過程,在并發(fā)測試前找出可能的耗時�,從而提前優(yōu)��

 

對于RabbitMQ我們采用了TestPerf工具,并�(shè)計了幾種典型用例來�(jìn)行性能測試。重點關(guān)注消息隊列的吞吐量和消息投遞的延遲時��

 

- 分析方法

主要從測試結(jié)果和日志入手,優(yōu)先解決日志中的告警和報錯事件,例如瞬時高并發(fā)請求�(dǎo)致的服務(wù)或數(shù)�(jù)庫連接超時、數(shù)�(jù)庫查詢失�、資源不足等問題�

 

其次是基于對源代碼的理解通過添加時間戳獲取測試中的業(yè)�(wù)流或�(shù)�(jù)流關(guān)鍵節(jié)點的耗時�(shù)�(jù),再逐個分析高耗時點的耗時原因�

 

還有一種常用方法是在標(biāo)�(zhǔn)平臺上做對標(biāo)測試。在問題收斂到一定程度后仍無法解釋性能問題的原因時或者性能問題的原因可能是和硬件架�(gòu)或體系結(jié)�(gòu)有關(guān)�,使用該方法來驗證�

 

- �(yōu)化手�

對于OpenStack控制層面的優(yōu)化方向主要有三個:提高并發(fā)成功�、縮短平均完成時間和降低尾部延遲�

 

使用OpenStack默認(rèn)配置做并�(fā)測試會發(fā)�(xiàn),根�(jù)具體的被測功能不��200并發(fā)的成功率差別很大,通常被測功能涉及組件越多且并�(fā)�(shù)量越大成功率越低。提高成功率的方法主要靠提升硬件性能和調(diào)整各組件的配置參�(shù)。在兆芯平臺�,硬件性能的提升除了提高關(guān)鍵硬件(�(nèi)存和�(wǎng)卡)的性能外更重要的有兩點:一是服�(wù)器系�(tǒng)要打上兆芯BSP Patch;二是系�(tǒng)部署方案開發(fā)階段就要根據(jù)兆芯平臺的NUMA �?fù)浣Y(jié)�(gòu)來考慮合理的親和設(shè)置,盡量避免�(chǎn)生跨NUMA訪問,若無法避免跨NUMA訪問,則盡量使用相鄰的node,避免跨Socket訪問。而在組件配置上則需要根�(jù)對請求的處理路徑的跟�,找到耗時較長的功能。由于隨并發(fā)量增加經(jīng)常會�(dǎo)致組件處理的Retry和Timeout,這些會�(jìn)一步導(dǎo)致產(chǎn)生請求失�,在無法提升組件處理效率時應(yīng)適當(dāng)增加Retry次數(shù)或Timeout時間來避免請求直接失��

 

為縮短平均完成時間,除了前面已提到增加硬件平臺性能�,就要清楚的找到各種delay請求處理的具體階段。該情況下可使用各種trace手段來梳理控制流和數(shù)�(jù)�,通過分析日志等方法統(tǒng)計各處理階段的耗時。找到關(guān)鍵耗時功能后,可調(diào)�(yōu)的手段有�

 

1、修改組件配�,增加處理線程數(shù),充分利用多核性能�

2、組件性能依賴操作系統(tǒng)配置�,考慮�(diào)整系�(tǒng)配置來優(yōu)�,相�(guān)配置�(yōu)化方法請參考對�(yīng)操作系統(tǒng)提供的調(diào)�(yōu)文檔以及兆芯官方提供的其他服�(wù)器產(chǎn)品相�(guān)�(diào)�(yōu)文檔�

3、組件對同一功能�(jīng)常可以提供多種方�,修改配�,使用更高效的方式;

4、從社區(qū)查找對請求處理性能有提高的Patch,有些處理效率低的原因是實現(xiàn)方式本身就效率低,通過Patch可能有效提高處理能力。如果沒有可用Patch,只能自行開�(fā)�

5、根�(jù)Python語言特性�(jìn)行優(yōu)化,避免線程阻塞�

6、優(yōu)化組件部署方案,根據(jù)并發(fā)壓力適當(dāng)增加組件�(shù)量或節(jié)點數(shù)�,也可通過組件遷移均衡各節(jié)點的�(fù)載壓��

7、更新服�(wù)器OS �(nèi)核功能Patch,提供高�(nèi)核處理能力;

8、對于通用軟件,如消息隊列,數(shù)�(jù)庫和�(fù)載均衡等,還可以選擇升級版本或選用性能更好的同類軟件來實現(xiàn)性能的提高�

 

本文后續(xù)章節(jié)中介紹的推薦參數(shù)均根�(jù)測試成績得出�

 

請注�,測試用例不同對各參�(shù)的值影響很大,比如測試200并發(fā)�(chuàng)建虛擬機時使用的虛擬機鏡像從Cirros改為Ubuntu,那么很可能很多地方的超時時間和Retry次數(shù)都需要增加才能保證成功率100%。因�,本文中推薦值不適用于生�(chǎn)�(huán)�,僅用作�(diào)�(yōu)參考�

 

  �、OpenStack�(guān)鍵組件配�

 

1. �(shù)�(jù)�

OpenStack系統(tǒng)中數(shù)�(jù)庫是十分�(guān)鍵的服務(wù)。各組件都在�(shù)�(jù)庫服�(wù)中有自己的數(shù)�(jù)�,用于保存服�(wù)、資源和用戶等相�(guān)�(shù)�(jù)。數(shù)�(jù)庫還被用作各種組件間�(xié)同的一種機制。因�,在處理各種請求的過程中或多或少都會涉及到數(shù)�(jù)庫的讀寫操��

 

在云平臺的性能�(diào)�(yōu)過程中需要重點觀測數(shù)�(jù)庫請求響�(yīng)時間。對于運行時�(shù)�(jù)庫請求響�(yīng)時間過長問題通常是比較復(fù)雜的,涉及數(shù)�(jù)庫服�(wù)�/cluster、代理服�(wù)器和客戶端(即組件端�,需要一一排查,查找問題源頭并做對�(yīng)�(diào)�。本節(jié)主要介紹服務(wù)器端和客戶端的相�(guān)配置參數(shù),代理服�(wù)器端參考后�(xù)章節(jié)�

 

1.1 Mariadb

 

測試用系�(tǒng)中以三個Mariadb節(jié)點構(gòu)建了一個多主模式的Golare Cluster,前端通過haproxy實現(xiàn)主備高可用。數(shù)�(jù)庫的�(diào)�(yōu)方法可以參考《MySQL�(yōu)化手冊�,補充一個在OpenStack云系�(tǒng)中需要特別注意的參數(shù)�

 

- max_allowed_packet

該參�(shù)用于�(shè)置MariaDB 服務(wù)器端允許接收的最大數(shù)�(jù)包大�。有時候大的插入和更新操作會因max_allowed_packet 參數(shù)�(shè)置過小導(dǎo)致失敗�

 

- 推薦配置�

默認(rèn)�1024,單位Kbyte。在Open Stack的集群中,有超出默認(rèn)值大小的�,需要適�(dāng)增加大小�

修改mariadb的配置文件galera.cnf

[mysqld]

max_allowed_packet = 64M

重啟mariadb服務(wù)生效�

 

1.2 oslo_db

各組件訪問數(shù)�(jù)庫是通過�(diào)用oslo_db來實�(xiàn)�,實際部署中我們配置oslo_db+SQLAlchemy作為�(shù)�(jù)庫訪問的客戶端,因此,數(shù)�(jù)庫客戶端的相�(guān)配置參數(shù)和調(diào)�(yōu)手段可以參考SQLAlchemy的官方文�。注�,SQLAlchemy的高版本會帶來性能提升,但不可隨意升級,需要考慮版本兼容�

 

- slave_connection

Stein版oslo_db支持配置slave_connection,并且已�(jīng)有部分組件支持按讀寫分離方式訪問數(shù)�(jù)�,例如nova,即寫數(shù)�(jù)庫操作使用connection,讀�(shù)�(jù)庫操作使用slave_connection操作。從而改善讀操作的性能�

 

參考配置方法:

1、通過haproxy來為讀寫操作提供不同的�(zhuǎn)�(fā)入口,如寫操作使�3306,轉(zhuǎn)�(fā)模式配置為一主兩�;讀操作使用3307,轉(zhuǎn)�(fā)方式配置為按最小連接�(shù)�

2、為支持讀寫分離的組件在其配置文件的【database】段中增加slave_connection配置�

 

注意:有些組件雖然可以配置slave_connection但其代碼事件中實際上并沒有在讀�(shù)�(jù)庫時�(diào)用slave_connection,需根據(jù)具體版本仔細(xì)確認(rèn)�

 

- 連接�

和數(shù)�(jù)庫建立連接是一個相對耗時的過�,因�,提供了連接池機制,連接池中的連接可以被重�(fù)使用,以提高效率。用戶在�(diào)試過程中�(yīng)�(guān)注各組件和數(shù)�(jù)庫之間的連接情況,如組件的數(shù)�(jù)庫請求響�(yīng)時間和組件日志中�(shù)�(jù)庫連接相關(guān)信息等。若�(fā)�(xiàn)�(shù)�(jù)庫請求響�(yīng)時間過長且經(jīng)排查后懷疑是組件端在連接�(shù)�(jù)庫上耗時過長�,可通過以下參數(shù)嘗試�(yōu)化。在本文所述測試過程中使用了默�(rèn)參數(shù)�(shè)置,用戶需根據(jù)運行時情況�(jìn)行調(diào)�(yōu)。連接池主要配置參�(shù)如下�

 

    min_pool_size :連接池中已連接的SQL連接�(shù)不得小于該�,默�(rèn)值是1�

    max_pool_size :連接池中已連接的最大SQL連接�(shù),默�(rèn)值是5,設(shè)�0時無限制�

    max_overflow :最大允許超出最大連接�(shù)的數(shù)量,默認(rèn)值是50�

    pool_timeout :從連接池里獲取連接時如果無空閑的連接,且連接�(shù)已經(jīng)到達(dá)了max_pool_size+max_overflow,那么要獲取連接的�(jìn)程會等待pool_timeout�,默�(rèn)值是30s,如果超過這個時間還沒有得到連接將會拋出異常。如果出�(xiàn)該異�,可以考慮增加連接池的連接�(shù)�

 

2. 消息隊列

 

2.1 Rabbitmq

測試用系�(tǒng)中以三節(jié)點構(gòu)建了一個RabbitMQ Mirror Queue Cluster,所有節(jié)點均為disk節(jié)��

 

RabbitMQ的主要性能問題是消息投遞的延遲時間,但對于cluster的組織方�,其延遲時間主要消耗在鏡像隊列之間的數(shù)�(jù)一致性保證處理上,而這種處理過程十分�(fù)�,調(diào)�(yōu)難度很大。對于Cluster目前的調(diào)�(yōu)手段有如下幾種:

1、由于Rabbitmq基于erlang運行,而通過對比測試,高低版本性能差異較大,建議盡量使用高版本,如Rabbitmq使用3.8以上版本,erlang版本v22.3及以上(根據(jù)Rabbitmq版本具體選擇��

2、由于Cluster的隊列鏡像數(shù)量越�,每條消息處理時在一致性上的耗時越長,因此可以根�(jù)實際情況減少鏡像隊列�(shù)��

3、若Rabbitmq是容器化安裝,為避免CPU資源被搶�,可配置docker參數(shù),分配給Rabbitmq更多的CPU時間�

 

推薦的調(diào)�(yōu)參數(shù)如下�

 

- collect_statistics_interval

默認(rèn)情況下,Rabbitmq Server默認(rèn)�5s的間隔統(tǒng)計系�(tǒng)信息,周期內(nèi)publish� delivery message等速率信息會以此為周期�(tǒng)��

 

- 推薦配置

增加該值能夠減少Rabbitmq Server 收集大量的狀�(tài)信息而導(dǎo)致CPU利用率增加,參數(shù)單位為ms�

編輯RabbitMQ的配置文件rabbitmq.conf

collect_statistics_interval = 30000

重啟RabbitMQ服務(wù)生效�

 

- cpu_share

Docker 允許用戶為每個容器設(shè)置一個數(shù)�,代表容器的 CPU share,默�(rèn)情況下每個容器的 share � 1024。要注意,這� share 是相對的,本身并不能代表任何確定的意義。當(dāng)主機上有多個容器運行時,每個容器占用的 CPU 時間比例為它� share 在總額中的比例。只有在CPU資源緊張�,設(shè)定的資源比例才可以顯�(xiàn)出來,如果CPU資源空閑,cpu_share值低的docker也能獲取到比例外的CPU資源�

 

- 推薦配置

控制節(jié)點上部署openstack 各組件的api server以及rabbitmq server,當(dāng)對rabbitmq做并�(fā)測試時,可以適當(dāng)提高節(jié)點上rabbitmq docker的CPU share讓其獲得更多的CPU資源。如果條件滿足的情況�,Rabbitmq Cluster �(yīng)該單獨部署在服務(wù)器集群上,不和其他服�(wù)搶占CPU資源。單獨部署的Rabbitmq Cluster有更高的并發(fā)能力�

 

配置方法�

docker update --cpu-shares 10240 rabbitmq-server

 

- ha-mode

Rabbitmq Cluster鏡像隊列可以配置鏡像隊列在多節(jié)點備�,每個隊列包含一個master節(jié)點和多個slave節(jié)�。消費者消費的操作�(xiàn)在master節(jié)點上完成,之后再slave上�(jìn)行相同的操作。生�(chǎn)者發(fā)布的消息會同步到所有的節(jié)�。其他的操作通過master中轉(zhuǎn),master將操作作用于slave。鏡像隊列的配置策略�

ha-mode

ha-params

說明

all

 

集群中每個節(jié)點都有鏡像隊�

exactly

count

指定集群中鏡像隊列的個數(shù)

nodes

node names

在指定的節(jié)點列表中配置鏡像隊列

默認(rèn)的ha-mode是all,三節(jié)點的集群鏡像隊列�3節(jié)點備份。定義策略的指令�

    rabbimqctl set_policy –p vhost

    pattern:正則匹�,定義的policy會根�(jù)正則表達(dá)式應(yīng)用到相應(yīng)的交換機或者隊列上�

    definition:配置策略的參數(shù)�

 

- 推薦配置

rabbitmqctl set_policy -p / ha-exactly '^' '{'ha-mode':'exactly', 'ha-params':2}'

集群兩隊列備份比三隊列備份應(yīng)對高并發(fā)的能力更�,前者支持的并發(fā)�(shù)是后者的1.75倍�

 

- CPU綁定

RabbitMQ 運行在erlang 虛擬機中。本文環(huán)境之中使用的erlang版本支持SMP,采用多�(diào)度器多隊列的機制,即啟動erlang虛擬機時,默�(rèn)會根�(jù)系統(tǒng)邏輯CPU核數(shù)啟動相同�(shù)量的�(diào)度器(可通過啟動參數(shù)+S限制),每個調(diào)度器都會從各自的運行隊列中獲取運行�(jìn)程。但由于OS的線程調(diào)度機�,erlang�(diào)度器線程會在各核之間遷移,這會�(dǎo)致Cache Miss增加,影響性能??梢酝ㄟ^參數(shù)+sbt 來設(shè)置調(diào)度器和邏輯核綁定。Erlang支持多種綁定策略,詳見Erlang說明文檔�

 

- 推薦配置

默認(rèn)配置為db,按numa node輪流綁定,盡量使用到所有node。但由于�(diào)度器多任�(wù)隊列之間存在balance機制,任�(wù)會在隊列間遷移,因此,為了更好的利用cache,在本文測試�(huán)境下上推薦將+stb 配置為nnts,即�(diào)度器線程按numa node順序�(jìn)行綁��

 

編輯RabbitMQ的配置文件rabbitmq-env.conf

RABBITMQ_SERVER_ADDITIONAL_ERL_ARGS='+sbt nnts'

重啟RabbitMQ 服務(wù)生效�

 

用戶還可根據(jù)實際�(huán)境采取其他Erlang虛擬機調(diào)�(yōu)方法配合使用(如啟用Hipe等),但具體Erlang虛擬機的參數(shù)介紹和調(diào)�(yōu)手段不在本文討論范圍�

 

- 心跳保持

Hearbeat用來檢測通信的對端是�?�??;驹硎菣z測對�(yīng)的socket鏈接上數(shù)�(jù)的收�(fā)是否正常,如果有一段時間沒有收�(fā)�(shù)�(jù),則�?qū)Χ税l(fā)送一個心跳檢測包,如果一段時間內(nèi)沒有回應(yīng)則認(rèn)為心跳超時對端可能異常crash。在�(fù)載重時組件可能無法及時處理heartbeat消息而導(dǎo)致Rabbitmq Server沒有在超時時間內(nèi)收到心跳檢測�(yīng)答。Rabbitmq Server因組件超時未�(yīng)答而關(guān)閉連接�(dǎo)致錯�。適�(dāng)增加組件和Rabbitmq Server的心跳超時時間以避免該錯誤。參考Cinder章節(jié)中相�(guān)介紹�

 

編輯RabbitMQ的配置文件rabbitmq.conf

heartbeat = 180

重啟RabbitMQ服務(wù)生效�

 

2.2 oslo_messaging

 

各功能組件和Rabbitmq的連接都是�(diào)用oslo_messaging來實�(xiàn)的,該庫中提供的如下幾個參�(shù)可用于優(yōu)化高�(fù)載時RPC處理性能�

 

    rpc_conn_pool_size :RPC連接池的大小,默�(rèn)值是30�

    executor_thread_pool_size :執(zhí)行RPC處理的線程或�(xié)程數(shù)�,默�(rèn)值是64�

    rpc_response_timeout :RPC�(diào)用響�(yīng)超時時間,默�(rèn)值是60s�

 

我們的測試中前兩個參�(shù)使用了默�(rèn)�,rpc_response_timeout根據(jù)各組件的實際處理能力做了適當(dāng)增加�

 

3. Nova

 

3.1 nova-api

 

- Workers

nova-api是一個WSGI Server,亦可叫api server,其�(fù)�(zé)接收外部�(fā)來的REST請求。API Server啟動時會根據(jù)配置文件建立一定數(shù)量的worker線程來接收請�,如果線程數(shù)量不足以滿足并發(fā)�,就會出�(xiàn)排隊,導(dǎo)致請求處理時間增�。因�,管理員可以根據(jù)實際的業(yè)�(wù)量來�(shè)置worker�(shù)量。對于獨占一個節(jié)點的api server,workers通常可以�(shè)置到等同于CPU核數(shù),但在多個api server同部署于一個節(jié)點時(如在控制節(jié)點上�,物理CPU資源有限,就需要根�(jù)服務(wù)器處理能力和各api server的負(fù)載情況調(diào)整各自的workers�(shù)��

 

對于容器方式部署的api server,還可以考慮配置容器使用的CPU和內(nèi)存資源的numa親和,盡量做到各容器的NUMA親和分開,從而減少CPU資源搶占以及更好的利用cache資源。該方法對其他以容器方式部署的服�(wù)也同樣適��

 

在nova-api的配置文件中有幾個配置項,如:osapi_compute_workers和metadata_workers,可以根�(jù)節(jié)點服�(wù)器能力和�(fù)載情況�(jìn)行調(diào)��

 

其他組件,如glance等都有自己的api server,與nova-api類似,可以通過增加workers�(shù)量來改善請求處理性能�

 

- Patch:熱遷移性能

Nova Compute在執(zhí)行l(wèi)ive-migrate()時,子線程執(zhí)行_live_migration_operation()函數(shù)和主線程�(zhí)行_live_migration_monitor()函數(shù)都會訪問Instance Fields。如果此前未曾訪問過Instance.fields,則可能同時�(diào)用nova/objects/instance.py:class Instance. obj_load_attr()函數(shù),utils.temporary_mutation()函數(shù)的重入會�(dǎo)致執(zhí)行后Context.read_deleted賦值為“yes”。后�(xù)在_update_usage_from_instances()時會�(tǒng)計不必要�(tǒng)計的已經(jīng)刪除的instances,增加耗時� Stein版本的OpenStack存在上述漏洞,后�(xù)的版本由于其他操作提前獲取過Intance.fields,從而nova-compute不需要在live-migrate時調(diào)用obj_load_attr()。Bug提交信息見鏈接:https://bugs.launchpad.net/nova/+bug/1941819??梢酝ㄟ^補丁https://github.com/openstack/nova/commit/84db8b3f3d202a234221ed265ec00a7cf32999c9 在Nova API中提前獲取Instance.fileds以避免該Bug�

 

補丁詳情參見附件A1.1

 

3.2  Nova-scheduler & nova-conductor

 

這兩個服�(wù)同樣也有workers配置參數(shù)來控制處理線程的�(shù)量,可以根據(jù)實際�(yè)�(wù)量來�(diào)�,我們的測試中設(shè)置為默認(rèn)值即可滿�。Nova-conductor的workers默認(rèn)值是CPU核數(shù)。Nova-scheduler的默�(rèn)值是�(dāng)使用filter-scheduler時是CPU核數(shù),使用其他Scheduler時默�(rèn)值是1�

 

3.3 nova-computer

 

- vcpu_pin_set

限定Guest可以使用compute node上的pCPUs的范�,給Host適當(dāng)留下一些CPU以保證正常運�。解決因虛擬機高�(fù)載情況下爭搶CPU資源�(dǎo)致Host性能不足的問題�

 

- 推薦配置

在每個numa node上都�(yù)留一個物理CPU核供Host使用。以兩numa nodes 平臺為例,可選擇�(yù)留cpu0和cpu15供Host使用,配置方法如下:

編輯nova-compute的配置文件nova.conf

vcpu_pin_set = 1-14

重啟nova_compute服務(wù)生效�

 

- reserved_host_memory_mb

該參�(shù)用于在計算節(jié)點上�(yù)留內(nèi)存給Host系統(tǒng)使用,避免虛擬主機占用掉過多�(nèi)�,導(dǎo)致Host系統(tǒng)上的任務(wù)無法正常運行�

 

- 推薦配置

�(shè)置值須根據(jù)實際�(nèi)存總�、Host系統(tǒng)中運行的tasks以及�(yù)期虛擬機會占用的最大內(nèi)存量來決定,建議至少�(yù)�1024MB供系�(tǒng)使用�

編輯nova-compute的配置文件nova.conf

reserved_host_memory_mb=4096  #單位MB

重啟nova_compute服務(wù)生效�

 

- cpu_allocation_ratio

配置vCPU的可超配比例�

 

- 推薦配置

根據(jù)ZHAOXIN CPU的性能,桌面云系統(tǒng)中,單個pCPU可以虛擬2個vCPU�

編輯nova-compute的配置文件nova.conf

[DEFAULT]

cpu_allocation_ratio = 2

重啟nova_compute服務(wù)生效�

 

- block_device_allocate_retries

�(chuàng)建有block device的虛擬機時,需要從blank、image或者snaphot�(chuàng)建volume。在volume被attach到虛擬機之前,其狀�(tài)必須�“available”。block_device_allocate_retries指定nova檢查volume狀�(tài)是否“available”的次�(shù)。相�(guān)參數(shù)有block_device_allocate_retries_interval,指定檢查狀�(tài)的查詢間隔,默認(rèn)�3,單位s�

 

- 推薦配置

默認(rèn)值是60�。當(dāng)cinder�(fù)載較重時�60次查詢之后可能volume的狀�(tài)不是“available”,適�(dāng)增加查詢次數(shù),避免虛擬機�(chuàng)建失��

 

編輯nova-compute的配置文件nova.conf

[DEFAULT]

block_device_allocate_retries = 150

重啟nova_compute服務(wù)生效�

 

- vif_plugging_timeout

nova-compute等待Neutron VIF plugging event message arrival的超時時��

 

- 推薦配置

默認(rèn)�300,單位s。當(dāng)�(chuàng)建虛擬機并發(fā)�(shù)高時,可能無法在300s�(nèi)收到該event。兆芯桌面云系統(tǒng)測試200并發(fā)�(chuàng)建虛擬機�,耗時�360s,該值可以根�(jù)系統(tǒng)的最大并�(fā)�(shù)適當(dāng)�(diào)��

 

編輯nova-compute的配置文件nova.conf

[DEFAULT]

vif_plugging_timeout = 500

重啟nova_compute服務(wù)生效�

 

- Patch:熱遷移性能

該補丁完成了兩個功能:去掉遷移過程中不必要的get_volume_connect() 函數(shù)�(diào)�,以及減少不要的Neutron訪問。該補丁能夠讓熱遷移更高�,熱遷移的無服務(wù)時間更短。補丁地址�

https://review.opendev.org/c/openstack/nova/+/795027

https://opendev.org/openstack/nova/commit/6488a5dfb293831a448596e2084f484dd0bfa916

補丁詳情參見附件A1.2

 

4. Cinder

 

4.1 cinder-api

 

- Workers

參見nova-api。不同的是Stein版中cinder-api采用httpd部署,因此,其除了可以調(diào)�(yōu)cinder支持的各種workers(如osapi_volume_workers�,還可以�(diào)�(yōu)cinder-wsgi.conf中的processes和threads。類似的組件還有keystone和horizon等�

 

編輯cinder-wsgi.conf

WSGIDaemonProcess cinder-api processes=12 threads=3 user=cinder group=cinder display-name=%{GROUP} python-path=/var/lib/kolla/venv/lib/python2.7/site-packages

……

 

重啟cinder-api服務(wù)生效�

 

- rpc_response_timeout

 

Cinder-api等待RPC消息返回的超時時�

 

- 推薦配置

默認(rèn)� 60,單位s。在高并�(fā)的attach_volume�,cinder-volume響應(yīng)cinder-api的時間較�。如果報告rpc timeout的錯誤,可以適當(dāng)�(diào)大該��

 

編輯cinder-volume的配置文件cinder.conf

[DEFAULT]

rpc_response_timeout = 600

重啟cinder-api服務(wù)生效�

 

4.2 cinder-volume

 

- 心跳保持

和Rabbitmq之間

參考RabbitMQ章中�“心跳保持”小節(jié)�

 

Cinder的heartbeat_timeout_threshold用來�(shè)置心跳超時時間,會以1/2心跳超時時間為間隔發(fā)送心跳檢測信��

 

- 推薦配置

cinder-volume heartbeat_timeout_threshold默認(rèn)值為60,單位為s,在�(fù)載重時可能無法在及時處理heartbeat消息而導(dǎo)致Rabbitmq Server沒有在超時時間內(nèi)收到心跳檢測�(yīng)答。Rabbitmq Server因Cinder-volume超時未應(yīng)答而關(guān)閉連接,�(jìn)而導(dǎo)致一系列錯誤。適�(dāng)增加Cinder-volume和Rabbitmq Server的心跳超時時間以避免該錯�,不建議禁用心跳檢測機制(heartbeat=0)�

 

編輯cinder-volume的配置文件cinder.conf

[oslo_messaging_rabbit]

heartbeat_timeout_threshold = 180

重啟cinder-volume服務(wù)生效�

 

服務(wù)之間

OpenStack是一個分布式系統(tǒng),由運行在不同主機上的各個服�(wù)組成來共同完成各項工�。每個服�(wù)都會定時向數(shù)�(jù)庫中更新自己的update time,服�(wù)間可通過查詢對方的update time是否超過�(shè)置的service_down_time來判斷服�(wù)是否在線。這也可以看作是一種心跳機��

 

在高�(fù)載時,數(shù)�(jù)庫訪問可能延遲增�,同時運行上報的周期任務(wù)會因CPU資源被占用導(dǎo)致延遲上報,這些都有可能引發(fā)誤報service down�

 

- 推薦配置

report_interval:狀�(tài)報告間隔,即心跳間隔,默�(rèn)10,單位s�

service_down_time:距離上一次心跳的最長時�,默�(rèn)60,單位s。超過這個時間沒有心跳則�(rèn)為服�(wù)Down�

report-interval一定要小于service_down_time。適�(dāng)增加service_down_time,避免cinder-volume的周期性任�(wù)占用cpu�(dǎo)致沒有及時報告狀�(tài)而被誤認(rèn)為Down�

編輯cinder-volume的配置文件cinder.conf

service_down_time = 120

重啟cinder_volume服務(wù)生效�

 

- rbd_exclusive_cinder_pool

 

OpenStack Ocata引入了參�(shù)rbd_exclusive_cinder_pool,如果RBD pool是Cinder獨占,則可以�(shè)置rbd_exclusive_cinder_pool=true。Cinder用查詢數(shù)�(jù)庫的方式代替輪詢后端所有volumes的方式獲取provisioned size,這會明顯減少查詢時間,同時減輕Ceph 集群和Cinder-volume 服務(wù)的負(fù)載�

 

- 推薦配置

編輯cinder-volume的配置文件cinder.conf

[DEFAULT]

Enable_backends =rbd-1

[rbd-1]

rbd_exclusive_cinder_pool = true

重啟cinder-volume服務(wù)生效�

 

- image_volume_cache_enabled

 

從Liberty版本開始,Cinder能夠使用image volume cahe,能夠提高從image�(chuàng)建volume的性能。從image第一次創(chuàng)建volume的同時會�(chuàng)建屬于快存儲Internal Tenant的cached image-volume 。后�(xù)從該image�(chuàng)建volume時從cached image-volume 克隆,不需要將image 下載到本地再傳入volume�

 

- 推薦配置

cinder_internal_tenant_project_id:指定OpenStack的項�“service”的ID

cinder_internal_tenant_user_id:指定OpenStack的用�“cinder”的ID

image_volume_cache_max_size_gb:指定cached image-volume的最大size,設(shè)置為0,即不對其限��

image_volume_cache_max_count:指定cached image-volume的最大數(shù)�,設(shè)置為0,即不對其限制�

編輯cinder-volume的配置文件cinder.conf

[DEFAULT]

cinder_internal_tenant_project_id = c4076a45bcac411bacf20eb4fecb50e0 

cinder_internal_tenant_user_id = 4fe8e33010fd4263be493c1c9681bec8 

[backend_defaults]

image_volume_cache_enabled=True

image_volume_cache_max_size_gb = 0

image_volume_cache_max_count = 0

重啟cinder-volume服務(wù)生效�

 

5. Neutron

 

5.1 Neutron Service

 

neutron-service是neutron組件的api server,其配置�(yōu)化參考nova-api中的介紹,可�(diào)整參�(shù)有api_workers和metadata_workers�

 

- rpc_workers

在neutron的設(shè)計架�(gòu)�,核心服�(wù)和各plugin的處理是先主�(jìn)程fork子�(jìn)�,再在子�(jìn)程中�(chuàng)建協(xié)程來運行處理程序,從而實�(xiàn)可利用到多核的并�(fā)處理。rpc_workers是用來控制為RPC處理�(chuàng)建的�(jìn)程數(shù)�,默�(rèn)值是api_workers的一�。由于我們的系統(tǒng)是基于容器部�,因此該值使用默�(rèn)值即��

 

5.2 Neutron DHCP Agent

 

這里的兩個補丁主要影響網(wǎng)�(luò)節(jié)點上的neutron服務(wù)�

 

- 改善Network Port管理效率

 

Patch1

Neutron DHCP agent中用Pyroute2 �“ip route”命令替換oslo.rootwrap庫中該linux命令。該補丁讓Neutron DHCP agent�(chuàng)建和刪除port時更加高�。補丁地址�

https://opendev.org/openstack/neutron/commit/06997136097152ea67611ec56b345e5867184df5

補丁詳情參見附件A1.3�

 

Patch2

Neutron DHCP agent中用oslo.privsep庫的“dhcp_release”命令替換oslo.rootwrap庫該linux 命令。該補丁讓Neutron DHCP agent�(chuàng)建和刪除port時更加高�。補丁地址�

https://opendev.org/openstack/neutron/commit/e332054d63cfc6a2f8f65b1b9de192ae0df9ebb3

https://opendev.org/openstack/neutron/commit/2864957ca53a346418f89cc945bba5cdcf204799

補丁詳情參見附件A1.4�

 

5.3 Neuton OpenvSwitch Agent

 

- 改善Network Port處理效率

 

polling_interval

Neutron L2 Agent如果配置的是openvswitch agent,neutron-openvswitch-agent啟動后會運行一個RPC循環(huán)任務(wù)來處理端口添加、刪�、修改。通過配置項polling_interval指定RPC循環(huán)�(zhí)行的間隔�

 

- 推薦配置

默認(rèn)值是2,單位s。減少該值可以使得端口狀�(tài)更新更快,特別是熱遷移過程中,減少該值可以減少遷移時間。但如果�(shè)置為0會導(dǎo)致neutron-openvswitch-agent占用過多的CPU資源�

編輯neutron-openvswitch-agent的配置文件ml2_conf.ini

[agent]

polling_interval = 1

重啟計算節(jié)點的neutron-openvswitch-agent服務(wù)生效�

 

Patch

Neutron openvswitch agent中用oslo.privsep庫替換oslo.rootwrap庫的“iptables”�“ipset”命令。該補丁能讓Neutron openvswitch agent處理network port時更加高效。補丁地址�

https://opendev.org/openstack/neutron/commit/6c75316ca0a7ee2f6513bb6bc0797678ef419d24

https://opendev.org/openstack/neutron/commit/5a419cbc84e26b4a3b1d0dbe5166c1ab83cc825b

補丁詳情參見附件A1.5�

 

5.4 熱遷移Down Time時間�(yōu)�

 

openstack stein版本在熱遷移的測試中,虛機遷移到目標(biāo)主機后,�(wǎng)�(luò)不能及時ping�,存在比較明顯的延時�(xiàn)�。原因是虛機遷移成功后會立刻�(fā)送RARP廣播,而此時虛機網(wǎng)卡對�(yīng)的port還沒真正up。Bug信息�

https://bugs.launchpad.net/neutron/+bug/1901707

https://bugs.launchpad.net/neutron/+bug/1815989

補丁詳情參見附錄B1.1--B1.7, 涉及neutron和nova模塊�

https://review.opendev.org/c/openstack/neutron/+/790702

https://review.opendev.org/c/openstack/nova/+/767368

 

5.5 �(wǎng)�(luò)性能�(yōu)�

 

�(wǎng)�(luò)為了獲得�(wěn)定的高性能,在部署虛機時,�(wǎng)卡硬中斷和對�(yīng)虛機,最好限定在位于同一Cluster的CPU上,這樣可以避免不必要的cache miss,�(jìn)而提升網(wǎng)�(luò)的穩(wěn)定性和性能�

 

5.6 VXLAN 性能�(yōu)�

 

主流隧道�(wǎng)�(luò)普遍基于UDP�(xié)議實�(xiàn),例如VXLAN,當(dāng)UDP校驗和字段為零時,會�(dǎo)致接收端在處理VXLAN報文時不能及時�(jìn)行GRO(generic receive offload)處理,�(jìn)而嚴(yán)重影響網(wǎng)�(luò)性能。該問題社區(qū)已經(jīng)修正,具體信息可以參見下面鏈接:

https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git/commit/?id=89e5c58fc1e2857ccdaae506fb8bc5fed57ee063

打上該補丁后,萬兆網(wǎng)卡情況下,同樣的VXLAN iperf3測試,成績可以提�2倍以��

補丁詳情參見附錄C1.1

 

6. Keystone

 

- 并發(fā)�(jìn)程數(shù)

WSGIDaemonProcess守護(hù)�(jìn)程的配置參數(shù)。Processes定義了守�(hù)�(jìn)程啟動的�(jìn)程數(shù)�

 

- 推薦配置

默認(rèn)值為1。當(dāng)keystone壓力較大��1個WSGI�(jìn)程無法處理較大的并發(fā)�(shù),適�(dāng)增加processes的�,大于CPU cores number的意義不��

編輯keystone的配置文件wsgi-keystone.conf

WSGIDaemonProcess keystone-public processes=12 threads=1 user=keystone group=keystone display-name=%{GROUP} python-path=/var/lib/kolla/venv/lib/python2.7/site-packages

……

 

WSGIDaemonProcess keystone-admin processes=12 threads=1 user=keystone group=keystone display-name=%{GROUP} python-path=/var/lib/kolla/venv/lib/python2.7/site-packages

……

 

重啟keystone服務(wù)生效�

 

7. Haproxy

 

- Timeout�(shè)�

1、timeout http-request:HTTP請求的最大超時時�

2、timeout queue:當(dāng)server端請求數(shù)量達(dá)到了maxconn,新到的connections會被添加到指定的queue。當(dāng)requests在queue上等待超過timeout queue�,request被認(rèn)為不被服�(wù)而丟�,返�503 error給client��

3、timeout connect :connection連接上后端服�(wù)器的超時時間�

4、timeout client :client端發(fā)送數(shù)�(jù)或者應(yīng)答時,客戶端最大的非活躍時�

5、timeout server:server端最大的非活躍時�

 

- 推薦配置

編輯haproxy的配置文件haproxy.cfg,單位s表示�,單位m表示��

defaults

timeout http-request 100s

timeout queue 4m

timeout connect 100s

timeout client 10m

timeout server 10m�

重啟haproxy服務(wù)生效�

 

- 最大連接�(shù)

Haproxy可以配置全局的maxconn定義Haproxy同時最大的連接�(shù),也可以為后端服�(wù)配置maxconn定義該服�(wù)的最大連接�(shù),可以為前端配置maxconn定義此端口的最大連接�(shù)。系�(tǒng)的ulimit -n的值一定要大于maxconn�

 

- 推薦配置

全局的maxconn默認(rèn)值為4000,應(yīng)用過程中在Haproxy的控制臺觀察到全局連接�(shù)不夠,將其增加到40000�

編輯Haproxy的配置文件haproxy.cfg

global

maxconn 40000

重啟haproxy服務(wù)生效�

 

- 處理線程�(shù)

�(shè)置haproxy的負(fù)�(zé)均衡并發(fā)�(jìn)程數(shù),OpenStack Stein的Haproxy 版本�1.5.18。該參數(shù)在haproxy 2.5 版本中已移除,由nbthread參數(shù)指定線程�(shù)��

 

- 推薦配置

由于Haproxy的負(fù)載較重,推薦適當(dāng)增大該參�(shù)�

編輯Haproxy的配置文件haproxy.cfg

global

nbproc 4

重啟haproxy服務(wù)生效�

 

- �(quán)�

Haproxy后端配置參數(shù)weight可以配置server的權(quán)�,取值范�0-256,權(quán)重越�,分給這個server的請求就越多。weight�0的server將不會被分配任何新的連接。所有server的默�(rèn)值為1�

 

- 推薦配置

�(dāng)多個后端的主機壓力不一致時,可以將壓力大的主機上的server的權(quán)重適�(dāng)減少,從而使各主機負(fù)載均��

 

�3 控制節(jié)點的小集群為例:測試過程中,controller03上整體cpu使用率較高達(dá)95%+,其他兩個控制節(jié)點cpu使用率約�70%,各個控制節(jié)點keystone� cpu使用率均較高。減少controller03上keystone server的權(quán)�,從而減少controller03的cpu壓力�

 

編輯Haproxy的配置文件keystone.cfg

listen keystone_external

    mode http

    http-request del-header X-Forwarded-Proto

    option httplog

    option forwardfor

    http-request set-header X-Forwarded-Proto https if { ssl_fc }

    bind haproxy-ip-addr:5000

    maxconn 5000

    server controller01 server-ip-addr:5000 check inter 2000 rise 2 fall 5 maxconn 3000 weight 10

    server controller02 server-ip-addr:5000 check inter 2000 rise 2 fall 5 maxconn 3000 weight 10

    server controller03 server-ip-addr:5000 check inter 2000 rise 2 fall 5 maxconn 3000 weight 9

重啟haproxy服務(wù)生效�

~~~~~~~~~~~~~~~~~~

附錄

A1.1

diff -ruN nova-bak/api/openstack/compute/migrate_server.py nova/api/openstack/compute/migrate_server.py

--- nova-bak/api/openstack/compute/migrate_server.py    2021-09-10 11:20:15.774990677 +0800

+++ nova/api/openstack/compute/migrate_server.py 2021-09-10 11:23:22.239098421 +0800

@@ -157,7 +157,9 @@

                               'conductor during pre-live-migration checks '

                               ''%(ex)s'', {'ex': ex})

             else:

-                raise exc.HTTPBadRequest(explanation=ex.format_message())

+               raise exc.HTTPBadRequest(explanation=ex.format_message())

+        except exception.OperationNotSupportedForSEV as e:

+            raise exc.HTTPConflict(explanation=e.format_message())

         except exception.InstanceIsLocked as e:

             raise exc.HTTPConflict(explanation=e.format_message())

         except exception.ComputeHostNotFound as e:

diff -ruN nova-bak/api/openstack/compute/suspend_server.py nova/api/openstack/compute/suspend_server.py

--- nova-bak/api/openstack/compute/suspend_server.py   2021-09-10 11:25:03.847439106 +0800

+++ nova/api/openstack/compute/suspend_server.py 2021-09-10 11:27:09.958950964 +0800

@@ -40,7 +40,8 @@

             self.compute_api.suspend(context, server)

         except exception.InstanceUnknownCell as e:

             raise exc.HTTPNotFound(explanation=e.format_message())

-        except exception.InstanceIsLocked as e:

+        except (exception.OperationNotSupportedForSEV,

+                exception.InstanceIsLocked) as e:

             raise exc.HTTPConflict(explanation=e.format_message())

         except exception.InstanceInvalidState as state_error:

             common.raise_http_conflict_for_instance_invalid_state(state_error,

diff -ruN nova-bak/compute/api.py nova/compute/api.py

--- nova-bak/compute/api.py   2021-09-10 11:31:55.278077457 +0800

+++ nova/compute/api.py 2021-09-10 15:32:28.131175652 +0800

@@ -215,6 +215,23 @@

         return fn(self, context, instance, *args, **kwargs)

     return _wrapped

 

+def reject_sev_instances(operation):

+    '''Decorator.  Raise OperationNotSupportedForSEV if instance has SEV

+    enabled.

+    '''

+

+    def outer(f):

+        @six.wraps(f)

+        def inner(self, context, instance, *args, **kw):

+            if hardware.get_mem_encryption_constraint(instance.flavor,

+                                                      instance.image_meta):

+                raise exception.OperationNotSupportedForSEV(

+                    instance_uuid=instance.uuid,

+                    operation=operation)

+            return f(self, context, instance, *args, **kw)

+        return inner

+    return outer

+

 

 def _diff_dict(orig, new):

     '''Return a dict describing how to change orig to new.  The keys

@@ -690,6 +707,9 @@

         '''

         image_meta = _get_image_meta_obj(image)

 

+        API._validate_flavor_image_mem_encryption(instance_type, image_meta)

+     

+

         # Only validate values of flavor/image so the return results of

         # following 'get' functions are not used.

         hardware.get_number_of_serial_ports(instance_type, image_meta)

@@ -701,6 +721,19 @@

         if validate_pci:

             pci_request.get_pci_requests_from_flavor(instance_type)

 

+    @staticmethod

+    def _validate_flavor_image_mem_encryption(instance_type, image):

+        '''Validate that the flavor and image don't make contradictory

+        requests regarding memory encryption.

+        :param instance_type: Flavor object

+        :param image: an ImageMeta object

+        :raises: nova.exception.FlavorImageConflict

+        '''

+        # This library function will raise the exception for us if

+        # necessary; if not, we can ignore the result returned.

+        hardware.get_mem_encryption_constraint(instance_type, image)

+

+

     def _get_image_defined_bdms(self, instance_type, image_meta,

                                 root_device_name):

         image_properties = image_meta.get('properties', {})

@@ -3915,6 +3948,7 @@

         return self.compute_rpcapi.get_instance_diagnostics(context,

                                                             instance=instance)

 

+    @reject_sev_instances(instance_actions.SUSPEND)

     @check_instance_lock

     @check_instance_cell

     @check_instance_state(vm_state=[vm_states.ACTIVE])

@@ -4699,6 +4733,7 @@

                                                      diff=diff)

         return _metadata

 

+    @reject_sev_instances(instance_actions.SUSPEND)

     @check_instance_lock

     @check_instance_cell

     @check_instance_state(vm_state=[vm_states.ACTIVE, vm_states.PAUSED])

diff -ruN nova-bak/exception.py nova/exception.py

--- nova-bak/exception.py        2021-09-10 11:35:25.491284738 +0800

+++ nova/exception.py     2021-09-10 11:36:09.799787563 +0800

@@ -536,6 +536,10 @@

     msg_fmt = _('Unable to migrate instance (%(instance_id)s) '

                 'to current host (%(host)s).')

 

+class OperationNotSupportedForSEV(NovaException):

+    msg_fmt = _('Operation '%(operation)s' not supported for SEV-enabled '

+                'instance (%(instance_uuid)s).')

+    code = 409

 

 class InvalidHypervisorType(Invalid):

     msg_fmt = _('The supplied hypervisor type of is invalid.')

diff -ruN nova-bak/objects/image_meta.py nova/objects/image_meta.py

--- nova-bak/objects/image_meta.py 2021-09-10 15:16:30.530628464 +0800

+++ nova/objects/image_meta.py      2021-09-10 15:19:26.999151245 +0800

@@ -177,6 +177,9 @@

         super(ImageMetaProps, self).obj_make_compatible(primitive,

                                                         target_version)

         target_version = versionutils.convert_version_to_tuple(target_version)

+       

+        if target_version < (1, 24):

+            primitive.pop('hw_mem_encryption', None)

         if target_version < (1, 21):

             primitive.pop('hw_time_hpet', None)

         if target_version < (1, 20):

@@ -298,6 +301,11 @@

         # is not practical to enumerate them all. So we use a free

         # form string

         'hw_machine_type': fields.StringField(),

+       

+        # boolean indicating that the guest needs to be booted with

+        # encrypted memory

+        'hw_mem_encryption': fields.FlexibleBooleanField(),

+

 

         # One of the magic strings 'small', 'any', 'large'

         # or an explicit page size in KB (eg 4, 2048, ...)

diff -ruN nova-bak/scheduler/utils.py nova/scheduler/utils.py

--- nova-bak/scheduler/utils.py 2021-09-10 15:19:58.172561042 +0800

+++ nova/scheduler/utils.py      2021-09-10 15:35:05.630393147 +0800

@@ -35,7 +35,7 @@

 from nova.objects import instance as obj_instance

 from nova import rpc

 from nova.scheduler.filters import utils as filters_utils

-

+import nova.virt.hardware as hw

 

 LOG = logging.getLogger(__name__)

 

@@ -61,6 +61,27 @@

         # Default to the configured limit but _limit can be

         # set to None to indicate 'no limit'.

         self._limit = CONF.scheduler.max_placement_results

+        image = (request_spec.image if 'image' in request_spec

+                 else objects.ImageMeta(properties=objects.ImageMetaProps()))

+        self._translate_memory_encryption(request_spec.flavor, image)

+

+    def _translate_memory_encryption(self, flavor, image):

+        '''When the hw:mem_encryption extra spec or the hw_mem_encryption

+        image property are requested, translate into a request for

+        resources:MEM_ENCRYPTION_CONTEXT=1 which requires a slot on a

+        host which can support encryption of the guest memory.

+        '''

+        # NOTE(aspiers): In theory this could raise FlavorImageConflict,

+        # but we already check it in the API layer, so that should never

+        # happen.

+        if not hw.get_mem_encryption_constraint(flavor, image):

+            # No memory encryption required, so no further action required.

+            return

+

+        self._add_resource(None, orc.MEM_ENCRYPTION_CONTEXT, 1)

+        LOG.debug('Added %s=1 to requested resources',

+                  orc.MEM_ENCRYPTION_CONTEXT)

+

 

     def __str__(self):

         return ', '.join(sorted(

diff -ruN nova-bak/virt/hardware.py nova/virt/hardware.py

--- nova-bak/virt/hardware.py  2022-02-23 10:45:42.320988102 +0800

+++ nova/virt/hardware.py        2021-09-10 14:05:25.145572630 +0800

@@ -1140,6 +1140,67 @@

 

     return flavor_policy, image_policy

 

+def get_mem_encryption_constraint(flavor, image_meta, machine_type=None):

+    '''Return a boolean indicating whether encryption of guest memory was

+    requested, either via the hw:mem_encryption extra spec or the

+    hw_mem_encryption image property (or both).

+    Also watch out for contradictory requests between the flavor and

+    image regarding memory encryption, and raise an exception where

+    encountered.  These conflicts can arise in two different ways:

+        1) the flavor requests memory encryption but the image

+           explicitly requests *not* to have memory encryption, or

+           vice-versa

+        2) the flavor and/or image request memory encryption, but the

+           image is missing hw_firmware_type=uefi

+        3) the flavor and/or image request memory encryption, but the

+           machine type is set to a value which does not contain 'q35'

+    This can be called from the libvirt driver on the compute node, in

+    which case the driver should pass the result of

+    nova.virt.libvirt.utils.get_machine_type() as the machine_type

+    parameter, or from the API layer, in which case get_machine_type()

+    cannot be called since it relies on being run from the compute

+    node in order to retrieve CONF.libvirt.hw_machine_type.

+    :param instance_type: Flavor object

+    :param image: an ImageMeta object

+    :param machine_type: a string representing the machine type (optional)

+    :raises: nova.exception.FlavorImageConflict

+    :raises: nova.exception.InvalidMachineType

+    :returns: boolean indicating whether encryption of guest memory

+    was requested

+    '''

+

+    flavor_mem_enc_str, image_mem_enc = _get_flavor_image_meta(

+        'mem_encryption', flavor, image_meta)

+

+    flavor_mem_enc = None

+    if flavor_mem_enc_str is not None:

+        flavor_mem_enc = strutils.bool_from_string(flavor_mem_enc_str)

+

+    # Image property is a FlexibleBooleanField, so coercion to a

+    # boolean is handled automatically

+

+    if not flavor_mem_enc and not image_mem_enc:

+        return False

+

+    _check_for_mem_encryption_requirement_conflicts(

+        flavor_mem_enc_str, flavor_mem_enc, image_mem_enc, flavor, image_meta)

+

+    # If we get this far, either the extra spec or image property explicitly

+    # specified a requirement regarding memory encryption, and if both did,

+    # they are asking for the same thing.

+    requesters = []

+    if flavor_mem_enc:

+        requesters.append('hw:mem_encryption extra spec in %s flavor' %

+                          flavor.name)

+    if image_mem_enc:

+        requesters.append('hw_mem_encryption property of image %s' %

+                          image_meta.name)

+

+    _check_mem_encryption_uses_uefi_image(requesters, image_meta)

+    _check_mem_encryption_machine_type(image_meta, machine_type)

+

+    LOG.debug('Memory encryption requested by %s', ' and '.join(requesters))

+    return True

 

 def _get_numa_pagesize_constraint(flavor, image_meta):

     '''Return the requested memory page size

A1.2

diff -ruN nova-bak/compute/manager.py nova/compute/manager.py

--- nova-bak/compute/manager.py  2021-07-07 14:40:15.570807168 +0800

+++ nova/compute/manager.py        2021-10-18 19:02:37.931655551 +0800

@@ -7013,7 +7013,8 @@

                                         migrate_data)

 

         # Detaching volumes.

-        connector = self.driver.get_volume_connector(instance)

+        connector = None

+        #connector = self.driver.get_volume_connector(instance)

         for bdm in source_bdms:

             if bdm.is_volume:

                 # Detaching volumes is a call to an external API that can fail.

@@ -7033,6 +7034,8 @@

                         # remove the volume connection without detaching from

                         # hypervisor because the instance is not running

                         # anymore on the current host

+                        if connector is None:

+                            connector = self.driver.get_volume_connector(instance)

                         self.volume_api.terminate_connection(ctxt,

                                                              bdm.volume_id,

                                                              connector)

@@ -7056,8 +7059,10 @@

 

         # Releasing vlan.

         # (not necessary in current implementation?)

-

-        network_info = self.network_api.get_instance_nw_info(ctxt, instance)

+       

+        #changed by Fiona

+        #network_info = self.network_api.get_instance_nw_info(ctxt, instance)

+        network_info = instance.get_network_info()

 

         self._notify_about_instance_usage(ctxt, instance,

                                           'live_migration._post.start',

A1.3

diff -ruN neutron-bak/agent/l3/router_info.py neutron-iproute/agent/l3/router_info.py

--- neutron-bak/agent/l3/router_info.py    2020-12-14 18:00:23.683687327 +0800

+++ neutron-iproute/agent/l3/router_info.py     2022-02-23 15:18:15.650669589 +0800

@@ -748,8 +748,10 @@

         for ip_version in (lib_constants.IP_VERSION_4,

                            lib_constants.IP_VERSION_6):

             gateway = device.route.get_gateway(ip_version=ip_version)

-            if gateway and gateway.get('gateway'):

-                current_gateways.add(gateway.get('gateway'))

+#            if gateway and gateway.get('gateway'):

+#                current_gateways.add(gateway.get('gateway'))

+            if gateway and gateway.get('via'):

+                current_gateways.add(gateway.get('via'))

         for ip in current_gateways - set(gateway_ips):

             device.route.delete_gateway(ip)

         for ip in gateway_ips:

diff -ruN neutron-bak/agent/linux/ip_lib.py neutron-iproute/agent/linux/ip_lib.py

--- neutron-bak/agent/linux/ip_lib.py 2020-12-14 18:03:47.951878754 +0800

+++ neutron-iproute/agent/linux/ip_lib.py 2022-02-23 15:19:03.981457532 +0800

@@ -48,6 +48,8 @@

                   'main': 254,

                   'local': 255}

 

+IP_RULE_TABLES_NAMES = {v: k for k, v in IP_RULE_TABLES.items()}

+

 # Rule indexes: pyroute2.netlink.rtnl

 # Rule names: https://www.systutorials.com/docs/linux/man/8-ip-rule/

 # NOTE(ralonsoh): 'masquerade' type is printed as 'nat' in 'ip rule' command

@@ -592,14 +594,18 @@

     def _dev_args(self):

         return ['dev', self.name] if self.name else []

 

-    def add_gateway(self, gateway, metric=None, table=None):

-        ip_version = common_utils.get_ip_version(gateway)

-        args = ['replace', 'default', 'via', gateway]

-        if metric:

-            args += ['metric', metric]

-        args += self._dev_args()

-        args += self._table_args(table)

-        self._as_root([ip_version], tuple(args))

+#    def add_gateway(self, gateway, metric=None, table=None):

+#        ip_version = common_utils.get_ip_version(gateway)

+#        args = ['replace', 'default', 'via', gateway]

+#        if metric:

+#            args += ['metric', metric]

+#        args += self._dev_args()

+#        args += self._table_args(table)

+#        self._as_root([ip_version], tuple(args))

+

+    def add_gateway(self, gateway, metric=None, table=None, scope='global'):

+        self.add_route(None, via=gateway, table=table, metric=metric,

+                       scope=scope)

 

     def _run_as_root_detect_device_not_found(self, options, args):

         try:

@@ -618,41 +624,16 @@

         args += self._table_args(table)

         self._run_as_root_detect_device_not_found([ip_version], args)

 

-    def _parse_routes(self, ip_version, output, **kwargs):

-        for line in output.splitlines():

-            parts = line.split()

-

-            # Format of line is: '|default [] ...'

-            route = {k: v for k, v in zip(parts[1::2], parts[2::2])}

-            route['cidr'] = parts[0]

-            # Avoids having to explicitly pass around the IP version

-            if route['cidr'] == 'default':

-                route['cidr'] = constants.IP_ANY[ip_version]

-

-            # ip route drops things like scope and dev from the output if it

-            # was specified as a filter.  This allows us to add them back.

-            if self.name:

-                route['dev'] = self.name

-            if self._table:

-                route['table'] = self._table

-            # Callers add any filters they use as kwargs

-            route.update(kwargs)

-

-            yield route

-

-    def list_routes(self, ip_version, **kwargs):

-        args = ['list']

-        args += self._dev_args()

-        args += self._table_args()

-        for k, v in kwargs.items():

-            args += [k, v]

-

-        output = self._run([ip_version], tuple(args))

-        return [r for r in self._parse_routes(ip_version, output, **kwargs)]

+    def list_routes(self, ip_version, scope=None, via=None, table=None,

+                    **kwargs):

+        table = table or self._table

+        return list_ip_routes(self._parent.namespace, ip_version, scope=scope,

+                              via=via, table=table, device=self.name, **kwargs)

 

     def list_onlink_routes(self, ip_version):

         routes = self.list_routes(ip_version, scope='link')

-        return [r for r in routes if 'src' not in r]

+#        return [r for r in routes if 'src' not in r]

+        return [r for r in routes if not r['source_prefix']]

 

     def add_onlink_route(self, cidr):

         self.add_route(cidr, scope='link')

@@ -660,34 +641,12 @@

     def delete_onlink_route(self, cidr):

         self.delete_route(cidr, scope='link')

 

-    def get_gateway(self, scope=None, filters=None, ip_version=None):

-        options = [ip_version] if ip_version else []

-

-        args = ['list']

-        args += self._dev_args()

-        args += self._table_args()

-        if filters:

-            args += filters

-

-        retval = None

-

-        if scope:

-            args += ['scope', scope]

-

-        route_list_lines = self._run(options, tuple(args)).split('\n')

-        default_route_line = next((x.strip() for x in

-                                   route_list_lines if

-                                   x.strip().startswith('default')), None)

-        if default_route_line:

-            retval = dict()

-            gateway = DEFAULT_GW_PATTERN.search(default_route_line)

-            if gateway:

-                retval.update(gateway=gateway.group(1))

-            metric = METRIC_PATTERN.search(default_route_line)

-            if metric:

-                retval.update(metric=int(metric.group(1)))

-

-        return retval

+    def get_gateway(self, scope=None, table=None,

+                    ip_version=constants.IP_VERSION_4):

+        routes = self.list_routes(ip_version, scope=scope, table=table)

+        for route in routes:

+            if route['via'] and route['cidr'] in constants.IP_ANY.values():

+                return route

 

     def flush(self, ip_version, table=None, **kwargs):

         args = ['flush']

@@ -696,16 +655,11 @@

             args += [k, v]

         self._as_root([ip_version], tuple(args))

 

-    def add_route(self, cidr, via=None, table=None, **kwargs):

-        ip_version = common_utils.get_ip_version(cidr)

-        args = ['replace', cidr]

-        if via:

-            args += ['via', via]

-        args += self._dev_args()

-        args += self._table_args(table)

-        for k, v in kwargs.items():

-            args += [k, v]

-        self._run_as_root_detect_device_not_found([ip_version], args)

+    def add_route(self, cidr, via=None, table=None, metric=None, scope=None,

+                  **kwargs):

+        table = table or self._table

+        add_ip_route(self._parent.namespace, cidr, device=self.name, via=via,

+                     table=table, metric=metric, scope=scope, **kwargs)

 

     def delete_route(self, cidr, via=None, table=None, **kwargs):

         ip_version = common_utils.get_ip_version(cidr)

@@ -1455,3 +1409,53 @@

                 retval[device['vxlan_link_index']]['name'])

 

     return list(retval.values())

+

+def add_ip_route(namespace, cidr, device=None, via=None, table=None,

+                 metric=None, scope=None, **kwargs):

+    '''Add an IP route'''

+    if table:

+        table = IP_RULE_TABLES.get(table, table)

+    ip_version = common_utils.get_ip_version(cidr or via)

+    privileged.add_ip_route(namespace, cidr, ip_version,

+                            device=device, via=via, table=table,

+                            metric=metric, scope=scope, **kwargs)

+

+

+def list_ip_routes(namespace, ip_version, scope=None, via=None, table=None,

+                   device=None, **kwargs):

+    '''List IP routes'''

+    def get_device(index, devices):

+        for device in (d for d in devices if d['index'] == index):

+            return get_attr(device, 'IFLA_IFNAME')

+

+    table = table if table else 'main'

+    table = IP_RULE_TABLES.get(table, table)

+    routes = privileged.list_ip_routes(namespace, ip_version, device=device,

+                                       table=table, **kwargs)

+    devices = privileged.get_link_devices(namespace)

+    ret = []

+    for route in routes:

+        cidr = get_attr(route, 'RTA_DST')

+        if cidr:

+            cidr = '%s/%s' % (cidr, route['dst_len'])

+        else:

+            cidr = constants.IP_ANY[ip_version]

+        table = int(get_attr(route, 'RTA_TABLE'))

+        value = {

+            'table': IP_RULE_TABLES_NAMES.get(table, table),

+            'source_prefix': get_attr(route, 'RTA_PREFSRC'),

+            'cidr': cidr,

+            'scope': IP_ADDRESS_SCOPE[int(route['scope'])],

+            'device': get_device(int(get_attr(route, 'RTA_OIF')), devices),

+            'via': get_attr(route, 'RTA_GATEWAY'),

+            'priority': get_attr(route, 'RTA_PRIORITY'),

+        }

+

+        ret.append(value)

+

+    if scope:

+        ret = [route for route in ret if route['scope'] == scope]

+    if via:

+        ret = [route for route in ret if route['via'] == via]

+

+    return ret

diff -ruN neutron-bak/cmd/sanity/checks.py neutron-iproute/cmd/sanity/checks.py

--- neutron-bak/cmd/sanity/checks.py       2022-02-23 11:33:16.934132708 +0800

+++ neutron-iproute/cmd/sanity/checks.py        2022-02-23 15:20:10.562018672 +0800

@@ -36,6 +36,7 @@

 from neutron.common import utils as common_utils

 from neutron.plugins.ml2.drivers.openvswitch.agent.common \

     import constants as ovs_const

+from neutron.privileged.agent.linux import dhcp as priv_dhcp

 

 LOG = logging.getLogger(__name__)

 

@@ -230,8 +231,8 @@

 

 

 def dhcp_release6_supported():

-    return runtime_checks.dhcp_release6_supported()

-

+#    return runtime_checks.dhcp_release6_supported()

+     return priv_dhcp.dhcp_release6_supported()

 

 def bridge_firewalling_enabled():

     for proto in ('arp', 'ip', 'ip6'):

@@ -363,7 +364,8 @@

 

             default_gw = gw_dev.route.get_gateway(ip_version=6)

             if default_gw:

-                default_gw = default_gw['gateway']

+#                default_gw = default_gw['gateway']

+                default_gw = default_gw['via']

 

     return expected_default_gw == default_gw

 

diff -ruN neutron-bak/privileged/agent/linux/ip_lib.py neutron-iproute/privileged/agent/linux/ip_lib.py

--- neutron-bak/privileged/agent/linux/ip_lib.py 2020-12-14 18:26:08.339307939 +0800

+++ neutron-iproute/privileged/agent/linux/ip_lib.py 2022-02-23 15:20:39.477439105 +0800

@@ -634,3 +634,50 @@

         if e.errno == errno.ENOENT:

             raise NetworkNamespaceNotFound(netns_name=namespace)

         raise

+

[email protected]

[email protected]('privileged-ip-lib')

+def add_ip_route(namespace, cidr, ip_version, device=None, via=None,

+                 table=None, metric=None, scope=None, **kwargs):

+    '''Add an IP route'''

+    try:

+        with get_iproute(namespace) as ip:

+            family = _IP_VERSION_FAMILY_MAP[ip_version]

+            if not scope:

+                scope = 'global' if via else 'link'

+            scope = _get_scope_name(scope)

+            if cidr:

+                kwargs['dst'] = cidr

+            if via:

+                kwargs['gateway'] = via

+            if table:

+                kwargs['table'] = int(table)

+            if device:

+                kwargs['oif'] = get_link_id(device, namespace)

+            if metric:

+                kwargs['priority'] = int(metric)

+            ip.route('replace', family=family, scope=scope, proto='static',

+                     **kwargs)

+    except OSError as e:

+        if e.errno == errno.ENOENT:

+            raise NetworkNamespaceNotFound(netns_name=namespace)

+        raise

+

+

[email protected]

[email protected]('privileged-ip-lib')

+def list_ip_routes(namespace, ip_version, device=None, table=None, **kwargs):

+    '''List IP routes'''

+    try:

+        with get_iproute(namespace) as ip:

+            family = _IP_VERSION_FAMILY_MAP[ip_version]

+            if table:

+                kwargs['table'] = table

+            if device:

+                kwargs['oif'] = get_link_id(device, namespace)

+            return make_serializable(ip.route('show', family=family, **kwargs))

+    except OSError as e:

+        if e.errno == errno.ENOENT:

+            raise NetworkNamespaceNotFound(netns_name=namespace)

+        raise

+

A1.4

diff -ruN neutron-bak/agent/linux/dhcp.py neutron-dhcprelease/agent/linux/dhcp.py

--- neutron-bak/agent/linux/dhcp.py 2020-12-15 09:59:29.966957908 +0800

+++ neutron-dhcprelease/agent/linux/dhcp.py  2022-02-23 15:10:14.169101010 +0800

@@ -25,6 +25,7 @@

 from neutron_lib import constants

 from neutron_lib import exceptions

 from neutron_lib.utils import file as file_utils

+from oslo_concurrency import processutils

 from oslo_log import log as logging

 from oslo_utils import excutils

 from oslo_utils import fileutils

@@ -41,6 +42,7 @@

 from neutron.common import ipv6_utils

 from neutron.common import utils as common_utils

 from neutron.ipam import utils as ipam_utils

+from neutron.privileged.agent.linux import dhcp as priv_dhcp

 

 LOG = logging.getLogger(__name__)

 

@@ -476,7 +478,8 @@

 

     def _is_dhcp_release6_supported(self):

         if self._IS_DHCP_RELEASE6_SUPPORTED is None:

-            self._IS_DHCP_RELEASE6_SUPPORTED = checks.dhcp_release6_supported()

+            self._IS_DHCP_RELEASE6_SUPPORTED = (

+                priv_dhcp.dhcp_release6_supported())

             if not self._IS_DHCP_RELEASE6_SUPPORTED:

                 LOG.warning('dhcp_release6 is not present on this system, '

                             'will not call it again.')

@@ -485,24 +488,28 @@

     def _release_lease(self, mac_address, ip, ip_version, client_id=None,

                        server_id=None, iaid=None):

         '''Release a DHCP lease.'''

-        if ip_version == constants.IP_VERSION_6:

-            if not self._is_dhcp_release6_supported():

-                return

-            cmd = ['dhcp_release6', '--iface', self.interface_name,

-                   '--ip', ip, '--client-id', client_id,

-                   '--server-id', server_id, '--iaid', iaid]

-        else:

-            cmd = ['dhcp_release', self.interface_name, ip, mac_address]

-            if client_id:

-                cmd.append(client_id)

-        ip_wrapper = ip_lib.IPWrapper(namespace=self.network.namespace)

         try:

-            ip_wrapper.netns.execute(cmd, run_as_root=True)

-        except RuntimeError as e:

+            if ip_version == constants.IP_VERSION_6:

+                if not self._is_dhcp_release6_supported():

+                    return

+

+                params = {'interface_name': self.interface_name,

+                          'ip_address': ip, 'client_id': client_id,

+                          'server_id': server_id, 'iaid': iaid,

+                          'namespace': self.network.namespace}

+                priv_dhcp.dhcp_release6(**params)

+            else:

+                params = {'interface_name': self.interface_name,

+                          'ip_address': ip, 'mac_address': mac_address,

+                          'client_id': client_id,

+                          'namespace': self.network.namespace}

+#                LOG.info('Rock_DEBUG: DHCP release construct params %(params)s.', {'params': params})

+                priv_dhcp.dhcp_release(**params)

+        except (processutils.ProcessExecutionError, OSError) as e:

             # when failed to release single lease there's

             # no need to propagate error further

-            LOG.warning('DHCP release failed for %(cmd)s. '

-                        'Reason: %(e)s', {'cmd': cmd, 'e': e})

+            LOG.warning('DHCP release failed for params %(params)s. '

+                        'Reason: %(e)s', {'params': params, 'e': e})

 

     def _output_config_files(self):

         self._output_hosts_file()

diff -ruN neutron-bak/cmd/sanity/checks.py neutron-dhcprelease/cmd/sanity/checks.py

--- neutron-bak/cmd/sanity/checks.py       2022-02-23 11:33:16.934132708 +0800

+++ neutron-dhcprelease/cmd/sanity/checks.py 2022-02-23 15:11:07.536446402 +0800

@@ -36,6 +36,7 @@

 from neutron.common import utils as common_utils

 from neutron.plugins.ml2.drivers.openvswitch.agent.common \

     import constants as ovs_const

+from neutron.privileged.agent.linux import dhcp as priv_dhcp

 

 LOG = logging.getLogger(__name__)

 

@@ -230,8 +231,8 @@

 

 

 def dhcp_release6_supported():

-    return runtime_checks.dhcp_release6_supported()

-

+#    return runtime_checks.dhcp_release6_supported()

+     return priv_dhcp.dhcp_release6_supported()

 

 def bridge_firewalling_enabled():

     for proto in ('arp', 'ip', 'ip6'):

@@ -363,7 +364,8 @@

 

             default_gw = gw_dev.route.get_gateway(ip_version=6)

             if default_gw:

-                default_gw = default_gw['gateway']

+#                default_gw = default_gw['gateway']

+                default_gw = default_gw['via']

 

     return expected_default_gw == default_gw

 

diff -ruN neutron-bak/privileged/__init__.py neutron-dhcprelease/privileged/__init__.py

--- neutron-bak/privileged/__init__.py        2020-04-23 14:45:14.000000000 +0800

+++ neutron-dhcprelease/privileged/__init__.py 2022-02-23 15:10:29.209584186 +0800

@@ -27,3 +27,11 @@

                   caps.CAP_DAC_OVERRIDE,

                   caps.CAP_DAC_READ_SEARCH],

 )

+

+dhcp_release_cmd = priv_context.PrivContext(

+    __name__,

+    cfg_section='privsep_dhcp_release',

+    pypath=__name__ + '.dhcp_release_cmd',

+    capabilities=[caps.CAP_SYS_ADMIN,

+                  caps.CAP_NET_ADMIN]

+)

A1.5

diff -ruN neutron-bak/agent/linux/ipset_manager.py neutron/agent/linux/ipset_manager.py

--- neutron-bak/agent/linux/ipset_manager.py  2022-02-16 15:11:40.419016919 +0800

+++ neutron/agent/linux/ipset_manager.py        2022-02-16 15:17:02.328133786 +0800

@@ -146,7 +146,7 @@

             cmd_ns.extend(['ip', 'netns', 'exec', self.namespace])

         cmd_ns.extend(cmd)

         self.execute(cmd_ns, run_as_root=True, process_input=input,

-                     check_exit_code=fail_on_errors)

+                     check_exit_code=fail_on_errors, privsep_exec=True)

 

     def _get_new_set_ips(self, set_name, expected_ips):

         new_member_ips = (set(expected_ips) -

diff -ruN neutron-bak/agent/linux/iptables_manager.py neutron/agent/linux/iptables_manager.py

--- neutron-bak/agent/linux/iptables_manager.py      2022-02-16 15:05:53.853147520 +0800

+++ neutron/agent/linux/iptables_manager.py   2021-07-07 14:59:16.000000000 +0800

@@ -475,12 +475,15 @@

         args = ['iptables-save', '-t', table]

         if self.namespace:

             args = ['ip', 'netns', 'exec', self.namespace] + args

-        return self.execute(args, run_as_root=True).split('\n')

+        #return self.execute(args, run_as_root=True).split('\n')

+        return self.execute(args, run_as_root=True,

+                            privsep_exec=True).split('\n')

 

     def _get_version(self):

         # Output example is 'iptables v1.6.2'

         args = ['iptables', '--version']

-        version = str(self.execute(args, run_as_root=True).split()[1][1:])

+        #version = str(self.execute(args, run_as_root=True).split()[1][1:])

+        version = str(self.execute(args, run_as_root=True, privsep_exec=True).split()[1][1:])

         LOG.debug('IPTables version installed: %s', version)

         return version

 

@@ -505,8 +508,10 @@

             args += ['-w', self.xlock_wait_time, '-W', XLOCK_WAIT_INTERVAL]

         try:

             kwargs = {} if lock else {'log_fail_as_error': False}

+            #self.execute(args, process_input='\n'.join(commands),

+            #             run_as_root=True, **kwargs)

             self.execute(args, process_input='\n'.join(commands),

-                         run_as_root=True, **kwargs)

+                         run_as_root=True, privsep_exec=True, **kwargs)

         except RuntimeError as error:

             return error

 

@@ -568,7 +573,8 @@

             if self.namespace:

                 args = ['ip', 'netns', 'exec', self.namespace] + args

             try:

-                save_output = self.execute(args, run_as_root=True)

+                #save_output = self.execute(args, run_as_root=True)

+                save_output = self.execute(args, run_as_root=True, privsep_exec=True)

             except RuntimeError:

                 # We could be racing with a cron job deleting namespaces.

                 # It is useless to try to apply iptables rules over and

@@ -769,7 +775,8 @@

                 args.append('-Z')

             if self.namespace:

                 args = ['ip', 'netns', 'exec', self.namespace] + args

-            current_table = self.execute(args, run_as_root=True)

+            #current_table = self.execute(args, run_as_root=True)

+            current_table = self.execute(args, run_as_root=True, privsep_exec=True)

             current_lines = current_table.split('\n')

 

             for line in current_lines[2:]:

diff -ruN neutron-bak/agent/linux/utils.py neutron/agent/linux/utils.py

--- neutron-bak/agent/linux/utils.py 2022-02-16 15:06:03.133090388 +0800

+++ neutron/agent/linux/utils.py       2021-07-08 09:34:12.000000000 +0800

@@ -38,6 +38,7 @@

 from neutron.agent.linux import xenapi_root_helper

 from neutron.common import utils

 from neutron.conf.agent import common as config

+from neutron.privileged.agent.linux import utils as priv_utils

 from neutron import wsgi

 

 

@@ -85,13 +86,24 @@

     if run_as_root:

         cmd = shlex.split(config.get_root_helper(cfg.CONF)) + cmd

     LOG.debug('Running command: %s', cmd)

-    obj = utils.subprocess_popen(cmd, shell=False,

-                                 stdin=subprocess.PIPE,

-                                 stdout=subprocess.PIPE,

-                                 stderr=subprocess.PIPE)

+    #obj = utils.subprocess_popen(cmd, shell=False,

+    #                             stdin=subprocess.PIPE,

+    #                             stdout=subprocess.PIPE,

+    #                             stderr=subprocess.PIPE)

+    obj = subprocess.Popen(cmd, shell=False, stdin=subprocess.PIPE,

+                           stdout=subprocess.PIPE, stderr=subprocess.PIPE)

 

     return obj, cmd

 

+def _execute_process(cmd, _process_input, addl_env, run_as_root):

+    obj, cmd = create_process(cmd, run_as_root=run_as_root, addl_env=addl_env)

+    _stdout, _stderr = obj.communicate(_process_input)

+    returncode = obj.returncode

+    obj.stdin.close()

+    _stdout = helpers.safe_decode_utf8(_stdout)

+    _stderr = helpers.safe_decode_utf8(_stderr)

+    return _stdout, _stderr, returncode

+

 

 def execute_rootwrap_daemon(cmd, process_input, addl_env):

     cmd = list(map(str, addl_env_args(addl_env) + cmd))

@@ -103,31 +115,45 @@

     LOG.debug('Running command (rootwrap daemon): %s', cmd)

     client = RootwrapDaemonHelper.get_client()

     try:

-        return client.execute(cmd, process_input)

+        #return client.execute(cmd, process_input)

+        returncode, __stdout, _stderr =  client.execute(cmd, process_input)

     except Exception:

         with excutils.save_and_reraise_exception():

             LOG.error('Rootwrap error running command: %s', cmd)

+    _stdout = helpers.safe_decode_utf8(_stdout)

+    _stderr = helpers.safe_decode_utf8(_stderr)

+    return _stdout, _stderr, returncode

 

 

 def execute(cmd, process_input=None, addl_env=None,

             check_exit_code=True, return_stderr=False, log_fail_as_error=True,

-            extra_ok_codes=None, run_as_root=False):

+            extra_ok_codes=None, run_as_root=False, privsep_exec=False):

     try:

         if process_input is not None:

             _process_input = encodeutils.to_utf8(process_input)

         else:

             _process_input = None

-        if run_as_root and cfg.CONF.AGENT.root_helper_daemon:

-            returncode, _stdout, _stderr = (

-                execute_rootwrap_daemon(cmd, process_input, addl_env))

+        #if run_as_root and cfg.CONF.AGENT.root_helper_daemon:

+        #    returncode, _stdout, _stderr = (

+        #        execute_rootwrap_daemon(cmd, process_input, addl_env))

+        #else:

+        #    obj, cmd = create_process(cmd, run_as_root=run_as_root,

+        #                              addl_env=addl_env)

+        #    _stdout, _stderr = obj.communicate(_process_input)

+        #    returncode = obj.returncode

+        #    obj.stdin.close()

+        #_stdout = helpers.safe_decode_utf8(_stdout)

+        #_stderr = helpers.safe_decode_utf8(_stderr)

+

+        if run_as_root and privsep_exec:

+            _stdout, _stderr, returncode = priv_utils.execute_process(

+                cmd, _process_input, addl_env)

+        elif run_as_root and cfg.CONF.AGENT.root_helper_daemon:

+            _stdout, _stderr, returncode = execute_rootwarp_daemon(

+                cmd, process_input, addl_env)

         else:

-            obj, cmd = create_process(cmd, run_as_root=run_as_root,

-                                      addl_env=addl_env)

-            _stdout, _stderr = obj.communicate(_process_input)

-            returncode = obj.returncode

-            obj.stdin.close()

-        _stdout = helpers.safe_decode_utf8(_stdout)

-        _stderr = helpers.safe_decode_utf8(_stderr)

+            _stdout, _stderr, returncode = _execute_process(

+                cmd, _process_input, addl_env, run_as_root)

 

         extra_ok_codes = extra_ok_codes or []

         if returncode and returncode not in extra_ok_codes:

diff -ruN neutron-bak/cmd/ipset_cleanup.py neutron/cmd/ipset_cleanup.py

--- neutron-bak/cmd/ipset_cleanup.py      2022-02-16 15:18:00.727786180 +0800

+++ neutron/cmd/ipset_cleanup.py   2021-07-07 15:00:03.000000000 +0800

@@ -38,7 +38,8 @@

 def remove_iptables_reference(ipset):

     # Remove any iptables reference to this IPset

     cmd = ['iptables-save'] if 'IPv4' in ipset else ['ip6tables-save']

-    iptables_save = utils.execute(cmd, run_as_root=True)

+    #iptables_save = utils.execute(cmd, run_as_root=True)

+    iptables_save = utils.execute(cmd, run_as_root=True, privsep_exec=True)

 

     if ipset in iptables_save:

         cmd = ['iptables'] if 'IPv4' in ipset else ['ip6tables']

@@ -50,7 +51,8 @@

                 params = rule.split()

                 params[0] = '-D'

                 try:

-                    utils.execute(cmd + params, run_as_root=True)

+                    #utils.execute(cmd + params, run_as_root=True)

+                    utils.execute(cmd + params, run_as_root=True, privsep_exec=True)

                 except Exception:

                     LOG.exception('Error, unable to remove iptables rule '

                                   'for IPset: %s', ipset)

@@ -65,7 +67,8 @@

     LOG.info('Destroying IPset: %s', ipset)

     cmd = ['ipset', 'destroy', ipset]

     try:

-        utils.execute(cmd, run_as_root=True)

+        #utils.execute(cmd, run_as_root=True)

+        utils.execute(cmd, run_as_root=True, privsep_exec=True)

     except Exception:

         LOG.exception('Error, unable to destroy IPset: %s', ipset)

 

@@ -75,7 +78,8 @@

     LOG.info('Destroying IPsets with prefix: %s', conf.prefix)

 

     cmd = ['ipset', '-L', '-n']

-    ipsets = utils.execute(cmd, run_as_root=True)

+    #ipsets = utils.execute(cmd, run_as_root=True)

+    ipsets = utils.execute(cmd, run_as_root=True, privsep_exec=True)

     for ipset in ipsets.split('\n'):

         if conf.allsets or ipset.startswith(conf.prefix):

             destroy_ipset(conf, ipset)

diff -ruN neutron-bak/privileged/agent/linux/utils.py neutron/privileged/agent/linux/utils.py

--- neutron-bak/privileged/agent/linux/utils.py  1970-01-01 08:00:00.000000000 +0800

+++ neutron/privileged/agent/linux/utils.py        2021-07-07 14:58:21.000000000 +0800

@@ -0,0 +1,82 @@

+# Copyright 2020 Red Hat, Inc.

+#

+#    Licensed under the Apache License, Version 2.0 (the 'License'); you may

+#    not use this file except in compliance with the License. You may obtain

+#    a copy of the License at

+#

+#         http://www.apache.org/licenses/LICENSE-2.0

+#

+#    Unless required by applicable law or agreed to in writing, software

+#    distributed under the License is distributed on an 'AS IS' BASIS, WITHOUT

+#    WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the

+#    License for the specific language governing permissions and limitations

+#    under the License.

+

+import os

+import re

+

+from eventlet.green import subprocess

+from neutron_lib.utils import helpers

+from oslo_concurrency import processutils

+from oslo_utils import fileutils

+

+from neutron import privileged

+

+

+NETSTAT_PIDS_REGEX = re.compile(r'.* (?P\d{2,6})/.*')

+

+

[email protected]

+def find_listen_pids_namespace(namespace):

+    return _find_listen_pids_namespace(namespace)

+

+

+def _find_listen_pids_namespace(namespace):

+    '''Retrieve a list of pids of listening processes within the given netns

+    This method is implemented separately to allow unit testing.

+    '''

+    pids = set()

+    cmd = ['ip', 'netns', 'exec', namespace, 'netstat', '-nlp']

+    output = processutils.execute(*cmd)

+    for line in output[0].splitlines():

+        m = NETSTAT_PIDS_REGEX.match(line)

+        if m:

+            pids.add(m.group('pid'))

+    return list(pids)

+

+

[email protected]

+def delete_if_exists(path, remove=os.unlink):

+    fileutils.delete_if_exists(path, remove=remove)

+

+

[email protected]

+def execute_process(cmd, _process_input, addl_env):

+    obj, cmd = _create_process(cmd, addl_env=addl_env)

+    _stdout, _stderr = obj.communicate(_process_input)

+    returncode = obj.returncode

+    obj.stdin.close()

+    _stdout = helpers.safe_decode_utf8(_stdout)

+    _stderr = helpers.safe_decode_utf8(_stderr)

+    return _stdout, _stderr, returncode

+

+

+def _addl_env_args(addl_env):

+    '''Build arguments for adding additional environment vars with env'''

+

+    # NOTE (twilson) If using rootwrap, an EnvFilter should be set up for the

+    # command instead of a CommandFilter.

+    if addl_env is None:

+        return []

+    return ['env'] + ['%s=%s' % pair for pair in addl_env.items()]

+

+

+def _create_process(cmd, addl_env=None):

+    '''Create a process object for the given command.

+    The return value will be a tuple of the process object and the

+    list of command arguments used to create it.

+    '''

+    cmd = list(map(str, _addl_env_args(addl_env) + list(cmd)))

+    obj = subprocess.Popen(cmd, shell=False, stdin=subprocess.PIPE,

+                           stdout=subprocess.PIPE, stderr=subprocess.PIPE)

+    return obj, cmd

B1.1

--- neutron-bak/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py  2022-08-02 17:02:51.213224245 +0800

+++ neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py 2022-08-02 17:02:09.181883012 +0800

@@ -161,8 +161,8 @@

         self.enable_distributed_routing = agent_conf.enable_distributed_routing

         self.arp_responder_enabled = agent_conf.arp_responder and self.l2_pop

 

-        host = self.conf.host

-        self.agent_id = 'ovs-agent-%s' % host

+        self.host = self.conf.host

+        self.agent_id = 'ovs-agent-%s' % self.host

 

         self.enable_tunneling = bool(self.tunnel_types)

 

@@ -245,7 +245,7 @@

             self.phys_ofports,

             self.patch_int_ofport,

             self.patch_tun_ofport,

-            host,

+            self.host,

             self.enable_tunneling,

             self.enable_distributed_routing,

             self.arp_responder_enabled)

@@ -289,7 +289,7 @@

         #                  or which are used by specific extensions.

         self.agent_state = {

             'binary': 'neutron-openvswitch-agent',

-            'host': host,

+            'host': self.host,

             'topic': n_const.L2_AGENT_TOPIC,

             'configurations': {'bridge_mappings': self.bridge_mappings,

                                c_const.RP_BANDWIDTHS: self.rp_bandwidths,

@@ -1671,6 +1671,7 @@

         skipped_devices = []

         need_binding_devices = []

         binding_no_activated_devices = set()

+        migrating_devices = set()

         agent_restarted = self.iter_num == 0

         devices_details_list = (

             self.plugin_rpc.get_devices_details_list_and_failed_devices(

@@ -1696,6 +1697,12 @@

                 skipped_devices.append(device)

                 continue

 

+            migrating_to = details.get('migrating_to')

+            if migrating_to and migrating_to != self.host:

+                LOG.info('Port %(device)s is being migrated to host %(host)s.',

+                         {'device': device, 'host': migrating_to})

+                migrating_devices.add(device)

+

             if 'port_id' in details:

                 LOG.info('Port %(device)s updated. Details: %(details)s',

                          {'device': device, 'details': details})

@@ -1729,7 +1736,7 @@

                 if (port and port.ofport != -1):

                     self.port_dead(port)

         return (skipped_devices, binding_no_activated_devices,

-                need_binding_devices, failed_devices)

+                need_binding_devices, failed_devices, migrating_devices)

 

     def _update_port_network(self, port_id, network_id):

         self._clean_network_ports(port_id)

@@ -1821,10 +1828,12 @@

         need_binding_devices = []

         skipped_devices = set()

         binding_no_activated_devices = set()

+        migrating_devices = set()

         start = time.time()

         if devices_added_updated:

             (skipped_devices, binding_no_activated_devices,

-             need_binding_devices, failed_devices['added']) = (

+             need_binding_devices, failed_devices['added'],

+                migrating_devices) = (

                 self.treat_devices_added_or_updated(

                     devices_added_updated, provisioning_needed))

             LOG.debug('process_network_ports - iteration:%(iter_num)d - '

@@ -1847,7 +1856,7 @@

         # TODO(salv-orlando): Optimize avoiding applying filters

         # unnecessarily, (eg: when there are no IP address changes)

         added_ports = (port_info.get('added', set()) - skipped_devices -

-                       binding_no_activated_devices)

+                       binding_no_activated_devices - migrating_devices)

         self._add_port_tag_info(need_binding_devices)

         self.sg_agent.setup_port_filters(added_ports,

                                          port_info.get('updated', set()))

B1.2

--- neutron-bak/conf/common.py     2022-08-02 17:07:18.239265163 +0800

+++ neutron/conf/common.py 2021-09-08 17:08:59.000000000 +0800

@@ -166,6 +166,24 @@

                help=_('Type of the nova endpoint to use.  This endpoint will'

                       ' be looked up in the keystone catalog and should be'

                       ' one of public, internal or admin.')),

+    cfg.BoolOpt('live_migration_events', default=True,

+                help=_('When this option is enabled, during the live '

+                       'migration, the OVS agent will only send the '

+                       ''vif-plugged-event' when the destination host '

+                       'interface is bound. This option also disables any '

+                       'other agent (like DHCP) to send to Nova this event '

+                       'when the port is provisioned.'

+                       'This option can be enabled if Nova patch '

+                       'https://review.opendev.org/c/openstack/nova/+/767368 '

+                       'is in place.'

+                       'This option is temporary and will be removed in Y and '

+                       'the behavior will be 'True'.'),

+                deprecated_for_removal=True,

+                deprecated_reason=(

+                    'In Y the Nova patch '

+                    'https://review.opendev.org/c/openstack/nova/+/767368 '

+                    'will be in the code even when running a Nova server in '

+                    'X.')),

 ]

B1.3

--- neutron-bak/agent/rpc.py   2021-08-25 15:29:11.000000000 +0800

+++ neutron/agent/rpc.py 2021-09-15 16:34:09.000000000 +0800

@@ -25,8 +25,10 @@

 from neutron_lib import constants

 from neutron_lib.plugins import utils

 from neutron_lib import rpc as lib_rpc

+from oslo_config import cfg

 from oslo_log import log as logging

 import oslo_messaging

+from oslo_serialization import jsonutils

 from oslo_utils import uuidutils

 

 from neutron.agent import resource_cache

@@ -323,8 +325,10 @@

         binding = utils.get_port_binding_by_status_and_host(

             port_obj.bindings, constants.ACTIVE, raise_if_not_found=True,

             port_id=port_obj.id)

-        if (port_obj.device_owner.startswith(

-                constants.DEVICE_OWNER_COMPUTE_PREFIX) and

+        migrating_to = migrating_to_host(port_obj.bindings)

+        if (not (migrating_to and cfg.CONF.nova.live_migration_events) and

+                port_obj.device_owner.startswith(

+                    constants.DEVICE_OWNER_COMPUTE_PREFIX) and

                 binding[pb_ext.HOST] != host):

             LOG.debug('Device %s has no active binding in this host',

                       port_obj)

@@ -357,7 +361,8 @@

             'qos_policy_id': port_obj.qos_policy_id,

             'network_qos_policy_id': net_qos_policy_id,

             'profile': binding.profile,

-            'security_groups': list(port_obj.security_group_ids)

+            'security_groups': list(port_obj.security_group_ids),

+            'migrating_to': migrating_to,

         }

         LOG.debug('Returning: %s', entry)

         return entry

@@ -365,3 +370,40 @@

     def get_devices_details_list(self, context, devices, agent_id, host=None):

         return [self.get_device_details(context, device, agent_id, host)

                 for device in devices]

+

+# TODO(ralonsoh): move this method to neutron_lib.plugins.utils

+def migrating_to_host(bindings, host=None):

+    '''Return the host the port is being migrated.

+

+    If the host is passed, the port binding profile with the 'migrating_to',

+    that contains the host the port is being migrated, is compared to this

+    value. If no value is passed, this method will return if the port is

+    being migrated ('migrating_to' is present in any port binding profile).

+

+    The function returns None or the matching host.

+    '''

+    #LOG.info('LiveDebug: enter migrating_to_host  001')

+    for binding in (binding for binding in bindings if

+                    binding[pb_ext.STATUS] == constants.ACTIVE):

+        profile = binding.get('profile')

+        if not profile:

+            continue

+       '''

+        profile = (jsonutils.loads(profile) if isinstance(profile, str) else

+                   profile)

+        migrating_to = profile.get('migrating_to')

+       '''

+        # add by michael

+        if isinstance(profile, str):

+            migrating_to = jsonutils.loads(profile).get('migrating_to')

+            #LOG.info('LiveDebug: migrating_to_host 001  migrating_to: %s', migrating_to)

+        else:

+            migrating_to = profile.get('migrating_to')

+            #LOG.info('LiveDebug: migrating_to_host 002  migrating_to: %s', migrating_to)

+

+        if migrating_to:

+            if not host:  # Just know if the port is being migrated.

+                return migrating_to

+            if migrating_to == host:

+                return migrating_to

+    return None

B1.4

--- neutron-bak/db/provisioning_blocks.py        2021-08-25 15:43:47.000000000 +0800

+++ neutron/db/provisioning_blocks.py     2021-09-03 09:32:41.000000000 +0800

@@ -137,8 +137,7 @@

             context, standard_attr_id=standard_attr_id):

         LOG.debug('Provisioning complete for %(otype)s %(oid)s triggered by '

                   'entity %(entity)s.', log_dict)

-        registry.notify(object_type, PROVISIONING_COMPLETE,

-                        'neutron.db.provisioning_blocks',

+        registry.notify(object_type, PROVISIONING_COMPLETE, entity,

                         context=context, object_id=object_id)

 

B1.5

--- neutron-bak/notifiers/nova.py     2021-08-25 16:02:33.000000000 +0800

+++ neutron/notifiers/nova.py  2021-09-03 09:32:41.000000000 +0800

@@ -13,6 +13,8 @@

 #    License for the specific language governing permissions and limitations

 #    under the License.

 

+import contextlib

+

 from keystoneauth1 import loading as ks_loading

 from neutron_lib.callbacks import events

 from neutron_lib.callbacks import registry

@@ -66,6 +68,16 @@

             if ext.name == 'server_external_events']

         self.batch_notifier = batch_notifier.BatchNotifier(

             cfg.CONF.send_events_interval, self.send_events)

+        self._enabled = True

+

+    @contextlib.contextmanager

+    def context_enabled(self, enabled):

+        stored_enabled = self._enabled

+        try:

+            self._enabled = enabled

+            yield

+        finally:

+            self._enabled = stored_enabled

 

     def _get_nova_client(self):

         global_id = common_context.generate_request_id()

@@ -163,6 +175,10 @@

                 return self._get_network_changed_event(port)

 

     def _can_notify(self, port):

+        if not self._enabled:

+            LOG.debug('Nova notifier disabled')

+            return False

+

         if not port.id:

             LOG.warning('Port ID not set! Nova will not be notified of '

                         'port status change.')

B1.6

--- nova-bak/compute/manager.py  2022-08-02 16:27:45.943428128 +0800

+++ nova/compute/manager.py        2021-09-03 09:35:24.529858458 +0800

@@ -6637,12 +6637,12 @@

         LOG.error(msg, msg_args)

 

     @staticmethod

-    def _get_neutron_events_for_live_migration(instance):

+    def _get_neutron_events_for_live_migration(instance, migration):

         # We don't generate events if CONF.vif_plugging_timeout=0

         # meaning that the operator disabled using them.

-        if CONF.vif_plugging_timeout and utils.is_neutron():

-            return [('network-vif-plugged', vif['id'])

-                    for vif in instance.get_network_info()]

+        if CONF.vif_plugging_timeout:

+            return (instance.get_network_info()

+                    .get_live_migration_plug_time_events())

         else:

             return []

 

@@ -6695,7 +6695,8 @@

             '''

             pass

 

-        events = self._get_neutron_events_for_live_migration(instance)

+        events = self._get_neutron_events_for_live_migration(

+            instance, migration)

         try:

             if ('block_migration' in migrate_data and

                     migrate_data.block_migration):

B1.7

--- nova-bak/network/model.py        2022-08-02 16:27:47.490437859 +0800

+++ nova/network/model.py     2021-09-03 09:35:24.532858440 +0800

@@ -469,6 +469,14 @@

         return (self.is_hybrid_plug_enabled() and not

                 migration.is_same_host())

 

+    @property

+    def has_live_migration_plug_time_event(self):

+        '''Returns whether this VIF's network-vif-plugged external event will

+        be sent by Neutron at 'plugtime' - in other words, as soon as neutron

+        completes configuring the network backend.

+        '''

+        return self.is_hybrid_plug_enabled()

+

     def is_hybrid_plug_enabled(self):

         return self['details'].get(VIF_DETAILS_OVS_HYBRID_PLUG, False)

 

@@ -527,20 +535,26 @@

         return jsonutils.dumps(self)

 

     def get_bind_time_events(self, migration):

-        '''Returns whether any of our VIFs have 'bind-time' events. See

-        has_bind_time_event() docstring for more details.

+        '''Returns a list of external events for any VIFs that have

+        'bind-time' events during cold migration.

         '''

         return [('network-vif-plugged', vif['id'])

                 for vif in self if vif.has_bind_time_event(migration)]

 

+    def get_live_migration_plug_time_events(self):

+        '''Returns a list of external events for any VIFs that have

+        'plug-time' events during live migration.

+        '''

+        return [('network-vif-plugged', vif['id'])

+                for vif in self if vif.has_live_migration_plug_time_event]

+

     def get_plug_time_events(self, migration):

-        '''Complementary to get_bind_time_events(), any event that does not

-        fall in that category is a plug-time event.

+        '''Returns a list of external events for any VIFs that have

+        'plug-time' events during cold migration.

         '''

         return [('network-vif-plugged', vif['id'])

                 for vif in self if not vif.has_bind_time_event(migration)]

 

-

 class NetworkInfoAsyncWrapper(NetworkInfo):

     '''Wrapper around NetworkInfo that allows retrieving NetworkInfo

     in an async manner.

C1.1

--- linux-3.10.0-1062.18.1.el7.orig/net/ipv4/udp_offload.c     2020-02-12 21:45:22.000000000 +0800

+++ linux-3.10.0-1062.18.1.el7/net/ipv4/udp_offload.c   2022-08-17 15:56:27.540557289 +0800

@@ -261,7 +261,7 @@ struct sk_buff **udp_gro_receive(struct

      struct sock *sk;

 

      if (NAPI_GRO_CB(skb)->encap_mark ||

-         (skb->ip_summed != CHECKSUM_PARTIAL &&

+         (uh->check && skb->ip_summed != CHECKSUM_PARTIAL &&

           NAPI_GRO_CB(skb)->csum_cnt == 0 &&

           !NAPI_GRO_CB(skb)->csum_valid))

             goto out;

 

分享到:

返回列表
北京集特智能科技有限公司 All Copy Right 2005-2010@ 2015 All rights reserved. 備案號:京ICP�20018443�-1 后臺管理
在線咨詢