server: Fix update capacity for hosts take long time if there are many service offerings#4623
Conversation
…y service offerings
Steps to reproduce the issue:
(1)Create 10000 service offerings (by db changes below or cloudmonkey).
```
DROP PROCEDURE IF EXISTS cloud.insert_service_offering;
DELIMITER $$
CREATE PROCEDURE cloud.insert_service_offering()
BEGIN
DECLARE count INT DEFAULT 10000;
SET @offeringid = (select max(id)+1 from disk_offering);
WHILE count > 0 DO
INSERT INTO disk_offering (id,name,uuid,display_text,disk_size,type,created) values (@offeringid,'test-offering-wei',uuid(), 'test-offering-wei',0,'Service',now());
INSERT INTO service_offering (id,cpu,speed,ram_size) values (@offeringid, 1, 500,256);
SET @offeringid = @offeringid + 1;
SET count = count - 1;
END WHILE;
END $$
DELIMITER ;
CALL cloud.insert_service_offering();
mysql> CALL cloud.insert_service_offering();
Query OK, 0 rows affected (2 min 30.85 sec)
```
(2) Check the total time of periodical capacity check in cloudstack.
Without this patch, it spend 2.5 seconds (2 hosts)
```
2021-01-15 16:10:12,793 DEBUG [c.c.a.AlertManagerImpl] (CapacityChecker:ctx-5d5f3b3b) (logid:f5eb68ba) Running Capacity Checker ...
2021-01-15 16:10:15,287 DEBUG [c.c.a.AlertManagerImpl] (CapacityChecker:ctx-5d5f3b3b) (logid:f5eb68ba) Done running Capacity Checker ...
```
With this patch ,it spend 1.3 seconds (2 hosts)
```
2021-01-15 16:12:43,604 DEBUG [c.c.a.AlertManagerImpl] (CapacityChecker:ctx-a2a7f3f1) (logid:f7e0a4c5) Running Capacity Checker ...
2021-01-15 16:12:44,927 DEBUG [c.c.a.AlertManagerImpl] (CapacityChecker:ctx-a2a7f3f1) (logid:f7e0a4c5) Done running Capacity Checker ...
```
If there are 100 hosts, the total time will be reduced from 100+ seconds to around 10 seconds.
|
@blueorangutan package |
|
@rhtyd a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress. |
|
Packaging result: ✔centos7 ✖centos8 ✔debian. JID-2607 |
|
@blueorangutan test |
|
@rhtyd a Trillian-Jenkins test job (centos7 mgmt + kvm-centos7) has been kicked to run smoke tests |
|
Trillian test result (tid-3457)
|
|
@ustcweizhou can you review the |
@rhtyd I ran the test on 4.14 and 4.15, both succeed. |
|
@blueorangutan package |
|
@rhtyd a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress. |
|
Running failed test manually @weizhouapache @rhtyd |
|
Packaging result: ✔centos7 ✖centos8 ✔debian. JID-2633 |
DaanHoogland
left a comment
There was a problem hiding this comment.
looks like a sensible optimisation; fetch the offerings before looping over hosts instead of re-fetch for each host. I think only regression tests are needed for this one.
|
@blueorangutan test |
|
@DaanHoogland a Trillian-Jenkins test job (centos7 mgmt + kvm-centos7) has been kicked to run smoke tests |
Failing test verified manually, |
|
Trillian test result (tid-3474)
|
Description
This PR fixes the issue that update capacity for hosts take long time if there are many service offerings.
Steps to reproduce the issue:
(1)Create 10000 service offerings (by db changes or cloudmonkey).
(2) Check the total time of periodical capacity check in cloudstack.
Without this patch, it spend 2.5 seconds (2 hosts)
With this patch ,it spend 1.3 seconds (2 hosts)
If there are 100 hosts, the total time will be reduced from 100+ seconds to around 10 seconds.
This helps a lot to reduce the execution time of prometheus exporter.
Types of changes
Feature/Enhancement Scale or Bug Severity
Feature/Enhancement Scale
Bug Severity
Screenshots (if appropriate):
How Has This Been Tested?