Skip to content

RvR: VPC redundant vrs run on same hypervisor#3421

Merged
yadvr merged 1 commit into
apache:4.11from
ustcweizhou:rvr-vpc-same-hv
Jun 29, 2019
Merged

RvR: VPC redundant vrs run on same hypervisor#3421
yadvr merged 1 commit into
apache:4.11from
ustcweizhou:rvr-vpc-same-hv

Conversation

@ustcweizhou
Copy link
Copy Markdown
Contributor

Description

For VPC supports redundant VRs, when start the second VR, the pod/cluster/host of first VR should be added to avoid list. This provides higher availability.

The network VRs have the same process already.

Types of changes

  • Breaking change (fix or feature that would cause existing functionality to change)
  • New feature (non-breaking change which adds functionality)
  • Bug fix (non-breaking change which fixes an issue)
  • Enhancement (improves an existing feature and functionality)
  • Cleanup (Code refactoring and cleanup, that may add test cases)

@yadvr yadvr added this to the 4.13.0.0 milestone Jun 25, 2019
final List<DomainRouterVO> routerList = _routerDao.listByVpcId(router.getVpcId());
for (final DomainRouterVO rrouter : routerList) {
if (rrouter.getHostId() != null && rrouter.getIsRedundantRouter() && rrouter.getState() == State.Running) {
if (routerToBeAvoid != null) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this also check that the routerList.size() is >= 2?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it should be checked at some other places I think.
I am too lazy I just copied the lines from 380 to 390
https://github.com/apache/cloudstack/blob/4.11/server/src/com/cloud/network/router/NetworkHelperImpl.java#L380,L390

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it should be checked at some other places I think.
I am too lazy I just copied the lines from 380 to 390
https://github.com/apache/cloudstack/blob/4.11/server/src/com/cloud/network/router/NetworkHelperImpl.java#L380,L390

Copy link
Copy Markdown
Contributor

@anuragaw anuragaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM but code could use some cleanup.

Comment thread server/src/com/cloud/network/router/NetworkHelperImpl.java
Copy link
Copy Markdown
Contributor

@anuragaw anuragaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ustcweizhou !

LGTM

@yadvr
Copy link
Copy Markdown
Member

yadvr commented Jun 27, 2019

@blueorangutan package

@blueorangutan
Copy link
Copy Markdown

@rhtyd a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress.

@blueorangutan
Copy link
Copy Markdown

Packaging result: ✔centos6 ✔centos7 ✔debian. JID-47

@yadvr
Copy link
Copy Markdown
Member

yadvr commented Jun 27, 2019

@blueorangutan test

@blueorangutan
Copy link
Copy Markdown

@rhtyd a Trillian-Jenkins test job (centos7 mgmt + kvm-centos7) has been kicked to run smoke tests

@blueorangutan
Copy link
Copy Markdown

Trillian test result (tid-41)
Environment: kvm-centos7 (x2), Advanced Networking with Mgmt server 7
Total time taken: 33954 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr3421-t41-kvm-centos7.zip
Intermittent failure detected: /marvin/tests/smoke/test_internal_lb.py
Intermittent failure detected: /marvin/tests/smoke/test_vpc_redundant.py
Smoke tests completed. 67 look OK, 2 have error(s)
Only failed tests results shown below:

Test Result Time (s) Test File
test_04_rvpc_internallb_haproxy_stats_on_all_interfaces Error 197.01 test_internal_lb.py
test_05_rvpc_multi_tiers Failure 418.29 test_vpc_redundant.py
test_05_rvpc_multi_tiers Error 442.26 test_vpc_redundant.py

@yadvr
Copy link
Copy Markdown
Member

yadvr commented Jun 27, 2019

@blueorangutan test

@blueorangutan
Copy link
Copy Markdown

@rhtyd a Trillian-Jenkins test job (centos7 mgmt + kvm-centos7) has been kicked to run smoke tests

@ustcweizhou
Copy link
Copy Markdown
Contributor Author

@rhtyd if it still fails, I will look into it tomorrow.

@blueorangutan
Copy link
Copy Markdown

Trillian test result (tid-42)
Environment: kvm-centos7 (x2), Advanced Networking with Mgmt server 7
Total time taken: 31840 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr3421-t42-kvm-centos7.zip
Intermittent failure detected: /marvin/tests/smoke/test_routers_network_ops.py
Intermittent failure detected: /marvin/tests/smoke/test_vpc_redundant.py
Intermittent failure detected: /marvin/tests/smoke/test_vpc_vpn.py
Smoke tests completed. 68 look OK, 1 have error(s)
Only failed tests results shown below:

Test Result Time (s) Test File
test_04_rvpc_network_garbage_collector_nics Failure 330.93 test_vpc_redundant.py
test_05_rvpc_multi_tiers Failure 412.71 test_vpc_redundant.py
test_05_rvpc_multi_tiers Error 437.94 test_vpc_redundant.py

@ustcweizhou
Copy link
Copy Markdown
Contributor Author

@rhtyd
I checked the result but did not understand why the issue happened...

======================================================================
FAIL: Create a redundant VPC with 1 Tier, 1 VM, 1 ACL, 1 PF and test Network GC Nics
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/marvin/tests/smoke/test_vpc_redundant.py", line 597, in test_04_rvpc_network_garbage_collector_nics
    self.do_vpc_test(False)
  File "/marvin/tests/smoke/test_vpc_redundant.py", line 690, in do_vpc_test
    self.check_ssh_into_vm(vm.get_vm(), vm.get_ip(), expectFail=expectFail, retries=retries)
  File "/marvin/tests/smoke/test_vpc_redundant.py", line 533, in check_ssh_into_vm
    self.fail("Failed to SSH into VM - %s" % (public_ip.ipaddress.ipaddress))
AssertionError: Failed to SSH into VM - 10.1.34.69

and

======================================================================
FAIL: Create a redundant VPC with multiple tiers
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/marvin/tests/smoke/test_vpc_redundant.py", line 633, in test_05_rvpc_multi_tiers
    self.do_vpc_test(False)
  File "/marvin/tests/smoke/test_vpc_redundant.py", line 690, in do_vpc_test
    self.check_ssh_into_vm(vm.get_vm(), vm.get_ip(), expectFail=expectFail, retries=retries)
  File "/marvin/tests/smoke/test_vpc_redundant.py", line 533, in check_ssh_into_vm
    self.fail("Failed to SSH into VM - %s" % (public_ip.ipaddress.ipaddress))
AssertionError: Failed to SSH into VM - 10.1.34.71

and

======================================================================
ERROR: Create a redundant VPC with multiple tiers
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/marvin/tests/smoke/test_vpc_redundant.py", line 284, in tearDown
    raise Exception("Warning: Exception during cleanup : %s" % e)
Exception: Warning: Exception during cleanup : Execute cmd: deletenetworkoffering failed, due to: errorCode: 431, errorText:Can't delete network offering 78 as its used by 1 networks. To make the network offering unavaiable, disable it

@yadvr
Copy link
Copy Markdown
Member

yadvr commented Jun 28, 2019

Okay @ustcweizhou I'll kick them one more time. Third's the winner ;)

@blueorangutan package

@blueorangutan
Copy link
Copy Markdown

@rhtyd a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress.

@blueorangutan
Copy link
Copy Markdown

Packaging result: ✔centos6 ✔centos7 ✔debian. JID-62

@yadvr
Copy link
Copy Markdown
Member

yadvr commented Jun 28, 2019

@blueorangutan test

@blueorangutan
Copy link
Copy Markdown

@rhtyd a Trillian-Jenkins test job (centos7 mgmt + kvm-centos7) has been kicked to run smoke tests

@blueorangutan
Copy link
Copy Markdown

Trillian test result (tid-55)
Environment: kvm-centos7 (x2), Advanced Networking with Mgmt server 7
Total time taken: 25736 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr3421-t55-kvm-centos7.zip
Smoke tests completed. 69 look OK, 0 have error(s)
Only failed tests results shown below:

Test Result Time (s) Test File

@yadvr yadvr merged commit 9ac32aa into apache:4.11 Jun 29, 2019
Slair1 pushed a commit to ippathways/cloudstack that referenced this pull request Jan 29, 2020
For VPC supports redundant VRs, when start the second VR, the pod/cluster/host of first VR should be added to avoid list. This provides higher availability.

The network VRs have the same process already.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants