Problem with VRRP-A and Virtual Server IP (VIP)

Hi everyone:


We have a problem with our Thunder 930 cluster (two A10 Thunder 930 in a L3V scenario with VRRP-A and VCS). There are some partitions created on the cluster and we are seeing a strange behaviour when we change the priority of VRRP-A in any of this partitions.


When we decrease the priority of VRRP-A on an active partition, the MAC address of the floatings IP's changes correctly to the standby device, and communication to the server side and to the upstream router works fine. But, Virtual Server IP's (VIPs) stop working.  This means that below the custer everyting is ok as wells as upstream, but no communication or service can be established thourgh the VIP's.


We see that the packets with destination to any of the VIPs, are routed-back to the upstream router by the new active device. Obviously, the upstream router send again packets to the active device, creating a L3 loop.


However, if instead of decreasing the priority, we shutdown all interfaces on the active device (or shutdown the actual active device), then the failover to standby device occurs and everything works OK (including VIPs).


We have 4.1.1-P6 (build 62) version in both devices. Last upgrade done in 2018 from 2.7.1-GR1 (build 58) to the running one.


Has anyone had a behavior similar to this?


Below is an example of configuring one of the problem partitions.



interface ve 1/201 

 name SERVER-SIDE 

 ip address 10.10.10.2 255.255.255.0 

!

interface ve 1/301 

 name UPSTREAM

 ip address 20.20.20.2 255.255.255.0 

!

interface ve 2/201 

 name SERVER-SIDE

 ip address 10.10.10.3 255.255.255.0 

!

interface ve 2/301 

 name UPSTREAM

 ip address 20.20.20.3 255.255.255.0 


vlan 1/201 

 tagged trunk 10

 router-interface ve 201 

 name SERVER-SIDE

!    

vlan 1/301 

 tagged trunk 10

 router-interface ve 301 

 name UPSTREAM

!    

vlan 2/201 

 tagged trunk 9

 router-interface ve 201 

 name SERVER-SIDE

!    

vlan 2/301 

 tagged trunk 9

 router-interface ve 301 

 name UPSTREAM


vrrp-a vrid 5 

 floating-ip 10.10.10.1 

 floating-ip 20.20.20.1 

 device-context 1

  blade-parameters 

   priority 200 

   tracking-options 

    vlan 301 timeout 5 priority-cost 100 

    vlan 201 timeout 5 priority-cost 100 

 device-context 2

  blade-parameters 

   priority 150 

   tracking-options 

    vlan 301 timeout 5 priority-cost 100 

    vlan 201 timeout 5 priority-cost 100 


device-context 1

 ip route 0.0.0.0 /0 20.20.20.20 

!

device-context 2

 ip route 0.0.0.0 /0 20.20.20.20


Thanks in advance

Tagged:

Comments

  • mdunnmdunn Member

    Hello - This is an interesting scenario... One thought that comes to mind - I see you're using VRID 5 for VRRP. Do you have vrid 5 assigned to the VIPs that fail to failover?

    slb virtual-server test 192.168.1.20
      vrid 5
    

    Mike

  • First of all, thank's for your reply, Mike.


    About your suggestion, we don´t have that command in the virtual-server configuration. When we try to configure it, we receive this error:


    "You are trying to change the vrid of virtual server with nat pool on it, please delete nat pool first !"
    


    We use source-nat in our L3V scenario because servers in that partition need to reach virtual servers (VIPs) in the same partition.


    So, we try to remove the NAT configuration first, then configure "vrid xx" on virtual-server and then configure again the NAT, but we receive this error:

    "Invalid HA ID specified."
    

    Using the same image from the last post, this is an example of one of our slb configurations.


    slb server server-01 10.10.10.5 

     port 443 tcp


    slb server server-02 10.10.10.6 

     port 443 tcp


    slb service-group SG-test_443 tcp 

     method least-connection 

     extended-stats 

     member server-01 443 

     member server-02 443


    slb virtual-server VIP-test 30.30.30.10

     extended-stats

     port 443 tcp

      ha-conn-mirror

      extended-stats

      access-list 100 source-nat-pool source_nat

      service-group SG-test_443


    access-list 100 remark acl_source_nat

    access-list 100 permit ip 20.20.20.0 0.0.0.255 any


    ip nat pool source_nat 30.30.30.254 30.30.30.254 netmask /24


    What do you think is the mistake?

    Thank's in advance...

  • mdunnmdunn Member

    In the instance of NAT pool, you will also need to set the VRID for the pool to match the VIP:

    ip nat pool source_nat 30.30.30.254 30.30.30.254 netmask /24 vrid 4
    

    After that, you should be allowed to bind the pool to the vPort and test.

    Hope this helps!

Sign In or Register to comment.