Cancel
Showing results for 
Search instead for 
Did you mean: 
Highlighted

Best practice - J2EE Load Balancing

Experimenter
Experimenter

Hello guys,

 

I have a question regarding the J2EE Load Balancing.

 

We had some hardware (Cisco) & software (Apache) load balancing working with TC10 for the J2EE Server Manager + Webtier.

 

However for TC11 and TCIC (for the CATIA integration) this solution is not suitable anymore - since using the Racless method, the Load balancing appearently switches to the other server (despite the big cookie or sticky session). We have a 50/50 chance that the thing is working/not working.

 

Is there any best practice on that? How to setup the load balancing between PoolA & PoolB (including Webtier)? Currently we have the following solution (which is not working) in the Test environment: Apache 2.2 on Server X (where it points to the http://serverX/tc) and the Loadbalancing module with worker1=ServerPoolA and worker2=ServerPoolB - sticky session enabled.
The TreeCacheTCP.xml on PoolA has the following values:

<attribute name="ClusterName">Cluster_TCServer</attribute>
<attribute name="ClusterConfig">
<config>
<TCP start_port="17800" end_port="17800" sock_conn_timeout="2000" use_send_queues="false" enable_diagnostics="false"></TCP>
<TCPPING initial_hosts="TCPOOLSERVERA[17810]" port_range="1" timeout="15000" num_initial_members="1" up_thread="true" down_thread="true"></TCPPING>
<MERGE2 min_interval="5000" max_interval="10000"></MERGE2>
<FD_SOCK></FD_SOCK>
<FD timeout="10000" max_tries="5" shun="true" down_thread="false" up_thread="false"></FD>
<VERIFY_SUSPECT timeout="3000" down_thread="false" up_thread="false"></VERIFY_SUSPECT>
<pbcast.NAKACK down_thread="true" up_thread="true" gc_lag="100" retransmit_timeout="3000"></pbcast.NAKACK>
<pbcast.STABLE desired_avg_gossip="20000" down_thread="false" up_thread="false"></pbcast.STABLE>
<pbcast.GMS join_timeout="1250" join_retry_timeout="2000" shun="false" print_local_addr="true" down_thread="true" up_thread="true"></pbcast.GMS>
<pbcast.STATE_TRANSFER up_thread="true" down_thread="true"></pbcast.STATE_TRANSFER>
</config>
</attribute>

 

The TreeCacheTCP.xml on PoolB has the following values:

<attribute name="ClusterName">Cluster_TCServer</attribute>
<attribute name="ClusterConfig">
<config>
<TCP start_port="17800" end_port="17800" sock_conn_timeout="2000" use_send_queues="false" enable_diagnostics="false"></TCP>
<TCPPING initial_hosts="TCPOOLSERVERB[17810]" port_range="1" timeout="15000" num_initial_members="1" up_thread="true" down_thread="true"></TCPPING>
<MERGE2 min_interval="5000" max_interval="10000"></MERGE2>
<FD_SOCK></FD_SOCK>
<FD timeout="10000" max_tries="5" shun="true" down_thread="false" up_thread="false"></FD>
<VERIFY_SUSPECT timeout="3000" down_thread="false" up_thread="false"></VERIFY_SUSPECT>
<pbcast.NAKACK down_thread="true" up_thread="true" gc_lag="100" retransmit_timeout="3000"></pbcast.NAKACK>
<pbcast.STABLE desired_avg_gossip="20000" down_thread="false" up_thread="false"></pbcast.STABLE>
<pbcast.GMS join_timeout="1250" join_retry_timeout="2000" shun="false" print_local_addr="true" down_thread="true" up_thread="true"></pbcast.GMS>
<pbcast.STATE_TRANSFER up_thread="true" down_thread="true"></pbcast.STATE_TRANSFER>
</config>
</attribute>

 

Is this configuration OK? I am thinking we should set a different ClusterName for both... and maybe a different initialhost entries (Notice TCPoolServerA & TCPoolServerB). Any suggestions on how to get this working properly so I can test this use case with the TCIC? Any best practice on that? I have read about some Multicast options or a different configuration using the TreeCache.xml - or an entirely other solution.

 

Any help is appreciated. Thanks

7 REPLIES 7

Re: Best practice - J2EE Load Balancing

Solution Partner Esteemed Contributor Solution Partner Esteemed Contributor
Solution Partner Esteemed Contributor
The Server Managers do not need load balanced (that's what the Cluster does), only multiple Web Tiers do. You do have an issue in your TreeCacheTCP.xml files for A and B where they are referring to themselves instead of the others that are participating in the TreeCache.

On TCPOOLSERVERA, the TCPPING should have TCPOOLSERVERB and The WebTier.
On TCPOOLSERVERB, the TCPPING should have TCPOOLSERVERA and the WebTier.
TCPOOLSERVERA, TCPOOLSERVERB and the WebTier all need the same ClusterName [Cluster_TCServer].

Hope that clears things up. With StartRaclessServer.bat, TCIC is using SOA services and I don't know how that affects LB. Keep us posted.

Randy Ellsworth, Teamcenter Architect, Applied CAx, LLC
NX 11 | SW 2016 | Creo 4 | TcUA 11.4
Evaluating: AW 3.4

Re: Best practice - J2EE Load Balancing

Experimenter
Experimenter

Hi Randy,

 

thanks for that info! I have done that now - first tests were successful. We will try now with multiple users if this problem can be reproduced again and then proceed with the hardware load balancer.

 

Looking at our previous TC10 configuration raised one question more question: 

TCPPING in the TreeCacheTCP.xml can have TCPOOLSERVERA & TCPOOLSERVERB listed or should it specifically be as stated in your reply?

Re: Best practice - J2EE Load Balancing

Solution Partner Esteemed Contributor Solution Partner Esteemed Contributor
Solution Partner Esteemed Contributor
You don't want to include yourself but you do want to include everyone else participating in the TreeCache. Any other server manager's or web tier's should be included other than "localhost" (basically).
https://docs.jboss.org/jbossas/docs/Clustering_Guide/5/html/jbosscache-jgroups-discovery-tcpping.htm...

Randy Ellsworth, Teamcenter Architect, Applied CAx, LLC
NX 11 | SW 2016 | Creo 4 | TcUA 11.4
Evaluating: AW 3.4

Re: Best practice - J2EE Load Balancing

Experimenter
Experimenter

OK I did that, it worked - but I still have some problems loading data to Catia - sometimes it works, sometimes not.

 

I am starting to assume, that this is not connected to the TreeCache but with the LoadBalancer (currently Apache).

I am using Apache 2.2 with the following configuration (worker.properties):

worker.list=loadbalancer,jkstatus
# Define Worker Template
worker.template.port=8009
worker.template.type=ajp13
worker.template.prepost_timeout=10000
worker.template.connect_timeout=10000
worker.template.reply_timeout=360000
worker.template.socket_connect_timeout=10000
worker.template.connection_pool_size=300

# Define Balancer Member 1
worker.TOMCAT_S1.reference=worker.template
worker.TOMCAT_S1.host=SERVERIP
worker.TOMCAT_S1.lbfactor=1

# Define Balancer Member 2
worker.TOMCAT_S2.reference=worker.template
worker.TOMCAT_S2.host=SERVERIP
worker.TOMCAT_S2.lbfactor=1

# Define LB-Configuration
worker.loadbalancer.type=lb
worker.loadbalancer.balance_workers=TOMCAT_S1,TOMCAT_S2
worker.loadbalancer.sticky_session=true
worker.jkstatus.type=status

 

And the uriworkermap.properties:

/tc=loadbalancer
/tc/*=loadbalancer
/manager=loadbalancer
/manager/*=loadbalancer
/jkstatus=jkstatus

 

The Apache load balancer is on Server3, which is set in the Teamcenter configuration for the clients: http://server3/tc

So it points to the tc folder on Server1 or Server2.

 

The problem now is if the Server Manager selects SERVER1 for the TC Session, and the SERVER3 LB assigns SERVER1, the loading to CATIA works. If the SERVER3 LB assigns SERVER2, the load to CATIA is not working.

 

Any other method we can try to avoid that? Perhaps without using the SERVER3 for the LB? Or am I missing something? Thanks for the suggestions.

 

Re: Best practice - J2EE Load Balancing

Solution Partner Esteemed Contributor Solution Partner Esteemed Contributor
Solution Partner Esteemed Contributor
I haven't used a software loadbalancer and don't know how the "sticky bit" gets set. Since you are only loadbalancing multiple web servers (since the TreeCache takes care of tracking/routing to the correct tcserver), then the loadbalancer needs to understand which user is assigned to which web server and route accordingly. It sounds like you don't have the "sticky bit" set which is allowing the loadbalancer is randomly route the user to an available web server instead of routing them to the same web server that they connected to. That's fine for basic web traffic (like Google) but not correct behavior for applications.

Randy Ellsworth, Teamcenter Architect, Applied CAx, LLC
NX 11 | SW 2016 | Creo 4 | TcUA 11.4
Evaluating: AW 3.4

Re: Best practice - J2EE Load Balancing

Experimenter
Experimenter

I "inherited" the Teamcenter environment :) But in the past there was a Hardware load balancer active with TC10.1 which was working properly. When we updated to TC11.3 & TCIC 11.0.6 this setup was not working anymore - loading to CATIA was a major problem. This is why we turned the HW LB off.

 

I am checking right now if I can replicate the behaviour using a SW LB on a test environment - to make it work somehow - if not possible, I will test it with a HW LB.

So if I understand it right - If we use a HW LB with the following configuration, should everything work?:

SERVER1 Treecache pointing to SERVER2

SERVER2 Treecache pointing to SERVER1

LB pointing to the webtier of SERVER1 & SERVER2

Any other configuration needed, despite the sticky bit?

Re: Best practice - J2EE Load Balancing

Solution Partner Esteemed Contributor Solution Partner Esteemed Contributor
Solution Partner Esteemed Contributor
That's all that I'm aware of. Sounds like you're on the right track now.

Randy Ellsworth, Teamcenter Architect, Applied CAx, LLC
NX 11 | SW 2016 | Creo 4 | TcUA 11.4
Evaluating: AW 3.4