Long time for the cluster nodes to connect after the startup time (#298)

Artur Hefczyc opened 1 decade ago

Due Date
2016-10-06

All the details are on the forums: message#1591

Activities

Artur Hefczyc commented 1 decade ago

Wojtek, when we talked about this last time you mentioned that you committed some fixes for this already but the problem appears to be still affecting some users. So this might have more than one cause.

Could you please add some update on this issue in the comments?
Wojciech Kapcia (Tigase) commented 1 decade ago

With the latest code nodes connects within ~5 seconds, however the issue described in the thread still occurs - users that connected to each node before establishing cluster connection won't see each other presence unless they re-broadcast presence. Ideally after establishing cluster connection there should be synchronization of online users (which I believe was implemented, but is removed now, vide: commit:499ba42281391b12259dae7483bfdfbc34ac4a27)
Artur Hefczyc commented 1 decade ago

As far as I remember the synchronization is not removed, it is moved to a different place. It is no longer implemented inside a SM, it is implemented on the clustering strategy level, as a strategy command: tigase.server.cluster.strategy.cmd. At least this is how it is supposed to work.

Please verify that it works!.

However, this is a different kind of presence synchronization from what you are talking about. The cluster synchronization is for synchronizing cluster cache data to each node knows who is online and where the user is connected. It does not resent or rebroadcast user's presence data to all contacts. And I think it should not do that.

A correct way to handle it is, in my opinion to make Tigase delay accepting users' connections until cluster connections are established. Assuming that cluster connections are established within 5 seconds we could implement a delay of 30 seconds in Tigase after which it starts accepting users' connections if started in a cluster mode.
Artur Hefczyc commented 9 years ago

This is actually important stuff for 7.1.0.
Wojciech Kapcia (Tigase) commented 9 years ago
Artur Hefczyc wrote:

As far as I remember the synchronization is not removed, it is moved to a different place. It is no longer implemented inside a SM, it is implemented on the clustering strategy level, as a strategy command: tigase.server.cluster.strategy.cmd. At least this is how it is supposed to work.

Please verify that it works!.

Yes, it was moved to ACS: #tigase.server.cluster.strategy.cmd.RequestSyncOnlineCmd#RequestSyncOnlineCmd@ and it works (and I guess there's not much sense to have it in default strategy ad it doesn't utilize caching and received sync data was ignored in this case).

However, this is a different kind of presence synchronization from what you are talking about. The cluster synchronization is for synchronizing cluster cache data to each node knows who is online and where the user is connected. It does not resent or rebroadcast user's presence data to all contacts. And I think it should not do that.

A correct way to handle it is, in my opinion to make Tigase delay accepting users' connections until cluster connections are established. Assuming that cluster connections are established within 5 seconds we could implement a delay of 30 seconds in Tigase after which it starts accepting users' connections if started in a cluster mode.

%kobit

I was doing some tests and also analysing the issue and I think that simply delaying accepting user connections wont be ideal solution (for example when establishing of clustering connection takes longer for some reasons).

I think that, in case of clustering mode enabled, we could use following logic:

delay listening on the user connection ports;

wait till cluster-repo is reloaded (so we know how many nodes there are) and only open user ports after all of them got connected.

What do you think?

Also - there is a possible related problem - for when the nodes are online and works correctly, but for some reason cluster connection is broken - all the connections/disconnections/status changes during that time won't be visible after cluster connection is re-established.

Should we also think of handling this case or do we consider it as 'abnormal'?
Artur Hefczyc commented 9 years ago
Wojciech Kapcia wrote:

Artur Hefczyc wrote:

As far as I remember the synchronization is not removed, it is moved to a different place. It is no longer implemented inside a SM, it is implemented on the clustering strategy level, as a strategy command: tigase.server.cluster.strategy.cmd. At least this is how it is supposed to work.

Please verify that it works!.

Yes, it was moved to ACS: #tigase.server.cluster.strategy.cmd.RequestSyncOnlineCmd#RequestSyncOnlineCmd@ and it works (and I guess there's not much sense to have it in default strategy ad it doesn't utilize caching and received sync data was ignored in this case).

However, this is a different kind of presence synchronization from what you are talking about. The cluster synchronization is for synchronizing cluster cache data to each node knows who is online and where the user is connected. It does not resent or rebroadcast user's presence data to all contacts. And I think it should not do that.

A correct way to handle it is, in my opinion to make Tigase delay accepting users' connections until cluster connections are established. Assuming that cluster connections are established within 5 seconds we could implement a delay of 30 seconds in Tigase after which it starts accepting users' connections if started in a cluster mode.

%kobit

I was doing some tests and also analysing the issue and I think that simply delaying accepting user connections wont be ideal solution (for example when establishing of clustering connection takes longer for some reasons).

I think that, in case of clustering mode enabled, we could use following logic:

delay listening on the user connection ports;

wait till cluster-repo is reloaded (so we know how many nodes there are) and only open user ports after all of them got connected.

What do you think?

Yes, I think this is a good approach. By delaying "accepting user connections" I meant something like what you described above. I mean, some logic which either delays listening on user connection ports or prevents users from connecting to the node in some other way. I think the last point - loading cluster-repo and waiting for all nodes to be connected makes most sense.

Also - there is a possible related problem - for when the nodes are online and works correctly, but for some reason cluster connection is broken - all the connections/disconnections/status changes during that time won't be visible after cluster connection is re-established.

Should we also think of handling this case or do we consider it as 'abnormal'?

This is something we should take care as well, in future versions. First, most important step would be to make sure all cluster nodes resynchronize after reconnecting, next, second step would be to make sure users are up to date with all the changes which happened in the meantime.

The first step seems to be relatively simple but, I imagine, the second step might be quite difficult and maybe it is even an overkill.

"Normally" we require very good connectivity between cluster nodes in order to consider clustering useful. We do not guarantee that clustering will work correctly if there are connectivity issues between cluster nodes.
Wojciech Kapcia (Tigase) commented 9 years ago

I've implemented delaying listening on user ports and tied opening them to the appropriate events (opening when there is single node in cluster or when there are more nodes and connection was established to all of them). I've also decreased initial reload time so establishing clustering connections takes now only a couple of seconds.

In addition I've added expiration timer to start listening for the connections after roughly 2 minutes (just in case) and there are log entries informing that listening for user connections was delayed.

I've tested it quite thoroughly and everything was working correctly.

%andrzej.wojcik - could you review changes in origin/1783-cluster-connections@? If you don't find any issue I think we could merge it to current @origin/stable still as it hasn't been released (instead of 7.1.1).
Andrzej Wójcik (Tigase) commented 9 years ago
I reviewed changes from origin/1783-cluster-connection and I would say they are OK with few exceptions. Below I listed possible issues I can see with current implementation.

We cannot register event handler in getDefaults() method! This method may be called many times and it is even possible to reconfigure kernel and unload instance and later reload new one. In such case we would get more than one event handler registered! This is bad. Also each handler will keep in memory instance which was unloaded due to strong reference between event handler and instance of ConnectionManager.

+Solution:+ I think it would be far better if we would move creation of event handler and registration to start() method and add call to unregister event handler during execution of stop()/release() method. This methods are implemented by most (if not by every) components.

What if we are reconfiguring server without stopping it (yes, it is possible). Let's say I will disable BOSH component and later I will enable it - it will never receive CLUSTER_INITIALIZED_EVENT and it will never start to accept connections!

We may have issues with @see-other-host@, when we use hash - as other cluster nodes may already redirect users to newly started cluster node which is not accepting connections - this will create a loop until server will properly start and synchronize with whole cluster which can take a while.

I'm not sure but I think we should block S2S connections as well. How we can route packets to users, if we do not have knowledge about their presence?
Wojciech Kapcia (Tigase) commented 9 years ago
Andrzej Wójcik wrote:

I reviewed changes from origin/1783-cluster-connection and I would say they are OK with few exceptions. Below I listed possible issues I can see with current implementation.

Thank you!

1 We cannot register event handler in getDefaults() method! This method may be called many times and it is even possible to reconfigure kernel and unload instance and later reload new one. In such case we would get more than one event handler registered! This is bad. Also each handler will keep in memory instance which was unloaded due to strong reference between event handler and instance of ConnectionManager.

+Solution:+ I think it would be far better if we would move creation of event handler and registration to start() method and add call to unregister event handler during execution of stop()/release() method. This methods are implemented by most (if not by every) components.

OK, makes sense (not using osgi nor familiarising myself yet with kernel made me miss those use-cases). I've made suggested changes.

2 What if we are reconfiguring server without stopping it (yes, it is possible). Let's say I will disable BOSH component and later I will enable it - it will never receive CLUSTER_INITIALIZED_EVENT and it will never start to accept connections!

OK, is this documented somewhere? What is done, what is called and what status can we expect at different moments? From what I remember it's possible to manage component configuration through ad-hocs (and some component's doesn't support it hence setProperties() calls with single-parameter maps are ignored (tigase <= 7.1.x) but this doesn't affect component accepting connections.

The other case is management of components through groovy script but is there a way to distinguish it (i.e. initialization of single component vs initialization of whole server, thus avoiding delaying opening connections, we could assume cluster is ready)? Does OSGi mode utilize same/similar approach? From what I've checked it could be quite tricky. We could store information about cluster availability in tigase.cluster.ClusterConnectionManager or system property and based on it enable delaying, but this feels at best awkward.

From what I gather with TKF this should be both easier to achieve and in a more sane manner - correct?

However, please note that to avoid problem of this event missing for some reason I've included an expiration timer as mentioned in previous post - it has default time-out of 2 minutes (default delay * 60 so this may be too long, and granted, this is far from ideal).

3 We may have issues with @see-other-host@, when we use hash - as other cluster nodes may already redirect users to newly started cluster node which is not accepting connections - this will create a loop until server will properly start and synchronize with whole cluster which can take a while.

Right now establishing cluster connection is quite fast, ACS synchronization is done separately - correct?

I think the window would be really slim:

we have a cluster of 3 nodes already connected;

we are starting new node - it initializes, read cluster repository, start new connections to items read from repository;

on connection-receiving nodes we are reloading repository on incomming connection therefore we have up-to date data and passwords (this should be a bit differently, in separate thread at least but that's separate issue, i.e. #4127);

connection is established, #onNodeConnected is called across the cluster and only at that moment we are both starting to accept new connections as well as include new node in redirection map

Correct?

4 I'm not sure but I think we should block S2S connections as well. How we can route packets to users, if we do not have knowledge about their presence?

OK, in this case chances of establishing s2s connection and missing user presence are even more slim - usually in case of cluster s2s connection would already be established to one of the nodes. Of course we can add similar mechanism there.
Andrzej Wójcik (Tigase) commented 9 years ago
Wojciech Kapcia wrote:

Andrzej Wójcik wrote:

I reviewed changes from origin/1783-cluster-connection and I would say they are OK with few exceptions. Below I listed possible issues I can see with current implementation.

Thank you!

1 We cannot register event handler in getDefaults() method! This method may be called many times and it is even possible to reconfigure kernel and unload instance and later reload new one. In such case we would get more than one event handler registered! This is bad. Also each handler will keep in memory instance which was unloaded due to strong reference between event handler and instance of ConnectionManager.

+Solution:+ I think it would be far better if we would move creation of event handler and registration to start() method and add call to unregister event handler during execution of stop()/release() method. This methods are implemented by most (if not by every) components.

OK, makes sense (not using osgi nor familiarising myself yet with kernel made me miss those use-cases). I've made suggested changes.

I'm not talking only about OSGi mode. We have CompManager.groovy groovy script which allows us to enable/disable and reconfigure components without server being restrarted. There are calls to getDefaults() and other calls which lead to reconfiguration by use of setProperties() (called by @Configurator.updateMessageRouter()@). With this script we can disable component and some time later enable it once again (same classes will be used but different instances!).

2 What if we are reconfiguring server without stopping it (yes, it is possible). Let's say I will disable BOSH component and later I will enable it - it will never receive CLUSTER_INITIALIZED_EVENT and it will never start to accept connections!

OK, is this documented somewhere? What is done, what is called and what status can we expect at different moments? From what I remember it's possible to manage component configuration through ad-hocs (and some component's doesn't support it hence setProperties() calls with single-parameter maps are ignored (tigase <= 7.1.x) but this doesn't affect component accepting connections.

Script which I mentioned above CompManager results in calls to setProperties() with whole configuration being passes - so it is not a call with single property and works in every Tigase XMPP Server version.

The other case is management of components through groovy script but is there a way to distinguish it (i.e. initialization of single component vs initialization of whole server, thus avoiding delaying opening connections, we could assume cluster is ready)? Does OSGi mode utilize same/similar approach? From what I've checked it could be quite tricky. We could store information about cluster availability in tigase.cluster.ClusterConnectionManager or system property and based on it enable delaying, but this feels at best awkward.

OSGi uses same methods in <= 7.1.0 as CompManager.groovy adhoc script. I think there is no way to distinguish this in easy way. However I think it could be possible to pass some additional properties for components if clustering is already started. But in case of OSGi it is possible that component will be registered (reconfigured) while server is started but before cluster nodes connect to each other. I know this is rare case scenario but it is possible.

From what I gather with TKF this should be both easier to achieve and in a more sane manner - correct?

To be honest, I'm not sure if it will be easier. We will know if component is started, stopped or reconfigured but still there will be no way to tell is cluster is already synced.

However, please note that to avoid problem of this event missing for some reason I've included an expiration timer as mentioned in previous post - it has default time-out of 2 minutes (default delay * 60 so this may be too long, and granted, this is far from ideal).

I agree - timer will somehow deal with this situation, but then we need to mention it in documentation.

3 We may have issues with @see-other-host@, when we use hash - as other cluster nodes may already redirect users to newly started cluster node which is not accepting connections - this will create a loop until server will properly start and synchronize with whole cluster which can take a while.

Right now establishing cluster connection is quite fast, ACS synchronization is done separately - correct?

I think the window would be really slim:

we have a cluster of 3 nodes already connected;

we are starting new node - it initializes, read cluster repository, start new connections to items read from repository;

on connection-receiving nodes we are reloading repository on incomming connection therefore we have up-to date data and passwords (this should be a bit differently, in separate thread at least but that's separate issue, i.e. #4127);

connection is established, #onNodeConnected is called across the cluster and only at that moment we are both starting to accept new connections as well as include new node in redirection map

Correct?

For sure, #onNodeConnected is called after cluster connection between nodes is established, so it should work ok. For some reason I was thinking that we are delaying until ACS cache is in sync, if not then ok.

4 I'm not sure but I think we should block S2S connections as well. How we can route packets to users, if we do not have knowledge about their presence?

OK, in this case chances of establishing s2s connection and missing user presence are even more slim - usually in case of cluster s2s connection would already be established to one of the nodes. Of course we can add similar mechanism there.

If we wait only for cluster connections and not for ACS cache sync then it is ok with me.
Wojciech Kapcia (Tigase) commented 9 years ago

Andrzej Wójcik wrote:

The other case is management of components through groovy script but is there a way to distinguish it (i.e. initialization of single component vs initialization of whole server, thus avoiding delaying opening connections, we could assume cluster is ready)? Does OSGi mode utilize same/similar approach? From what I've checked it could be quite tricky. We could store information about cluster availability in tigase.cluster.ClusterConnectionManager or system property and based on it enable delaying, but this feels at best awkward.

OSGi uses same methods in <= 7.1.0 as CompManager.groovy adhoc script. I think there is no way to distinguish this in easy way. However I think it could be possible to pass some additional properties for components if clustering is already started. But in case of OSGi it is possible that component will be registered (reconfigured) while server is started but before cluster nodes connect to each other. I know this is rare case scenario but it is possible.

As mentioned earlier, there is a possibility to use 'ugly' solution - we could have a isClusterReady() in ClusterConnectionManager and access it using XMPPServer.getConfigurator().getComponent( "cl-comp" ); (which is done already in couple of places...

Any cons?

From what I gather with TKF this should be both easier to achieve and in a more sane manner - correct?

To be honest, I'm not sure if it will be easier. We will know if component is started, stopped or reconfigured but still there will be no way to tell is cluster is already synced.

When I mentioned 'easier' I meant something along above - more direct access to other components (and I think, in addition to configuration handling, this was motivation behind Kernel)...

However, please note that to avoid problem of this event missing for some reason I've included an expiration timer as mentioned in previous post - it has default time-out of 2 minutes (default delay * 60 so this may be too long, and granted, this is far from ideal).

I agree - timer will somehow deal with this situation, but then we need to mention it in documentation.

Will do, thanks for the comments so far.
Andrzej Wójcik (Tigase) commented 9 years ago

Wojciech Kapcia wrote:

Andrzej Wójcik wrote:

The other case is management of components through groovy script but is there a way to distinguish it (i.e. initialization of single component vs initialization of whole server, thus avoiding delaying opening connections, we could assume cluster is ready)? Does OSGi mode utilize same/similar approach? From what I've checked it could be quite tricky. We could store information about cluster availability in tigase.cluster.ClusterConnectionManager or system property and based on it enable delaying, but this feels at best awkward.

OSGi uses same methods in <= 7.1.0 as CompManager.groovy adhoc script. I think there is no way to distinguish this in easy way. However I think it could be possible to pass some additional properties for components if clustering is already started. But in case of OSGi it is possible that component will be registered (reconfigured) while server is started but before cluster nodes connect to each other. I know this is rare case scenario but it is possible.

As mentioned earlier, there is a possibility to use 'ugly' solution - we could have a isClusterReady() in ClusterConnectionManager and access it using XMPPServer.getConfigurator().getComponent( "cl-comp" ); (which is done already in couple of places...

Any cons?

I think there will be no cons in this case - except from direct access to the other component, but I think we can skip this con, just make sure this solution will work if we would get null instead of cl-comp component instance.

From what I gather with TKF this should be both easier to achieve and in a more sane manner - correct?

To be honest, I'm not sure if it will be easier. We will know if component is started, stopped or reconfigured but still there will be no way to tell is cluster is already synced.

When I mentioned 'easier' I meant something along above - more direct access to other components (and I think, in addition to configuration handling, this was motivation behind Kernel)...

Yes, this would be allowed by adding field and annotation to this field.

However, please note that to avoid problem of this event missing for some reason I've included an expiration timer as mentioned in previous post - it has default time-out of 2 minutes (default delay * 60 so this may be too long, and granted, this is far from ideal).

I agree - timer will somehow deal with this situation, but then we need to mention it in documentation.

Will do, thanks for the comments so far.
Wojciech Kapcia (Tigase) commented 9 years ago

Changes has been completed, I've included above comments to avoid delaying client connectivity while reloading component if the cluster is active; I've added configuration parameters to control it and documentation with rationale and description how it works.

Given that it was originally intended for 7.1.0, later on pushed to 7.1.1 and that 7.1.0 hasn't been published yet it was finally merged to origin/release and will be included in 7.1.0.
Artur Hefczyc commented 9 years ago

Perfect! Thank you.
Login to comment

Type	Bug
Priority	Normal
Assignee	Artur Hefczyc
RedmineID	1783
Version	tigase-server-7.1.0
Spent time	0

Issue Votes (0)

Watchers (0)

Reference

tigase/_server/server-core#298