Projects tigase _server server-core Issues #754
client-port-delay-listening causes exception (#754)
wojciech.kapcia@tigase.net opened 8 years ago
Due Date
2017-03-01
--client-port-delay-listening = false
2017-02-07 17:10:07.305 [ResultsListener-socketWriteThread-31]  SocketThread$ResultsListener.run()  WARNING: Protocol execution exception.
java.util.concurrent.ExecutionException: java.lang.NullPointerException
        at java.util.concurrent.FutureTask.report(FutureTask.java:122)
        at java.util.concurrent.FutureTask.get(FutureTask.java:192)
        at tigase.net.SocketThread$ResultsListener.run(SocketThread.java:568)
Caused by: java.lang.NullPointerException
        at tigase.server.AbstractMessageReceiver.addTimerTask(AbstractMessageReceiver.java:1193)
        at tigase.server.AbstractMessageReceiver$PacketReceiverTask.<init>(AbstractMessageReceiver.java:1376)
        at tigase.server.AbstractMessageReceiver$PacketReceiverTask.<init>(AbstractMessageReceiver.java:1334)
        at tigase.server.AbstractMessageReceiver.addOutPacketWithTimeout(AbstractMessageReceiver.java:1179)
        at tigase.server.xmppclient.ClientConnectionManager.xmppStreamClosed(ClientConnectionManager.java:418)
        at tigase.server.xmppclient.ClientConnectionManager.serviceStopped(ClientConnectionManager.java:319)
        at tigase.server.xmppclient.ClientConnectionManager.serviceStopped(ClientConnectionManager.java:64)
        at tigase.net.IOService.forceStop(IOService.java:321)
        at tigase.xmpp.XMPPIOService.forceStop(XMPPIOService.java:300)
        at tigase.xmpp.XMPPIOService.processSocketData(XMPPIOService.java:750)
        at tigase.net.IOService.call(IOService.java:248)
        at tigase.xmpp.XMPPIOService.call(XMPPIOService.java:214)
        at tigase.xmpp.XMPPIOService.call(XMPPIOService.java:64)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
wojciech.kapcia@tigase.net commented 8 years ago

Actually the root cause seems to be different.

wojciech.kapcia@tigase.net commented 8 years ago

It looks like there root cause is the attempt at connection before server finished startup:

2017-02-08 18:22:35.678 [pool-9-thread-1]  ClientConnectionManager.xmppStreamOpened()  FINER: Sending a system command to SM: from=null, to=null, DATA=<iq from="c2s@atlantiscity.local/0:0:0:0:0:0:0:1_5222_0:0:0:0:0:0:0:1_50408" type="set" id="c2s--c2s1" to="sess-man@atlantiscity.local"><command xmlns="http://jabber.org/protocol/commands" node="STREAM_OPENED"><x xmlns="jabber:x:data" type="submit"><field var="session-id"><value>f374a34f-5f26-4ce5-9f71-ba547bb98c4b</value></field><field var="hostname"><value>atlantiscity</value></field><field var="xml:lang"><value>en</value></field></x></command></iq>, SIZE=455, XMLNS=null, PRIORITY=SYSTEM, PERMISSION=NONE, TYPE=set
2017-02-08 18:22:35.679 [pool-9-thread-1]  XMPPIOService.processSocketData()  INFO:     c2s@atlantiscity.local/0:0:0:0:0:0:0:1_5222_0:0:0:0:0:0:0:1_50408, type: accept, Socket: c2s@atlantiscity.local/0:0:0:0:0:0:0:1_5222_0:0:0:0:0:0:0:1_50408 Socket[addr=/0:0:0:0:0:0:0:1,port=50408,localport=5222], jid: null, Incorrect XML data: <?xml version="1.0"?>
<stream:stream xmlns:stream="http://etherx.jabber.org/streams" version="1.0" xmlns="jabber:client" to="atlantiscity" xml:lang="en" xmlns:xml="http://www.w3.org/XML/1998/namespace">
, stopping connection: c2s@atlantiscity.local/0:0:0:0:0:0:0:1_5222_0:0:0:0:0:0:0:1_50408, exception:
java.lang.NullPointerException
        at tigase.server.AbstractMessageReceiver.addTimerTask(AbstractMessageReceiver.java:1193)
        at tigase.server.AbstractMessageReceiver$PacketReceiverTask.<init>(AbstractMessageReceiver.java:1376)
        at tigase.server.AbstractMessageReceiver$PacketReceiverTask.<init>(AbstractMessageReceiver.java:1334)
        at tigase.server.AbstractMessageReceiver.addOutPacketWithTimeout(AbstractMessageReceiver.java:1179)
        at tigase.server.xmppclient.ClientConnectionManager.xmppStreamOpened(ClientConnectionManager.java:529)
        at tigase.xmpp.XMPPIOService.xmppStreamOpened(XMPPIOService.java:830)
        at tigase.xmpp.XMPPDomBuilderHandler.startElement(XMPPDomBuilderHandler.java:301)
        at tigase.xml.SimpleParser.parse(SimpleParser.java:343)
        at tigase.xmpp.XMPPIOService.processSocketData(XMPPIOService.java:702)
        at tigase.net.IOService.call(IOService.java:248)
        at tigase.xmpp.XMPPIOService.call(XMPPIOService.java:214)
        at tigase.xmpp.XMPPIOService.call(XMPPIOService.java:64)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
2017-02-08 18:22:35.681 [main]             AbstractBeanConfigurator.configure()  CONFIG: Configuring bean 'cidConnectionsOpenerService'...
2017-02-08 18:22:35.683 [pool-9-thread-1]  ConnectionManager.serviceStopped()  FINER:   [[c2s]] Connection stopped: c2s@atlantiscity.local/0:0:0:0:0:0:0:1_5222_0:0:0:0:0:0:0:1_50408, type: accept, Socket: c2s@atlantiscity.local/0:0:0:0:0:0:0:1_5222_0:0:0:0:0:0:0:1_50408 Socket[addr=/0:0:0:0:0:0:0:1,port=50408,localport=5222], jid: null
2017-02-08 18:22:35.683 [pool-9-thread-1]  ClientConnectionManager.xmppStreamClosed()  FINER: Stream closed: c2s@atlantiscity.local/0:0:0:0:0:0:0:1_5222_0:0:0:0:0:0:0:1_50408
2017-02-08 18:22:35.684 [ResultsListener-socketWriteThread-32]  SocketThread$ResultsListener.run()  WARNING: Protocol execution exception.
java.util.concurrent.ExecutionException: java.lang.NullPointerException
        at java.util.concurrent.FutureTask.report(FutureTask.java:122)
        at java.util.concurrent.FutureTask.get(FutureTask.java:192)
        at tigase.net.SocketThread$ResultsListener.run(SocketThread.java:568)
Caused by: java.lang.NullPointerException
        at tigase.server.AbstractMessageReceiver.addTimerTask(AbstractMessageReceiver.java:1193)
        at tigase.server.AbstractMessageReceiver$PacketReceiverTask.<init>(AbstractMessageReceiver.java:1376)
        at tigase.server.AbstractMessageReceiver$PacketReceiverTask.<init>(AbstractMessageReceiver.java:1334)
        at tigase.server.AbstractMessageReceiver.addOutPacketWithTimeout(AbstractMessageReceiver.java:1179)
        at tigase.server.xmppclient.ClientConnectionManager.xmppStreamClosed(ClientConnectionManager.java:418)
        at tigase.server.xmppclient.ClientConnectionManager.serviceStopped(ClientConnectionManager.java:319)
        at tigase.server.xmppclient.ClientConnectionManager.serviceStopped(ClientConnectionManager.java:64)
        at tigase.net.IOService.forceStop(IOService.java:321)
        at tigase.xmpp.XMPPIOService.forceStop(XMPPIOService.java:300)
        at tigase.xmpp.XMPPIOService.processSocketData(XMPPIOService.java:750)
        at tigase.net.IOService.call(IOService.java:248)
        at tigase.xmpp.XMPPIOService.call(XMPPIOService.java:214)
        at tigase.xmpp.XMPPIOService.call(XMPPIOService.java:64)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)

After everything is correctly initialised exception is not thrown.

wojciech.kapcia@tigase.net commented 8 years ago

Ports are started before starting threads causing the issue:

"main@1" prio=5 tid=0x1 nid=NA runnable
  java.lang.Thread.State: RUNNABLE
	  at tigase.server.ConnectionManager.startService(ConnectionManager.java:1231)
	  at tigase.server.ConnectionManager.access$200(ConnectionManager.java:65)
	  at tigase.server.ConnectionManager$PortConfigBean.beanConfigurationChanged(ConnectionManager.java:1584)
	  at tigase.server.ConnectionManager$PortConfigBean.initialize(ConnectionManager.java:1603)
	  at tigase.kernel.core.Kernel.initBean(Kernel.java:148)
	  at tigase.kernel.core.Kernel.injectDependencies(Kernel.java:551)
	  at tigase.kernel.core.Kernel.injectDependencies(Kernel.java:726)
	  at tigase.kernel.core.Kernel.registerBean(Kernel.java:929)
	  at tigase.kernel.core.BeanConfigBuilder.execWithoutInject(BeanConfigBuilder.java:113)
	  at tigase.kernel.core.BeanConfigBuilder.exec(BeanConfigBuilder.java:102)
	  at tigase.server.ConnectionManager$PortsConfigBean.initialize(ConnectionManager.java:1666)
	  at tigase.kernel.core.Kernel.initBean(Kernel.java:148)
	  at tigase.kernel.core.Kernel.injectDependencies(Kernel.java:551)
	  at tigase.kernel.core.Kernel.injectDependencies(Kernel.java:726)
	  at tigase.kernel.core.Kernel$DelayedDependenciesInjection.inject(Kernel.java:1288)
	  at tigase.kernel.core.Kernel.finishDependecyDelayedInjection(Kernel.java:965)
	  at tigase.kernel.core.Kernel.initBean(Kernel.java:139)
	  at tigase.kernel.core.Kernel.injectDependencies(Kernel.java:551)
	  at tigase.kernel.core.Kernel.initBean(Kernel.java:143)
	  at tigase.kernel.core.Kernel.getInstance(Kernel.java:371)
	  at tigase.server.Bootstrap.start(Bootstrap.java:108)
	  at tigase.server.XMPPServer.start(XMPPServer.java:137)
	  at tigase.server.XMPPServer.main(XMPPServer.java:118)

followed by:

"main@1" prio=5 tid=0x1 nid=NA runnable
  java.lang.Thread.State: RUNNABLE
	  at tigase.server.AbstractMessageReceiver.startThreads(AbstractMessageReceiver.java:1249)
	  at tigase.server.AbstractMessageReceiver.start(AbstractMessageReceiver.java:820)
	  at tigase.server.xmppsession.SessionManager.start(SessionManager.java:527)
	  at tigase.server.MessageRouter.setComponentsAll(MessageRouter.java:138)
	  at sun.reflect.NativeMethodAccessorImpl.invoke0(NativeMethodAccessorImpl.java:-1)
	  at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	  at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	  at java.lang.reflect.Method.invoke(Method.java:498)
	  at tigase.kernel.BeanUtils.setValue(BeanUtils.java:158)
	  at tigase.kernel.core.Kernel.inject(Kernel.java:530)
	  at tigase.kernel.core.Kernel.injectDependencies(Kernel.java:593)
	  at tigase.kernel.core.Kernel.initBean(Kernel.java:143)
	  at tigase.kernel.core.Kernel.getInstance(Kernel.java:371)
	  at tigase.server.Bootstrap.start(Bootstrap.java:108)
	  at tigase.server.XMPPServer.start(XMPPServer.java:137)
	  at tigase.server.XMPPServer.main(XMPPServer.java:118)
wojciech.kapcia@tigase.net commented 8 years ago

Andrzej, I fixed the issue at hand by introducing tigase.kernel.beans.Initializable#completed and opening the ports only after the MessageRouter bean has finished starting up (which should translate to all dependent beans finished as well) - could you review it?

During getting to know implementation of Kernel/DSL/Bootstrap a couple comments/ideas popped into my mind (and I refrained from going forward with some of them until feedback from you to have consistent implementation):

  • from what I can see currently tigase.kernel.core.DependencyManager#getBeanConfigs() seems to return only current beans - would it make sense to create additional method which would return a collection of all descendent beans? That is instead of returning only amp@, @s2s@, @userRepository@, etc… also return all it's registered beans (i.e.: for @amp would also include @broadcast@, @drop@, etc…)?

  • building on the above - it looks like tigase.kernel.core.DependencyGrapher#getDependencyGraph doesn't include those registered beans (separate task for that?);

  • actually client-port-delay-listening is not working - it was relying on tigase.server.ConnectionManager#connectWaitingTasks however currently tigase.server.ConnectionManager.PortsConfigBean is being used (and I was under the impression, that connectWaitingTasks may/should be removed?) thus I've paused with this one until your feedback.

Andrzej Wójcik (Tigase) commented 8 years ago

Wojciech Kapcia wrote:

Andrzej, I fixed the issue at hand by introducing tigase.kernel.beans.Initializable#completed and opening the ports only after the MessageRouter bean has finished starting up (which should translate to all dependent beans finished as well) - could you review it?

I just reviewed it and adjusted to workflow which I think is a better fit.

During getting to know implementation of Kernel/DSL/Bootstrap a couple comments/ideas popped into my mind (and I refrained from going forward with some of them until feedback from you to have consistent implementation):

  • from what I can see currently tigase.kernel.core.DependencyManager#getBeanConfigs() seems to return only current beans - would it make sense to create additional method which would return a collection of all descendent beans? That is instead of returning only amp@, @s2s@, @userRepository@, etc… also return all it's registered beans (i.e.: for @amp would also include @broadcast@, @drop@, etc…)?

No. Bean you passed as an example are beans from subkernel of root kernel and this root kernel should not be aware of them to make sure that there will be no easy way to mix them - ie. what if I would deploy 2 pubsub components named pubsub1 and pubsub2 and both of them have repository beans named @repository@? Then I would have multiple beans for single name and this is not allowed.

  • building on the above - it looks like tigase.kernel.core.DependencyGrapher#getDependencyGraph doesn't include those registered beans (separate task for that?);

Beans you mentioned are registered only if parent bean is active during initialization of a bean instance as in other case there is no point of having them. So I think this works fine.

  • actually client-port-delay-listening is not working - it was relying on tigase.server.ConnectionManager#connectWaitingTasks however currently tigase.server.ConnectionManager.PortsConfigBean is being used (and I was under the impression, that connectWaitingTasks may/should be removed?) thus I've paused with this one until your feedback.

No, it was not considered to be removed as tasks may be registered from other beans (ie. subbeans) before component is fully initialized.

wojciech.kapcia@tigase.net commented 8 years ago

Andrzej Wójcik wrote:

  • actually client-port-delay-listening is not working - it was relying on tigase.server.ConnectionManager#connectWaitingTasks however currently tigase.server.ConnectionManager.PortsConfigBean is being used (and I was under the impression, that connectWaitingTasks may/should be removed?) thus I've paused with this one until your feedback.

No, it was not considered to be removed as tasks may be registered from other beans (ie. subbeans) before component is fully initialized.

side comment: As per discussion on IM - it was simply ignored and a direct call to startService() was used. waitingTasks should be used in this case.

Andrzej Wójcik (Tigase) commented 8 years ago

I decided to use different solution as it is not possible to remove entries from waitingTasks, so I modified PortConfigBean to delay initialization of port.

I've checked and in my opinion now it works properly.

%wojtek Please review my changes if now this works properly.

wojciech.kapcia@tigase.net commented 8 years ago

Andrzej Wójcik wrote:

I decided to use different solution as it is not possible to remove entries from waitingTasks, so I modified PortConfigBean to delay initialization of port.

I've checked and in my opinion now it works properly.

wojtek Please review my changes if now this works properly.

I've run a quick check on our node1/2.xmpp-test.net and while opening of the ports is in fact delayed and listening it doesn't work correctly (due to exception the cluster connection is not even established):

…
Loading component: c2s :: componentInfo{Title=Tigase XMPP Server, Version=7.2.0-SNAPSHOT-b4653/9e0edcc4 (2017-02-15/21:02:46), Class=tigase.cluster.ClientConnectionClustered}
Loading component: ws2s :: componentInfo{Title=Tigase XMPP Server, Version=7.2.0-SNAPSHOT-b4653/9e0edcc4 (2017-02-15/21:02:46), Class=tigase.cluster.WebSocketClientConnectionClustered}
Loading component: s2s :: componentInfo{Title=Tigase XMPP Server, Version=7.2.0-SNAPSHOT-b4653/9e0edcc4 (2017-02-15/21:02:46), Class=tigase.server.xmppserver.S2SConnectionManager}
Loading component: eventbus :: componentInfo{Title=Tigase XMPP Server, Version=7.2.0-SNAPSHOT-b4653/9e0edcc4 (2017-02-15/21:02:46), Class=tigase.eventbus.component.EventBusComponent}
2017-02-15 16:09:37.249 [cluster-nodes]    ConfigRepository$1.run()           SEVERE:   exception during reload of config repository items
java.lang.NullPointerException
        at tigase.server.AbstractMessageReceiver.addTimerTask(AbstractMessageReceiver.java:462)
        at tigase.server.ConnectionManager.reconnectService(ConnectionManager.java:1196)
        at tigase.server.ConnectionManager.addWaitingTask(ConnectionManager.java:896)
        at tigase.cluster.ClusterConnectionManager.itemAdded(ClusterConnectionManager.java:372)
        at tigase.cluster.ClusterConnectionManager.itemAdded(ClusterConnectionManager.java:89)
        at tigase.db.comp.ConfigRepository.addItemNoStore(ConfigRepository.java:175)
        at tigase.db.comp.ConfigRepository.addItem(ConfigRepository.java:163)
        at tigase.cluster.repo.ClConConfigRepository.itemLoaded(ClConConfigRepository.java:190)
        at tigase.cluster.repo.ClConSQLRepository.reload(ClConSQLRepository.java:299)
        at tigase.db.comp.ConfigRepository$1.run(ConfigRepository.java:95)
        at java.util.TimerThread.mainLoop(Timer.java:555)
        at java.util.TimerThread.run(Timer.java:505)
…
Andrzej Wójcik (Tigase) commented 8 years ago

Thank you for reporting this. I was focused on testing delayed opening of ports and did not run a test with more than 1 node.

I found a cause of this issue. It was caused by fact that ConnectionManager were marked as initialized during bean initializations while they are ready for processing data after they are started, so I moved code responsible for marking ConnectionManager as initialized to start() method after processing threads are started.

I've also found that ConnectionManager and BasicComponent have variable with same name (@initializationCompleted@) and now it is set to true in two different places. Maybe it would be a good idea to refactor this? and change initializationCompleted in ConnectionManager to @started@? but it would lead to more API changes in @ConnectionManager@.

wojciech.kapcia@tigase.net commented 8 years ago

Andrzej Wójcik wrote:

Thank you for reporting this. I was focused on testing delayed opening of ports and did not run a test with more than 1 node.

I found a cause of this issue. It was caused by fact that ConnectionManager were marked as initialized during bean initializations while they are ready for processing data after they are started, so I moved code responsible for marking ConnectionManager as initialized to start() method after processing threads are started.

That was my initial fix more or less ;-)

I've checked and now it looks it's working correctly. Only missing bit is the information that the port opening was (intentionally) delayed, which result in something like:

== 2017-02-17 16:36:02.573 Server finished starting up and (if there wasn't any error) is ready to use

2017-02-17 16:36:12.362 [ConnectionOpenThread]  SocketThread.<clinit>()       WARNING:  5 socketReadThreads started.
2017-02-17 16:36:12.376 [ConnectionOpenThread]  SocketThread.<clinit>()       WARNING:  5 socketWriteThreads started.
2017-02-17 16:36:12.474 [pool-1-thread-4]  ClientConnectionClustered.lambda$start$0()  WARNING: Starting listening on ports of component: c2s
2017-02-17 16:36:12.475 [pool-1-thread-1]  WebSocketClientConnectionClustered.lambda$start$0()  WARNING: Starting listening on ports of component: ws2s
2017-02-17 16:36:12.476 [pool-1-thread-2]  BoshConnectionClustered.lambda$start$0()  WARNING: Starting listening on ports of component: bosh

I've also found that ConnectionManager and BasicComponent have variable with same name (@initializationCompleted@) and now it is set to true in two different places. Maybe it would be a good idea to refactor this? and change initializationCompleted in ConnectionManager to @started@? but it would lead to more API changes in @ConnectionManager@.

@started@?

OK, I'll reference once more Startup Time - it has quite clear component lifecycle. currently initializationCompleted is in the @start()@:

public void start() {
	sslContextContainer.start();
	super.start();

	initializationCompleted = true;
	if (!delayPortListening) {
		connectWaitingTasks();
	}

	setupWatchdogThread();
	if ( null != watchdog ){
		watchdog.start();
	}
}

There is a tigase.server.Lifecycle interface, which seems like a good candidate for that purpose, but it may, as you suggested, require more changes.

Andrzej Wójcik (Tigase) commented 8 years ago

As suggested I renamed initializationCompleted in ConnectionManager to @started@.

I also added warnings about delayed opening of ports (originally lost during merging changes from 7.1.0).

wojciech.kapcia@tigase.net commented 8 years ago

Now everything works.

issue 1 of 1
Type
Bug
Priority
Normal
Assignee
RedmineID
4885
Version
tigase-server-8.0.0
Spent time
107h 30m
Issue Votes (0)
Watchers (0)
Reference
tigase/_server/server-core#754
Please wait...
Page is in error, reload to recover