Projects tigase _server server-core Issues #169
Tigase not reconnecting to DB (#169)
Artur Hefczyc opened 1 decade ago

In some cases the Tigase does not reconnect to the database and throws NPE instead. It might be related to a prolonged network outage or concurrent access to prepared statements...

bq. ??So I can reproduce the NPE by just disconnecting the DB and doing some logins. The problem there is that you synchronize on a preparedStatement that cannot be returned because there’s no connection. Generally, just confusing getting the NPE when perhaps it would be better getting a connection problem. So that’s a bit of a red herring. However.. The precipitating factor is that IT CANNOT get a connection… ??

?? here’s a trace ??

2012-11-22 02:13:13  HttpMethodBase.getResponseBody()    WARNING:  Going to buffer response body of large or unknown size. Using getResponseBodyAsStream 
                                     instead is recommended.
2012-11-22 02:13:28  WorkerThread.run()                  SEVERE:   tigase.server.xmppsession.SessionManager$ProcessorWorkerThread,(jabber:iq:auth Queue Worker 2) 
                                     Exception during packet processing: from=c2s@ip-10-159-3-99/10.159.3.99_5222_10.159.20.62_2626, to=sess-man@ip-10-159-3-99, DATA=
<iq id="libxmpp323" xmlns="jabber:client" type="set">
  <query xmlns="jabber:iq:auth">
    <username>some-user</username>
    <token>some-token</token>
  </query>
</iq>
, SIZE=270, XMLNS=jabber:client, PRIORITY=NORMAL, PERMISSION=NONE, TYPE=set
java.lang.NullPointerException
        at tigase.db.jdbc.JDBCRepository.getUserUID(JDBCRepository.java:616)
        at tigase.db.jdbc.JDBCRepository.getUserUID(JDBCRepository.java:1431)
        at tigase.db.jdbc.JDBCRepository.getNodeNID(JDBCRepository.java:1417)
        at tigase.db.jdbc.JDBCRepository.getData(JDBCRepository.java:302)
        at tigase.db.UserRepositoryMDImpl.getData(UserRepositoryMDImpl.java:138)
        at tigase.xmpp.RepositoryAccess.getData(RepositoryAccess.java:262)
        at tigase.xmpp.impl.DomainFilter.getDomains(DomainFilter.java:397)
        at tigase.xmpp.impl.DomainFilter.filter(DomainFilter.java:126)
        at tigase.server.xmppsession.SessionManager.addOutPackets(SessionManager.java:831)
        at tigase.server.xmppsession.SessionManager$ProcessorWorkerThread.process(SessionManager.java:2106)
        at tigase.util.WorkerThread.run(WorkerThread.java:132)

??Here’s it still having problems much later… (eg.. 7 HOURS LATER??

2012-11-22 09:06:52  WorkerThread.run()                  SEVERE:   tigase.server.xmppsession.SessionManager$ProcessorWorkerThread,(jabber:iq:auth Queue Worker 2)
                                     Exception during packet processing: from=c2s@ip-10-159-3-99/10.159.3.99_5222_10.28.4.137_44169, to=sess-man@ip-10-159-3-99, DATA=
<iq id="libxmpp1" xmlns="jabber:client" type="set">
  <query xmlns="jabber:iq:auth">
    <username>some-user</username>
    <token>some-token</token>
  </query>
</iq>
, SIZE=292, XMLNS=jabber:client, PRIORITY=NORMAL, PERMISSION=NONE, TYPE=set
java.lang.NullPointerException
        at tigase.db.jdbc.JDBCRepository.getUserUID(JDBCRepository.java:616)
        at tigase.db.jdbc.JDBCRepository.getUserUID(JDBCRepository.java:1431)
        at tigase.db.jdbc.JDBCRepository.getNodeNID(JDBCRepository.java:1417)
        at tigase.db.jdbc.JDBCRepository.getData(JDBCRepository.java:302)
        at tigase.db.UserRepositoryMDImpl.getData(UserRepositoryMDImpl.java:138)
        at tigase.xmpp.RepositoryAccess.getData(RepositoryAccess.java:262)
        at tigase.xmpp.impl.DomainFilter.getDomains(DomainFilter.java:397)
        at tigase.xmpp.impl.DomainFilter.filter(DomainFilter.java:126)
        at tigase.server.xmppsession.SessionManager.addOutPackets(SessionManager.java:831)
        at tigase.server.xmppsession.SessionManager$ProcessorWorkerThread.process(SessionManager.java:2106)
        at tigase.util.WorkerThread.run(WorkerThread.java:132)

??--à Some other problems of interest.. Key issues when adding Offline messages.??

??Here’s another issue I don’t believe is related to the JDBC reconnect problem.. HOWEVER.. it Is trying to inject Offline messages to a -1 nid which causes an out of range exception:??

2012-11-20 17:50:12  JDBCRepository.addDataList()        WARNING:  Error adding data list, user_id: user-id, subnode: offline/msgoffline, key: 
                                     messages, uid: 12924729, nid: -1, list: [
<message xml:lang="en_US" xmlns="jabber:client" type="ebisu-invite" to="user-to" from="user-from">
  <subject>71055;71055;2004552</subject>
  <body/>
  <delay xmlns="urn:xmpp:delay" from="chat-server" stamp="2012-11-20T17:49:36">Offline Storage</delay>
</message>]
com.mysql.jdbc.MysqlDataTruncation: Data truncation: Out of range value for column 'nid' at row 1
        at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3601)
        at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3535)
        at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1989)
        at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2150)
        at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2626)
        at com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:2119)
        at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:2415)
        at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:2333)
        at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:2318)
        at tigase.db.jdbc.JDBCRepository.addDataList(JDBCRepository.java:172)
        at tigase.db.jdbc.JDBCRepository.addDataList(JDBCRepository.java:204)
        at tigase.db.UserRepositoryMDImpl.addDataList(UserRepositoryMDImpl.java:75)
        at tigase.db.NonAuthUserRepositoryImpl.addOfflineDataList(NonAuthUserRepositoryImpl.java:124)
        at tigase.xmpp.impl.OfflineMessages$MsgRepositoryImpl.storeMessage(OfflineMessages.java:447)
        at com.ea.online.gcs.chat.plugin.EbisuOfflineMessages.savePacketForOffLineUser(EbisuOfflineMessages.java:92)
        at tigase.xmpp.impl.OfflineMessages.postProcess(OfflineMessages.java:130)
        at tigase.server.xmppsession.SessionManager.processPacket(SessionManager.java:1541)
        at tigase.cluster.SessionManagerClustered.processPacket(SessionManagerClustered.java:540)
        at tigase.server.AbstractMessageReceiver$QueueListener.run(AbstractMessageReceiver.java:1337)
2012-11-20 17:50:12  NonAuthUserRepositoryImpl.addOfflineDataList()  SEVERE: Problem accessing repository data.
tigase.db.TigaseDBException: Problem adding data list to repository
        at tigase.db.jdbc.JDBCRepository.addDataList(JDBCRepository.java:206)
        at tigase.db.UserRepositoryMDImpl.addDataList(UserRepositoryMDImpl.java:75)
        at tigase.db.NonAuthUserRepositoryImpl.addOfflineDataList(NonAuthUserRepositoryImpl.java:124)
        at tigase.xmpp.impl.OfflineMessages$MsgRepositoryImpl.storeMessage(OfflineMessages.java:447)
        at com.ea.online.gcs.chat.plugin.EbisuOfflineMessages.savePacketForOffLineUser(EbisuOfflineMessages.java:92)
        at tigase.xmpp.impl.OfflineMessages.postProcess(OfflineMessages.java:130)
        at tigase.server.xmppsession.SessionManager.processPacket(SessionManager.java:1541)
        at tigase.cluster.SessionManagerClustered.processPacket(SessionManagerClustered.java:540)
        at tigase.server.AbstractMessageReceiver$QueueListener.run(AbstractMessageReceiver.java:1337)
Caused by: com.mysql.jdbc.MysqlDataTruncation: Data truncation: Out of range value for column 'nid' at row 1
        at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3601)
        at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3535)
        at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1989)
        at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2150)
        at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2626)
        at com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:2119)
        at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:2415)
        at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:2333)
        at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:2318)
        at tigase.db.jdbc.JDBCRepository.addDataList(JDBCRepository.java:172)
        at tigase.db.jdbc.JDBCRepository.addDataList(JDBCRepository.java:204)
        at tigase.db.UserRepositoryMDImpl.addDataList(UserRepositoryMDImpl.java:75)
        at tigase.db.NonAuthUserRepositoryImpl.addOfflineDataList(NonAuthUserRepositoryImpl.java:124)
        at tigase.xmpp.impl.OfflineMessages$MsgRepositoryImpl.storeMessage(OfflineMessages.java:447)
        at com.ea.online.gcs.chat.plugin.EbisuOfflineMessages.savePacketForOffLineUser(EbisuOfflineMessages.java:92)
        at tigase.xmpp.impl.OfflineMessages.postProcess(OfflineMessages.java:130)
        at tigase.server.xmppsession.SessionManager.processPacket(SessionManager.java:1541)
        at tigase.cluster.SessionManagerClustered.processPacket(SessionManagerClustered.java:540)
        at tigase.server.AbstractMessageReceiver$QueueListener.run(AbstractMessageReceiver.java:1337)

??And an error on the get??

2012-11-21 20:14:49  DomainFilter.filter()               WARNING:  Can't access user repository.
tigase.db.TigaseDBException: Error getting user data for: user-id/null/allowed-domains
        at tigase.db.jdbc.JDBCRepository.getData(JDBCRepository.java:336)
        at tigase.db.UserRepositoryMDImpl.getData(UserRepositoryMDImpl.java:138)
        at tigase.xmpp.RepositoryAccess.getData(RepositoryAccess.java:262)
        at tigase.xmpp.impl.DomainFilter.getDomains(DomainFilter.java:397)
        at tigase.xmpp.impl.DomainFilter.filter(DomainFilter.java:126)
        at tigase.server.xmppsession.SessionManager.addOutPackets(SessionManager.java:831)
        at tigase.server.xmppsession.SessionManager$ProcessorWorkerThread.process(SessionManager.java:2106)
        at tigase.util.WorkerThread.run(WorkerThread.java:132)
Caused by: java.sql.SQLException: You are not owner of thread 109
        at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1073)
        at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3603)
        at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3535)
        at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1989)
        at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2150)
        at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2620)
        at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2570)
        at com.mysql.jdbc.StatementImpl.execute(StatementImpl.java:779)
        at com.mysql.jdbc.StatementImpl.execute(StatementImpl.java:622)
        at com.mysql.jdbc.StatementImpl$CancelTask$1.run(StatementImpl.java:106)
Artur Hefczyc commented 1 decade ago

My collegue Ryan discovered that data_repo.getPreparedStatement in the getUserUID method never returns on Windows if the database is down.. where as in

Linux this just returns a null.. which then causes the NPE at the synchronized statement.

This still does not sort Why our server was not reconnecting.

Artur Hefczyc commented 1 decade ago

Tested it with current master branch and Tigase seems to reconnect correctly every time MySQL is down and up. I run a test for several minutes with 20 users trying to login and disconnecting every 5 seconds. I stopped MySQL several times during the test and started it again and Tigase connected back to the DB every time without any delay.

So, this is either the problem which I cannot replicate or the new code is working differently from the version reported to have a problem.

For the nid = 1 problem I cannot find a solution. I do not know why this happens in the first place, it should not happen but it does. Hence I cannot provide a solution at this time.

issue 1 of 1
Type
Bug
Priority
Normal
Assignee
RedmineID
940
Version
tigase-server-5.2.0
Estimation
32h
Issue Votes (0)
Watchers (0)
Reference
tigase/_server/server-core#169
Please wait...
Page is in error, reload to recover