-
I've reviewed stored procedures and their SQL queries and I've found a suboptimal usage of indexes in one of the cases. I've modified source code of the stored procedures and committed them in a separate branch named
issue-84
. Those changes still need to be tested but initial tests confirmed that they improve performance of the data retrieval. -
Just got a notification from tigase.im:
[2025-01-24 02:52:04:022] [SEVERE ] [ in_2-message-archive ] StanzaProcessor.processPacket() : Error retrieving messages from database when processing from=sess-man@ip-172-31-22-51.us-west-2.compute.internal, to=message-archive@ip-172-31-22-51.us-west-2.compute.internal, serverAuthorisedStanzaFrom=Optional.empty, DATA=<iq type="set" xmlns="jabber:client" from="…@sure.im/monocles …" id="…"><query xmlns="urn:xmpp:mam:2" queryid="…"><x type="submit" xmlns="jabber:x:data"><field type="hidden" var="FORM_TYPE"><value>urn:xmpp:mam:2</value></field></x><set xmlns="http://jabber.org/protocol/rsm"><after>d7ae41ff-7f7f-4f0e-b5d6-524b890fc312</after><max>50</max></set></query></iq>, SIZE=398, XMLNS=jabber:client, PRIORITY=NORMAL, PERMISSION=AUTH, TYPE=set, STABLE_ID=null java.lang.RuntimeException: Error retrieving messages from database at tigase.xmpp.mam.modules.QueryModule.process(QueryModule.java:86) at tigase.component.modules.StanzaProcessor.execute(StanzaProcessor.java:219) at tigase.component.modules.StanzaProcessor.process(StanzaProcessor.java:206) at tigase.component.modules.StanzaProcessor.processPacket(StanzaProcessor.java:103) at tigase.component.AbstractKernelBasedComponent.processPacket(AbstractKernelBasedComponent.java:81) at tigase.server.AbstractMessageReceiver$QueueListener.run(AbstractMessageReceiver.java:1398) Caused by: tigase.db.TigaseDBException: Cound not retrieve items at tigase.archive.db.JDBCMessageArchiveRepository.queryItems(JDBCMessageArchiveRepository.java:212) at tigase.archive.db.JDBCMessageArchiveRepository.queryItems(JDBCMessageArchiveRepository.java:53) at jdk.internal.reflect.GeneratedMethodAccessor68.invoke(Unknown Source) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) at java.base/java.lang.reflect.Method.invoke(Unknown Source) at tigase.stats.StatisticsInvocationHandler.invoke(StatisticsInvocationHandler.java:75) at jdk.proxy2/jdk.proxy2.$Proxy61.queryItems(Unknown Source) at tigase.archive.db.MessageArchiveRepositoryPool.queryItems(MessageArchiveRepositoryPool.java:97) at tigase.archive.db.MessageArchiveRepositoryPool.queryItems(MessageArchiveRepositoryPool.java:45) at tigase.xmpp.mam.modules.QueryModule.process(QueryModule.java:84) ... 5 more Caused by: com.mysql.cj.jdbc.exceptions.MySQLTimeoutException: Statement cancelled due to timeout or client request at com.mysql.cj.jdbc.exceptions.SQLExceptionsMapping.translateException(SQLExceptionsMapping.java:113) at com.mysql.cj.jdbc.StatementImpl.checkCancelTimeout(StatementImpl.java:2167) at com.mysql.cj.protocol.a.NativeProtocol.sendQueryPacket(NativeProtocol.java:1069) at com.mysql.cj.NativeSession.execSQL(NativeSession.java:666) at com.mysql.cj.jdbc.ClientPreparedStatement.executeInternal(ClientPreparedStatement.java:930) at com.mysql.cj.jdbc.ClientPreparedStatement.executeQuery(ClientPreparedStatement.java:1009) at com.mysql.cj.jdbc.CallableStatement.executeQuery(CallableStatement.java:891) at jdk.internal.reflect.GeneratedMethodAccessor8.invoke(Unknown Source) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) at java.base/java.lang.reflect.Method.invoke(Unknown Source) at tigase.db.jdbc.PreparedStatementInvocationHandler.invoke(PreparedStatementInvocationHandler.java:43) at jdk.proxy2/jdk.proxy2.$Proxy42.executeQuery(Unknown Source) at tigase.archive.db.JDBCMessageArchiveRepository.getItemsItems(JDBCMessageArchiveRepository.java:564) at tigase.archive.db.JDBCMessageArchiveRepository.queryItems(JDBCMessageArchiveRepository.java:209) ... 14 more
-
This could still be caused by total number of queries being run on the database. Current version running at tigase.im does not contain a fix for tigase/_server/tigase-push#65, so it is running "statistics" query every second from each cluster node. That with increased usage of tigase.im (usage peak) could cause this query to be slow (or timeout).
At this point I do not see anything more what we could do to improve the situation with optimization of SQL queries.
-
This could still be caused by total number of queries being run on the database. Current version running at tigase.im does not contain a fix for tigase/_server/tigase-push#65, so it is running "statistics" query every second from each cluster node. That with increased usage of tigase.im (usage peak) could cause this query to be slow (or timeout).
Should we update the installation just to make sure we are running latest version with all the fixes?
-
Again timeout:
[2025-02-06 12:28:04:448] [SEVERE ] [ in_5-message-archive ] StanzaProcessor.processPacket() : Error retrieving messages from database when processing from=sess-man@ip-172-31-0-99.us-west-2.compute.internal, to=message-archive@ip-172-31-0-99.us-west-2.compute.internal, serverAuthorisedStanzaFrom=Optional.empty, DATA=<iq xmlns="jabber:client" id="0tHuCT2cpqZI" from="…@tigase.im/blabber.im[3.1.4].z5P2" type="set"><query xmlns="urn:xmpp:mam:2" queryid="…"><x xmlns="jabber:x:data" type="submit"><field var="FORM_TYPE" type="hidden"><value>urn:xmpp:mam:2</value></field></x><set xmlns="http://jabber.org/protocol/rsm"><after>9c495ab8-cecd-4f5c-b8d6-a05e9b7f1301</after><max>50</max></set></query></iq>, SIZE=397, XMLNS=jabber:client, PRIORITY=NORMAL, PERMISSION=AUTH, TYPE=set, STABLE_ID=null java.lang.RuntimeException: Error retrieving messages from database at tigase.xmpp.mam.modules.QueryModule.process(QueryModule.java:86) at tigase.component.modules.StanzaProcessor.execute(StanzaProcessor.java:219) at tigase.component.modules.StanzaProcessor.process(StanzaProcessor.java:206) at tigase.component.modules.StanzaProcessor.processPacket(StanzaProcessor.java:103) at tigase.component.AbstractKernelBasedComponent.processPacket(AbstractKernelBasedComponent.java:81) at tigase.server.AbstractMessageReceiver$QueueListener.run(AbstractMessageReceiver.java:1398) Caused by: tigase.db.TigaseDBException: Cound not retrieve items at tigase.archive.db.JDBCMessageArchiveRepository.queryItems(JDBCMessageArchiveRepository.java:212) at tigase.archive.db.JDBCMessageArchiveRepository.queryItems(JDBCMessageArchiveRepository.java:53) at jdk.internal.reflect.GeneratedMethodAccessor74.invoke(Unknown Source) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) at java.base/java.lang.reflect.Method.invoke(Unknown Source) at tigase.stats.StatisticsInvocationHandler.invoke(StatisticsInvocationHandler.java:75) at jdk.proxy2/jdk.proxy2.$Proxy74.queryItems(Unknown Source) at tigase.archive.db.MessageArchiveRepositoryPool.queryItems(MessageArchiveRepositoryPool.java:97) at tigase.archive.db.MessageArchiveRepositoryPool.queryItems(MessageArchiveRepositoryPool.java:45) at tigase.xmpp.mam.modules.QueryModule.process(QueryModule.java:84) ... 5 more Caused by: com.mysql.cj.jdbc.exceptions.MySQLTimeoutException: Statement cancelled due to timeout or client request at com.mysql.cj.jdbc.exceptions.SQLExceptionsMapping.translateException(SQLExceptionsMapping.java:113) at com.mysql.cj.jdbc.StatementImpl.checkCancelTimeout(StatementImpl.java:2167) at com.mysql.cj.protocol.a.NativeProtocol.sendQueryPacket(NativeProtocol.java:1069) at com.mysql.cj.NativeSession.execSQL(NativeSession.java:666) at com.mysql.cj.jdbc.ClientPreparedStatement.executeInternal(ClientPreparedStatement.java:930) at com.mysql.cj.jdbc.ClientPreparedStatement.executeQuery(ClientPreparedStatement.java:1009) at com.mysql.cj.jdbc.CallableStatement.executeQuery(CallableStatement.java:891) at jdk.internal.reflect.GeneratedMethodAccessor8.invoke(Unknown Source) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) at java.base/java.lang.reflect.Method.invoke(Unknown Source) at tigase.db.jdbc.PreparedStatementInvocationHandler.invoke(PreparedStatementInvocationHandler.java:43) at jdk.proxy2/jdk.proxy2.$Proxy43.executeQuery(Unknown Source) at tigase.archive.db.JDBCMessageArchiveRepository.getItemsCount(JDBCMessageArchiveRepository.java:618) at tigase.archive.db.JDBCMessageArchiveRepository.queryItems(JDBCMessageArchiveRepository.java:197) ... 14 more
-
In this case, I have no idea what we can do next. We have optimized current schema so it would be as performant as possible and use correct indexes.
The query that is now causing the issue is most likely related to fetching data/messages from the database based on offset, so it shouldn't be that problematic for MySQL unless it is required to fetch quite a lot of messages.
The only other solution that I see right now, would be a change of database schema to denormalize it and change primary keys to reduce indexes sizes, increase performance and maybe that would allow us to sort messages by "ids" instead of timestamps improving query performance as well. But that would be a bigger job and most likely for a "big" release.
-
-
Let's close it for now as it seems to be working well enough. I've created #tigase/_server/tigase-message-archiving#85 as a follwup.
Type |
Task
|
Priority |
Normal
|
Assignee | |
Version |
none
|
Sprints |
n/a
|
Customer |
n/a
|
-
tigase-server-8.5.0 Open
Fetching messages still causes timeouts on our installation when a lot of data to be retrieved. It would be good to review their performance and usage of indexes.