Projects tigase _server server-core Issues #685
Messages getting lost when StreamManagement resumption is used (#685)
Andrzej Wójcik (Tigase) opened 8 years ago
Due Date
2016-07-05

As reported in comment for task #4142:

Consider this scenario:

  1. Alice sends a message to Bob's full JID
  1. But Bob has a broken connection and he never receives the message (or it does, but the server doesn't get the ack, same thing)
  1. Bob realizes about the broken connection (the server still does not) and reconnects
  1. Server disconnects the old client for conflict (because Bob is using the same resource)
  1. At this point, the message is marked with a delivery-error and stamped with time at point 5 and begins its redelivery process
  1. Message is stored to offline storage (including the delivery-error element) because no connection is available to handle it yet
  1. Bob sends the new available presence for the connection started at 4 and server redelivers messages from offline storage
  1. C2SDeliveryErrorProcessor checks that the timestamp of the delivery error is 5 but the only available connection was started at 4, therefore discarding the message

At point 9, C2SDeliveryErrorProcessor returned true anyway even if no message was delivered. The message is now lost.

Andrzej Wójcik (Tigase) commented 8 years ago

I tried to replicate issue with message lost when resource conflict happens, however it was not possible using single server. This always worked properly.

During this tests I found that:

  • messages are stamped with delayed element with timestamp of detection of broken stream by server

This suggest it was possible that message reported incorrect timestamp

  • messages delivery-error element contains proper stamp of message delivery time (first time message was tried to be sent to client connection)

So it is not possible to cause issue with message delivery suggested by report (messages are stamped with time from point 3 - not point 5)

  • messages where redelivered after conflict (and after any stream closed caused by server) after resumption timeout passed

It could be related somehow to delivery issue and in case of session being closed (as in case of @conflict@) redelivery of packets should be done without delay

Following changes have been made:

  • fixed timestamping of messages in delay element.

  • forced redelivery without timeout in case of session being closed (ie. due to @conflict@)

  • removed delivery-error element during saving to offline store (this should not change anything but I think this is not needed to be stored to offline store)

issue 1 of 1
Type
Bug
Priority
Normal
Assignee
RedmineID
4262
Version
tigase-server-7.1.0
Spent time
48h 45m
Issue Votes (0)
Watchers (0)
Reference
tigase/_server/server-core#685
Please wait...
Page is in error, reload to recover