Projects tigase _server server-core Issues #605
Implement clustered Map (#605)
Open
Andrzej Wójcik (Tigase) opened 9 years ago
Due Date
2016-05-13

While working on #1601 I realized that in fact, what I'm implementing is kind of clustered Map. In this map all key-values pairs are available on every cluster node - map is mirrored on every node (in my case on every node on which user is logged in).

Looking at this I think it might be a good idea to use EventBus to prepare implementation of such clustered Map, which would allow us to use it to store data in user session (XMPPSession) and have them in sync between cluster nodes. As this would be transparent to developer using this implementation, it would be very good to use in XMPPProcessor implementations.

Artur Hefczyc commented 9 years ago

This is something I was thinking about too. I think this could be very useful and could simplify clustering implementation in such a way that it is transparent to plugin APIs as we could have the session and common session data distributed over all cluster nodes if needed.

However, my concern is that it can significantly increase network traffic between cluster nodes. So I am in favor of implementing it but would be very cautious about using it. We will need to make sure the use of this Map is under a strict control.

Bartosz Małkowski commented 9 years ago

Map must be created with:

java.util.Map<String, String> map = ClusterMapFactory.get().createMap("type",String.class,String.class,"1","2","3" )

Where "type" is usage identifier. It should be use to tell other nodes how to treat event with newly created map, it should be stored in user session or somewhere else.

Map Key class and Map Value class -- used to type conversion

Arrays of strings -- parameters, for example ID of user session

Map must be destroyed if is no longer used:

ClusterMapFactory.get().destroyMap(map);

Map will be destroyed on each cluster node.

Events:

To catch event when map is created on other cluster node:

		eventBus.addHandler(MapCreatedEventHandler.MapCreatedEvent.class, new MapCreatedEventHandler() {
			@Override
			public void onMapCreated(Map map, String type, String... parameters) {
			}
		});

To catch event when map is destroyed on other cluster node:

		eventBus.addHandler(MapDestroyedEventHandler.MapDestroyedEvent.class, new MapDestroyedEventHandler() {
			@Override
			public void onMapDestroyed(Map mapX, String type) {
			}
		});

If you need any improvements, just tell me.

Artur Hefczyc commented 9 years ago

I have questions though.

Let's say we have 100 ClusterMaps on our server used by different components in different places.

  1. How the handler would know which Map is created/destroyed?

  2. Do you use high priority Packets in Tigase and high priority queues cluster communication for Map update in the cluster

  3. What are the consequences if a record which is later modified is updated sooner on the cluster due to a traffic, do you have timestamps on the Map record updates?

  4. What is the impact on the overall cluster traffic from ClusteredMap usage? In terms of number of packets and how big are packets?

Artur Hefczyc commented 9 years ago

Another request to Bartosz about the implementation. Make tests for the implementation and run them under a load. Ideally testing this functionality should be part of cluster load tests which are run periodically by Eric. Contact Eric and discuss with him what would be the best way to add your cluster map tests to the cluster load tests.

What we need to know is:

  1. What is min/max/average time of the Map record update during load tests

  2. Do we have any packets/data lost due to a high load

  3. Does it happen that record modified a few seconds later than other record is updated on the cluster sooner

Artur Hefczyc commented 9 years ago

%daniel please incorporate description of this new API to our development documentation.

Bartosz Małkowski commented 9 years ago

How the handler would know which Map is created/destroyed?

Don't understand your question.

Clustered Map uses Distributed EventBus to inform other instances about changes.

Each instance of Map has UID and internal listeners, thats why new instance must be created via Factory.

Do you use high priority Packets in Tigase and high priority queues cluster communication for Map update in the cluster

I use Distributed EventBus. DEB uses default priority, but it is easy to change.

What are the consequences if a record which is later modified is updated sooner on the cluster due to a traffic, do you have timestamps on the Map record updates?

I don't use timestamps. It is simple ConcurrentMap inside. So last received data change event changes value. We can make better implementation later (Clustered map is hidden behind java.util.Map, so we have to create own Map implementation from scratch, because it is impossible to extend implementations from JRE).

What is the impact on the overall cluster traffic from ClusteredMap usage? In terms of number of packets and how big are packets?

Clustered Map uses Distributed Eventbus. Each data change causes sending event to each listener (each cluster node).

This is sample event (it adds two items at once):

<ElementAdd xmlns="tigase:clustered:map">
  <uid>1-2-3</uid>
  <item>
    <key>xKEY</key>
    <value>xVALUE</value>
  </item>
  <item>
    <key>yKEY</key>
    <value>yVALUE</value>
  </item>
</ElementAdd>

I can make element names shorter.

Artur Hefczyc commented 9 years ago

Bartosz Malkowski wrote:

h3. How the handler would know which Map is created/destroyed?

Don't understand your question.

Clustered Map uses Distributed EventBus to inform other instances about changes.

Each instance of Map has UID and internal listeners, thats why new instance must be created via Factory.

What I mean is:

  1. Let's say we want to use ClusteredMap for some data in MUC and another ClusteredMap for different data in SM.

  2. We have 3 cluster nodes

  3. A ClusteredMap for MUC is created on a cluster node,

  4. A ClusteredMap for SM is created on a cluster node.

How MUC/SM on other cluster nodes know which ClusteredMap is which when they get event that it is created?

h3. Do you use high priority Packets in Tigase and high priority queues cluster communication for Map update in the cluster

I use Distributed EventBus. DEB uses default priority, but it is easy to change.

I think it is very important to use high priority packets for ClusteredMap events. But do not use high priority packets for all EventBus events. You have to select which events are critical and which are less important.

h3. What are the consequences if a record which is later modified is updated sooner on the cluster due to a traffic, do you have timestamps on the Map record updates?

I don't use timestamps. It is simple ConcurrentMap inside. So last received data change event changes value. We can make better implementation later (Clustered map is hidden behind java.util.Map, so we have to create own Map implementation from scratch, because it is impossible to extend implementations from JRE).

We do not have to implement own Map. I am not sure if we can have a custom Map.Entry class which contains timestamp. If not, we can have a custom Key, which contains user's key and timestamp. But, it's not critical to have it right now. You can add it in next version. Please create a task to add this functionality for next version.

h3. What is the impact on the overall cluster traffic from ClusteredMap usage? In terms of number of packets and how big are packets?

Clustered Map uses Distributed Eventbus. Each data change causes sending event to each listener (each cluster node).

This is sample event (it adds two items at once):

[...]

I can make element names shorter.

No, we do not need to make it shorter. We just need to know.

Bartosz Małkowski commented 9 years ago

What I mean is:

Let's say we want to use ClusteredMap for some data in MUC and another ClusteredMap for different data in SM.

We have 3 cluster nodes

A ClusteredMap for MUC is created on a cluster node,

A ClusteredMap for SM is created on a cluster node.

How MUC/SM on other cluster nodes know which ClusteredMap is which when they get event that it is created?

As I said earlier:

java.util.Map<String, String> map = ClusterMapFactory.get().createMap("type",String.class,String.class,"1","2","3" )

For example:

ClusterMapFactory.get().createMap("Very_Important_Map_In_User_Session",JID.class,Boolean.class,"user-session-identifier-123");

will fire event MapCreatedEvent on other cluster nodes, and strings "Very_Important_Map_In_User_Session" and "user-session-identifier-123" given as parameters in onMapCreated() method.

Event consumer code must knows what to do with map with type "Very_Important_Map_In_User_Session". It may get user session "user-session-identifier-123" and put this map in this session.

Or room.

Or something else.

Artur Hefczyc commented 9 years ago

Ok, now I understand. The sample code is very helpful. In your API description the "type" was kind of misleading as it suggested some sort of Map type, whereas this is in fact the Map ID.

Daniel, please add the description to our development guide.

Daniel Wisnewski commented 9 years ago
Bartosz Małkowski commented 9 years ago

I changed two code samples, because API of EventBus changed a bit.

Looks good except for one thing:

I don't understand section "Map Changes".

Developer no need to handle events AddValue@, @RemoveValue or other events with @xmlns="tigase:clustered:map"@. This is done by ClusterMap code.

Developer is interested only events MapCreatedEvent.class and @MapDestroyedEvent.class@, because it inform him that map is created or destroyed and developer should do something with it.

Daniel Wisnewski commented 9 years ago

I figured it would be good to show the change mechanism operates. However, if MapCreatedEvent fires every time the ClusterMap changes, and the whole map is rebuilt, then I can see where you are coming from. I'll remove the map changes section and upload the new document for now.

issue 1 of 1
Type
New Feature
Priority
Blocker
Assignee
RedmineID
3713
Spent time
122h 30m
Issue Votes (0)
Watchers (0)
Reference
tigase/_server/server-core#605
Please wait...
Page is in error, reload to recover