Make sure we can get a list of pubsub nodes for a subscriber (#12)

Artur Hefczyc opened 1 decade ago

Let's say we have 30 mln PubSub nodes and we want to have a list of PubSub nodes for a given subscriber. We assume that we have just a few (max 100) PubSub nodes for a single subscriber so we do not worry about loading 30 mln, the problem is just searching DB in an efficient way.

Can we do this efficiently? Given the way Tigase stores subscribers right now it might be problematic.

Andrzej, could you please review this problem and decide whether we need to do something and who should work on this if needed?

Activities

Andrzej Wójcik (Tigase) commented 1 decade ago
Now we have two different implementations used to store nodes, subscriptions and affiliations in database.

In PubSubDAO subscriptions for each node are stored separatly in single sql field for a node, so to load all subscriptions of a user we are forced now to load all subscriptions for all nodes and parse them to find nodes with subscription for particular user

In PubSubDAOJDBC subscriptions are stored for each node and for each and for each subscriber in separate sql, so we would be able to search for all nodes subscribed by users fast enought using current database schema.

Unfortunately we use common code to search in PubSubDAO and PubSubDAOJDBC which forces us to load all subscriptions of a user to load and later search in them. Because of that I think that it may be hard to search 30mln of nodes for nodes subscribed by a user. To improve performance we would need to improve database API and use only PubSubDAOJDBC or change schema used by PubSubDAO and also improve database API.

I think that anyone from our team can work on this issue but I suppose that we should also consider who will work on improvements in database API we considered some time ago.
Artur Hefczyc commented 1 decade ago

Ok, please start working on this.
Andrzej Wójcik (Tigase) commented 1 decade ago

I redesigned database schema for PubSub component, which allows us to query for affiliations and subscriptions directly on database level, which will improve performance of described features. Following change in database schema will be part of Tigase PubSub Component 3.0.0
Artur Hefczyc commented 1 decade ago
We need some tests to see the performance improvement. Could you please run some tests for most typical PubSub use cases to see the performance difference? Ideally we would like to have comparison between all 3 PubSub data storages. If we have populate the DB with test data over XMPP we could have the same exact tool and test for all 3 DBs. Even better we could compare Tigase PubSub to other PubSub systems out there. Actually time necessary to load DB over XMPP could be also measured and should be part of the test.

Eric can help you with setting up systems with Tigase with all 3 different configuration (and maybe other XMPP servers as well?????? ejabberd, openfire, prosody).

Then we could run tests:

100M PubSub nodes with up to 100 subscribers

10M nodes with up to 1k subscribers

1M PubSub nodes with 10k subscribers (if technically possible)

100k PubSub nodes with 10M subscribers

Tests:

We are interested here DB access performance so, please plan a few tests which focus on DB performance, rather then generating a high traffic

querying nodes for a user,

querying subscribers for a node

getting a list of N recent publications

getting a list of publications since a given timestamp (the new API just added by Bartek)

publishing N new publications per second (how much can we get for each DB)

Adding/removing a subscriber, publisher

Anything else which comes to your mind (Bartek, Wojciech any suggestions???)

Andrzej - I also have one additional question which always worries me when I think of a high PubSub usage. What would happen if you post a new publication to a node with 10M subscribers? Will this overload the Tigase? I remember trying to add some logic to PubSub which would monitor memory usage and slowly send notifications to all subscribers. I do not remember if this code is in our current implementation but please think of this. Do we have anything which prevents OOM in such a case?
Andrzej Wójcik (Tigase) commented 1 decade ago

I will start from last question. I haven't seen any code which could prevent from OOM in case of 10M subcribers or anything like that so it would be possible that it could overload Tigase.

As for a tests, do we have a mechanism which should or could be used for a test? TTS? We should test not only database performance as slowness of PubSub component was caused by not optimal database schema and a lot of synchronization inside DAO mechanisms (default single connection to DB and single writing thread for storing node configuration). Both things were improved in same task as this was part of database access layer which was needed to be improved to create better database schema. So I think we should test concurrent access - high traffic?
Artur Hefczyc commented 1 decade ago

Andrzej Wójcik wrote:

I will start from last question. I haven't seen any code which could prevent from OOM in case of 10M subcribers or anything like that so it would be possible that it could overload Tigase.

Ok, this is something to work on as well. I have created a new task for this: #1694.

As for a tests, do we have a mechanism which should or could be used for a test? TTS?

TTS is not suitable for performance tests unfortunately. We could use either Tsung or write a simple command line client in Java or Groovy using JaXMPP2. With out own code we have more control over what we do and how we do it. Tsung is easier to run large and distributed load tests but probably this is not what we want to do here.

There is also a built-into Tigase code for testing PubSub so you actually do not need any external tool or users connection to put some load on PubSub. PubSubTestsTask

If you know how to use it, it might be good enough.

In order to activate the task you have to configure StanzaReceiver component, and then connect with Psi to the server, browse service discovery for the component and create a new task. Pick PubSubTest task from the list and that's it.

We should test not only database performance as slowness of PubSub component was caused by not optimal database schema and a lot of synchronization inside DAO mechanisms (default single connection to DB and single writing thread for storing node configuration). Both things were improved in same task as this was part of database access layer which was needed to be improved to create better database schema. So I think we should test concurrent access - high traffic?

I agree, this makes sense. But probably we are not interested in hundreds of thousands users connection, instead 100 or 1,000 users would be good enough. Actually we do not need users at all. It could be a single user accessing multiple nodes at the same time. What I mean is the user simulator software could be very simple thing.
Login to comment

Type	New Feature
Priority	Normal
Assignee	Andrzej Wójcik (Tigase)
RedmineID	1667

Issue Votes (0)

Watchers (0)

Reference

tigase/_server/tigase-pubsub#12