Projects tigase _server tigase-muc Issues #81
New optimized store for MUC room configuration (#81)
Closed
Andrzej Wójcik (Tigase) opened 9 years ago
Due Date
2016-10-30

Right now MUC component uses UserRepository implementation to keep most of informations about rooms.

Due to that single room read from database results in more than one request to database.

We should create alternative MucDAO implementation which would use plain access to database to read all needed data in one request if possible.

This means we would need to create new tables, stored procedures, schema for databases, etc.

Andrzej Wójcik (Tigase) commented 8 years ago

While working on this changes we need to take care of proper JID comparison - see #4655

Andrzej Wójcik (Tigase) commented 8 years ago

New schema is created for every supported database and in task #3995 for MongoDB.

Schema files will port MUC history to new table and rename old one to @muc_history_old@.

Separate class tigase.muc.repository.migration.Converter is executable class within MUC jar and is responsible for moving rooms configuration from old to new store.

To make it easier there are 2 files in database directory (@muc-db-migrate.sh@ and @muc-db-migrate.cmd@) which should run conversion class.

There are 2 important changes in this commit:

  • now MUC room configuration is kept in same database as MUC rooms history - so that data will be kept in one place

  • it is possible to configure separate storages for some domains

ie. keep rooms configurations and history for muc.example1.com in database example1 and for muc.example2.com in database example2

wojciech.kapcia@tigase.net commented 8 years ago

I've checked the changes and they look good. A couple of comments as usual :)

  • what is the purpose of event_type int@? It seems to be ignored and set to @1 either way?

  • in converter public void convert() throws RepositoryException you are destroying room and it's history after conversion:

oldRepo.destroyRoom(roomJid);
  • I think that when we deprecate something we should include information what is the replacement?

  • schema files are not included in resulting jar (and probably tigase-server needs to be updated as well, probably same solution as in tigase-pubsub case);

  • upgrade path should be documented (at first at the least in tigase-server, but later on we are probably going to extract component-specific parts of the documentation to relevant projects and then pull them and inline when building tigase-server package - Dan is working on this)

And one more, a bit separate:

  • it just occurred to me (rather general remark), that maybe we should also include a documentation very short description of the schema? We are moving towards per-component schemas, often with multiple tables and not always using prefixes, so it may be prudent to have something simple like: this component is using table x,y,z; table x stores user data, table y stores... - %kobit - comments?
Andrzej Wójcik (Tigase) commented 8 years ago

Wojciech Kapcia wrote:

I've checked the changes and they look good. A couple of comments as usual :)

  • what is the purpose of event_type int@? It seems to be ignored and set to @1 either way?

I have no idea. It was there so I assumed that it may be used by some extension so I left it as it is. Maybe it was to store info about join/left with @2@?

%bmalkow - Do you have any idea why it is there?

  • in converter public void convert() throws RepositoryException you are destroying room and it's history after conversion:

[...] Maybe it would be better to make it optional (just in case to avoid situations where there could be some data lost?); there is no converter test case so maybe just in case?

I tested converted and worked fine, however I considered leaving it as not a good idea, as data is stored in UserRepo which will make it difficult to clean up later on.

Additionally removing data from UserRepo make it possible to resume conversion if it crashes/failes as I remove data from UserRepo after proper entries in new repository are created. So restart after crash will allow to continue this operation from point where it crashed.

  • I think that when we deprecate something we should include information what is the replacement?

I agree. I deprecated few interfaces, ie. Repository to make it easier to identify and later remove. I think we should add this informations for sure.

  • schema files are not included in resulting jar (and probably tigase-server needs to be updated as well, probably same solution as in tigase-pubsub case);

This is starting to be an issue. We need to implement this automatic update for DB schema mechanism.

  • upgrade path should be documented (at first at the least in tigase-server, but later on we are probably going to extract component-specific parts of the documentation to relevant projects and then pull them and inline when building tigase-server package - Dan is working on this)

You mean description what needs to be updated between versions, etc?

I agree, however having this automatic update mechanism with incremental SQL file with timestamp would make it easier and it could verify schema during startup and warn user what is missing and how to fix it.

And one more, a bit separate:

  • it just occurred to me (rather general remark), that maybe we should also include a documentation very short description of the schema? We are moving towards per-component schemas, often with multiple tables and not always using prefixes, so it may be prudent to have something simple like: this component is using table x,y,z; table x stores user data, table y stores... - %kobit - comments?

There was simiar documentation for PubSub at wiki if I'm correct. I think it may be useful for people. But we should do this for databases and for Mongo as well.

wojciech.kapcia@tigase.net commented 8 years ago

Andrzej Wójcik wrote:

  • in converter public void convert() throws RepositoryException you are destroying room and it's history after conversion:

[...] Maybe it would be better to make it optional (just in case to avoid situations where there could be some data lost?); there is no converter test case so maybe just in case?

I tested converted and worked fine, however I considered leaving it as not a good idea, as data is stored in UserRepo which will make it difficult to clean up later on.

Hmm, removing particular user JID (in this case multi-user-chat user) should remove all nodes/pairs for that user, however it would leave history intact (but this would be easier to handle as it's in separate table.

Additionally removing data from UserRepo make it possible to resume conversion if it crashes/failes as I remove data from UserRepo after proper entries in new repository are created. So restart after crash will allow to continue this operation from point where it crashed.

Possible solution - creating temporary user multi-user-chat-user and moving processed configuration to this JID.

But if you confirm it works OK and there won't be data loss then we can skip it.

  • I think that when we deprecate something we should include information what is the replacement?

I agree. I deprecated few interfaces, ie. Repository to make it easier to identify and later remove. I think we should add this informations for sure.

I've noticed that hence my comment :-)

  • schema files are not included in resulting jar (and probably tigase-server needs to be updated as well, probably same solution as in tigase-pubsub case);

This is starting to be an issue. We need to implement this automatic update for DB schema mechanism.

OK, having automatic updater will be handy, but... this is separate issue - we need to transfer schema from component to main tigase server distribution packages, hence we should include it in the resulting jar and then unpack it.

  • upgrade path should be documented (at first at the least in tigase-server, but later on we are probably going to extract component-specific parts of the documentation to relevant projects and then pull them and inline when building tigase-server package - Dan is working on this)

You mean description what needs to be updated between versions, etc?

Something like: http://docs.tigase.org/tigase-server/snapshot/Administration_Guide/html_chunk/oldVerSchemas.html#v710notice

I agree, however having this automatic update mechanism with incremental SQL file with timestamp would make it easier and it could verify schema during startup and warn user what is missing and how to fix it.

Yes, this is true.

And one more, a bit separate:

  • it just occurred to me (rather general remark), that maybe we should also include a documentation very short description of the schema? We are moving towards per-component schemas, often with multiple tables and not always using prefixes, so it may be prudent to have something simple like: this component is using table x,y,z; table x stores user data, table y stores... - %kobit - comments?

There was simiar documentation for PubSub at wiki if I'm correct. I think it may be useful for people. But we should do this for databases and for Mongo as well.

Yes, this wasn't exclusive to this particular issue and component.

There is a PubSubDAO description (v. https://projects.tigase.org/projects/tigase-pubsub/wiki/PubSub_DAO, a bit dated) but it doesn't cover what-is-where, and while it may be obvious, especially for us, but the question about tables pop up sometimes on the forums.

At any rate - it shouldn't be included in the wiki but in the future asciidoc of the component.

Andrzej Wójcik (Tigase) commented 8 years ago

Wojciech Kapcia wrote:

Andrzej Wójcik wrote:

  • in converter public void convert() throws RepositoryException you are destroying room and it's history after conversion:

[...] Maybe it would be better to make it optional (just in case to avoid situations where there could be some data lost?); there is no converter test case so maybe just in case?

I tested converted and worked fine, however I considered leaving it as not a good idea, as data is stored in UserRepo which will make it difficult to clean up later on.

Hmm, removing particular user JID (in this case multi-user-chat user) should remove all nodes/pairs for that user, however it would leave history intact (but this would be easier to handle as it's in separate table.

No, no, no! Default room configuration is left in UserRepository as it fits there. If you would remove user, you would remove this configuration!

Additionally removing data from UserRepo make it possible to resume conversion if it crashes/failes as I remove data from UserRepo after proper entries in new repository are created. So restart after crash will allow to continue this operation from point where it crashed.

Possible solution - creating temporary user multi-user-chat-user and moving processed configuration to this JID.

But if you confirm it works OK and there won't be data loss then we can skip it.

I tested converted few times and in some cases I forced it to crash. Then I restarted it and it moved rest of data without any issues. In the end all data was moved correctly.

  • I think that when we deprecate something we should include information what is the replacement?

I agree. I deprecated few interfaces, ie. Repository to make it easier to identify and later remove. I think we should add this informations for sure.

I've noticed that hence my comment :-)

  • schema files are not included in resulting jar (and probably tigase-server needs to be updated as well, probably same solution as in tigase-pubsub case);

This is starting to be an issue. We need to implement this automatic update for DB schema mechanism.

OK, having automatic updater will be handy, but... this is separate issue - we need to transfer schema from component to main tigase server distribution packages, hence we should include it in the resulting jar and then unpack it.

Right

  • upgrade path should be documented (at first at the least in tigase-server, but later on we are probably going to extract component-specific parts of the documentation to relevant projects and then pull them and inline when building tigase-server package - Dan is working on this)

You mean description what needs to be updated between versions, etc?

Something like: http://docs.tigase.org/tigase-server/snapshot/Administration_Guide/html_chunk/oldVerSchemas.html#v710notice

I can add following entries, but in 7.2.0 we are updating:

  • config file (DSL)

  • schema for PubSub

  • schema for MUC (new thing)

  • schema for MA/UA (new thing)

so there will be a lot to put in there.

I would rather suggest to create asciidoc for each component and add there section "Required during update" and mention schema updates there.

Then we could somehow gather this data to Tigase XMPP Server "Upgrade guide".

I agree, however having this automatic update mechanism with incremental SQL file with timestamp would make it easier and it could verify schema during startup and warn user what is missing and how to fix it.

Yes, this is true.

And one more, a bit separate:

  • it just occurred to me (rather general remark), that maybe we should also include a documentation very short description of the schema? We are moving towards per-component schemas, often with multiple tables and not always using prefixes, so it may be prudent to have something simple like: this component is using table x,y,z; table x stores user data, table y stores... - %kobit - comments?

There was simiar documentation for PubSub at wiki if I'm correct. I think it may be useful for people. But we should do this for databases and for Mongo as well.

Yes, this wasn't exclusive to this particular issue and component.

There is a PubSubDAO description (v. https://projects.tigase.org/projects/tigase-pubsub/wiki/PubSub_DAO, a bit dated) but it doesn't cover what-is-where, and while it may be obvious, especially for us, but the question about tables pop up sometimes on the forums.

At any rate - it shouldn't be included in the wiki but in the future asciidoc of the component.

+1 - it should be placed in components asciidoc

wojciech.kapcia@tigase.net commented 8 years ago

Andrzej Wójcik wrote:

No, no, no! Default room configuration is left in UserRepository as it fits there. If you would remove user, you would remove this configuration!

OK, makes sense.

  • upgrade path should be documented (at first at the least in tigase-server, but later on we are probably going to extract component-specific parts of the documentation to relevant projects and then pull them and inline when building tigase-server package - Dan is working on this)

You mean description what needs to be updated between versions, etc?

Something like: http://docs.tigase.org/tigase-server/snapshot/Administration_Guide/html_chunk/oldVerSchemas.html#v710notice

I can add following entries, but in 7.2.0 we are updating:

  • config file (DSL)
  • schema for PubSub
  • schema for MUC (new thing)
  • schema for MA/UA (new thing)

so there will be a lot to put in there.

I would rather suggest to create asciidoc for each component and add there section "Required during update" and mention schema updates there.

Then we could somehow gather this data to Tigase XMPP Server "Upgrade guide".

As mentioned before - Dan is already working on in-lining all the documentations so this should be feasible. As this is in origin/master and intended for 7.2.0 then documentation can be migrated (and upgrade details) later on. I've created apropriate tasks for each listed component: #4714, #4715, #4716, #4717.

Marking this as resolved/closed.

wojciech.kapcia@tigase.net commented 8 years ago

Given that the schema is OK and the issue seems to be different I've moved the issue to separate task: #5006

Referenced from commit 8 months ago
issue 1 of 1
Type
New Feature
Priority
Normal
Assignee
RedmineID
3994
Issue Votes (0)
Watchers (0)
Reference
tigase/_server/tigase-muc#81
Please wait...
Page is in error, reload to recover