Exposing statistics from DB layer from Tigase

Exposing statistics from DB layer from Tigase - Monitored Methods (#216)

arun das opened 1 decade ago

Due Date
2017-04-20

It will be great if tigase can expose statistics [like average execution time, number of executions, exception etc. ] from classes which directly interacts with DB using query execution. [MonitoredMethods]

For sample MySqlHistoryProvider which interacts with MySQL DB to store/read the history for MUC activities.

This will help us in getting the actual performance from last layer in application to DB.

This will help us in figuring out the actual DB performance. In case of slow running queries/full table scans/locked sessions etc these statistics will definitely

help in pin pointing the issue.

People can use additional monitoring/alerting tools set with some thresholds on these statistics to send out alerts in case vital statistics go beyond a threshold.

Please let me know whether you will be able to accommodate this feature in the coming releases.

Related
- Seperate Component-based statistics (#827) Closed

Activities

Artur Hefczyc commented 1 decade ago

Wojciech, please look at it while you work on the DB refactoring and SQL Server support. If this is too much work we will push it to version 5.2.1, if not too much then we can include it in 5.2.0.

I started working on a special API for accumulating timings and performance metrics. I suppose this could be used here as well.
arun das commented 1 decade ago

Would like to know about the planned release date for this.
Artur Hefczyc commented 1 decade ago

We have plans for rewriting part of DB API very soon. While we do that we will add statistics. I will let you know once we have ETA for the work.
Artur Hefczyc commented 10 years ago

Is this completed yet? I know some work has been already done for this.
Wojciech Kapcia (Tigase) commented 10 years ago

Some work has been done under #2908 - extending StatisticsAPI, adding generalised API to measure ad-hoc execution time and, for some PubSub queries - measuring execution time. This hasn't been done to all DB queries nor those in tigase-server.
Artur Hefczyc commented 10 years ago

Please estimate work effort to complete the task then.
Wojciech Kapcia (Tigase) commented 10 years ago

Judging from previous efforts 6-8h.
Wojciech Kapcia (Tigase) commented 9 years ago

Implementation discussion was in #4491#note-3
Andrzej Wójcik (Tigase) commented 9 years ago
In discussion in #4491#note-3 we decided to go with implementation of a wrapper for each interface of a repository to count execution time on a repository method level.

After looking in our code I found better approach to this task - instead of preparing implementation of a wrapper for every repository we use, I used Java Proxy class which is able to do the same @on the fly@. I checked and it has similar performance as a wrapper - as in fact Java creates in class in memory and uses it as a wrapper. Thanks to that there will be no need to update wrappers in future if we add or remove any method from existing repository interfaces.

Additionally I modified and created classes extending MDRepositoryBean@, @SDRepositoryBean and MDPoolBean which are used as base classes for multi domain repository pools (in fact in version 7.2.0 we use only them). In this extended classes (which are beans) I added new configuration property statistics and if this property is set to true then statistics will be gathered for this repository - separately for each connection pool.

At this point I gather following informations for every method of repository interface:

message-archive/repositoryPool/default/deleteExpiredMessages/Executions[L] = 0 message-archive/repositoryPool/default/deleteExpiredMessages/Excutions last hour[L] = 0 message-archive/repositoryPool/default/deleteExpiredMessages/Excutions last minute[L] = 0 message-archive/repositoryPool/default/deleteExpiredMessages/Excutions last second[L] = 0 message-archive/repositoryPool/default/deleteExpiredMessages/Average processing time[L] = 0 message-archive/repositoryPool/default/deleteExpiredMessages/Exceptions during execution[L] = 0

which gives us rather good knowledge about performance of a method or any possible issues within this methods code.

Statistics can now be enabled for ANY class which uses one MDRepositoryBean@, @SDRepositoryBean and MDPoolBean as a base multi domain repository pool class, ie:

userRepository { statistics = true } authRepository { statistics = true } message-archive { repositoryPool { statistics = true } } muc { muc-dao { statistics = true } historyProviderPool { statistics = true } } pubsub { dao { statistics = true } } msgRepository { statistics = true }

Due to fact that in DSL it is possible to forward configuration parameters to subbeans it is now possible to enable whole statistics gathering on every level, ie.:

enable statistics for every bean supporting statistics configuration property in MUC:

muc { statistics = true }

enabling statistics for whole system (every bean which supports statistics configuration property:

statistics = true

I'm assigning this task to %wojtek for a quick review of gathered data and any comments on configuration of statistics. If all will be OK, then we may update documentations to mention this new statistics.
Wojciech Kapcia (Tigase) commented 8 years ago

Andrzej Wójcik wrote:

I'm assigning this task to wojtek for a quick review of gathered data and any comments on configuration of statistics. If all will be OK, then we may update documentations to mention this new statistics.

Looks good! Definitely should be part of the documentation.

Small comment - maybe it would be a good idea to enable it by default? I would suggest differentiating levels - leave count and average time with FINE but change remaining to FINEST
Andrzej Wójcik (Tigase) commented 8 years ago

I agree with changing levels of statistics, no problem.

But before I will do that, I would like to know if I should enable it by default. It may slightly impact performance, but statistics will be gathered by statistics collector which may lead to bigger memory usage. So should I enable repository statistics by default?
Wojciech Kapcia (Tigase) commented 8 years ago

I think we can enable it by default, but on the condition levels will be changed - this is because StatisticsCache stores only values for Level FINER and greater (vide @tigase.stats.StatisticsProvider.StatisticsCache@) so while it will impact memory usage, getting basic statistics should still be useful - %kobit, comments?
Artur Hefczyc commented 8 years ago

OK.

A few comments though.

Do you have any idea how much this may impact performance?

I am also thinking, maybe it would be a good idea to collect somewhere, somehow all the performance related information. Kind of a guide or collection of suggestions on how to get max of Tigase in terms of performance, low resources usage, how to diagnose performance issues, etc.... What do you think?

For example, we could put there a point about disabling DB layer stats which reduces CPU usage by 2% and memory usage by 1GB.... and tricks and hints like this. Some of the suggestions can be, of course more thorough, such as how to read stats to find out which part of the Tigase is a bottleneck...

I am thinking of more relaxed way to collect such information than our asciidoc documentation. Something like a wiki page easy to edit. We have this project:xmppscaling project which was created for a similar purpose. We could change it to "Tigase XMPP Server Scaling" or something like this and start adding stuff like this.

%wojtek , %bmalkow , %andrzej.wojcik , %eric , %daniel Any suggestions are very welcomed.
Andrzej Wójcik (Tigase) commented 8 years ago

Artur Hefczyc wrote:

OK.

A few comments though.

Do you have any idea how much this may impact performance?

I haven't done any testing on my own but code responsible for collection of statistics is simple and I used default mechanism in Java to generate dynamic proxies, which according to https://opencredo.com/dynamic-proxies-java-part-2/ is fast during execution.

I am also thinking, maybe it would be a good idea to collect somewhere, somehow all the performance related information. Kind of a guide or collection of suggestions on how to get max of Tigase in terms of performance, low resources usage, how to diagnose performance issues, etc.... What do you think?

For example, we could put there a point about disabling DB layer stats which reduces CPU usage by 2% and memory usage by 1GB.... and tricks and hints like this. Some of the suggestions can be, of course more thorough, such as how to read stats to find out which part of the Tigase is a bottleneck...

I am thinking of more relaxed way to collect such information than our asciidoc documentation. Something like a wiki page easy to edit. We have this project:xmppscaling project which was created for a similar purpose. We could change it to "Tigase XMPP Server Scaling" or something like this and start adding stuff like this.

I'm in favour of this idea. I think we should do that, however I would like to stick to asciidoc. We can use wiki to write idea, but later we should check them, measure them (ie. some tests on our servers needs to be done) and then write it to asciidoc with results we got.

I'm forcing to use asciidoc as it has one great thing - versioning. Some tips may work in one version but may not in the other.

Artur, please assign issue back to me, so I may enable statistics gathering by default and change levels of gathered statistics.
Wojciech Kapcia (Tigase) commented 8 years ago

%kobit

Artur Hefczyc wrote:

Do you have any idea how much this may impact performance?

Me neither.

Regarding statistics I've only done a string interning optimization a while back which allowed saving dozens of MB of memory but that's about it.

What we could do is to count size of single complete Statistics collection at given time and give that to user (as part of such statistic? or in log?) so it would be possible to assess increased memory consumption based on the configured statistics history size and currently enabled history. Please bare in mind that there is a task to further optimise statistics history (#4809).

I am also thinking, maybe it would be a good idea to collect somewhere, somehow all the performance related information. Kind of a guide or collection of suggestions on how to get max of Tigase in terms of performance, low resources usage, how to diagnose performance issues, etc.... What do you think?

That sounds like a good idea and I think that was the intention of the performance comparison tests run by %eric a while back.

For example, we could put there a point about disabling DB layer stats which reduces CPU usage by 2% and memory usage by 1GB.... and tricks and hints like this. Some of the suggestions can be, of course more thorough, such as how to read stats to find out which part of the Tigase is a bottleneck...

While this is a good idea and will definitely be helpful, if we want something performance related we could/should prepare a "use this spec for this kind of load" information so users, based on expected components and traffic could get the idea what hardware they need (and this is being asked quite often, not to mention it could be a good PR to publish those).

I am thinking of more relaxed way to collect such information than our asciidoc documentation. Something like a wiki page easy to edit. We have this project:xmppscaling project which was created for a similar purpose. We could change it to "Tigase XMPP Server Scaling" or something like this and start adding stuff like this.

Like %andrzej.wojcik I'm also in favour of putting this in asciidoc format in the repository (and wiki can be used for drafting); asciidoc is similarly easy to edit (after a while it's syntax it's easier, it's pity that redmine doesn't support it); we already have "High load settings" chapter in Server guide so that would fit quite nicely there.
Artur Hefczyc commented 8 years ago

Thank you for your comments.

I understand your motivations for using asciidoc. The problem is that it is not quick and easy to add new part to the documentation. If I work on a problem and discovered something or have an idea or interesting data, which is worth writing down in our documentation, our current asciidoc system, committing changes to git repo, etc... is something which I always want to postpone. As a result some things are never written and get forgotten.

That's why I suggested simplified procedure to record some information.

By the way: https://github.com/gpr/redmine_asciidoc_formatter

If we could get this work directly with git repo somehow, so Redmine displays document directly from git that would be great.
Andrzej Wójcik (Tigase) commented 8 years ago

I've enabled statistics and verified log levels. All was already set as requested (with exception of Exceptions during executions) but I decided to leave it in FINE as it may be important to know value of this counter.

%wojtek Please verify if it works correctly for you.

%kobit As I mentioned - we may use wiki for drafting and later (when we have time to do this) move things to asciidoc for preparing documentation on how to improve performance.
Daniel Wisnewski commented 8 years ago

Is there any way to set the level of statistics to FINEST or some other level? We do have different levels for statistics if they remain from the v7.1.0 statistics description, it would be helpful to filter them if we just want basic statistics on some components.

Also, do global statistics override component settings?
Wojciech Kapcia (Tigase) commented 8 years ago

Andrzej Wójcik wrote:

I've enabled statistics and verified log levels. All was already set as requested (with exception of Exceptions during executions) but I decided to leave it in FINE as it may be important to know value of this counter.

wojtek Please verify if it works correctly for you.

You are correct. I probably looked to hastily. Looks OK now and can be included in documentation now.

Daniel Wisnewski wrote:

Is there any way to set the level of statistics to FINEST or some other level?

You can select desired level of statistics when you retrieve statistics using XMPP or over JMX for example.

We do have different levels for statistics if they remain from the v7.1.0 statistics description, it would be helpful to filter them if we just want basic statistics on some components.

This works the same as in 7.1.0 - level wise.

Also, do global statistics override component settings?

IMHO bean confing would override global settings.
Andrzej Wójcik (Tigase) commented 8 years ago

Wojciech Kapcia wrote:

Andrzej Wójcik wrote:

I've enabled statistics and verified log levels. All was already set as requested (with exception of Exceptions during executions) but I decided to leave it in FINE as it may be important to know value of this counter.

wojtek Please verify if it works correctly for you.

You are correct. I probably looked to hastily. Looks OK now and can be included in documentation now.

Added.

Daniel Wisnewski wrote:

Is there any way to set the level of statistics to FINEST or some other level?

You can select desired level of statistics when you retrieve statistics using XMPP or over JMX for example.

We do have different levels for statistics if they remain from the v7.1.0 statistics description, it would be helpful to filter them if we just want basic statistics on some components.

This works the same as in 7.1.0 - level wise.

Also, do global statistics override component settings?

IMHO bean confing would override global settings.

Yes, bean config will override global settings and global settings will override default settings for every repository (if needed).
Wojciech Kapcia (Tigase) commented 8 years ago

Andrzej Wójcik wrote:

Wojciech Kapcia wrote:

You are correct. I probably looked to hastily. Looks OK now and can be included in documentation now.

Added.

As usual a couple of comments ;)

=== Number of active data sources

It could use some more description -- is it a completely total number of connections? or number of existing {dataSourceName}?

=== Average processing time of {method}

Clarification whether it's totals - i.e. of all time and all calls.

== Statistics common to custom {compname} component repositories

These statistics may be found in many components which are using repository implementations created just for them.

Are there still statistics exclusive to components in per-component documentation which could/should be included here? Or we skip inlining in this case?

%Daniel and %Andrzej - what do you think about turning this into table? after perusing it it struck me that it could make sense
Andrzej Wójcik (Tigase) commented 8 years ago

Wojciech Kapcia wrote:

Andrzej Wójcik wrote:

Wojciech Kapcia wrote:

You are correct. I probably looked to hastily. Looks OK now and can be included in documentation now.

Added.

As usual a couple of comments ;)

=== Number of active data sources

It could use some more description -- is it a completely total number of connections? or number of existing {dataSourceName}?

Added answer to documentation.

=== Average processing time of {method}

Clarification whether it's totals - i.e. of all time and all calls.

Added answer to documentation.

== Statistics common to custom {compname} component repositories

These statistics may be found in many components which are using repository implementations created just for them.

Are there still statistics exclusive to components in per-component documentation which could/should be included here? Or we skip inlining in this case?

I do not think we have any other per-component statistics which should be described, but it will change in future and I strongly suggest to stop inlining entries related to component in server documentation.

Description of component features/options/settings/statistics/modules should be part of component documentation and later may be inlined in Tigase XMPP Server documentation during preparation of Tigase XMPP Server distribution build.

%Daniel Do you agree?

%Daniel and %Andrzej - what do you think about turning this into table? after perusing it it struck me that it could make sense

From my point of view list is fine but it could be transformed into table as well as our description of each method contains of same entries: level, format, possible value, etc.

However we are using code blocks to show possible entry, which would be difficult to do in a table, but in a table I think we could just drop code blocks and just make wrap it in ````
Daniel Wisnewski commented 8 years ago

The only concern I have is readability. We can for sure create a component statistics documentation page for each component, and then add them to the index. However, one reason why I inlined common component statistics was to avoid having multiple duplications. When all totaled together they can be a lot to scroll through and lengthens the size considerably.

A potential fix for this is to move statistics description to an appendix of sorts at the end of documentation. In which case we can lay everything out.
Wojciech Kapcia (Tigase) commented 8 years ago
Andrzej Wójcik wrote:

Added answer to documentation.

Thank you for additions.

== Statistics common to custom {compname} component repositories

These statistics may be found in many components which are using repository implementations created just for them.

Are there still statistics exclusive to components in per-component documentation which could/should be included here? Or we skip inlining in this case?

I do not think we have any other per-component statistics which should be described, but it will change in future and I strongly suggest to stop inlining entries related to component in server documentation.

Description of component features/options/settings/statistics/modules should be part of component documentation and later may be inlined in Tigase XMPP Server documentation during preparation of Tigase XMPP Server distribution build.

+1

Except for general things (like in this case, where repository statistics, which reported by repository of particular component, were in fact a result of data source statistics.

%Daniel and %Andrzej - what do you think about turning this into table? after perusing it it struck me that it could make sense

From my point of view list is fine but it could be transformed into table as well as our description of each method contains of same entries: level, format, possible value, etc.

Yes, that was my idea - each row = single setting. I think it would make it easier to grasp.

However we are using code blocks to show possible entry, which would be difficult to do in a table, but in a table I think we could just drop code blocks and just make wrap it in ````

That's rather cosmetic change, and using inline code formatting for a short excerpts should work just fine.

Daniel Wisnewski wrote:

The only concern I have is readability. We can for sure create a component statistics documentation page for each component, and then add them to the index. However, one reason why I inlined common component statistics was to avoid having multiple duplications. When all totaled together they can be a lot to scroll through and lengthens the size considerably.

I think that Andrzej wasn't suggesting expanding everything and having a detailed, repetitive description of all statistics.

Something like avoiding

pubsub/repository/pubsub-repo-specific-statistics[I]=123

included in tigase-server.

In principle -- if something isn't strictly tigase-server and it's about component then it should go into that component documentation (and now we have asciidoc documentation for almost all projects [all?]).
Andrzej Wójcik (Tigase) commented 8 years ago

Wojciech Kapcia wrote:

Andrzej Wójcik wrote:

Added answer to documentation.

Thank you for additions.

== Statistics common to custom {compname} component repositories

These statistics may be found in many components which are using repository implementations created just for them.

Are there still statistics exclusive to components in per-component documentation which could/should be included here? Or we skip inlining in this case?

I do not think we have any other per-component statistics which should be described, but it will change in future and I strongly suggest to stop inlining entries related to component in server documentation.

Description of component features/options/settings/statistics/modules should be part of component documentation and later may be inlined in Tigase XMPP Server documentation during preparation of Tigase XMPP Server distribution build.

+1

Except for general things (like in this case, where repository statistics, which reported by repository of particular component, were in fact a result of data source statistics.

Right, common settings should be left in Tigase XMPP Server documentation.

%Daniel and %Andrzej - what do you think about turning this into table? after perusing it it struck me that it could make sense

From my point of view list is fine but it could be transformed into table as well as our description of each method contains of same entries: level, format, possible value, etc.

Yes, that was my idea - each row = single setting. I think it would make it easier to grasp.

+1

However we are using code blocks to show possible entry, which would be difficult to do in a table, but in a table I think we could just drop code blocks and just make wrap it in ````

That's rather cosmetic change, and using inline code formatting for a short excerpts should work just fine.

Yes, I know.

Daniel Wisnewski wrote:

The only concern I have is readability. We can for sure create a component statistics documentation page for each component, and then add them to the index. However, one reason why I inlined common component statistics was to avoid having multiple duplications. When all totaled together they can be a lot to scroll through and lengthens the size considerably.

I think that Andrzej wasn't suggesting expanding everything and having a detailed, repetitive description of all statistics.

Wojtek is correct here. As I mentioned few lines above - common description should be left in Tigase XMPP Server documentation.

Something like avoiding

[...]

included in tigase-server.

In principle -- if something isn't strictly tigase-server and it's about component then it should go into that component documentation (and now we have asciidoc documentation for almost all projects [all?]).

I think we have asciidoc for almost all projects and we are on a road to create separated documentation (asciidoc) for every project.
Wojciech Kapcia (Tigase) commented 8 years ago

Andrzej Wójcik wrote:

%Daniel and %Andrzej - what do you think about turning this into table? after perusing it it struck me that it could make sense

From my point of view list is fine but it could be transformed into table as well as our description of each method contains of same entries: level, format, possible value, etc.

Yes, that was my idea - each row = single setting. I think it would make it easier to grasp.

+1

Dan, can you make the change?
Daniel Wisnewski commented 8 years ago

All statistics are now in tables, separated by component. Although I found it hard to control some formatting since a lot needed to fit in some descriptions, but I think this works and simplifies the document, it's also now moved to an appendix at the end of the document.
Wojciech Kapcia (Tigase) commented 8 years ago
Daniel Wisnewski wrote:

All statistics are now in tables, separated by component. Although I found it hard to control some formatting since a lot needed to fit in some descriptions, but I think this works and simplifies the document,

Definitely looks better. Non-server components would need to be extracted to per-component repositories but that's separate issue.

it's also now moved to an appendix at the end of the document.

OK, but there is some problem with indenting, take a look at http://docs.tigase.org/tigase-server/snapshot/User_Guide/html_chunk/ - we have two chapters:

21. Appendix I - Statistics Descriptions 22. Statistics description

Please correct this.
Daniel Wisnewski commented 8 years ago

Fixed the chapter names to prevent doubling of Appendix I. Wojciech, do we have known component-based statistics? Do they all display on server shutdown when components are turned on?
Wojciech Kapcia (Tigase) commented 8 years ago
Daniel Wisnewski wrote:

Fixed the chapter names to prevent doubling of Appendix I.

From the source it looks fixed. Hopefully next nightly will have it corrected on docs. website.

Wojciech, do we have known component-based statistics?

Some are already included in the guide, vide chapters:

message-archive muc proxy pubsub rest

Those should be placed in corresponding projects.

Do they all display on server shutdown when components are turned on?

Yes, but only when the component is enabled (and with 7.2.0 more components are enabled)
Daniel Wisnewski commented 8 years ago

Closing issue, I have made a related task to separate statistics descriptions to keep work separate.
Login to comment

Type	New Feature
Priority	Normal
Assignee	Daniel Wisnewski
RedmineID	1477
Version	tigase-server-8.0.0
Spent time	0

Issue Votes (0)

Watchers (0)

Reference

tigase/_server/server-core#216