Using OldGen for memory reporting (#658)

Wojciech Kapcia (Tigase) opened 9 years ago

Due Date
2016-09-22

We are using tenured (Old Gen) heap space for memory usage reporting as per javadoc comment:

We try to return OLD memory pool size as this is what is the most interesting

to us. If this is not possible then we return total Heap size.

and I think it doesn't give us proper overview of the system - theoretically it should contain only long-lived objects which ignores Young Generation

Screenshot 2016-03-07 19.04.02.png Screenshot 2016-03-07 19.03.34.png

Related
- Investigate GC settings (#521) Closed

Activities

Artur Hefczyc commented 9 years ago

Wojtek, I do not understand what do you propose we do? And why this is set as a bug?
Wojciech Kapcia (Tigase) commented 9 years ago
OK, a little bit of clarification:

in JMV we have a couple of different regions of HEAP (this depends also on the type of GC used);

Tigase currently reports memory usage only from the region that has "old" in it's name (if such exists) instead of complete HEAP usage

this results in a slightly distorted picture of the heap usage.

Take a look at the examples - VisualVM heap overview shows that while using CMS GC (top) heap usage (total) slowly climbed and then FullGC occured; by comparison while using default ParalelGC (bottom) we see huge rises of heap (total) usage and often FullGC occurring.

Now comparing this to what Tigase reports (MRTG screenshots) - green (on the left, with CMS) looks like its GC is behaving very sporradic while orange (Paralel, on the right) shows rather low and constant memory usage, because it only shows OldGen and not many objects get promoted but Eden space was used more (in terms of usage and collection).

Bottomline - our current metric is (could be) misleading when looking at memory usage.
Artur Hefczyc commented 9 years ago

Ok, now I understand. And current behavior is intentional.

Please look at the subject from the admin point of view. Admin is interested in stability and performance of the service. Therefore, I believe we are only interested in the old-gen (or similar regions) of the heap. This is because it shows us real-long term memory usage and as a result real memory requirements for the installation. Plus potential memory leaks. When the old-gen size approaches heap limits we know that there is a problem.

Please note, we are not interested in young-gen which contains mostly all the stanza being processed. This memory is being GC quickly enough and is released back to the program. So the short-term memory spikes are irrelevant. This depends on the traffic/load, memory size allocated to JVM and GC algorithms used. When the young-gen approaches the heap limits it means nothing at all. GC will soon kick off and releases all the objects.

However, from the performance point of view we are interested in young-gen but only when we experiment with the GC settings. For normal production system young-gen is not something we want to look at. This is because we want to have GC settings which reduce large GC stop-the-whole-world runs and do most of the GC in background concurrently to the normal server operation. While you run tests and experiment with GC, you are definitely interested in young-gen to see how GC handles it.
Wojciech Kapcia (Tigase) commented 9 years ago

This makes sense, with the caveat that the number of long-lived objects (which could roughly translate to connections/sessions) outweighs number of stanzas processed (Old is usually bigger than Eden).

Nevertheless, currently it may be a bit misleading ("% HEAP" can be considered as a whole heap; in addition - without configuring Xms=Xmx this value could fluctuate over time so while OldGen allocation in MB may vary percentage may be constant)
Artur Hefczyc commented 9 years ago

Wojciech Kapcia wrote:

This makes sense, with the caveat that the number of long-lived objects (which could roughly translate to connections/sessions) outweighs number of stanzas processed (Old is usually bigger than Eden).

This really depends on the traffic. On some installations Tigase handles 1M XMPP packets per second. Even on low traffic installations but with large memory, the young-gen may be much larger than old-gen, so between GC runs number of short-lived objects can be much greater than long-lived objects.

Nevertheless, currently it may be a bit misleading ("% HEAP" can be considered as a whole heap; in addition - without configuring Xms=Xmx this value could fluctuate over time so while OldGen allocation in MB may vary percentage may be constant)

Indeed. We need to make the description more clear and also make sure we recommend correct, reliable memory settings for JVM. Also, I think it makes sense to work on the memory metrics to provide us more detailed information. Information which would be useful for production systems admin and during load tests. Maybe we could/should add more metrics describing memory usage and GC activity.
Wojciech Kapcia (Tigase) commented 9 years ago
Artur Hefczyc wrote:

Wojciech Kapcia wrote:

This makes sense, with the caveat that the number of long-lived objects (which could roughly translate to connections/sessions) outweighs number of stanzas processed (Old is usually bigger than Eden).

This really depends on the traffic. On some installations Tigase handles 1M XMPP packets per second. Even on low traffic installations but with large memory, the young-gen may be much larger than old-gen, so between GC runs number of short-lived objects can be much greater than long-lived objects.

This is not exactly true:

By default, the Application Server is invoked with the Java HotSpot Server JVM. The default NewRatio for the Server JVM is 2: the old generation occupies 2/3 of the heap while the new generation occupies 1/3. The larger new generation can accommodate many more short-lived objects, decreasing the need for slow major collections. The old generation is still sufficiently large enough to hold many long-lived objects.

Hence I think that swapping those could be a good starting point - but this would greatly depend on the use-case (with lower traffic and more long lived connections this wouldn't be too good); but this is more in the context of the GC tuning ticket.

Nevertheless, currently it may be a bit misleading ("% HEAP" can be considered as a whole heap; in addition - without configuring Xms=Xmx this value could fluctuate over time so while OldGen allocation in MB may vary percentage may be constant)

Indeed. We need to make the description more clear and also make sure we recommend correct, reliable memory settings for JVM. Also, I think it makes sense to work on the memory metrics to provide us more detailed information. Information which would be useful for production systems admin and during load tests. Maybe we could/should add more metrics describing memory usage and GC activity.

I've already added some more information about GC in the logs, probably I'll add more information later.

Clarification of naming and extending the statistics seems logical (currently we can retrieve those via JMX, but not through XMPP) - agreed?

I propose then:

Report Memory usage and usage per regions;

Report GC activity (number of collections, times and whether they were STW - if possible).
Artur Hefczyc commented 9 years ago
Wojciech Kapcia wrote:

Artur Hefczyc wrote:

Wojciech Kapcia wrote:

This makes sense, with the caveat that the number of long-lived objects (which could roughly translate to connections/sessions) outweighs number of stanzas processed (Old is usually bigger than Eden).

This really depends on the traffic. On some installations Tigase handles 1M XMPP packets per second. Even on low traffic installations but with large memory, the young-gen may be much larger than old-gen, so between GC runs number of short-lived objects can be much greater than long-lived objects.

This is not exactly true:

By default, the Application Server is invoked with the Java HotSpot Server JVM. The default NewRatio for the Server JVM is 2: the old generation occupies 2/3 of the heap while the new generation occupies 1/3. The larger new generation can accommodate many more short-lived objects, decreasing the need for slow major collections. The old generation is still sufficiently large enough to hold many long-lived objects.

That's the theory.

I mean the fact that the old-gen space is bigger then the new-gen space does not mean that a number of objects in new-gen is smaller than old-gen objects in a particular use-case. I think Tigase (and XMPP in general) is probably not a typical use-case. Please note, in XMPP we have SM related objects which stay in memory for hours or even days, stanza related objects usually disappear in seconds.

The 1M XMPP packets per second traffic happens on an installation with online users between 1M - 2M. So in this particular use-case, we can easily have more stanza related objects than user session objects. There are also use-cases with lower traffic but a very large memory assigned to the Tigase. So the old-gen is relatively empty and new-gen space is slowly being filled until GC kicks in. Then, there is a long GC collection time which stops VM for a few seconds.

Hence I think that swapping those could be a good starting point - but this would greatly depend on the use-case (with lower traffic and more long lived connections this wouldn't be too good); but this is more in the context of the GC tuning ticket.

Our goal is to minimize or if possible completely get rid of long running GC collections. From the performance point of view it is better to have GC kicking in more often for shorter periods of time rather than once in a while for a few seconds or minutes. Ideally all the GC activity should happen in background, concurrently to normal application activity without stopping all threads.

I do not know what GC settings are best for our use-case but this is what we need to find out.

Nevertheless, currently it may be a bit misleading ("% HEAP" can be considered as a whole heap; in addition - without configuring Xms=Xmx this value could fluctuate over time so while OldGen allocation in MB may vary percentage may be constant)

Indeed. We need to make the description more clear and also make sure we recommend correct, reliable memory settings for JVM. Also, I think it makes sense to work on the memory metrics to provide us more detailed information. Information which would be useful for production systems admin and during load tests. Maybe we could/should add more metrics describing memory usage and GC activity.

I've already added some more information about GC in the logs, probably I'll add more information later.

Clarification of naming and extending the statistics seems logical (currently we can retrieve those via JMX, but not through XMPP) - agreed?

I propose then:

Report Memory usage and usage per regions;

Report GC activity (number of collections, times and whether they were STW - if possible).

Why not through XMPP? Right now, all the Tigase metrics are accessible through either JMX or XMPP. I think we should continue on this path.
Wojciech Kapcia (Tigase) commented 9 years ago
Artur Hefczyc wrote:

Wojciech Kapcia wrote:

Artur Hefczyc wrote:

Wojciech Kapcia wrote:

This makes sense, with the caveat that the number of long-lived objects (which could roughly translate to connections/sessions) outweighs number of stanzas processed (Old is usually bigger than Eden).

This really depends on the traffic. On some installations Tigase handles 1M XMPP packets per second. Even on low traffic installations but with large memory, the young-gen may be much larger than old-gen, so between GC runs number of short-lived objects can be much greater than long-lived objects.

This is not exactly true:

By default, the Application Server is invoked with the Java HotSpot Server JVM. The default NewRatio for the Server JVM is 2: the old generation occupies 2/3 of the heap while the new generation occupies 1/3. The larger new generation can accommodate many more short-lived objects, decreasing the need for slow major collections. The old generation is still sufficiently large enough to hold many long-lived objects.

That's the theory.

I mean the fact that the old-gen space is bigger then the new-gen space does not mean that a number of objects in new-gen is smaller than old-gen objects in a particular use-case. I think Tigase (and XMPP in general) is probably not a typical use-case. Please note, in XMPP we have SM related objects which stay in memory for hours or even days, stanza related objects usually disappear in seconds.

The 1M XMPP packets per second traffic happens on an installation with online users between 1M - 2M. So in this particular use-case, we can easily have more stanza related objects than user session objects. There are also use-cases with lower traffic but a very large memory assigned to the Tigase. So the old-gen is relatively empty and new-gen space is slowly being filled until GC kicks in. Then, there is a long GC collection time which stops VM for a few seconds.

I think that we are talking about the same thing ;-)

I was just pointing out, that:

it's true that XMPP traffic is different and may impact heap usage differently

dedicated (or intended space) old-gen and young-gen may not fit that scenario (you've said but with large memory, the young-gen may be much larger than old-gen so I've quoted documentation that states that by default space assigned for old-gen is always larger, and the traffic doesn't affect it in any way)

we should experiment more with ratio of new-gen and old-gen to better match traffic generated in XMPP (please see the line below from my comment)

Hence I think that swapping those could be a good starting point - but this would greatly depend on the use-case (with lower traffic and more long lived connections this wouldn't be too good); but this is more in the context of the GC tuning ticket.

Our goal is to minimize or if possible completely get rid of long running GC collections. From the performance point of view it is better to have GC kicking in more often for shorter periods of time rather than once in a while for a few seconds or minutes. Ideally all the GC activity should happen in background, concurrently to normal application activity without stopping all threads.

I do not know what GC settings are best for our use-case but this is what we need to find out.

Agreed.

Nevertheless, currently it may be a bit misleading ("% HEAP" can be considered as a whole heap; in addition - without configuring Xms=Xmx this value could fluctuate over time so while OldGen allocation in MB may vary percentage may be constant)

Indeed. We need to make the description more clear and also make sure we recommend correct, reliable memory settings for JVM. Also, I think it makes sense to work on the memory metrics to provide us more detailed information. Information which would be useful for production systems admin and during load tests. Maybe we could/should add more metrics describing memory usage and GC activity.

I've already added some more information about GC in the logs, probably I'll add more information later.

Clarification of naming and extending the statistics seems logical (currently we can retrieve those via JMX, but not through XMPP) - agreed?

I propose then:

Report Memory usage and usage per regions;

Report GC activity (number of collections, times and whether they were STW - if possible).

Why not through XMPP? Right now, all the Tigase metrics are accessible through either JMX or XMPP. I think we should continue on this path.

OK, let me clarify more:

right now via XMPP we can access only a couple of memory related metrics (heap usage in %, and then max/used/free in KB - all from single heap generation; in addition non-heap same set of metrics - for example @message-router/HEAP usage [%]@);

more details statistics regarding heap/memory usage can be obtained through JMX, but this doesn't utilize Tigase statistics as the above but rather metrics provided by JVM itself.
Artur Hefczyc commented 9 years ago

OK
Wojciech Kapcia (Tigase) commented 9 years ago
I've:

added to Tigase statistics underlying JVM heap memory metrics as well as GC statistics;

re-factored of Windows/Panels and de-duplicated code in Monitor;

added new Panel with JVM memory details, option to select preferred display unit (KB/MB/GB);

added configuration option to control whether the graphs should be approximated or not.

One comment - by default works best with relatively detailed data resolution due to possible often changes of the values (and Tigase by default uses 10s)
Login to comment

Type	Bug
Priority	Major
Assignee	Artur Hefczyc
RedmineID	4003
Version	tigase-server-7.1.0
Spent time	0

Issue Votes (0)

Watchers (0)

Reference

tigase/_server/server-core#658