-
~~I do not think we should use
Optional<>
in getters due to performance reasons, see https://pkolaczk.github.io/overhead-of-optional/~~ I've double-checked their results, and if my tests are correct (with use of JMH) usage ofOptional<Element>
is around 400-500 us slower per 1000000 operations. (It looks like mentioned article had worse values not due to the usage ofOptional
but due to conversion oflong
toLong
and back).Usage of empty
List
would, performance-wise, be equal to returningnull
for methods returningList
, but it would allow us to "drop" checks if we received an instance of aList
ornull
. -
I think we should provide 2 forms of methods for getting a child element:
Element getChild(String name, String xmlns); Optional<Element> findChild(String name, String xmlns);
I would also leave existing versions of methods accepting as a parameter
Matcher
/Predicate
as it makes querying more flexible (ie. we can query using different attributes or even by testing if children contains particular element). -
Here are benchmark results for the tests which I've implemented:
Benchmark Mode Cnt Score Error Units ElementPerformanceTest.benchmarkFindChildAndClassic avgt 5 14,577 ± 1,681 us/op ElementPerformanceTest.benchmarkFindChildAndOptional avgt 5 13,954 ± 1,230 us/op ElementPerformanceTest.benchmarkFindChildArrayList avgt 5 18,353 ± 0,433 us/op ElementPerformanceTest.benchmarkFindChildLinkedList avgt 5 18,815 ± 0,662 us/op ElementPerformanceTest.benchmarkFindChildNew avgt 5 1,685 ± 0,016 us/op ElementPerformanceTest.benchmarkFindChildNewStatic avgt 5 1,423 ± 0,028 us/op ElementPerformanceTest.benchmarkFindChildNewX avgt 5 1,733 ± 0,037 us/op ElementPerformanceTest.benchmarkFindChildOldEmpty avgt 5 0,019 ± 0,001 us/op ElementPerformanceTest.benchmarkFindChildOldOptional avgt 5 3,294 ± 0,456 us/op ElementPerformanceTest.benchmarkFindChildOldStatic avgt 5 1,456 ± 0,069 us/op ElementPerformanceTest.benchmarkFlatMap avgt 5 0,095 ± 0,004 us/op ElementPerformanceTest.benchmarkFlatMapStreamToList avgt 5 0,278 ± 0,038 us/op ElementPerformanceTest.benchmarkFlatMapStreamToList2 avgt 5 0,193 ± 0,017 us/op ElementPerformanceTest.benchmarkHashMapGet avgt 5 0,057 ± 0,001 us/op ElementPerformanceTest.benchmarkHashMapGetIdentity avgt 5 0,477 ± 0,007 us/op ElementPerformanceTest.benchmarkHashMapGetMemoryOptimized avgt 5 0,067 ± 0,002 us/op ElementPerformanceTest.benchmarkHashMapSet avgt 5 0,072 ± 0,003 us/op ElementPerformanceTest.benchmarkHashMapSetIdentity avgt 5 0,517 ± 0,019 us/op ElementPerformanceTest.benchmarkHashMapSetMemoryOptimized avgt 5 0,093 ± 0,005 us/op ElementPerformanceTest.benchmarkIndentityMapGet avgt 5 0,507 ± 0,013 us/op ElementPerformanceTest.benchmarkIndentityMapGetStatic avgt 5 0,047 ± 0,001 us/op ElementPerformanceTest.benchmarkIndentityMapSet avgt 5 0,524 ± 0,017 us/op
Following things we should consider:
- Use
ArrayList
instead ofLinkedList
as it should be smaller and faster String.intern()
is slow.- Usage of
HashMap
is faster for lookup of arbitrary strings, asIdentityHashMap
requires usage ofString.intern()
. When using static stringsHashMap
is 21% slower onget
but is still faster onput
about 86% (as we do not haveStaticString
equivalent). - Usage of
HashMap
will increase memory usage as attribute keys will not be deduplicated (unless we would use "optimization", to replace most commonly used attribute names with staticString
s). - Element parsing will be faster when
String.intern()
is not used. - Comparing element names using identity instead of equality is about 18% faster.
Looking at those results, I think that we may consider replacing
LinkedList
withArrayList
andIdentityHashMap
withHashMap
.In case of
Map
replacement, while we may slightly slow down "get" requests, we would for sure increase performance of "set" (which is done during parsing of elements, element instantiation, etc.). We may also consider usage of "optimized" version, to reduce memory overhead of this decision.In case of
Stream
, we should use them in "non-critical" path as it is slower, but if we are creating many collections before we get our result, then usage ofStream
may be faster..In case of
Optional
, we should use them in "non-critical" path as it is slower.While I would like to "simplify" our API and remove "StaticString" methods, I think we would have to keep them as removing them would slow down child elements lookup by 18% (however, it would reduce element creation penalty "String.intern()", but that is one time operation). Optionally, we could create a similar optimization to proposed "memory optimization" for known attribute names, or even check if manually created pool of queried element names would not be faster.
All tests are now in a separate branch
issue #14
.@wojtek What do you think?
- Use
-
Use ArrayList instead of LinkedList as it should be smaller and faster
+1
I'm somewhat split regarding
String.intern()
- it indeed is quite slow (https://stackoverflow.com/questions/10624232/performance-penalty-of-string-intern#10628759), quite possibly "optimised" HashMap would be better option here. Same goes with comparison (identity vs equality). Yes, pure comparison with identity is faster but we should consider lookups in the pool. And I think that those tables could be relatively big (element names and attribute names could be small but their values not so much, considering high variation in JIDs for example; and I memory serves me right I saw that those tables used quite some space after a while). Again - we would have to review the code and check all the places where the comparison is made.Regarding Optionals and usage . The problem with "using them in non-critical paths" would be non-enforceable thus it could happen that someone would use it without knowing the penalty (This applies to
Stream
methods as well). And even though with Value Classes this should be somewhat mitigated, it's still future. Alternatively we could use@Nullable
annotation (could be from jetbrains) - it could add hints in IDE and we could add checks during build (we already havespotbugs-maven-plugin
but it's not enabled by default). -
Same goes with comparison (identity vs equality). Yes, pure comparison with identity is faster but we should consider lookups in the pool. And I think that those tables could be relatively big (element names and attribute names could be small but their values not so much, considering high variation in JIDs for example; and I memory serves me right I saw that those tables used quite some space after a while). Again - we would have to review the code and check all the places where the comparison is made.
True, if we would "cache" all element names and attribute names, those tables would be huge. In recent Java hash tables from
String.intern()
were increased and in theory, it should speed up processing, however, looking at my results, suggests that this change was not so beneficial for us. Moreover, if we callString.intern()
on many strings, this will slow down instance lookup internally inString.intern()
implementation (see http://java-performance.info/string-intern-in-java-6-7-8/)My optimized approach was to actually cache only "attribute names" known and commonly used (I know, this may have some drawbacks).
Please note, that we do not call
String.intern()
for attribute values.Regarding Optionals and usage . The problem with "using them in non-critical paths" would be non-enforceable thus it could happen that someone would use it without knowing the penalty (This applies to Stream methods as well). And even though with Value Classes this should be somewhat mitigated, it's still future. Alternatively we could use @Nullable annotation (could be from jetbrains) - it could add hints in IDE and we could add checks during build (we already have spotbugs-maven-plugin but it's not enabled by default).
We can add
@Nullable
annotation to the method to get attribute value.Usage of
Optional
andStream
is clean and convenient, but may have pently, so we need to decided if we accept it, or not. -
I think we (once again) have same stance on
intern()
and it's drawbacks, i.e. I was agreeing with you that our optimisation (for names) should be better for us.Regarding performance penalties: I would avoid exposing
Stream
.Optional
has some penalty but as you stated earlier - it's not that huge and in future JVM version should be even lower (though, at that time handling ofnull
s themself could be different. IMHO from the performance point of view anoting the methods with@Nullable
would be best here - we can have compile-time checks (SpotBugs or PMD) to make sure that we do check fornull
, which would be virtually the same as the check enforced byOptional
but with no overhead. -
Regarding performance penalties: I would avoid exposing Stream. Optional has some penalty but as you stated earlier - it's not that huge and in future JVM version should be even lower (though, at that time handling of nulls themself could be different. IMHO from the performance point of view anoting the methods with @Nullable would be best here - we can have compile-time checks (SpotBugs or PMD) to make sure that we do check for null, which would be virtually the same as the check enforced by Optional but with no overhead.
Maybe, just maybe, we could introduce functions:
-
T mapChild(String name, String xmlns, Function<Element,T>)
This method would returnnull
but the mapping function would be called only if the element was found. -
void ifChildPresence(String name, String xmlns, Consumer<Element>)
This would allow us to implement "Optional-list" processing without the overhead of creation of the
Optional
instance.I could prepare some implementations, so we could evaluate the performance of such methods to compare with the manual retrieval of element and check for
null
. -
-
I've rerun tests once again as those very high results of
Optional
were suspecious. The same tests with-Xms1g -Xmx1g
generated in following results:Benchmark Mode Cnt Score Error Units ElementPerformanceTest.benchmarkFindChildAndClassic avgt 5 11,334 ± 0,642 us/op ElementPerformanceTest.benchmarkFindChildAndOptional avgt 5 11,712 ± 0,445 us/op ElementPerformanceTest.benchmarkFindChildArrayList avgt 5 17,932 ± 0,809 us/op ElementPerformanceTest.benchmarkFindChildLinkedList avgt 5 18,308 ± 0,472 us/op ElementPerformanceTest.benchmarkFindChildNew avgt 5 1,663 ± 0,021 us/op ElementPerformanceTest.benchmarkFindChildNewStatic avgt 5 1,407 ± 0,025 us/op ElementPerformanceTest.benchmarkFindChildNewX avgt 5 1,709 ± 0,087 us/op ElementPerformanceTest.benchmarkFindChildOldEmpty avgt 5 0,019 ± 0,001 us/op ElementPerformanceTest.benchmarkFindChildOldOptional avgt 5 2,471 ± 0,075 us/op ElementPerformanceTest.benchmarkFindChildOldStatic avgt 5 1,408 ± 0,013 us/op ElementPerformanceTest.benchmarkFlatMap avgt 5 0,079 ± 0,008 us/op ElementPerformanceTest.benchmarkFlatMapStreamToList avgt 5 0,201 ± 0,011 us/op ElementPerformanceTest.benchmarkFlatMapStreamToList2 avgt 5 0,143 ± 0,003 us/op ElementPerformanceTest.benchmarkHashMapGet avgt 5 0,055 ± 0,003 us/op ElementPerformanceTest.benchmarkHashMapGetIdentity avgt 5 0,470 ± 0,047 us/op ElementPerformanceTest.benchmarkHashMapGetMemoryOptimized avgt 5 0,066 ± 0,004 us/op ElementPerformanceTest.benchmarkHashMapSet avgt 5 0,087 ± 0,002 us/op ElementPerformanceTest.benchmarkHashMapSetIdentity avgt 5 0,549 ± 0,005 us/op ElementPerformanceTest.benchmarkHashMapSetMemoryOptimized avgt 5 0,111 ± 0,005 us/op ElementPerformanceTest.benchmarkIndentityMapGet avgt 5 0,482 ± 0,024 us/op ElementPerformanceTest.benchmarkIndentityMapGetStatic avgt 5 0,047 ± 0,001 us/op ElementPerformanceTest.benchmarkIndentityMapSet avgt 5 0,523 ± 0,017 us/op
As we can see,
Optional
has still very high score, but almost 800us smaller. I suppose that this was due to the fact that new object is being created and JMH caused GC during the test and with increased memory heap, it was not run or had less executions.Also
Stream
is a lot faster, still slower than iteration overList
but almost 28% faster. -
I've fixed following tests and rerun tests (static string was passed instead of new instance, which caused equlas() to return true a lot faster than it should):
ElementPerformanceTest.benchmarkFindChildNew avgt 5 4,166 ± 0,259 us/op ElementPerformanceTest.benchmarkFindChildNewX avgt 5 4,476 ± 0,094 us/op
-
Maybe, just maybe, we could introduce functions:
As commented yesterday - I'm somewhat sceptical towards such API in Element class.
I suppose that this was due to the fact that new object is being created and JMH caused GC during the test and with increased memory heap, it was not run or had less executions.
This is interesting point - maybe the GC wasn't run in the test (with increased heap size) but there was additional allocation which can be problematic during normal operation (i.e. triggering GC more often). AFAIR it should be possible to measure allocations in JMH tests - could we compare it?
Also Stream is a lot faster, still slower than iteration over List but almost 28% faster.
Wouldn't
Stream
simply be based on existing collection? Thus we could return simpleList
and if someone would need fancier processing of elements then it could call.stream()
on that collection? Moreover, collections could be re-used,Streams
not so much. -
Some more current results.
Benchmark Mode Cnt Score Error Units ElementPerformanceTest.benchmarkFindAndMapChild avgt 5 1,742 ± 0,103 us/op ElementPerformanceTest.benchmarkFindChildAndClassic avgt 5 11,337 ± 0,743 us/op ElementPerformanceTest.benchmarkFindChildAndOptional avgt 5 11,694 ± 0,358 us/op ElementPerformanceTest.benchmarkFindChildArrayList avgt 5 17,862 ± 0,316 us/op ElementPerformanceTest.benchmarkFindChildLinkedList avgt 5 18,486 ± 0,930 us/op ElementPerformanceTest.benchmarkFindChildNew avgt 5 4,078 ± 0,074 us/op ElementPerformanceTest.benchmarkFindChildNewStatic avgt 5 1,411 ± 0,012 us/op ElementPerformanceTest.benchmarkFindChildNewX avgt 5 4,432 ± 0,213 us/op ElementPerformanceTest.benchmarkFindChildOldEmpty avgt 5 2,485 ± 0,090 us/op ElementPerformanceTest.benchmarkFindChildOldOptional avgt 5 2,504 ± 0,176 us/op ElementPerformanceTest.benchmarkFindChildOldStatic avgt 5 1,418 ± 0,014 us/op ElementPerformanceTest.benchmarkFlatMap avgt 5 0,081 ± 0,002 us/op ElementPerformanceTest.benchmarkFlatMapManual avgt 5 0,072 ± 0,004 us/op ElementPerformanceTest.benchmarkFlatMapStreamToArray2 avgt 5 0,148 ± 0,006 us/op ElementPerformanceTest.benchmarkFlatMapStreamToList2 avgt 5 0,153 ± 0,004 us/op ElementPerformanceTest.benchmarkHashMapGet avgt 5 0,054 ± 0,001 us/op ElementPerformanceTest.benchmarkHashMapGetIdentity avgt 5 0,500 ± 0,008 us/op ElementPerformanceTest.benchmarkHashMapGetMemoryOptimized avgt 5 0,065 ± 0,002 us/op ElementPerformanceTest.benchmarkHashMapSet avgt 5 0,087 ± 0,002 us/op ElementPerformanceTest.benchmarkHashMapSetIdentity avgt 5 0,551 ± 0,024 us/op ElementPerformanceTest.benchmarkHashMapSetMemoryOptimized avgt 5 0,112 ± 0,003 us/op ElementPerformanceTest.benchmarkIndentityMapGet avgt 5 0,480 ± 0,010 us/op ElementPerformanceTest.benchmarkIndentityMapGetStatic avgt 5 0,049 ± 0,002 us/op ElementPerformanceTest.benchmarkIndentityMapSet avgt 5 0,524 ± 0,016 us/op ElementPerformanceTest.benchmarkMapStreamToList avgt 5 0,097 ± 0,003 us/op
NOTES:
- Results for methods named
benchmarkHashMap*
,benchmarkIdentityMap*
are times of 6 executions of those methods for different attribute names onElement
class. - Results for methods named
benchmarkFlatMap
andbenchmarkMapStreamToList
are times of a single execution of those methods onElement
class. - Results for methods named
benchmarkFindChild*
are times of 1000 executions of those methods onElement
class.
During this iteration of tests, I've added
benchmarkFindAndMapChild
test (based onequality
and not oninstance equality
, which uses "manually implemented"Optional
-like return type allowing mapping of elements, attribute values, etc. The difference is, that it does not haveisPresent()
method as it doesn't catch state of the expression but evaluates it. Itsget
method returnsnull
normally (in the oposition toOptional
class from Java) and have (for now) less methods. This return type would be beneficial if we would use it forfindChild
orgetAttribute
as in some cases it would allow us to "map" returned value and it would provide "null checks" internally. The only "last" check on the result ofget
method would be required. That could simplify some code without overhead introduced byOptional
.Back to your questions:
This is interesting point - maybe the GC wasn't run in the test (with increased heap size) but there was additional allocation which can be problematic during normal operation (i.e. triggering GC more often). AFAIR it should be possible to measure allocations in JMH tests - could we compare it?
I think that there was no real issue with GC (small short lived objects should be removed fast), but there was no
-Xms
so I suppose that small amount of the memory was allocated and that forced JVM to resize HEAP - allocate more memory from the OS).As for allocations, most likely small objects may increase GC, but they are short lived so that should be fast. If you would look closer on my tests na implementation of
getChildren()
andstreamChildren()
you will find out thatgetChildren()
createsLinkedList
every time it is called. On the other handstreamChildren()
builds stream without collection (most likely there is something used internally).To be honest, looking at latest results, stream with
flatMap()
(benchmarkFlatMapStreamToList2
) is 2 times slower thatbenchmarkFlatMap
. Both are using collections flattening.On the other hand if you would look at
benchmarkMapStreamToList
, where we have a stream of children mapped withgetName()
to list ofString
s, you can see thatStream
performance withoutflatMap
is not so bad (I would say that it is rather good).Wouldn't Stream simply be based on existing collection? Thus we could return simple List and if someone would need fancier processing of elements then it could call .stream() on that collection? Moreover, collections could be re-used, Streams not so much.
Basing steam on the collection which needs to be created (or filtered) was slower in my tests. Manual creation of a
Stream
like instreamChildren()
method was faster.To sum up.
- We need to keep "StaticString" methods as those are around 3 times faster.
- We should replace
LinkedList
withArrayList
for performance and memory usage improvements. - I'm in favour of keeping
Element:streamChildren()
as it is a single method and it's impact is small to none and it is still faster thatElement.getChildren()::stream()
. - With
IdentityHashMap
would suggest replacing it withHashMap
and add this "simple" method with a switch replacing commonly used attribute names with their "static" references (not all of them, just some listed, like "to", "from", "id", "name", "version", "xmlns", "var", and maybe few others). That should improve performance (due to dropped calls toString.intern()
) with small penalty for againstgetAttributeStaticString()
but being far faster thangetAttribute()
. - I have mixed fillings about using
Optional
. On the one hand, having methods returningOptional
could be good, but could increase complexity ofElement
class and overuse of it could cause performance issues and people can "easily" wrap results of our methods withOptional.ofNullable()
. On the other hand this performance may be an issue. It is about 80% slower, but execution times are quite fast (just 2.5us for a single execution) comparing to ie.getChildren()
.
- Results for methods named
-
Thank you for running those tests. I made a couple of changes (from commit):
- Add possibility to configure Element's List implementation
- Remove test mode from configuration options (to allow overriding via annotation)
- Add parameters to test execution: number of children elements of Element; whether to use first children from list or random (0…max children)
- Rename certain test methods to make them easier to discern/understand
- Removed for-loops in favour of running more iterations with Throughput benchmark mode
Could you please do check those changes and comment on them?
I added case with more sub-children as this can happen as well, though I think that majority of the request (so when we do perform lookup would be mostly IQ queries so rather limited number of children and most of the time checking first of them; with forms being the exception here). On the other hand results can have bigger amount of children but in that case we would most likely simply produce Element like that and there would be very limited querying/using "find*" methods IMHO thus I'm not sure how focused we should be on results with bigger stanzas / lookup of random child (sub-)element.
With the above I run two tests ( comparing
.getChild()
and checking for null versus usingOptional.ofNullable
) and the results are as follows:Benchmark (clazz) (maxChild) (randomChild) Mode Cnt Score Error Units ElementPerformanceTest.benchmarkGetChildAndNullComparison ArrayList 1 false thrpt 5 236,333 ± 3,763 ops/us ElementPerformanceTest.benchmarkGetChildAndNullComparison ArrayList 1 true thrpt 5 234,281 ± 2,581 ops/us ElementPerformanceTest.benchmarkGetChildAndNullComparison ArrayList 5 false thrpt 5 234,640 ± 4,701 ops/us ElementPerformanceTest.benchmarkGetChildAndNullComparison ArrayList 5 true thrpt 5 116,522 ± 65,283 ops/us ElementPerformanceTest.benchmarkGetChildAndNullComparison ArrayList 10 false thrpt 5 235,529 ± 5,840 ops/us ElementPerformanceTest.benchmarkGetChildAndNullComparison ArrayList 10 true thrpt 5 95,278 ± 68,433 ops/us ElementPerformanceTest.benchmarkGetChildAndNullComparison ArrayList 20 false thrpt 5 236,338 ± 5,741 ops/us ElementPerformanceTest.benchmarkGetChildAndNullComparison ArrayList 20 true thrpt 5 79,458 ± 28,718 ops/us ElementPerformanceTest.benchmarkGetChildAndNullComparison LinkedList 1 false thrpt 5 165,172 ± 4,443 ops/us ElementPerformanceTest.benchmarkGetChildAndNullComparison LinkedList 1 true thrpt 5 167,101 ± 3,130 ops/us ElementPerformanceTest.benchmarkGetChildAndNullComparison LinkedList 5 false thrpt 5 169,708 ± 2,190 ops/us ElementPerformanceTest.benchmarkGetChildAndNullComparison LinkedList 5 true thrpt 5 83,583 ± 41,249 ops/us ElementPerformanceTest.benchmarkGetChildAndNullComparison LinkedList 10 false thrpt 5 165,741 ± 4,012 ops/us ElementPerformanceTest.benchmarkGetChildAndNullComparison LinkedList 10 true thrpt 5 81,997 ± 72,196 ops/us ElementPerformanceTest.benchmarkGetChildAndNullComparison LinkedList 20 false thrpt 5 167,691 ± 1,806 ops/us ElementPerformanceTest.benchmarkGetChildAndNullComparison LinkedList 20 true thrpt 5 63,790 ± 20,075 ops/us ElementPerformanceTest.benchmarkGetChildAndOptional ArrayList 1 false thrpt 5 224,966 ± 14,476 ops/us ElementPerformanceTest.benchmarkGetChildAndOptional ArrayList 1 true thrpt 5 225,953 ± 7,391 ops/us ElementPerformanceTest.benchmarkGetChildAndOptional ArrayList 5 false thrpt 5 224,014 ± 5,323 ops/us ElementPerformanceTest.benchmarkGetChildAndOptional ArrayList 5 true thrpt 5 107,219 ± 37,451 ops/us ElementPerformanceTest.benchmarkGetChildAndOptional ArrayList 10 false thrpt 5 225,062 ± 3,534 ops/us ElementPerformanceTest.benchmarkGetChildAndOptional ArrayList 10 true thrpt 5 83,064 ± 41,024 ops/us ElementPerformanceTest.benchmarkGetChildAndOptional ArrayList 20 false thrpt 5 224,379 ± 6,866 ops/us ElementPerformanceTest.benchmarkGetChildAndOptional ArrayList 20 true thrpt 5 78,561 ± 63,378 ops/us ElementPerformanceTest.benchmarkGetChildAndOptional LinkedList 1 false thrpt 5 165,315 ± 4,145 ops/us ElementPerformanceTest.benchmarkGetChildAndOptional LinkedList 1 true thrpt 5 166,405 ± 2,049 ops/us ElementPerformanceTest.benchmarkGetChildAndOptional LinkedList 5 false thrpt 5 164,328 ± 3,828 ops/us ElementPerformanceTest.benchmarkGetChildAndOptional LinkedList 5 true thrpt 5 103,478 ± 72,707 ops/us ElementPerformanceTest.benchmarkGetChildAndOptional LinkedList 10 false thrpt 5 165,087 ± 1,678 ops/us ElementPerformanceTest.benchmarkGetChildAndOptional LinkedList 10 true thrpt 5 87,306 ± 53,350 ops/us ElementPerformanceTest.benchmarkGetChildAndOptional LinkedList 20 false thrpt 5 163,638 ± 4,182 ops/us ElementPerformanceTest.benchmarkGetChildAndOptional LinkedList 20 true thrpt 5 50,018 ± 43,976 ops/us
So, for this small subset of tests:
ArrayList
is fasterLinkedList
quite significantly (~45%), even for random elements and not only the first one- checking for nullness vs using
Optional
gives the former slight advantage (~5%)
I'm going to run tests for all methods later on (don't want to use the machine while it runs to avoid interruptions).
I think that there was no real issue with GC (small short lived objects should be removed fast), but there was no -Xms so I suppose that small amount of the memory was allocated and that forced JVM to resize HEAP - allocate more memory from the OS). As for allocations, most likely small objects may increase GC, but they are short lived so that should be fast. If you would look closer on my tests na implementation of getChildren() and streamChildren() you will find out that getChildren() creates LinkedList every time it is called. On the other hand streamChildren() builds stream without collection (most likely there is something used internally).
Yes, those should be removed fast but still this is additional allocation and in the current JVM version (without Value Classes) I think we should keep it in mind considering Element is the most fundamental piece of Tigase. (Yes, I'm slightly leaning towards returning
null
and using@Nullable
annotation)To be honest, looking at latest results, stream with flatMap() (benchmarkFlatMapStreamToList2) is 2 times slower that benchmarkFlatMap. Both are using collections flattening. On the other hand if you would look at benchmarkMapStreamToList , where we have a stream of children mapped with getName() to list of Strings, you can see that Stream performance without flatMap is not so bad (I would say that it is rather good).
To be honest I checked results from the table you provided and could confirm your observations on that data… though, while looking at the code I wasn't sure we are making exactly the same comparison in all cases (it would be better to group cases that should give same outcome but do it differently somehow for more clarity - maybe different classes?)
Regarding "finding child" use case, here is the extract for the case with Element with 20 children sub-elements backed by ArrayList and returning first child (this is throughput so higher=better):
ElementPerformanceTest.benchmarkFindChildEquality ArrayList 20 false thrpt 5 536,225 ± 13,491 ops/us ElementPerformanceTest.benchmarkFindChildGetChildStaticStr ArrayList 20 false thrpt 5 133,136 ± 1,619 ops/us ElementPerformanceTest.benchmarkFindChildGetChildStaticStrOptional ArrayList 20 false thrpt 5 132,562 ± 2,474 ops/us ElementPerformanceTest.benchmarkFindChildIdentity ArrayList 20 false thrpt 5 137,221 ± 3,510 ops/us ElementPerformanceTest.benchmarkFindChildNewMatcher ArrayList 20 false thrpt 5 563,661 ± 9,735 ops/us ElementPerformanceTest.benchmarkFindChildNullableAndGetName ArrayList 20 false thrpt 5 562,216 ± 18,203 ops/us ElementPerformanceTest.benchmarkFindChildOptionalMapperAndGetName ArrayList 20 false thrpt 5 553,154 ± 7,123 ops/us ElementPerformanceTest.benchmarkFindChildStreamDirectIdentity ArrayList 20 false thrpt 5 5,723 ± 0,883 ops/us ElementPerformanceTest.benchmarkFindChildStreamIdentity ArrayList 20 false thrpt 5 10,629 ± 0,091 ops/us
What surprised me was poor result for
StaticStr
and findChild with predicate based on identity check (which is counterintuitive)Wouldn't Stream simply be based on existing collection? Thus we could return simple List and if someone would need fancier processing of elements then it could call .stream() on that collection? Moreover, collections could be re-used, Streams not so much.
Basing steam on the collection which needs to be created (or filtered) was slower in my tests. Manual creation of a Stream like in streamChildren() method was faster.
To sum up.
We need to keep "StaticString" methods as those are around 3 times faster.
I assume you mean methods that favours identity (references) comparison instead of equality?
We should replace LinkedList with ArrayList for performance and memory usage improvements.
Agreed completely from what I saw.
I'm in favour of keeping Element:streamChildren() as it is a single method and it's impact is small to none and it is still faster that Element.getChildren()::stream().
Will do more testing (I think that with Elements with more children Stream methods could be slower but as I said - more testing needed).
Though, I think we should avoid giving to many choices and simply provide user with "single best method" to avoid someone using something "convenient" that could later on become a bottleneck. Of course more convenience methods are OK while they don't degrade performance significantly.
With IdentityHashMap would suggest replacing it with HashMap and add this "simple" method with a switch replacing commonly used attribute names with their "static" references (not all of them, just some listed, like "to", "from", "id", "name", "version", "xmlns", "var", and maybe few others). That should improve performance (due to dropped calls to String.intern()) with small penalty for against getAttributeStaticString() but being far faster than getAttribute().
This seems ok.
I have mixed fillings about using Optional. On the one hand, having methods returning Optional could be good, but could increase complexity of Element class and overuse of it could cause performance issues and people can "easily" wrap results of our methods with Optional.ofNullable(). On the other hand this performance may be an issue. It is about 80% slower, but execution times are quite fast (just 2.5us for a single execution) comparing to ie. getChildren().
Same conclusion.
Complete result from the current set of tests:
Benchmark (clazz) (maxChild) (randomChild) Mode Cnt Score Error Units ElementPerformanceTest.benchmarkFindChildEquality ArrayList 1 false thrpt 5 539,085 ± 12,827 ops/us ElementPerformanceTest.benchmarkFindChildEquality ArrayList 1 true thrpt 5 540,144 ± 3,835 ops/us ElementPerformanceTest.benchmarkFindChildEquality ArrayList 5 false thrpt 5 535,902 ± 17,290 ops/us ElementPerformanceTest.benchmarkFindChildEquality ArrayList 5 true thrpt 5 325,764 ± 432,496 ops/us ElementPerformanceTest.benchmarkFindChildEquality ArrayList 10 false thrpt 5 539,969 ± 10,823 ops/us ElementPerformanceTest.benchmarkFindChildEquality ArrayList 10 true thrpt 5 229,710 ± 573,576 ops/us ElementPerformanceTest.benchmarkFindChildEquality ArrayList 20 false thrpt 5 536,225 ± 13,491 ops/us ElementPerformanceTest.benchmarkFindChildEquality ArrayList 20 true thrpt 5 125,397 ± 174,318 ops/us ElementPerformanceTest.benchmarkFindChildEquality LinkedList 1 false thrpt 5 523,468 ± 5,786 ops/us ElementPerformanceTest.benchmarkFindChildEquality LinkedList 1 true thrpt 5 518,944 ± 21,567 ops/us ElementPerformanceTest.benchmarkFindChildEquality LinkedList 5 false thrpt 5 534,196 ± 5,688 ops/us ElementPerformanceTest.benchmarkFindChildEquality LinkedList 5 true thrpt 5 256,962 ± 274,198 ops/us ElementPerformanceTest.benchmarkFindChildEquality LinkedList 10 false thrpt 5 537,580 ± 14,030 ops/us ElementPerformanceTest.benchmarkFindChildEquality LinkedList 10 true thrpt 5 154,408 ± 348,009 ops/us ElementPerformanceTest.benchmarkFindChildEquality LinkedList 20 false thrpt 5 534,719 ± 4,177 ops/us ElementPerformanceTest.benchmarkFindChildEquality LinkedList 20 true thrpt 5 78,444 ± 97,766 ops/us ElementPerformanceTest.benchmarkFindChildGetChildStaticStr ArrayList 1 false thrpt 5 747,264 ± 21,063 ops/us ElementPerformanceTest.benchmarkFindChildGetChildStaticStr ArrayList 1 true thrpt 5 749,625 ± 13,905 ops/us ElementPerformanceTest.benchmarkFindChildGetChildStaticStr ArrayList 5 false thrpt 5 328,784 ± 2,578 ops/us ElementPerformanceTest.benchmarkFindChildGetChildStaticStr ArrayList 5 true thrpt 5 330,592 ± 5,394 ops/us ElementPerformanceTest.benchmarkFindChildGetChildStaticStr ArrayList 10 false thrpt 5 218,086 ± 2,793 ops/us ElementPerformanceTest.benchmarkFindChildGetChildStaticStr ArrayList 10 true thrpt 5 218,747 ± 4,519 ops/us ElementPerformanceTest.benchmarkFindChildGetChildStaticStr ArrayList 20 false thrpt 5 133,136 ± 1,619 ops/us ElementPerformanceTest.benchmarkFindChildGetChildStaticStr ArrayList 20 true thrpt 5 132,662 ± 0,402 ops/us ElementPerformanceTest.benchmarkFindChildGetChildStaticStr LinkedList 1 false thrpt 5 871,265 ± 11,243 ops/us ElementPerformanceTest.benchmarkFindChildGetChildStaticStr LinkedList 1 true thrpt 5 871,231 ± 7,048 ops/us ElementPerformanceTest.benchmarkFindChildGetChildStaticStr LinkedList 5 false thrpt 5 324,462 ± 2,666 ops/us ElementPerformanceTest.benchmarkFindChildGetChildStaticStr LinkedList 5 true thrpt 5 324,440 ± 3,570 ops/us ElementPerformanceTest.benchmarkFindChildGetChildStaticStr LinkedList 10 false thrpt 5 162,820 ± 2,243 ops/us ElementPerformanceTest.benchmarkFindChildGetChildStaticStr LinkedList 10 true thrpt 5 162,260 ± 2,747 ops/us ElementPerformanceTest.benchmarkFindChildGetChildStaticStr LinkedList 20 false thrpt 5 76,111 ± 1,141 ops/us ElementPerformanceTest.benchmarkFindChildGetChildStaticStr LinkedList 20 true thrpt 5 75,369 ± 0,667 ops/us ElementPerformanceTest.benchmarkFindChildGetChildStaticStrOptional ArrayList 1 false thrpt 5 680,177 ± 6,690 ops/us ElementPerformanceTest.benchmarkFindChildGetChildStaticStrOptional ArrayList 1 true thrpt 5 724,642 ± 20,068 ops/us ElementPerformanceTest.benchmarkFindChildGetChildStaticStrOptional ArrayList 5 false thrpt 5 332,793 ± 8,871 ops/us ElementPerformanceTest.benchmarkFindChildGetChildStaticStrOptional ArrayList 5 true thrpt 5 332,448 ± 11,752 ops/us ElementPerformanceTest.benchmarkFindChildGetChildStaticStrOptional ArrayList 10 false thrpt 5 216,507 ± 3,911 ops/us ElementPerformanceTest.benchmarkFindChildGetChildStaticStrOptional ArrayList 10 true thrpt 5 216,765 ± 2,027 ops/us ElementPerformanceTest.benchmarkFindChildGetChildStaticStrOptional ArrayList 20 false thrpt 5 132,562 ± 2,474 ops/us ElementPerformanceTest.benchmarkFindChildGetChildStaticStrOptional ArrayList 20 true thrpt 5 132,102 ± 0,899 ops/us ElementPerformanceTest.benchmarkFindChildGetChildStaticStrOptional LinkedList 1 false thrpt 5 762,051 ± 13,793 ops/us ElementPerformanceTest.benchmarkFindChildGetChildStaticStrOptional LinkedList 1 true thrpt 5 814,414 ± 10,604 ops/us ElementPerformanceTest.benchmarkFindChildGetChildStaticStrOptional LinkedList 5 false thrpt 5 340,997 ± 5,704 ops/us ElementPerformanceTest.benchmarkFindChildGetChildStaticStrOptional LinkedList 5 true thrpt 5 322,356 ± 10,800 ops/us ElementPerformanceTest.benchmarkFindChildGetChildStaticStrOptional LinkedList 10 false thrpt 5 163,174 ± 4,264 ops/us ElementPerformanceTest.benchmarkFindChildGetChildStaticStrOptional LinkedList 10 true thrpt 5 164,148 ± 4,126 ops/us ElementPerformanceTest.benchmarkFindChildGetChildStaticStrOptional LinkedList 20 false thrpt 5 75,072 ± 0,951 ops/us ElementPerformanceTest.benchmarkFindChildGetChildStaticStrOptional LinkedList 20 true thrpt 5 75,202 ± 0,373 ops/us ElementPerformanceTest.benchmarkFindChildIdentity ArrayList 1 false thrpt 5 829,788 ± 2,980 ops/us ElementPerformanceTest.benchmarkFindChildIdentity ArrayList 1 true thrpt 5 839,544 ± 19,739 ops/us ElementPerformanceTest.benchmarkFindChildIdentity ArrayList 5 false thrpt 5 329,963 ± 9,489 ops/us ElementPerformanceTest.benchmarkFindChildIdentity ArrayList 5 true thrpt 5 329,263 ± 1,186 ops/us ElementPerformanceTest.benchmarkFindChildIdentity ArrayList 10 false thrpt 5 215,261 ± 0,931 ops/us ElementPerformanceTest.benchmarkFindChildIdentity ArrayList 10 true thrpt 5 216,895 ± 4,538 ops/us ElementPerformanceTest.benchmarkFindChildIdentity ArrayList 20 false thrpt 5 137,221 ± 3,510 ops/us ElementPerformanceTest.benchmarkFindChildIdentity ArrayList 20 true thrpt 5 137,187 ± 2,423 ops/us ElementPerformanceTest.benchmarkFindChildIdentity LinkedList 1 false thrpt 5 862,107 ± 26,565 ops/us ElementPerformanceTest.benchmarkFindChildIdentity LinkedList 1 true thrpt 5 861,248 ± 12,594 ops/us ElementPerformanceTest.benchmarkFindChildIdentity LinkedList 5 false thrpt 5 347,114 ± 7,143 ops/us ElementPerformanceTest.benchmarkFindChildIdentity LinkedList 5 true thrpt 5 346,358 ± 10,022 ops/us ElementPerformanceTest.benchmarkFindChildIdentity LinkedList 10 false thrpt 5 169,617 ± 1,709 ops/us ElementPerformanceTest.benchmarkFindChildIdentity LinkedList 10 true thrpt 5 169,996 ± 4,278 ops/us ElementPerformanceTest.benchmarkFindChildIdentity LinkedList 20 false thrpt 5 79,374 ± 2,345 ops/us ElementPerformanceTest.benchmarkFindChildIdentity LinkedList 20 true thrpt 5 76,332 ± 0,584 ops/us ElementPerformanceTest.benchmarkFindChildNewMatcher ArrayList 1 false thrpt 5 549,403 ± 17,548 ops/us ElementPerformanceTest.benchmarkFindChildNewMatcher ArrayList 1 true thrpt 5 550,733 ± 12,833 ops/us ElementPerformanceTest.benchmarkFindChildNewMatcher ArrayList 5 false thrpt 5 558,461 ± 10,447 ops/us ElementPerformanceTest.benchmarkFindChildNewMatcher ArrayList 5 true thrpt 5 251,083 ± 325,713 ops/us ElementPerformanceTest.benchmarkFindChildNewMatcher ArrayList 10 false thrpt 5 557,759 ± 16,369 ops/us ElementPerformanceTest.benchmarkFindChildNewMatcher ArrayList 10 true thrpt 5 165,897 ± 190,680 ops/us ElementPerformanceTest.benchmarkFindChildNewMatcher ArrayList 20 false thrpt 5 563,661 ± 9,735 ops/us ElementPerformanceTest.benchmarkFindChildNewMatcher ArrayList 20 true thrpt 5 175,458 ± 270,762 ops/us ElementPerformanceTest.benchmarkFindChildNewMatcher LinkedList 1 false thrpt 5 543,895 ± 11,502 ops/us ElementPerformanceTest.benchmarkFindChildNewMatcher LinkedList 1 true thrpt 5 546,446 ± 5,638 ops/us ElementPerformanceTest.benchmarkFindChildNewMatcher LinkedList 5 false thrpt 5 557,672 ± 8,784 ops/us ElementPerformanceTest.benchmarkFindChildNewMatcher LinkedList 5 true thrpt 5 271,937 ± 373,001 ops/us ElementPerformanceTest.benchmarkFindChildNewMatcher LinkedList 10 false thrpt 5 561,501 ± 18,048 ops/us ElementPerformanceTest.benchmarkFindChildNewMatcher LinkedList 10 true thrpt 5 233,095 ± 412,933 ops/us ElementPerformanceTest.benchmarkFindChildNewMatcher LinkedList 20 false thrpt 5 561,397 ± 14,286 ops/us ElementPerformanceTest.benchmarkFindChildNewMatcher LinkedList 20 true thrpt 5 108,231 ± 188,377 ops/us ElementPerformanceTest.benchmarkFindChildNullableAndGetName ArrayList 1 false thrpt 5 558,711 ± 11,616 ops/us ElementPerformanceTest.benchmarkFindChildNullableAndGetName ArrayList 1 true thrpt 5 554,554 ± 7,128 ops/us ElementPerformanceTest.benchmarkFindChildNullableAndGetName ArrayList 5 false thrpt 5 562,505 ± 10,916 ops/us ElementPerformanceTest.benchmarkFindChildNullableAndGetName ArrayList 5 true thrpt 5 220,350 ± 203,325 ops/us ElementPerformanceTest.benchmarkFindChildNullableAndGetName ArrayList 10 false thrpt 5 560,544 ± 7,321 ops/us ElementPerformanceTest.benchmarkFindChildNullableAndGetName ArrayList 10 true thrpt 5 158,713 ± 134,166 ops/us ElementPerformanceTest.benchmarkFindChildNullableAndGetName ArrayList 20 false thrpt 5 562,216 ± 18,203 ops/us ElementPerformanceTest.benchmarkFindChildNullableAndGetName ArrayList 20 true thrpt 5 128,479 ± 178,694 ops/us ElementPerformanceTest.benchmarkFindChildNullableAndGetName LinkedList 1 false thrpt 5 540,242 ± 9,092 ops/us ElementPerformanceTest.benchmarkFindChildNullableAndGetName LinkedList 1 true thrpt 5 540,163 ± 15,128 ops/us ElementPerformanceTest.benchmarkFindChildNullableAndGetName LinkedList 5 false thrpt 5 559,354 ± 7,952 ops/us ElementPerformanceTest.benchmarkFindChildNullableAndGetName LinkedList 5 true thrpt 5 240,311 ± 294,342 ops/us ElementPerformanceTest.benchmarkFindChildNullableAndGetName LinkedList 10 false thrpt 5 551,215 ± 7,223 ops/us ElementPerformanceTest.benchmarkFindChildNullableAndGetName LinkedList 10 true thrpt 5 162,254 ± 91,732 ops/us ElementPerformanceTest.benchmarkFindChildNullableAndGetName LinkedList 20 false thrpt 5 565,501 ± 5,429 ops/us ElementPerformanceTest.benchmarkFindChildNullableAndGetName LinkedList 20 true thrpt 5 88,613 ± 90,993 ops/us ElementPerformanceTest.benchmarkFindChildOptionalMapperAndGetName ArrayList 1 false thrpt 5 559,473 ± 10,822 ops/us ElementPerformanceTest.benchmarkFindChildOptionalMapperAndGetName ArrayList 1 true thrpt 5 559,713 ± 34,239 ops/us ElementPerformanceTest.benchmarkFindChildOptionalMapperAndGetName ArrayList 5 false thrpt 5 554,958 ± 16,232 ops/us ElementPerformanceTest.benchmarkFindChildOptionalMapperAndGetName ArrayList 5 true thrpt 5 247,863 ± 236,474 ops/us ElementPerformanceTest.benchmarkFindChildOptionalMapperAndGetName ArrayList 10 false thrpt 5 554,524 ± 31,806 ops/us ElementPerformanceTest.benchmarkFindChildOptionalMapperAndGetName ArrayList 10 true thrpt 5 206,893 ± 268,016 ops/us ElementPerformanceTest.benchmarkFindChildOptionalMapperAndGetName ArrayList 20 false thrpt 5 553,154 ± 7,123 ops/us ElementPerformanceTest.benchmarkFindChildOptionalMapperAndGetName ArrayList 20 true thrpt 5 103,946 ± 73,610 ops/us ElementPerformanceTest.benchmarkFindChildOptionalMapperAndGetName LinkedList 1 false thrpt 5 544,949 ± 9,985 ops/us ElementPerformanceTest.benchmarkFindChildOptionalMapperAndGetName LinkedList 1 true thrpt 5 537,616 ± 19,101 ops/us ElementPerformanceTest.benchmarkFindChildOptionalMapperAndGetName LinkedList 5 false thrpt 5 555,715 ± 30,640 ops/us ElementPerformanceTest.benchmarkFindChildOptionalMapperAndGetName LinkedList 5 true thrpt 5 146,804 ± 121,396 ops/us ElementPerformanceTest.benchmarkFindChildOptionalMapperAndGetName LinkedList 10 false thrpt 5 552,609 ± 18,380 ops/us ElementPerformanceTest.benchmarkFindChildOptionalMapperAndGetName LinkedList 10 true thrpt 5 88,238 ± 99,413 ops/us ElementPerformanceTest.benchmarkFindChildOptionalMapperAndGetName LinkedList 20 false thrpt 5 563,031 ± 22,835 ops/us ElementPerformanceTest.benchmarkFindChildOptionalMapperAndGetName LinkedList 20 true thrpt 5 92,659 ± 150,276 ops/us ElementPerformanceTest.benchmarkFindChildStreamDirectIdentity ArrayList 1 false thrpt 5 26,496 ± 5,228 ops/us ElementPerformanceTest.benchmarkFindChildStreamDirectIdentity ArrayList 1 true thrpt 5 28,164 ± 15,615 ops/us ElementPerformanceTest.benchmarkFindChildStreamDirectIdentity ArrayList 5 false thrpt 5 13,133 ± 6,236 ops/us ElementPerformanceTest.benchmarkFindChildStreamDirectIdentity ArrayList 5 true thrpt 5 14,722 ± 1,615 ops/us ElementPerformanceTest.benchmarkFindChildStreamDirectIdentity ArrayList 10 false thrpt 5 9,755 ± 1,544 ops/us ElementPerformanceTest.benchmarkFindChildStreamDirectIdentity ArrayList 10 true thrpt 5 10,068 ± 2,511 ops/us ElementPerformanceTest.benchmarkFindChildStreamDirectIdentity ArrayList 20 false thrpt 5 5,723 ± 0,883 ops/us ElementPerformanceTest.benchmarkFindChildStreamDirectIdentity ArrayList 20 true thrpt 5 4,611 ± 1,429 ops/us ElementPerformanceTest.benchmarkFindChildStreamDirectIdentity LinkedList 1 false thrpt 5 25,445 ± 9,211 ops/us ElementPerformanceTest.benchmarkFindChildStreamDirectIdentity LinkedList 1 true thrpt 5 27,543 ± 10,269 ops/us ElementPerformanceTest.benchmarkFindChildStreamDirectIdentity LinkedList 5 false thrpt 5 14,769 ± 4,796 ops/us ElementPerformanceTest.benchmarkFindChildStreamDirectIdentity LinkedList 5 true thrpt 5 13,981 ± 2,563 ops/us ElementPerformanceTest.benchmarkFindChildStreamDirectIdentity LinkedList 10 false thrpt 5 9,699 ± 6,985 ops/us ElementPerformanceTest.benchmarkFindChildStreamDirectIdentity LinkedList 10 true thrpt 5 9,998 ± 5,641 ops/us ElementPerformanceTest.benchmarkFindChildStreamDirectIdentity LinkedList 20 false thrpt 5 6,045 ± 1,903 ops/us ElementPerformanceTest.benchmarkFindChildStreamDirectIdentity LinkedList 20 true thrpt 5 6,180 ± 2,457 ops/us ElementPerformanceTest.benchmarkFindChildStreamIdentity ArrayList 1 false thrpt 5 95,110 ± 1,808 ops/us ElementPerformanceTest.benchmarkFindChildStreamIdentity ArrayList 1 true thrpt 5 96,115 ± 1,684 ops/us ElementPerformanceTest.benchmarkFindChildStreamIdentity ArrayList 5 false thrpt 5 40,001 ± 0,567 ops/us ElementPerformanceTest.benchmarkFindChildStreamIdentity ArrayList 5 true thrpt 5 42,873 ± 0,697 ops/us ElementPerformanceTest.benchmarkFindChildStreamIdentity ArrayList 10 false thrpt 5 31,145 ± 0,345 ops/us ElementPerformanceTest.benchmarkFindChildStreamIdentity ArrayList 10 true thrpt 5 31,216 ± 0,493 ops/us ElementPerformanceTest.benchmarkFindChildStreamIdentity ArrayList 20 false thrpt 5 10,629 ± 0,091 ops/us ElementPerformanceTest.benchmarkFindChildStreamIdentity ArrayList 20 true thrpt 5 10,617 ± 0,226 ops/us ElementPerformanceTest.benchmarkFindChildStreamIdentity LinkedList 1 false thrpt 5 93,644 ± 1,316 ops/us ElementPerformanceTest.benchmarkFindChildStreamIdentity LinkedList 1 true thrpt 5 94,876 ± 1,754 ops/us ElementPerformanceTest.benchmarkFindChildStreamIdentity LinkedList 5 false thrpt 5 40,840 ± 0,641 ops/us ElementPerformanceTest.benchmarkFindChildStreamIdentity LinkedList 5 true thrpt 5 41,068 ± 1,384 ops/us ElementPerformanceTest.benchmarkFindChildStreamIdentity LinkedList 10 false thrpt 5 28,466 ± 0,274 ops/us ElementPerformanceTest.benchmarkFindChildStreamIdentity LinkedList 10 true thrpt 5 28,724 ± 0,299 ops/us ElementPerformanceTest.benchmarkFindChildStreamIdentity LinkedList 20 false thrpt 5 10,396 ± 0,202 ops/us ElementPerformanceTest.benchmarkFindChildStreamIdentity LinkedList 20 true thrpt 5 11,179 ± 0,176 ops/us ElementPerformanceTest.benchmarkGetChildAndNullComparison ArrayList 1 false thrpt 5 234,901 ± 8,647 ops/us ElementPerformanceTest.benchmarkGetChildAndNullComparison ArrayList 1 true thrpt 5 235,774 ± 5,994 ops/us ElementPerformanceTest.benchmarkGetChildAndNullComparison ArrayList 5 false thrpt 5 236,102 ± 5,545 ops/us ElementPerformanceTest.benchmarkGetChildAndNullComparison ArrayList 5 true thrpt 5 108,319 ± 40,833 ops/us ElementPerformanceTest.benchmarkGetChildAndNullComparison ArrayList 10 false thrpt 5 235,172 ± 6,003 ops/us ElementPerformanceTest.benchmarkGetChildAndNullComparison ArrayList 10 true thrpt 5 104,831 ± 75,528 ops/us ElementPerformanceTest.benchmarkGetChildAndNullComparison ArrayList 20 false thrpt 5 236,018 ± 4,367 ops/us ElementPerformanceTest.benchmarkGetChildAndNullComparison ArrayList 20 true thrpt 5 89,975 ± 65,232 ops/us ElementPerformanceTest.benchmarkGetChildAndNullComparison LinkedList 1 false thrpt 5 165,081 ± 4,258 ops/us ElementPerformanceTest.benchmarkGetChildAndNullComparison LinkedList 1 true thrpt 5 165,097 ± 10,391 ops/us ElementPerformanceTest.benchmarkGetChildAndNullComparison LinkedList 5 false thrpt 5 166,439 ± 3,500 ops/us ElementPerformanceTest.benchmarkGetChildAndNullComparison LinkedList 5 true thrpt 5 85,197 ± 87,759 ops/us ElementPerformanceTest.benchmarkGetChildAndNullComparison LinkedList 10 false thrpt 5 169,921 ± 3,563 ops/us ElementPerformanceTest.benchmarkGetChildAndNullComparison LinkedList 10 true thrpt 5 88,654 ± 101,693 ops/us ElementPerformanceTest.benchmarkGetChildAndNullComparison LinkedList 20 false thrpt 5 169,330 ± 1,605 ops/us ElementPerformanceTest.benchmarkGetChildAndNullComparison LinkedList 20 true thrpt 5 55,874 ± 49,351 ops/us ElementPerformanceTest.benchmarkGetChildAndOptional ArrayList 1 false thrpt 5 225,389 ± 6,782 ops/us ElementPerformanceTest.benchmarkGetChildAndOptional ArrayList 1 true thrpt 5 224,127 ± 5,849 ops/us ElementPerformanceTest.benchmarkGetChildAndOptional ArrayList 5 false thrpt 5 225,726 ± 4,396 ops/us ElementPerformanceTest.benchmarkGetChildAndOptional ArrayList 5 true thrpt 5 118,939 ± 35,815 ops/us ElementPerformanceTest.benchmarkGetChildAndOptional ArrayList 10 false thrpt 5 219,142 ± 11,629 ops/us ElementPerformanceTest.benchmarkGetChildAndOptional ArrayList 10 true thrpt 5 104,793 ± 30,632 ops/us ElementPerformanceTest.benchmarkGetChildAndOptional ArrayList 20 false thrpt 5 224,347 ± 10,909 ops/us ElementPerformanceTest.benchmarkGetChildAndOptional ArrayList 20 true thrpt 5 86,632 ± 78,354 ops/us ElementPerformanceTest.benchmarkGetChildAndOptional LinkedList 1 false thrpt 5 166,167 ± 3,320 ops/us ElementPerformanceTest.benchmarkGetChildAndOptional LinkedList 1 true thrpt 5 169,203 ± 2,698 ops/us ElementPerformanceTest.benchmarkGetChildAndOptional LinkedList 5 false thrpt 5 165,869 ± 3,918 ops/us ElementPerformanceTest.benchmarkGetChildAndOptional LinkedList 5 true thrpt 5 97,427 ± 66,403 ops/us ElementPerformanceTest.benchmarkGetChildAndOptional LinkedList 10 false thrpt 5 165,894 ± 2,915 ops/us ElementPerformanceTest.benchmarkGetChildAndOptional LinkedList 10 true thrpt 5 68,796 ± 15,427 ops/us ElementPerformanceTest.benchmarkGetChildAndOptional LinkedList 20 false thrpt 5 164,723 ± 4,912 ops/us ElementPerformanceTest.benchmarkGetChildAndOptional LinkedList 20 true thrpt 5 61,574 ± 99,175 ops/us ElementPerformanceTest.benchmarkFlatMap ArrayList 1 false avgt 5 0,024 ± 0,001 us/op ElementPerformanceTest.benchmarkFlatMap ArrayList 1 true avgt 5 0,024 ± 0,001 us/op ElementPerformanceTest.benchmarkFlatMap ArrayList 5 false avgt 5 0,101 ± 0,003 us/op ElementPerformanceTest.benchmarkFlatMap ArrayList 5 true avgt 5 0,100 ± 0,001 us/op ElementPerformanceTest.benchmarkFlatMap ArrayList 10 false avgt 5 0,216 ± 0,012 us/op ElementPerformanceTest.benchmarkFlatMap ArrayList 10 true avgt 5 0,234 ± 0,019 us/op ElementPerformanceTest.benchmarkFlatMap ArrayList 20 false avgt 5 0,416 ± 0,011 us/op ElementPerformanceTest.benchmarkFlatMap ArrayList 20 true avgt 5 0,409 ± 0,006 us/op ElementPerformanceTest.benchmarkFlatMap LinkedList 1 false avgt 5 0,025 ± 0,001 us/op ElementPerformanceTest.benchmarkFlatMap LinkedList 1 true avgt 5 0,025 ± 0,001 us/op ElementPerformanceTest.benchmarkFlatMap LinkedList 5 false avgt 5 0,141 ± 0,002 us/op ElementPerformanceTest.benchmarkFlatMap LinkedList 5 true avgt 5 0,171 ± 0,002 us/op ElementPerformanceTest.benchmarkFlatMap LinkedList 10 false avgt 5 0,266 ± 0,009 us/op ElementPerformanceTest.benchmarkFlatMap LinkedList 10 true avgt 5 0,272 ± 0,012 us/op ElementPerformanceTest.benchmarkFlatMap LinkedList 20 false avgt 5 0,468 ± 0,006 us/op ElementPerformanceTest.benchmarkFlatMap LinkedList 20 true avgt 5 0,479 ± 0,009 us/op ElementPerformanceTest.benchmarkFlatMapManual ArrayList 1 false avgt 5 0,030 ± 0,001 us/op ElementPerformanceTest.benchmarkFlatMapManual ArrayList 1 true avgt 5 0,031 ± 0,001 us/op ElementPerformanceTest.benchmarkFlatMapManual ArrayList 5 false avgt 5 0,126 ± 0,002 us/op ElementPerformanceTest.benchmarkFlatMapManual ArrayList 5 true avgt 5 0,125 ± 0,003 us/op ElementPerformanceTest.benchmarkFlatMapManual ArrayList 10 false avgt 5 0,247 ± 0,007 us/op ElementPerformanceTest.benchmarkFlatMapManual ArrayList 10 true avgt 5 0,247 ± 0,004 us/op ElementPerformanceTest.benchmarkFlatMapManual ArrayList 20 false avgt 5 0,493 ± 0,006 us/op ElementPerformanceTest.benchmarkFlatMapManual ArrayList 20 true avgt 5 0,494 ± 0,013 us/op ElementPerformanceTest.benchmarkFlatMapManual LinkedList 1 false avgt 5 0,035 ± 0,001 us/op ElementPerformanceTest.benchmarkFlatMapManual LinkedList 1 true avgt 5 0,035 ± 0,001 us/op ElementPerformanceTest.benchmarkFlatMapManual LinkedList 5 false avgt 5 0,152 ± 0,003 us/op ElementPerformanceTest.benchmarkFlatMapManual LinkedList 5 true avgt 5 0,148 ± 0,005 us/op ElementPerformanceTest.benchmarkFlatMapManual LinkedList 10 false avgt 5 0,281 ± 0,004 us/op ElementPerformanceTest.benchmarkFlatMapManual LinkedList 10 true avgt 5 0,285 ± 0,002 us/op ElementPerformanceTest.benchmarkFlatMapManual LinkedList 20 false avgt 5 0,570 ± 0,009 us/op ElementPerformanceTest.benchmarkFlatMapManual LinkedList 20 true avgt 5 0,568 ± 0,005 us/op ElementPerformanceTest.benchmarkFlatMapStreamDirectToListStreamChildrenDirect ArrayList 1 false avgt 5 0,140 ± 0,004 us/op ElementPerformanceTest.benchmarkFlatMapStreamDirectToListStreamChildrenDirect ArrayList 1 true avgt 5 0,144 ± 0,003 us/op ElementPerformanceTest.benchmarkFlatMapStreamDirectToListStreamChildrenDirect ArrayList 5 false avgt 5 0,397 ± 0,005 us/op ElementPerformanceTest.benchmarkFlatMapStreamDirectToListStreamChildrenDirect ArrayList 5 true avgt 5 0,404 ± 0,016 us/op ElementPerformanceTest.benchmarkFlatMapStreamDirectToListStreamChildrenDirect ArrayList 10 false avgt 5 0,715 ± 0,013 us/op ElementPerformanceTest.benchmarkFlatMapStreamDirectToListStreamChildrenDirect ArrayList 10 true avgt 5 0,714 ± 0,004 us/op ElementPerformanceTest.benchmarkFlatMapStreamDirectToListStreamChildrenDirect ArrayList 20 false avgt 5 1,520 ± 0,041 us/op ElementPerformanceTest.benchmarkFlatMapStreamDirectToListStreamChildrenDirect ArrayList 20 true avgt 5 1,454 ± 0,030 us/op ElementPerformanceTest.benchmarkFlatMapStreamDirectToListStreamChildrenDirect LinkedList 1 false avgt 5 0,142 ± 0,001 us/op ElementPerformanceTest.benchmarkFlatMapStreamDirectToListStreamChildrenDirect LinkedList 1 true avgt 5 0,138 ± 0,003 us/op ElementPerformanceTest.benchmarkFlatMapStreamDirectToListStreamChildrenDirect LinkedList 5 false avgt 5 0,375 ± 0,004 us/op ElementPerformanceTest.benchmarkFlatMapStreamDirectToListStreamChildrenDirect LinkedList 5 true avgt 5 0,399 ± 0,007 us/op ElementPerformanceTest.benchmarkFlatMapStreamDirectToListStreamChildrenDirect LinkedList 10 false avgt 5 0,654 ± 0,027 us/op ElementPerformanceTest.benchmarkFlatMapStreamDirectToListStreamChildrenDirect LinkedList 10 true avgt 5 0,698 ± 0,010 us/op ElementPerformanceTest.benchmarkFlatMapStreamDirectToListStreamChildrenDirect LinkedList 20 false avgt 5 1,323 ± 0,014 us/op ElementPerformanceTest.benchmarkFlatMapStreamDirectToListStreamChildrenDirect LinkedList 20 true avgt 5 1,255 ± 0,035 us/op ElementPerformanceTest.benchmarkFlatMapStreamToArray ArrayList 1 false avgt 5 0,059 ± 0,001 us/op ElementPerformanceTest.benchmarkFlatMapStreamToArray ArrayList 1 true avgt 5 0,052 ± 0,001 us/op ElementPerformanceTest.benchmarkFlatMapStreamToArray ArrayList 5 false avgt 5 0,207 ± 0,003 us/op ElementPerformanceTest.benchmarkFlatMapStreamToArray ArrayList 5 true avgt 5 0,202 ± 0,002 us/op ElementPerformanceTest.benchmarkFlatMapStreamToArray ArrayList 10 false avgt 5 0,226 ± 0,005 us/op ElementPerformanceTest.benchmarkFlatMapStreamToArray ArrayList 10 true avgt 5 0,226 ± 0,004 us/op ElementPerformanceTest.benchmarkFlatMapStreamToArray ArrayList 20 false avgt 5 0,589 ± 0,011 us/op ElementPerformanceTest.benchmarkFlatMapStreamToArray ArrayList 20 true avgt 5 0,594 ± 0,009 us/op ElementPerformanceTest.benchmarkFlatMapStreamToArray LinkedList 1 false avgt 5 0,058 ± 0,001 us/op ElementPerformanceTest.benchmarkFlatMapStreamToArray LinkedList 1 true avgt 5 0,053 ± 0,001 us/op ElementPerformanceTest.benchmarkFlatMapStreamToArray LinkedList 5 false avgt 5 0,136 ± 0,001 us/op ElementPerformanceTest.benchmarkFlatMapStreamToArray LinkedList 5 true avgt 5 0,136 ± 0,002 us/op ElementPerformanceTest.benchmarkFlatMapStreamToArray LinkedList 10 false avgt 5 0,227 ± 0,004 us/op ElementPerformanceTest.benchmarkFlatMapStreamToArray LinkedList 10 true avgt 5 0,407 ± 0,007 us/op ElementPerformanceTest.benchmarkFlatMapStreamToArray LinkedList 20 false avgt 5 0,626 ± 0,002 us/op ElementPerformanceTest.benchmarkFlatMapStreamToArray LinkedList 20 true avgt 5 0,625 ± 0,019 us/op ElementPerformanceTest.benchmarkFlatMapStreamToListGetChildrenStream ArrayList 1 false avgt 5 0,086 ± 0,002 us/op ElementPerformanceTest.benchmarkFlatMapStreamToListGetChildrenStream ArrayList 1 true avgt 5 0,086 ± 0,001 us/op ElementPerformanceTest.benchmarkFlatMapStreamToListGetChildrenStream ArrayList 5 false avgt 5 0,259 ± 0,013 us/op ElementPerformanceTest.benchmarkFlatMapStreamToListGetChildrenStream ArrayList 5 true avgt 5 0,258 ± 0,010 us/op ElementPerformanceTest.benchmarkFlatMapStreamToListGetChildrenStream ArrayList 10 false avgt 5 0,440 ± 0,012 us/op ElementPerformanceTest.benchmarkFlatMapStreamToListGetChildrenStream ArrayList 10 true avgt 5 0,433 ± 0,004 us/op ElementPerformanceTest.benchmarkFlatMapStreamToListGetChildrenStream ArrayList 20 false avgt 5 0,639 ± 0,026 us/op ElementPerformanceTest.benchmarkFlatMapStreamToListGetChildrenStream ArrayList 20 true avgt 5 0,646 ± 0,016 us/op ElementPerformanceTest.benchmarkFlatMapStreamToListGetChildrenStream LinkedList 1 false avgt 5 0,089 ± 0,003 us/op ElementPerformanceTest.benchmarkFlatMapStreamToListGetChildrenStream LinkedList 1 true avgt 5 0,090 ± 0,007 us/op ElementPerformanceTest.benchmarkFlatMapStreamToListGetChildrenStream LinkedList 5 false avgt 5 0,235 ± 0,008 us/op ElementPerformanceTest.benchmarkFlatMapStreamToListGetChildrenStream LinkedList 5 true avgt 5 0,236 ± 0,005 us/op ElementPerformanceTest.benchmarkFlatMapStreamToListGetChildrenStream LinkedList 10 false avgt 5 0,402 ± 0,012 us/op ElementPerformanceTest.benchmarkFlatMapStreamToListGetChildrenStream LinkedList 10 true avgt 5 0,400 ± 0,004 us/op ElementPerformanceTest.benchmarkFlatMapStreamToListGetChildrenStream LinkedList 20 false avgt 5 0,637 ± 0,016 us/op ElementPerformanceTest.benchmarkFlatMapStreamToListGetChildrenStream LinkedList 20 true avgt 5 0,638 ± 0,023 us/op ElementPerformanceTest.benchmarkFlatMapStreamToListStreamChildren ArrayList 1 false avgt 5 0,060 ± 0,001 us/op ElementPerformanceTest.benchmarkFlatMapStreamToListStreamChildren ArrayList 1 true avgt 5 0,058 ± 0,001 us/op ElementPerformanceTest.benchmarkFlatMapStreamToListStreamChildren ArrayList 5 false avgt 5 0,134 ± 0,001 us/op ElementPerformanceTest.benchmarkFlatMapStreamToListStreamChildren ArrayList 5 true avgt 5 0,135 ± 0,001 us/op ElementPerformanceTest.benchmarkFlatMapStreamToListStreamChildren ArrayList 10 false avgt 5 0,384 ± 0,007 us/op ElementPerformanceTest.benchmarkFlatMapStreamToListStreamChildren ArrayList 10 true avgt 5 0,220 ± 0,005 us/op ElementPerformanceTest.benchmarkFlatMapStreamToListStreamChildren ArrayList 20 false avgt 5 0,565 ± 0,008 us/op ElementPerformanceTest.benchmarkFlatMapStreamToListStreamChildren ArrayList 20 true avgt 5 0,548 ± 0,006 us/op ElementPerformanceTest.benchmarkFlatMapStreamToListStreamChildren LinkedList 1 false avgt 5 0,060 ± 0,001 us/op ElementPerformanceTest.benchmarkFlatMapStreamToListStreamChildren LinkedList 1 true avgt 5 0,062 ± 0,001 us/op ElementPerformanceTest.benchmarkFlatMapStreamToListStreamChildren LinkedList 5 false avgt 5 0,135 ± 0,003 us/op ElementPerformanceTest.benchmarkFlatMapStreamToListStreamChildren LinkedList 5 true avgt 5 0,233 ± 0,005 us/op ElementPerformanceTest.benchmarkFlatMapStreamToListStreamChildren LinkedList 10 false avgt 5 0,224 ± 0,005 us/op ElementPerformanceTest.benchmarkFlatMapStreamToListStreamChildren LinkedList 10 true avgt 5 0,222 ± 0,005 us/op ElementPerformanceTest.benchmarkFlatMapStreamToListStreamChildren LinkedList 20 false avgt 5 0,609 ± 0,014 us/op ElementPerformanceTest.benchmarkFlatMapStreamToListStreamChildren LinkedList 20 true avgt 5 0,512 ± 0,008 us/op ElementPerformanceTest.benchmarkHashMapGet N/A N/A N/A avgt 5 0,031 ± 0,001 us/op ElementPerformanceTest.benchmarkHashMapGetIdentity N/A N/A N/A avgt 5 0,666 ± 0,007 us/op ElementPerformanceTest.benchmarkHashMapGetMemoryOptimized N/A N/A N/A avgt 5 0,054 ± 0,001 us/op ElementPerformanceTest.benchmarkHashMapSet N/A N/A N/A avgt 5 0,019 ± 0,001 us/op ElementPerformanceTest.benchmarkHashMapSetIdentity N/A N/A N/A avgt 5 0,662 ± 0,004 us/op ElementPerformanceTest.benchmarkHashMapSetMemoryOptimized N/A N/A N/A avgt 5 0,044 ± 0,001 us/op ElementPerformanceTest.benchmarkIndentityMapGet N/A N/A N/A avgt 5 0,647 ± 0,014 us/op ElementPerformanceTest.benchmarkIndentityMapGetStatic N/A N/A N/A avgt 5 0,032 ± 0,001 us/op ElementPerformanceTest.benchmarkIndentityMapSet N/A N/A N/A avgt 5 0,658 ± 0,007 us/op ElementPerformanceTest.benchmarkMapStreamToList ArrayList 1 false avgt 5 0,048 ± 0,002 us/op ElementPerformanceTest.benchmarkMapStreamToList ArrayList 1 true avgt 5 0,047 ± 0,001 us/op ElementPerformanceTest.benchmarkMapStreamToList ArrayList 5 false avgt 5 0,088 ± 0,002 us/op ElementPerformanceTest.benchmarkMapStreamToList ArrayList 5 true avgt 5 0,087 ± 0,003 us/op ElementPerformanceTest.benchmarkMapStreamToList ArrayList 10 false avgt 5 0,117 ± 0,005 us/op ElementPerformanceTest.benchmarkMapStreamToList ArrayList 10 true avgt 5 0,114 ± 0,012 us/op ElementPerformanceTest.benchmarkMapStreamToList ArrayList 20 false avgt 5 0,213 ± 0,003 us/op ElementPerformanceTest.benchmarkMapStreamToList ArrayList 20 true avgt 5 0,212 ± 0,012 us/op ElementPerformanceTest.benchmarkMapStreamToList LinkedList 1 false avgt 5 0,048 ± 0,001 us/op ElementPerformanceTest.benchmarkMapStreamToList LinkedList 1 true avgt 5 0,048 ± 0,001 us/op ElementPerformanceTest.benchmarkMapStreamToList LinkedList 5 false avgt 5 0,084 ± 0,003 us/op ElementPerformanceTest.benchmarkMapStreamToList LinkedList 5 true avgt 5 0,083 ± 0,002 us/op ElementPerformanceTest.benchmarkMapStreamToList LinkedList 10 false avgt 5 0,123 ± 0,001 us/op ElementPerformanceTest.benchmarkMapStreamToList LinkedList 10 true avgt 5 0,123 ± 0,001 us/op ElementPerformanceTest.benchmarkMapStreamToList LinkedList 20 false avgt 5 0,266 ± 0,004 us/op ElementPerformanceTest.benchmarkMapStreamToList LinkedList 20 true avgt 5 0,267 ± 0,013 us/op
-
I've rerun tests on my device (just for confirmation of the results):
Benchmark (clazz) (maxChild) (randomChild) Mode Cnt Score Error Units ElementPerformanceTest.benchmarkFindChildEquality ArrayList 1 false thrpt 5 423,188 ± 10,806 ops/us ElementPerformanceTest.benchmarkFindChildEquality ArrayList 1 true thrpt 5 423,261 ± 3,969 ops/us ElementPerformanceTest.benchmarkFindChildEquality ArrayList 5 false thrpt 5 417,632 ± 35,821 ops/us ElementPerformanceTest.benchmarkFindChildEquality ArrayList 5 true thrpt 5 154,873 ± 80,264 ops/us ElementPerformanceTest.benchmarkFindChildEquality ArrayList 10 false thrpt 5 420,095 ± 13,134 ops/us ElementPerformanceTest.benchmarkFindChildEquality ArrayList 10 true thrpt 5 86,880 ± 57,611 ops/us ElementPerformanceTest.benchmarkFindChildEquality ArrayList 20 false thrpt 5 417,735 ± 7,310 ops/us ElementPerformanceTest.benchmarkFindChildEquality ArrayList 20 true thrpt 5 91,805 ± 45,995 ops/us ElementPerformanceTest.benchmarkFindChildEquality LinkedList 1 false thrpt 5 370,248 ± 3,082 ops/us ElementPerformanceTest.benchmarkFindChildEquality LinkedList 1 true thrpt 5 369,220 ± 14,114 ops/us ElementPerformanceTest.benchmarkFindChildEquality LinkedList 5 false thrpt 5 375,413 ± 18,409 ops/us ElementPerformanceTest.benchmarkFindChildEquality LinkedList 5 true thrpt 5 231,619 ± 251,507 ops/us ElementPerformanceTest.benchmarkFindChildEquality LinkedList 10 false thrpt 5 373,015 ± 42,821 ops/us ElementPerformanceTest.benchmarkFindChildEquality LinkedList 10 true thrpt 5 87,188 ± 68,971 ops/us ElementPerformanceTest.benchmarkFindChildEquality LinkedList 20 false thrpt 5 380,034 ± 7,764 ops/us ElementPerformanceTest.benchmarkFindChildEquality LinkedList 20 true thrpt 5 120,767 ± 442,771 ops/us ElementPerformanceTest.benchmarkFindChildGetChildStaticStr ArrayList 1 false thrpt 5 1120,717 ± 10,730 ops/us ElementPerformanceTest.benchmarkFindChildGetChildStaticStr ArrayList 1 true thrpt 5 1121,619 ± 4,722 ops/us ElementPerformanceTest.benchmarkFindChildGetChildStaticStr ArrayList 5 false thrpt 5 1122,085 ± 9,664 ops/us ElementPerformanceTest.benchmarkFindChildGetChildStaticStr ArrayList 5 true thrpt 5 441,373 ± 356,780 ops/us ElementPerformanceTest.benchmarkFindChildGetChildStaticStr ArrayList 10 false thrpt 5 1118,401 ± 10,219 ops/us ElementPerformanceTest.benchmarkFindChildGetChildStaticStr ArrayList 10 true thrpt 5 387,174 ± 452,062 ops/us ElementPerformanceTest.benchmarkFindChildGetChildStaticStr ArrayList 20 false thrpt 5 1113,951 ± 20,243 ops/us ElementPerformanceTest.benchmarkFindChildGetChildStaticStr ArrayList 20 true thrpt 5 262,479 ± 233,681 ops/us ElementPerformanceTest.benchmarkFindChildGetChildStaticStr LinkedList 1 false thrpt 5 1059,517 ± 29,829 ops/us ElementPerformanceTest.benchmarkFindChildGetChildStaticStr LinkedList 1 true thrpt 5 1054,566 ± 18,752 ops/us ElementPerformanceTest.benchmarkFindChildGetChildStaticStr LinkedList 5 false thrpt 5 1008,539 ± 13,921 ops/us ElementPerformanceTest.benchmarkFindChildGetChildStaticStr LinkedList 5 true thrpt 5 537,933 ± 751,115 ops/us ElementPerformanceTest.benchmarkFindChildGetChildStaticStr LinkedList 10 false thrpt 5 1009,320 ± 11,129 ops/us ElementPerformanceTest.benchmarkFindChildGetChildStaticStr LinkedList 10 true thrpt 5 457,063 ± 875,516 ops/us ElementPerformanceTest.benchmarkFindChildGetChildStaticStr LinkedList 20 false thrpt 5 995,362 ± 44,941 ops/us ElementPerformanceTest.benchmarkFindChildGetChildStaticStr LinkedList 20 true thrpt 5 168,266 ± 275,345 ops/us ElementPerformanceTest.benchmarkFindChildGetChildStaticStrOptional ArrayList 1 false thrpt 5 1113,747 ± 18,743 ops/us ElementPerformanceTest.benchmarkFindChildGetChildStaticStrOptional ArrayList 1 true thrpt 5 1100,001 ± 26,638 ops/us ElementPerformanceTest.benchmarkFindChildGetChildStaticStrOptional ArrayList 5 false thrpt 5 1104,330 ± 101,606 ops/us ElementPerformanceTest.benchmarkFindChildGetChildStaticStrOptional ArrayList 5 true thrpt 5 454,811 ± 367,371 ops/us ElementPerformanceTest.benchmarkFindChildGetChildStaticStrOptional ArrayList 10 false thrpt 5 1117,008 ± 4,354 ops/us ElementPerformanceTest.benchmarkFindChildGetChildStaticStrOptional ArrayList 10 true thrpt 5 423,192 ± 336,249 ops/us ElementPerformanceTest.benchmarkFindChildGetChildStaticStrOptional ArrayList 20 false thrpt 5 1100,559 ± 91,876 ops/us ElementPerformanceTest.benchmarkFindChildGetChildStaticStrOptional ArrayList 20 true thrpt 5 216,928 ± 252,530 ops/us ElementPerformanceTest.benchmarkFindChildGetChildStaticStrOptional LinkedList 1 false thrpt 5 1057,063 ± 13,297 ops/us ElementPerformanceTest.benchmarkFindChildGetChildStaticStrOptional LinkedList 1 true thrpt 5 1058,902 ± 15,367 ops/us ElementPerformanceTest.benchmarkFindChildGetChildStaticStrOptional LinkedList 5 false thrpt 5 1006,604 ± 10,338 ops/us ElementPerformanceTest.benchmarkFindChildGetChildStaticStrOptional LinkedList 5 true thrpt 5 522,616 ± 551,944 ops/us ElementPerformanceTest.benchmarkFindChildGetChildStaticStrOptional LinkedList 10 false thrpt 5 999,878 ± 21,332 ops/us ElementPerformanceTest.benchmarkFindChildGetChildStaticStrOptional LinkedList 10 true thrpt 5 286,018 ± 153,200 ops/us ElementPerformanceTest.benchmarkFindChildGetChildStaticStrOptional LinkedList 20 false thrpt 5 1002,240 ± 34,059 ops/us ElementPerformanceTest.benchmarkFindChildGetChildStaticStrOptional LinkedList 20 true thrpt 5 203,783 ± 318,287 ops/us ElementPerformanceTest.benchmarkFindChildIdentity ArrayList 1 false thrpt 5 478,378 ± 7,768 ops/us ElementPerformanceTest.benchmarkFindChildIdentity ArrayList 1 true thrpt 5 476,888 ± 9,098 ops/us ElementPerformanceTest.benchmarkFindChildIdentity ArrayList 5 false thrpt 5 282,199 ± 5,693 ops/us ElementPerformanceTest.benchmarkFindChildIdentity ArrayList 5 true thrpt 5 256,032 ± 27,086 ops/us ElementPerformanceTest.benchmarkFindChildIdentity ArrayList 10 false thrpt 5 182,493 ± 10,158 ops/us ElementPerformanceTest.benchmarkFindChildIdentity ArrayList 10 true thrpt 5 184,106 ± 4,837 ops/us ElementPerformanceTest.benchmarkFindChildIdentity ArrayList 20 false thrpt 5 104,716 ± 7,860 ops/us ElementPerformanceTest.benchmarkFindChildIdentity ArrayList 20 true thrpt 5 104,527 ± 1,885 ops/us ElementPerformanceTest.benchmarkFindChildIdentity LinkedList 1 false thrpt 5 529,952 ± 38,298 ops/us ElementPerformanceTest.benchmarkFindChildIdentity LinkedList 1 true thrpt 5 535,282 ± 8,899 ops/us ElementPerformanceTest.benchmarkFindChildIdentity LinkedList 5 false thrpt 5 292,701 ± 1,319 ops/us ElementPerformanceTest.benchmarkFindChildIdentity LinkedList 5 true thrpt 5 291,346 ± 16,194 ops/us ElementPerformanceTest.benchmarkFindChildIdentity LinkedList 10 false thrpt 5 191,617 ± 13,623 ops/us ElementPerformanceTest.benchmarkFindChildIdentity LinkedList 10 true thrpt 5 191,870 ± 4,746 ops/us ElementPerformanceTest.benchmarkFindChildIdentity LinkedList 20 false thrpt 5 91,443 ± 5,410 ops/us ElementPerformanceTest.benchmarkFindChildIdentity LinkedList 20 true thrpt 5 91,468 ± 3,164 ops/us ElementPerformanceTest.benchmarkFindChildNewMatcher ArrayList 1 false thrpt 5 441,789 ± 34,051 ops/us ElementPerformanceTest.benchmarkFindChildNewMatcher ArrayList 1 true thrpt 5 449,293 ± 6,070 ops/us ElementPerformanceTest.benchmarkFindChildNewMatcher ArrayList 5 false thrpt 5 445,836 ± 34,837 ops/us ElementPerformanceTest.benchmarkFindChildNewMatcher ArrayList 5 true thrpt 5 202,783 ± 190,396 ops/us ElementPerformanceTest.benchmarkFindChildNewMatcher ArrayList 10 false thrpt 5 448,042 ± 9,499 ops/us ElementPerformanceTest.benchmarkFindChildNewMatcher ArrayList 10 true thrpt 5 123,082 ± 161,207 ops/us ElementPerformanceTest.benchmarkFindChildNewMatcher ArrayList 20 false thrpt 5 446,923 ± 12,915 ops/us ElementPerformanceTest.benchmarkFindChildNewMatcher ArrayList 20 true thrpt 5 65,456 ± 44,718 ops/us ElementPerformanceTest.benchmarkFindChildNewMatcher LinkedList 1 false thrpt 5 375,530 ± 23,980 ops/us ElementPerformanceTest.benchmarkFindChildNewMatcher LinkedList 1 true thrpt 5 379,721 ± 4,949 ops/us ElementPerformanceTest.benchmarkFindChildNewMatcher LinkedList 5 false thrpt 5 396,859 ± 5,893 ops/us ElementPerformanceTest.benchmarkFindChildNewMatcher LinkedList 5 true thrpt 5 170,482 ± 236,733 ops/us ElementPerformanceTest.benchmarkFindChildNewMatcher LinkedList 10 false thrpt 5 387,271 ± 44,871 ops/us ElementPerformanceTest.benchmarkFindChildNewMatcher LinkedList 10 true thrpt 5 142,712 ± 264,626 ops/us ElementPerformanceTest.benchmarkFindChildNewMatcher LinkedList 20 false thrpt 5 392,170 ± 29,970 ops/us ElementPerformanceTest.benchmarkFindChildNewMatcher LinkedList 20 true thrpt 5 86,316 ± 118,652 ops/us ElementPerformanceTest.benchmarkFindChildNullableAndGetName ArrayList 1 false thrpt 5 367,263 ± 25,725 ops/us ElementPerformanceTest.benchmarkFindChildNullableAndGetName ArrayList 1 true thrpt 5 368,246 ± 7,986 ops/us ElementPerformanceTest.benchmarkFindChildNullableAndGetName ArrayList 5 false thrpt 5 367,446 ± 29,559 ops/us ElementPerformanceTest.benchmarkFindChildNullableAndGetName ArrayList 5 true thrpt 5 136,297 ± 134,230 ops/us ElementPerformanceTest.benchmarkFindChildNullableAndGetName ArrayList 10 false thrpt 5 369,060 ± 25,805 ops/us ElementPerformanceTest.benchmarkFindChildNullableAndGetName ArrayList 10 true thrpt 5 95,418 ± 43,454 ops/us ElementPerformanceTest.benchmarkFindChildNullableAndGetName ArrayList 20 false thrpt 5 365,791 ± 42,911 ops/us ElementPerformanceTest.benchmarkFindChildNullableAndGetName ArrayList 20 true thrpt 5 83,119 ± 35,294 ops/us ElementPerformanceTest.benchmarkFindChildNullableAndGetName LinkedList 1 false thrpt 5 345,709 ± 12,167 ops/us ElementPerformanceTest.benchmarkFindChildNullableAndGetName LinkedList 1 true thrpt 5 343,125 ± 6,945 ops/us ElementPerformanceTest.benchmarkFindChildNullableAndGetName LinkedList 5 false thrpt 5 365,058 ± 8,170 ops/us ElementPerformanceTest.benchmarkFindChildNullableAndGetName LinkedList 5 true thrpt 5 153,322 ± 205,752 ops/us ElementPerformanceTest.benchmarkFindChildNullableAndGetName LinkedList 10 false thrpt 5 366,896 ± 2,651 ops/us ElementPerformanceTest.benchmarkFindChildNullableAndGetName LinkedList 10 true thrpt 5 69,944 ± 13,082 ops/us ElementPerformanceTest.benchmarkFindChildNullableAndGetName LinkedList 20 false thrpt 5 366,292 ± 2,650 ops/us ElementPerformanceTest.benchmarkFindChildNullableAndGetName LinkedList 20 true thrpt 5 69,931 ± 107,890 ops/us ElementPerformanceTest.benchmarkFindChildOptionalMapperAndGetName ArrayList 1 false thrpt 5 365,005 ± 4,115 ops/us ElementPerformanceTest.benchmarkFindChildOptionalMapperAndGetName ArrayList 1 true thrpt 5 365,087 ± 6,733 ops/us ElementPerformanceTest.benchmarkFindChildOptionalMapperAndGetName ArrayList 5 false thrpt 5 366,408 ± 3,523 ops/us ElementPerformanceTest.benchmarkFindChildOptionalMapperAndGetName ArrayList 5 true thrpt 5 107,241 ± 87,631 ops/us ElementPerformanceTest.benchmarkFindChildOptionalMapperAndGetName ArrayList 10 false thrpt 5 365,261 ± 7,046 ops/us ElementPerformanceTest.benchmarkFindChildOptionalMapperAndGetName ArrayList 10 true thrpt 5 117,375 ± 71,044 ops/us ElementPerformanceTest.benchmarkFindChildOptionalMapperAndGetName ArrayList 20 false thrpt 5 365,732 ± 5,985 ops/us ElementPerformanceTest.benchmarkFindChildOptionalMapperAndGetName ArrayList 20 true thrpt 5 87,111 ± 118,842 ops/us ElementPerformanceTest.benchmarkFindChildOptionalMapperAndGetName LinkedList 1 false thrpt 5 344,761 ± 8,656 ops/us ElementPerformanceTest.benchmarkFindChildOptionalMapperAndGetName LinkedList 1 true thrpt 5 346,667 ± 5,627 ops/us ElementPerformanceTest.benchmarkFindChildOptionalMapperAndGetName LinkedList 5 false thrpt 5 365,892 ± 4,214 ops/us ElementPerformanceTest.benchmarkFindChildOptionalMapperAndGetName LinkedList 5 true thrpt 5 152,747 ± 147,831 ops/us ElementPerformanceTest.benchmarkFindChildOptionalMapperAndGetName LinkedList 10 false thrpt 5 365,291 ± 5,595 ops/us ElementPerformanceTest.benchmarkFindChildOptionalMapperAndGetName LinkedList 10 true thrpt 5 85,660 ± 66,327 ops/us ElementPerformanceTest.benchmarkFindChildOptionalMapperAndGetName LinkedList 20 false thrpt 5 366,091 ± 7,036 ops/us ElementPerformanceTest.benchmarkFindChildOptionalMapperAndGetName LinkedList 20 true thrpt 5 69,969 ± 75,417 ops/us ElementPerformanceTest.benchmarkFindChildStreamDirectIdentity ArrayList 1 false thrpt 5 23,497 ± 7,405 ops/us ElementPerformanceTest.benchmarkFindChildStreamDirectIdentity ArrayList 1 true thrpt 5 22,890 ± 10,142 ops/us ElementPerformanceTest.benchmarkFindChildStreamDirectIdentity ArrayList 5 false thrpt 5 13,440 ± 3,780 ops/us ElementPerformanceTest.benchmarkFindChildStreamDirectIdentity ArrayList 5 true thrpt 5 14,397 ± 5,174 ops/us ElementPerformanceTest.benchmarkFindChildStreamDirectIdentity ArrayList 10 false thrpt 5 8,761 ± 3,489 ops/us ElementPerformanceTest.benchmarkFindChildStreamDirectIdentity ArrayList 10 true thrpt 5 8,735 ± 3,663 ops/us ElementPerformanceTest.benchmarkFindChildStreamDirectIdentity ArrayList 20 false thrpt 5 5,324 ± 2,596 ops/us ElementPerformanceTest.benchmarkFindChildStreamDirectIdentity ArrayList 20 true thrpt 5 5,194 ± 2,759 ops/us ElementPerformanceTest.benchmarkFindChildStreamDirectIdentity LinkedList 1 false thrpt 5 22,971 ± 3,217 ops/us ElementPerformanceTest.benchmarkFindChildStreamDirectIdentity LinkedList 1 true thrpt 5 22,484 ± 3,902 ops/us ElementPerformanceTest.benchmarkFindChildStreamDirectIdentity LinkedList 5 false thrpt 5 13,323 ± 3,847 ops/us ElementPerformanceTest.benchmarkFindChildStreamDirectIdentity LinkedList 5 true thrpt 5 13,171 ± 2,877 ops/us ElementPerformanceTest.benchmarkFindChildStreamDirectIdentity LinkedList 10 false thrpt 5 9,126 ± 3,925 ops/us ElementPerformanceTest.benchmarkFindChildStreamDirectIdentity LinkedList 10 true thrpt 5 8,644 ± 3,996 ops/us ElementPerformanceTest.benchmarkFindChildStreamDirectIdentity LinkedList 20 false thrpt 5 5,115 ± 2,124 ops/us ElementPerformanceTest.benchmarkFindChildStreamDirectIdentity LinkedList 20 true thrpt 5 5,131 ± 1,933 ops/us ElementPerformanceTest.benchmarkFindChildStreamIdentity ArrayList 1 false thrpt 5 73,515 ± 3,224 ops/us ElementPerformanceTest.benchmarkFindChildStreamIdentity ArrayList 1 true thrpt 5 71,614 ± 13,624 ops/us ElementPerformanceTest.benchmarkFindChildStreamIdentity ArrayList 5 false thrpt 5 32,084 ± 0,641 ops/us ElementPerformanceTest.benchmarkFindChildStreamIdentity ArrayList 5 true thrpt 5 30,543 ± 1,601 ops/us ElementPerformanceTest.benchmarkFindChildStreamIdentity ArrayList 10 false thrpt 5 22,185 ± 0,507 ops/us ElementPerformanceTest.benchmarkFindChildStreamIdentity ArrayList 10 true thrpt 5 21,383 ± 0,890 ops/us ElementPerformanceTest.benchmarkFindChildStreamIdentity ArrayList 20 false thrpt 5 9,211 ± 0,421 ops/us ElementPerformanceTest.benchmarkFindChildStreamIdentity ArrayList 20 true thrpt 5 8,646 ± 0,081 ops/us ElementPerformanceTest.benchmarkFindChildStreamIdentity LinkedList 1 false thrpt 5 71,451 ± 2,204 ops/us ElementPerformanceTest.benchmarkFindChildStreamIdentity LinkedList 1 true thrpt 5 69,768 ± 13,318 ops/us ElementPerformanceTest.benchmarkFindChildStreamIdentity LinkedList 5 false thrpt 5 30,896 ± 1,636 ops/us ElementPerformanceTest.benchmarkFindChildStreamIdentity LinkedList 5 true thrpt 5 30,616 ± 0,720 ops/us ElementPerformanceTest.benchmarkFindChildStreamIdentity LinkedList 10 false thrpt 5 21,131 ± 0,658 ops/us ElementPerformanceTest.benchmarkFindChildStreamIdentity LinkedList 10 true thrpt 5 21,290 ± 0,552 ops/us ElementPerformanceTest.benchmarkFindChildStreamIdentity LinkedList 20 false thrpt 5 9,016 ± 1,209 ops/us ElementPerformanceTest.benchmarkFindChildStreamIdentity LinkedList 20 true thrpt 5 8,820 ± 1,062 ops/us ElementPerformanceTest.benchmarkGetChildAndNullComparison ArrayList 1 false thrpt 5 151,586 ± 3,986 ops/us ElementPerformanceTest.benchmarkGetChildAndNullComparison ArrayList 1 true thrpt 5 150,688 ± 1,427 ops/us ElementPerformanceTest.benchmarkGetChildAndNullComparison ArrayList 5 false thrpt 5 148,141 ± 26,464 ops/us ElementPerformanceTest.benchmarkGetChildAndNullComparison ArrayList 5 true thrpt 5 104,481 ± 66,781 ops/us ElementPerformanceTest.benchmarkGetChildAndNullComparison ArrayList 10 false thrpt 5 150,837 ± 3,175 ops/us ElementPerformanceTest.benchmarkGetChildAndNullComparison ArrayList 10 true thrpt 5 79,177 ± 50,146 ops/us ElementPerformanceTest.benchmarkGetChildAndNullComparison ArrayList 20 false thrpt 5 154,036 ± 3,832 ops/us ElementPerformanceTest.benchmarkGetChildAndNullComparison ArrayList 20 true thrpt 5 56,121 ± 44,353 ops/us ElementPerformanceTest.benchmarkGetChildAndNullComparison LinkedList 1 false thrpt 5 130,458 ± 23,240 ops/us ElementPerformanceTest.benchmarkGetChildAndNullComparison LinkedList 1 true thrpt 5 129,622 ± 6,216 ops/us ElementPerformanceTest.benchmarkGetChildAndNullComparison LinkedList 5 false thrpt 5 119,111 ± 11,161 ops/us ElementPerformanceTest.benchmarkGetChildAndNullComparison LinkedList 5 true thrpt 5 89,627 ± 53,418 ops/us ElementPerformanceTest.benchmarkGetChildAndNullComparison LinkedList 10 false thrpt 5 133,007 ± 5,347 ops/us ElementPerformanceTest.benchmarkGetChildAndNullComparison LinkedList 10 true thrpt 5 75,890 ± 39,071 ops/us ElementPerformanceTest.benchmarkGetChildAndNullComparison LinkedList 20 false thrpt 5 132,784 ± 1,293 ops/us ElementPerformanceTest.benchmarkGetChildAndNullComparison LinkedList 20 true thrpt 5 44,435 ± 21,761 ops/us ElementPerformanceTest.benchmarkGetChildAndOptional ArrayList 1 false thrpt 5 168,183 ± 4,521 ops/us ElementPerformanceTest.benchmarkGetChildAndOptional ArrayList 1 true thrpt 5 171,855 ± 4,203 ops/us ElementPerformanceTest.benchmarkGetChildAndOptional ArrayList 5 false thrpt 5 167,970 ± 19,891 ops/us ElementPerformanceTest.benchmarkGetChildAndOptional ArrayList 5 true thrpt 5 89,849 ± 28,360 ops/us ElementPerformanceTest.benchmarkGetChildAndOptional ArrayList 10 false thrpt 5 172,189 ± 14,855 ops/us ElementPerformanceTest.benchmarkGetChildAndOptional ArrayList 10 true thrpt 5 77,608 ± 51,003 ops/us ElementPerformanceTest.benchmarkGetChildAndOptional ArrayList 20 false thrpt 5 174,644 ± 15,135 ops/us ElementPerformanceTest.benchmarkGetChildAndOptional ArrayList 20 true thrpt 5 69,575 ± 90,217 ops/us ElementPerformanceTest.benchmarkGetChildAndOptional LinkedList 1 false thrpt 5 143,939 ± 1,594 ops/us ElementPerformanceTest.benchmarkGetChildAndOptional LinkedList 1 true thrpt 5 141,766 ± 3,514 ops/us ElementPerformanceTest.benchmarkGetChildAndOptional LinkedList 5 false thrpt 5 140,080 ± 3,225 ops/us ElementPerformanceTest.benchmarkGetChildAndOptional LinkedList 5 true thrpt 5 114,170 ± 63,605 ops/us ElementPerformanceTest.benchmarkGetChildAndOptional LinkedList 10 false thrpt 5 142,609 ± 3,737 ops/us ElementPerformanceTest.benchmarkGetChildAndOptional LinkedList 10 true thrpt 5 78,885 ± 64,933 ops/us ElementPerformanceTest.benchmarkGetChildAndOptional LinkedList 20 false thrpt 5 139,749 ± 5,379 ops/us ElementPerformanceTest.benchmarkGetChildAndOptional LinkedList 20 true thrpt 5 64,690 ± 51,087 ops/us ElementPerformanceTest.benchmarkFlatMap ArrayList 1 false avgt 5 0,033 ± 0,003 us/op ElementPerformanceTest.benchmarkFlatMap ArrayList 1 true avgt 5 0,033 ± 0,006 us/op ElementPerformanceTest.benchmarkFlatMap ArrayList 5 false avgt 5 0,122 ± 0,001 us/op ElementPerformanceTest.benchmarkFlatMap ArrayList 5 true avgt 5 0,124 ± 0,024 us/op ElementPerformanceTest.benchmarkFlatMap ArrayList 10 false avgt 5 0,236 ± 0,004 us/op ElementPerformanceTest.benchmarkFlatMap ArrayList 10 true avgt 5 0,238 ± 0,004 us/op ElementPerformanceTest.benchmarkFlatMap ArrayList 20 false avgt 5 0,480 ± 0,008 us/op ElementPerformanceTest.benchmarkFlatMap ArrayList 20 true avgt 5 0,493 ± 0,020 us/op ElementPerformanceTest.benchmarkFlatMap LinkedList 1 false avgt 5 0,033 ± 0,010 us/op ElementPerformanceTest.benchmarkFlatMap LinkedList 1 true avgt 5 0,031 ± 0,001 us/op ElementPerformanceTest.benchmarkFlatMap LinkedList 5 false avgt 5 0,133 ± 0,019 us/op ElementPerformanceTest.benchmarkFlatMap LinkedList 5 true avgt 5 0,130 ± 0,002 us/op ElementPerformanceTest.benchmarkFlatMap LinkedList 10 false avgt 5 0,256 ± 0,025 us/op ElementPerformanceTest.benchmarkFlatMap LinkedList 10 true avgt 5 0,237 ± 0,002 us/op ElementPerformanceTest.benchmarkFlatMap LinkedList 20 false avgt 5 0,497 ± 0,052 us/op ElementPerformanceTest.benchmarkFlatMap LinkedList 20 true avgt 5 0,503 ± 0,079 us/op ElementPerformanceTest.benchmarkFlatMapManual ArrayList 1 false avgt 5 0,028 ± 0,001 us/op ElementPerformanceTest.benchmarkFlatMapManual ArrayList 1 true avgt 5 0,029 ± 0,001 us/op ElementPerformanceTest.benchmarkFlatMapManual ArrayList 5 false avgt 5 0,114 ± 0,001 us/op ElementPerformanceTest.benchmarkFlatMapManual ArrayList 5 true avgt 5 0,117 ± 0,009 us/op ElementPerformanceTest.benchmarkFlatMapManual ArrayList 10 false avgt 5 0,224 ± 0,005 us/op ElementPerformanceTest.benchmarkFlatMapManual ArrayList 10 true avgt 5 0,223 ± 0,022 us/op ElementPerformanceTest.benchmarkFlatMapManual ArrayList 20 false avgt 5 0,447 ± 0,006 us/op ElementPerformanceTest.benchmarkFlatMapManual ArrayList 20 true avgt 5 0,448 ± 0,005 us/op ElementPerformanceTest.benchmarkFlatMapManual LinkedList 1 false avgt 5 0,029 ± 0,001 us/op ElementPerformanceTest.benchmarkFlatMapManual LinkedList 1 true avgt 5 0,030 ± 0,005 us/op ElementPerformanceTest.benchmarkFlatMapManual LinkedList 5 false avgt 5 0,115 ± 0,003 us/op ElementPerformanceTest.benchmarkFlatMapManual LinkedList 5 true avgt 5 0,115 ± 0,004 us/op ElementPerformanceTest.benchmarkFlatMapManual LinkedList 10 false avgt 5 0,220 ± 0,004 us/op ElementPerformanceTest.benchmarkFlatMapManual LinkedList 10 true avgt 5 0,220 ± 0,002 us/op ElementPerformanceTest.benchmarkFlatMapManual LinkedList 20 false avgt 5 0,435 ± 0,008 us/op ElementPerformanceTest.benchmarkFlatMapManual LinkedList 20 true avgt 5 0,448 ± 0,085 us/op ElementPerformanceTest.benchmarkFlatMapStreamDirectToListStreamChildrenDirect ArrayList 1 false avgt 5 0,186 ± 0,004 us/op ElementPerformanceTest.benchmarkFlatMapStreamDirectToListStreamChildrenDirect ArrayList 1 true avgt 5 0,187 ± 0,033 us/op ElementPerformanceTest.benchmarkFlatMapStreamDirectToListStreamChildrenDirect ArrayList 5 false avgt 5 0,529 ± 0,004 us/op ElementPerformanceTest.benchmarkFlatMapStreamDirectToListStreamChildrenDirect ArrayList 5 true avgt 5 0,532 ± 0,004 us/op ElementPerformanceTest.benchmarkFlatMapStreamDirectToListStreamChildrenDirect ArrayList 10 false avgt 5 0,933 ± 0,020 us/op ElementPerformanceTest.benchmarkFlatMapStreamDirectToListStreamChildrenDirect ArrayList 10 true avgt 5 0,972 ± 0,211 us/op ElementPerformanceTest.benchmarkFlatMapStreamDirectToListStreamChildrenDirect ArrayList 20 false avgt 5 1,859 ± 0,040 us/op ElementPerformanceTest.benchmarkFlatMapStreamDirectToListStreamChildrenDirect ArrayList 20 true avgt 5 1,852 ± 0,111 us/op ElementPerformanceTest.benchmarkFlatMapStreamDirectToListStreamChildrenDirect LinkedList 1 false avgt 5 0,186 ± 0,007 us/op ElementPerformanceTest.benchmarkFlatMapStreamDirectToListStreamChildrenDirect LinkedList 1 true avgt 5 0,186 ± 0,003 us/op ElementPerformanceTest.benchmarkFlatMapStreamDirectToListStreamChildrenDirect LinkedList 5 false avgt 5 0,521 ± 0,006 us/op ElementPerformanceTest.benchmarkFlatMapStreamDirectToListStreamChildrenDirect LinkedList 5 true avgt 5 0,541 ± 0,009 us/op ElementPerformanceTest.benchmarkFlatMapStreamDirectToListStreamChildrenDirect LinkedList 10 false avgt 5 0,935 ± 0,018 us/op ElementPerformanceTest.benchmarkFlatMapStreamDirectToListStreamChildrenDirect LinkedList 10 true avgt 5 0,959 ± 0,018 us/op ElementPerformanceTest.benchmarkFlatMapStreamDirectToListStreamChildrenDirect LinkedList 20 false avgt 5 1,903 ± 0,131 us/op ElementPerformanceTest.benchmarkFlatMapStreamDirectToListStreamChildrenDirect LinkedList 20 true avgt 5 1,827 ± 0,017 us/op ElementPerformanceTest.benchmarkFlatMapStreamToArray ArrayList 1 false avgt 5 0,072 ± 0,001 us/op ElementPerformanceTest.benchmarkFlatMapStreamToArray ArrayList 1 true avgt 5 0,073 ± 0,001 us/op ElementPerformanceTest.benchmarkFlatMapStreamToArray ArrayList 5 false avgt 5 0,191 ± 0,013 us/op ElementPerformanceTest.benchmarkFlatMapStreamToArray ArrayList 5 true avgt 5 0,191 ± 0,003 us/op ElementPerformanceTest.benchmarkFlatMapStreamToArray ArrayList 10 false avgt 5 0,336 ± 0,003 us/op ElementPerformanceTest.benchmarkFlatMapStreamToArray ArrayList 10 true avgt 5 0,336 ± 0,003 us/op ElementPerformanceTest.benchmarkFlatMapStreamToArray ArrayList 20 false avgt 5 0,680 ± 0,128 us/op ElementPerformanceTest.benchmarkFlatMapStreamToArray ArrayList 20 true avgt 5 0,688 ± 0,006 us/op ElementPerformanceTest.benchmarkFlatMapStreamToArray LinkedList 1 false avgt 5 0,073 ± 0,002 us/op ElementPerformanceTest.benchmarkFlatMapStreamToArray LinkedList 1 true avgt 5 0,073 ± 0,001 us/op ElementPerformanceTest.benchmarkFlatMapStreamToArray LinkedList 5 false avgt 5 0,193 ± 0,005 us/op ElementPerformanceTest.benchmarkFlatMapStreamToArray LinkedList 5 true avgt 5 0,194 ± 0,032 us/op ElementPerformanceTest.benchmarkFlatMapStreamToArray LinkedList 10 false avgt 5 0,322 ± 0,005 us/op ElementPerformanceTest.benchmarkFlatMapStreamToArray LinkedList 10 true avgt 5 0,344 ± 0,055 us/op ElementPerformanceTest.benchmarkFlatMapStreamToArray LinkedList 20 false avgt 5 0,694 ± 0,097 us/op ElementPerformanceTest.benchmarkFlatMapStreamToArray LinkedList 20 true avgt 5 0,684 ± 0,006 us/op ElementPerformanceTest.benchmarkFlatMapStreamToListGetChildrenStream ArrayList 1 false avgt 5 0,117 ± 0,008 us/op ElementPerformanceTest.benchmarkFlatMapStreamToListGetChildrenStream ArrayList 1 true avgt 5 0,118 ± 0,002 us/op ElementPerformanceTest.benchmarkFlatMapStreamToListGetChildrenStream ArrayList 5 false avgt 5 0,260 ± 0,026 us/op ElementPerformanceTest.benchmarkFlatMapStreamToListGetChildrenStream ArrayList 5 true avgt 5 0,257 ± 0,004 us/op ElementPerformanceTest.benchmarkFlatMapStreamToListGetChildrenStream ArrayList 10 false avgt 5 0,436 ± 0,087 us/op ElementPerformanceTest.benchmarkFlatMapStreamToListGetChildrenStream ArrayList 10 true avgt 5 0,428 ± 0,005 us/op ElementPerformanceTest.benchmarkFlatMapStreamToListGetChildrenStream ArrayList 20 false avgt 5 0,913 ± 0,038 us/op ElementPerformanceTest.benchmarkFlatMapStreamToListGetChildrenStream ArrayList 20 true avgt 5 0,906 ± 0,010 us/op ElementPerformanceTest.benchmarkFlatMapStreamToListGetChildrenStream LinkedList 1 false avgt 5 0,119 ± 0,003 us/op ElementPerformanceTest.benchmarkFlatMapStreamToListGetChildrenStream LinkedList 1 true avgt 5 0,119 ± 0,003 us/op ElementPerformanceTest.benchmarkFlatMapStreamToListGetChildrenStream LinkedList 5 false avgt 5 0,273 ± 0,004 us/op ElementPerformanceTest.benchmarkFlatMapStreamToListGetChildrenStream LinkedList 5 true avgt 5 0,270 ± 0,004 us/op ElementPerformanceTest.benchmarkFlatMapStreamToListGetChildrenStream LinkedList 10 false avgt 5 0,451 ± 0,010 us/op ElementPerformanceTest.benchmarkFlatMapStreamToListGetChildrenStream LinkedList 10 true avgt 5 0,442 ± 0,004 us/op ElementPerformanceTest.benchmarkFlatMapStreamToListGetChildrenStream LinkedList 20 false avgt 5 0,758 ± 0,018 us/op ElementPerformanceTest.benchmarkFlatMapStreamToListGetChildrenStream LinkedList 20 true avgt 5 0,756 ± 0,007 us/op ElementPerformanceTest.benchmarkFlatMapStreamToListStreamChildren ArrayList 1 false avgt 5 0,072 ± 0,002 us/op ElementPerformanceTest.benchmarkFlatMapStreamToListStreamChildren ArrayList 1 true avgt 5 0,077 ± 0,007 us/op ElementPerformanceTest.benchmarkFlatMapStreamToListStreamChildren ArrayList 5 false avgt 5 0,180 ± 0,002 us/op ElementPerformanceTest.benchmarkFlatMapStreamToListStreamChildren ArrayList 5 true avgt 5 0,196 ± 0,034 us/op ElementPerformanceTest.benchmarkFlatMapStreamToListStreamChildren ArrayList 10 false avgt 5 0,330 ± 0,006 us/op ElementPerformanceTest.benchmarkFlatMapStreamToListStreamChildren ArrayList 10 true avgt 5 0,341 ± 0,053 us/op ElementPerformanceTest.benchmarkFlatMapStreamToListStreamChildren ArrayList 20 false avgt 5 0,663 ± 0,015 us/op ElementPerformanceTest.benchmarkFlatMapStreamToListStreamChildren ArrayList 20 true avgt 5 0,688 ± 0,004 us/op ElementPerformanceTest.benchmarkFlatMapStreamToListStreamChildren LinkedList 1 false avgt 5 0,075 ± 0,001 us/op ElementPerformanceTest.benchmarkFlatMapStreamToListStreamChildren LinkedList 1 true avgt 5 0,081 ± 0,053 us/op ElementPerformanceTest.benchmarkFlatMapStreamToListStreamChildren LinkedList 5 false avgt 5 0,179 ± 0,003 us/op ElementPerformanceTest.benchmarkFlatMapStreamToListStreamChildren LinkedList 5 true avgt 5 0,183 ± 0,035 us/op ElementPerformanceTest.benchmarkFlatMapStreamToListStreamChildren LinkedList 10 false avgt 5 0,313 ± 0,003 us/op ElementPerformanceTest.benchmarkFlatMapStreamToListStreamChildren LinkedList 10 true avgt 5 0,285 ± 0,007 us/op ElementPerformanceTest.benchmarkFlatMapStreamToListStreamChildren LinkedList 20 false avgt 5 0,665 ± 0,002 us/op ElementPerformanceTest.benchmarkFlatMapStreamToListStreamChildren LinkedList 20 true avgt 5 0,693 ± 0,103 us/op ElementPerformanceTest.benchmarkHashMapGet N/A N/A N/A avgt 5 0,055 ± 0,001 us/op ElementPerformanceTest.benchmarkHashMapGetIdentity N/A N/A N/A avgt 5 0,468 ± 0,005 us/op ElementPerformanceTest.benchmarkHashMapGetMemoryOptimized N/A N/A N/A avgt 5 0,064 ± 0,001 us/op ElementPerformanceTest.benchmarkHashMapSet N/A N/A N/A avgt 5 0,086 ± 0,002 us/op ElementPerformanceTest.benchmarkHashMapSetIdentity N/A N/A N/A avgt 5 0,563 ± 0,006 us/op ElementPerformanceTest.benchmarkHashMapSetMemoryOptimized N/A N/A N/A avgt 5 0,112 ± 0,010 us/op ElementPerformanceTest.benchmarkIndentityMapGet N/A N/A N/A avgt 5 0,467 ± 0,010 us/op ElementPerformanceTest.benchmarkIndentityMapGetStatic N/A N/A N/A avgt 5 0,046 ± 0,001 us/op ElementPerformanceTest.benchmarkIndentityMapSet N/A N/A N/A avgt 5 0,508 ± 0,033 us/op ElementPerformanceTest.benchmarkMapStreamToList ArrayList 1 false avgt 5 0,064 ± 0,003 us/op ElementPerformanceTest.benchmarkMapStreamToList ArrayList 1 true avgt 5 0,062 ± 0,001 us/op ElementPerformanceTest.benchmarkMapStreamToList ArrayList 5 false avgt 5 0,117 ± 0,001 us/op ElementPerformanceTest.benchmarkMapStreamToList ArrayList 5 true avgt 5 0,119 ± 0,003 us/op ElementPerformanceTest.benchmarkMapStreamToList ArrayList 10 false avgt 5 0,157 ± 0,024 us/op ElementPerformanceTest.benchmarkMapStreamToList ArrayList 10 true avgt 5 0,165 ± 0,003 us/op ElementPerformanceTest.benchmarkMapStreamToList ArrayList 20 false avgt 5 0,274 ± 0,012 us/op ElementPerformanceTest.benchmarkMapStreamToList ArrayList 20 true avgt 5 0,259 ± 0,004 us/op ElementPerformanceTest.benchmarkMapStreamToList LinkedList 1 false avgt 5 0,063 ± 0,002 us/op ElementPerformanceTest.benchmarkMapStreamToList LinkedList 1 true avgt 5 0,065 ± 0,017 us/op ElementPerformanceTest.benchmarkMapStreamToList LinkedList 5 false avgt 5 0,111 ± 0,001 us/op ElementPerformanceTest.benchmarkMapStreamToList LinkedList 5 true avgt 5 0,130 ± 0,002 us/op ElementPerformanceTest.benchmarkMapStreamToList LinkedList 10 false avgt 5 0,153 ± 0,006 us/op ElementPerformanceTest.benchmarkMapStreamToList LinkedList 10 true avgt 5 0,163 ± 0,014 us/op ElementPerformanceTest.benchmarkMapStreamToList LinkedList 20 false avgt 5 0,272 ± 0,005 us/op ElementPerformanceTest.benchmarkMapStreamToList LinkedList 20 true avgt 5 0,273 ± 0,016 us/op
-
Additionally, I've gathered String intern table metrics with
-XX:StringTableSize=1000003 -XX:+PrintStringTableStatistics
:SymbolTable statistics: Number of buckets : 32768 = 262144 bytes, each 8 Number of entries : 16885 = 270160 bytes, each 16 Number of literals : 16885 = 674664 bytes, avg 39.000 Total footprint : = 1206968 bytes Average bucket size : 0.515 Variance of bucket size : 0.511 Std. dev. of bucket size: 0.715 Maximum bucket size : 6 StringTable statistics: Number of buckets : 1048576 = 8388608 bytes, each 8 Number of entries : 4935 = 78960 bytes, each 16 Number of literals : 4935 = 318208 bytes, avg 64.000 Total footprint : = 8785776 bytes Average bucket size : 0.005 Variance of bucket size : 0.005 Std. dev. of bucket size: 0.069 Maximum bucket size : 2
which confirms that even with small no. of entries and increased bucket size by quite a lot there was no increase of performance of
String.intern()
method.From that, it is clear that the comparison of Strings using
==
instead ofequals()
is around 35% faster. On the other hand, each callString.intern()
takes around 0,6us (which is called on each creation of an element, or setting of xmlns. However, looking at around325
operations ofequals()
per each us I wonder if it is ok to callString.intern()
for eachElement
creation. So to overcome the overhead ofString.intern()
call, each String would need to be compared by==
at least 195 times. (assuming that element was in a list with at least 5 elements and in a random position - quite possible).I would assume that element name is compared 10-20 times on average in a single pass of a packet, but I may be wrong. We mostly check/compare element names of "outer" elements of the stanza (close to the root), while internal elements may newer be compared. We would need to measure this somehow... any ideas?
I'm not sure, if this call to
String.intern()
actually gives us a performance boost, or rather just "slightly" reduces memory usage due to common element names deduplication. -
I wonder if we shouldn't add
-XX:StringTableSize=1000003 -XX:+PrintStringTableStatistics
to our deployments to check actual usage ofString.intern()
, ie. to know how many strings were processed, know used memory and check if StringTableSize is big enough (too small will produce performance issues). -
Interesting observation regarding intern() performance. If indeed it usage is not that beneficial performance wise we could drop it in favour of something else (our own local "intern" table with common names?).
Regarding usage / comparison count - while we won't compare it normally Element could do it while performing lookup. I think if we want to measure it we could add basic statistics to Element (only temporarily for the testing purpose) and run... TTS-NG against server with such custom build of tigase-xmltools.
Adding StringTable statistics to our tigase.im/org deployment could also give us more information.
-
I think that knowing how many times elements (and xmlns) were interned would be good to know, but I suppose there is no way to count how many times we are comparing element names as those are instance comparisons so there is no way to measure that, right? We could check executions of
getChild()
, etc. -
Yes, I was thinking about counting usage of
.getChild()
and such however you rise a valid point that comparing instances is usually how it's done. Nevertheless, we usually don't compare completeElement
but rather it's features (name, xmlns) and in that case we do something along the lines ofElement.getName() == "message"
(interestingly, in tigase-server I found 70 usage of.getName()
with instance comparison and ~110 usages of calls with.getName().equals()
) -
I've tested a direct comparison of performance between comparing String with
String::equals()
, instance comparison, instance comparison with call tointern()
. Here are results:ElementPerformanceTest2.benchmarkStringEquals N/A N/A thrpt 5 560,065 ± 60,731 ops/us ElementPerformanceTest2.benchmarkStringIdentity N/A N/A thrpt 5 11,180 ± 0,572 ops/us ElementPerformanceTest2.benchmarkStringIdentityInternalized N/A N/A thrpt 5 3496,496 ± 169,305 ops/us ElementPerformanceTest2.benchmarkStringIdentityInternalizedOverkill N/A N/A thrpt 5 11,197 ± 1,314 ops/us ElementPerformanceTest2.benchmarkStringIdentityOneInternalized N/A N/A thrpt 5 21,179 ± 2,375 ops/us
From that we can see, that comparing
String
withequals()
is 7 times slower than instance comparison. On the other hand, it is 50 times faster than comparing 2 strings with an instance comparison if both of them need to be internalized and 27 times faster if at least one of the strings is static and the second one needs to be internalized.I will try to make some measurements to estimate how many times we are calling
String::intern()
and how many times we are comparing internalized versions ofString
. That should give us an average ratio and will help us decide what would be better for us.Alternatively, we could use the "mixed" mode, by internalizing in a global map all element names passed as parameters to
findChild()
methods and actually callString::intern()
only on elements which names or XMLNSs are compared. That could reduce no. of calls toString::intern()
for large stanzas (with a lot of internal elements which are never compared).Another approach could be internalizing only "known" element names (such as
message
,presence
,iq
,query
,body
, etc.) but that would require manually writingswitch
with cases, so we would need to measure times for those as well. -
A point in discussion https://shipilev.net/jvm/anatomy-quarks/10-string-intern/
In almost every project we were taking care of, removing
String.intern()
from the hotpaths, or optionally replacing it with a handrolled deduplicator, was the very profitable performance optimization. Do not useString.intern()
without thinking very hard about it, okay? -
As discussed, I've modified
Element
to support identity-based and nonidentity-based attributes holder, and here are results:Benchmark (attributesClass) Mode Cnt Score Error Units ElementAttributesPerformanceTest.benchmarkAttributeGet IdentityMap thrpt 5 22,342 ± 0,420 ops/us ElementAttributesPerformanceTest.benchmarkAttributeGet HashMap thrpt 5 268,068 ± 2,820 ops/us ElementAttributesPerformanceTest.benchmarkAttributeGet DedupHashMap thrpt 5 276,086 ± 6,135 ops/us ElementAttributesPerformanceTest.benchmarkAttributeGetIdentity IdentityMap thrpt 5 905,443 ± 50,518 ops/us ElementAttributesPerformanceTest.benchmarkAttributeGetIdentity HashMap thrpt 5 259,715 ± 50,871 ops/us ElementAttributesPerformanceTest.benchmarkAttributeGetIdentity DedupHashMap thrpt 5 280,487 ± 6,015 ops/us ElementAttributesPerformanceTest.benchmarkAttributePut IdentityMap thrpt 5 22,347 ± 0,182 ops/us ElementAttributesPerformanceTest.benchmarkAttributePut HashMap thrpt 5 149,649 ± 30,114 ops/us ElementAttributesPerformanceTest.benchmarkAttributePut DedupHashMap thrpt 5 149,762 ± 11,182 ops/us ElementAttributesPerformanceTest.benchmarkAttributeRemove IdentityMap thrpt 5 22,650 ± 0,922 ops/us ElementAttributesPerformanceTest.benchmarkAttributeRemove HashMap thrpt 5 1030,129 ± 26,221 ops/us ElementAttributesPerformanceTest.benchmarkAttributeRemove DedupHashMap thrpt 5 1004,137 ± 187,431 ops/us
It looks like
put
is 7 times slower whenintern()
is used. However, if we comparegetStatic
vsget
then it is 3.4 times faster. But if we "forget" about using "static", then identity-based solution is over 10x slower. Please note, thatput
is used for each attribute (even those which are never asked for!). Similarly,remove
is 45x slower when we are using identity-based solution. -
This seems to confirm that we should drop
String.intern()
use. Though, I think we should modify test cast and instead of using completely random attribute names also try with limited set of most popular attributes (this should especially show differences inDedupHashMap
implementation). -
Benchmark (attributeType) (attributesClass) Mode Cnt Score Error Units ElementAttributesPerformanceTest.benchmarkAttributeGet Dynamic IdentityMap thrpt 5 23,314 ± 1,812 ops/us ElementAttributesPerformanceTest.benchmarkAttributeGet Dynamic HashMap thrpt 5 271,550 ± 11,128 ops/us ElementAttributesPerformanceTest.benchmarkAttributeGet Dynamic DedupHashMap thrpt 5 266,812 ± 48,502 ops/us ElementAttributesPerformanceTest.benchmarkAttributeGet Static IdentityMap thrpt 5 41,831 ± 3,953 ops/us ElementAttributesPerformanceTest.benchmarkAttributeGet Static HashMap thrpt 5 351,414 ± 24,967 ops/us ElementAttributesPerformanceTest.benchmarkAttributeGet Static DedupHashMap thrpt 5 602,491 ± 24,917 ops/us ElementAttributesPerformanceTest.benchmarkAttributeGetIdentity Dynamic IdentityMap thrpt 5 924,518 ± 19,176 ops/us ElementAttributesPerformanceTest.benchmarkAttributeGetIdentity Dynamic HashMap thrpt 5 278,129 ± 7,343 ops/us ElementAttributesPerformanceTest.benchmarkAttributeGetIdentity Dynamic DedupHashMap thrpt 5 279,252 ± 2,265 ops/us ElementAttributesPerformanceTest.benchmarkAttributeGetIdentity Static IdentityMap thrpt 5 927,816 ± 13,294 ops/us ElementAttributesPerformanceTest.benchmarkAttributeGetIdentity Static HashMap thrpt 5 361,070 ± 4,145 ops/us ElementAttributesPerformanceTest.benchmarkAttributeGetIdentity Static DedupHashMap thrpt 5 604,214 ± 66,511 ops/us ElementAttributesPerformanceTest.benchmarkAttributePut Dynamic IdentityMap thrpt 5 16,831 ± 0,423 ops/us ElementAttributesPerformanceTest.benchmarkAttributePut Dynamic HashMap thrpt 5 158,600 ± 11,464 ops/us ElementAttributesPerformanceTest.benchmarkAttributePut Dynamic DedupHashMap thrpt 5 155,214 ± 2,629 ops/us ElementAttributesPerformanceTest.benchmarkAttributePut Static IdentityMap thrpt 5 35,416 ± 0,401 ops/us ElementAttributesPerformanceTest.benchmarkAttributePut Static HashMap thrpt 5 130,465 ± 0,986 ops/us ElementAttributesPerformanceTest.benchmarkAttributePut Static DedupHashMap thrpt 5 185,464 ± 3,369 ops/us ElementAttributesPerformanceTest.benchmarkAttributeRemove Dynamic IdentityMap thrpt 5 22,139 ± 2,529 ops/us ElementAttributesPerformanceTest.benchmarkAttributeRemove Dynamic HashMap thrpt 5 1058,439 ± 22,766 ops/us ElementAttributesPerformanceTest.benchmarkAttributeRemove Dynamic DedupHashMap thrpt 5 1057,848 ± 14,043 ops/us ElementAttributesPerformanceTest.benchmarkAttributeRemove Static IdentityMap thrpt 5 40,887 ± 0,807 ops/us ElementAttributesPerformanceTest.benchmarkAttributeRemove Static HashMap thrpt 5 1037,860 ± 168,149 ops/us ElementAttributesPerformanceTest.benchmarkAttributeRemove Static DedupHashMap thrpt 5 1059,319 ± 2,549 ops/us
Here are results including case when "known" attribute name was used ("id") - "Static".
-
I've added one more test.
Attributes holder class named previously
DedupHashMap
was doing deduplication using "static" switch-case is now namedDedupStaticHashMap
. The new version, using HashMap for deduplication (allowing us to modify the list of deduplicated attribute names in runtime) is now namedDedupHashHashMap
.Here are results:
Benchmark (attributeType) (attributesClass) Mode Cnt Score Error Units ElementAttributesPerformanceTest.benchmarkAttributeGet Dynamic IdentityMap thrpt 5 15,457 ± 1,387 ops/us ElementAttributesPerformanceTest.benchmarkAttributeGet Dynamic HashMap thrpt 5 271,397 ± 4,562 ops/us ElementAttributesPerformanceTest.benchmarkAttributeGet Dynamic DedupStaticHashMap thrpt 5 275,557 ± 4,993 ops/us ElementAttributesPerformanceTest.benchmarkAttributeGet Dynamic DedupHashHashMap thrpt 5 287,798 ± 5,461 ops/us ElementAttributesPerformanceTest.benchmarkAttributeGet Static IdentityMap thrpt 5 40,866 ± 2,709 ops/us ElementAttributesPerformanceTest.benchmarkAttributeGet Static HashMap thrpt 5 352,945 ± 4,543 ops/us ElementAttributesPerformanceTest.benchmarkAttributeGet Static DedupStaticHashMap thrpt 5 605,102 ± 5,953 ops/us ElementAttributesPerformanceTest.benchmarkAttributeGet Static DedupHashHashMap thrpt 5 598,744 ± 23,502 ops/us ElementAttributesPerformanceTest.benchmarkAttributeGetIdentity Dynamic IdentityMap thrpt 5 922,564 ± 10,873 ops/us ElementAttributesPerformanceTest.benchmarkAttributeGetIdentity Dynamic HashMap thrpt 5 284,818 ± 5,797 ops/us ElementAttributesPerformanceTest.benchmarkAttributeGetIdentity Dynamic DedupStaticHashMap thrpt 5 266,303 ± 8,952 ops/us ElementAttributesPerformanceTest.benchmarkAttributeGetIdentity Dynamic DedupHashHashMap thrpt 5 275,538 ± 6,721 ops/us ElementAttributesPerformanceTest.benchmarkAttributeGetIdentity Static IdentityMap thrpt 5 913,211 ± 38,650 ops/us ElementAttributesPerformanceTest.benchmarkAttributeGetIdentity Static HashMap thrpt 5 354,059 ± 4,145 ops/us ElementAttributesPerformanceTest.benchmarkAttributeGetIdentity Static DedupStaticHashMap thrpt 5 603,541 ± 7,332 ops/us ElementAttributesPerformanceTest.benchmarkAttributeGetIdentity Static DedupHashHashMap thrpt 5 595,235 ± 29,891 ops/us ElementAttributesPerformanceTest.benchmarkAttributePut Dynamic IdentityMap thrpt 5 17,004 ± 0,538 ops/us ElementAttributesPerformanceTest.benchmarkAttributePut Dynamic HashMap thrpt 5 156,010 ± 11,896 ops/us ElementAttributesPerformanceTest.benchmarkAttributePut Dynamic DedupStaticHashMap thrpt 5 154,720 ± 6,253 ops/us ElementAttributesPerformanceTest.benchmarkAttributePut Dynamic DedupHashHashMap thrpt 5 142,995 ± 7,413 ops/us ElementAttributesPerformanceTest.benchmarkAttributePut Static IdentityMap thrpt 5 32,695 ± 3,046 ops/us ElementAttributesPerformanceTest.benchmarkAttributePut Static HashMap thrpt 5 128,166 ± 8,409 ops/us ElementAttributesPerformanceTest.benchmarkAttributePut Static DedupStaticHashMap thrpt 5 142,257 ± 2,664 ops/us ElementAttributesPerformanceTest.benchmarkAttributePut Static DedupHashHashMap thrpt 5 162,970 ± 0,597 ops/us ElementAttributesPerformanceTest.benchmarkAttributeRemove Dynamic IdentityMap thrpt 5 22,533 ± 2,082 ops/us ElementAttributesPerformanceTest.benchmarkAttributeRemove Dynamic HashMap thrpt 5 1019,817 ± 175,476 ops/us ElementAttributesPerformanceTest.benchmarkAttributeRemove Dynamic DedupStaticHashMap thrpt 5 1049,247 ± 11,586 ops/us ElementAttributesPerformanceTest.benchmarkAttributeRemove Dynamic DedupHashHashMap thrpt 5 1042,285 ± 78,440 ops/us ElementAttributesPerformanceTest.benchmarkAttributeRemove Static IdentityMap thrpt 5 39,838 ± 0,699 ops/us ElementAttributesPerformanceTest.benchmarkAttributeRemove Static HashMap thrpt 5 1054,993 ± 12,879 ops/us ElementAttributesPerformanceTest.benchmarkAttributeRemove Static DedupStaticHashMap thrpt 5 1018,364 ± 45,795 ops/us ElementAttributesPerformanceTest.benchmarkAttributeRemove Static DedupHashHashMap thrpt 5 1049,882 ± 25,535 ops/us
-
Here are some more tests to measure if usage of deduplication of element name would speed up element lookup.
Benchmark (clazz) (maxChild) (willDedup) Mode Cnt Score Error Units ElementPerformanceTest.benchmarkFindChildDedupEquality ArrayList 1 false thrpt 5 324,337 ± 5,819 ops/us ElementPerformanceTest.benchmarkFindChildDedupEquality ArrayList 1 true thrpt 5 276,836 ± 5,511 ops/us ElementPerformanceTest.benchmarkFindChildDedupEquality ArrayList 5 false thrpt 5 150,787 ± 17,099 ops/us ElementPerformanceTest.benchmarkFindChildDedupEquality ArrayList 5 true thrpt 5 166,530 ± 5,483 ops/us ElementPerformanceTest.benchmarkFindChildDedupEquality ArrayList 10 false thrpt 5 102,848 ± 1,524 ops/us ElementPerformanceTest.benchmarkFindChildDedupEquality ArrayList 10 true thrpt 5 125,910 ± 12,166 ops/us ElementPerformanceTest.benchmarkFindChildDedupEquality LinkedList 1 false thrpt 5 323,914 ± 3,066 ops/us ElementPerformanceTest.benchmarkFindChildDedupEquality LinkedList 1 true thrpt 5 262,859 ± 6,712 ops/us ElementPerformanceTest.benchmarkFindChildDedupEquality LinkedList 5 false thrpt 5 64,743 ± 15,393 ops/us ElementPerformanceTest.benchmarkFindChildDedupEquality LinkedList 5 true thrpt 5 66,343 ± 17,814 ops/us ElementPerformanceTest.benchmarkFindChildDedupEquality LinkedList 10 false thrpt 5 40,147 ± 0,712 ops/us ElementPerformanceTest.benchmarkFindChildDedupEquality LinkedList 10 true thrpt 5 39,947 ± 7,333 ops/us ElementPerformanceTest.benchmarkFindChildDedupEqualityPossibleOptimized ArrayList 1 false thrpt 5 430,021 ± 6,353 ops/us ElementPerformanceTest.benchmarkFindChildDedupEqualityPossibleOptimized ArrayList 1 true thrpt 5 424,701 ± 25,306 ops/us ElementPerformanceTest.benchmarkFindChildDedupEqualityPossibleOptimized ArrayList 5 false thrpt 5 141,278 ± 31,980 ops/us ElementPerformanceTest.benchmarkFindChildDedupEqualityPossibleOptimized ArrayList 5 true thrpt 5 161,832 ± 1,610 ops/us ElementPerformanceTest.benchmarkFindChildDedupEqualityPossibleOptimized ArrayList 10 false thrpt 5 98,278 ± 2,041 ops/us ElementPerformanceTest.benchmarkFindChildDedupEqualityPossibleOptimized ArrayList 10 true thrpt 5 117,119 ± 1,040 ops/us ElementPerformanceTest.benchmarkFindChildDedupEqualityPossibleOptimized LinkedList 1 false thrpt 5 382,869 ± 50,581 ops/us ElementPerformanceTest.benchmarkFindChildDedupEqualityPossibleOptimized LinkedList 1 true thrpt 5 385,171 ± 6,204 ops/us ElementPerformanceTest.benchmarkFindChildDedupEqualityPossibleOptimized LinkedList 5 false thrpt 5 63,648 ± 13,582 ops/us ElementPerformanceTest.benchmarkFindChildDedupEqualityPossibleOptimized LinkedList 5 true thrpt 5 61,753 ± 0,838 ops/us ElementPerformanceTest.benchmarkFindChildDedupEqualityPossibleOptimized LinkedList 10 false thrpt 5 38,915 ± 0,639 ops/us ElementPerformanceTest.benchmarkFindChildDedupEqualityPossibleOptimized LinkedList 10 true thrpt 5 41,068 ± 0,739 ops/us ElementPerformanceTest.benchmarkFindChildEquality ArrayList 1 N/A thrpt 5 430,022 ± 47,343 ops/us ElementPerformanceTest.benchmarkFindChildEquality ArrayList 5 N/A thrpt 5 174,254 ± 15,473 ops/us ElementPerformanceTest.benchmarkFindChildEquality ArrayList 10 N/A thrpt 5 115,625 ± 1,765 ops/us ElementPerformanceTest.benchmarkFindChildEquality LinkedList 1 N/A thrpt 5 387,752 ± 8,182 ops/us ElementPerformanceTest.benchmarkFindChildEquality LinkedList 5 N/A thrpt 5 163,705 ± 2,547 ops/us ElementPerformanceTest.benchmarkFindChildEquality LinkedList 10 N/A thrpt 5 104,628 ± 2,562 ops/us ElementPerformanceTest.benchmarkFindChildIdentity ArrayList 1 N/A thrpt 5 1136,214 ± 24,443 ops/us ElementPerformanceTest.benchmarkFindChildIdentity ArrayList 5 N/A thrpt 5 335,138 ± 11,112 ops/us ElementPerformanceTest.benchmarkFindChildIdentity ArrayList 10 N/A thrpt 5 217,671 ± 4,249 ops/us ElementPerformanceTest.benchmarkFindChildIdentity LinkedList 1 N/A thrpt 5 1038,479 ± 4,364 ops/us ElementPerformanceTest.benchmarkFindChildIdentity LinkedList 5 N/A thrpt 5 333,967 ± 4,592 ops/us ElementPerformanceTest.benchmarkFindChildIdentity LinkedList 10 N/A thrpt 5 184,304 ± 3,872 ops/us
To make it better visible I've created a chart
From here, it looks like usage of plain
String.equals()
is the second best solution for performance of string comparison after usingString.intern()
. Deduplicated version of comparison (even optimized one) is slower than direct usage ofString.equals()
. (Even if comparison using identity is used for lookup after using deduplication).My guess is that this is caused by
HashMap
backing our deduplication mechanism, which forces Java to calculatehashCode()
for eachString
passed as name. This method iterates over eachchar
of theString
(unless hashCode value was already cached for this instance of aString
). Due to that, comparison has to iterate over each char:- for dedup:
- if string has deduplicated value - once (to find deduplicated version)
- if string has not deduplication value - twice (to look for deduplicated value and then during call to
equals()
)
- for optimized deduplication:
- if less than 5 children are in the collection - once (equality comparision)
- if string has deduplicated value - once (to find deduplicated version)
- if string has not deduplication value - twice (to look for deduplicated value and then during call to
equals()
)
- for equality - just once (call to
equals()
)
To sum this up, each method which has deduplication will slow down as it forces Java to calculate
hashCode
which is expensive (at the same level as call to equality). The only improvements could be seen if instance ofString
would be reused and deduplication will happen, but that is rather "unlikely" and would force use to deduplicate all element names inConcurrentHashMap
(which would lock and slow down processing and names could not be released from there). - for dedup:
-
To add some more performance data, here are the results of performance testing of
String.intern()
vs "deduplication".Benchmark Mode Cnt Score Error Units StringPerformanceTest.stringDeduplication thrpt 5 724,057 ± 31,020 ops/us StringPerformanceTest.stringIntern thrpt 5 26,079 ± 0,125 ops/us StringPerformanceTest.stringNope thrpt 5 4076,612 ± 271,274 ops/us
"stringNope" does nothing - just accesses string instance.
From that, it is really visible that calls to
String.intern()
while improving memory usage may be slowing us down (as each attribute name, element name, xmlns) is "internalized" (for each element - not only those used for comparison).It looks like,
String.intern()
is 27.8x times slower than "deduplication" which is only 5.6x times slower than using an original instance of aString
.In my opinion, we could keep "deduplication" just to reduce memory usage, but as deduplication will not improve performance (see the previous comment), we will sacrifice some of the performance gains to get lower memory usage.
Looking at those numbers, I wonder if we should not make "deduplication" optional. However, adding a simply "if" to decide if deduplication should be done or not, would slow down accessing string instance by ~35% (from my quick test), compared to using string without any conditions.
-
Looking at the results above, it looks like a single call to
String.intern()
requires as much time as 16 calls ofString.equal()
between 2 strings... not to mention that most likely we have a lot of attributes interned now and we are not using them in the comparison anywere. -
@kobit Looking at the results above, we are considering dropping the usage of
String.intern()
and switching from comparing element names and attribute names usingString.equals()
instead of==
of instances (at this is no longer an option with dropped usage ofString.intern()
. From our tests, it looks like we could improve the performance of parsing XML, and element processing should not be harmed as a single call toString.intern()
is expensive (even when it is later not used for comparison).Also, deduplication of strings (using HashMap with a configurable list of deduplicated names) may be used only for deduplication of strings during the creation of an element or setting an attribute as usage of it during comparisons and querying would slow us down.
Could you look at the comments since https://projects.tigase.net/issue/issue #14#focus=Comments-4-33480.0-0 up to this comment and let us know what you think about those changes (those changes would be part of Tigase XMPP Server 9.0.0)
-
Of course, if after tests the equals() gives better overall performance than intern() with ==, then yes, we should rework this code. Go ahead and do this.
I am only worried about all these places where == is used which could be missed after removing intern(). This may result in weird errors. Please be careful.
To be honest I have never tested a real performance of intern(). I just assumed a single intern() call would be about the same as a single equals().
For most small installations, performance is not a problem but memory usage is. Tigase from the start requires a lot of RAM, too much in my opinion. But I do not think this is the place to optimize for memory usage. Especially that even with zero traffic, Tigase still requires a lot of RAM.
So, to sum it up. Yes, please go ahead and remove usage of intern() + == and replace it with equals().
-
I am only worried about all these places where == is used which could be missed after removing intern(). This may result in weird errors. Please be careful.
I think that TTS-NG/TTS will help here and would catch if we miss something.
For most small installations, performance is not a problem but memory usage is. Tigase from the start requires a lot of RAM, too much in my opinion. But I do not think this is the place to optimize for memory usage. Especially that even with zero traffic, Tigase still requires a lot of RAM. So, to sum it up. Yes, please go ahead and remove usage of intern() + == and replace it with equals().
Unfortunately JVM itself is responsible as well (especially on small installations) but it's changing (jlink for example, which with removal of dependencies/groovy) should help quite a lot with memory usage here.
-
I am only worried about all these places where == is used which could be missed after removing intern(). This may result in weird errors. Please be careful.
As @wojtek mentioned, TTS-NG tests should help us find issues. Maybe we could try the lint plugin with some rules to proactively check that? On the other hand, I've suggested that we could add
boolean matches(String name, String xmlns)
method toElement
API, so that we could call this instead of callinggetName() == ""
which could ensure that checks would be done properly.For most small installations, performance is not a problem but memory usage is. Tigase from the start requires a lot of RAM, too much in my opinion. But I do not think this is the place to optimize for memory usage. Especially that even with zero traffic, Tigase still requires a lot of RAM.
If I recall, we already removed usage of
DirectByteBuffer
pools to reduce memory usage. I think that we could make a few more optimizations related to the memory, but in my opinion, the issue is also caused by the very high number of threads (each thread requires memory, and no. of threads depends on no. of CPUs) and task queues used by Tigase (each component has a task/packet queues (in & out), this no. depends on no. of components and no. of CPU cores). -
As for the threads number. Yes, the number might be scaled down somewhat. Specifically for small installations.
This part was extensively tested by me years back and the number of threads was adjusted to provide maximum throughput for large/huge installations. In short, I got much higher overall throughput with higher number of threads. Of course this translates into higher resource usage. But in such cases it is justified.
For small installations we deal with a few messages per minute sometimes per second, rather than thousands per second. So, we do not need a very high throughput and we could scale threads down without impact on the end-user.
-
I've applied discussed changes to the Tigase XML Tools package. @wojtek please review if all discussed changes were applied.
-
As we discussed removing "constructors" as we have now builder-like API for adding attributes and children, I've run some test to see if there will be any performance penalty. Here are results
Benchmark Mode Cnt Score Error Units ElementPerformanceTest.benchmarkElementWithAttributesCreationBuilder thrpt 5 21,436 ± 0,417 ops/us ElementPerformanceTest.benchmarkElementWithAttributesCreationDynamicArray thrpt 5 17,622 ± 4,194 ops/us ElementPerformanceTest.benchmarkElementWithAttributesCreationDynamicMap thrpt 5 11,659 ± 0,961 ops/us ElementPerformanceTest.benchmarkElementWithAttributesCreationDynamicMapOfEntries thrpt 5 9,028 ± 0,405 ops/us ElementPerformanceTest.benchmarkElementWithAttributesCreationMixedStatic thrpt 5 19,578 ± 0,294 ops/us ElementPerformanceTest.benchmarkElementWithAttributesCreationStaticArray thrpt 5 19,272 ± 0,097 ops/us ElementPerformanceTest.benchmarkElementWithAttributesCreationStaticMap thrpt 5 19,966 ± 0,611 ops/us
I would expect "Builder" to be slower than "StaticArray" and I would assume it is slower or at least has similar performance as I've had "builder" slower during a few runs when I was running less tests at once.
For sure, we should not use methods accepting
Map
for settings attributes unless we already have an instance of a map.Taking those results into account, I would say we should have following constructors:
Element(String name) Element(String name, String xmlns) Element(Element src)
I do not see a point in having other constructors as other cases can easily be taken care of by using builder pattern. The only issue may be during "transition" as we had a following constructor
Element(String name, String cdata)
which could be easily mistaken with the second one which I've proposed. Due to that I could lean on using just 2 constructors:
Element(String name) Element(Element src)
The second one is responsible for creation of a shallow copy.
-
+1 for 2 constructors (with name and
Element
) -- seems enough and with "withers" there is no need for others.As for default XMLNS - I vaguely remember that there were weird issues with missing xmlns (especially for
jabber:client
orjabber:server
and I think this default could be useful here). Though, during 9.0 development we could drop it and test if anything breaks. -
@wojtek Could you review the changes once again? I've dropped unnecessary constructors and
getChild()
method (which should be replaced by now withfindChild()
). Additionally, I've added JavaDoc forElement
methods.The last remaining thing will be adding test cases.
-
It looks good IMHO.
A couple of further comments:
- do we need
addAttribute*()
methods when they simply callsetAttribute*(...)
internally? - regarding
setAttributes(@NotNull String[] names, @NotNull String[] values)
- we already have one withMap
and you can useMap.of()
- I think we can utilize
Objects.requireNonNull()
- it does the same thing but would simplify a bit the code - do we need comments in
toString*()
methods regarding nulls? withFIXME
? It seems it's a non-issue now? - Should we leave FIXME comments in
setXMLNS()
? seems a good approach withremoveAttribute
/setAttribute
- Shouldn't we use
Deque<Element>
inDomBuilderHandler
? - In
Path
in// FIXME: I'm not sure about this.. maybe assert would be a better option..
- If we annote parameter withNotNull
then I think it would be sensible to useObjects.requireNonNull(element);
instead of returning empty collection.
Pushed some of the changes to the branch.
- do we need
-
do we need addAttribute*() methods when they simply call setAttribute*(...) internally?
I would prefer to use
addAttribute()
andremoveAttribute()
as we do not acceptnull
as a value, soset
will not remove an attribute if the value would be set tonull
. (as I would expect).regarding setAttributes(@NotNull String[] names, @NotNull String[] values) - we already have one with Map and you can use Map.of()
See performance comparison of using
Map.of()
->ElementPerformanceTest.benchmarkElementWithAttributesCreationDynamicMap
vsbenchmarkElementWithAttributesCreationDynamicArray
. We should move to use(add/set)Attribute()
only.I think we can utilize Objects.requireNonNull() - it does the same thing but would simplify a bit the code
+1
do we need comments in toString*() methods regarding nulls? with FIXME? It seems it's a non-issue now?
The checks for null were there, but I think there is no need for it as other methods guard against adding null. Due to that I've commented them out and left FIXME to verify that.
Should we leave FIXME comments in setXMLNS()? seems a good approach with removeAttribute/setAttribute
I've left them there to make it easy to find and verify that this is what we want. Generally, most of
FIXME
comments are to review the code there and then remove them.Shouldn't we use Deque in DomBuilderHandler?
Maybe.. I would even consider using ArrayDeque.. would need to test performance.
In Path in // FIXME: I'm not sure about this.. maybe assert would be a better option.. - If we annote parameter with NotNull then I think it would be sensible to use Objects.requireNonNull(element); instead of returning empty collection.
That was my assumption as well, that is why
FIXME
was there to ensure this will be discussed and reviewed. -
I would prefer to use
addAttribute()
andremoveAttribute()
as we do not accept null as a value, so set will not remove an attribute if the value would be set to null. (as I would expect).But semantics for add and set in case of null are somewhat (or can be) somewhat similar - we can simply reject
null
's here. I'm pondering the case where you would call add twice with the same attribute name - you wouldn't add second one but simply set value of (whether exists or not).See performance comparison of using Map.of() -> ElementPerformanceTest.benchmarkElementWithAttributesCreationDynamicMap vs benchmarkElementWithAttributesCreationDynamicArray. We should move to use (add/set)Attribute() only.
Good point, though in case of
benchmarkElementWithAttributesCreationDynamicArray
there is ~4k error margin (somewhat weird)The checks for null were there, but I think there is no need for it as other methods guard against adding null. Due to that I've commented them out and left FIXME to verify that.
Understood, +1 for removing null checks if not needed.
Maybe.. I would even consider using ArrayDeque.. would need to test performance.
+1
That was my assumption as well, that is why FIXME was there to ensure this will be discussed and reviewed.
+1
-
I've applied discussed changes and:
- renamed
children
collection tonodes
to reflect its actual content - moved filtering of
Element
s fromnodes
collection to a single placeprivate void forEachChild(Consumer<Element>)
method and using it ingetChildren()
andfindChildren()
methods
- renamed
-
It seems this was never merged (but got status "closed" and got sidetracked) after migration…
For now I think we should mark APIs that will be removed as deprecated.
(tentatively worth investigating: org.apache.commons.collections4.list.TreeList - could be faster but probably would use more memory than ArrayList, though probably still less than LinkedList; there is also brownies-collections with performance comparison and for small collections )
-
Possibly relevant: https://inside.java/2025/05/01/strings-just-got-faster/
(moving it to "in progress" as it's still to be done)
Type |
Task
|
Priority |
Normal
|
Assignee | |
Version |
tigase-server-9.0.0
|
-
tigase-server-8.5.0 Open
-
tigase-server-9.0.0 Open
Current
Element
API can be sometimes confusing and quite often, without looking at the sources it's difficult to tell what to expect. Improvements:null
s - use eitherOptional<>
or emptyList