-
I do not recall working on similar task, however I think that to fix this we would need to make changes in SimpleParser and add map with allowed or banned chars or even implement validation of chars based on char value.
Right now there is no such code which would filter incoming data and close connection so I do not think this is implemented.
Attached code (links to repository commits) points to code related to checking of validation of JID sent to PubSub as one of attributes so I would not say it fixes issue described in here, however it may be loosely connected.
-
Thank you for your comments.
Right now, there is a logic in the SimpleParser code to ignore certain characters. Actually one character: 0x0. But it could/should be easily changed or replaced with a code to signal an error in case a character is found which is not allowed for the XMPP stream.
This particular code is critical for us in regards to performance and resource consumption, so we should avoid using Map. I think the current code with array and binary search on the array is quite efficient and we should continue to use it, unless more efficient solution is found.
Andrzej, please take a look at the code and let me know.
-
I prepared modified version of check which in my testing decreased performance by less than 1% in typical use case.
Then I used more testing in which I used all chars from range - allowed and not then performance decrease was less than 4%.
In both cases I measured performance of whole parse method of SimpleParser - so decrease may be less depending on current state of JVM.
I would like to confirm two thing:
-
this change should be for 7.2.0? If so I will need to switch branches in repository before I push changes - so current SNAPSHOT will be for 7.1.0 and newly create version will be for 7.2.0
-
if char is not in supported ranges we should return error? (not skip char as it was implemented before?)
-
-
Andrzej Wójcik wrote:
I prepared modified version of check which in my testing decreased performance by less than 1% in typical use case.
Then I used more testing in which I used all chars from range - allowed and not then performance decrease was less than 4%.
In both cases I measured performance of whole parse method of SimpleParser - so decrease may be less depending on current state of JVM.
Perfect! Good results, I am impressed.
I would like to confirm two thing:
- this change should be for 7.2.0? If so I will need to switch branches in repository before I push changes - so current SNAPSHOT will be for 7.1.0 and newly create version will be for 7.2.0
No, it can go to 7.1.0. I assigned it to version 7.2.0 because I did not expect to have it ready for this version.
- if char is not in supported ranges we should return error? (not skip char as it was implemented before?)
I think the spec requires the server to return stream error and terminate connection, so this is what we should do. So the SimpleParser should return an error.
-
I finished implementation of this check (with minor refactoring at the end) and once again checked performance - same results as presented above.
I added test case to verify every build that this check works OK and I compiled Tigase XMPP Server with this change to make sure that client can connect and that this works OK.
All changes are now part of Tigase XML Tools 3.5.0-SNAPSHOT and will be part of Tigase XMPP Server 7.1.0 release
Type |
Bug
|
Priority |
Major
|
Assignee | |
RedmineID |
8
|
Version |
tigase-server-7.1.0
|
Estimation |
0
|
Spent time |
0
|
Hi! As far as I've been able to tell, XMPP servers should disconnect clients
that send illegal XML characters[1]. And more importantly, XMPP servers
should NOT pass through illegal XML characters.
The original RFC3920[2] is a little vague on this issue (search for
"well-formed"), but Peter Saint-Andre's current draft revision[3] is fairly
clear:
"An XMPP entity MUST NOT accept data that is not XML-well-formed; instead it
MUST return an stream error and close the stream over
which the data was received."
I'm able to reproduce this bug using Pidgin 2.7.3 in Linux.
will probably need to use the --multiple flag)
used tigase.im)
"test" then ++u then 013 then space. This will insert the ASCII
character 013 aka 0x0b aka vertical tab
0x0b is invalid in XML 1.0 according to the
spec. The only allowed characters are:
#x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF]
I think maybe some of the characters which are invalid in XML 1.0 are
valid in 1.1, but it seems like XMPP might mandate XML 1.0? I
couldn't find any info in the original RFC, but section 11.8 of Peter
Saint-Andre's proposed draft says, "XMPP is an application profile of
XML 1.0. A future version of XMPP might be defined in terms of higher
versions of XML, but this specification defines XMPP only in terms of
XML 1.0."