Projects tigase _server server-core Issues #3
Tigase XMPP server should not pass through non-well-formed XML (#3)
Closed
Artur Hefczyc opened 1 decade ago
Due Date
2016-02-11

Hi! As far as I've been able to tell, XMPP servers should disconnect clients

that send illegal XML characters[1]. And more importantly, XMPP servers

should NOT pass through illegal XML characters.

The original RFC3920[2] is a little vague on this issue (search for

"well-formed"), but Peter Saint-Andre's current draft revision[3] is fairly

clear:

"An XMPP entity MUST NOT accept data that is not XML-well-formed; instead it

MUST return an stream error and close the stream over

which the data was received."

I'm able to reproduce this bug using Pidgin 2.7.3 in Linux.

  1. Start two instances of Pidgin (if you're using a single computer then you

will probably need to use the --multiple flag)

  1. In each instance, login to a separate account on a single Tigase server (I

used tigase.im)

  1. In one of the instances, set your status to "away" and type the message

"test" then ++u then 013 then space. This will insert the ASCII

character 013 aka 0x0b aka vertical tab

  1. The other instance will be disconnected

0x0b is invalid in XML 1.0 according to the

spec. The only allowed characters are:

#x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF]

I think maybe some of the characters which are invalid in XML 1.0 are

valid in 1.1, but it seems like XMPP might mandate XML 1.0? I

couldn't find any info in the original RFC, but section 11.8 of Peter

Saint-Andre's proposed draft says, "XMPP is an application profile of

XML 1.0. A future version of XMPP might be defined in terms of higher

versions of XML, but this specification defines XMPP only in terms of

XML 1.0."

Artur Hefczyc commented 9 years ago

I recall you worked on something similar not long ago. Am I correct? If so, please review the task.

Andrzej Wójcik (Tigase) commented 9 years ago

I do not recall working on similar task, however I think that to fix this we would need to make changes in SimpleParser and add map with allowed or banned chars or even implement validation of chars based on char value.

Right now there is no such code which would filter incoming data and close connection so I do not think this is implemented.

Attached code (links to repository commits) points to code related to checking of validation of JID sent to PubSub as one of attributes so I would not say it fixes issue described in here, however it may be loosely connected.

Artur Hefczyc commented 9 years ago

Thank you for your comments.

Right now, there is a logic in the SimpleParser code to ignore certain characters. Actually one character: 0x0. But it could/should be easily changed or replaced with a code to signal an error in case a character is found which is not allowed for the XMPP stream.

This particular code is critical for us in regards to performance and resource consumption, so we should avoid using Map. I think the current code with array and binary search on the array is quite efficient and we should continue to use it, unless more efficient solution is found.

Andrzej, please take a look at the code and let me know.

Andrzej Wójcik (Tigase) commented 9 years ago

I prepared modified version of check which in my testing decreased performance by less than 1% in typical use case.

Then I used more testing in which I used all chars from range - allowed and not then performance decrease was less than 4%.

In both cases I measured performance of whole parse method of SimpleParser - so decrease may be less depending on current state of JVM.

I would like to confirm two thing:

  • this change should be for 7.2.0? If so I will need to switch branches in repository before I push changes - so current SNAPSHOT will be for 7.1.0 and newly create version will be for 7.2.0

  • if char is not in supported ranges we should return error? (not skip char as it was implemented before?)

Artur Hefczyc commented 9 years ago

Andrzej Wójcik wrote:

I prepared modified version of check which in my testing decreased performance by less than 1% in typical use case.

Then I used more testing in which I used all chars from range - allowed and not then performance decrease was less than 4%.

In both cases I measured performance of whole parse method of SimpleParser - so decrease may be less depending on current state of JVM.

Perfect! Good results, I am impressed.

I would like to confirm two thing:

  • this change should be for 7.2.0? If so I will need to switch branches in repository before I push changes - so current SNAPSHOT will be for 7.1.0 and newly create version will be for 7.2.0

No, it can go to 7.1.0. I assigned it to version 7.2.0 because I did not expect to have it ready for this version.

  • if char is not in supported ranges we should return error? (not skip char as it was implemented before?)

I think the spec requires the server to return stream error and terminate connection, so this is what we should do. So the SimpleParser should return an error.

Andrzej Wójcik (Tigase) commented 9 years ago

I finished implementation of this check (with minor refactoring at the end) and once again checked performance - same results as presented above.

I added test case to verify every build that this check works OK and I compiled Tigase XMPP Server with this change to make sure that client can connect and that this works OK.

All changes are now part of Tigase XML Tools 3.5.0-SNAPSHOT and will be part of Tigase XMPP Server 7.1.0 release

Artur Hefczyc commented 9 years ago

Excellent, thank you. Finally completed after 5 years :)

Referenced from commit 11 months ago
issue 1 of 1
Type
Bug
Priority
Major
Assignee
RedmineID
8
Version
tigase-server-7.1.0
Estimation
8h
Spent time
37h 30m
Issue Votes (0)
Watchers (0)
Reference
tigase/_server/server-core#3
Please wait...
Page is in error, reload to recover