-
Andrzej Wójcik wrote:
This issue is caused by fact that we added validation of allowed chars in XML 1.0 specification which is declared by XMPP protocol. From protocol specification this is proper behavior of Tigase XMPP Server but it may not be good to end user.
Ok so what do you suggest?
From the client side, I could encode those characters as entities (&#xxxx;), but isn't that allowed for HTML only? I can't remember...
-
It depends on what the spec says. I think I saw some discussion about this on one of the XMPP mailing lists. If the emoji characters are allowed or are planed to be allowed for XMPP, then we should update our code to let them through. I think it is a safe assumption that these new emoji characters will be part of the XMPP spec at some point.
Andrzej, what do you think? If you think, there are negative, side effects of adding these chars to allowed set, please update our code.
-
If the emoji characters are allowed or are planed to be allowed for XMPP
XMPP has always been allowing Unicode emojis, because XML 1.0 § 2.2 allows #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF] as codepoints in text and this includes most emojis. The [#x10000-#x10FFFF] range needs special treatment in Java, since you have to handle surrogate pairs, which is often simply forgotten when implementing some sort of XML validator. So if I can disconnect a stream by sending e.g. U+1F4A9, then this is definitely a Tigase issue. Openfire had the same issue, which was fixed with https://github.com/igniterealtime/Openfire/commit/c0a4fc1889d4cced817fc2bee8e0e4a92e06ba60
Type |
Bug
|
Priority |
Major
|
Assignee | |
RedmineID |
3838
|
Spent time |
0
|
I just found out with latest release branch Tigase is disconnecting my client for "XML content parse error". I've investigated and I can regolarly reproduce it if I put emojis in the XML stream (messages, presence status element, etc.).
I don't think a log is needed, but if you need one I can restart my server with more verbose logging to prove it.
I've been told by another XMPP software maintainer that Openfire had a similar issue and they had to fix it to allow a more wide range of UTF-8 characters.
I'm marking this bug as High becuase it casues clients to disconnect.