Projects tigase _server server-core Issues #394
Exception when storing to delayed delivery message with unicode characters in mysql (#394)
Closed
Unknown opened 10 years ago
Due Date
2014-11-29

I am working with tigase and using mysql and tried to send a message with unicode characters to an offline user and got this exception:

2014-11-11 07:16:18.400 [in_3-sess-man]    MsgRepository.storeMessage()       WARNING:  Problem adding new entry to DB:
java.sql.SQLException: Incorrect string value: '\xD7\x94\xD7\x95\xD7\x93...' for column 'message' at row 1
        at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1072)
        at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3563)
        at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3495)
        at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1959)
        at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2113)
        at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2693)
        at com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:2102)
        at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:2395)
        at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:2313)
        at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:2298)
        at tigase.server.amp.MsgRepository.storeMessage(MsgRepository.java:488)
        at tigase.xmpp.impl.OfflineMessages.savePacketForOffLineUser(OfflineMessages.java:300)
        at tigase.xmpp.impl.MessageAmp.postProcess(MessageAmp.java:194)
        at tigase.server.xmppsession.SessionManager.processPacket(SessionManager.java:1966)
        at tigase.server.xmppsession.SessionManager.processPacket(SessionManager.java:561)
        at tigase.server.AbstractMessageReceiver$QueueListener.run(AbstractMessageReceiver.java:1475)

After some research, I discovered that most tables in mysql are created with utf8_general_ci collation, but the msg_history table is using latin1_swedish_ci.

In 5.2.1 I can see that this is because most tables are created when running the initialization script(db-create-mysql.sh) but msg_history is created in code - and here is the difference:

In the script(taken from mysql-schema-4-schema.sql):

create table if not exists tig_nodes (
       nid bigint unsigned NOT NULL auto_increment,
       parent_nid bigint unsigned,
       uid bigint unsigned NOT NULL,

       node varchar(255) NOT NULL,

       primary key (nid),
       unique key tnode (parent_nid, uid, node),
       key node (node),
			 key uid (uid),
			 key parent_nid (parent_nid),
			 constraint tig_nodes_constr foreign key (uid) references tig_users (uid)
)
ENGINE=InnoDB default character set utf8 ROW_FORMAT=DYNAMIC;

In code(Taken from MsgRepository.java in 5.2.1, but looks like it's still happens in master in JDBCMsgRepository.java):

private static final String MYSQL_CREATE_MSG_TABLE =
							"create table " + MSG_TABLE + " ( " + "  "
							+ MSG_ID_COLUMN + " serial," + "  "
							+ MSG_TIMESTAMP_COLUMN + " TIMESTAMP DEFAULT CURRENT_TIMESTAMP," + "  "
							+ MSG_EXPIRED_COLUMN + " DATETIME," + "  "
							+ MSG_FROM_UID_COLUMN + " bigint unsigned," + "  "
							+ MSG_TO_UID_COLUMN + " bigint unsigned NOT NULL," + "  "
							+ MSG_BODY_COLUMN + " varchar(4096) NOT NULL," + "  "
							+ " key (" + MSG_EXPIRED_COLUMN + "), "
							+ " key (" + MSG_FROM_UID_COLUMN + ", " + MSG_TO_UID_COLUMN + "),"
							+ " key (" + MSG_TO_UID_COLUMN + ", " + MSG_FROM_UID_COLUMN + "))";

as you can see, it is missing this:

ENGINE=InnoDB default character set utf8 ROW_FORMAT=DYNAMIC;

Is there any reason it is not utf8 as well?

Artur Hefczyc commented 10 years ago

Yes, this is a bug and we will fix it.

wojciech.kapcia@tigase.net commented 10 years ago

Explicit declaration of Character Set added

issue 1 of 1
Type
Bug
Priority
Normal
Assignee
RedmineID
2458
Spent time
3h
Issue Votes (0)
Watchers (0)
Reference
tigase/_server/server-core#394
Please wait...
Page is in error, reload to recover