Reply
Highlighted
Contributor
Posts: 14
Registered: ‎07-06-2011

Transaction Details API trashes UTF-8

Here's an actual example:

 

Correct:

José


Byte stream as downloaded from getTransactionDetails:

4a 6f 73 c3 83 c2 a9

  J   o     s         Ã         ©


Correct byte stream:

4A 6F 73 C3 A9

  J   o    s          é

 

The getTransactionDetails API xml header claims that the encoding is UTF-8,

the API is actually encoding the unicode twice.

One frequent cause of this is to have a MySQL database with tables defined as the

default latin1 charset, but the data is actually stored in the table as UTF8. MySQL doesn't

care until you SELECT the data as UTF-8, at which point it encodes the already UTF8

as UTF8 again.

 

Some versions of the mysqldump tool can also store the dump incorrectly,

even if the tables are properly declared as UTF8 charset.

When restored from such a dump, the table will have doubly-UTF8-encoded text.


Workaraound:

Decoding from downloaded "UTF-8" byte stream to unicode,

then decoding from the unicode (which is actually UTF-8) to unicode fixes the problem.

Highlighted
All Star
Posts: 1,072
Registered: ‎08-17-2009

Re: Transaction Details API trashes UTF-8

Hey there,

 

Hey sorry for the delay, but I've got the development teams looking into this one. We'll post more once we hear back from them. 

 

Thanks,

 

Michelle

Developer Community Manager

Highlighted
Administrator
Posts: 591
Registered: ‎08-21-2009

Re: Transaction Details API trashes UTF-8

I was able to confirmt hat this isn't exactly what is happening, so unfortunately it isn't just a simple adjustment to stop the behavior.  You are correct that our database stores the values in a western character set.  We are currently configured to expect that all text is sent to us in a western encoding.  You should be able to send characters such as é using extended western encoding. It is likely that we will move to support UTF-8 in the future, but it isn't something that we can provide a timeline on.