Here's an actual example:
Correct:
José
Byte stream as downloaded from getTransactionDetails:
4a 6f 73 c3 83 c2 a9
J o s à ©
Correct byte stream:
4A 6F 73 C3 A9
J o s é
The getTransactionDetails API xml header claims that the encoding is UTF-8,
the API is actually encoding the unicode twice.
One frequent cause of this is to have a MySQL database with tables defined as the
default latin1 charset, but the data is actually stored in the table as UTF8. MySQL doesn't
care until you SELECT the data as UTF-8, at which point it encodes the already UTF8
as UTF8 again.
Some versions of the mysqldump tool can also store the dump incorrectly,
even if the tables are properly declared as UTF8 charset.
When restored from such a dump, the table will have doubly-UTF8-encoded text.
Workaraound:
Decoding from downloaded "UTF-8" byte stream to unicode,
then decoding from the unicode (which is actually UTF-8) to unicode fixes the problem.