Notifications

3887 views

Description

On instances before Kingston, if an email contains Unicode 4b emoji characters, like for example a smiley icon sent from a Windows phone, all content after that character will be missing when stored in the system.

The instance logs show the following exception:

java.sql.SQLException: Incorrect string value: '\xF0\x9F\x98\x83\xF0\x9F...' for column 'body' at row 1

Steps to Reproduce

  1. Send an email to an instance containing the following text in the body:

    Hello , happy face emoji     😄      <--- then more text after this

    Sending email with emoji

    Note that you only receive "Hello , happy face emoji." All data after the emoji is lost on the sys_email record.

    Data lost by emoji

Workaround

Our email form (sys_email) has support for Four-byte UTF8 Unicode characters on the subject, body, and body_text fields on Kingston releases and newer.

However, if an inbound action or processing is triggered, and the string contains Four-byte UTF8 Unicode characters, you need to ensure your target record fields support Four-byte UTF8 Unicode characters to avoid truncating data on those fields. The field type 'String (Full-UTF8)' is capable of handling the 4-byte emoji's character in all current releases.

Important: Before deciding to change the field types of body_html, body_text, and subject to prevent truncation, please review the other related pre-Kingston problems:

  • KB0621591 - Copy from String (Full-UTF8) to String causes data truncation if the content contains Unicode (PRB832997)
  • PRB833032 - Unable to create Journal Input that supports UTF-8
A recommended option, if you prefer not to make changes on the target fields, and you are using Four-byte UTF8 Unicode characters (emojis), it is to replace them on the incoming email.
 
Here is a simple example to remove the emoji characters on the Email (sys_email) form using a Business Rule:
  1. Table sys_email
  2. When to run: Select Before and mark insert
  3. Enable the Advanced checkbox
  4. On the Advanced tab (not the normal condition), set the condition: (current.type.toString() == "received")
  5. Empty the Script field and insert the following code:
(function executeRule(current, previous) {

function remove4ByteUTF8Unicodecharacters(vtext) {

return vtext ?
vtext.replace(/^[\0\uD7FF\uE000-\uFFFF]|[\uD800-\uDBFF][\uDC00-\uDFFF]|[\u2600-\u26FF]/g, ".") :
"";
}

// replace the 4-byte unicode from body, subject and body_text.
current.setValue("body", remove4ByteUTF8Unicodecharacters(current.body + ""));
current.setValue("body_text", remove4ByteUTF8Unicodecharacters(current.body_text + ""));
current.setValue("subject", remove4ByteUTF8Unicodecharacters(current.subject + ""));
})(current, previous);

 

Here is how it looks:

 Business Rule

 

 

 

Related Problem: PRB627055

Seen In

Dublin Patch 1
Eureka Patch 10
Eureka Patch 11 Hot Fix 2
Eureka Patch 8
Eureka Patch 9 Hot Fix 1
Fuji Patch 12 Hot Fix 1
Fuji Patch 13 Hot Fix 1
Fuji Patch 5
Fuji Patch 7 Hot Fix 5
Fuji Patch 8
Geneva Patch 7
Helsinki Patch 2

Fixed In

Kingston

Associated Community Threads

There is no data to report.

Article Information

Last Updated:2018-11-13 01:27:46
Published:2018-11-01