Notifications

9228 views

Description

On instances before Kingston, if an email contains Unicode 4 byte emoji characters, like for example a smiley icon sent from a Windows phone, all content after that character will be missing in the sys_email body, body_text, or subject fields when the inbound email is stored in the sys_email record.

The instance logs show the following exception:

java.sql.SQLException: Incorrect string value: '\xF0\x9F\x98\x83\xF0\x9F...' for column 'body' at row 1

Steps to Reproduce

  1. Send an email to an instance containing the following text in the body:

    Hello , happy face emoji     😄      <--- then more text after this

    Sending email with emoji

    Note that you only receive "Hello , happy face emoji." All data after the emoji is lost on the sys_email record.

    Data lost by emoji

The truncation problem appears on deployments where MySQL is used as the database server. On deployments utilizing MariaDB (most farm instances), emojis are replaced with question marks (????). TASK463760 has the details.

Workaround

The email table (sys_email) supports four-byte UTF8 Unicode characters on the subject, body, and body_text fields on Kingston releases and newer.

 

The field type 'String (Full-UTF8)' is capable of handling the 4-byte emoji's character in all current releases.

Please ensure your target record fields are configured to allow four-byte UTF8 Unicode characters to avoid truncating data on those fields. 

 

For pre-Kingston

Important: Before deciding to change the field types of body_html, body_text, and subject to prevent truncation, please review the other related pre-Kingston problems:

  • KB0621591 - Copy from String (Full-UTF8) to String causes data truncation if the content contains Unicode (PRB832997)
  • PRB833032 - Unable to create Journal Input that supports UTF-8
A recommended option, if you prefer not to make changes on the target fields, and you are using Four-byte UTF8 Unicode characters (emojis), it is to replace them on the incoming email.
 
Note: the workaround isn't working for Journal Input fields, such as Work Notes. Please refer to PRB833032 for details.
 
Here is a simple example to remove the emoji characters on the Email (sys_email) form using a Business Rule:
  1. Table sys_email
  2. When to run: Select Before and mark insert
  3. Enable the Advanced checkbox
  4. On the Advanced tab (not the normal condition), set the condition: (current.type.toString() == "received")
  5. Empty the Script field and insert the following code:
(function executeRule(current, previous) {

function remove4ByteUTF8Unicodecharacters(vtext) {

return vtext ?
vtext.replace(/^[\0\uD7FF\uE000-\uFFFF]|[\uD800-\uDBFF][\uDC00-\uDFFF]|[\u2600-\u26FF]/g, ".") :
"";
}

// replace the 4-byte unicode from body, subject and body_text.
current.setValue("body", remove4ByteUTF8Unicodecharacters(current.body + ""));
current.setValue("body_text", remove4ByteUTF8Unicodecharacters(current.body_text + ""));
current.setValue("subject", remove4ByteUTF8Unicodecharacters(current.subject + ""));
})(current, previous);

 

Here is how it looks:

 Business Rule

 

 

 

Related Problem: PRB627055

Seen In

Dublin Patch 1
Eureka Patch 10
Eureka Patch 11 Hot Fix 2
Eureka Patch 8
Eureka Patch 9 Hot Fix 1
Fuji Patch 12 Hot Fix 1
Fuji Patch 13 Hot Fix 1
Fuji Patch 5
Fuji Patch 7 Hot Fix 5
Fuji Patch 8
Geneva Patch 7
Helsinki Patch 2

Fixed In

Kingston

Associated Community Threads

There is no data to report.

Article Information

Last Updated:2019-10-07 09:16:00
Published:2019-07-11