Much to my chagrin, a non-trivial amount of my development work involves dealing with the JavaMail API. It’s pretty easily the worst part of my job. As my friend Rich Unger once said
[JavaMail is] a classic example of an API that was designed to make it easy on the API writer instead of the API user.
All of the most annoying bugs I’ve had to deal with have been because of some awful subtlety of using the JavaMail API. This frustration was greatly aggravated by mixing in two more of life’s abominations: Hotmail and Internet Explorer. We had a bug with our application such that mail sent to a Hotmail account would display “garbage” characters when viewed in IE. I took a screenshot, because everyone loves visuals, but for the link-imparied, an otherwise plain-text ascii email would contain characters like the following:

This particular sequence of characters represents the byte-order mark (BOM) for the document. Of course, the BOM is supposed to be invisible, but that doesn’t stop Hotmail and IE from outing it to the world.
What made this bug particularly frustrating was that it was specific to Hotmail and IE. The same message sent to a GMail address would have the exact same content, but would display as expected in IE. Similarly, the Hotmail message viewed in a more reasonable browser would also get rendered correctly. Of course, one could “force” IE and Hotmail into compliance by explicitly setting the encoding to UTF-8 (don’t forget to turn off “Auto-select”!), but that’s hardly something I can expect from end-users.
This particular bug wasn’t necessarily limited to our application. If I created a plain-text message in outlook and added extended characters, the same behavior would result: Firefox would display it fine, GMail would display it fine, but Hotmail+IE would blow up.
So, there wasn’t much we could do if the message indeed had extended characters. Part of the bug, though, was that the messages we were sending didn’t have any extended characters, but JavaMail still insisted on sticking the BOM in there. If I told JavaMail to encode the body of the message as “us-ascii” instead of “UTF-8″, the plain-text would show up fine, unless of course there were extended characters.
So, to avoid this Hotmail+IE bug, I essentially wanted to tell JavaMail to encode in UTF-8 if there were extended characters, but in US-ASCII otherwise. I was contemplating writing a “boolean hasExtendCharacters(String)” method, until I decided to do some last-ditch spelunking in the JavaMail source code.
It turns out, if you don’t specify an encoding via the actual method calls, but rather rely on System properties (either file.encoding or mail.mime.charset) to specify the default encoding, JavaMail will do the right thing.
So, instead of adding text to our messages like this:
MimeMesssage message = getMessage();
// this will *always* encode the text as UTF-8, forcing a BOM at the beginning
// and causing much grief to any Hotmail recipients using IE
message.setText("Hello, world!", "UTF-8");
it needs to be done like so:
MimeMesssage message = getMessage();
// the message will default to US-ASCII, unless there are extended characters
message.setText("Hello, world!");
Ugh. I personally find relying on system properties to define the behavior of a library to be a real shortcoming of an API, especially one as complex as JavaMail. System properties are hard to document in a public manner, and many times one cannot rely on the ability to alter system properties (such as a webapp being deployed into a servlet container). If you find yourself in such a circumstance, I suggest checking to make sure you actually are sending text with extended characters before explicitly setting an encoding on your plaintext messages. You never know what kind of crappy mail clients your recipients will be using…





[...] Character encoding malfunctions are topical for me of late. So if I were the kind of geek to publicly express personal world views within the confines of a bumper sticker, you might see my ride decked out with one of these bad boys. [...]