Vamsi Pavan’s Place

When curiousity outbursts …..

java.io.UTFDataFormatException: 5-byte UTF8 encoding not supported.

March 9th, 2009 · 1 Comment · Java, Jsp/servlets, Programming

An interesting exception faced while parsing xml content. Also, on further analysis on this error caused below similar issues started to raise.

Another similar exception is:

java.io.UTFDataFormatException: Invalid byte 1 of 1-byte UTF-8 sequence

Here, I am trying to parse an xml string using byte array from the xml string, and give that array as input to xml reader stream. I have used java.lang.String.getBytes() for this.

Unfortunately, I got a chinese (other UTF-8) characters as a value of one node in the xml. Hence, I got up with the above error. Later, I found that getBytes() method supports only the western encoding, not UTF-8. So by using java.lang.String.getBytes("UTF-8") method, we solved the issue.

Good to note this in XML Programming :) .

Bookmark it! These icons link to social bookmarking sites where readers can share and discover new web pages.
  • bodytext
  • Sphinn
  • del.icio.us
  • Facebook
  • Mixx
  • Google
  • Live
  • StumbleUpon
  • BlinkList
  • YahooMyWeb
  • NewsVine
  • blogtercimlap
  • Netvouz
  • Technorati
  • Slashdot
  • Print this article!

Tags:

1 response so far ↓

  • 1 wizardOfOzz // Jun 3, 2009 at 7:57 am

    Nice post, I didn’t know getBytes behaves like that. This applies to any string, not only XMLs.

Leave a Comment