- Trending Categories
Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
Physics
Chemistry
Biology
Mathematics
English
Economics
Psychology
Social Studies
Fashion Studies
Legal Studies
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
Convert UTF-8 to Unicode in Java
Before moving onto their conversions, let us learn about Unicode and UTF-8.
Unicode is an international standard of character encoding which has the capability of representing a majority of written languages all over the globe. Unicode uses hexadecimal to represent a character. Unicode is a 16-bit character encoding system. The lowest value is \u0000 and the highest value is \uFFFF.
UTF-8 is a variable width character encoding. UTF-8 has the ability to be as condense as ASCII but can also contain any unicode characters with some increase in the size of the file. UTF stands for Unicode Transformation Format. The '8' signifies that it allocates 8-bit blocks to denote a character. The number of blocks needed to represent a character varies from 1 to 4.
In order to convert UTF-8 to Unicode, we create a String Object which has the parameters as the UTF-8 byte array name and the charset the array of bytes which it is in i.e. UTF-8.
Let us see a program to convert UTF-8 to Unicode by creating a new String Object.
Example
public class Example { public static void main(String[] args) throws Exception { String str = "hey\u6366"; byte[] charset = str.getBytes("UTF-8"); String result = new String(charset, "UTF-8"); System.out.println(result); } }
Output
hey捦
Let us understand the above program. Firstly we converted a given Unicode string to UTF-8 for future verification using the getBytes() method −
String str = "hey\u6366"; byte[] charset = str.getBytes("UTF-8")
Then we converted the charset byte array to Unicode by creating a new String object as follows −
String result = new String(charset, "UTF-8"); System.out.println(result);
- Related Articles
- Convert Unicode to UTF-8 in Java
- Convert String to UTF-8 bytes in Java
- How many bits are used to represent Unicode, ASCII, UTF-16, and UTF-8 characters in java?
- How to read and write unicode (UTF-8) files in Python?
- Convert ASCII TO UTF-8 Encoding in PHP?
- How to represent Unicode strings as UTF-8 encoded strings using Tensorflow and Python?
- How to convert wrongly encoded data to UTF-8 in MySQL?
- How to convert an MySQL database characterset and collation to UTF-8?
- UTF-8 Validation in C++
- Difference Between ANSI and UTF-8
- Change MySQL default character set to UTF-8 in my.cnf?
- How to convert Unicode values to characters in JavaScript?
- How to convert Date to String in Java 8?
- How to convert an integer to a unicode character in Python?
- How can Tensorflow text be used to split the UTF-8 strings in Python?
