- Trending Categories
Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
Physics
Chemistry
Biology
Mathematics
English
Economics
Psychology
Social Studies
Fashion Studies
Legal Studies
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
Reading UTF8 data from a file using Java
In general, data is stored in a computer in the form of bits (1 or, 0). There are various coding schemes available specifying the set of bytes represented by each character.
Unicode (UTF) − Stands for Unicode Translation Format. It is developed by The Unicode Consortium. if you want to create documents that use characters from multiple character sets, you will be able to do so using the single Unicode character encodings. It provides 3 types of encodings.
UTF-8 − It comes in 8-bit units (bytes), a character in UTF8 can be from 1 to 4 bytes long, making UTF8 variable width.
UTF-16 − It comes in 16-bit units (shorts), it can be 1 or 2 shorts long, making UTF16 variable width.
UTF-32 − It comes in 32-bit units (longs). It is a fixed-width format and is always 1 "long" in length.
Writing UTF data to a file
The readUTF() method of the java.io.DataOutputStream reads data that is in modified UTF-8 encoding, into a String and returns it. Therefore to read UTF-8 data to a file −
Instantiate the FileInputStream class by passing a String value representing the path of the required file, as a parameter.
Instantiate the DataInputStream class bypassing the above created FileInputStream object as a parameter.
read UTF data from the InputStream object using the readUTF() method.
Example
import java.io.DataInputStream; import java.io.EOFException; import java.io.FileInputStream; import java.io.IOException; public class UTF8Example { public static void main(String args[]) { StringBuffer buffer = new StringBuffer(); try { //Instantiating the FileInputStream class FileInputStream fileIn = new FileInputStream("D:\test.txt"); //Instantiating the DataInputStream class DataInputStream inputStream = new DataInputStream(fileIn); //Reading UTF data from the DataInputStream while(inputStream.available()>0) { buffer.append(inputStream.readUTF()); } } catch(EOFException ex) { System.out.println(ex.toString()); } catch(IOException ex) { System.out.println(ex.toString()); } System.out.println("Contents of the file: "+buffer.toString()); } }
Output
Contents of the file: టుటోరియల్స్ పాయింట్ కి స్వాగతిం
The new bufferedReader() method of the java.nio.file.Files class accepts an object of the class Path representing the path of the file and an object of the class Charset representing the type of the character sequences that are to be read() and, returns a BufferedReader object that could read the data which is in the specified format.
The value for the Charset could be StandardCharsets.UTF_8 or, StandardCharsets.UTF_16LE or, StandardCharsets.UTF_16BE or, StandardCharsets.UTF_16 or, StandardCharsets.US_ASCII or, StandardCharsets.ISO_8859_1
Therefore to read UTF-8 data to a file −
Create/get an object of the Path class representing the required path using the get() method of the java.nio.file.Paths class.
Create/get a BufferedReader object, that could read UtF-8 data, bypassing the above-created Path object and StandardCharsets.UTF_8 as parameters.
Using the readLine() method of the BufferedReader object read the contents of the file.
Example
import java.io.BufferedReader; import java.nio.charset.StandardCharsets; import java.nio.file.Files; import java.nio.file.Path; import java.nio.file.Paths; public class UTF8Example { public static void main(String args[]) throws Exception{ //Getting the Path object String filePath = "D:\samplefile.txt"; Path path = Paths.get(filePath); //Creating a BufferedReader object BufferedReader reader = Files.newBufferedReader(path, StandardCharsets.UTF_8); //Reading the UTF-8 data from the file StringBuffer buffer = new StringBuffer(); int ch = 0; while((ch = reader.read())!=-1) { buffer.append((char)ch+reader.readLine()); } System.out.println("Contents of the file: "+buffer.toString()); } }
Output
Contents of the file: టుటోరియల్స్ పాయింట్ కి స్వాగతిం