Julia - Files I/O


Advertisements

Reading from files

The functions namely open(), read(), and close() are the standard approach for extracting information from text files.

Opening a text file

If you want to read text from a text file, you need to first obtain the file handle. It can be done with the help of open() function as follows −

foo = open("C://Users//Leekha//Desktop//NLP.txt")

It shows that now foo is the Julia’s connection to the text file namely NLP.txt on the disk.

Closing the file

Once we are done with the file, we should have to close the connection as follows −

Close(foo)

In Julia, it is recommended to wrap any file-processing functions inside a do block as follows −

open("NLP.txt") do file
   # here you can work with the open file
end

The advantage of wrapping file-processing functions inside do block is that the open file will be automatically closed when this block finishes.

An example to keep some of the information like total time to read the file and total lines in the files −

julia> totaltime, totallines = open("C://Users//Leekha//Desktop//NLP.txt") do foo

            linecounter = 0
            timetaken = @elapsed for l in eachline(foo)
               linecounter += 1
            end
            (timetaken, linecounter)
         end
(0.0001184, 87)

Reading a file all at once

With read() function, we can read the whole content of an open file at once, for example −

ABC = read(foo, String)

Similarly, the below will store the contents of the file in ABC −

julia> ABC = open("C://Users//Leekha//Desktop//NLP.txt") do file
            read(file, String)
         end

We can also read in the whole file as an array. Use readlines() as follows −

julia> foo = open("C://Users//Leekha//Desktop//NLP.txt")
IOStream(<file C://Users//Leekha//Desktop//NLP.txt>)


julia> lines = readlines(foo)
87-element Array{String,1}:
 "Natural Language Processing: Semantic Analysis "
 ""
 "Introduction to semantic analysis:"
"The purpose of semantic analysis is to draw exact meaning, or you can say dictionary meaning from the text. Semantic analyzer checks the text for meaningfulness. "………………………………

Reading line by line

We can also process a file line by line. For this task, Julia provides a function named eachline() which basically turns a source into an iterator.

julia> open("C://USers/Leekha//Desktop//NLP.txt") do file
         for ln in eachline(file)
            println("$(length(ln)), $(ln)")
         end
      end
47, Natural Language Processing: Semantic Analysis
0,
34, Introduction to semantic analysis:
…………………………………

If you want to keep a track of which line you are on while reading the file, use the below given approach −

julia> open("C://Users//Leekha//Desktop//NLP.txt") do f
         line = 1
         while !eof(f)
            x = readline(f)
            println("$line $x")
            line += 1
         end
      end
1 Natural Language Processing: Semantic Analysis
2
3 Introduction to semantic analysis:
4 The purpose of semantic analysis is to draw exact meaning, or you can say dictionary meaning from the text. Semantic analyzer checks the text for meaningfulness.
5 We know that lexical analysis also deals with the meaning of the words then how semantic analysis is different from lexical analysis? The answer is that Lexical analysis is based on smaller token but on the other side semantic analysis focuses on larger chunks. That is why semantic analysis can be divided into the following two parts:
6 Studying the meaning of individual word: It is the first part of the semantic analysis in which the study of the meaning of individual words is performed. This part is called lexical semantics.
7 Studying the combination of individual words: In this second part, the individual words will be combined to provide meaning in sentences.
8 The most important task of semantic analysis is to get the proper meaning of the sentence. For example, analyze the sentence “Ram is great.” In this sentence, the speaker is talking either about Lord Ram or about a person whose name is Ram. That is why the job, to get the proper meaning of the sentence, of semantic analyzer is important.
9 Elements of semantic analysis:
10 Following are the elements of semantic analysis:……………………..

Path and File Names

The table below shows functions that are useful for working with filenames −

Sl.No Functions & Working
1

cd(path)

This function changes the current directory.

2

pwd()

This function gets the current working directory.

3

readdir(path)

This function returns a list of the contents of a named directory, or the current directory.

4

abspath(path)

This function adds the current directory's path to a filename to make an absolute pathname.

5

joinpath(str, str, ...)

This function assembles a pathname from pieces.

6

isdir(path)

This function tells you whether the path is a directory.

7

splitdir(path)

This function splits a path into a tuple of the directory name and file name.

8

splitdrive(path)

This function, on Windows, split a path into the drive letter part and the path part. And, On Unix systems, the first component is always the empty string.

9

splitext(path)

This function, if the last component of a path contains a dot, split the path into everything before the dot and everything including and after the dot. Otherwise, return a tuple of the argument unmodified and the empty string.

10

expanduser(path)

This function replaces a tilde character at the start of a path with the current user's home directory.

11

normpath(path)

This function normalizes a path, removing "." and ".." entries.

12

realpath(path)

This function canonicalizes a path by expanding symbolic links and removing "." and ".." entries.

13

homedir()

This function gives the current user's home directory.

14

dirname(path)

This function gets the directory part of a path.

15

basename(path)

This function gets the file name part of a path.

Information about file

We can use stat(“pathname”) to get the information about a specific file.

Example

julia> for n in fieldnames(typeof(stat("C://Users//Leekha//Desktop//NLP.txt")))
            println(n, ": ", getfield(stat("C://Users//Leekha//Desktop//NLP.txt"),n))
         end
device: 3262175189
inode: 17276
mode: 33206
nlink: 1
uid: 0
gid: 0
rdev: 0
size: 6293
blksize: 4096
blocks: 16
mtime: 1.6017034024103658e9
ctime: 1.6017034024103658e9

Interacting with the file system

If you want to convert filenames to pathnames, you can use abspath() function. We can map this over a list of files in a directory as follows −

julia> map(abspath, readdir())
204-element Array{String,1}:
 "C:\\Users\\Leekha\\.anaconda"
 "C:\\Users\\Leekha\\.conda"
 "C:\\Users\\Leekha\\.condarc"
 "C:\\Users\\Leekha\\.config"
 "C:\\Users\\Leekha\\.idlerc"
 "C:\\Users\\Leekha\\.ipynb_checkpoints"
 "C:\\Users\\Leekha\\.ipython"
 "C:\\Users\\Leekha\\.julia"
 "C:\\Users\\Leekha\\.jupyter"
 "C:\\Users\\Leekha\\.keras"
 "C:\\Users\\Leekha\\.kindle"…………………………

Writing to files

A function writedlm(), a function in the DelimitedFiles package can be used to write the contents of an object to a text file.

Example

julia> test_numbers = rand(10,10)
10×10 Array{Float64,2}:
 0.457071 0.41895  0.63602  0.812757 0.727214 0.156181 0.023817 0.286904 0.488069 0.232787
 0.623791 0.946815 0.757186 0.822932 0.791591 0.67814 0.903542 0.664997 0.702893 0.924639
 0.334988 0.511964 0.738595 0.631272 0.33401 0.634704 0.175641 0.0679822 0.350901 0.0773231
 0.838656 0.140257 0.404624 0.346231 0.642377 0.404291 0.888538 0.356232 0.924593 0.791257
 0.438514 0.70627 0.642209 0.196252 0.689652 0.929208 0.19364 0.19769 0.868283 0.258201
 0.599995 0.349388 0.22805 0.0180824 0.0226505 0.0838017 0.363375 0.725694 0.224026 0.440138
 0.526417 0.788251 0.866562 0.946811 0.834365 0.173869 0.279936 0.80839 0.325284 0.0737317
 0.0805326 0.507168 0.388336 0.186871 0.612322 0.662037 0.331884 0.329227 0.355914 0.113426
 0.527173 0.0799835 0.543556 0.332768 0.105341 0.409124 0.61811 0.623762 0.944456 0.0490737
 0.281633 0.934487 0.257375 0.409263 0.206078 0.720507 0.867653 0.571467 0.705971 0.11014
 
julia> writedlm("C://Users//Leekha//Desktop//testfile.txt", test_numbers)
Advertisements