- Julia Tutorial
- Julia - Home
- Julia - Overview
- Julia - Environment Setup
- Julia - Basic Syntax
- Julia - Arrays
- Julia - Tuples
- Integers & Floating-Point Numbers
- Julia - Rational & Complex Numbers
- Julia - Basic Operators
- Basic Mathematical Functions
- Julia - Strings
- Julia - Functions
- Julia - Flow Control
- Julia - Dictionaries & Sets
- Julia - Date & Time
- Julia - Files I/O
- Julia - Metaprogramming
- Julia - Plotting
- Julia - Data Frames
- Working with Datasets
- Julia - Modules and Packages
- Working with Graphics
- Julia - Networking
- Julia - Databases
- Julia Useful Resources
- Julia - Quick Guide
- Julia - Useful Resources
- Julia - Discussion

# Julia - Dictionaries and Sets

Many of the functions we have seen so far are working on arrays and tuples. Arrays are just one type of collection, but Julia has other kind of collections too. One such collection is Dictionary object which associates **keys** with **values**. That is why it is called an **‘associative collection’**.

To understand it better, we can compare it with simple look-up table in which many types of data are organized and provide us the single piece of information such as number, string or symbol called the key. It doesn’t provide us the corresponding data value.

## Creating Dictionaries

The syntax for creating a simple dictionary is as follows −

Dict(“key1” => value1, “key2” => value2,,…, “keyn” => valuen)

In the above syntax, key1, key2…keyn are the keys and value1, value2,…valuen are the corresponding values. The operator => is the Pair() function. We can not have two keys with the same name because keys are always unique in dictionaries.

### Example

julia> first_dict = Dict("X" => 100, "Y" => 110, "Z" => 220) Dict{String,Int64} with 3 entries: "Y" => 110 "Z" => 220 "X" => 100

We can also create dictionaries with the help of comprehension syntax. The example is given below −

### Example

julia> first_dict = Dict(string(x) => sind(x) for x = 0:5:360) Dict{String,Float64} with 73 entries: "320" => -0.642788 "65" => 0.906308 "155" => 0.422618 "335" => -0.422618 "75" => 0.965926 "50" => 0.766044 ⋮ => ⋮

## Keys

As discussed earlier, dictionaries have unique keys. It means that if we assign a value to a key that already exists, we will not be creating a new one but modifying the existing key. Following are some operations on dictionaries regarding keys −

### Searching for a key

We can use **haskey()** function to check whether the dictionary contains a key or not −

julia> first_dict = Dict("X" => 100, "Y" => 110, "Z" => 220) Dict{String,Int64} with 3 entries: "Y" => 110 "Z" => 220 "X" => 100 julia> haskey(first_dict, "Z") true julia> haskey(first_dict, "A") false

### Searching for a key/value pair

We can use **in()** function to check whether the dictionary contains a key/value pair or not −

julia> in(("X" => 100), first_dict) true julia> in(("X" => 220), first_dict) false

### Add a new key-value

We can add a new key-value in the existing dictionary as follows −

julia> first_dict["R"] = 400 400 julia> first_dict Dict{String,Int64} with 4 entries: "Y" => 110 "Z" => 220 "X" => 100 "R" => 400

### Delete a key

We can use **delete!()** function to delete a key from an existing dictionary −

julia> delete!(first_dict, "R") Dict{String,Int64} with 3 entries: "Y" => 110 "Z" => 220 "X" => 100

### Getting all the keys

We can use **keys()** function to get all the keys from an existing dictionary −

julia> keys(first_dict) Base.KeySet for a Dict{String,Int64} with 3 entries. Keys: "Y" "Z" "X"

## Values

Every key in dictionary has a corresponding value. Following are some operations on dictionaries regarding values −

### Retrieving all the values

We can use **values()** function to get all the values from an existing dictionary −

julia> values(first_dict) Base.ValueIterator for a Dict{String,Int64} with 3 entries. Values: 110 220 100

### Dictionaries as iterable objects

We can process each key/value pair to see the dictionaries are actually iterable objects −

for kv in first_dict println(kv) end "Y" => 110 "Z" => 220 "X" => 100

Here the **kv** is a tuple that contains each key/value pair.

## Sorting a dictionary

Dictionaries do not store the keys in any particular order hence the output of the dictionary would not be a sorted array. To obtain items in order, we can sort the dictionary −

### Example

julia> first_dict = Dict("R" => 100, "S" => 220, "T" => 350, "U" => 400, "V" => 575, "W" => 670) Dict{String,Int64} with 6 entries: "S" => 220 "U" => 400 "T" => 350 "W" => 670 "V" => 575 "R" => 100 julia> for key in sort(collect(keys(first_dict))) println("$key => $(first_dict[key])") end R => 100 S => 220 T => 350 U => 400 V => 575 W => 670

We can also use **SortedDict** data type from the **DataStructures.ji** Julia package to make sure that the dictionary remains sorted all the times. You can check the example below −

### Example

julia> import DataStructures julia> first_dict = DataStructures.SortedDict("S" => 220, "T" => 350, "U" => 400, "V" => 575, "W" => 670) DataStructures.SortedDict{String,Int64,Base.Order.ForwardOrdering} with 5 entries: "S" => 220 "T" => 350 "U" => 400 "V" => 575 "W" => 670 julia> first_dict["R"] = 100 100 julia> first_dict DataStructures.SortedDict{String,Int64,Base.Order.ForwardOrdering} with 6 entries: “R” => 100 “S” => 220 “T” => 350 “U” => 400 “V” => 575 “W” => 670

## Word Counting Example

One of the simple applications of dictionaries is to count how many times each word appears in text. The concept behind this application is that each word is a key-value set and the value of that key is the number of times that particular word appears in that piece of text.

In the following example, we will be counting the words in a file name NLP.txtb(saved on the desktop) −

julia> f = open("C://Users//Leekha//Desktop//NLP.txt") IOStream() julia> wordlist = String[] String[] julia> for line in eachline(f) words = split(line, r"\W") map(w -> push!(wordlist, lowercase(w)), words) end julia> filter!(!isempty, wordlist) 984-element Array{String,1}: "natural" "language" "processing" "semantic" "analysis" "introduction" "to" "semantic" "analysis" "the" "purpose" …………………… …………………… julia> close(f)

We can see from the above output that wordlist is now an array of 984 elements.

We can create a dictionary to store the words and word count −

julia> wordcounts = Dict{String,Int64}() Dict{String,Int64}() julia> for word in wordlist wordcounts[word]=get(wordcounts, word, 0) + 1 end

To find out how many times the words appear, we can look up the words in the dictionary as follows −

julia> wordcounts["natural"] 1 julia> wordcounts["processing"] 1 julia> wordcounts["and"] 14

We can also sort the dictionary as follows −

julia> for i in sort(collect(keys(wordcounts))) println("$i, $(wordcounts[i])") end 1, 2 2, 2 3, 2 4, 2 5, 1 a, 28 about, 3 above, 2 act, 1 affixes, 3 all, 2 also, 5 an, 5 analysis, 15 analyze, 1 analyzed, 1 analyzer, 2 and, 14 answer, 5 antonymies, 1 antonymy, 1 application, 3 are, 11 … … … …

To find the most common words we can use **collect()** to convert the dictionary to an array of tuples and then sort the array as follows −

julia> sort(collect(wordcounts), by = tuple -> last(tuple), rev=true) 276-element Array{Pair{String,Int64},1}: "the" => 76 "of" => 47 "is" => 39 "a" => 28 "words" => 23 "meaning" => 23 "semantic" => 22 "lexical" => 21 "analysis" => 15 "and" => 14 "in" => 14 "be" => 13 "it" => 13 "example" => 13 "or" => 12 "word" => 12 "for" => 11 "are" => 11 "between" => 11 "as" => 11 ⋮ "each" => 1 "river" => 1 "homonym" => 1 "classification" => 1 "analyze" => 1 "nocturnal" => 1 "axis" => 1 "concept" => 1 "deals" => 1 "larger" => 1 "destiny" => 1 "what" => 1 "reservation" => 1 "characterization" => 1 "second" => 1 "certitude" => 1 "into" => 1 "compound" => 1 "introduction" => 1

We can check the first 10 words as follows −

julia> sort(collect(wordcounts), by = tuple -> last(tuple), rev=true)[1:10] 10-element Array{Pair{String,Int64},1}: "the" => 76 "of" => 47 "is" => 39 "a" => 28 "words" => 23 "meaning" => 23 "semantic" => 22 "lexical" => 21 "analysis" => 15 "and" => 14

We can use **filter()** function to find all the words that start with a particular alphabet (say ’n’).

julia> filter(tuple -> startswith(first(tuple), "n") && last(tuple) < 4, collect(wordcounts)) 6-element Array{Pair{String,Int64},1}: "none" => 2 "not" => 3 "namely" => 1 "name" => 1 "natural" => 1 "nocturnal" => 1

## Sets

Like an array or dictionary, a set may be defined as a collection of unique elements. Following are the differences between sets and other kind of collections −

In a set, we can have only one of each element.

The order of element is not important in a set.

### Creating a Set

With the help of **Set** constructor function, we can create a set as follows −

julia> var_color = Set() Set{Any}()

We can also specify the types of set as follows −

julia> num_primes = Set{Int64}() Set{Int64}()

We can also create and fill the set as follows −

julia> var_color = Set{String}(["red","green","blue"]) Set{String} with 3 elements: "blue" "green" "red"

Alternatively we can also use **push!()** function, as arrays, to add elements in sets as follows −

julia> push!(var_color, "black") Set{String} with 4 elements: "blue" "green" "black" "red"

We can use **in()** function to check what is in the set −

julia> in("red", var_color) true julia> in("yellow", var_color) false

## Standard operations

Union, intersection, and difference are some standard operations we can do with sets. The corresponding functions for these operations are **union(), intersect()** and **setdiff()**.

### Union

In general, the union (set) operation returns the combined results of the two statements.

**Example**

julia> color_rainbow = Set(["red","orange","yellow","green","blue","indigo","violet"]) Set{String} with 7 elements: "indigo" "yellow" "orange" "blue" "violet" "green" "red" julia> union(var_color, color_rainbow) Set{String} with 8 elements: "indigo" "yellow" "orange" "blue" "violet" "green" "black" "red"

### Intersection

In general, an intersection operation takes two or more variables as inputs and returns the intersection between them.

**Example**

julia> intersect(var_color, color_rainbow) Set{String} with 3 elements: "blue" "green" "red"

### Difference

In general, the difference operation takes two or more variables as an input. Then, it returns the value of the first set excluding the value overlapped by the second set.

**Example**

julia> setdiff(var_color, color_rainbow) Set{String} with 1 element: "black"

## Some Functions on Dictionary

In the below example, you will see that the functions that work on arrays as well as sets also works on collections like dictionaries −

julia> dict1 = Dict(100=>"X", 220 => "Y") Dict{Int64,String} with 2 entries: 100 => "X" 220 => "Y" julia> dict2 = Dict(220 => "Y", 300 => "Z", 450 => "W") Dict{Int64,String} with 3 entries: 450 => "W" 220 => "Y" 300 => "Z"

### Union

julia> union(dict1, dict2) 4-element Array{Pair{Int64,String},1}: 100 => "X" 220 => "Y" 450 => "W" 300 => "Z"

### Intersect

julia> intersect(dict1, dict2) 1-element Array{Pair{Int64,String},1}: 220 => "Y"

### Difference

julia> setdiff(dict1, dict2) 1-element Array{Pair{Int64,String},1}: 100 => "X"

### Merging two dictionaries

julia> merge(dict1, dict2) Dict{Int64,String} with 4 entries: 100 => "X" 450 => "W" 220 => "Y" 300 => "Z"

### Finding the smallest element

julia> dict1 Dict{Int64,String} with 2 entries: 100 => "X" 220 => "Y" julia> findmin(dict1) ("X", 100)