# Julia - Strings

#### Julia Programming For Beginners: Learn Julia Programming

73 Lectures 4 hours

#### Julia Programming Language - From Zero to Expert

24 Lectures 3 hours

#### Hello Julia: Learn the New Julia Programming Language

29 Lectures 2.5 hours

A string may be defined as a finite sequence of one or more characters. They are usually enclosed in double quotes. For example: “This is Julia programming language”. Following are important points about strings −

• Strings are immutable, i.e., we cannot change them once they are created.

• It needs utmost care while using two specific characters − double quotes(“), and dollar sign($). It is because if we want to include a double quote character in the string then it must precede with a backslash; otherwise we will get different results because then the rest of the string would be interpreted as Julia code. On the other hand, if we want to include a dollar sign then it must also precede with a backslash because dollar sign is used in string interpolation./p> • In Julia, the built-in concrete type used for strings as well as string literals is String which supports full range of Unicode characters via the UTF-8 encoding. • All the string types in Julia are subtypes of the abstract type AbstractString. If you want Julia to accept any string type, you need to declare the type as AbstractString. • Julia has a first-class type for representing single character. It is called AbstractChar. ## Characters A single character is represented with Char value. Char is a 32-bit primitive type which can be converted to a numeric value (which represents Unicode code point). julia> 'a' 'a': ASCII/Unicode U+0061 (category Ll: Letter, lowercase) julia> typeof(ans) Char  We can convert a Char to its integer value as follows − julia> Int('a') 97 julia> typeof(ans) Int64  We can also convert an integer value back to a Char as follows − julia> Char(97) 'a': ASCII/Unicode U+0061 (category Ll: Letter, lowercase)  With Char values, we can do some arithmetic as well as comparisons. This can be understood with the help of following example − julia> 'X' < 'x' true julia> 'X' <= 'x' <= 'Y' false julia> 'X' <= 'a' <= 'Y' false julia> 'a' <= 'x' <= 'Y' false julia> 'A' <= 'X' <= 'Y' true julia> 'x' - 'b' 22 julia> 'x' + 1 'y': ASCII/Unicode U+0079 (category Ll: Letter, lowercase)  ### Delimited by double quotes or triple double quotes As we discussed, strings in Julia can be declared using double or triple double quotes. For example, if you need to add quotations to a part in a string, you can do so using double and triple double quotes as shown below − julia> str = "This is Julia Programming Language.\n" "This is Julia Programming Language.\n" julia> """See the "quote" characters""" "See the \"quote\" characters"  ### Performing arithmetic and other operations with end Just like a normal value, we can perform arithmetic as well as other operations with end. Check the below given example − julia> str[end-1] '.': ASCII/Unicode U+002E (category Po: Punctuation, other) julia> str[end÷2] 'g': ASCII/Unicode U+0067 (category Ll: Letter, lowercase)  ### Extracting substring by using range indexing We can extract substring from a string by using range indexing. Check the below given example − julia> str[6:9] "is J"  ### Using SubString In the above method, the Range indexing makes a copy of selected part of the original string, but we can use SubString to create a view into a string as given in the below example − julia> substr = SubString(str, 1, 4) "This" julia> typeof(substr) SubString{String}  ## Unicode and UTF-8 Unicode characters and strings are fully supported by Julia programming language. In character literals, Unicode \u and \U escape sequences as well as all the standard C escape sequences can be used to represent Unicode code points. It is shown in the given example − julia> s = "\u2200 x \u2203 y" "∀ x ∃ y"  Another encoding is UTF-8, a variable-width encoding, that is used to encode string literals. Here the variable-width encoding means that all the characters are not encoded in the same number of bytes, i.e., code units. For example, in UTF-8 − • ASCII characters (with code points less than 080(128) are encoded, using a single byte, as they are in ASCII. • On the other hand, the code points 080(128) and above are encoded using multiple bytes (up to four per character). The code units (bytes for UTF-8), which we have mentioned above, are String indices in Julia. They are actually the fixed-width building blocks that are used to encode arbitrary characters. In other words, every index into a String is not necessarily a valid index. You can check out the example below − julia> s[1] '∀': Unicode U+2200 (category Sm: Symbol, math) julia> s[2] ERROR: StringIndexError("∀ x ∃ y", 2) Stacktrace: [1] string_index_err(::String, ::Int64) at .\strings\string.jl:12 [2] getindex_continued(::String, ::Int64, ::UInt32) at .\strings\string.jl:220 [3] getindex(::String, ::Int64) at .\strings\string.jl:213 [4] top-level scope at REPL[106]:1,  ## String Concatenation Concatenation is one of the most useful string operations. Following is an example of concatenation − julia> A = "Hello" "Hello" julia> B = "Julia Programming Language" "Julia Programming Language" julia> string(A, ", ", B, ".\n") "Hello, Julia Programming Language.\n"  We can also concatenate strings in Julia with the help of *. Given below is the example for the same − julia> A = "Hello" "Hello" julia> B = "Julia Programming Language" "Julia Programming Language" julia> A * ", " * B * ".\n" "Hello, Julia Programming Language.\n"  ## Interpolation It is bit cumbersome to concatenate strings using concatenation. Therefore, Julia allows interpolation into strings and reduce the need for these verbose calls to strings. This interpolation can be done by using dollar sign ($). For example −

julia> A = "Hello"
"Hello"
julia> B = "Julia Programming Language"
"Julia Programming Language"
julia> "$A,$B.\n"
"Hello, Julia Programming Language.\n"


Julia takes the expression after $as the expression whose whole value is to be interpolated into the string. That’s the reason we can interpolate any expression into a string using parentheses. For example − julia> "100 + 10 =$(100 + 10)"
"100 + 10 = 110"


Now if you want to use a literal $in a string then you need to escape it with a backslash as follows − julia> print("His salary is \$5000 per month.\n")
true


### Search operators

Julia provides us findfirst and findlast functions to search for the index of a particular character in string. You can check the below example of both these functions −

julia> findfirst(isequal('o'), "Tutorialspoint")
4

julia> findlast(isequal('o'), "Tutorialspoint")
11


Julia also provides us findnext and findprev functions to start the search for a character at a given offset. Check the below example of both these functions −

julia> findnext(isequal('o'), "Tutorialspoint", 1)
4
julia> findnext(isequal('o'), "Tutorialspoint", 5)
11
julia> findprev(isequal('o'), "Tutorialspoint", 5)
4


It is also possible to check if a substring is found within a string or not. We can use occursin function for this. The example is given below −

julia> occursin("Julia", "This is, Julia Programming.")
true

julia> occursin("T", "Tutorialspoint")
true

julia> occursin("Z", "Tutorialspoint")
false


### The repeat() and join() functions

In the perspective of Strings in Julia, repeat and join are two useful functions. Example below explains their use −

julia> repeat("Tutorialspoint.com ", 5)
"Tutorialspoint.com Tutorialspoint.com Tutorialspoint.com Tutorialspoint.com Tutorialspoint.com "

julia> join(["TutorialsPoint","com"], " . ")
"TutorialsPoint . com"


## Non-standard String Literals

Literal is a character or a set of characters which is used to store a variable.

### Raw String Literals

Raw String literals are another useful non-standard string literal. They, without interpolation or unescaping can be expressed in the form of raw”…”. They create ordinary String objects containing enclosed contents same as entered without interpolation or unescaping.

### Example

julia> println(raw"\\ \\\"")
\\ \"


### Byte Array Literals

Byte array literals is one of the most useful non-standard string literals. It has the following rules −

• ASCII characters as well as escapes will produce a single byte.

• Octal escape sequence as well as \x will produce the byte corresponding to the escape value.

• The Unicode escape sequence will produce a sequence of bytes encoding.

All these three rules are overlapped in one or other sense.

### Example

julia> b"DATA\xff\u2200"
8-element Base.CodeUnits{UInt8,String}:
0x44
0x41
0x54
0x41
0xff
0xe2
0x88
0x80


The above resulting byte array is not a valid UTF-8 string as you can see below −

julia> isvalid("DATA\xff\u2200")
false


### Version Number Literals

Version Number literals are another useful non-standard string literal. They can be the form of v”…”. VNL create objects namely VersionNumber. These objects follow the specifications of semantic versioning.

### Example

We can define the version specific behavior by using the following statement −

julia> if v"1.0" <= VERSION < v"0.9-"
# you need to do something specific to 1.0 release series
end


### Regular Expressions

Julia has Perl-compatible Regular Expressions, which are related to strings in the following ways −

• RE are used to find regular patterns in strings.

• RE are themselves input as strings. It is parsed into a state machine which can then be used efficiently to search patterns in strings.

### Example

julia> r"^\s*(?:#|$)" r"^\s*(?:#|$)"

julia> typeof(ans)
Regex


We can use occursin as follows to check if a regex matches a string or not −

julia> occursin(r"^\s*(?:#|$)", "not a comment") false julia> occursin(r"^\s*(?:#|$)", "# a comment")
true