Elixir - Sigils



In this chapter, we are going to explore sigils, the mechanisms provided by the language for working with textual representations. Sigils start with the tilde (~) character which is followed by a letter (which identifies the sigil) and then a delimiter; optionally, modifiers can be added after the final delimiter.

Regex

Regexes in Elixir are sigils. We have seen their use in the String chapter. Let us again take an example to see how we can use regex in Elixir.

# A regular expression that matches strings which contain "foo" or
# "bar":
regex = ~r/foo|bar/
IO.puts("foo" =~ regex)
IO.puts("baz" =~ regex)

When the above program is run, it produces the following result −

true
false

Sigils support 8 different delimiters −

~r/hello/
~r|hello|
~r"hello"
~r'hello'
~r(hello)
~r[hello]
~r{hello}
~r<hello>

The reason behind supporting different delimiters is that different delimiters can be more suited for different sigils. For example, using parentheses for regular expressions may be a confusing choice as they can get mixed with the parentheses inside the regex. However, parentheses can be handy for other sigils, as we will see in the next section.

Elixir supports Perl compatible regexes and also support modifiers. You can read up more about the use of regexes here.

Strings, Char lists and Word lists

Other than regexes, Elixir has 3 more inbuilt sigils. Let us have a look at the sigils.

Strings

The ~s sigil is used to generate strings, like double quotes are. The ~s sigil is useful, for example, when a string contains both double and single quotes −

new_string = ~s(this is a string with "double" quotes, not 'single' ones)
IO.puts(new_string)

This sigil generates strings. When the above program is run, it produces the following result −

"this is a string with \"double\" quotes, not 'single' ones"

Char Lists

The ~c sigil is used to generate char lists −

new_char_list = ~c(this is a char list containing 'single quotes')
IO.puts(new_char_list)

When the above program is run, it produces the following result −

this is a char list containing 'single quotes'

Word Lists

The ~w sigil is used to generate lists of words (words are just regular strings). Inside the ~w sigil, words are separated by whitespace.

new_word_list = ~w(foo bar bat)
IO.puts(new_word_list)

When the above program is run, it produces the following result −

foobarbat

The ~w sigil also accepts the c, s and a modifiers (for char lists, strings and atoms, respectively), which specify the data type of the elements of the resulting list −

new_atom_list = ~w(foo bar bat)a
IO.puts(new_atom_list)

When the above program is run, it produces the following result −

[:foo, :bar, :bat]

Interpolation and Escaping in Sigils

Besides lowercase sigils, Elixir supports uppercase sigils to deal with escaping characters and interpolation. While both ~s and ~S will return strings, the former allows escape codes and interpolation while the latter does not. Let us consider an example to understand this −

~s(String with escape codes \x26 #{"inter" <> "polation"})
# "String with escape codes & interpolation"
~S(String without escape codes \x26 without #{interpolation})
# "String without escape codes \\x26 without \#{interpolation}"

Custom Sigils

We can easily create our own custom sigils. In this example, we will create a sigil to convert a string to uppercase.

defmodule CustomSigil do
   def sigil_u(string, []), do: String.upcase(string)
end

import CustomSigil

IO.puts(~u/tutorials point/)

When we run the above code, it produces the following result −

TUTORIALS POINT

First we define a module called CustomSigil and within that module, we created a function called sigil_u. As there is no existing ~u sigil in the existing sigil space, we will use it. The _u indicates that we wish use u as the character after the tilde. The function definition must take two arguments, an input and a list.

Advertisements