Python - Golomb Encoding for b=2n and b!=2n


Golomb encoding is a data compression technique used to encode non−negative integers with a specific distribution. It was introduced by Solomon W. Golomb in 1966 and has been widely used in various applications, including video and image compression, information retrieval, and data storage. In this article, we will explore Golomb encoding and understand the two cases i.e. base is a power of 2(b=2^n) and when the base is not a power of 2 (b ≠ 2^n).

Golomb Encoding for b=2^n (Base is power of 2)

When the base is a power of 2, Golomb encoding becomes relatively simpler. Let's consider an example where b = 4 (2^2).

Steps to Find the Golomb Encoding of number when base is power of 2.

Determining the quotient and remainder

To encode a number (n), we first divide it by b and obtain the quotient (q) and the remainder (r). In our example, let's assume n = 12.

n = 12

b = 4

q = n // b # quotient
r = n % b # remainder

So, in our example :

  • q = 12 // 4 = 3

  • r = 12 % 4 = 0

Encoding the Quotient

The quotient (q) is encoded using unary coding, which means it is represented by q consecutive 1s followed by a 0. In our case, q = 3, so the unary encoding of q is "1110" (three 1s followed by a 0).

Encoding the Remainder

The remainder (r) is encoded using binary encoding. Since b = 4, we need to use 2 bits to encode r. In our case, r = 0, which can be represented as "00" in binary. The final Golomb encoding for n = 12 and b = 4 is the concatenation of the unary encoding of q and the binary encoding of r. Therefore, the Golomb encoding for n = 12 is "111000".

Golomb Encoding for b ≠ 2^n (Base is not power of 2)

When the base is not a power of 2, the encoding process becomes slightly more complex. Let's consider an example where b = 7.

Steps to Find the Golomb Encoding of number when base is not a power of 2.

Determining the quotient and remainder

Similar to the previous case, we divide the number (n) by b to obtain the quotient (q) and the remainder (r). Let's assume n = 23.

n = 23

b = 7

q = n // b # quotient
r = n % b # remainder

In our example :

  • q = 23 // 7 = 3

  • r = 23 % 7 = 2

Encoding the Quotient

The quotient (q) is encoded using unary coding, as in the previous case. Therefore, q = 3 will be represented by "1110" in unary.

Encoding the Remainder

The remainder (r) is encoded using Rice coding or binary coding. Rice coding encodes r by dividing it into two parts: the prefix and suffix.

Prefix Calculation

The prefix is determined by calculating k, which is the smallest integer satisfying 2^k ≥ b. In our example, b = 7, so k = 3. We calculate the range (R) as R = 2^k − b. In this case, R = 2^3 − 7 = 1. If r < R, we encode r using binary coding with k−1 bits. Otherwise, we encode r+R using binary coding with k bits. In our example, r = 2 and R = 1. Since r < R, we encode r = 2 using binary coding with k−1 = 2 bits. Therefore, r = 2 will be represented by "10" in binary. The final Golomb encoding for n = 23 and b = 7 is the concatenation of the unary encoding of q and the encoded remainder r. Hence, the Golomb encoding for n = 23 is "111010".

Golomb encoding Implementation

We can implement the logic behind Golomb encoding using Python by creating functions for unary encoding, binary encoding. The code implementation of Golomb encoding is shown below.

Example

In the below example, the unary_encoding function performs unary encoding for the given quotient (q). It creates a string of q consecutive "1" followed by a "0" to represent the quotient in unary. The binary_encoding function performs binary encoding for the given remainder (r) and prefix length (k). If r is less than the difference between 2^k and r, it encodes r using binary with k−1 bits. Otherwise, it encodes r+2^k−r using binary with k bits. The golomb_encoding function performs Golomb encoding for the given number (n) and base (b). It calculates the quotient (q) and remainder (r) by performing integer division and modulo operation, respectively. The cases covered in the below example are :

  • If the base (b) is a power of 2 (checked using b & (b − 1) == 0), it directly converts the remainder (r) to binary and pads it with zeros to match the length of the binary representation of b.

  • If the base (b) is not a power of 2, it calculates the prefix length (k) by subtracting 3 from the length of the binary representation of b. It then calls the binary_encoding function to encode the remainder (r) using the prefix length (k).

The encoded Golomb number is obtained by concatenating the unary encoded quotient (q) and the binary encoded remainder (r).

def unary_encoding(q):
    """
    Perform unary encoding for the given quotient (q).
    """
    encoded = "1" * q + "0"
    return encoded


def binary_encoding(r, k):
    """
    Perform binary encoding for the given remainder (r) and prefix length (k).
    """
    if r < 2 ** k - r:
        encoded = bin(r)[2:].zfill(k - 1)
    else:
        encoded = bin(r + 2 ** k - r)[2:].zfill(k)
    return encoded


def golomb_encoding(n, b):
    """
    Perform Golomb encoding for the given number (n) and base (b).
    """
    q = n // b  # quotient
    r = n % b   # remainder

    unary_encoded = unary_encoding(q)
    if b & (b - 1) == 0:  # Check if b is a power of 2
        binary_encoded = bin(r)[2:].zfill(len(bin(b)) - 2)
    else:
        k = len(bin(b)) - 3  # Prefix length
        binary_encoded = binary_encoding(r, k)

    encoded = unary_encoded + binary_encoded
    return encoded


# Example usage
number = 23
base = 7

encoded_number = golomb_encoding(number, base)
print("Golomb Encoding for number", number, "with base", base, ":", encoded_number)

Output

Golomb Encoding for number 23 with base 7 : 1110100

Conclusion

In this article, we discussed about Golumb encoding and the two cases associated with it i.e b=2^n and b ≠ 2^n . Golomb encoding is a useful data compression technique, especially for non−negative integer distributions with certain patterns. Understanding Golomb encoding can be valuable when working with compression algorithms, information retrieval systems, and other data−intensive applications.

Updated on: 18-Jul-2023

130 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements