
- Trending Categories
Data Structure
Networking
RDBMS
Operating System
Java
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
How to encode multiple strings that have the same length using Tensorflow and Python?
Multiple strings of same length can be encoded using the ‘tf.Tensor’ as an input value. When encoding multiple strings of varying lengths need to be encoded, a tf.RaggedTensor should be used as an input. If a tensor contains multiple strings in padded/sparse format, it needs to be converted to a tf.RaggedTensor. Then, the method unicode_encode should be called on it.
Read More: What is TensorFlow and how Keras work with TensorFlow to create Neural Networks?
Let us understand how to represent Unicode strings using Python, and manipulate those using Unicode equivalents. First, we separate the Unicode strings into tokens based on script detection with the help of the Unicode equivalents of standard string ops.
We are using the Google Colaboratory to run the below code. Google Colab or Colaboratory helps run Python code over the browser and requires zero configuration and free access to GPUs (Graphical Processing Units). Colaboratory has been built on top of Jupyter Notebook.
print("When encoding multiple strings of same lengths, tf.Tensor is used as input") tf.strings.unicode_encode([[99, 97, 116], [100, 111, 103], [ 99, 111, 119]],output_encoding='UTF-8') print("When encoding multiple strings with varying length, a tf.RaggedTensor should be used as input:") tf.strings.unicode_encode(batch_chars_ragged, output_encoding='UTF-8') print("If there is a tensor with multiple strings in padded/sparse format, convert it to a tf.RaggedTensor before calling unicode_encode") tf.strings.unicode_encode( tf.RaggedTensor.from_sparse(batch_chars_sparse), output_encoding='UTF-8') tf.strings.unicode_encode( tf.RaggedTensor.from_tensor(batch_chars_padded, padding=-1), output_encoding='UTF-8')
Code credit: https://www.tensorflow.org/tutorials/load_data/unicode
Output
When encoding multiple strings of same lengths, tf.Tensor is used as input When encoding multiple strings with varying length, a tf.RaggedTensor should be used as input: If there is a tensor with multiple strings in padded/sparse format, convert it to a tf.RaggedTensor before calling unicode_encode
Explanation
- When encoding multiple strings of same lengths, tf.Tensor can be used as input.
- When encoding multiple strings that have varying length, a tf.RaggedTensor can be used as input.
- When there is a tensor with multiple strings in padded/sparse format, it needs to be converted to a tf.RaggedTensor before calling unicode_encode on it.
- Related Questions & Answers
- How to represent Unicode strings as UTF-8 encoded strings using Tensorflow and Python?
- Find whether all tuple have same length in Python
- Encode and Decode Strings in C++
- How to return rows that have the same column values in MySQL?
- Encode and decode uuencode files using Python
- How can Tensorflow text be used with UnicodeScriptTokenizer to encode the data?
- How can Selenium select each div separately that have the same class?
- Program to equal two strings of same length by swapping characters in Python
- Encode and decode binhex4 files using Python (binhex)
- Encode and decode XDR data using Python xdrlib
- How can multiple plots be plotted in same figure using matplotlib and Python?
- Python - Print rows from the matrix that have same element at a given index
- How to concatenate strings using both GROUP_CONCAT() and CONCAT() in the same MySQL query?
- Python Program to Group Strings by K length Using Suffix
- Encode and decode MIME quoted-printable data using Python