- Trending Categories
Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
Physics
Chemistry
Biology
Mathematics
English
Economics
Psychology
Social Studies
Fashion Studies
Legal Studies
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
How can Tensorflow be used in the conversion between different string representations?
The encoded string scalar can be converted to a vector of code points using the ‘decode’ method. The vector of code points can be converted to an encoded string scalar using the ‘encode’ method. The encoded string scalar can be converted to a different encoding using the ‘transcode’ method.
Read More: What is TensorFlow and how Keras work with TensorFlow to create Neural Networks?
Let us understand how to represent Unicode strings using Python, and manipulate those using Unicode equivalents. First, we separate the Unicode strings into tokens based on script detection with the help of the Unicode equivalents of standard string ops.
We are using the Google Colaboratory to run the below code. Google Colab or Colaboratory helps run Python code over the browser and requires zero configuration and free access to GPUs (Graphical Processing Units). Colaboratory has been built on top of Jupyter Notebook.
print("Converting encoded string scalar to a vector of code points") tf.strings.unicode_decode(text_utf8,input_encoding='UTF-8') print("Converting vector of code points to an encoded string scalar") tf.strings.unicode_encode(text_chars, output_encoding='UTF-8') print("Converting encoded string scalar to a different encoding") tf.strings.unicode_transcode(text_utf8, input_encoding='UTF8', output_encoding='UTF-16-BE')
Code credit: https://www.tensorflow.org/tutorials/load_data/unicode
Output
Converting encoded string scalar to a vector of code points Converting vector of code points to an encoded string scalar Converting encoded string scalar to a different encoding <tf.Tensor: shape=(), dtype=string, numpy=b'\x8b\xed\x8a\x00Y\x04t\x06'>
Explanation
- The function 'unicode_decode' is used to convert encoded string scalar to vector of code points.
- The function 'unicode_encode' is used to convert vector of code points to an encoded string scalar.
- The function 'unicode_transcode' is used to convert encoded string scalar to a different encoding.