- Trending Categories
- Data Structure
- Operating System
- MS Excel
- C Programming
- Social Studies
- Fashion Studies
- Legal Studies
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
How can Tensorflow text be used to preprocess text data?
Tensorflow text is a package that can be used with the Tensorflow library. It has to be installed explicitly before using it. It can be used to pre-process data for text-based models.
Read More: What is TensorFlow and how Keras work with TensorFlow to create Neural Networks?
We will use the Keras Sequential API, which is helpful in building a sequential model that is used to work with a plain stack of layers, where every layer has exactly one input tensor and one output tensor.
A neural network that contains at least one layer is known as a convolutional layer. We can use the Convolutional Neural Network to build learning model.
TensorFlow Text contains collection of text related classes and ops that can be used with TensorFlow 2.0. The TensorFlow Text can be used to preprocess sequence modelling.
We are using the Google Colaboratory to run the below code. Google Colab or Colaboratory helps run Python code over the browser and requires zero configuration and free access to GPUs (Graphical Processing Units). Colaboratory has been built on top of Jupyter Notebook.
import tensorflow as tf import tensorflow_text as text print("Converting to UTF-8 encoding") docs = tf.constant([u'Everything not saved will be lost.'.encode('UTF-16-BE'), u'Sad☹'.encode('UTF-16-BE')]) utf8_docs = tf.strings.unicode_transcode(docs, input_encoding='UTF-16-BE', output_encoding='UTF-8')
Code credit −https://www.tensorflow.org/tutorials/tensorflow_text/intro
Converting to UTF-8 encoding
The strings can be converted to UTF-8 encoding with the help of the ‘encode’ method.
Once this is done, the strings are transcoded to UTF-8 encoding
- Related Articles
- How can TensorFlow Text be used to preprocess sequence modelling?
- How can Tensorflow and Tensorflow text be used to tokenize string data?
- How can TensorFlow be used to preprocess Fashion MNIST data in Python?
- How can Tensorflow text be used with UnicodeScriptTokenizer to encode the data?
- How can Tensorflow text be used with whitespace tokenizer in Python?
- How can Tensorflow be used to vectorise the text data associated with stackoverflow question dataset using Python?
- How can Tensorflow text be used to split the UTF-8 strings in Python?
- How can scikit learn library be used to preprocess data in Python?
- How can Tensorflow text be used to split the strings by character using unicode_split() in Python?
- How can Tensorflow be used with Estimators to perform data transformation?
- How can Tensorflow be used to visualize the data using Python?
- How can Tensorflow be used to standardize the data using Python?
- How can Tensorflow be used with Estimators to explore the titanic data?
- How can Tensorflow be used with estimators to visualize the titanic data?
- How can Tensorflow be used to display sample data from abalone dataset?