
- Python Basic Tutorial
- Python - Home
- Python - Overview
- Python - Environment Setup
- Python - Basic Syntax
- Python - Comments
- Python - Variables
- Python - Data Types
- Python - Operators
- Python - Decision Making
- Python - Loops
- Python - Numbers
- Python - Strings
- Python - Lists
- Python - Tuples
- Python - Dictionary
- Python - Date & Time
- Python - Functions
- Python - Modules
- Python - Files I/O
- Python - Exceptions
What are uncide scripts with respect to Tensorflow and Python?
Every Unicode code point belongs to a single collection of codepoints which is known as a script. A character's script determines the language to which the character would belong. TensorFlow comes with ‘strings.unicode_script’ method that helps find which script would be used by a given codepoint. The script codes are int32 values which can be mapped to International Components for Unicode (ICU) UScriptCode values
Read More: What is TensorFlow and how Keras work with TensorFlow to create Neural Networks?
We will no see how to represent Unicode strings using Python, and manipulate those using Unicode equivalents. First, separate the Unicode strings into tokens based on script detection with the help of the Unicode equivalents of standard string ops.
We are using the Google Colaboratory to run the below code. Google Colab or Colaboratory helps run Python code over the browser and requires zero configuration and free access to GPUs (Graphical Processing Units). Colaboratory has been built on top of Jupyter Notebook.
print("The below represent '芸' and 'Б' respectively") uscript = tf.strings.unicode_script([33464, 1041]) print(uscript.numpy()) # [17, 8] == [USCRIPT_HAN, USCRIPT_CYRILLIC] print("Applying to multidimensional strings") print(tf.strings.unicode_script(batch_chars_ragged))
Code credit: https://www.tensorflow.org/tutorials/load_data/unicode
Output
The below represent '芸' and 'Б' respectively [17 8] Applying to multidimensional strings <tf.RaggedTensor [[25, 25, 25, 25, 25], [25, 25, 25, 25, 0, 25, 25, 0, 25, 25, 25, 0, 25, 25, 25, 25, 25, 25, 25, 0, 25, 25, 25, 25, 25, 25, 25, 25], [25, 25, 25, 25, 25, 25, 25, 25, 25], [0]]>
Explanation
- Every Unicode code point belongs to a single collection of codepoints that is known as a script.
- A character's script helps determine which language the character could belong to.
- TensorFlow provides tf.strings.unicode_script operation to find out which script a given codepoint will use.
- The script codes are int32 values that map to International Components for Unicode (ICU) UScriptCode values.
- The tf.strings.unicode_script operation can be applied to multidimensional tf.Tensors or tf.RaggedTensors of codepoints as well.
- Related Articles
- What is Keras with respect to Tensorflow?
- What is segmentation with respect to text data in Tensorflow?
- What are Scripts In Postman?
- How can TensorFlow be used to create a plot that visualizes accuracy and loss with respect to time in IMDB dataset in Python?
- What is TensorFlow and how Keras work with TensorFlow to create Neural Networks?
- What are the rules of exception handling with respect to method overriding in java?
- What is time series with respect to Machine Learning?
- Command Line Scripts - Python Packaging
- What are layers in a Neural Network with respect to Deep Learning in Machine Learning?
- Scripts and Schemas
- Can scripts be inserted with innerHTML?
- What are the different startup scripts in JShell in Java 9?
- What is Q-learning with respect to reinforcement learning in Machine Learning?
- What is the associative property of rational numbers with respect to addition?
- 2A+B+C ----->+E;It is the first-order reaction with respect to A, A, second order with respect to B and zero-order with respect to C. Give the differential rate equation for the reaction.
