 
 Data Structure Data Structure
 Networking Networking
 RDBMS RDBMS
 Operating System Operating System
 Java Java
 MS Excel MS Excel
 iOS iOS
 HTML HTML
 CSS CSS
 Android Android
 Python Python
 C Programming C Programming
 C++ C++
 C# C#
 MongoDB MongoDB
 MySQL MySQL
 Javascript Javascript
 PHP PHP
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
How can data be normalized to predict the fuel efficiency with Auto MPG dataset using TensorFlow?
Tensorflow is a machine learning framework that is provided by Google. It is an open−source framework used in conjunction with Python to implement algorithms, deep learning applications and much more. The ‘tensorflow’ package can be installed on Windows using the below line of code −
pip install tensorflow
Tensor is a data structure used in TensorFlow. It helps connect edges in a flow diagram. This flow diagram is known as the ‘Data flow graph’. Tensors are nothing but multidimensional array or a list.
The aim behind a regression problem is to predict the output of a continuous or discrete variable, such as a price, probability, whether it would rain or not and so on.
The dataset we use is called the ‘Auto MPG’ dataset. It contains fuel efficiency of 1970s and 1980s automobiles. It includes attributes like weight, horsepower, displacement, and so on. With this, we need to predict the fuel efficiency of specific vehicles.
We are using the Google Colaboratory to run the below code. Google Colab or Colaboratory helps run Python code over the browser and requires zero configuration and free access to GPUs (Graphical Processing Units). Colaboratory has been built on top of Jupyter Notebook.
Following is the code snippet −
Example
print("Separating the label from features")
train_features = train_dataset.copy()
test_features = test_dataset.copy()
train_labels = train_features.pop('MPG')
test_labels = test_features.pop('MPG')
print("The mean and standard deviation of the training dataset : ")
train_dataset.describe().transpose()[['mean', 'std']]
print("Normalize the features since they use different scales")
print("Creating the normalization layer")
normalizer = preprocessing.Normalization()
normalizer.adapt(np.array(train_features))
print(normalizer.mean.numpy())
first = np.array(train_features[3:4])
print("Every feature has been individually normalized")
with np.printoptions(precision=2, suppress=True):
print('First example is :', first)
print()
print('Normalized data :', normalizer(first).numpy())
Code credit − https://www.tensorflow.org/tutorials/keras/regression
Output
Separating the label from features The mean and standard deviation of the training dataset : Normalize the features since they use different scales Creating the normalization layer [ 5.467 193.847 104.135 2976.88 15.591 75.934 0.168 0.197 0.635] Every feature has been individually normalized First example is : [[ 4. 105. 63. 2125. 14.7 82. 0. 0. 1. ]] Normalized data : [[−0.87 −0.87 −1.11 −1.03 −0.33 1.65 −0.45 −0.5 0.76]]
Explanation
- The target value (label) is separated from the features. 
- The label is the value that needs to be trained for the predictions to occur. 
- The features are normalized so that the training is stable. 
- The ‘Normalization’ function in Tensorflow preprocesses the data. 
- A first layer is created, and the mean and variances are stored in this layer. 
- When this layer is called, it returns the input data where every feature has been normalized. 
