Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
How can Tensorflow be used to load the csv data from abalone dataset?
The abalone dataset can be loaded using TensorFlow and Pandas to read CSV data from Google's storage API. The read_csv() method reads the data directly from the URL, and we explicitly specify the column names since the CSV file doesn't contain headers.
Read More: What is TensorFlow and how Keras work with TensorFlow to create Neural Networks?
We will be using the abalone dataset, which contains measurements of abalone (a type of sea snail). The goal is to predict the age based on other physical measurements.
Loading the Abalone Dataset
Here's how to load the CSV data from the abalone dataset using TensorFlow and Pandas ?
import pandas as pd
import numpy as np
import tensorflow as tf
from tensorflow.keras import layers
print("Setting NumPy print options for better readability")
np.set_printoptions(precision=3, suppress=True)
print("Reading the CSV data from Google storage")
abalone_train = pd.read_csv("https://storage.googleapis.com/download.tensorflow.org/data/abalone_train.csv",
names=["Length", "Diameter", "Height", "Whole weight",
"Shucked weight", "Viscera weight", "Shell weight", "Age"])
print("Dataset shape:", abalone_train.shape)
print("\nFirst 5 rows:")
print(abalone_train.head())
Setting NumPy print options for better readability Reading the CSV data from Google storage Dataset shape: (3320, 8) First 5 rows: Length Diameter Height Whole weight Shucked weight Viscera weight Shell weight Age 0 0.435 0.335 0.110 0.334 0.1355 0.0775 0.0965 8 1 0.585 0.450 0.125 0.874 0.3545 0.2075 0.2255 13 2 0.655 0.510 0.160 1.092 0.4930 0.2145 0.2605 16 3 0.545 0.425 0.125 0.768 0.2940 0.1495 0.2600 12 4 0.550 0.440 0.150 0.894 0.3940 0.1695 0.2300 14
Dataset Features
The dataset contains the following features ?
- Length: Longest shell measurement
- Diameter: Perpendicular to length
- Height: With meat in shell
- Whole weight: Whole abalone weight
- Shucked weight: Weight of meat
- Viscera weight: Gut weight after bleeding
- Shell weight: Weight after being dried
- Age: Number of rings + 1.5 (target variable)
Key Points
- The CSV file is loaded directly from Google's TensorFlow data storage
- Column names are specified explicitly since the file has no headers
- NumPy print options are configured for better output readability
- The dataset contains 3,320 samples with 8 features each
Conclusion
Loading CSV data with TensorFlow and Pandas is straightforward using pd.read_csv(). The abalone dataset provides a good example for regression tasks where we predict age from physical measurements.
