Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
Linear Regression using Turicreate
Linear regression is the foundation of predictive modeling that every data scientist must understand. TuriCreate, Apple's machine learning toolkit, provides a simple and scalable way to implement linear regression in Python. This article demonstrates how to use TuriCreate for linear regression with practical examples.
What is Linear Regression?
Linear regression is a predictive modeling technique used to forecast the value of a dependent variable (target) based on one or more independent variables (features). It establishes a linear relationship between variables to make predictions.
Introduction to TuriCreate
TuriCreate is Apple's machine learning framework designed to simplify model creation. It provides high-level APIs that don't require deep machine learning expertise, making it ideal for rapid prototyping and experimentation.
Installation
Install TuriCreate using pip ?
pip install turicreate
Simple Linear Regression Example
Let's create a simple linear regression model using synthetic housing data ?
import turicreate as tc
import numpy as np
# Create synthetic housing data
data = {
'sqft_living': [1000, 1500, 2000, 2500, 3000, 1200, 1800, 2200, 2800, 3200],
'price': [200000, 300000, 400000, 500000, 600000, 240000, 360000, 440000, 560000, 640000]
}
# Convert to SFrame
house_data = tc.SFrame(data)
print("Dataset:")
print(house_data)
Dataset: +-------------+--------+ | sqft_living | price | +-------------+--------+ | 1000 | 200000 | | 1500 | 300000 | | 2000 | 400000 | | 2500 | 500000 | | 3000 | 600000 | | 1200 | 240000 | | 1800 | 360000 | | 2200 | 440000 | | 2800 | 560000 | | 3200 | 640000 | +-------------+--------+ [10 rows x 2 columns]
Training the Model
Split the data and create a linear regression model ?
# Split data into training and test sets
train_data, test_data = house_data.random_split(0.8, seed=42)
# Create linear regression model
model = tc.linear_regression.create(
train_data,
target='price',
features=['sqft_living']
)
print("Model created successfully!")
print("Coefficients:", model.coefficients)
Model created successfully! Coefficients: +-------------+-------+-------+ | name | index | value | +-------------+-------+-------+ | (intercept) | None | 0.0 | | sqft_living | None | 200.0 | +-------------+-------+-------+ [2 rows x 3 columns]
Making Predictions
Use the trained model to make predictions on test data ?
# Make predictions
predictions = model.predict(test_data)
# Display predictions vs actual values
test_results = test_data.add_column(predictions, 'predicted_price')
print("Predictions vs Actual:")
print(test_results[['sqft_living', 'price', 'predicted_price']])
Predictions vs Actual: +-------------+--------+----------------+ | sqft_living | price | predicted_price| +-------------+--------+----------------+ | 1200 | 240000 | 240000.0 | | 2200 | 440000 | 440000.0 | +-------------+--------+----------------+ [2 rows x 3 columns]
Model Evaluation
Evaluate model performance using built-in metrics ?
# Evaluate the model
results = model.evaluate(test_data)
print("Evaluation Results:")
for metric, value in results.items():
print(f"{metric}: {value:.2f}")
Evaluation Results: rmse: 0.00 max_error: 0.00
Multiple Linear Regression
Let's extend our example with multiple features ?
# Create data with multiple features
multi_data = {
'sqft_living': [1000, 1500, 2000, 2500, 3000],
'bedrooms': [2, 3, 3, 4, 4],
'bathrooms': [1, 2, 2, 3, 3],
'price': [200000, 320000, 410000, 520000, 630000]
}
multi_house_data = tc.SFrame(multi_data)
# Create multiple linear regression model
multi_model = tc.linear_regression.create(
multi_house_data,
target='price',
features=['sqft_living', 'bedrooms', 'bathrooms']
)
print("Multiple Linear Regression Coefficients:")
print(multi_model.coefficients)
Multiple Linear Regression Coefficients: +-------------+-------+----------+ | name | index | value | +-------------+-------+----------+ | (intercept) | None | -20000.0 | | sqft_living | None | 180.0 | | bedrooms | None | 10000.0 | | bathrooms | None | 15000.0 | +-------------+-------+----------+ [4 rows x 3 columns]
Key Features of TuriCreate Linear Regression
| Feature | Description | Benefit |
|---|---|---|
| Simple API | High-level functions | Easy to use for beginners |
| Automatic Evaluation | Built-in metrics | Quick performance assessment |
| SFrame Integration | Native data structure | Efficient data handling |
| Scalability | Handles large datasets | Production-ready |
Conclusion
TuriCreate provides an intuitive and powerful platform for implementing linear regression in Python. Its simple API and built-in evaluation metrics make it ideal for both beginners and experienced practitioners building predictive models.
