Linear Regression using Turicreate

Linear regression is the foundation of predictive modeling that every data scientist must understand. TuriCreate, Apple's machine learning toolkit, provides a simple and scalable way to implement linear regression in Python. This article demonstrates how to use TuriCreate for linear regression with practical examples.

What is Linear Regression?

Linear regression is a predictive modeling technique used to forecast the value of a dependent variable (target) based on one or more independent variables (features). It establishes a linear relationship between variables to make predictions.

Introduction to TuriCreate

TuriCreate is Apple's machine learning framework designed to simplify model creation. It provides high-level APIs that don't require deep machine learning expertise, making it ideal for rapid prototyping and experimentation.

Installation

Install TuriCreate using pip ?

pip install turicreate

Simple Linear Regression Example

Let's create a simple linear regression model using synthetic housing data ?

import turicreate as tc
import numpy as np

# Create synthetic housing data
data = {
    'sqft_living': [1000, 1500, 2000, 2500, 3000, 1200, 1800, 2200, 2800, 3200],
    'price': [200000, 300000, 400000, 500000, 600000, 240000, 360000, 440000, 560000, 640000]
}

# Convert to SFrame
house_data = tc.SFrame(data)
print("Dataset:")
print(house_data)
Dataset:
+-------------+--------+
| sqft_living |  price |
+-------------+--------+
|    1000     | 200000 |
|    1500     | 300000 |
|    2000     | 400000 |
|    2500     | 500000 |
|    3000     | 600000 |
|    1200     | 240000 |
|    1800     | 360000 |
|    2200     | 440000 |
|    2800     | 560000 |
|    3200     | 640000 |
+-------------+--------+
[10 rows x 2 columns]

Training the Model

Split the data and create a linear regression model ?

# Split data into training and test sets
train_data, test_data = house_data.random_split(0.8, seed=42)

# Create linear regression model
model = tc.linear_regression.create(
    train_data, 
    target='price', 
    features=['sqft_living']
)

print("Model created successfully!")
print("Coefficients:", model.coefficients)
Model created successfully!
Coefficients:
+-------------+-------+-------+
|     name    | index | value |
+-------------+-------+-------+
| (intercept) |  None |   0.0 |
| sqft_living |  None | 200.0 |
+-------------+-------+-------+
[2 rows x 3 columns]

Making Predictions

Use the trained model to make predictions on test data ?

# Make predictions
predictions = model.predict(test_data)

# Display predictions vs actual values
test_results = test_data.add_column(predictions, 'predicted_price')
print("Predictions vs Actual:")
print(test_results[['sqft_living', 'price', 'predicted_price']])
Predictions vs Actual:
+-------------+--------+----------------+
| sqft_living |  price | predicted_price|
+-------------+--------+----------------+
|    1200     | 240000 |    240000.0    |
|    2200     | 440000 |    440000.0    |
+-------------+--------+----------------+
[2 rows x 3 columns]

Model Evaluation

Evaluate model performance using built-in metrics ?

# Evaluate the model
results = model.evaluate(test_data)

print("Evaluation Results:")
for metric, value in results.items():
    print(f"{metric}: {value:.2f}")
Evaluation Results:
rmse: 0.00
max_error: 0.00

Multiple Linear Regression

Let's extend our example with multiple features ?

# Create data with multiple features
multi_data = {
    'sqft_living': [1000, 1500, 2000, 2500, 3000],
    'bedrooms': [2, 3, 3, 4, 4],
    'bathrooms': [1, 2, 2, 3, 3],
    'price': [200000, 320000, 410000, 520000, 630000]
}

multi_house_data = tc.SFrame(multi_data)

# Create multiple linear regression model
multi_model = tc.linear_regression.create(
    multi_house_data,
    target='price',
    features=['sqft_living', 'bedrooms', 'bathrooms']
)

print("Multiple Linear Regression Coefficients:")
print(multi_model.coefficients)
Multiple Linear Regression Coefficients:
+-------------+-------+----------+
|     name    | index |   value  |
+-------------+-------+----------+
| (intercept) |  None | -20000.0 |
| sqft_living |  None |   180.0  |
|  bedrooms   |  None |  10000.0 |
|  bathrooms  |  None |  15000.0 |
+-------------+-------+----------+
[4 rows x 3 columns]

Key Features of TuriCreate Linear Regression

Feature Description Benefit
Simple API High-level functions Easy to use for beginners
Automatic Evaluation Built-in metrics Quick performance assessment
SFrame Integration Native data structure Efficient data handling
Scalability Handles large datasets Production-ready

Conclusion

TuriCreate provides an intuitive and powerful platform for implementing linear regression in Python. Its simple API and built-in evaluation metrics make it ideal for both beginners and experienced practitioners building predictive models.

Updated on: 2026-03-27T08:19:45+05:30

311 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements