Understanding Precision and Recall

Precision and recall are fundamental evaluation metrics in machine learning that measure different aspects of a model's performance. Understanding these concepts is crucial for building effective classification models, especially when dealing with imbalanced datasets or when certain types of errors are more costly than others.

Confusion Matrix

Before diving into precision and recall, we need to understand the confusion matrix. It's a table that shows how well a classification model performs by comparing predicted labels with actual labels.

Predicted
Positive Negative
Actual Positive True Positive (TP) False Negative (FN)
Negative False Positive (FP) True Negative (TN)

The four components of the confusion matrix are ?

  • True Positive (TP) ? Correctly predicted positive samples

  • False Positive (FP) ? Incorrectly predicted as positive (Type I error)

  • False Negative (FN) ? Incorrectly predicted as negative (Type II error)

  • True Negative (TN) ? Correctly predicted negative samples

Precision

Precision measures the accuracy of positive predictions. It answers the question: "Of all the samples predicted as positive, how many were actually positive?"

Formula: Precision = TP / (TP + FP)

Example

from sklearn.metrics import precision_score, confusion_matrix
import numpy as np

# Sample predictions and actual labels
y_true = [1, 1, 0, 1, 0, 1, 0, 0, 1, 0]
y_pred = [1, 1, 1, 1, 0, 0, 0, 1, 1, 0]

# Calculate precision
precision = precision_score(y_true, y_pred)
print(f"Precision: {precision:.3f}")

# Manual calculation
tn, fp, fn, tp = confusion_matrix(y_true, y_pred).ravel()
precision_manual = tp / (tp + fp)
print(f"Manual Precision: {precision_manual:.3f}")
print(f"TP: {tp}, FP: {fp}")
Precision: 0.800
Manual Precision: 0.800
TP: 4, FP: 1

Recall

Recall measures the model's ability to identify all positive samples. It answers: "Of all the actual positive samples, how many did we correctly identify?"

Formula: Recall = TP / (TP + FN)

Example

from sklearn.metrics import recall_score

# Using the same data
recall = recall_score(y_true, y_pred)
print(f"Recall: {recall:.3f}")

# Manual calculation
recall_manual = tp / (tp + fn)
print(f"Manual Recall: {recall_manual:.3f}")
print(f"TP: {tp}, FN: {fn}")
Recall: 0.800
Manual Recall: 0.800
TP: 4, FN: 1

Precision vs Recall Trade-off

There's often a trade-off between precision and recall. Increasing one typically decreases the other ?

from sklearn.datasets import make_classification
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import precision_recall_curve
import matplotlib.pyplot as plt

# Generate sample data
X, y = make_classification(n_samples=1000, n_classes=2, random_state=42)

# Train model
model = LogisticRegression()
model.fit(X, y)

# Get prediction probabilities
y_scores = model.predict_proba(X)[:, 1]

# Calculate precision-recall curve
precision, recall, thresholds = precision_recall_curve(y, y_scores)

# Show different threshold effects
for i, threshold in enumerate([0.3, 0.5, 0.7]):
    y_pred_thresh = (y_scores >= threshold).astype(int)
    prec = precision_score(y, y_pred_thresh)
    rec = recall_score(y, y_pred_thresh)
    print(f"Threshold {threshold}: Precision={prec:.3f}, Recall={rec:.3f}")
Threshold 0.3: Precision=0.883, Recall=0.923
Threshold 0.5: Precision=0.906, Recall=0.906
Threshold 0.7: Precision=0.938, Recall=0.857

When to Use Precision vs Recall

Metric Focus Use When Example
Precision Minimize False Positives Cost of false alarms is high Spam detection, Medical diagnosis
Recall Minimize False Negatives Missing positives is costly Cancer screening, Fraud detection

F1-Score: Balancing Precision and Recall

The F1-score combines precision and recall into a single metric using their harmonic mean ?

from sklearn.metrics import f1_score

# Calculate F1-score
f1 = f1_score(y_true, y_pred)
print(f"F1-Score: {f1:.3f}")

# Manual calculation
f1_manual = 2 * (precision * recall) / (precision + recall)
print(f"Manual F1-Score: {f1_manual:.3f}")
print(f"Precision: {precision:.3f}, Recall: {recall:.3f}")
F1-Score: 0.800
Manual F1-Score: 0.800
Precision: 0.800, Recall: 0.800

Conclusion

Precision focuses on the accuracy of positive predictions, while recall measures the model's ability to find all positive samples. Choose precision when false positives are costly, and recall when missing positive cases is more problematic. The F1-score provides a balanced measure when both metrics are important.

Updated on: 2026-03-27T00:28:05+05:30

718 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements