C program to compute linear regression

Linear regression is a statistical method used to find the relationship between two variables by fitting a linear equation to observed data. In C programming, we can implement linear regression to find the slope (m) and y-intercept (c) of the line that best fits the data points.

Syntax

y = mx + c
where:
m = (n*?(xy) - ?(x)*?(y)) / (n*?(x²) - (?(x))²)
c = (?(y)*?(x²) - ?(x)*?(xy)) / (n*?(x²) - (?(x))²)

Linear Regression Formula

The linear regression algorithm uses the least squares method to calculate the slope (m) and intercept (c) −

  • Slope (m): Rate of change between variables
  • Intercept (c): Value of y when x equals zero
  • Denominator (d): Common denominator for both calculations

Example

Following is the C program to compute linear regression −

#include <stdio.h>
#include <math.h>

int main() {
    int n, i;
    float x, y, m, c, d;
    float sumx = 0, sumxsq = 0, sumy = 0, sumxy = 0;
    
    printf("Enter the number of data points: ");
    scanf("%d", &n);
    
    printf("\nEnter %d pairs of (x, y) values:<br>", n);
    
    for (i = 0; i < n; i++) {
        printf("Point %d - x: ", i + 1);
        scanf("%f", &x);
        printf("Point %d - y: ", i + 1);
        scanf("%f", &y);
        
        sumx = sumx + x;
        sumxsq = sumxsq + (x * x);
        sumy = sumy + y;
        sumxy = sumxy + (x * y);
    }
    
    // Calculate denominator
    d = n * sumxsq - sumx * sumx;
    
    // Calculate slope (m) and intercept (c)
    m = (n * sumxy - sumx * sumy) / d;
    c = (sumy * sumxsq - sumx * sumxy) / d;
    
    printf("\nLinear Regression Results:<br>");
    printf("Slope (m) = %.3f<br>", m);
    printf("Y-intercept (c) = %.3f<br>", c);
    printf("Linear equation: y = %.3fx + %.3f<br>", m, c);
    
    return 0;
}
Enter the number of data points: 5

Enter 5 pairs of (x, y) values:
Point 1 - x: 1
Point 1 - y: 5
Point 2 - x: 2
Point 2 - y: 6
Point 3 - x: 2
Point 3 - y: 4
Point 4 - x: 3
Point 4 - y: 7
Point 5 - x: 1
Point 5 - y: 1

Linear Regression Results:
Slope (m) = 2.000
Y-intercept (c) = 1.000
Linear equation: y = 2.000x + 1.000

How It Works

  1. Read n data points (x, y coordinates)
  2. Calculate summation values: ?x, ?y, ?x², ?xy
  3. Apply least squares formulas to find slope and intercept
  4. Display the linear equation y = mx + c

Key Points

  • Linear regression assumes a linear relationship between variables
  • The algorithm minimizes the sum of squared errors
  • More data points generally provide better accuracy
  • Division by zero occurs when all x-values are identical

Conclusion

Linear regression in C helps analyze relationships between variables by calculating the best-fit line. The program computes slope and intercept using mathematical formulas, providing a foundation for predictive analysis.

Updated on: 2026-03-15T14:13:20+05:30

9K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements