Represent a Given Set of Points by the Best Possible Straight Line in C++

Discuss How to represent a set of points by the best possible straight line. We are given values (x,y) of a set of points, and we need to find the best straight line y = mx + c, So all we need is to find the value of m and c, for example

Input: no_of_points = 4
x1 = 2, y1 = 3,
x2 = 5, y2 = 6,
x3 = 1, y3 = 3,
x4 = 4, y4 = 5.

Output: m = 0.8, c = 1.85
Explanation: If we apply the value of m and c in the equation y = mx + c for any point (xi, yi) it would give the best straight line covering all the points.
Putting value of m and c in (x2,y2),
L.H.S : mx + c = 0.8 * 5 + 1.85 = 5.85
R.H.S : y = 6 which is nearly equal to L.H.S.

Input: no_of_points = 3
x1 = 3, y1 = 6,
x2 = 2, y2 = 4,
x3 = 1, y3 = 3,

Output: m = 1.5,c = 1.33

Approach to Find the Solution

To solve this problem, we need to find values of m and c. There will be a unique solution when the number of points is 2, But when no points are greater than two, solutions may or may not exist.

Let’s take the number of points to be n,

So we will have n equations, fn = mxn + c

For this equation to be the best fit, we need to find the value of fi, equal or close to yi.

Let’s take Z = ( fi - yi )2; now, we need to make this value minimum for all points. We squared the term ( fi - yi ) to eliminate the negative terms.

For Z to be minimum this should satisfy,

𝜹(Z) / 𝜹(m) = 0 and 𝜹(Z) / 𝜹(c) = 0.

On solving these equations,

sigma(y) = m * sigma(x) + no_of_points * c, and

sigma(xy) = m * sigma(x2) + c * sigma(x).

Which is,

m = (no_of_points * sigma(xy) - sigma(x) 8 sigma(y) ) / (n * sigma(x2) - sigma(x2) ) , and

c = ( sigma(y) - m * sigma(x) ) / no_of_points.

So now we have a direct formula to find m and c of the final equation.

Example

C++ Code for the Above Approach

#include <cmath>
#include <iostream>
using namespace std;
int main(){
int X[] = { 3, 2, 1 };
int Y[] = { 6, 4, 3};
int no_of_points = sizeof(X) / sizeof(X[0]);
float m, c;
int sum_of_X = 0, sum_of_X2 = 0, sum_of_Y = 0, sum_of_XY = 0;
// calculating all the terms of the equation.
for (int i = 0; i < no_of_points; i++) {
sum_of_X = sum_of_X + X[i];
sum_of_X2 = sum_of_X2 + pow(X[i],2);
sum_of_Y = sum_of_Y + Y[i];
sum_of_XY = sum_of_XY + (X[i] * Y[i]);
}
// calculating value of m and c using formula.
m = (no_of_points * sum_of_XY - sum_of_X * sum_of_Y) / (no_of_points * sum_of_X2 - pow(sum_of_X,2));
c = (sum_of_Y - m * sum_of_X) / no_of_points;
cout << "m = " << m;
cout << "\nc = " << c;
return 0;
}

Output

m = 1.5
c = 1.33333

Conclusion

In this tutorial, we discussed finding the best fit straight line to represent a given set of points. We discussed a simple approach by first deriving the formula for m and c and then simply applying it. We also discussed the C++ program for this problem which we can do with programming languages like C, Java, Python, etc. We hope you find this tutorial helpful.