Scatter Plot


Introduction

A scatter plot is a type of plot or mathematical diagram that displays values for typically two variables for a set of data using Cartesian coordinates. A scatter graph, scatter chart, scattergram, or scatter diagram are other names for it. If the points are colour, shape, or size coded, an additional variable can be displayed. A series of points is used to depict the data, with each point's position on the horizontal axis being determined by the value of one variable and its position on the vertical axis by the value of the other. In this session, scatter plots are covered.

Definition

An XY graph, scatter chart, or scattergram are other names for a scatter plot. The scatter diagram plots numerical data pairings with one variable on each axis to show the relationship between them. A scatter plot can be used when both continuous variables are independent or when one continuous variable is controlled by the researcher and the other one depends on it. If there is a parameter that is regularly increased and/or decreased by the other, the control parameter, also known as the independent variable, is frequently shown down the horizontal axis. Usually, the measured or dependent variable is plotted on a vertical axis. If there is no dependent variable, any variable can be put on either axis, and a scatter plot merely displays the strength of the correlation between two variables, not their causal link.

Graphs

Large amounts of data are immediately reported through scatter plots. It is useful in the following circumstances −

  • For a large number of provided data points,

  • Each set includes two values.

  • The data is presented in numerical form.

Correlation

We are aware that a statistical indication of the relationship between the relative motions of two variables is called a correlation. The points will form a line or curve if the variables are correlated. The points will touch the line more closely the better the correlation. One of the seven crucial quality tools is this cause examination tool. The correlation between two characteristics or variables is explained by the scatter plot. It shows how closely related the two variables are. To determine the relationship between the two variables, there are three possible scenarios:

  • Positive Correlation − A positive correlation between two variables is defined as movement that occurs simultaneously or in the same direction. A positive correlation exists when one variable increases as the other increases or when one variable decreases while the other decreases.

  • Negative Correlation − A link between two variables known as "negative correlation" occurs when one variable rises as the other falls and vice versa.

  • No Correlation − Zero correlation means there is no relationship between the two variables. To put it another way, when one variable changed, another altered in a way that was entirely unrelated.

Uses and Examples

In an effort to demonstrate the degree to which one variable is influenced by another, scatter plots are used to plot data points on a horizontal and vertical axis. The values of the columns set on the X and Y axes determine the position of the marker that represents each row in the data table. The size or colour of the markers can be assigned to a third variable, giving the map yet more dimension. The term "correlation" is used to express the connection between two variables. The two variables have a high correlation if the markers in the scatter plot are nearly parallel to one another. If the markers are dispersed equally over the scatter plot, the link is negligible or nil.

Example

The following is the scatter plot with a straight line as a trendline for the marks obtained by the student in math A and math B papers out of 10.

Solved Examples

1. For the below information, create a scatter plot that displays the total number of games(x) played and the final scores(y) for each one.

x 2 4 6 7 8 4 2 3 3 7 8
y 30 40 30 40 45 50 30 40 60 90 100

Solution −

We display the scores on the y-axis and the number of games played on the x- axis to obtain the scatter plot for the above information as follows −

2. For the below information, create a scatter plot that displays the temperature(x) in degree Celsius during day time and the traffic rating(y) out of 10 for each one.

x 20 20 25 28 30 32 32 35
y 4 5 5 2 2 2 6 1

Solution −

We display the temperature(x) in degrees Celsius on the x-axis and traffic rating(y) out of 10 on the y-axis to obtain the scatter plot for the above information as follows −

Conclusion

  • To find a relationship between two pairs of numerical data, a scatter diagram graphs them with one variable on each axis. If the variables are correlated, a line or curve will be formed by the points. The points will hug the line closer the better the association.

  • We use scatter plots for the cases like, when attempting to establish a connection between the two variables, consider. When attempting to pinpoint probable issues' underlying sources. After listing potential causes and effects, utilize a fishbone diagram to verify whether a specific cause and effect are connected. When establishing whether two seemingly similar phenomena have the same cause, before creating a control chart, check for autocorrelation.

FAQs

1. What are some typical scatter plot problems?

Overplotting and the mistaken interpretation of correlation as causation are two major problems with scatter plots. When there are too many data points to plot, overplotting happens and distinct data points end up overlapping.

2. What three things do scatter plots show?

Let's explain it! An X-Y diagram illustrating the relationship between two variables is used. Data points are plotted on a vertical and horizontal axis using it. The goal is to demonstrate how significantly one variable influences another.

3. The ideal kind of data for scatter plots is what?

When comparing numerous data points without respect to time, a scatter chart performs best. This style of the chart is highly effective at demonstrating the link between two variables (represented by the x and y axes), such as a person's weight and height. Below is a picture of a nice illustration of this.

4. What is a scatter plot's limitation?

You cannot determine the precise degree of association using scatter graphs. The link between the variables is not quantitatively measured in a scatter plot. It merely displays the quantitative change's expression in numbers.

5. What number of variables are shown in a scatterplot?

The association between two numerical variables measured for the same individuals is displayed in a scatterplot. One variable's values are displayed on the horizontal axis, and the other variable's values are displayed on the vertical axis. On the graph, each individual in the data is represented by a point.

Updated on: 27-Feb-2024

6 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements