- Trending Categories
- Data Structure
- Networking
- RDBMS
- Operating System
- Java
- iOS
- HTML
- CSS
- Android
- Python
- C Programming
- C++
- C#
- MongoDB
- MySQL
- Javascript
- PHP

- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who

# What are interval-scaled variables?

Interval-scaled variables are continuous data of an approximately linear scale. An examples such as weight and height, latitude and longitude coordinates (e.g., when clustering homes), and weather temperature. The measurement unit used can influence the clustering analysis.

For instance, changing data units from meters to inches for height, or from kilograms to pounds for weight, can lead to several clustering structure. In general, defining a variable in smaller units will lead to a higher range for that variable, and therefore a larger effect on the resulting clustering architecture.

It can prevent dependence on the choice of data units, the data must be standardized. Standardizing measurements tries to provide all variables an equal weight. This is especially useful when given no previous knowledge of the data. But in some applications, users can intentionally need to provide more weight to a specific set of variables than to others. For instance, when clustering basketball player candidates, it can prefer to provide more weight to the variable height.

It can be standardize data, one choice is to modify the original data to unit less variables. Given measurements for a variable f, this can be implemented as follows −

Calculate the mean absolute deviation, s_{f} −

$$\mathrm{s_{f}\:=\:\frac{1}{n}(|x_{1f}-m_{f}|+|x_{2f}-m_{f}|+\cdot\cdot\cdot+|x_{nf}-m_{f}|)}$$

where x_{1f} … x_{nf} are n measurements of f, and m_{f} is the mean value of f, that is, $\mathrm{m_{f}\:=\:\frac{1}{n}(|x_{1f}|+|x_{2f}|+\cdot\cdot\cdot+|x_{nf}|)}$

Calculate the standardized measurement, or z-score −

$$\mathrm{z_{if}\:=\:\frac{x_{if}-m_{f}}{s_{f}}}$$

The mean absolute deviation, s_{f}, is powerful to outliers than the standard deviation, $\mathrm{\sigma_{f}}$. When computing the mean absolute deviation, the deviations from the mean $\mathrm{(|x_{1f}-m_{f}|)}$ are not squared.

Therefore, the effect of outliers is decreased. There are powerful measures of dispersion, including the median absolute deviation. The benefit of using the mean absolute deviation is that the z-scores of outliers do not come too small; therefore, the outliers remain detectable.

Standardization may or may not be helpful in a specific application. Therefore the choice of whether and how to implement standardization must be left to the user. After standardization, or without standardization in specific applications, the dissimilarity (or similarity) among the objects defined by interval-scaled variables is generally computed based on the distance between each group of objects.

The famous distance measure is Euclidean distance, which is represented as

$$\mathrm{d(i, j)=\sqrt{(X_{i1}-X_{j1}})^2+{(X_{i2}-X_{j2}})^2+...+{(X_{in}-X_{jn}})^2}$$

where i = (x_{i1}, x_{i2}, … x_{in}) and j = (x_{j1}, x_{j2}, … x_{jn}) are two n-dimensional data objects. Another well-known metric is Manhattan (or city block) distance, described as

$$\mathrm{d(i, j)=|X_{i1}-X_{j1}|+ |(X_{i2}-X_{j2}|+...+|(X_{in}-X_{jn}|}$$

Both the Euclidean distance and Manhattan distance satisfy the following numerical requirements of a distance function −

d(i, j) ≥ 0: Distance is a nonnegative number.

d(i, i) = 0: The distance of an object to itself is 0.

d(i, j) = d(j, i): Distance is a symmetric function.

d(i, j) ≤ d(i, h)+d(h, j): It is going directly from object i to object j in space is no more than making a detour over any other object h (triangular inequality).

- Related Questions & Answers
- What are class variables, instance variables and local variables in Java?
- What are variables and types of variables in C++?
- What are local variables and global variables in C++?
- What are local variables in C++?
- What are global variables in C++?
- SAFe Methodology Tutorial: What is Scaled Agile Framework
- What is Scaled Agile Framework? SAFe Methodology Tutorial
- What are Transient variables in Java? Explain.
- What are Local Scope Variables in Postman?
- What are the properties of MySQL user variables?
- What are storage classes of variables in C++?
- What are bind variables? How to execute a query with bind variables using JDBC?
- What are the basic scoping rules for python variables?
- What are the rules to declare variables in C++?
- What are environment variables available in Python CGI Programming?