What is Discovery-driven exploration?

Discovery-driven exploration is such a cube exploration approach. In discovery-driven exploration, precomputed measures indicating data exceptions are used to guide the user in the data analysis process, at all levels of aggregation. It refer to these measures as exception indicators.

Intuitively, an exception is a data cube cell value that is significantly different from the value anticipated, based on a statistical model. The model treated variations and patterns in the measure value across all of the dimensions to which a cell apply.

For instance, if the analysis of item-sales data acknowledge an increase in sales in December in comparison to several months, this can view like an exception in the time dimension. However, it is not an exception if the item dimension is considered, since there is a similar increase in sales for other items during December.

The model treated exceptions unknown at some aggregated group-by’s of a data cube. Visual cues including background color are used to follow the degree of exception of each cell, depends on the pre-calculated exception indicators.

Three measures are used as exception indicators to provide recognize data anomalies. These measures denote the degree of surprise that the quantity in a cell influence, concerning its expected value. The measures are computed and associated with every cell, for all levels of aggregation. They are as follows −

SelfExp − This denotes the degree of surprise of the cell value, associated other cells at the equal level of aggregation.

InExp − This denotes the degree of surprise somewhere beneath the cell, if it can drill down from it.

PathExp − This denotes the degree of surprise for every drill-down path from the cell.

For example, suppose that you would like to analyze the monthly sales at AllElectronics as a percentage difference from the previous month. The dimensions contained are item, time, and region.

To view the exception indicators, you would click on a button marked highlight exceptions on the screen. This interpret the SelfExp and InExp values into visual cues, showed with each cell. The background color of every cell depends on its SelfExp value.

Furthermore, a box is drawn around every cell, where the thickness and color of the box are a service of its InExp value. Thick boxes denotes high InExp values. In both cases, the overcast the color, the higher the degree of exception.

A drill-down along item outcomes in the cube slice of displaying the sales over time for every item. It can be presented with several different sales values to analyze. It can be pressing on the highlight exceptions button, the visual cues are showed, delivering focus toward the exceptions."