Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
Database Articles
Page 156 of 547
What is AOI?
AOI stands for Attribute-Oriented Induction. The attribute-oriented induction approach to concept description was first proposed in 1989, a few years before the introduction of the data cube approach. The data cube approach is essentially based on materialized views of the data, which typically have been pre-computed in a data warehouse.In general, it implements off-line aggregation earlier an OLAP or data mining query is submitted for processing. In other words, the attribute-oriented induction approach is generally a query-oriented, generalization-based, on-line data analysis methods.The general idea of attribute-oriented induction is to first collect the task-relevant data using a database query and then ...
Read MoreWhat are the methods for Data Generalization and Concept Description?
Data generalization summarizes data by replacing relatively low-level values (such as numeric values for an attribute age) with higher-level concepts (such as young, middleaged, and senior). Given the high amount of data saved in databases, it is beneficial to be able to define concepts in concise and succinct terms at generalized (rather than low) methods of abstraction.It is allowing data sets to be generalized at multiple levels of abstraction facilitates users in examining the general behavior of the data. Given the AllElectronics database, for instance, rather than examining single customer transactions, sales managers can prefer to view the data generalized ...
Read MoreWhat is the types of constraints in multidimensional gradient analysis?
The curse of dimensionality and the need for understandable results pose serious challenges for finding an efficient and scalable solution to the cubegrade problem. It can be confined but interesting version of the cubegrade problem, called constrained multidimensional gradient analysis. It can reduces the search space and derives interesting results.There are the following types of constraints which are as follows −Significance constraint − This provide that it can test only the cells that have specific “statistical significance” in the data, including containing at least a defined number of base cells or at least a specific total sales. In the data ...
Read MoreHow are the exception values computed?
There are three measures are used as exception indicators to support recognize data anomalies. These measures denotes the degree of surprise that the quantity in a cell influence, concerning its expected value.The measures are computed and associated with every cell, for all levels of aggregation. They are as follows including the SelfExp, InExp, and PathExp measures are based on a numerical approaches for table analysis.A cell value is treated an exception depends on how much it differs from its expected value, where its expected value is decided with a statistical model. The difference among a given cell value and its ...
Read MoreWhat is Discovery-driven exploration?
Discovery-driven exploration is such a cube exploration approach. In discovery-driven exploration, precomputed measures indicating data exceptions are used to guide the user in the data analysis process, at all levels of aggregation. It refer to these measures as exception indicators.Intuitively, an exception is a data cube cell value that is significantly different from the value anticipated, based on a statistical model. The model treated variations and patterns in the measure value across all of the dimensions to which a cell apply.For instance, if the analysis of item-sales data acknowledge an increase in sales in December in comparison to several months, ...
Read MoreHow are measures computed in data mining?
Measures can be organized into three elements including distributive, algebraic, and holistic. It depends on the type of aggregate functions used.Distributive − An aggregate function is distributive if it can be calculated in a delivered manner as follows. Consider the data are independent into n sets. It can use the service to each partition, resulting in n aggregate values.If the result changed by using the function to the n aggregate values is the same as that derived by using the function to the whole data set (without partitioning), the function can be evaluated in a distributed way.For instance, count() can ...
Read MoreWhat is Entropy-Based Discretization?
Entropy-based discretization is a supervised, top-down splitting approach. It explores class distribution data in its computation and preservation of split-points (data values for separation an attribute range). It can discretize a statistical attribute, A, the method choose the value of A that has the minimum entropy as a split-point, and recursively divisions the resulting intervals to appear at a hierarchical discretization.Specific discretization forms a concept hierarchy for A. Let D includes data tuples described by a group of attributes and a class-label attribute. The class-label attribute supports the class data per tuple. The basic approach for the entropy-based discretization of ...
Read MoreHow can this technique be useful for data reduction if the wavelet transformed data are of the same length as the original data?
The utility lies in the fact that the wavelet transformed data can be limited. A compressed approximation of the information can be retained by saving only a small fraction of the principal of the wavelet coefficients. For instance, all wavelet coefficients higher than some user-defined threshold can be maintained. Some other coefficients are set to 0.The resulting data description is very sparse so that services that can take benefit of data sparsity are computationally very quick if implemented in wavelet space. The method also works to eliminate noise without smoothing out the main characteristics of the data, creating it efficient ...
Read MoreHow can we find a good subset of the original attributes?
Attribute subset selection reduces the data set size by removing irrelevant or redundant attributes (or dimensions). The objective of attribute subset selection is to discover a minimum set of attributes such that the subsequent probability distribution of the data classes is as close as feasible to the original distribution obtained using all attributes.For n attributes, there are 2n possible subsets. An exhaustive search for the optimal subset of attributes can be extremely costly, specifically as n and the number of data classes raise. Hence, heuristic approaches that explore a reduced search space are generally used for attribute subset selection.These approaches ...
Read MoreWhat is trend analysis?
Trend analysis defines the techniques for extracting a model of behavior in a time series that can be slightly or entirely hidden by noise. The methods of trend analysis have been generally used in detecting outbreaks and unexpected increases or decreases in disease appearances, monitoring the trends of diseases, evaluating the effectiveness of disease control programs and policies, and assessing the success of health care programs and policies, etc.Various techniques can be used to detect trends in item series. Smoothing is an approach that is used to remove the non-systematic behaviors found in time series. Smoothing usually takes the form ...
Read More