Data Structure Articles - Page 143 of 164
3K+ Views
The major components of a data warehouse are as follows −Data Sources − Data sources define an electronic repository of records that includes data of interest for administration use or analytics. The mainframe of databases (e.g. IBM DB2, ISAM, Adabas, Teradata, etc.), client-server databases (e.g. Teradata, IBM DB2, Oracle database, Informix, Microsoft SQL Server, etc.), PC databases (e.g. Microsoft Access, Alpha Five), spreadsheets (e.g. Microsoft Excel) and any other electronic storage of data.Data Warehouse − The data warehouse is normally a relational database. It should be organized to hold data in a structure that best supports not only query and ... Read More
6K+ Views
Data Warehousing is a technique that is mainly used to collect and manage data from various sources to give the business a meaningful business insight. A data warehouse is specifically designed to support management decisions.In simple terms, a data warehouse refers to a database that is maintained separately from an organization’s operational databases. Data warehouse systems enable for integration of several application systems. They provide data processing by supporting a solid platform of consolidated, historical information for analysis.Data Warehouse queries are complicated because they contain the computation of huge groups of information at summarized levels. It can require the use ... Read More
7K+ Views
Data integration is the procedure of merging data from several disparate sources. While performing data integration, it must work on data redundancy, inconsistency, duplicity, etc. In data mining, data integration is a record preprocessing method that includes merging data from a couple of the heterogeneous data sources into coherent data to retain and provide a unified perspective of the data.Data integration is especially important in the healthcare industry. Integrated data from several patient records and clinics assist clinicians in identifying medical disorders and diseases by integrating information from several systems into a single perspective of beneficial information from which useful ... Read More
3K+ Views
It is complex and laborious to define concept hierarchies for numerical attributes because of the broad diversity of applicable data ranges and the frequent updates of data values. There are various methods of concept hierarchy generation for numeric data are as follows −Binning − Binning is a top-down splitting technique based on a defined number of bins. These methods are also used as discretization methods for numerosity reduction and concept hierarchy generation. These techniques can be used recursively to the resulting partitions to make concept hierarchies. Binning does not use class data and is, therefore, an unsupervised discretization technique. It ... Read More
7K+ Views
The data discretization techniques can be used to reduce the number of values for a given continuous attribute by dividing the range of the attribute into intervals. Interval labels can be used to restore actual data values. It can be restoring multiple values of a continuous attribute with a small number of interval labels therefore decrease and simplifies the original information.This leads to a concise, easy-to-use, knowledge-level representation of mining results. Discretization techniques can be categorized depends on how the discretization is implemented, such as whether it uses class data or which direction it proceeds (i.e., top-down vs. bottom-up). If ... Read More
1K+ Views
Dimensionality ReductionIn dimensionality reduction, data encoding or transformations are used to access a reduced or “compressed” depiction of the original data. If the original data can be regenerated from the compressed data without any loss of data, the data reduction is known as lossless. If data reconstructed is only approximated of the original data, then the data reduction is called lossy.The DWT is nearly associated with the discrete Fourier transform (DFT), a signal processing technique containing sines and cosines. In general, the DWT achieves better lossy compression. That is if a similar number of coefficients is maintained for a DWT ... Read More
2K+ Views
In the Numerosity reduction, the data volume is reduced by choosing an alternative, smaller form of data representation. These techniques may be parametric or nonparametric. For parametric methods, a model is used to estimate the data, so that only the data parameters need to be stored, instead of the actual data, for example, Log-linear models. Non-parametric methods are used for storing a reduced representation of the data which include histograms, clustering, and sampling.There are the following techniques of numerosity reduction which are as follows −Regression and Log-Linear Models − These models can be used to approximate the given data. In ... Read More
4K+ Views
Attribute subset selection decreases the data set size by eliminating irrelevant or redundant attributes (or dimensions). Attribute subset selection aims to discover a minimum set of attributes such that the resulting probability distribution of the data classes is as close as applicable to the original distribution accessing using all attributes. Data mining on a reduced set of attributes has an extra benefit. It reduces the multiple attributes occurring in the discovered patterns, provides to create the patterns simpler to understand.For n attributes, there are 2n possible subsets. An exhaustive search for the optimal subset of attributes can be intensely expensive, ... Read More
6K+ Views
Data mining is applied to the selected data in a large amount database. When data analysis and mining is done on a huge amount of data then it takes a very long time to process, which makes it impractical and infeasible. It can reduce the processing time for data analysis, data reduction techniques are used to obtain a reduced representation of the dataset that is much smaller in volume by maintaining the integrity of the original data. By reducing the data, the efficiency of the data mining process is improved which produces the same analytical results.Data reduction aims to define ... Read More
2K+ Views
In data transformation, the data are transformed or combined into forms suitable for mining. Data transformation can involve the following −Smoothing − It can work to remove noise from the data. Such methods contain binning, regression, and clustering.Aggregation − In aggregation, where summary or aggregation operations are applied to the data. For example, the daily sales data may be aggregated to compute monthly and annual total amounts. This phase is generally used in making a data cube for the analysis of the data at multiple granularities.Generalization − In Generalization, where low-level or “primitive” (raw) data are restored by larger-level concepts ... Read More
Data Structure
Networking
RDBMS
Operating System
Java
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP