How can this technique be useful for data reduction if the wavelet transformed data are of the same length as the original data?

Data Mining Database Data Structure

The utility lies in the fact that the wavelet transformed data can be limited. A compressed approximation of the information can be retained by saving only a small fraction of the principal of the wavelet coefficients. For instance, all wavelet coefficients higher than some user-defined threshold can be maintained. Some other coefficients are set to 0.

The resulting data description is very sparse so that services that can take benefit of data sparsity are computationally very quick if implemented in wavelet space. The method also works to eliminate noise without smoothing out the main characteristics of the data, creating it efficient for data cleaning as well. Given a set of coefficients, an approximation of the original data can be generated by using the opposite of the DWT applied.

The DWT is generally related to the discrete Fourier transform (DFT), a signal processing method containing sines and cosines. In general, the DWT achieves good lossy compression. If a similar number of coefficients is kept for a DWT and a DFT of a given data vector, the DWT version will support a more efficient approximation of the original records.

Therefore, for the same approximation, the DWT needed less area than the DFT. Unlike the DFT, wavelets are completely localized in space, contributing to the conservation of local elements. There is just one DFT, yet there are multiple families of DWTs.

There are famous wavelet transforms such as the Haar-2, Daubechies-4, and Daubechies-6 transforms. The general process for using a discrete wavelet transform facilitate a hierarchical pyramid algorithm that halves the records at each iteration, resulting in quick computational speed. The method is as follows −

The length, L, of the input data vector should be a numerical power of 2. This condition can be assembled by padding the data vector with zeros as essential (L ≥ n).
Each transform involves using two functions. The first uses various data smoothing, including a sum or weighted average. The second implement a weighted difference, which facilitates bringing out the detailed characteristics of the data.
The two functions are used to pairs of data points in X, that is, to all pairs of data (x_2i,x_2i+1). This results in two sets of data of length L/2. In general, these define a smoothed or low-frequency version of the input records and the high-frequency content of it, accordingly.
The two functions are recursively used to the sets of data acquired in the earlier loop until the resulting data sets acquired are of length 2.
It can be selected values from the data sets acquired in the following iterations are destined the wavelet coefficients of the transformed data.

Ginni

Updated on: 16-Feb-2022

152 Views

Kickstart Your Career

Get certified by completing the course

Get Started