What are the types of mining sequence data?

A sequence is an ordered list of events. Sequences can be divided into three groups, based on the features of the events they define as follows −

Similarity Search in Time-Series Data

A time-series data set includes sequences of integer values acquired over repeated computation of time. The values are generally measured at same time intervals (such as each minute, hour, or day).

Time-series databases are famous in several applications including stock market analysis, economic and sales predicting, budgetary analysis, utility studies, inventory studies, revenue projections, workload projections, and process and quality service. They are beneficial for studying natural phenomena, mathematical and engineering experiments, and pharmaceutical treatments.

Regression and Trend Analysis in Time-Series Data

Regression analysis of time-series data has been designed substantially in the application of data and signal analysis. Trend analysis construct an integrated model using the following four major elements or movements to define time-series data −

Trend or long-term movements − These denotes the general direction in which a time-series graph is changing over time, for instance, using weighted moving average and the least squares approach to find trend curves including the dashed curve.

Cyclic movements − These are the long-term vibration about a trend line or curve.

Seasonal variations − These are closely identical patterns that a time series occurs to follow during equivalent seasons of successive years including holiday shopping seasons. For efficient trend analysis, the data required to be “deseasonalized” based on a seasonal index calculated by autocorrelation.

Random movements − These define sporadic changes because of chance events including labor disputes or announced personnel changes within organization.

Sequential Pattern Mining in Symbolic Sequences

A symbolic sequence include an ordered group of elements or events, documented with or without a concrete concept of time. There are several applications including data of symbolic series including user shopping sequences, web click streams, program implementation sequences, biological sequences, and sequences of events in science and engineering and in natural and social developments.

Because biological sequences give complex semantic meaning and pose several challenging research problems, most investigations are directed in the application of bioinformatics.

Alignment of Biological Sequences

Biological sequences define the sequences of nucleotides or amino acids. Biological sequence analysis compares, aligns, indexes, and study biological sequences and therefore plays an essential role in bioinformatics and current biology.

Sequence alignment depends on the fact that all living organisms are associated by development. This indicate that the nucleotide (DNA, RNA) and protein sequences of species that are nearer to each other in evolution must exhibit higher similarities. An alignment is the procedure of lining up sequences to obtain a maximal identity level, which also defines the degree of similarity among sequences.

Updated on: 18-Feb-2022

2K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started