What is Multi-relational Data Mining?

Data MiningDatabaseData Structure

Multi-relational data mining (MRDM) methods search for designs that contain several tables (relations) from a relational database. Each table or relation represents an entity or a relationship, described by a set of attributes. Links between relations show the relationship between them.

There is one method to apply traditional data mining methods (which assume that the data reside in a single table) is propositionalization, which converts multiple relational data into a single flat data relation, using joins and aggregations.

This can lead to the generation of a huge, undesirable “universal relation” (involving all of the attributes). Furthermore, it can result in the loss of information, including essential semantic information represented by the links in the database design.

Multi-relational data mining aims to discover knowledge directly from relational data. There are different multi-relational data mining functions, such as multinational classification, clustering, and frequent pattern mining.

The advantage of Multi-relational classification is to build a classification model that utilizes information in different relations. Multi-relational clustering aims to group tuples into clusters using their attributes as well as tuples related to them in different relations. Multi-relational frequent pattern mining aims at finding patterns involving interconnected items in different relations. It can first use mult-relational classification as an example to illustrate the purpose and procedure of multi-relational data mining.

In a database for multi-relational classification, there is one target relation, Rt , whose tuples are known as target tuples and are related to class labels. The other relations are nontarget. Each relation can have one primary key (which uniquely recognizes tuples in the relation) and several foreign keys (where a primary key in one relation can be connected to the foreign key in another).

If it can consider a two-class problem, then it can select one class as the positive class and the other as the negative class. The service for building an accurate multi-relational classifier is to find relevant features in different relations that help to categorize positive and negative target tuples.

The most popular form of hypotheses for multi-relational classification is sets of rules. Each rule is a list (logical conjunct) of predicates, associated with a class label. A predicate is a constraint on an attribute in a relation. A predicate is often defined based on a certain join path. A target tuple satisfies a rule if and only if it satisfies every predicate of the rule.

Published on 25-Nov-2021 09:32:15