Subscribe by Email

Thursday, August 13, 2009

Data Mining as a part of Knowledge Discovery Process

Knowledge discovery in databases typically encompasses more than data mining. The knowledge discovery process comprises six phases : data selection, data cleansing, enrichment, data transformation or encoding, data mining and the reporting and display of the discovered information.
Data mining addresses inductive knowledge. Knowledge can be represented in an unstructured sense, it can be represented by rules, or propositional logic. In a structured form, it may be represented in decision trees, semantic networks, neural networks. The knowledge that is discovered during data mining can be described in 5 ways :
- Association rules : These rules correlate the presence of a set of items with another range of values for another set of variables. An association rule is of the form X=>Y where X = {x1, x2,..., xn}, and Y = {y1, y2,..., yn} are set of items, with xi and yi being distinct items for all i and j.
- Classification hierarchies : The goal is to work from an existing set of events or transactions to create a hierarchy of classes.
- Sequential patterns : A sequence of actions or events is sought. Detection of sequential patterns is equivalent to detecting association among events with certain temporal relationships.
- Patterns with time series : Similarities can be detected within the positions of the time series.
- Categorization and segmentation : A given population of events or items can be partitioned or segmented into sets of similar elements.

For most applications, the desired knowledge is a combination of the above types.

No comments:

Facebook activity