Repository | Book | Chapter

176343

Efficient data mining from large text databases

Hiroki Arimura Hiroshi Sakamoto Setsuo Arikawa

pp. 123-139

Abstract

In this paper, we consider the problem of discovering a simple class of combinatorial patterns from a large collection of unstructured text data. As a framework of data mining, we adopted optimized pattern discovery in which a mining algorithm discovers the best patterns that optimize a given statistical measure within a class of hypothesis patterns on a given data set. We present efficient algorithms for the classes of proximity word association patterns and report the experiments on the keyword discovery from Web data.

Publication details

Published in:

Arikawa Setsuo, Shinohara Ayumi (2002) Progress in discovery science: final report of the Japanese discovery science project. Dordrecht, Springer.

Pages: 123-139

DOI: 10.1007/3-540-45884-0_6

Full citation:

Arimura Hiroki, Sakamoto Hiroshi, Arikawa Setsuo (2002) „Efficient data mining from large text databases“, In: S. Arikawa & A. Shinohara (eds.), Progress in discovery science, Dordrecht, Springer, 123–139.