Repository | Book | Chapter
Efficient data mining from large text databases
pp. 123-139
Abstract
In this paper, we consider the problem of discovering a simple class of combinatorial patterns from a large collection of unstructured text data. As a framework of data mining, we adopted optimized pattern discovery in which a mining algorithm discovers the best patterns that optimize a given statistical measure within a class of hypothesis patterns on a given data set. We present efficient algorithms for the classes of proximity word association patterns and report the experiments on the keyword discovery from Web data.
Publication details
Published in:
Arikawa Setsuo, Shinohara Ayumi (2002) Progress in discovery science: final report of the Japanese discovery science project. Dordrecht, Springer.
Pages: 123-139
Full citation:
Arimura Hiroki, Sakamoto Hiroshi, Arikawa Setsuo (2002) „Efficient data mining from large text databases“, In: S. Arikawa & A. Shinohara (eds.), Progress in discovery science, Dordrecht, Springer, 123–139.