Big data challenges different aspects of storing, processing and managing data, as well as analyzing and using data for business purposes. Applying Data Mining methods over Big Data is another challenge because of huge data volumes, variety of information, and the dynamic of the sources. Different applications are made in this area, but their successful usage depends on understanding many specific parameters. In this paper we present several opportunities for using Data Mining techniques provided by the analytical engine of RDBMS Oracle over data stored in Hadoop Distributed File System (HDFS). Some experimental results are given and they are discussed.
Topics
Data mining
REFERENCES
1.
J.
Han
and M.
Kamber
, Data Mining: Concepts and Techniques
(Morgan Kaufmann, San Francisco
, 2nd edition), 2006
.2.
3.
What is the Hadoop Distributed File System (HDFS)
, ibm.com
, (IBM Retrieved 2014-10-30) http://www-01.ibm.com/software/data/infosphere/hadoop/hdfs/4.
What is Map Reduce
, ibm.com, (IBM Retrieved 2014-10-30)
https://www-01.ibm.com/software/data/infosphere/hadoop/mapreduce/5.
Agrawal
, T. Imielinski
, and A.
Swami
, Mining Association Rules Between Sets of items in large databases
, Proceedings of the ACM SIGMOD Conference on Management of data
, 1993
, pp. 207
–216
.
This content is only available via PDF.
© 2017 Author(s).
2017
Author(s)
You do not currently have access to this content.