Big data describes a huge amount of data in structured and unstructured form that makes a business on a daily basis and is not the amount of data that is important. It is huge in size and growing more with time. Big data is generated through various fields such as stock exchanges, jet engines, social media sites etc. The data is searched for a specific information. The process of searching specific data in big data is known as data mining that is done with multiple methods. The most common and efficient method of data mining is by using python language. There are multiple libraries in python that are used in data mining. The paper discusses the orange, matminer and scikit-learn modules of python and provides an overview with the discussion of data mining techniques.

1.
Y.
Kostyuchenko
and
I.
Gosudarev
, “
Analysis of approaches to data modeling using Python libraries
.”
2.
M. Janez
Kranjc
,
Roman
Oravc
,
Vid
Podpevcan
,
Nada
Lavravc
and
R.-v
Sikonja
, “
ClowdFlows: Online workflows for distributed big data mining
.”
3.
A.
Jovic
and
K.
Brkic
, “
An overview of free software tools for general data mining
.”
4.
M.
Butwall
,
P.
Ranka
, and
S.
Shah
, “
Python in Field of Data Science: A Review
.”
5.
W.
Mckinney
, “
pandas: a Foundational Python Library for DataAnalysis and Statistics
.”
6.
T. C. Janez
Demsar
,
Bla
Zupan
,
Gregor
Leban
, “
Orange: From Experimental Machine Learning to Interactive Data Mining
.”
7.
B. Logan
Warda
 et al., “
Matminer: An open source toolkit for materials data mining
.”
8.
D.
Jothimani
and
A. K.
Bhadani
, “
Big Data: Challenges, Opportunities, and Realities
.”
9.
S.
Arora
, “
Top 5 Python Libraries For Data Science
.” https://www.simplilearn.com/top-python-libraries-for-data-science-article.
10.
I. S. A.
Jovic
, “
An overview and comparison of free Python libraries for data mining and big data analysis
.”
This content is only available via PDF.
You do not currently have access to this content.