Data Mining techniques have been applied to data collected from a 222 kWp CdTe (Cadmium Telluride) photovoltaic (PV) generator to predict faults or special conditions that occurs due to shadows, bad weather, soiling, and technical faults. Five types of errors have been distinguished and its impact on the PV system performance has been evaluated. Up to date, this computing approach has needed the simultaneous measurement of environmental attributes that an array of sensors collected. This study presents a model to assess the state of the PV (photovoltaic) generator and an algorithm that classifies its state without measuring ambient conditions. The result of a 222 kWp CdTe PV case study shows how the application of computing learning algorithms can be used to improve the management and performance of the photovoltaic generators and underlines the environmental parameters as clue attributes to find faults during the PV performance. Although the application of this method requires computational effort, the result deals with an easy-implementing decision tree, which can be installed in small device.

1.
International Energy Agency
,
Technology Roadmap: Solar Photovoltaic Energy
(
International Energy Agency
,
2014
).
2.
Joint Research Centre (JRC)
, PV Status Report
2014
.
3.
A.
Jäger-Waldau
, “
Solar energy and photovoltaics
,” in
Encyclopedia of Inorganic and Bioinorganic Chemistry
(
John Wiley & Sons, Ltd.
,
2011
).
4.
International Energy Agency
,
Medium-Term Renewable Energy Market Report 2014—Market Analysis and Forecast to 2020
(
International Energy Agency
,
2014
).
5.
International Energy Agency
,
Trends 2015 in Photovoltaic Applications
, 20th ed. (
IEA Photovoltaic Power Systems Program Task 1
,
2015
).
6.
International Energy Agency
,
Trends 2013 in Photovoltaic Applications
(
IEA Photovoltaic Power Systems Program Task 1
,
2013
).
7.
E.
Roman
,
R.
Alonso
,
P.
Ibanez
,
S.
Elorduizapatarietxe
, and
D.
Goitia
, “
Intelligent PV module for grid-connected PV systems
,”
IEEE Trans. Ind. Electron.
53
,
1066
1073
(
2006
).
8.
N.
Gokmen
,
E.
Karatepe
,
S.
Silvestre
,
B.
Celik
, and
P.
Ortega
, “
An efficient fault diagnosis method for PV systems based on operating voltage-window
,”
Energy Convers. Manage.
73
,
350
360
(
2013
).
9.
S.
Firth
,
K.
Lomas
, and
S.
Rees
, “
A simple model of PV system performance and its use in fault detection
,”
Sol. Energy
84
,
624
635
(
2010
).
10.
A.
Drews
,
A.
de Keizer
,
H.
Beyer
,
E.
Lorenz
,
J.
Betcke
,
W.
van Sark
,
W.
Heydenreich
,
E.
Wiemken
,
S.
Stettler
,
P.
Toggweiler
,
S.
Bofinger
,
M.
Schneider
,
G.
Heilscher
, and
D.
Heinemann
, “
Monitoring and remote failure detection of grid-connected PV systems based on satellite observations
,”
Sol. Energy
81
,
548
564
(
2007
).
11.
A.
Chouder
and
S.
Silvestre
, “
Automatic supervision and fault detection of PV systems based on power losses analysis
,”
Energy Convers. Manage.
51
,
1929
1937
(
2010
).
12.
A.
Chouder
,
S.
Silvestre
,
B.
Taghezouit
, and
E.
Karatepe
, “
Monitoring, modelling and simulation of PV systems using LabVIEW
,”
Sol. Energy
91
,
337
349
(
2013
).
13.
W.
Chine
,
A.
Mellit
,
A. M.
Pavan
, and
S. A.
Kalogirou
, “
Fault detection method for grid-connected photovoltaic plants
,”
Renewable Energy
66
,
99
110
(
2014
).
14.
S.
Pervaiz
and
H. A.
Khan
, “
Low irradiance loss quantification in c-Si panels for photovoltaic systems
,”
J. Renewable Sustainable Energy
7
,
013129
(
2015
).
15.
N.
Gokmen
,
E.
Karatepe
,
B.
Celik
, and
S.
Silvestre
, “
Simple diagnostic approach for determining of faulted PV modules in string based PV arrays
,”
Sol. Energy
86
,
3364
3377
(
2012
).
16.
K.-H.
Chao
,
S.-H.
Ho
, and
M.-H.
Wang
, “
Modeling and fault diagnosis of a photovoltaic system
,”
Electr. Power Syst. Res.
78
,
97
105
(
2008
).
17.
L.
Bonsignore
,
M.
Davarifar
,
A.
Rabhi
,
G. M.
Tina
, and
A.
Elhajjaji
, “
Neuro-fuzzy fault detection method for photovoltaic systems
,” in
Energy Procedia 6th International Conference on Sustainability in Energy and Buildings, SEB-14
(
2014
), Vol.
62
, pp.
431
441
.
18.
A.
Hajizadeh
,
S. G.
Tesfahunegn
, and
T. M.
Undeland
, “
Intelligent control of hybrid photo voltaic/fuel cell/energy storage power generation system
,”
J. Renewable Sustainable Energy
3
,
043112
(
2011
).
19.
M.
Adouane
,
M.
Haddadi
,
K.
Touafek
, and
S.
AitCheikh
, “
Monitoring and smart management for hybrid plants (photovoltaic-generator) in Ghardaia
,”
J. Renewable Sustainable Energy
6
,
023112
(
2014
).
20.
Y.
Zhao
,
L.
Yang
,
B.
Lehman
,
J.-F.
de Palma
,
J.
Mosesian
, and
R.
Lyons
, “
Decision tree-based fault detection and classification in solar photovoltaic arrays
,” in
2012 Twenty-Seventh Annual IEEE Applied Power Electronics Conference and Exposition (APEC)
(
2012
), pp.
93
99
.
21.
U.
Fayyad
,
G.
Piatetsky-Shapiro
, and
P.
Smyth
, “
The KDD process for extracting useful knowledge from volumes of data
,”
Commun. ACM
39
,
27
34
(
1996
).
22.
O.
Maimon
and
L.
Rokach
,
Data Mining and Knowledge Discovery Handbook
, 2nd ed. (
Springer US
,
New York
,
2010
).
23.
I. H.
Witten
,
E.
Frank
, and
M. A.
Hall
,
Data Mining: Practical Machine Learning Tools and Techniques
, 3rd ed. (
Morgan Kaufmann
,
Burlington, MA
,
2011
).
24.
S.-H.
Liao
,
P.-H.
Chu
, and
P.-Y.
Hsiao
, “
Data mining techniques and applications—A decade review from 2000 to 2011
,”
Expert Syst. Appl.
39
,
11303
11311
(
2012
).
25.
R.
Nisbet
,
J.
Elder
, and
G.
Miner
,
Handbook of Statistical Analysis and Data Mining Applications
(
Academic Press
,
Boston
,
2009
).
26.
M.
Kantardzic
,
Data Mining: Concepts, Models, Methods, and Algorithms
(
John Wiley & Sons, Inc.
,
2011
).
27.
J.
Han
,
M.
Kamber
, and
J.
Pei
,
Data Mining: Concepts and Techniques
, The Morgan Kaufmann Series in Data Management Systems (
Morgan Kaufmann
,
Boston
,
2012
).
28.
M.
Bramer
,
Principles of Data Mining
, Undergraduate Topics in Computer Science (
Springer
London
, London,
2013
).
29.
F.
Guillet
and
H.
Hamilton
,
Quality Measures in Data Mining
(
Springer
,
Berlin
,
2007
).
30.
D.
He
,
R.
Li
, and
J.
Zhu
, “
Plastic bearing fault diagnosis based on a two-step data mining approach
,”
IEEE Trans. Ind. Electron.
60
,
3429
3440
(
2013
).
31.
A.
Soualhi
,
G.
Clerc
, and
H.
Razik
, “
Detection and diagnosis of faults in induction motor using an improved artificial ant clustering technique
,”
IEEE Trans. Ind. Electron.
60
,
4053
4062
(
2013
).
32.
I.
Khan
,
A.
Capozzoli
,
S. P.
Corgnati
, and
T.
Cerquitelli
, “
Fault detection analysis of building energy consumption using data mining techniques
,” in
Proceedings of an International Conference on Energy Procedia Mediterranean Green Energy Forum 2013: MGEF-13
(
2013
), Vol.
42
, pp.
557
566
.
33.
A.
Capozzoli
,
F.
Lauro
, and
I.
Khan
, “
Fault detection analysis using data mining techniques for a cluster of smart office buildings
,”
Expert Syst. Appl.
42
,
4324
4338
(
2015
).
34.
A.
Purarjomandlangrudi
,
A. H.
Ghapanchi
, and
M.
Esmalifalak
, “
A data mining approach for fault diagnosis: An application of anomaly detection algorithm
,”
Measurement
55
,
343
352
(
2014
).
35.
E.
Casagrande
,
W. L.
Woon
,
H. H.
Zeineldin
, and
N. H.
Kan'an
, “
Data mining approach to fault detection for isolated inverter-based microgrids
,”
Transm. Distrib. IET Gener.
7
,
745
754
(
2013
).
36.
P.
Zhang
,
W.
Li
,
S.
Li
,
Y.
Wang
, and
W.
Xiao
, “
Reliability assessment of photovoltaic power systems: Review of current status and future perspectives
,”
Appl. Energy
104
,
822
833
(
2013
).
37.
G.
Amooee
,
B.
Minaei-Bidgoli
, and
M.
Bagheri-Dehnavi
, “
A comparison between data mining prediction algorithms for fault detection (Case study: Ahanpishegan co.)
,”
Int. J. Computer Science Issues
8
,
425
431
(
2011
); available at http://www.ijcsi.org/papers/IJCSI-8-6-3-425-431.pdf.
38.

SMA Solar Technology AG, SUNNY MINI CENTRAL 7000 HV.

39.

JRC's Institute for Energy and Transport, Photovoltaic Geographical Information System (PVGIS).

40.
M.
Hall
,
E.
Frank
,
G.
Holmes
,
B.
Pfahringer
,
P.
Reutemann
, and
I. H.
Witten
, “
The WEKA data mining software: An Update
,”
SIGKDD Explor. Newsl.
11
,
10
18
(
2009
).
41.
R.
Stahlbock
,
S.
Lessmann
, and
S. F.
Crone
, “
Data mining and information systems: Quo Vadis?
,” in
Data Mining
, Annals of Information Systems Vol.
8
, edited by
R.
Stahlbock
,
S. F.
Crone
, and
S.
Lessmann
(
Springer
US
,
2010
), pp.
1
15
.
42.
J. R.
Quinlan
,
C4.5: Programs for Machine Learning
(
Morgan Kaufmann
,
1993
).
43.
L.
Breiman
, “
Random Forests
,”
Mach. Learn.
45
,
5
32
(
2001
).
44.
P.
Compton
, “
Maintaining an expert system
,” in
Application of Expert Systems
(
Addison Wesley
,
1989
), pp.
366
385
.
45.
E.
Frank
and
I. H.
Witten
,
Generating Accurate Rule Sets Without Global Optimization
(
University of Waikato, Department of Computer Science
,
1998
).
46.
W. W.
Cohen
, “
Fast effective rule induction
,” in
Proceedings of the Twelfth International Conference on Machine Learning
(
Morgan Kaufmann
,
1995
), pp.
115
123
.
47.
J. M.
Cadenas
,
M. C.
Garrido
, and
R.
Martínez
, “
Feature subset selection filter-wrapper based on low quality data
,”
Expert Syst. Appl.
40
,
6241
6252
(
2013
).
48.
R.
Genuer
,
J.-M.
Poggi
, and
C.
Tuleau-Malot
, “
Variable selection using random forests
,”
Pattern Recognit. Lett.
31
,
2225
2236
(
2010
).
49.
C.
Ruiz-Samblas
,
D. A.
Cadenas
,
J. M.
Pelta
, and
L.
Cuadros-Rodríguez
, “
Application of data mining methods for classification and prediction of olive oil blends with other vegetable oils
,”
Anal. Bioanal. Chem.
406
,
2591
2601
(
2014
).
50.
A. P.
Bradley
, “
The use of the area under the ROC curve in the evaluation of machine learning algorithms
,”
Pattern Recognit.
30
,
1145
1159
(
1997
).
51.
J. A.
Hanley
and
B. J.
McNeil
, “
The meaning and use of the area under a receiver operating characteristic (ROC) curve
,”
Radiology
143
,
29
36
(
1982
).
52.
S.
García
,
A.
Fernández
,
J.
Luengo
, and
F.
Herrera
, “
A study of statistical techniques and performance measures for genetics-based machine learning: Accuracy and interpretability
,”
Soft Comput.
13
,
959
977
(
2009
).
53.
Y.
Benjamini
and
Y.
Hochberg
, “
Controlling the false discovery rate: A practical and powerful approach to multiple testing
,”
J. R. Stat. Soc. Ser. B (Methodol.)
57
,
289
300
(
1995
); available at http://www.jstor.org/stable/2346101.
54.
R.
Ihaka
and
R.
Gentleman
, “
R: A language for data analysis and graphics
,”
J. Comput. Graphical Stat.
5
,
299
314
(
1996
).
55.
L.
Serrano-Luján
,
R.
García-Valverde
,
N.
Espinosa
,
M. S.
García-Cascales
,
J. M.
Sánchez-Lozano
, and
A.
Urbina
, “
Environmental benefits of parking-integrated photovoltaics: A 222 kWp experience
,”
Prog. Photovoltaics: Res. Appl.
23
,
253
264
(
2015
).
You do not currently have access to this content.