The first, and at the same time one of the most important tasks in building a predictive system is the task of analyzing the original data set. The next task is to process the data according to the purpose and objectives of the predictive system. These two tasks are often not given enough attention and, as a consequence, the developer does not get the desired result as a result of building the system. This paper discusses methods and algorithms for data analysis and processing that can help in building a predictive system. The data set from the machine learning competition "ASHRAE - Great Energy Predictor III" is used as input data. The analysis phase focuses on constructing graphs and drawing conclusions from them. The paper shows how to use graphical information to find anomalies in the data and eliminate them. Examples of graphs that provide information useful for building machine learning models are given. The data processing phase describes the transformations performed on the data. It considers the transformations that lead to positive results as well as those that lead to negative results.

1.
B. L.
Ball
,
N.
Long
,
K.
Fleming
,
C.
Balbach
and
P.
Lopez
,
Journal of Building Performance Simulation
13
(
5
),
487
500
(
2020
).
2.
R.
De Vecchi
,
M. J.
Sorgato
,
M.
Pacheco
,
C.
Cândido
and
R.
Lamberts
,
Architectural Science Review
58
(
1
),
93
101
(
2014
).
3.
J. M.
Shah
,
O.
Awe
,
B.
Gebrehiwot
et al,
Journal of Electronic Packaging
139
(
2
),
020903
(
2017
).
4.
ASHRAE DOE Course: Save Energy Now Presentation Series
. (
American Society of Heating, Refrigerating, and Air-Conditioning Engineers
,
Dallas, TX
,
2010
), pp.
18
5.
Thermal Guidelines for Data Processing Environments
(
American Society of Heating, Refrigerating, and Air-Conditioning Engineers
,
Atlanta, GA
,
2012
), p. 45.
6.
X.
Liu
and
H.
Xu
,
14th International Conference on Computer Science & Education (ICCSE)
.
2019
,
278
281
.
7.
J. S.
Conery
,
Explorations in Computing
(
Chapman and Hall/CRC
,
2014
),
338
.
8.
S.
Hambrusch
,
C.
Hoffmann
,
J. T.
Korb
,
M.
Haugan
and
A. L.
Hosking
,
Proceedings of the 40th ACM Technical Symposium on Computer Science Education - SIGCSE’09.
2009
,
183
187
.
9.
X.
Wang
,
K.
Smith
and
R.
Hyndman
,
Data Mining and Knowledge Discovery
13
(
3
),
335
364
(
2006
).
10.
C. H.
Lubba
,
S. S.
Sethi
,
P.
Knaute
et al,
Data Mining and Knowledge Discovery
33
,
1821
1852
(
2019
).
This content is only available via PDF.
You do not currently have access to this content.