The process of building new dataset and the existence of such a data followed the urgent need for the existence of datasets that are specialized in educational lectures, so this will need an accurate classification algorithm to classify it, the benefit of classify such dataset is to minimize the workload of classifying each file manually and individually. In the present paper, authors perform experimentations for conducting an empirical deep learning study, especially, convolutional neural network, for three new datasets of educational lectures which are (PDF, Word and PowerPoint datasets), The three new datasets using real data educational resources lectures collected from various document projects of different universities and institutions. The architecture has been applied to the task of the text classification in the domain of the document with documents data-sets have been obtained from a variety of projects on actual document cases. The aim of the present study is to initially test the performance of each dataset (PDF, Word, and PowerPoint dataset) through using four machine learning classification algorithms which are (Bayes Net, Random Forest, Random Committee, and OneR). Second goal is experimenting the efficiency of the approach of the deep learning in the tasks of classification and after that, comparing the efficiencies with the efficiencies of traditional machine learning classification methods. Mainly two classification techniques used to maximize the benefits of the classification process, the first one is to use the deep learning algorithm which shows an accuracy of classifying file between (95 and 96%) for three new dataset files and standard machine learning algorithms (OneR, Random forest, Bayes net, and Random Committee ) these algorithm shows accuracy 91% for PDF Dataset using random forest and random committee algorithms, for Word dataset the accuracy is 46% using random committee, and for the last dataset PowerPoint the accuracy is 77% using random forest, Therefore, we will choose Deep learning algorithm because it gives higher results and accuracy than machine learning algorithms.

1.
Yang
,
Zichao
, et al,
2016
. “
Hierarchical attention networks for document classification
.”
Proceedings of the 2016 conference of the North American chapter of the association for computational linguistics: human language technologies
.
2016
.
2.
Joulin
,
Armand
, et al,
2016
. “
Bag of tricks for efficient text classification
.” arXiv preprint arXiv:1607.01759 (
2016
).
3.
Wang
,
Suhang
, et al,
2016
. “
Linked document embedding for classification
.”
Proceedings of the 25th ACM international on conference on information and knowledge management. ACM
,
2016
.
4.
Selvi,
S.
Thamarai
, et al,
2017
. “
Text categorization using Rocchio algorithm and random forest algorithm
.”
2016 Eighth International Conference on Advanced Computing (ICoAC
).
IEEE
,
2017
.
5.
Jiang
,
Mingyang
, et al,
2018
. “
Text classification based on deep belief network and softmax regression
.”
Neural Computing and Applications
29
.
1
(
2018
):
61
70
.
6.
Peng
,
Hao
, et al,
2018
. “
Large-scale hierarchical text classification with recursively regularized deep graph-cnn
.”
Proceedings of the 2018 World Wide Web Conference. International World Wide Web Conferences Steering Committee
,
2018
.
7.
Ebrahimi
,
Mohammadreza
,
Ching Y.
Suen
, and
Olga
Ormandjieva
,
2016
. “
Detecting predatory conversations in social media by deep convolutional neural networks
.”
Digital Investigation
18
(
2016
):
33
49
.
8.
Lenc
,
Ladislav
, and
Pavel
Král
,
2016
. “
Deep neural networks for Czech multi-label document classification
.”
International Conference on Intelligent Text Processing and Computational Linguistics
.
Springer
,
Cham
,
2016
.
9.
Georgakopoulos
,
Spiros
V.
, et al,
2018
. “
Convolutional neural networks for toxic comment classification
.”
Proceedings of the 10th Hellenic Conference on Artificial Intelligence
.
ACM
,
2018
.
10.
Kalyankar
,
2018
.
Effect of Training Set Size in Decision Tree Construction by Using GATree and J48 Algorithm
,
Proceedings of the World Congress on Engineering 2018
Vol I WCE 2018,
July 4-6, 2018
,
London, U.K
.
11.
Kalmegh
,
2018
.
Comparative Analysis of the WEKA Classifiers Rules Conjuctiverule & Descisiontable on Indian News Dataset by Using Different Test Mode
,
International Journal of Engineering Science Invention (IJESI)
ISSN (Online): 2319 – 6734, ISSN (Print): 2319 – 6726 www.ijesi.org ||Volume
7
Issue
2
Ver III || February
2018
|| PP.
01
09
.
12.
Namdev
,
Agrawal
, Silkari,
2015
.
Recent Advancement in Machine Learning Based Internet Traffic Classification, 19th International Conference on Knowledge Based and Intelligent Information and Engineering Systems
,
Procedia Computer Science
60
(
2015
)
784
791
.
13.
Parsania
,
Jani
, Bhalodiya,
2014
.
Applying Naïve Bayes, BayesNet, PART, JRip, and OneR Algorithms on Hypothyroid Database for Comparative Analysis
,
International Journal of Darshan Institute on Engineering & Emerging Technologies
, Vol.
3
, No.
1
,
2014
.
14.
Timothy C.
Au
,
2018
.
Random Forest, Decision Trees, and Categorical Predictors: The “Absent Level”
Problem, Journal of Machine Learning Research
19
(
2018
)
1
30
.
15.
Nasa
,
Suman
,
2012
.
Evaluation of Different Classification Techniques for Web Data
,
International Journal of Computer Applications
, Vol.
52
, No.
9
, (0975-8887), August
2012
.
16.
Cigsar
,
Unal
,
2019
.
Comparison of Data Mining Classification Algorithms Determining The Default Risk
,
Hindawi Scientific Programming
Volume
2019
, Article ID 8706505,
8
pages.
17.
N. A.
Sajid
,
M. T.
Afzal
,
M. A.
Qadir
,
S. A.
Khan
,
2013
.
The Insights of Classification Schemes, Centre for Distributed and Semantic Computing (CDSC)
,
Department of Computer Science, Mohammad Ali Jinnah University Islamabad
,
Pakistan
.
18.
Kingma,
Lei
Ba
,
2015
.
ADAM: A Method for Stochastic Optimization
, Published as a conference paper at ICLR 2015.
19.
Aminu
Da’u
, Salim,
2019
. Recommendation System Based on Deep Learning Methods: A Systematic Review and New Directions,
Artificial Intelligence Review
,
Springer Nature B.V
.
2019
.
20.
Hdwehle, 2017. Machine Learning, Deep Learning and AI: What’s the Difference, Conference: Data Scientist Innovation Day, July
2017
.
21.
V.
Yaloveha
,
D.
Hlavcheva
,
A.
Podorozhniak
,
2019
.
Usage of Convolution Neural Network Multispectral Image Processing Applied to The Problem of Detecting Fire Hazardous Forest Area
,
Advanced Information Systems.
2019. Vol.
3
, No.
1
, ISSN 2522-9052, UDC 004.032.26:004.85+528.854.
22.
Tatsis
,
Tjortjis
, Tzirakis,
2018
.
Evaluating Data Mining Algorithms Using Molecular Dynamics Trajectories
,
International Journal of Data Mining and Bioinformatics
, July
2018
.
23.
Witten
,
I.H.
and
Frank
,
E.
,
2005
.
Data Mining: Practical Machine Learning Tools and Techniques
, 2nd ed.,
Morgan Kaufmann
.
San Francisco
This content is only available via PDF.
You do not currently have access to this content.