Cars are the preferred vehicle for comfortable travel, especially on a long trip. Nowadays, peoples are trying to buy new cars, but the new tax policies increase the price amount by adding additional charges. So, most of the people go for second-hand cars due to nominal price. Online portals also helped a lot, for buying and selling the used cars. Here, Machine learning algorithms play a vital role to predict the right price for the right car. In this paper, Multiple Linear Regression, KNN, Random Forest, Gradient Boosting and XGBoost models are developed and results are compared for accuracy. Among those XGBoost gives the highest r2 score, which is 88% of training data and 87% of test data. In the existing system, while pre-processing, all the null values are dropped and in categorical data conversion label encoder is used for numerical conversion but that is suitable for ordinal data. In the proposed system, two datasets are used to analyze the performance of imputation. For that, all the null values are imputed in one dataset and the results are compared with another non-imputed dataset. In categorical data conversion, one-hot encoding is used to represent the feature availability. And finally, results are analyzed the discussed the pros and cons of each technique.

2.
G.
Chandrashekar
and
F.
Sahin
, “
A survey on feature selection methods
”,
Computers & Electrical Engineering
,
40
(
1
), pp.
16
28
, (
2014
).
3.
M.C.
Newman
, “
Regression analysis of log-transformed data:Statistical bias and its correction
”,
Environmental Toxicology and Chemistry
,
12
(
6
), pp.
1129
1133
, (
1993
).
4.
Kuiper
,
Shonda
.
"Introduction to Multiple Regression: How Much Is Your Car Worth?
",
Journal of Statistics Education,
16
(
3
), (
2008
).
5.
T.
Chen
and
C.
Guestrin
, “
Xgboost: A scalable tree boosting system
”,
KDD '16: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
, pp.
785
794
, (
2016
).
6.
S.
Lessmann
and
S.
Voß
, “
Car resale price forecasting: The impact of regression method, private information, and heterogeneity on forecast accuracy
”,
International Journal of Forecasting
,
33
(
4
), pp.
864
877
, (
2017
).
7.
Pudaruth
,
Sameerchand
,
“Predicting the price of used cars using machine learning techniques
’,
International Journal of Information Computing. Technology
,
4
(
7
), pp-
753
764
, (
2014
).
8.
N.
Pal
,
P.
Arora
,
D.
Sundararaman
,
P.
Kohli
, and
S. Sumanth
Palakurthy
,
“How much is my car worth? A methodology for predicting used cars prices using Random Forest“
,
Advances in Information and Communication Networks,
1
, pp.
413
422
, (
2017
).
9.
N.
Monburinon
,
P.
Chertchom
,
T.
Kaewkiriya
,
S.
Rungpheung
,
S.
Buya
,
P.
Boonpou
,
“Prediction of prices for used car by using regression models
”,
5th International Conference on Business and Industrial Research (ICBIR)
, pp.
115-119
,(
2018
)
10.
Kanwaal
Noor
,
Jan
Sadaqat
,
“Vehicle price prediction system using machine learning techniques
”,
International Journal of Computer Applications,
167
(
9
), pp.
27
31
, (
2017
).
11.
Enis
Gegic
,
Becir
Isakovic
,
Dino
Keco
,
Zerina
Masetic
,
Jasmin
Kevric
,
“Car price prediction using machine learning techniques
”,
TEM Journal
,
8
(
1
), pp.
113
118
, (
2019
).
This content is only available via PDF.
You do not currently have access to this content.