The problem of atmospheric air pollution is of great concern to human health, the clean environment and the climate, and is an actual topic in environmental research. The major pollutants of air are PM10, PM2.5, SO2, ozone (O3) and more. For each geographical region, there are specific sources of pollution, as well as conditions for the containment of harmful emissions into the air for longer periods of time. Bulgaria is a member state of the European Union in which low air quality is permanently reported, against the statutory limits in European legislation. Significant exceedances of regulatory limits have been measured for many of the major air pollutants. This paper focuses on the application and comparison of two machine learning methods - Boosted trees and regularized regression to investigate the influence of meteorological, atmospheric and other factors on air quality based on empirical data. Hybrid type models have also been built and tested. The modeling process uses daily average ground level ozone (O3) and PM10 emissions data in the city of Ruse, Bulgaria, measured at licensed automated stations, under the control of the European Environment Agency in Bulgaria. As a result, validated models with high statistical goodness-of-fit indicators such as coefficient of determination, root mean square error, etc. The best selected models show very good agreement with the measured data in the order of 80-90%. Short-term forecasts for future pollution have been made. There is a slight preference for hybrid boosted trees and regularized regression models over those generated by individual methods.

This content is only available via PDF.
You do not currently have access to this content.