The aim of this research is to investigate machine learning techniques for emotion recognition for audio such as human speech and music. Emotion recognition technology has a history in commercial applications to enhance personalized music recommendations, mood-based ambience detection and interpret human interactions or human-machine interaction such as job interviews, caller-agent calls, streaming videos, and music platforms such as Spotify. Moreover, enhancing these algorithms can significantly benefit individuals on the autism spectrum by promoting accurate and practical support. In this study, we employed a combination of techniques to build a machine learning approach for emotion recognition. By adjusting the audio features and given test and train data, we aimed to identify and enhance the relationships between audio perception and its features. This approach seeks to eventually improve the accuracy and applicability of emotion recognition systems, contributing to the ongoing development and promotion of this technology in various domains.