Automatic speech recognition (ASR) systems are greatly demanded for customer service systems. With advanced interactive voice response systems, humans have more opportunities to have dialogues with computers. Existing dialogue systems process linguistic information, but do not process paralinguistic information. Therefore, computers are able to obtain less information during a human‐computer dialogue than a human can during a human‐human dialogue. This report describes a study of the estimation method of degree of speakers’ anger emotion using acoustic features and linguistic representation expressed in utterances during a natural dialogue. To record utterances expressing the users’ internal anger emotion, we set pseudo‐dialogues to induce irritation arising from discontentment with the ASR system performance and to induce exasperation against the operator while the user makes a complaint. A five‐scale subjective evaluation was conducted to mark each utterance with a score as the actual measurement of anger emotion. As a result of this, an emotional speech corpus was produced. We examine the acoustic features and features of linguistic representation of each utterance with reference to these anger score. Then we conduct experiments to estimate automatically the degree of anger emotion using parameters selected from those features.