In the past decade several metrics that reflect the variability of vocalic and consonantal intervals in speech have been used to quantify the impressionistic division of languages into stress‐ and syllable‐timed. Although all such metrics successfully separate prototypical languages, such as stress‐timed English and syllable‐timed Spanish, their results for other languages are less clear. The problem is related to the limited datasets used, which consist of either a small number of sentences per language elicited from several speakers, or longer stretches of speech elicited from one speaker per language. Here we elicited short sentences, story reading and spontaneous speech from several speakers of stress‐timed English, syllable‐timed Spanish, Korean, a hitherto unclassified language, and Greek, which has shown to resist classification. Our results show that different metrics yield different classifications for some languages, like Greek, while scores for the same language differ depending on speaking style. Taken all together these results cast doubt on the robustness and usefulness of the popular metrics and suggest that alternative ways of conceiving of speech rhythm that do not rely exclusively on timing but take relative prominence into account may ultimately be more successful in explaining speech rhythm.