The most frequent opaque idiomatic expressions in English news

Author: Wenhua Hsu (I-Shou University, Taiwan)
Speaker: Wenhua Hsu
Topic: Language, community, ethnicity
The (SCOPUS / ISI) SOAS GLOCAL CALA 2019 General Session


There is little dispute that idioms form an important part of the lexicon in news texts. Like vocabulary learning, countless idioms may be daunting for EFL learners. Although idiom dictionaries enhance our understanding of idiomatic expressions used by native speakers, yet they may include a substantial number of rarely-used idioms. This issue can be addressed through a corpus-based study of idioms used in news texts, as current event news reports are close to daily life in many areas. This paper aimed to establish a pedagogical useful list of the most frequent idioms and idiomatic expressions for EFL learners who need to read English news as voluminous reading material. It dealt with a particularly non-compositional subset of formulaic language, namely opaque idiomatic expressions, because they may cause deceptive comprehension for EFL learners. A non-compositional expression is a semantically opaque multiword unit where the meaning of the whole cannot be understood easily from the meanings of its component words. Lexical coverage (the percentage of words known in a text) may be overestimated when non-compositional multiword expressions are hidden in known words and their meanings as a whole are actually unknown to learners (as, of versus as of; carrot, and, stick versus carrot and stick). This research began by compiling a search list of idioms from five online English idiom dictionaries. Through Excel, repeated idioms and phrases were removed. A total of 8,211 idiomatic expressions were identified. Subsequent to the raw list was the corpus search. The researcher searched the 6-billion-word NOW (News on the Web) corpus for the 8,211 items, using the search tool provided on the NOW website. The cutoff point was set at two tokens per million words to meet what Moon (1998) classified as the lowest band of the medium-frequency idioms. Based on frequency and non-compositionality, a total of 675 idiomatic expressions of 2 to 6 words were selected. To verify if they merit pedagogical attention, the 675 expressions were tested on the VOA (Voice of America) news corpus of 6 million words, which were intended for the goal of voluminous reading. Results show that they accounted for 5.08% of the running words in the VOA news corpus. It is hoped that knowledge of the most frequent opaque idiomatic expressions can contribute to filling the chasm of lexical coverage that individual words fail to account for.

Keywords: Idioms, Language education, Voice of America, News