This post is funnier than you’re expecting.
I’ve done plenty of writing about words on this website. Time to break things down to the more granular level of letters. (Then maybe, one day, I’ll break things down to the individual shapes within the letters. “Letters with at least one straight vertical line are 3.7x more popular than letters with one to two curves.”)
A kindred spirit named Peter Norvig analyzed Google Books to compile a veritable smorgasbord of statistics on the frequency of words and letters in the English language. He wound up analyzing 743.8 billion words using 3.56 trillion letters, you know, just in case someone might dare to throw up the “small sample size” alert at only 3.4 trillion letters.
He broke down his results into several different charts, but the one I’m focusing on today is the overall frequency of letters. That means it’s every usage of each letter, regardless of its position in a word.
Unsurprisingly — especially for anyone who’s watched Wheel of Fortune — E is the most frequently used letter. Approximately one out of every eight letters is an E.
But I’m looking at the other end of the scale. Here are the 11 least-used letters in English, three or maybe four of which are decently surprising.
1 | Z, ~1/1111 letters used
Apparently Dr. Seuss books weren’t included in the study. That guy is Z ca-razy. Like Michael Scott pulling out an imaginary gun in every improv scene, Dr. Seuss busts out a made-up word featuring two to 17 Zs at every possible opportunity.
2 | Q, ~1/833
For what it’s worth, Norvig found that Q can exist without U. While QU is the most common pairing with Q, at least one book included a word that paired Q virtually every other letter. There are only six exceptions; no book has a word containing JQ, QG, QK, QY, QZ or WQ.
3 | J, ~1/625
J is the only letter that doesn’t appear on the periodic table. Ooh, someone should figure out how frequently letters appear on the periodic table and then compare that to the frequency in other words. I would do it myself but I’m quite busy. With… I don’t know… let’s say work and learning swing dancing.
4 | X, ~1/435
You’d think X might be higher, and it might be if this were only taking into account how frequently each letter starts a word. There are precious few words that start with X. It’s why all of my son’s alphabet books have to use cheats, like “eXtinct” for A-Z animals. Or “xylophone” in basically every other situation, regardless of how apropos it is. (“I mean, I guess a book of America From A-Z could include xylophone. People here… um… play them sometimes?”)
5 | K, ~1/185
I guess the study didn’t look at too many comedy books. Everyone knows K is the funniest letter.
6 | V, ~1/95
V really gets a boost from all the usage of “I’ve.” Never doubt the power of drafting off people’s — especially writers’ — egocentrism.
7 | B, ~1/68
My first surprise of the list. I expect to see the letters from the end of the alphabet on this list. They’re the bad kids sitting in the back of the bus, throwing paper airplanes or doing whatever the bad kids do today. Meth? Probably meth. But seeing B, the second letter of the alphabet, in this ignoble spot? Blasphemy.
8 | Y, ~1/60
Y is my half surprise. Perhaps if the study were more skewed toward current (not, as I assume, primarily older, public domain) works, modern AdverbMania would’ve given Y the boost it so clearly, undisputedly deserves.
9 | W, ~1/60
If I were to anthropomorphize letters, W would sit around all day bitter at U, bitching about how it’s literally bringing twice as much to the table but getting a fraction of the usage. That’s the attitude I assume Boutros Boutros Ghali had whenever he was around someone whose name was only one Boutros. Wow. This paragraph on W really went off the rails.
10 | G, ~1/53
I’ll call G my second surprise. I mean, it’s worth two points in Scrabble. That’s lower than four letters that aren’t on this bottom 11 list (C, M, F, H) and tied with D, which also isn’t on this list. If we can’t trust scrabble, who can we trust? Plus every “-ing” word has a G. And it still only landed here? Won’t somebody think of the gerunds?
11 | P, ~1/47
For the final surprise, I wouldn’t have guessed P would be in the bottom half of letter usage. Although on that note, here’s a random “Everyone Hates P” trivia fact: Believe it or not, there are only three U.S. state names that contain the letter P. I’d tell you which ones but that’s no fun.