small words are hard to define -- 3/31/17
Today's selection -- from Word by Word by Kory Stamper. The smaller the word, the harder it is to define. The author, an editor of dictionaries, describes the difficulty involved in defining short words:
"We were working on revising the Collegiate [Dictionary] for its eleventh edition, and we had just finished the letter S. ... I signed out the next batch in T and grabbed the galleys for that batch along with the boxes -- two boxes! -- of citations for the batch. While flipping through the galley pages, I realized that my batch -- the entire thing -- was just one word: 'take.' Hmm, I thought, that's curious. Lexicography, like most professions, offers its devotees some benchmarks by which you can measure your sad little existence, and one is the size of the words you are allowed to handle.
"Most people assume that long words or rare words are the hardest to define because they are often the hardest to spell, say, and remember. The truth is, those are usually a snap. 'Schadenfreude' may be difficult to spell, but it's a cinch to define, because all the uses of it are very, very semantically and syntactically clear. It's always a noun, and it's often glossed because even though it's now an English word, it's one of those delectable German compounds we love to slurp into English.
Title page of the 1828 first edition of the
American Dictionary of the English Language featuring an engraving of Webster
"Generally speaking, and as mentioned earlier, the smaller and more commonly used the word is, the more difficult it is to define. Words like 'but,' 'as,' and 'for' have plenty of uses that are syntactically similar but not identical. Verbs like 'go' and 'do' and 'make' (and, yes, 'take') don't just have semantically oozy uses that require careful definition, but semantically drippy uses as well. 'Let's do dinner' and 'let's do laundry' are identical syntactically but feature very different semantic meanings of 'do.' And how do you describe what the word 'how' is doing in this sentence?
"It's not just semantic fiddliness that causes lexicographical pain. Some words, like 'the' and 'a,' are so small that we barely think of them as words. Most of the publicly available databases that we use for citational spackling don't even index some of these words, let alone let you search for them -- for entirely practical reasons. A search for 'the' in our in-house citation database returns over one million hits, which sends the lexicographer into fits of audible swearing, then weeping.
"To keep the lexicographers from crying and disturbing the people around them, sometimes these small words are pulled from the regular batches and are given to more senior editors for handling. They require the balance of concision, grammatical prowess, speed, and fortitude usually found in wiser and more experienced editors.
I didn't know any of that at the time, of course, because I was not a wise or more experienced editor. I was hapless and dumb, but dutifully so: grabbing a fistful of index cards from one of the two boxes, I began sorting the cards into piles by part of speech. This is the first job you must do as a lexicographer dealing with paper, because those citations aren't sorted for you. I figured that 'take' wasn't going to be too terrible in this respect: there's just a verb and a noun to contend with. When those piles were two and a half inches high and began cascading onto my desk, I decided to dump the rest of the citations into my pencil drawer and stack my citations in the now-empty boxes.
"Sorting citations by their part of speech is usually simple. Most words entered in the dictionary only have one part of speech, and if they have more than one, the parts of speech are usually easy to distinguish between -- the noun 'blemish' and the verb 'blemish,' for example, or the noun 'courtesy' and the adjective 'courtesy.' By the time you've hit T on a major dictionary overhaul like a new edition of the Collegiate, you can sort citations by part of speech in your sleep. For a normal-sized word like 'blemish,' it's a matter of minutes.
"Five hours in, I had finished sorting the first box of citations for 'take.'"