The Misspelled Universe

How many times have you typed something in Google to be asked “Did you mean: ….” Next time you reach this page, stay a little longer and take a look at the pages that Google did find. This is your gateway to the parallel universe of misspelled words. Well let me correct myself — these “misspelled” words can belong to a different language altogether or they even might be rarely used genuine English words with close resemblance to the heavily used ones.

An entire gamut of information is being denied to us due to mere errors in spelling. To deride these spelling mistakes as mere errors in spelling is to ignore a small minority of people who deliberately misspell words so as to make their pages less publicly accessible.

Then there are people who exploit misspellings to make their living, e.g. People searching auction sites like eBay for misspelled (or mislabeled) items and, hence, hopefully, underbid items. (* eBay now offers a spell-check utility, but surprisingly few people still refuse to use it.)

Excepting eBay entrepreneurs, one thing that is clear is that we are “losing”‘ this increasingly vast pool of information containing misspelled “keywords” (words we type in a search engine). There is an argument to be made that the quality of information source with misspelled words may itself be poor and hence we needn’t worry about the “lost” information. Arguably, the frequency of misspelled words in a peer-reviewed journal is much lower than, say, my blog;) The normative question is, Does that rightly consign my blog to obscurity?

Internet search is a classic case of finding a needle in a haystack, and search algorithms are built of dispense with as much “clutter” (hay) as fast as possible, leaving a very small minority of websites that are given genuine value. What we are seeing are two trends implicit in Google’s search algorithm — most of our search needs are about “popular”‘ items (given a higher rank by Google), and it is progressively harder to find “unpopular” sources. On the face of it, the trend is innocuous and even sensible, but the wider ramifications include information hegemony.

Let us turn the discussion around to sites that use syntactically correct but meaningless verbiage including common search terms” (a sentence like “Indeed, a blind crenelation blasphemously a player inside the stictomys. For example, a whopper behind a ferrocyanide indicates that the saccharinity behind a casino tropez another euphausiacea from another modem.”) People also “Google bomb” (mass posting on blogs/lists associating a search phrase with online address). Some sites have, in fact, automated this by writing programs that automatically go to different blogs/lists and post entries/comments like “poker chips poker – [web address].” This problem is much worse as it is making it progressively harder for us to find “genuine” (or most popular/reliable) information. So will there be too much seemingly reliable unreliable information, or will we miss a lot of seemingly unreliable reliable information? Chances are that both will happen.

Subscribe to Gojiberries

Don’t miss out on the latest issues. Sign up now to get access to the library of members-only issues.
jamie@example.com
Subscribe