Thanks @jeremy!
Some more fun results.
Given that the misspellings phenomenon looks like it’s driven by differences in the distributions in the source data, I wondered if you might be able to do other transformations. In particular, could you create a transformation vector to turn my turgid businesslike writing into flowery, poetic prose. Someone responded to Jeremy’s tweet wondering if you could build a transformation to make you sound smarter. Well, in a round about sort of way, it looks like you can.
Those who use long tricky words do tend always to use them, whereas people like me with a more limited vocabulary don’t often use ‘resplendent’ or ‘supercilious’ in our writing. If this rarefied vocabulary only crops up frequently in particular sources, then it will probably be in a different bit of the vector space.
So, I’ve tried building a transformation (a pretentiousness transformation?) by taking the average between pairs of words that mean roughly the same thing, but one of them is souped-up. For example,
('impoverished', 'poor'),
('diffident', 'shy'),
('congenial', 'agreeable'),
('droll', 'witty'),
('valiant', 'brave'),
('servile', 'dutiful')
And so on. The resulting transformation vector shares characteristic spikes in 2 dimensions with the spelling vector, but not in the way you might expect. The ‘souping-up’ direction is the same as the misspelling direction. I think this is because the ‘best spelled’ end of the spelling direction is based on business news sources, which tend to stick to fairly straightforward language.
To generate interesting ‘souped-up’ alternatives for words I’ve applied the transformation (after chopping off these spikes it shares with the misspelling vector, otherwise we start creating misspellings), taken the nearest neighbours, and then removed from this set the nearest neighbours of the un-transformed query word. This just gives you better results. For example, if we use the word ‘give’,
- the nearest neighbours are ‘giving’, ‘make’, ‘let’, ‘want’, ‘you’, ‘get’, ‘take’, ‘bring’, ‘need’;
- the transformed neighbours are ‘furnish’, ‘sufficient’, ‘compel’, ‘provide’, ‘bestow’, ‘obtain’, ‘afforded’, ‘confer’, ‘proffer’
It doesn’t always work, often giving you rubbish, but here are few examples of the options this method gives you:
query | fancy sounding alternatives |
---|---|
smelly | fusty, foul-smelling, odiferous, odorous, malodorous, fetid, putrid, reeking, mildewy |
funny | comical, droll, hilariously, humourous, ironical, irreverent, sardonic, witty |
fat | adipose, carbohydrate, cholesterol, corpulent, rotund, triglyceride |
clever | adroit, artful, astute, cleverness, deft, erudite, shrewd |
book | monograph, scholarly, treatise, tome, textbook, two-volume, nonfiction |
loud | boisterous, cacophonous, clamorous, deafening,ear-splitting,resounded,sonorous,strident |