Wednesday, October 12, 2016


Sometimes it is just really nice to remember that there is progress in the description of (some) languages, even though it might take decades.

From Lindblom, Gerhard. (1926) Notes on Kamba Grammar (Archives d’Études Orientales 10). Uppsala: Appelbergs Boktryckeri, page 34:

"It was thought, for a long time, that East-African dialects did not have any musical accent, any fixed melody bound to the word as such, as in Swedish, Lituanian, Chinese and other languages. The researches of Meinhof and others have proved the existence of a musical accent in Kishambala and Kinyamwezi. I cannot fully ascertain, whether such an accent exists in Kikamba, but it is certain that there are in the language several words absolutely identical as to the constituent sounds, which in pronunciation are strictly distinguished by the natives. They have often laughed, when I have said the word for 'rust' instead of that for 'guinea-fowl' (see ex. below). As far as I can hear, there is no difference of stress in the following pairs of words, so I suppose there must be some difference of pitch that I have not been able to catch. I give the examples for further research."

From Roberts-Kohno, Rosalind Ruth. (2000) Kikamba Phonology and Morphology. The Ohio State University doctoral dissertation, page 191:

"Kikamba is a Bantu language with four tones: Super-Low (SL), Low (L), High (H), and Super-High (SH)."

Tuesday, September 20, 2016

The Glamour of Grammar

A quotation from Lyle Campbell's Historical Linguistics (1998:5) on the shared etymology of the words glamour and grammar, in honour of today's xkcd cartoon:

'Glamour is a changed form of the word grammar, originally in use in Scots English; it meant 'magic, enchantment, spell', found especially in the phrase 'to cast the glamour over one'.  It did not acquire the sense of 'a magical or fictitious beauty or alluring charm' until the mid-1800s.  Grammar has its own interesting history...In Classical Latin, grammatica meant the study of literature broadly.  In the Middle Ages, it came to mean chiefly the study or knowledge of Latin and hence came also to be synonymous with learning in general, the knowledge peculiar to the learned class.  Since this was popularly believed to include also magic and astrology, French grammaire came to be used sometimes for the name of these occult 'sciences'.'

Monday, September 19, 2016

Sound-meaning associations across more than 4000 languages

My friend Damián ‘blasé Damián’ Blasi published a paper in Proceedings of the National Academy of Sciences this week, with Søren Wichmann, Harald Hammarström, Peter Stadler and Morten Christiansen, available here.

The paper was on how there are common sounds that languages across the world tend to use in particular words.  An example is the word ‘nose’, which tends to contain the nasal consonant 'n' more than expected by chance.  The word for 'horn' often has a 'k', reminiscent of the 'bouba/kiki' effect, where people who are asked to use the labels 'bouba' and 'kiki' to describe the two shapes below will tend to use 'kiki' for the jagged shape on the left.

The word for 'small' often has an 'i', as if the lips are being placed together to indicate size in the same way that finger tips can:

The word for 'breast' often has a 'm', the reason for which I will not attempt to speculate on.  People have noticed that words for 'mother' also often have a 'm', although that hypothesis could not be tested here because 'mother' is not in the word list they used.

Stranger associations include 'dog' often having a 's', 'fish' having an 'a', 'star' having a 'z', and 'name' having an 'i'.  The authors also find negative associations, such as 'dog' not tending to use the sound 't'.

Why do these associations occur?  One reason may be that words are often deliberately sound-symbolic.  People are good at tasks such as the bouba/kiki task above, without being able to articulate necessarily what the connection is between shapes and sounds.  Languages can exploit these unconscious connections in forming words, such as English words with fl ('fly', 'flutter', 'fleeting') that have something to do with motion, or gl ('glimmer', 'glitter', 'gleam') to do with light.  Languages can have whole classes of ideophones, words that resemble onomatopoeia ('bang', 'whoosh') but which can be very specific and go beyond sound and describe other senses, such as the Ngbaka Gaya loɓoto-loɓoto 'large animals plodding through mud', Korean 초롱초롱 chorong-chorong 'eyes sparkling', or Siwu nyɛk̃ɛñyɛk̃ɛ̃ 'intensely sweet' (see this review).  

A surprisingly diverse set of meanings can therefore be depicted with sounds, in ways that make a certain intuitive sense.  People can guess the meaning of sound-symbolic terms to some extent, even without knowing the language that they are from.  Other studies have found that people are good at guessing the meaning of words in a foreign language to some extent, for instance which out of the Hungarian words kicsi and nagy means 'big' or 'small', suggesting that even basic, conventionalised vocabulary can retain a sound-symbolic component, conceivably because they are descended from words that were once ideophones.

The intuition that some sounds are more likely to be found in certain words is therefore well-known, but it has not been tested across a large language sample with careful statistical controls before.  The predecessor to this paper was a paper by Wichmann, Holman and Brown (2010) which tested this for the first time, but whose statistical methods are rather strange (including controlling for relatedness by attempting reconstructions of words in proto-languages).  

By contrast, the methods of this new paper are very clear and seem sound.  Besides controlling for language families, which I will return to, the authors tested each association in six different areas of the world independently: Africa, Eurasia, Papua New Guinea and the Pacific, and (indigenous languages in) Australia, North America and South America.  They only report associations which are positive in at least three independent areas.

Because they didn't know in advance what sounds would be associated with which meanings, they tested every possible association in the data set.  This is a type of multiple testing, and so you can get some associations by accident (such as the number of drownings in pools correlating with number of films Nicholas Cage appeared in each year).  The authors use a correction for this, which Damián once explained to me in its general form: a data set contains many correlations of varying p-values, some accidentally below 0.05 (i.e. spurious correlations), but many other above 0.05 and going as high as 1.  In a completely random data set, a histogram of the p-values of all correlations looks like this, where the number of 'significant' correlations with a p-value below 0.05 isn't actually any higher than you'd expect for any other interval, suggesting that the p-values below 0.05 are due to chance:

By contrast, in a non-random data-set where 20 variables are correlated with one particular variable, there are many more correlations with p-value below 0.05 than in other intervals, as shown by the spike on the left:

You can test every correlation in the data-set and find out what the expected number of false positives are (i.e. the number of p-values that fall in any particular interval).  You can then choose a threshold such as p<0.0001, below which the number of false positives is going to be small, say 5% of the correlations that you report (as the authors do in this paper).

Finally, they control for word length and the rate that a phoneme is expected to appear in other words of the word list.  They find the frequency that a phoneme is found in a particular word using a genealogically balanced average (i.e. treating each family as one datapoint), and compare it with the frequency that the phoneme appears in other words in the word list.  The ratio of the two is in some cases high, if there is an association of that phoneme with a particular concept, and the significance of that association can be computed by comparing this ratio with the ratio obtained by selecting random words from the same languages.  Word length needs to be controlled for as well, because words differ in how long they are on average ('I', 'me' and 'water' are the shortest words in the list cross-lingustically, and 'star' and 'knee' are the longest).  They correct for this by doing the above test but just comparing words of the same length; and then also performing a test with simulated words of the same length, rejecting any associations between a phoneme and a concept which came out as strongly in the simulated data as in the real data.

I have two minor criticisms of their method of controlling for language relatedness.  An association between a phoneme and a meaning can be inflated by families with a lot of languages such as Indo-European, and the authors deal with this problem by treating each known language family (or isolate) as just one data point, by effectively taking the mean value for that family: for instance, about 82% of the Indo-European languages have 'n' in the word for 'nose' by my count, so Indo-European as a whole gets a value of 0.82.  However, this assumes that Indo-European has a completely flat structure, ignoring the fact that languages belong to sub-groups within Indo-European such as Germanic, Romance and so on.  A single branch of the family with a lot of languages can inflate the value, meaning that they are not controlling for non-independence at lower levels in the family.  

A simple correction for this (one that the authors must have considered but for some reason rejected) is to take average values weighted by branch: for example each node in the family tree gets an average value such as 0.94 in Germanic, 0.82 in Romance and so on, and these are averaged to produce a value at the root, the phylogenetic mean.  This can be estimated using the 'ace' function in the R package 'ape', in which Indo-European gets a much higher value of 0.94.

The second problem is that that slower-changing words are more likely to exhibit sound-meaning correspondences by chance.  Some words change very slowly, such as pronouns and numerals.  In those slow-changing words, there is going to be a higher probability of a particular phoneme being found associated with that meaning, simply because those forms are more similar across languages within a family.  It is still unlikely that many spurious correlations will arise in these words, given that each family is one data point, and the association needs to be in three independent areas; but it may perhaps influence some of the stranger sound-meaning associations that they find, such as 'I' having palatal nasals, or 'one' having 'n' or 't'.  

A possible improvement to their method is to not take mean values, such as how many languages use 'n' in the word for 'nose', but to use family trees to reconstruct whether languages in the past used 'n' in the word for 'nose' and how they changed over time.  As a crude example, here is a plot of a set of Eurasian families (Indo-European, Dravidian, Uralic, Turkic, Mongolic and Tungusic) in a family tree, using the Glottolog classification and then randomly made into a binary tree with branch-lengths each of 1.  Tips with a yellow dot beside them are languages which have 'n' in the word for 'nose'.  You can then use maximum likelihood reconstruction in the R package 'ape' to reconstruct the probability of each ancestral node having 'n' in the word for 'nose', plotted here below by using blue circles whose sizes are proportional to these probabilities.  For example, Proto-Indo-European is reconstructed as having 'n' in 'nose' with a probability of 0.99, in agreement at least with linguists' reconstructions (Mallory and Adams 2006:175).

You can then calculate the rates of gaining and losing 'n' in 'nose' over time.  The rate of gaining 'n' is 0.17, which is higher than for other phonemes such as 0.05 for 't'.  The idea is that 'n' is gained frequently, suggesting that there is something that causes that sound to be associated with that meaning.  By contrast, the association between 'n' and 'one' seems to be more explicable simply by how slow-changing words for 'one' are, as suggested by the plot below.  The rate of gaining 'n' in 'one' is much lower, 0.02, with languages that have it tending to be retaining this association rather than innovating it (again Proto-Indo-European is reconstructed as having it with 0.99 probability).  Similar results for the rates of change are obtained when language families are not assumed to be related but when this is done just within Indo-European.

The authors may have decided not to use phylogenetic methods because of the apparent crudity of the assumptions that you have to make, in the absence of well-resolved binary trees with branch lengths for most families.  But even the crudest solution to that problem, such as my analysis here in R using maximum likelihood with branch lengths of 1 and randomly binarised trees, is a more accurate model than theirs, which effectively assumes entirely flat trees with no internal structure, branch lengths of 1, and no difference between innovation and retention.  The use of phylogenetic methods makes those assumptions more explicit, not more crude.

There are other minor problems, but these are much harder to solve and topics of ongoing research.  For instance, borrowing between languages can create an association between a phoneme and a meaning within a particular macro-area.  The authors explore this possibility by testing how well nearest neighbouring languages predict the presence of particular associations.  Some associations such as using 's' in the word for 'dog' are indeed more likely to occur in a language if they have an unrelated language nearby which also as 's' in the word for 'dog', suggesting that this particular association may have been inflated by borrowing between some language families.  There is finally the possibility that some language families are related to each other, and hence that some form-meaning associations are inherited from ancestral languages.  This has been argued in particular for pronouns, which often use similar sounds across Eurasia in particular.  While it is difficult to correct for this yet without proper cognate-coding between languages in the ASJP (a task I describe here), the authors explore this by comparing the distribution of sound-meaning associations with similarity of word forms overall (as cognates are expected to be more similar than non-cognates).  They also point out that the opposite point holds, that the existence of sound-meaning correspondences casts doubt on some claims about 'ultra-conserved' words.

Despite these possible issues, it is unlikely that many of the correlations they report are spurious, and the results are interesting both in the hypotheses that they confirm (small and 'i', for example), and in the unexpected associations that they discover. There has been plenty of positive coverage of the paper in the media, with particularly good write-ups in The Economist here, The Guardian here, and the Scientific American here.  Science Daily had the witty headline 'A nose by any other name would sound the same', which was also used by the Washington Post and other venues.  The Telegraph and The Sun opted for incoherent clickbait such as 'Humans may speak a universal language, scientists say'. 

While the research does not exactly point to a universal language, it does show that humans are good at perceiving links between form and meaning and use these in language far more than previously thought.  It is not necessarily true that these usages are conscious, and speakers may not be aware that there is anything potentially sound-symbolic about the 'n' in 'nose', for example.  One might speculate that sound-meaning associations are in some cases relics of more consciously sound-symbolic terms such as ideophones.  Another possibility is that words which carry a particular suitable phoneme may be more easily learnt, or somehow preferred in everyday language use, even if this behaviour is unconscious.  Blasi et al.'s paper raises the intriguing question of how associations such as 'n' in 'nose' have come about historically and what it is about speakers' behaviour that favours them.  The other exciting contribution of their paper is a set of sound-meaning correspondences across languages that are as evocative as they are hard to explain - 's' in 'dog', 'z' in 'star', 't' in 'stone', 'l' in 'leaf'.

A further implication of the paper that the authors themselves will perhaps not endorse (or other colleagues of mine who study sound-symbolism such as Dingemanse et al. in this review) is that humans may be good at perceiving links such as between 'k' and jagged shapes because of natural selection for the ability to perceive cross-modal mappings, specifically in the context of acquiring language.  I am not arguing that specific associations such as 'nose' and 'n' are innate, but a general ability to perceive associations across modalities is likely to be innate and have been selected for in the context of acquiring language.  People vary in their ability to guess the meanings of ideophones or words in another language, and there is evidence that this ability is linked with synaesthesia, the condition of having associations across senses such as sounds with colours.  Synaesthesia often runs in families (e.g. this study), giving one example of the way that the ability to make cross-modal mappings with sounds could be subject to genetic variation.  

The fact that competent speakers are good at tasks such as the bouba/kiki task or guessing the meanings of foreign words and ideophones suggests that the genetic variants underpinning these abilities must have become common, perhaps under the influence of selection.  If there was any selection pressure on these abilities in the past, it may have been less on the ability to remember vocabulary (although Imai and Kita (2014) argue that sound-symbolism does help infants learn words) so much as on understanding the concept of spoken communication at all.  Hominids have clearly had some form of communication from 2.5 million years ago which likely used at least manual gesture, as evidenced by tool traditions which require a certain amount of active teaching, leading us to have an enriched ability to understand of the communicative intention of gestures compared with other apes.

The transition to using speech may have similarly created a selection pressure for an instinctive understanding of how sounds can convey meaning.  If so, sound-meaning associations are evidence of shared psychological biases that originally allowed the evolution of spoken language.

External image sources: mouthgesturebouba/kiki, Magritte

Monday, September 5, 2016

Diachronic and functional explanations of typological universals @ SLE2016

The last two days I attended the workshop "Diachronic and functional explanations of typological universals" at the 49th Annual Meeting of the Societas Linguistica Europaea in Napels, Italy. This theme session was organized by Natalia Levshina, Ilja Serzant, Susanne Michaelis and Karsten Schmidtke-Bode, the description can be found here (page 409-410) and the introduction can be found here.

The purpose of the workshop was to draw attention to the importance of diachronic explanations for typological universals. Typological universals were also the topic of Jeremy's last post. They are properties found in (nearly) all languages, or if not all, enough languages to deserve an explanation. Jeremy discusses word order universals, an example being that if a language has verb-object word order, it is also likely to have prepositions, i.e. to have adposition-noun order (see Matthew Dryer's map in the World Atlas of Language Structures Online). Many explanations proposed for universals have been 'functional': verb-object and adposition-noun, for instance, would pattern together because they branch in the same direction, and this is easier to process than structures which do not match in this regard (Dryer 1992). There are many more examples of studies who explain typological correlations in terms of ease of processing, comprehension, production, and (first language and second language) learnability.

As the conveners of the workshop note, an alternative view on explaining universals comes from the history (diachrony) of languages. This is also the point of Jeremy's post: showing that at least some word order universals can be explained through highly common grammaticalization pathways rather than the pull of processing or learnability constraints.

It is really unfortunate that for some time now, linguistic typology and historical linguistics are conceived of as quite different and separate enterprises. Shibatani and Bynon (1995: 20-21) describe that this disunion came about when transitions between the famous morphological "stages" of isolation, agglutination, and fusion proved unfalsifiable in the sixties. Later on, Givón and Greenberg would pave the road towards an integration of typology and historical linguistic. This integration has not fully been achieved, however. Handbooks on linguistic typology often include a chapter on language change, but they don't give language change the central role it should have in order to advance the field.

During the Napels workshop, it became clear that all participants (mostly typologists looking at various morpho-syntactic features) feel the need for further development of historical explanations of typological patterns and universals. This includes Sonia Cristofaro, first and foremost, who has published extensively on this topic. See also her interview with Martin Haspelmath.

Other participants with a historical outlook were Eugen Hill, who described a diachronic rather than a functional explanation of Watkin's law; Eitan Grossman, who presented multiple grammaticalization pathways through which agent nominalizers (such as '-er' in 'kill-er') come about; and Michael Cysouw, who proposed an extension of Maslova (2002) in order to study transition probabilities between states, rather than frequencies of states.

Other participants emphasised the role of functional explanations. First and foremost, this includes Susanne Michaelis and Martin Haspelmath, the latter of which argues for a functional-adaptive constraint that guides diachronic change in general, and during this workshop together they proposed this constraint to explain the difference in length between independent personal pronouns (mine as in 'the book is mine') and dependent personal pronouns (my as in 'my book'). (See also Martin Haspelmath's post on this topic on his blog Diversity linguistics comment).

Other participants with a functional outlook were Ilja Seržant, who defended a functional account of the rise of differential object marking in Old Russian; Anita Slonimska and Seán Roberts (who won first prize for best PhD presentation!), who showed that question words are similar in order to facilitate early speech-act recognition; and Paul Widmer et al., who present a phylogenetic comparative study of the stability of recursion of the Indo-European noun phrase, a result that can only be explained through a neurophysiological preference for recursion.

Then there was a set of participants who proposed links between history and functional explanations: Olga Fischer, who discusses the role of analogy in grammaticalization; Borja Herce Calleja, who shows that past and future time deictic adverbials (like 'ago' (past) and 'in X years time' (future)) have different diachronic sources, which could be explained by the cognitive and experiential gap between past and future; and my own talk with Andreea Calude, which considered diachronic and functional explanations of atom-base order of numerals in Indo-European.

The most interesting work, at least in my opinion, was presented by Balthasar Bickel and Damián Blasi, Natalia Levshina, and Karsten Schmidtke-Bode. These three presentations all attempted to explain typological patterns through synthesis not only of typological and diachronic data, but data from experiments, corpus studies, gesture studies, and information structure. Balthasar Bickel and Damián Blasi presented an account of the preference for unmarked initial noun phrases to be either S ("subject") or A ("agent") arguments using findings from the neurophysiology of language processing, phylogenetics, and gesture studies. Natalia Levshina explained typological distributions of lexical, morphological, and analytic causatives through economy principles, using evidence from a corpus study, a typological study, a language learning experiment, consideration of diachronic pathways, and frequency effects. Karsten Schmidtke-Bode presented findings on diachronic pathways, iconicity, information structure and corpora to explain the placement of S ("subject") and A ("agent") complements.

These three talks where the most impressive to me, as the best bet we have in explaining typological patterns is to incorporate as many different types of data we can get. This was summed up rather nicely in Balthasar Bickel and Damián Blasi's conclusion. In my wording, they state that in order to explain diachronic universals, we need:
 - a theory of common evolutionary pathways (grammaticalization)
 - a theory of aspects that constrain random evolution: i.e. 'functional' needs from communication, processing, production, learnability, etc.

Hence, the two themes of the workshop are really two sides of the same coin. This is not a new conclusion, see for instance Jadranka Gvozdanović who writes in 1997 "a theory of language history is explanatorily adequate to the extend that it is able to correlate attested language data with types of language activity whose consequences they are" (Gvozdanović 1997). However, we may now have tools (demonstrated by various participants of the workshop) as well as ever increasing data sets that can be used to provide a holistic account of language universals. The role of grammaticalization in this endeavor is central, as Jeremy noted earlier, and very interesting in light of Shibatani and Bynon's description of the state of the art in 1995: "But we do not now see grammaticalization as a mechanism which propels entire languages from one type to another" (Shibatani and Bynon 1995: 21). This has obviously changed, and so much the better for our understanding of typological and historical patterns.


Dryer, Matthew S. 1992. The Greenbergian word order correlations. Language, 68(1). 81-138.

Gvozdanović, Jadranka. 1997. Introduction. In Gvozdanović, J. (ed.), Language change and functional explanations  1-8. Berlin: Mouton de Gruyter.

Maslova, Elena. 2002. Distributional universals and the rate of type shifts: towards a dynamic approach to "probability sampling". Lecture given at the 3rd Winter Typological School, Moscow. Available online at

Shibatani, Masayoshi, & Bynon, Theodora. 1995. Approaches to language typology: A conspectus. In Shibatani, M. & Bynon, T. (Eds.), Approaches to language typology  1-26. Oxford: Oxford University Press.

Saturday, August 20, 2016

Publish your research Open Access - for you, me and everyone!

Thinking of turning that MA thesis into an article, putting together an edited volume from a workshop or finally writing that big book on discourse particles that is going to solve everything? Why not consider Open Access (OA) publishing instead of the traditional publishing houses and journals? With OA, people have an easier time actually reading your work, and you won't be feeding money into a potentially shady system that exploits academics as editors and reviewers for free, and then makes the same community pay to read the products. Furthermore, by selecting OA options that have a good reputation, you're not in danger of the standing of your work lowered.

There are several traditional publishing venues who exploit the benevolence of the academic community by not paying reviewers and/or editors, not paying authors and then expecting university libraries to pay expensive rates when buying the published research - research that tax payers somewhere probably has already funded. 

Universities like Harvard are already actively encouraging their researchers to choose OAthe Max Planck Society in Germany is also a enthusiastic supporter and co-founder of the OA-movement and CERN are either publishing their research themselves (they already have a rigorous review process in-house) or at other OA-venues. After all, why would these research institutions like to pay for things several times? They're often already providing the reviewing and editing themselves, so why mix in a middle man who charges you money for honestly not much added value?

It can be hard for a junior researcher to take the step to publishing OA, the ranking of a certain OA-journal might not be good enough yet, no-one might have heard of or trust the venue. That's why it's important to have a set of trusted high quality venues where already more senior and established researchers are publishing and where you can too!  We've compiled a helpful list of venues where you can publish your work. Go here to see it! If you want to know more about different kinds of OA (Gold, Platinum, Green, Blue etc), click here.

And before you ask, OA does not mean a lack of reviewers, editors, proof-readers etc. There are other funding schemes that can provide pay for such work, or sometimes academics provide it for free.

Finally, it may be that you're at a financially stable institutions with a library that can afford to pay these fees, but do consider the fact that not everyone else is in such a privileged position.

Thank you for your time, go OA!

P.S. Naturally there are non-evil publishing places that are non-OA.

Friday, August 19, 2016

Some language universals are historical accidents

There are surprisingly few properties that all languages share.  Pretty much every attempt at articulating a genuine language universal tends to have at least one exception, as documented in Evans and Levinson's article 'The Myth of Language Universals'.  However, there are non-trivial properties that are found in if not literally all languages, enough of them and across multiple language families and independent areas of the world, that they demand an explanation.  

An example is the fact that languages have predictable word orders.  Languages differ in whether they allow the verb to come before or after the object (English has it before, Japanese after).  They also differ in whether they have pre-positions (such as English ‘on the table’) or post-positions (such as Japanese テーブルの上に teburu no ue ni ‘’the table on’).  If a language has the verb before the object then it tends to have prepositions rather than postpositions, as in English; if the verb is after the object, it is a good bet that the language will have postpositions rather than prepositions (these rules hold for 926/981 languages in WALS, not controlling for relatedness).  The ordering of different elements in a sentence such as the noun and adjective, noun and possessive, and so on, are to some extent free to vary among languages, but again tend to fall into correlating types.

Why should knowing the word order of one category in a language help predict the orderings of other categories?  Many people have taken facts such as this as evidence that language is shaped by principles of harmony across grammatical categories, or evidence of Universal Grammar.  Another possible explanation is that languages which have similar word orders for different grammatical categories are somehow easier to learn, or easier to use. 

However, I would argue that at least some of these patterns are not evidence of our psychological preferences, but are accidental consequences of language history.  This post is a response to a couple of blog posts written by Martin Haspelmath this week on Diversity Linguistics Comment (here, and here with Sonia Cristofaro), which argued that historical explanations of universals still need to invoke constraints on language change, and that 'the changes are often adaptive, and that in such cases a precise understanding of the diachronic mechanisms is not necessary (though of course still desirable).'  I disagree, in the particular case of word order correlations, and I will argue specifically in this post that word order correlations are a consequence of grammaticalization.  In arguing this I'm building on work by Aristar (1991) and Givon (1976), but developing the argument to refute Haspelmath's points.  I will present the background of the argument first and then discuss his points in more detail.

Grammaticalization is the process by which new grammatical categories can be formed from other (often lexical) categories. For example, Mandarin Chinese has a class of words which might be called prepositions if they were in a European language, but which really have their historical roots in verbs.  An example is 從 cóng which in modern Mandarin is a preposition meaning ‘from’ but which in Classical Chinese was a verb meaning ‘to follow’, as these two sentences illustrate.  

我 從    倫敦       來
I  from London  come
‘I come from London’

天下                         之                   民       從       之
under-heaven  POSSESSIVE  people  follow  him
‘Everyone in the world follows him’  (孟子萬章上 Mengzi Wanzhang Shang: from the 古代漢語辭典 Gudai Hanyu Cidian)

The word 從 cóng has changed its meaning from ‘follow’ to a more abstract spatial meaning ‘from’.  It has also lost its ability to be used as a full verb, requiring another verb such as ‘come’ in the sentence, just as English requires a verb in the sentence ‘I come from London’ (*’I from London’ and its equivalent *我從倫敦 are ungrammatical).  Other Chinese prepositions such as 跟 gēn ‘with’,  also have a verbal origin, while many preposition-like words such as 給 gěi 'for' and 在 zài 'in/at' retain verbal meanings ('give' and 'to be present') and verbal syntax (such as being able to be used as the sole verb in the sentence and to take aspect marking).

Why is this relevant to word order universals?  Because if so-called prepositions in Chinese were once historically verbs which have since lost their verbal uses, this can explain why the two grammatical classes have the same word ordering: they were once the same category, and they simply haven’t changed their word orders since then.  Since the verb precedes the object in Chinese, as in the Classical Chinese sentence given above (從之 cóng zhī ‘follow him’), the preposition in modern Chinese also precedes the object (從倫敦 cóng Lúndūn ‘from London’).

It is incorrect to say that Chinese has prepositions and verb-object order because this is a combination that is easy to process or learn, or because the categories are in some other sense ‘harmonious’.  The real explanation is that verbs and prepositions in Chinese have a common ancestor, and have simply preserved their word orders since then.  This is a subtle variant of Galton’s problem, by which the historical non-independence of data points can create correlations that are not causal.  Most examples of this are from relatedness of whole languages or cultures, for example the spurious correlation between chocolate consumption and Nobel Prize winners; or the way that a maths teacher at my school was exercised by the fact that in many languages ‘eight’ and ‘night’ often rhyme or are similar (e.g. German acht and nacht, French huit and nuit, Italian otto and notte) - there's nothing mystical here, the explanation here being that the words in these languages descend from common Proto-Indo-European roots *okto and *nekwt, which happen to be similar.  

Just as languages and cultures can be related, individual words in a language can be related, such as prepositions and verbs, and hence share properties such as their word order.  It turns out that the process of grammatical categories developing from other categories is extremely common and attested in every language family and part of the world (e.g. Heine and Kuteva 2008).  Verbs can change into adpositions as in the Chinese example above (also found in languages such as Thai and Japanese), while nouns also often change into adpositions (as in many Niger-Congo languages such as Dagaare, where adpositions are all also body parts such as zu 'on/head').  Other word order correlations can be explained in a similar way, such as the relationship between adjective-noun and genitive-noun order, and even verb-object order and genitive-noun order (because of grammaticalizations such as me-le e kpe dzi 'I am on his seeing' as a way of expressing 'I am seeing him' in Ewe, Claudi 1994).  I give further examples in different languages in a short article I wrote for Evolang (2012), in which I make the point that these processes should be considered a serious confound to an explanation which tries to claim that there is a causal link between word orders across grammatical categories.

But can't both explanations be correct?  This is the response I hear from every linguist that I've described this argument to, including Balthasar Bickel, Morten Christiansen, Simon Kirby and now also Martin Haspelmath in his blog post when he says '...while everyone agrees that common paths of change (or common sources) have an important role to play in our understanding of language structure, I would argue that the changes are often result-oriented, and that in such cases a precise understanding of the diachronic mechanisms is not necessary (though of course still desirable).'  In short, does grammaticalization happen in order to create correlated word orders?  

No.  Objections like that are missing the point about non-independence.  Grammaticalization happens, causing two grammatical constructions to exist where there was previously one. These two constructions are likely to have the same word order, on the reasonable assumption that constructions are more likely than not to keep the same word order over time (an assumption also vindicated by work by Dunn et al. described below).  You have to control for this common ancestry if you wish to claim that the correlation in word orders across constructions is causal.  It is as if people wanted to claim that there was a deeper ecological reason why chimpanzees and humans share 98.8% of their DNA, rather than just the primary historical reason which is that they have a common ancestor.

There are interesting ways that both explanations could be true, if this non-independence of constructions is successfully controlled for, but evidence for this is surprisingly elusive.  One possibility is that only some kinds of grammaticalization happen, namely the types which produce word orders that are easy to process.  Haspelmath makes this suggestion in his post: 'I certainly think that studying the diachronic mechanisms is interesting, and I also agree that the kind of source of a change may determine parts of the outcome, but to the extent that the outcomes are universal tendencies, I would simply deny the relevance of fully understanding the mechanisms. In many of the cases that Cristofaro discusses, I feel that the “pull force” of the preferred outcome may well have played a role in the change, though I do not see how I could show this, or how one could show that it did not play a role.'  

I agree that this might be possible ("the “pull force” of the preferred outcome may well have played a role in the change"), but there is currently no evidence for this.  The main way to test it would be to compile a database of grammaticalizations across languages and to see whether certain grammaticalizations happen only in certain languages: for example, do postpositions only develop from nouns in a genitive construction (the table's head -> the table on) if the language also places the verb after the object?  It is easy to find exceptions to that such as Dagaare, which has verb-object order but has postpositions because those postpositions develop from nouns, and it has genitive-noun order.  In a large database, there may be all sorts of interesting constraints on what grammaticalizations can occur, as well as geographical patterns, and it may of course turn out that word order is one constraining factor, but currently this hypothesis is unsubstantiated.

Another way that word orders might be shown to be causally related to each other is if a change in one word order can be shown to be correlated with a change in another word order in the history of a language, or in its descendants.  For example, if a language has verb-object order and prepositions but then changes to having object-verb order and postpositions, then this suggests that the two word orders are functionally linked (if this event takes place after any grammaticalization linking these verbs and postpositions).  The only solid statistical test of this so far has been an article by Dunn, Greenhill, Levinson and Gray in Nature (2011).  They tested the way that four language families have developed (Bantu, Austronesian, Indo-European and Uto-Aztecan) and tested models of word order change using a Bayesian phylogenetic method for analysing correlated evolution.  What they found was that some word orders do indeed change together.  For example, the order of verb and object seems to change simultaneously with the order of adposition and noun in Indo-European, as shown in the tree reproduced from their paper below (red square = prepositions, blue square = postpositions, red circle = verb-object, blue circle = object-verb, black = both):

A model in which these two word orders are dependent is preferred over a model in which they are independent with a Bayes factor of above 5, a conventional threshold for significance.  This seems to vindicate the idea that adpositions and verb-object order are functionally linked in Indo-European.  This also holds up in Austronesian.  It does not hold up in the smaller and younger families Uto-Aztecan and Bantu, but that may be because of the low statistical power of this test when applied to small language families.

Is this convincing evidence that there is a functional relationship between the two word orders after all, after factoring out grammaticalization?  It would be, except that language contact is not controlled for in this case.  What could be happening is that some Indo-European languages in India have different word orders because of the languages that they are near, such as Dravidian languages, which also have object-verb order and postpositions.  A similar point could be made about the Austronesian languages that undergo word order change, which are found in a single group of Western Oceanic languages on the coast of New Guinea, which is otherwise dominated by languages with object-verb order and postpositions.

An interesting result of their paper is that word orders are very stable, staying the same over tens of thousands of years of evolutionary time (i.e. summing the time over multiple branches of the families), supporting the assumption that I described above that word orders tend to be stay the same.  The main result of this test has been that language families differ in which word order dependencies they show, and many of them are likely to reflect events of word order change due to language contact, but the test has also been acknowledged by Russell Gray and others as having low statistical power, and hence not conclusive evidence either for or against there being genuine functional links between word orders.  A promising approach in the future is to apply the same phylogenetic test to the entire world, attempting to use it on a global phylogeny - a world tree of languages that does not have to be completely accurate, but simply has to incorporate known information about language relatedness, and perhaps some geographically plausible macro-families to control for linguistic areas where languages have shared grammatical properties across families (such as Southeast Asia or Africa).

Another intriguing line of inquiry is to work out what particular predictions a theory of processing or learnability would make about word order patterns across languages, and whether these predictions are in fact different from an explanation that invokes grammaticalization.  Hawkins (2004) is an example, which shows that there are word order patterns such as 'If a prepositional language preposes genitives, then it also preposes adjectives', and argues that these are predicted on the basis of the relative length of constituents such as possessive phrases and adjectives.  These particular rules worked in Hawkins's sample of 61 languages, but fail on larger databases such as WALS (38 out of 50 languages contradict the rule just given, for example).

Whether or not these attempts to demonstrate the role of processing are successful, a large part of the story of why word universals exist is the evolution of grammar.  When we try to explain why adpositions correlate in their ordering with other categories, we should remember to ask why languages have a separate grammatical category of adpositions at all.  Why does grammaticalization happen, forming a distinct class of adpositions, rather than languages just expressing spatial relations with nouns and verbs?  Why are English prepositions such as for, to, on and so on etymologically obscure, whereas in some languages such as Dagaare and Chinese many adpositions are homophonous with verbs and nouns, to the point that is doubtful that these 'adpositions' really constitute a separate class (as opposed to a sub-class of verbs, and relational nouns)?  One possibility is that we store individual constructions rather than words, and these constructions once individually stored can end up being transmitted as independent units between speakers.  To take a hypothetical example, in a language which uses body-part terms to convey spatial meanings such as saying table's head to mean 'on the table', the particular use of head as a spatial word may be stored separately from the body-part use of 'head'.  Once that happens, it is possible for the body-part sense of head to be lost in a community of speakers and just the spatial sense retained (for example, the English front derives from the Latin frons 'forehead').

This process often creates a chain of intermediate cases between nouns and adpositions, such as in Tibetan, where some adpositions require genitive marking such as mdun 'front' ('the house's front'), while others used genitive marking in the Classical language but no longer allow it (nang 'inside') (DeLancey 1996:58-59).  There are similarly ambiguous cases in English where words such as regarding can be both a verb form and a preposition.  It is worth asking whether any language has ever developed adpositions any other way: it is hard to imagine a language inventing adpositions from scratch (Edward de Bono managed to get the phrase 'lateral thinking' to catch on among English speakers, but not his invented preposition po), and they instead catch on better if they are extended uses of already existing words, such as English regarding. It is likely that most words began as extensions of other words, more generally, rather than invented out of nothing, perhaps with some exceptions such as 'Quidditch', or ideophones.  It is possible that some languages may actually invent adpositions, such as sign languages (which can use iconic signs for 'up' and 'down' for example), but if the hypothesis is correct that this is not normally what happens in spoken languages, then the historical default ought to be that adpositions share the same syntactic properties, including their word order, as other categories. The real thing to explain is not why they correlate in their word orders with other categories, but why it is ever the case that they do not correlate.   

As an analogy, some languages have unusual non-correlations of word orders across constructions, such as German which has verb-object order in main clauses and object-verb order in subordinate clauses, or Egyptian Arabic* in which numerals precede the noun except for the number 'one' and 'two', which follow it.  It is true that most languages in the world have a 'correlation' between the ordering of the number 'one' and the ordering of other numerals, to the point of making this another word order universal: but a functional explanation for this fact ('A language is easier to learn if the word order is the same for all numerals') would be banal, and would miss the fact that the historical default in most languages has been for the orderings to be correlated, simply because 'one' is normally treated as a member of the same grammatical class as other numerals.  I'm arguing that the correlation between adpositions and verb-object ordering is also likely to be a historical default due to grammaticalization, rather than a situation which languages converge on for reasons of processing or learnability.

I see word order correlations, in short, mostly as an unintended consequence of the way that grammatical categories evolved in most languages, not as an adaptive solution to processing or language acquisition.  Martin Haspelmath seems to disagree with the spirit of this type of historical argument in his blog post, however, which states (to repeat): 'Quite a few people have argued in recent times that typological distributions should be explained with reference to diachronic change...but I would argue that the changes are often adaptive and result-oriented, and that in such cases a precise understanding of the diachronic mechanisms is not necessary (though of course still desirable).'  My main point in this post is that I disagree with both parts - that an understanding of the history of languages is unnecessary to understand word order correlations (it is in fact the main story behind them), and that these changes are adaptive and result-oriented (there is little evidence so far that these grammaticalizations are geared towards producing harmonic word orders).

He has some more specific objections to historical arguments, reproduced below:

"(A) Recurrent paths of change cannot explain universal tendencies; universal tendencies can only be explained by constraints on possible changes (mutational constraints).

(B) Diverse convergent changes cannot be explained without reference to preferred results.

(C) If observed universal tendencies are plausible adaptations to language users’ needs, there is no need to justify the functional explanation in diachronic terms."

Objection (A) I disagree slightly with, because common pathways of change are enough to be a serious confound to functional explanations of language universals, as I have tried to argue. How common is 'common'?  In an ideal world, common enough that, for example, the number of languages predicted to have word order correlations is about 926/981, simply using a statistical model that assumes grammaticalization, inheritance of word order in language families, and language contact.  I have only listed some examples in this post, but their existence in multiple families and parts of the world, coupled with the stability of word orders in families, is enough to make the relatedness of constructions an important confound.  I acknowledge that an actual quantitative test is needed of whether they are common enough to explain the entire distribution of word orders, which would rely on a database of grammaticalizations - if there is enough data on grammaticalization to ever be able to test this.
Haspelmath is sceptical of 'common pathways of change', viewing these as unfalsifiable, and asks instead for stronger constraints: 'In syntax, one might explain adposition-noun order correlations on the basis of the source constraint that adpositions only ever arise from possessed nouns in adpossessive constructions, or from verbs in transitive constructions, Aristar 1991).'  In this post, I suggested a strong constraint, namely that new words normally develop from already existing words and are rarely invented from scratch.  Adpositions are therefore likely to develop from words that include, but are probably not limited to, nouns and verbs.  The question of what particular grammaticalizations can occur and why some are especially common is of course an interesting subject, but secondary to the main argument of this post, namely that the very existence of these processes is a serious confound to functional explanations of universals.  
Point (B) effectively asks why languages converge on patterns such as word order correlations when they take different historical paths, such as Chinese grammaticalizing verbs to prepositions, while Thai grammaticalized (in some cases) possessive nouns.  Isn't it a coincidence when both processes conspire on the same result, both verb-object languages having prepositions?  Well, in these cases the expected outcome of grammaticalization in both cases was prepositions simply based on the ordering of their source constructions (verb-object, and noun-genitive), so there isn't anything to explain.  There are also plenty of counter-examples, such as Dagaare mentioned earlier, which takes the same path as Thai and ends up with non-correlating word orders (verb-object order but having postpositions), because the postpositions come from possessed nouns with a genitive-noun ordering.  Again, a database of grammaticalization would tell us how common these exceptions are; if they turn out to be rarer than expected - for example, if there really is a tendency for verb-object languages not to evolve postpositions even when they have the genitive-noun order - then Haspelmath's point (B) may be vindicated.  Finally, point (C) is the main one that I disagree with, as I stated above (the history of these categories is all-important, and grammaticalization does not seem to happen with the goal of creating word order correlations).  I should add that I am only talking about word order, and may agree with Haspelmath's points in explaining other common linguistic patterns.  I am also not denying the relevance of processing to understanding why some word order combinations may be favoured over others, which can be illustrated with sentences such as 'The woman sitting next to Steven Pinker's pants are just like mine' (Pinker 1994) (illustrating the problem of a language having genitive-noun order and noun-relative clause order).
Why am I writing about a relatively minor set of disagreements on a niche question?  For me, this subject is interesting because it is about a subtle variant of Galton's problem and the possibility of erroneously inferring causation from correlation, but also because it encompasses three of the greatest discoveries of modern linguistics.  One of them is the discovery of word order universals themselves, the unexpected set of rules which allow one to make predictions about word orders in every part of the world from the Europe to the Amazon and New Guinea, with deep implications for the way that grammatical rules are represented in the mind.  Word order universals were first elucidated by Joseph Greenberg (1963) and substantiated for over 600 languages (now over 1500) by Matthew Dryer (1992).  I sometimes wonder why this discovery was not reported in Nature at the time, given that Dunn et al.'s later article on attempting to refute word order universals was published there.  It is an intriguing linguistic fact that has been written about in popular accounts of language such as Pinker's The Language Instinct but which has not yet received a fully satisfactory explanation and awaits further statistical tests, such as a large-scale phylogenetic analysis.  Such tests require knowledge of how languages are related to each other, touching on the second 'great discovery' that I would suggest that linguists have made, the way that we can study the history of large, ancient families such as Indo-European and Austronesian (and perhaps soon even larger macro-families).
The third great discovery, though less well-known, is grammaticalization, 'the best-kept secret of modern linguistics' (Tomasello 2005).  Languages are systems of complex grammatical categories and sometimes perverse syntactic rules.  How did all that get here?  Who 'invented' Latin verb endings, or English prepositions?  The most satisfying answer that we have is that grammatical words and morphemes tend to develop from already existing elements, and develop their grammatical meanings gradually.  The English morpheme -ing for example is claimed to have begun as an ending denoting nouns to do with people such as cyning 'king' and Iduming 'Edomite', and was then extended to be used on verbs as a nominalizer (playing tennis is fun) and then as a marker of continuous aspect (I am playing tennis) (Deutscher 2008).  The change from nominalizers to verb endings is mirrored across several language families (see here), and the origin of nominalizers in some languages can be traced back further to noun endings or even full nouns (such as 化 huà 'change' in Mandarin being used a nominalizer in 現代化 xiàndàihuà 'modernization', or sa in Tibetan coming from a noun meaning 'ground, place').  This shows in principle how complex grammar does not need to be invented, but can develop by gradual changes from simple elements such as concrete nouns.

In some cases these links are directly attested in languages with a long written record, such as Chinese.  In other cases they are inferred from polysemies or by comparison with related languages.  These links differ in how plausible or substantiated they are, and this work therefore needs some attempt at quantification, for example in objectively assessing similarity between forms, or counting instances of known semantic shifts across languages.  Above all, attested grammaticalizations need to be gathered into a database in order to test relationships with other properties such as word order.  Heine and Kuteva's The Genesis of Grammar (2008) and a well-written popular account The Unfolding of Language (Deutscher 2005) are overviews of grammaticalizations that have been documented across languages, including from nouns to adjectives, case markers, adpositions, adverbs, and complementisers; from verbs to aspect markers, case markers, adpositions, complementisers, demonstratives, and negative markers; from demonstratives to definite articles, relative clause markers, and pronouns; and from pronouns to agreement and voice markers.   

These pathways of change by which new categories can be created are the fullest account of the evolution of language that we currently have, a fraction of which are summarised in a tree below from Heine and Kuteva (2008:111).  They help us make sense of the inherent fuzziness of closely related categories, and also the formal similarities between them, including correlations in their word orders.  Word order universals may turn out to have been shaped in part by other factors such as processing and learnability, but they also tell the story of a linguistic equivalent of the Tree of Life, the history of grammatical categories.

(References: see this bibliography.  *Correction: it was pointed out to me that the Thai example that I cited from memory and without a source was wrong, which I've now replaced with the example of Egyptian Arabic from WALS.)