I Have Been Saying This for Years. This Proves It!
It happened the way the best discoveries usually do – sideways, by accident, while looking for something else.
I was following a thread of research this morning, one thing leading to another the way it does, and suddenly I was reading about a man named Victor Grauer. An ethnomusicologist from Pittsburgh. A composer, filmmaker, researcher, and quiet obsessive who spent decades building a scientific case that most of the academic world chose to ignore.
I had to stop reading and sit with it for a moment.
Because what he was saying – what he had spent his career saying – was the foundation underneath everything I have been building with Musicably.
Not similar. Not related. The same thing. Just arriving from a completely different direction.
For decades, the “Drumming First” theory dominated the discussion about the origin of music, because archaeology is biased toward things that don’t rot. We found bone flutes from 60,000 years ago and assumed the story started there. We couldn’t find a “fossilized song,” so we assumed rhythm and tools came first.
We are actually standing at the forefront of a major “paradigm shift” in musicology.
Before Language, There Was Song
Let me start somewhere that might surprise you.
Before your ancestors could say “I love you,” they could sing it.
Before they could argue, negotiate, tell stories, or ask questions, they could hum together, call and respond, interlock their voices into something larger than any one of them could make alone.
Music did not come after language. Music came first.
That is not a poetic idea. It is the central argument of Steven Mithen, a cognitive archaeologist at the University of Reading, in his book The Singing Neanderthals (2005). Mithen spent years pulling together evidence from neuroscience, archaeology, developmental psychology, and evolutionary biology to answer one question: why do all humans, in every culture ever studied, make music?
His answer is as simple as it is staggering. The propensity for music is encoded in the human genome. It predates language by a very long stretch of evolutionary time. Before our ancestors had words, they had something Mithen calls “Hmmmmm” – a form of communication that was holistic, musical, multi-modal, and emotionally honest in a way that words, with all their power, never quite managed to replicate.
Mithen’s latest work The Language Puzzle (2024) doubles down on the “musical” brain is the ancestral brain. He suggests that the reason we are so moved by music today is that it taps into the cognitive system that existed for millions of years before we ever had a word for “fear” or “joy.” We didn’t learn to be musical; we were musical for 5.8 million years, and we’ve only been “talking” for a tiny fraction of that time.
Think about that word honest for a moment.
Mithen argues that musical vocalisation carried emotional truth more reliably than language does. You could not easily fake a song the way you can fake a sentence. The feeling was in the sound itself.
And here is what stayed with me most, reading his work: the mental state that this pre-linguistic musical communication produced sounds, according to one reviewer, “remarkably like the state mystics of every religion describe when they meditate – timeless and wordless.”
Timeless and wordless.
The space that music opens, before thought arrives to name it and categorise it – that is not a side effect. According to evolutionary science, it is the original state. The one that was there before everything else.
The Ancient Technology for Overcoming Fear
So if music came first, what was it actually for?
This is where Joseph Jordania comes in.
Jordania is an Australian-Georgian ethnomusicologist at the University of Melbourne, winner of the Fumio Koizumi Prize – the most prestigious award in ethnomusicology – and one of the world’s leading researchers into the origins of choral singing. He has conducted fieldwork in Georgia, Corsica, Sardinia, Bulgaria, Japan, Tibet, Argentina, and Brazil. He has spent his life following one question: why do humans sing together?
His answer is both ancient and immediately practical.
Our earliest ancestors sang together to survive. Literally. Group singing – loud, interlocked, communal – was a strategy for making a small band of vulnerable hominids appear larger, more powerful, and more threatening to predators. You sang together to stay alive. And what happened neurologically when you sang together was not incidental. It was the point.
Communal singing put people into an altered state of consciousness. Fear dissolved. The boundary between “me” and “us” softened. The group acted as one.
Now I want you to hold that idea alongside something I have been observing for years in my work with people who tell me they “can’t do music.”
The thing that stops them is almost never ability. It is fear. Fear of judgment. Fear of looking foolish. Fear of not being good enough. The performance frame – the invisible wall that divides every room into performers and audience – has made fear the automatic companion of music-making.
But Jordania is telling us something remarkable. Before the performance frame existed, before anyone thought to divide the room, the original function of communal music was the biological opposite of fear. It was the ancient technology for removing it.
What we have done, somehow, over centuries of professionalisation and commodification, is take the thing that was designed to dissolve fear and rebuild it as a machine for producing fear instead.
That is not a small thing. That is one of the more extraordinary reversals in human cultural history.
The Man Who Mapped the Musical Genome
And now back to Victor Grauer – the man I found this morning by accident. (Here is his blog where he is “Contemplating the history of music from the year 000,001“)
In the 1960s, as a young graduate student at Wesleyan University, Grauer began working with the legendary folklorist and musicologist Alan Lomax. Together they developed a system called Cantometrics – a method for coding and comparing traditional music from cultures all over the world. They were building, in effect, a global map of how humans actually make music when nobody is telling them what music should be.
What the data eventually revealed was both unexpected and completely logical.
The oldest traceable musical tradition we can identify – the music of the Pygmy peoples of Central Africa and the San Bushmen of Southern Africa, communities whose genetic lineage connects them directly to the earliest branches of the human family tree – is built around communal, polyphonic, interlocked vocal music. Complex harmonies. Multiple voices weaving together without a conductor, without a score, without anyone designated as the soloist and everyone else as the audience.
Everyone sings.
Nobody watches.
Grauer spent decades building the case that this is not just one musical tradition among many. It is the original one. The template. The sound that was already ancient when our ancestors began their long migration out of Africa, and whose echoes can be traced – if you know what to listen for – in musical cultures around the world.
He tried to find a publisher for this work. Nobody wanted it. He eventually published it himself, first as a blog, then on Amazon. The academic world, which had largely turned away from exactly this kind of large-scale comparative thinking, paid it limited attention.
One musicologist, one man in Pittsburgh, building the scientific map of humanity’s original music. Working largely alone. Largely unrecognised.
I found him by accident.
What This Means for Us
In 1997 a popular cognitive psychologist Steven Pinker claimed that music is “auditory cheesecake” – a pleasant byproduct of language but evolutionarily useless – and this was the dominant view for a long time. It frustrated musicologists, including myself, because it reduced a universal human behavior to a mere accident.
Mithen, Jordania, Grauer – were the ones who finally stood up and said, “No, the cheesecake theory is wrong.” They argued that music isn’t a topping; it’s the yeast – the thing that allowed the “bread” of human society to rise in the first place.
Three researchers, three different disciplines, three different methodologies.
One conclusion, arriving from each direction like three rivers joining.
Music is not a talent some people have and others don’t. It is not entertainment. It is not a leisure activity. It is a biological inheritance, older than language, encoded in your genome, shaped by hundreds of thousands of years of evolution into something your body already knows how to do.
The oldest music we can trace was made by everyone together. No auditions. No wrong notes. No performers and no audience.
And here is what I find most quietly astonishing about all of this. Mithen points out that the individual human being develops exactly the same way. Before a baby has language, it has music. The first communication between a mother and an infant is not verbal – it is rhythmic, melodic, emotionally resonant. Musical. Every single one of us arrived in the world already knowing this.
We did not lose the ability.
We were taught to doubt it.
Hmmmmm.
That is Mithen’s word for the pre-linguistic state – the musical communication that preceded everything else. But I want to offer it to you as something else as well. As an invitation.
You do not need to perform. You do not need to be trained. You do not need to be brave in any dramatic sense.
You just need to remember that the sound was there before the silence was taught to you.
Start anywhere. Your body already knows where.
In the photo – Baka Pygmies; Author – Jordi Zaragoza Angles & Angels Ferrer


