How to Hear Production in Recorded Music

It’s often the case that I'll be talking to friends about music, say the latest release of an artist we both like, that mid-conversation I will step over an invisible threshold of intelligibility - notably, I’ll start spurting random production jargon about how ‘track 3 is overly compressed’, or ‘yeah I really like that stereo phasing of the guitars across the bridge’… I jest, mostly. However, I always find it unusual - or at least a little alien, that most people hear music singly, without tuning into specific instruments, qualities, or production choices in a song. I’m almost jealous in a way - I can listen critically, or emotionally, but that’s about it. I’m curious what it’s like to hear music without the brain starting its own sonic dissection during the process…

So, this week, I have decided I'm going to ruin music for the rest of you - primarily as a way of getting equilibrium. Notably, I want to discuss how you might approach listening to a song critically - from a production perspective - so hopefully this works as a little toolkit for listening, or at least, a new way to think about what you hear. I promise I’m fun in the pub, I do talk about things other than music… sometimes. Honest. 

A lot of key terms and jargon to make your head hurt…

So - in my thinking, I realise I too have had to re-evaluate what I consider to be ‘production’ of a song as well as what level (or levels) of complexity it can be broken down to. My first thought is how to assess music on an ‘immediate’ level - notably, what is the first thing your ears notice when you hear a song and where is your attention driven to.

And a little side note, I will refer to songs by their ‘mix’, this simply means the choices that both a producer and sound engineer (or both as one role) have made in order to represent the song to your ears, some key terms:

Panning in the Stereo Field - Unless listening on surround formats, you will notice (hopefully) that songs are designed for your two ears. Notably, there is a left and right ‘channel’ that dictate the width and distribution of instruments across a ‘stereo field’ - which really just means your left and right ear, and the disparities between them. Hard panning means that sources are only on the left or right whereas ‘mono’ or the ‘phantom centre’ refers to things being perceived centrally or in front of you. This simply means that the signal you hear on the left and right are at the same level, which consequently causes your brain to interpret the sound as being ‘in front’ - the most common example being vocals.

Example: I find the best way to think of this is to imagine a stage, say a rock band. In this instance you can visualise the placement of sounds - therefore, you would expect that the vocals would come from the centre, if the singer were in front of you, and the guitarist(s) if standing to the left/right, would come from their respective positions. In a recording studio we tend to ‘widen’ things a little, such as making drums pan so that fills sound exciting, but in its most basic form you can think of panning this way.

Balance - Put simply, this refers to the volumes of instruments/sources in a mix and more importantly, their relationship to one another. The intended goal is that a well balanced mix, should make all musical elements sound cohesive and ‘as one’ as a performance. Just as the name suggests.

Example: Imagine a stage with Dave Grohl playing drums on your left, and then Stuart Little singing ‘Blackbird’ on your right. Silly example, but what would you expect? Well, unless Stuart is amplified (with is a solution), you’re probably going to be overwhelmed by the drumkit. Consequently, when mixing, we tend to try and balance a mix by ensuring that the central performance (usually vocals) remains above other instruments for the sake of intelligibility. 

The Frequency Spectrum (and the Spectral Balance) - To take this one step further, the overall balance can also be assessed by the distribution (and relation) of specific frequency ranges between recorded sound sources.

Example: This one gets harder to explain, but back to the stage. If you were expecting a mouse to sing, do you think it would sound like Barry White? It could, if it had laryngitis maybe, but it wouldn’t be expected. So, if put on the spot, you’d expect a mouse to sing in a way that’s ‘squeaky’, and if so, what kind of instrument would you compare that to? Well, my thinking would be a brass instruments, but a high pitched one. Maybe a piccolo trumpet could work? A traditional example in an orchestral setting would be the use of woodwind, fluttering, to represent birdsong. How about a whale? Maybe something low and commanding, like a tuba? I promise I haven’t lost it - but what I'm trying to get at, is an exercise of associating known instruments to animals for their timbral qualities. Obviously a tuba and a trumpet will likely work well together because one is very bass oriented, and the other is trebly - or more simply, one is ‘low’ and the other ‘high’. However, issues arise when instruments begin to overlap in qualities - often the case is a guitar and a vocal. There is the chance that a lot of the notes being played (and sung) could overlap - so from a spectral point of view, we do not want to pan those to be in the same place, and if we do we may have to adjust their volumes to make them sound ‘balanced’. From there, we may employ EQ - equalisation - to adjust the frequency content of that sound. The spectral balance, put simply, is an attempt to make all frequency ranges/content of instruments sound cohesive together - they don’t necessarily have to be all the same, you may want a song to sound trebly overall, but you don’t want those higher frequencies to hurt your ears. 

Wet and Dry - Typically, when discussing effects (primarily reverb), the use of ‘wet’ and ‘dry’ are used. If a sound is ‘dry’, this usually implies it sounds true to its recorded environment - tight, in your face, and without much sense of the space it was recorded in. ‘Wet’ sounds are those which give much more information about the space or dimension it may have been recorded in. Think of it this way - if you clap in a cathedral, there will be be a sense of ‘tail’ as the initial clap begins to excite the space and reflect back to you - the sound is extended. In contrast, if you clap inside a library, you’ll hear few reflections of your clap across the room - you may also be asked to leave.

For further reading about the frequency spectrum - if you can bear it, I recommend reading about ‘Fletcher Munson' curves, as these represent the ear’s own response to the audible frequency range (20 - 20000Hz). From there, you may consider the ways that frequency correlates directly with musical pitches (notes), and beyond this - how these frequencies also embody the timbral characteristics of instruments. For example, the ear is most responsive to about 3.5/3.6KHz, around the pitch of a baby’s cry. Therefore, for most instruments, 3KHz becomes a range that tends to affect your impression of ‘clarity’ if amplified, or attenuated. If you remove 3KHz from a vocal, it might suddenly take a step backward and become less intelligible.

The Basics - Bright/Dark, Loud/Quiet, Spacious or Intimate?

OK - if you’re still with me, thank you, that previous paragraph is dense but necessary! You don’t have to understand it all immediately, this section will now start to apply some of these principles but in a way that should hopefully make more sense.

I think an easy first step for listening to a song is to have these three questions in mind: ‘is the song bright or dark sounding? Does the song sound louder, or perceivably ‘bigger’ in certain parts? And what kind of space does this song take place in?

So question one - bright or dark? I don’t mean emotional response here, but instead whether the overall mix of a song is biased towards a trebly sound, or a more bassy sound. Neither is ‘correct’, and this is often a choice that is informed both during a mix, but also by instrumentation.

Here are two examples within Wilco’s discography:

So do you notice which might be considered ‘bright’ and which might be considered ‘dark’?

‘Can’t Stand It’ is, to my ears, a little too bright! From the first few seconds you’re quickly made aware that the drum cymbals, snare, and electric guitars are very ‘forward’ sounding. From a mixing perspective, a choice has been made to make these elements stand out clearly and grab your ears. Notice then, that the vocals are in fact quite subdued - not bright. Instead, the vocals have a warmer, rounded quality that is achieved through a little distortion. It’s what you might call ‘Lo-Fi’. So why? Well, beyond creative decisions, it makes sense that the vocal should not be competing with these brighter elements - nor should the guitars or drums be competing with the vocal. In this instance, a balance is achieved by letting the vocals take most of the midrange.

The context behind this song specifically, is that the song was chosen to be a ‘single’ to sell the album - given the label were less than convinced that the album would have a commercial hit. Consequently, this song was selected and recorded to a click track for the sake of adding samples (ear candy as such) then being mixed differently for radio play. So why the focus on ‘bright’ elements - well in this case, I would assume it sounds more intelligible when it’s blaring out of a radio in a cafe. It has to fight whatever environment it’s in, or any old sound system it’s being played on - it’s not for people with a HIFI system at home.

So, ‘Jesus, Etc’, you may notice - I hope - is quite the opposite. It’s much ‘warmer’ sounding, and quite intimate. Most of the instruments are voiced in lower frequency ranges (or pitches) which allows Jeff Tweedy’s vocals to sit well above and remain clear, and intelligible. Importantly, this distribution of frequencies should never be an issue if a composition is arranged correctly. For orchestras this is fine, and has long been a tradition, for pop there is more room for contrasting elements - which is not necessarily a bad thing, but a challenge that arises because of it. Things need to ‘tucked into place’ to fit the bigger picture. In this instance, I think a lot of the ‘dark’ qualities to this song are caused by the instruments rather than mixing decisions - the drumkit is played gently, focusing primarily on the kick and snare, and opting for light touches across the cymbals. If you are a drummer - you’d be aware that cymbals are in fact advertised as being ‘bright’ or ‘dark’, with the latter often being associated with jazz for their ‘warmer’ quality. 

Next question - which is ‘loud’, which is ‘quiet’?

Well, ‘Can’t Stand It’ is full of energy, from the get-go, and so you'd perceive that song as energetic and loud overall. The intention is for verses to be driving, and slightly quieter, and for the choruses to explode and provide maximum excitement - perhaps with simpler riffs or melodic phrases. The intention being - ‘let’s make it memorable’. Nothing wrong with that, it’s what make things catchy on first listen.

In contrast, ‘Jesus, Etc’ maintains a consistent sense of dynamics throughout - the chorus is understated, and arrives in its own quiet way - matching the energy of the song. The main change of focus here is the use of a descending melody via the bass, and a looser legato rhythm achieved through the use of ride cymbal (whereas the hat is slightly more ‘accented’ for the verses, maintaining a sharper pace). So, this song is more ‘quiet’, it’s intimate and asks for you to listen more closely. 

And finally, what space does this song exist in?

I’ll start here by considering some spaces that you might use to answer this question, ranging from: inside your brain, in a living room, in a lounge, in a studio, in a club, in a stadium, in an area or cathedral, in space. Further contributions welcome.

Well, to my ears - ‘Can’t Stand It’ is somewhere between the stage and a studio space. The verses are tight and focused, emulating a reasonably small studio space, whereas the choruses stretch out more to offer the impression of a stage, or something acoustically bigger.

‘Jesus, Etc’ sounds much drier (less reverb), my ears pick up on the strings as being the main element that tells you about the song’s ‘space’. They sound about ten metres away or so, with some reflections, so I’d say this song could fit anywhere between a living room or a small studio.

The two examples are perhaps less useful here, but my further comparison for very dry might be Beck’s ‘Paper Tiger’ - a song which is ‘in your face’, gaining dimension from the orchestra during the chorus - a technique used to add greater impact between verse and chorus. As for ‘wet’, maybe Pearl Jam’s ‘Even Flow’ which aligns more closely with an arena sound, or even spacier ‘Walking On the Moon’ (The Police), need I say more?

If you want to take that one step further, Slowdive’s ‘Souvlaki Space Station’ uses reverb as its own instrument - turning guitars into long, repeating drones. Hence the name of the genre - shoe gaze - taken from the tendency for the guitarists to be staring down at their FX pedals all the time…

In Application - How Would You Represent a Recording?

So, with the onslaught of jargon - I think it’s worth considering how these things apply at the mix stage, or more importantly - how would you choose to represent sounds?

I think often people’s taste of genre, and decades, can inform a lot of their mixing decisions. For example:

  • The 1950s are defined by technological limitations, so much of those performances can sound distant, or overly warm (think crooners), or fixed to the space they are recorded in.

  • The 1960s offer more variety, with the introduction of ‘pop’ music and greater technological experimentation - think the psychedelia of the late 60s, as well as the punchy bass of Motown.

  • 70s, well then you’re onto rock bands, folk revival - I always think of very dead drum sounds and the introduction of more experimental mixing decisions - by the end of that decade digital processing was in its infancy.

  • And, if the 70s was quite punchy and mid-forward, the 80s then being referred to as the era where bass went away. It marked the introduction of synthesisers and early sampling, introducing a new era of pop music and pushing rock out of the mainstream.

  • Then the 90s, well that’s a mish mash of everything, my primary thought is the advent of Trip-Hop and sampling - suddenly music was ‘dark’ again because the samplers could only emulate certain frequency ranges… you’d lose the treble.

  • 2000s on, boybands and ‘bloke’ bands? And I guess anything else really! 

  • 2010s, more of the same… but super compressed breathy pop? 

  • 2020s, not sure yet. I think pandora’s box is open at this stage for sounding like whenever, whatever, and wherever. How the cynic speaks!

When mixing, I like to think of it as somewhere between a jigsaw - that is the technical side, balance, EQ, and panning, and then from that I wonder if mixing then becomes the way you choose to represent the ‘subject’ or elements of a song. Put visually, do you want it to be impressionist? Or photorealistic? Often it comes down to the emotional centre of the song, and remaining true to that. A less hyped, dry production can be fantastic if you want the song to sound direct - whereas an upbeat, bubbly pop song might ask for you to get more creative with your mixing - that is, how do you keep and maintain a level of excitement throughout?

From there - what are the key elements of the song? Is it the vocal, is it then the guitar, and do you want more kick than bass, or vice versa.

One that surprised me recently was Raye’s ‘Where is my Husband’ - it’s been a huge hit, but it has no ‘lead vocal’. Instead it’s comprised of two seperate vocal performances that are panned to the left and the right? The result is that you get a chorus of two voices, overlapping a little, and creating a sense of ‘movement’ between your ears. Cool huh?

I think often, whether we are aware of it or not, our response to music is indeed determined by whether it sounds ‘HIFI’ or not. It’s that level of polish, or intimacy, or being able to sonically ‘zoom’ into a sound that can enhance a song’s long term effect. I am always pleased to hear new elements in songs that I have never heard before - e.g Radiohead’s ‘Reckoner’, at 3 minutes, has ‘In rainbows’ sung as background vocals… it only took me a few years to notice. Did you ever notice the drums are looped on that song?

Here’s an example of, to my ears, a perfect ‘HIFI’ production - Zero 7’s ‘Destiny’. It sounds intimate, everything is clear, and there’s so much bass! And then, notice how in the chorus suddenly everything gets wider? The backing vocals introduce a new space in the left and right, hugging your ears. Also - yes, that is Sia, this was her first band I believe.

And finally, how different can two mixes sound - well, there’s a lot of room. There is no ‘perfect’ mix as it’s always subjective, but here’s an example of one song that has two different mixes by two different engineers.

And sorry in advance, it is U2. Long after their musical shelf life. If you’re a fan great, don’t hurt me. I’m sorry.

See which you prefer:

The first is the mix used for the album release - featuring spoken word intro, mixed by Steve Lillywhite. The second, is Nigel Godrich’s remix for the song’s distribution as a single. Both mix engineers are highly regarded for their craft but both mixes are very different!

To my ear, Nigel Godrich’s mix is much more daring - there are some unusual and bold choices made - such as bringing the background vocals well above the drums during the chorus. But, the overall mix feels more cohesive - the vocals are much more intelligible, and the guitars feel wide and strident. So for Nigel, the most important leading elements are Bono’s vocals and The Edge’s guitar parts, which sound wide and strident. Now that I think of it - I can’t name the bassist or the drummer for U2. So, maybe that follows.

The original album mix, by Lillywhite seems thinner - I dislike the drum cymbals, and the song feels as though it lacks some weight. The vocals too - are ‘swallowed’. The guitars? To my ear, less wide, and perhaps slightly more subdued again. Of course the song still works - but does it sound as polished? Perhaps not. I wonder if it’s a hesitancy to dig into the source material and experiment with the overall bigger picture of the mix jigsaw.

So do you agree? It’s subjective, of course, because it’s art. But, does the difference change how you respond to the song? I reckon so.

Next
Next

The Tension of the Love Song