The idea of immersive mixing is not new. Yet, the concept of adapting it to achieve an emotional story telling and audience control is still something that can be pushed forward.

Looking at the developments happening in the world of cinema and home audio, the idea of bringing the spectator to engage with the story is now a reality.

The Idea is to propose various techniques of mixing and audio processing along with the emotional impact into this world so that the artistic intention of the creator with the viewer is bridged and rather than having an audience that sees a movie, it is time to have the audience experience the story.

It will span from the art of psychological placements of sounds to designed and interpreted spaces that replicate a human emotion when replayed.


A lot has been written about the techniques and methods employed in creating a stereo mix or mix for music. There is very little literature about the actual process of a Film Mix or Immersive mix. One of the reasons is probably because this is a relatively new format and the artistic and engineering capabilities are still being explored.

In this paper, I will attempt to deconstruct the Emotional Aspect of Story telling and how this can be applied to an Immersive format.

It always has been a greater exploration to find the right balance between sounds so as to get the listener to be a part of the Visual Story telling that is happening on screen.

To this extent, I would say that there are 3 perspectives in this too; First Person, Second Person and Third Person. The balance and the weaving of the perspective between these bring what I would like to call Emotional Dynamics in the context.

The biggest advantage that Immersive mixing has brought to the forefront is the ability of enabling these very changes by the positioning of sound.


As a listener, it is very important to understand that sound has very little emotional relation or meaning to the person experiencing it unless there is a context to which it can be associated.

For example, the sound of a crow in a movie may not indicate much and could be taken as part of an ambience, but when it is shown in the context of let’s say witchcraft or so, the meaning associated with it changes.

On the other hand, there are sounds that are conditioned into us. An example is the sound of a wolf howling. This creates an eerie atmosphere only because the sound has been used multiple times for this very purpose.

P.N. Juslin [1] has presented a study on how the brain creates an emotional response from sound with six psychological mechanisms:

  • Brain Stem Reflex: The importance or Urgency is implied by the acoustic characteristic like loudness of the sound.

  • Evaluative Conditioning: Anchoring an emotional event with a stimulus like using a particular score or sound for an emotion in film.

  • Emotional Contagion: Causing the listener to reciprocate or mimic the emotion created with the sound.

  • Visual Imagery: Sound can create a visual image or idea in relation to the emotion presented.

  • Episodic Memory: Triggering an event or memory by sound.

  • Expectancy: Conditioning the listener to a sequence or frequency of sounds that is expected for a particular emotion. When this is broken, tension can be created.

Ekman and Kajastila [2] have presented a paper where they presented a study on the sound source direction and width influence on the perception of scariness. They have demonstrated the effect of manipulating special width and position to fine-tune the sound and its emotional impact.

On the basis of these, it is clear that there are directional influences that can impact our perception of the emotional value of sound.


The way we accept and emotional response to the sound is based a lot on the position of it and more importantly the proximity.

On a usual question to students and professionals on what they perceive is more intimidating; is it a sound of a twig break or a tiger growl. Usually the response is twig break. This is because of the amount of imagination we put into the context of the twig break and the reason we create for it based on the environment we are in at that moment. We can identify a tiger growl. But since the twig break can have multiple causes for it, we recognise fear more.

The second question I posed was where would be scarier - in front or in the surrounds. The majority of the response was the surround. Again the reason I would put for this is based on episodic memory because a sound without a known source would be more intimidating than one we know.