Close your eyes and listen to the sounds around you. Perhaps there’s a bird chirping in the distance, a plane flying overhead, the sound of a clock ticking. With your eyes shut, how do you know where those sounds are originating from? Are they to your left or right; are they near or far? Your ability to make these determinations is based on physics. Sound coming from your right arrives at your right ear louder and earlier than it arrives at your left ear and your brain quickly recognizes these differences which in turn create the perception that the sound is indeed coming from your right. An Object that’s far away is quieter not just in its overall Volume, but also in tone because some frequencies of sound can’t pass through air as easily as others, affecting the overall quality of the sound based on its distance to you.
Audio Engineers use Volume and Panning controls and more to re-create the idea of space when you listen to pre-recorded material played through a set of speakers. Music Engineers use Volume and Panning controls to paint the sonic picture they want the listener to perceive. For example, is the guitarist standing on the left or right side of the stage? With Audio for Film and TV, there’s already an image presented on the screen, so the Audio Engineer uses these same controls to create an audible image that matches the visual one - it would be very distracting to see a man talking on the right side of the screen while his voice comes from the left. Therefore the so called Panning has to be applied correctly to match the position of the sound to the visual image. With video games, it’s much like Audio for Film and TV where there’s an image on screen; however, the major difference is that with games there’s no way to predict exactly what that image will be as this is dictated by how the player plays the game. This all but eliminates the ability for Game Sound Designers to use Volume and Panning controls in a conventional way. To accommodate the unpredictability of a game, Audio Wngines such as Wwise use unique systems to allow the game itself to automatically control how the Audio is mixed in real-time. The process of doing this is most often called Positioning and Spatialization of sound.
As you might know, most common game engines offer basic tools for the Positioning and Spatialization of sound within the game. Quite a lot of engines have adopted a concept originating in the 90s, called 2D/3D Positioning. What these engines mean with 2D Positioning is that the sound is played in its original channel configuration as a sort of overlay to the scene, independent from movement of a Player Character or the Scene Camera. Music is a good example for that, which is most commonly played back in stereo, hence through a left and right channel. 3D Positioning on the other hand means that there is a Game Object or Actor that the sound is originating from. Compare it to a loudspeaker or a Emitter within the scene. It is therefore possible to measure the distance between this Audio Source or Emitter and a so called Audio Listener, either positioned on the Player Character or the Scene Camera. This distance is then used to modify the sound depending on its distance and direction from the Audio Listener. These two examples should show you the basic 2D/3D Positioning concept in a modern day video game.
In this example, the woman talking has an Audio Source attached to her Game Object. Her voice line is originating from that very Game Object. This could be implemented in two ways. Either the Scene Camera has an Audio Listener attached to it, because as the player is moving the position and rotation of the Scene Camera, the direction of the sound is changing accordingly. Or it could also be the Player Character with the Listener inheriting the Scene Cameras rotation.
This next example further shows the Audio Source and Audio Listener relationship. A Game Object that is far away is quieter in Volume and softer in tone. This changes back to normal as the distance between the Audio Source to the Listener decreases. Also, when the Game Object is far away you can hear natural reverberation and subtle echoes. As the Game Object comes closer, the sound gets much more dry and the reverberation becomes quieter and unnoticeable.
Audiokinetic Wwise has moved on from this old concept and offers a much more comprehensible and granular toolset for the Positioning needs of modern day’s Game Audio. It is still relying on Audio Sources and Audio Listeners within the scene, but the transition from 2D to 3D is much more fluent. Hence, you will now find the classical terms 2D and 3D Positioning as checkboxes for Objects in Wwise, but rather the terms Speaker Panning and Listener Relative. Let’s look at this feature-set and try it out on a few examples in the Platformer Game.