Description
Tested versions
Reproducible in 4.4
System information
Godot v4.4.stable (77dcf97) - NixOS #1-NixOS SMP PREEMPT_DYNAMIC Thu Sep 12 09:13:13 UTC 2024 on X11 - X11 display driver, Multi-window, 1 monitor - OpenGL 3 (Compatibility) - Mesa Intel(R) Graphics (ADL GT2) - 12th Gen Intel(R) Core(TM) i5-1240P (16 threads)
Issue description
The purpose of this issue is to draw together the observations of a bug found here #103655 (comment) and argue about what the solution is, of which this PR #103926 is but one possibility.
By orbiting an AudioStreamPlayer3D
around an AudioListener3D
in the horizontal plane and plotting the amplitude of the AudioEffectCaptured left and right channels on a polar graph (see the project below), one gets the following graphs:
Panning Strength=1 | Panning Strength=0.3 |
---|---|
![]() |
![]() |
This is definitely wrong. Aside from being weird, this diagram contradicts the documentation under panning strength:
A value of 0.0 disables stereo panning entirely, leaving only volume attenuation in place. A value of 1.0 completely mutes one of the channels if the sound is located exactly to the left (or right) of the listener.
The default value of 0.5 is tuned for headphones. When using speakers, you may find lower values to sound better as speakers have a lower stereo separation compared to headphones.
Clearly, reducing the panning strength is increasing the stereo separation, especially around the two misplaced nodes at 45 degrees to the backward facing vector.
Now, I have identified to source of the bug (use of an algorithm designed for 7.1 surround sound for stereo sound with the wrong speaker positions), but what is the correct answer?
Let's check what happens on other platforms.
Web Audio (W3C)
This platform is fully documented because it is a standard. It is reasonable to expect that the designers will want it to conform to conform to main stream conventions.
The specification for PannerNode "equalpower" Panning is described thus:
This is a simple and relatively inexpensive algorithm which provides basic, but reasonable results. It is used for the PannerNode when the panningModel attribute is set to "equalpower", in which case the elevation value is ignored.
In this case the azimuth
is to the angle of the listener facing forwards. Anything outside of the -90 to 90 region is behind the listener and mirrored to the front:
if (azimuth < -90)
azimuth = -180 - azimuth;
else if (azimuth > 90)
azimuth = 180 - azimuth;
if (azimuth <= 0)
x = (azimuth + 90) / 90;
else
x = azimuth / 90;
gainL = cos(x * Math.PI / 2);
gainR = sin(x * Math.PI / 2);
Unity
I have not found the explicit documentation for this, so instead I collected the data of the sound intensity in each ear as the camera rotates around in the horizontal plane and looks up and looks down.
Horizontal Camera Vector | Volume intensity polar plot |
---|---|
![]() |
![]() |
The orange and blue fuzzy lines are the measured data. The red and green lines are the polar plots of cos(t/2)
and sin(t/2)
, the same as the implementation I found in the UE code. It's a close match, although I have not been able to account for the discrepancy.
Unreal Engine
The public Spatialization Overview documentation for Unreal Engine describes panning as "the oldest and simplest way to simulate spatialization".
The computation is diagrammed for four speakers, but when there are two speakers they are 180 degrees on opposite sides of the head. The panning value for the left speaker is X/180
where X
is the horizontal angle between the vector to the sound source and the vector that is perpendicular to your left ear. The panning value for the right speaker is then 1-X
, which is consistent.
To convert this value to "equal power panning" such that the sum of the squares of volume/gain in each ear is constant (because kinetic energy is the square of the velocity) they offer the choice of a "square-root panning law" or the "cosine panning law".
Their "cosine panning law" where left_volume=cos(X)
, right_volume=sin(X)
agrees with the documented Web.Audio behavior and the observed Unity behavior. Confirmation that the elevation angle to the sound source is ignored is provided by the section titled "The Problem of Vector Flipping".
Proposed implementation for Godot
There are very good reasons not to depart from the default implementation given by these three established platforms in the stereo case. This includes their choice to ignore the elevation value, which is not how I would do it.
However, we can use some school mathematics to implement this calculation as follows without the use of trig functions:
// inputs
Vector3 to_source;
Vector3 listener_up; // eg (0,1,0)
Vector3 listener_ear; // eg (1,0,0)
// calculation
Vector3 to_source_projected = to_source - listener_up*(listener_up.dot(to_source));
float cos_asimuth = listener_ear.dot(to_source_projected)/to_source_projected.length();
float cos_asimuth_by_two = sqrt(cos_asimuth + 1)/2); // using cos(t) = 2*cos(t/2)^2 - 1
return cos_asimuth_by_two;
More advanced implementations
All these platforms have the facility for a more advanced HRTF (Head-Related Transform Function) implementation, even the webaudio platform. However UE and Unity do it with a plugin extension, and so should we since it is would otherwise involve some code bloat for a feature that is predominantly required for VR.
Unity seems to implement the panning_strength
feature by blending this stereo calculation with the mono (same) value using a lerp, which is a good answer.
Let's just fix this by implementing it in the industry standard way so that everything else that builds upon it works.
Steps to reproduce
See below
Minimal reproduction project (MRP)
Metadata
Metadata
Assignees
Type
Projects
Status