Voice Chat Spatialization

I would like to extend the existing voice chat implementation available through OnlineSubsystemNull in order to spatialize the provenience of the chat (e.g. link it to the position of specific actors in the game).
This is currently not supported, I’m trying to modify the engine source code in order to achieve this but I would need some guidance.

In particular:

  • How do I access the audio components
    generated for voice chat?
  • Which files are a good starting point to understand how this work?
    VoiceInterfaceImpl.cpp?
  • Where can I find an example of playing spatialized sounds?
  • What could be done to expose these voice streams through blueprints?

Thanks in advance for any help!

Any feedback?

Hi Devel,

This sounds like a great feature! …and pretty involved!

So, unfortunately, UE4 audio has quite different implementations per platform. What that means is that we have different ways of doing spatialization on each platform – however, you may be able to avoid doing too much in that domain.

For PC, in the current head revision of UE4 (and upcoming 4.8), for spatialization code, you can look into XAudio2Source.cpp and FXAduio2SoundSource::GetMonoChannelVolumes() to see how we build speakermaps for mono sources based on the listener and sound emitter position. It’s a matter of tracing up from there to see how the sound’s position is created and updated.

I think the VOIP stuff is implemented as a “procedural” data source (i.e. it’s not from a file but retrieved from network packets). This is fed into the audio system from USoundWaveStream::GeneratePCMData() and queued up from the incoming network packets of audio from USoundWaveStreaming::QueueAudio().

As for exposing the VOIP stuff to Blueprints, you’d probably need to make a new subclass for USoundBase so that it can be used inside a UAudioComponent. Searching the code you should see examples of classes which subclass USoundBase (USoundWave, UDialogueSoundWaveProxy, USoundMod, USoundCue). Might want to call it USoundWaveVoip or something. I believe once you do this, there might not be too much more work to get the thing spatialized since the sound’s location (via FActiveSound) is obtained through UAudioComponent in OnUpdateTransform. Doing this sounds like it might be easy, but my guess is there’ll be some gotchas you run into.

As for adding new voice capture code to various platforms, you’ll have to implement the IVoiceCapture interface, of which there are 2 current implementations (PC and MAC) in VoiceModuleMac.cpp and VoiceCaptureWindows.cpp.

Sorry for the delay in replying, hope the above is helpful to you!

Aaron

Thank you Aaron, that’s a lot of information! I will look into it and try to understand how all of this works. Does XAudio2 also support mac / android or is it windows only?

From 4.8 also Oculus audio SDK should be included in the engine, providing a cross platform solution, would this be useful at this purpose?

No, XAudio2 is only for PC/XBOne. That’s part of what makes doing this potentially tricky – every platform has almost a different audio engine. On Android we use OpenSLES api and on mac we use CoreAudio.

Also, we only implemented the PC version of the Oculus SDK as an XAPO plugin. The reason why only did this for PC is that our GDC VR demo’s were on PC and we’d have to write a different plugin for every platform. We have some longer term plans for a better multi-platform solution for audio, but that’s a bit longer term.

And, in case you didn’t know, the Oculus audio SDK just implements an HRTF-based spatialization algorithm for more realistic headphone spatialization. If you did the above (implement a VOIP USoundBase) as a mono sound source, it shouldn’t be too difficult to get the audio stream to also go through the Oculus HRTF spatialization (for PC).

Here’s links to audio API documentation for the platforms you seem interested in:

PC:
XAudio2:

XAPO plugins for XAudio2:

Mac:

Android:

Ok great, thank you! For a cross platform solution based on the Oculus Audio SDK I was also trying to use the FMOD plugin, not sure this is suitable for a procedural data source, though.

Just curious, did you manage to get any of this working? As I am starting on that path as well. I am really only interested at the moment with PC.

Given up at the moment, I’m interested specifically in specialisation on the Gear VR but the Oculus Audio SDK is still poorly supported on Android. Please let me know if you achieve results on this topic anyway!

Hey guys, Let me know if you’ve made any progress too. I’ve recently started back in on building a multi-platform mixer post 4.11 and paragon work. Should be way easier to do this once we have our own mixer.

Hi, I have having an issue with the soundwaveprocedural on Android platform. It seems like the function result = (*SL_PlayerBufferQueue)->Enqueue(SL_PlayerBufferQueue, AudioBuffers[0].AudioData, AudioBuffers[0].AudioDataSize);

returns a invalid parameter error. Do you have any idea how to fix it? Thanks.

Hi Aaron, I recently started working on this a bit before seeing this thread. What I have done so far is create an actor with a replicated UniqueNetId and an audio component, and then have a function that sets the UniqueNetId from a APlayerController. I do that in postlogin after spawn and the attach the actor near the head of the pawn/character.

I then added a method to the online session’s Voice interface that maps a UniqueNetId to an existing Audio component. I then wired the same thing into Voice Engine and made sure to use a lock on the map there.

I’m only dealing with the existing voice engines, Null and Steam.

Right now I’m just wiring up the component initialization and dealing with getting the right sample rate. I just added a SetupAudioComponentForVoice function that is based off of CreateVoiceAudioComponent in OnlineSubsystemUtils, it basically just does SetSound on an existing component with a USoundWaveProcedural that was created in the same way.

Am I on the right track? The main thing I’m worried about is thread safety of the audio component, and garbage collection (I passed the audio component as a WeakObjectPtr and am checking it). I assumed I would get whatever spatialization is in engine for free with this method (just by setting AudioComponent->bAllowSpatialization in my game thread before handing it over to the voip subsystem).

Anyway, after reading this thread I started to think my approach might not work. Any feedback?

Thanks

So I’ve implemented this; I changed approach a bit and I have the audio being sent by a TQueue to the game thread and then am polling it on tick; it all works and I get sound. But spatialization doesn’t work. Anything special to use it in C++?

Here’s my component’s constructor:

bWantsBeginPlay = true;
PrimaryComponentTick.bCanEverTick = true;

// ...

bReplicates = true;

AudioComponent = CreateDefaultSubobject<UAudioComponent>(TEXT("Voice"));
AudioComponent->SetupAttachment(this);
AudioComponent->bAutoActivate = false;
AudioComponent->bAllowSpatialization = true;
AudioComponent->AttenuationSettings = SoundAttenuation;

USoundWaveProcedural* SoundStreaming = NewObject<USoundWaveProcedural>();
SoundStreaming->SampleRate = 11050;

SoundStreaming->NumChannels = 1;
SoundStreaming->Duration = INDEFINITELY_LOOPING_DURATION;
SoundStreaming->SoundGroup = SOUNDGROUP_Voice;
SoundStreaming->bLooping = false;
SoundStreaming->OnSoundWaveProceduralUnderflow.BindUObject(this, &USpatializedOnlineVoice::handle_underflow);
AudioComponent->SetSound(SoundStreaming);

AudioComponent->Priority = Priority;

SoundAttenuation is just defined as a UPROPERTY like this and I am setting the value in editor:

UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "Audio")
USoundAttenuation * SoundAttenuation;

I tested the same attenuation class on some looping fire audio and it works fine; but it doesn’t work here. Is it incompatible with USoundWaveProcedural, or did I just initialize something wrong?

Have you stepped through to the lower levels of the code where spatialization actually occurs? On PC, that would be void FXAudio2SoundSource::Update()

Hi muchcharles, did you get spatilization to work? I would be very interested in this.

I got this working. I create and attach the audio component in my character’s constructor. Then in BeginPlay() I create the SoundWave and set the audio component’s sound to the created SoundWave. I also store a pointer to the audio component in my character’s PlayerState. Then in the voice engine, I can grab the audio component from the PlayerState with a virtual function I added to APlayerState.

Character.cpp:

MyCharacter::MyCharacter(const FObjectInitializer& ObjectInitializer)
{
    VoiceAudioComponent = CreateDefaultSubobject<UAudioComponent>(TEXT("VoIPAudioComponent"));
	VoiceAudioComponent->SetupAttachment(RootComponent);
	VoiceAudioComponent->bIsUISound = false;
	VoiceAudioComponent->bAllowSpatialization = true;
	VoiceAudioComponent->SetVolumeMultiplier(1.5f);
}

void MyCharacter::BeginPlay()
{
	Super::BeginPlay();

	if (MyPlayerState* PS = GetPlayerState())
	{
		USoundWaveProcedural* SoundStreaming = NewObject<USoundWaveProcedural>();
		SoundStreaming->SampleRate = 16000;
		SoundStreaming->NumChannels = 1;
		SoundStreaming->Duration = INDEFINITELY_LOOPING_DURATION;
		SoundStreaming->SoundGroup = SOUNDGROUP_Voice;
		SoundStreaming->bLooping = false;

		VoiceAudioComponent->SetSound(SoundStreaming);

		PS->VoiceAudioComponent = VoiceAudioComponent;
	}
}

VoiceEngineSteam.cpp:

uint32 FVoiceEngineSteam::SubmitRemoteVoiceData(const FUniqueNetId& RemoteTalkerId, uint8* Data, uint32* Size) 
{
    ...
    QueuedData->AudioComponent = [&]()
	{
		UWorld* World = GetWorldForOnline(SteamSubsystem->GetInstanceName());
		if (World && World->GameState)
		{
			for (const auto& PlayerState : World->GameState->PlayerArray)
			{
				if (*PlayerState->UniqueId == RemoteTalkerId)
				{
					if (UAudioComponent* OverrideAudioComponent = PlayerState->GetOverrideVoiceAudioComponent())
					{
                         if (USoundWaveProcedural* VoIPSound = Cast<USoundWaveProcedural>(OverrideAudioComponent->Sound))
						{
							VoIPSound->SampleRate = SteamUserPtr->GetVoiceOptimalSampleRate();
						}

						return OverrideAudioComponent;
					}
					else
					{
						break;
					}
				}
			}
		}

		return CreateVoiceAudioComponent(SteamUserPtr->GetVoiceOptimalSampleRate());
	}();
    ...
}

Edit:
We ran into another issue with attenuated VoIP. With how the engine handles sounds, if a sound has a volume of 0, it doesn’t play it. So if someone was too far away to hear a player talking, the VoIP data was being cached and queued for playback once that player was in the attenuation range. This would make it so you could hear what someone said from minutes ago. We got around this issue by adding a custom attenuation distance curve that has a min volume of 0.01 instead of 0 ((0,1), (1, 0.01)).

Cool solution if it works. I am going to try it out some time soon.

Is this already optimized concerning network relevancy and network performance?

If you always have a sound volume of at least 0.01, theoretically every player will hear what you say (even if you are kilometers away and he doesn’t hear it acoustically due to low volume).
So the sound always gets broadcasted over the network no matter if someone hears it or not. And for 64 players or more it can end up littering the network traffic, can’t it? Maybe I am completely wrong about that… and please correct me if I am.

Anyways, great approach. Have a nice day.

Do I need to Include something in the character class to use USoundWaveProcedural* ? Because I am getting the error “identifier “USoundWaveProcedural” is undefined”

Yes.
#include “Classes/Sound/SoundWaveProcedural.h”

You are correct. You could add additional checks for distance and decide if you should play the sound or not.

Sorry to bother you again but is there a way to do this when using the binary version of the engine (no source build)?