Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mapping SPEAKER_FRONT_LEFT by default when single channel instead of SPEAKER_FRONT_CENTER #87

Open
rubeniskov opened this issue Nov 9, 2024 · 2 comments

Comments

@rubeniskov
Copy link

I have observed that when playing back the waveform in certain audio players, the sound is routed only to the left speaker. This issue seems to occur when the player relies on the speaker channel mapping embedded in the audio file. Despite the audio file being mono, the playback is incorrectly mapped to a single speaker (left) rather than both speakers, resulting in no sound from the right speaker.

ffprobe .\musicgpt-generated.wav
ffprobe version 7.1-full_build-www.gyan.dev Copyright (c) 2007-2024 the FFmpeg developers
  built with gcc 14.2.0 (Rev1, Built by MSYS2 project)
  configuration: --enable-gpl --enable-version3 --enable-static --disable-w32threads --disable-autodetect --enable-fontconfig --enable-iconv --enable-gnutls --enable-libxml2 --enable-gmp --enable-bzlib --enable-lzma --enable-libsnappy --enable-zlib --enable-librist --enable-libsrt --enable-libssh --enable-libzmq --enable-avisynth --enable-libbluray --enable-libcaca --enable-sdl2 --enable-libaribb24 --enable-libaribcaption --enable-libdav1d --enable-libdavs2 --enable-libopenjpeg --enable-libquirc --enable-libuavs3d --enable-libxevd --enable-libzvbi --enable-libqrencode --enable-librav1e --enable-libsvtav1 --enable-libvvenc --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxavs2 --enable-libxeve --enable-libxvid --enable-libaom --enable-libjxl --enable-libvpx --enable-mediafoundation --enable-libass --enable-frei0r --enable-libfreetype --enable-libfribidi --enable-libharfbuzz --enable-liblensfun --enable-libvidstab --enable-libvmaf --enable-libzimg --enable-amf --enable-cuda-llvm --enable-cuvid --enable-dxva2 --enable-d3d11va --enable-d3d12va --enable-ffnvcodec --enable-libvpl --enable-nvdec --enable-nvenc --enable-vaapi --enable-libshaderc --enable-vulkan --enable-libplacebo --enable-opencl --enable-libcdio --enable-libgme --enable-libmodplug --enable-libopenmpt --enable-libopencore-amrwb --enable-libmp3lame --enable-libshine --enable-libtheora --enable-libtwolame --enable-libvo-amrwbenc --enable-libcodec2 --enable-libilbc --enable-libgsm --enable-liblc3 --enable-libopencore-amrnb --enable-libopus --enable-libspeex --enable-libvorbis --enable-ladspa --enable-libbs2b --enable-libflite --enable-libmysofa --enable-librubberband --enable-libsoxr --enable-chromaprint
  libavutil      59. 39.100 / 59. 39.100
  libavcodec     61. 19.100 / 61. 19.100
  libavformat    61.  7.100 / 61.  7.100
  libavdevice    61.  3.100 / 61.  3.100
  libavfilter    10.  4.100 / 10.  4.100
  libswscale      8.  3.100 /  8.  3.100
  libswresample   5.  3.100 /  5.  3.100
  libpostproc    58.  3.100 / 58.  3.100
Input #0, wav, from '.\musicgpt-generated.wav':
  Duration: 00:00:09.94, bitrate: 1024 kb/s
  Stream #0:0: Audio: pcm_f32le ([3][0][0][0] / 0x0003), 32000 Hz, 1 channels (FL), flt, 1024 kb/s
diff --git "a/.\\ffprobe-fl.txt" "b/.\\ffprobe-mono.txt"
index 3bdcb7a..a191f71 100644
--- "a/.\\ffprobe-fl.txt"
+++ "b/.\\ffprobe-mono.txt"
@@ -1,4 +1,4 @@
-ffprobe .\musicgpt-generated.wav
+ffprobe .\output_mono.wav
 ffprobe version 7.1-full_build-www.gyan.dev Copyright (c) 2007-2024 the FFmpeg developers
   built with gcc 14.2.0 (Rev1, Built by MSYS2 project)
   configuration: --enable-gpl --enable-version3 --enable-static --disable-w32threads --disable-autodetect --enable-fontconfig --enable-iconv --enable-gnutls --enable-libxml2 --enable-gmp --enable-bzlib --enable-lzma --enable-libsnappy --enable-zlib --enable-librist --enable-libsrt --enable-libssh --enable-libzmq --enable-avisynth --enable-libbluray --enable-libcaca --enable-sdl2 --enable-libaribb24 --enable-libaribcaption --enable-libdav1d --enable-libdavs2 --enable-libopenjpeg --enable-libquirc --enable-libuavs3d --enable-libxevd --enable-libzvbi --enable-libqrencode --enable-librav1e --enable-libsvtav1 --enable-libvvenc --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxavs2 --enable-libxeve --enable-libxvid --enable-libaom --enable-libjxl --enable-libvpx --enable-mediafoundation --enable-libass --enable-frei0r --enable-libfreetype --enable-libfribidi --enable-libharfbuzz --enable-liblensfun --enable-libvidstab --enable-libvmaf --enable-libzimg --enable-amf --enable-cuda-llvm --enable-cuvid --enable-dxva2 --enable-d3d11va --enable-d3d12va --enable-ffnvcodec --enable-libvpl --enable-nvdec --enable-nvenc --enable-vaapi --enable-libshaderc --enable-vulkan --enable-libplacebo --enable-opencl --enable-libcdio --enable-libgme --enable-libmodplug --enable-libopenmpt --enable-libopencore-amrwb --enable-libmp3lame --enable-libshine --enable-libtheora --enable-libtwolame --enable-libvo-amrwbenc --enable-libcodec2 --enable-libilbc --enable-libgsm --enable-liblc3 --enable-libopencore-amrnb --enable-libopus --enable-libspeex --enable-libvorbis --enable-ladspa --enable-libbs2b --enable-libflite --enable-libmysofa --enable-librubberband --enable-libsoxr --enable-chromaprint
@@ -10,6 +10,8 @@ ffprobe version 7.1-full_build-www.gyan.dev Copyright (c) 2007-2024 the FFmpeg d
   libswscale      8.  3.100 /  8.  3.100
   libswresample   5.  3.100 /  5.  3.100
   libpostproc    58.  3.100 / 58.  3.100
-Input #0, wav, from '.\musicgpt-generated.wav':
+Input #0, wav, from '.\output_mono.wav':
+  Metadata:
+    encoder         : Lavf61.7.100
   Duration: 00:00:09.94, bitrate: 1024 kb/s
-  Stream #0:0: Audio: pcm_f32le ([3][0][0][0] / 0x0003), 32000 Hz, 1 channels (FL), flt, 1024 kb/s
\ No newline at end of file
+  Stream #0:0: Audio: pcm_f32le ([3][0][0][0] / 0x0003), 32000 Hz, 1 channels, flt, 1024 kb/s
\ No newline at end of file

It seems the problem came from the assumption of mapping by default the number of channels to a certain speaker which for 1 channel only should be 0x4

hound/src/write.rs

Lines 124 to 149 in b5b6fbd

/// Generates a bitmask with `channels` ones in the least significant bits.
///
/// According to the [spec](https://docs.microsoft.com/en-us/windows-hardware/drivers/ddi/ksmedia/ns-ksmedia-waveformatextensible#remarks),
/// if `channels` is greater than the number of bits in the channel mask, 18 non-reserved bits,
/// extra channels are not assigned to any physical speaker location. In this scenario, this
/// function will return a filled channel mask.
fn channel_mask(channels: u16) -> u32 {
// clamp to 0-18 to stay within reserved bits
(0..channels.clamp(0, 18) as u32).map(|c| 1 << c).fold(0, |a, c| a | c)
}
#[test]
fn verify_channel_mask() {
assert_eq!(channel_mask(0), 0);
assert_eq!(channel_mask(1), 1);
assert_eq!(channel_mask(2), 3);
assert_eq!(channel_mask(3), 7);
assert_eq!(channel_mask(4), 0xF);
assert_eq!(channel_mask(8), 0xFF);
assert_eq!(channel_mask(16), 0xFFFF);
// expect channels >= 18 to yield the same mask
assert_eq!(channel_mask(18), 0x3FFFF);
assert_eq!(channel_mask(32), 0x3FFFF);
assert_eq!(channel_mask(64), 0x3FFFF);
assert_eq!(channel_mask(129), 0x3FFFF);
}

image

@rubeniskov rubeniskov changed the title mapping SPEAKER_FRONT_LEFT speaker by default when single channel instead of SPEAKER_FRONT_CENTER mapping SPEAKER_FRONT_LEFT by default when single channel instead of SPEAKER_FRONT_CENTER Nov 9, 2024
@rubeniskov
Copy link
Author

gabotechs/MusicGPT#21

@rubeniskov
Copy link
Author

#88

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant