Add parse_wav function for WAV header metadata extraction by mthrok · Pull Request #1242 · facebookresearch/spdl

mthrok · 2025-12-11T23:51:55Z

Audio processing workflows often need to inspect WAV file metadata (sample rate, channels, bit depth) without loading the entire audio data into memory. This is particularly useful for validation, preprocessing decisions, or building dataset catalogs where full audio decoding would be unnecessarily expensive.

This adds a new parse_wav() function that efficiently extracts WAV header information without decoding audio samples. The function returns a strongly-typed WAVHeader TypedDict containing all standard WAV metadata fields (audio_format, num_channels, sample_rate, byte_rate, block_align, bits_per_sample, data_size).

The implementation includes comprehensive test coverage (13 test cases) validating all header fields across different audio configurations, error handling for invalid data, and consistency checks against the existing load_wav function.

meta-codesync · 2025-12-11T23:56:13Z

@facebook-github-bot has imported this pull request. If you are a Meta employee, you can view this in D89001254. (Because this pull request was imported automatically, there will not be any future comments.)

Audio processing workflows often need to inspect WAV file metadata (sample rate, channels, bit depth) without loading the entire audio data into memory. This is particularly useful for validation, preprocessing decisions, or building dataset catalogs where full audio decoding would be unnecessarily expensive. This adds a new `parse_wav()` function that efficiently extracts WAV header information without decoding audio samples. The function returns a strongly-typed `WAVHeader` TypedDict containing all standard WAV metadata fields (`audio_format`, `num_channels`, `sample_rate`, `byte_rate`, `block_align`, `bits_per_sample`, `data_size`). The implementation includes comprehensive test coverage (13 test cases) validating all header fields across different audio configurations, error handling for invalid data, and consistency checks against the existing `load_wav` function.

meta-cla Bot added the CLA Signed This label is managed by the Meta Open Source bot. label Dec 11, 2025

mthrok force-pushed the parse_wav branch 4 times, most recently from 05b21de to 86ccaf4 Compare December 12, 2025 15:34

mthrok force-pushed the parse_wav branch from 6f15bb7 to 3099e69 Compare December 12, 2025 23:58

mthrok marked this pull request as ready for review December 13, 2025 10:38

meta-codesync Bot merged commit 54c3b3a into main Dec 13, 2025
205 of 207 checks passed

mthrok deleted the parse_wav branch December 13, 2025 11:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add parse_wav function for WAV header metadata extraction#1242

Add parse_wav function for WAV header metadata extraction#1242
meta-codesync[bot] merged 1 commit into
mainfrom
parse_wav

mthrok commented Dec 11, 2025

Uh oh!

meta-codesync Bot commented Dec 11, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

mthrok commented Dec 11, 2025

Uh oh!

meta-codesync Bot commented Dec 11, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant