Table A1 Examples of decoded outputs mapped to voicing categories for selected minimal pairs

Target pair Mapped as voiceless Mapped as voiced
pan/ban pan, pat, pain ban, band, better, bender
tan/Dan tan, ten, 10, tender, time Dan, down, dandelion
cam/GAM cam, KM, camp, can GAM, game, gambler
The table illustrates representative Whisper outputs that were assigned to voiceless and voiced categories based on initial consonant identity after normalization.