在有些时候,我们可能需要将音频的pcm裸流转换为可播放的格式。列如,AI语音播报场景中,算法生成并返回的语音流就需要转换为WAV、MP3等格式来在应用端播放。
我们看看在服务端如何做音频格式转换。
WAV格式
转换成WAV格式相对简单,我们直接看代码
public static byte[] convertPcmToWav(byte[] pcmData, int sampleRate, int channels, int bitDepth) throws IOException {
ByteArrayOutputStream baos = new ByteArrayOutputStream();
// RIFF header
writeString(baos, "RIFF");
writeInt(baos, 36 + pcmData.length); // Chunk size
writeString(baos, "WAVE");
// fmt chunk
writeString(baos, "fmt ");
writeInt(baos, 16); // Subchunk1Size
writeShort(baos, (short) 1); // Audio Format (PCM)
writeShort(baos, (short) channels);
writeInt(baos, sampleRate);
writeInt(baos, sampleRate * channels * (bitDepth / 8)); // Byte Rate
writeShort(baos, (short) (channels * (bitDepth / 8))); // Block Align
writeShort(baos, (short) bitDepth);
// data chunk
writeString(baos, "data");
writeInt(baos, pcmData.length);
baos.write(pcmData);
return baos.toByteArray();
}
private static void writeString(ByteArrayOutputStream baos, String s) throws IOException {
for (byte b : s.getBytes()) {
baos.write(b);
}
}
private static void writeInt(ByteArrayOutputStream baos, int i) throws IOException {
baos.write(i & 0xFF);
baos.write((i >> 8) & 0xFF);
baos.write((i >> 16) & 0xFF);
baos.write((i >> 24) & 0xFF);
}
private static void writeShort(ByteArrayOutputStream baos, short s) throws IOException {
baos.write(s & 0xFF);
baos.write((s >> 8) & 0xFF);
}
pcmData是输入的字节流,后面的参数需要和数据提供方确认,列如24k采样率,16bit存储,默认单声道,那么就可以像下面这样调用
private byte[] wavData = convertPcmToWav(inputStream, 24000, 1, 16);
这样,我们就可以通过WebSocket等方式推送给前端来播放音频。
MP3格式
MP3格式要麻烦些,我们借助ffmpeg工具库。
第一导入依赖
<!-- 导入ffmpeg依赖-->
<dependency>
<groupId>org.bytedeco</groupId>
<artifactId>ffmpeg</artifactId>
<version>${ffmpeg.version}-${javacv.version}</version>
</dependency>
<!--linux平台导入ffmpeg依赖 -->
<dependency>
<groupId>org.bytedeco</groupId>
<artifactId>ffmpeg</artifactId>
<version>${ffmpeg.version}-${javacv.version}</version>
<!--该标签标识了引入ffmpeg的平台版本,这是由于ffmpeg由c语言编写,在不同平台上的编译结果不同-->
<classifier>linux-x86_64</classifier>
</dependency>
然后看一下如何使用
public static byte[] convertPcmToMp3(byte[] pcmData, int sampleRate, int channels, int bitsPerSample) throws Exception {
// Create temporary PCM file
Path tempPcmPath = Files.createTempFile("temp_pcm", ".raw");
try (FileOutputStream fos = new FileOutputStream(tempPcmPath.toFile())) {
fos.write(pcmData);
}
// Create temporary MP3 file
Path tempMp3Path = Files.createTempFile("temp_mp3", ".mp3");
// Construct command line arguments
String[] cmd = {
"ffmpeg",
"-y", // Overwrite output files without asking
"-f", "s" + bitsPerSample + "le", // Sample format (little-endian)
"-ar", String.valueOf(sampleRate), // Sample rate
"-ac", String.valueOf(channels), // Number of audio channels
"-i", tempPcmPath.toString(),
"-b:a", "128k", // Bitrate
tempMp3Path.toString()
};
// Execute FFmpeg command
ProcessBuilder processBuilder = new ProcessBuilder(cmd);
Process process = processBuilder.start();
// Wait for the process to complete
int exitCode = process.waitFor();
if (exitCode != 0) {
throw new RuntimeException("FFmpeg conversion failed.");
}
// Read the MP3 data into a byte array
byte[] mp3Data = Files.readAllBytes(tempMp3Path);
// Delete temporary files
Files.deleteIfExists(tempPcmPath);
Files.deleteIfExists(tempMp3Path);
return mp3Data;
}
用法与前面的类似,不过这里需要用到临时文件,我不太喜爱这种做法,似乎也有直接转换的方式,就没有试过了。
一般来说,掌握这两种常见的音频转换就够用了。实则ffmpeg是一个强劲的工具,不光是音频,视频也可以转换,列如目前很火的AI虚拟数字人,就可能需要用到视频相关的了,感兴趣可以学习一下。
© 版权声明
文章版权归作者所有,未经允许请勿转载。
相关文章
暂无评论...




