String to .wav in Node
The title 'String to .wav in Node' is a bit techie. In normal people language this means 'Synthesis text to speech in Node using Azure'. Hmmm, boring text. Let's get this done.
Prerequisites
- Node installed (nodejs.org)
- Azure subscription (azure.microsoft.com)
Create as resource
To be able to synthesis text into speech a Cognitive Services Speech Service resource is required. In the following steps we'll create such a resource. I'll use Microsoft's Recommended naming and tagging conventions.
Resource group
First step is to creat a new Azure resource group with name rg-ttsnode-demo and located in West Europe.
az group create --name rg-ttsnode-demo --location westeurope
Cognitive Services Speech Services
Second step is creating a new Cognitive Services Speech Service resource with name cog-ttsnode-demo. For this example the Free tier is used. Take note of location 'westeurope', we need it in a later step.
az cognitiveservices account create --name cog-ttsnode-demo --resource-group rg-ttsnode-demo --location westeurope --kind SpeechServices --sku F0
Get the resource key
Use the following command to display the subscription keys of our Cognitive Service Speech Service.
az cognitiveservices account keys list --name cog-ttsnode-demo --resource-group rg-ttsnode-demo
This will result in something like below. Take note of 'key1', we need it in a later step.
{
"key1": "f32f95d207514d22933841ee9670444e",
"key2": "12065ae5b39a4f16b99b83657dffc60e"
}
The keys shown in the result above cannot be used in your application, that resource has already been deleted. :clown_face:
The Node script
Create a new folder called ttsnode-demo
. With a command line tool step into that folder. To initialize a new npm package run the initialize command below to create a package.json
file. The file will be created in our ttsnode-demo
folder.
npm init
Then install the Speech SDK by calling the install command.
npm install microsoft-cognitiveservices-speech-sdk
Create a file called index.js
and open it with your favorite text editor. Add the following line to the top. This makes it possible to access the Speech SDK.
const sdk = require("microsoft-cognitiveservices-speech-sdk");
Next we add two constants, one for our subscription key and the second for the region of where our resource is located. Both values u probably didn't note somewhere when I said to. But you can find them in the steps where you created the Cognitive Services Speech Services resource.
const subscriptionKey = "f32f95d207514d22933841ee9670444e";
const serviceRegion = "westeurope";
Create another constant that will hold our speech configuration.
const speechConfig = sdk.SpeechConfig.fromSubscription(subscriptionKey, serviceRegion);
The real work will be done in the following function. It uses the Speech SDK and our constants to synthesis text into speech. Create a function that takes 3 parameters, the first one text
holds the text to synthesis. The second one (filename
) will contain the filename used for writing our .wav output file. The last parameter (callback
) is our callback method for when the synthesis is completed.
function toSpeech(text, filename, callback) {}
In this function we setup the output file and SpeechSynthesizer by adding these two lines to it.
var audioConfig = sdk.AudioConfig.fromAudioFileOutput(filename);
var synthesizer = new sdk.SpeechSynthesizer(speechConfig, audioConfig);
Next add the synthesizer.speakTextAsync(...)
method, below the previous lines. In the next step i'll explain the result => {}
and err => {}
parameters.
synthesizer.speakTextAsync(text, result => {}, error => {});
The result parameter will contain our synthesised text; the final product or when there was a problem, some problem information. Replace result => {}
with the code below. Remember to call synthesizer.close()
before using the generated file.
result => {
if (result.reason !== sdk.ResultReason.SynthesizingAudioCompleted) {
synthesizer.close();
console.error(`Failed with ${JSON.stringify(result)}`);
callback();
}
synthesizer.close();
callback(filename);
}
As you can see, when calling the callback(...)
method, I only supply the filename when the synthesis succeeds. Otherwise it will be undefined
.
The error parameter holds the error information when all went wrong. Apply the same technique as before, replace error => {}
with the code below.
error => {
synthesizer.close();
console.error(`Failed with error ${error}`);
callback();
}
At the end of the file we setup our required variable.
var text = "Hello this is a text.";
var filename = `${__dirname}/output.wav`;
var callback = function (filename) {
if(filename) {
console.log(`File ${filename} has been created.`);
}
else {
console.error('There\'s no output.');
};
}
After setup we simply call our toSpeech(text, filename, callback)
method.
toSpeech(text, filename, callback);
Save the file and call it by executing the following command.
node index.js
A output.wav
file will be created in your script directory. That's all to convert a string to .wav in Node.
Completed example
The completed index.js
file should look like this.
const sdk = require("microsoft-cognitiveservices-speech-sdk");
const subscriptionKey = "f32f95d207514d22933841ee9670444e";
const serviceRegion = "westeurope";
const speechConfig = sdk.SpeechConfig.fromSubscription(subscriptionKey, serviceRegion);
function toSpeech(text, filename, callback) {
var audioConfig = sdk.AudioConfig.fromAudioFileOutput(filename);
var synthesizer = new sdk.SpeechSynthesizer(speechConfig, audioConfig);
synthesizer.speakTextAsync(text, result => {
if (result.reason !== sdk.ResultReason.SynthesizingAudioCompleted) {
synthesizer.close();
console.error(`Failed with ${JSON.stringify(result)}`);
callback();
}
synthesizer.close();
callback(filename);
}, error => {
synthesizer.close();
console.error(`Failed with error ${error}`);
callback();
});
}
var text = "Hello this is a text.";
var filename = `${__dirname}/output.wav`;
var callback = function (filename) {
if(filename) {
console.log(`File ${filename} has been created.`);
}
else {
console.error('There\'s no output.');
};
}
toSpeech(text, filename, callback);