Yandex Speech Kit client

0x4a52466c696e74 892a169cc9 emotion omazh removed 1 year ago
test_data 892a169cc9 emotion omazh removed 1 year ago
.travis.yml 4c9185e60f First 2 years ago
LICENSE 4c9185e60f First 2 years ago
README.md 4c9185e60f First 2 years ago
README.ru.md 4c9185e60f First 2 years ago
const.go 4c9185e60f First 2 years ago
go.mod 8c8e8d75ba mod fix 1 year ago
go.sum 8c8e8d75ba mod fix 1 year ago
stt.go 4c9185e60f First 2 years ago
tts.go 892a169cc9 emotion omazh removed 1 year ago
yask.go 4c9185e60f First 2 years ago
z_test.go 892a169cc9 emotion omazh removed 1 year ago

README.md

English | Русский

Build Status Go Report Card

yask

Tools for work with the synthesis and speech recognition service Yandex Speech Kit (more about in https://cloud.yandex.ru/docs/speechkit/) for golang programming language. Used to synthesize speech from text and recognize text from a sound stream.

Before start to use, you must register at https://cloud.yandex.ru/ to get the API key and directory identifier (more about https://cloud.yandex.ru/docs).

Audio stream formats

Speech synthesis from text

As a result of the example, get a file in wav format, ready for playback in any player program. The default bitrate is 8000.

import (
	"log"
	"os"

	"github.com/fcg-xvii/go-tools/speech/yask"
)

func main() {
	yaFolderID := "b1g..."    // yandex folder id
	yaAPIKey := "AQVNy..."    // yandex api yandex
	text := "Hi It's test of speech synthesis" // text for synthesis

	// init config for synthesis (по умоланию установлен формат lpcm)
	config := yask.TTSDefaultConfigText(yaFolderID, yaAPIKey, text)

    // By default language in config russian. For english must setup 
    // english language and voice
    config.Lang = yask.LangEN
    config.Voice = yask.VoiceNick


	// speech synthesis
	r, err := yask.TextToSpeech(config)
	if err != nil {
		log.Println(err)
		return
	}

    // open file for save result
	f, err := os.OpenFile("tts.wav", os.O_RDWR|os.O_CREATE|os.O_TRUNC, 0655)
	if err != nil {
		log.Println(err)
		return
	}
	defer f.Close()

    // lpcm encoding to wav format
	if err := yask.EncodePCMToWav(r, f, config.Rate, 16, 1); err != nil {
		log.Println(err)
		return
	}
}

Speech recognition to text

Example of recognition of short audio. The example uses a wav file that can be used with a configuration format value of lpcm

package main

import (
	"log"
	"os"

	"github.com/fcg-xvii/go-tools/speech/yask"
)

func main() {
	yaFolderID := "b1g4..." // yandex folder id
	yaAPIKey := "AQVNyr..." // yandex api key
	dataFileName := "data.wav" // audio file in wav format for recodnition to text

    // open audio file
	f, err := os.Open(dataFileName)
	if err != nil {
		log.Println(err)
		return
	}
	defer f.Close()

    // init config for recodnition
	config := yask.STTConfigDefault(yaFolderID, yaAPIKey, f)

    // setup english language
    config.Lang = yask.LangEN

    // recodnition speech to text
	text, err := yask.SpeechToTextShort(config)
	if err != nil {
		log.Println(err)
		return
	}

	log.Println(text)
}

License

The MIT License (MIT), see LICENSE.