STT + Cloud Function

This implementation is not suitable for long files as google cloud function times out after 1min to 9min

But for shorter files, you can add your media to a google cloud storage and run google cloud STT inside a cloud function and save the result to firestore.

Firestore events

// Goolge App Engine calls STT SDK
// this function returns null.
exports.createTranscript = functions.firestore
.onCreate(async (change, context) => {
// Get an object representing the document
const newValue =;
// for now only run STT for uploaded files
// eg if running from STT then don't run this
if (newValue.transcriptionType === "upload") {
// access a particular field as you would any JS property
const storageRef = newValue.storageRef;
console.log("storageRef", storageRef);
const storage =;
const bucket = storage.appInternal.options.storageBucket;
// // const gcsUri = 'gs://my-bucket/audio.raw';
const gcsUri = `gs://${bucket}/${storageRef}`;
console.log("gcsUri", gcsUri);
//////// STT ////////
// Creates a client for STT
const client = new speech.SpeechClient();
const encoding = "mp3"; //'Encoding of the audio file, e.g. LINEAR16';
const sampleRateHertz = "48000"; //16000;
// const languageCode = newValue.language.value;
const languageCode = "en-US"; //'BCP-47 language code, e.g. en-US';
const config = {
encoding: encoding,
sampleRateHertz: sampleRateHertz,
languageCode: languageCode
const audio = {
uri: gcsUri
const request = {
config: config,
audio: audio
// // Detects speech in the audio file. This creates a recognition job that you
// // can wait for now, or get its result later.
const [operation] = await client.longRunningRecognize(request);
// Get a Promise representation of the final result of the job
const [response] = await operation.promise();
// TODO: Convert to DPE
// TODO: Save DPE json format to firebase as collection
// get text - tmp
const transcription = response.results
.map(result => result.alternatives[0].transcript)
console.log(`Transcription: ${transcription}`);
// Then return a promise of a set operation to update the document
return change.ref.set(
// TODO: change transcription to transcript
transcription: transcription
{ merge: true }
} else {
return null;


Supercharging Firebase with Google Cloud Platform - Google I/O 2016

A Serverless Audio Transcription Pipeline - audio-2-text

Transcribing videos

SpeechClient - SDK

REST Resource: operations



Method: speech.recognize