🖌️
Firebase+React_Notes
  • Firebase React Notes
  • React + firebase
    • Firebase - Create React app setup
      • Node & nvm
      • Create React App + Firebase
      • Create firebase app
      • Deploying To Firebase Hosting
      • Switching Environments
      • Typescript typings
      • Firebase cloud function local development
      • Resources
    • Firebase React context
      • Motivation
      • Firebase React Context setup
    • Firebase function local dev react
    • React firebase hooks
  • Multiple ENVs
    • Multiple ENVs
    • Manual setup
    • Terraform
  • Firestore
    • Firestore
      • Using a function to check email domain
    • Firestore data model
    • associated Firebase data with Users
    • Firestore write
    • Firestore - read
      • Removing a listener from firestorm
    • Firestore update
    • Persisting data offline
    • Importing json
  • Auth
    • Auth
    • Firebase UI
    • Firebase Auth with React
    • Linking auth accounts
    • Twitter sign in
    • Google sign in
      • Google sign in custom domain
    • Database Auth
      • Custom claims
      • Limit auth to certain domain only
    • Custom tokens
  • Cloud Functions
    • Cloud Functions
    • Set node version
    • Set timeout and memory allocation
    • Call functions via HTTP requests
    • HTTPS Callable
      • HTTPS Callable cloud function auth check email address domain
    • Separate Cloud Function in multiple files
    • Slack integration
    • Twilio firebase functions
    • ffmpeg convert audio
    • ffmpeg transcoding video
  • Storage
    • Security
    • Create
    • Delete
    • Uploading with React to Firebase Storage
    • Getting full path
    • Firebase `getDownloadURL`
    • Saving files to cloud storage from memory
  • Hosting
    • Hosting
    • Hosting + cloud functions
  • Firebase Admin
    • Firebase admin
  • Firebase analytics
    • Firebase analytics
  • Google App Engine
    • Google App Engine
    • GCP App Engine + video transcoding
  • STT
    • STT + Cloud Function + Cloud Task
      • Example implementation
      • `createTranscript`
      • `createHandler`
        • Firebase ENV
    • Other
      • enableWordTimeOffsets
      • STT longRunningRecognize in Cloud function
      • STT + Cloud Function
      • STT + Google App Engine
      • STT via Google Cloud Video intelligence API
  • CI Integration
    • Travis CI integration
    • Github actions integration
  • Visual code
    • Visual code extension
  • Electron
    • Firebase with electron
  • Pricing
    • Pricing
  • Testing
    • Unit testing
  • Privacy and Security
    • Privacy and security
  • Useful resources
    • links
  • Firebase Extensions
    • Firebase extension
  • Chrome Extension
    • Firebase in a chrome extension
  • Cloud Run
    • Cloud Run
Powered by GitBook
On this page

Was this helpful?

  1. STT
  2. Other

STT + Cloud Function

PreviousSTT longRunningRecognize in Cloud functionNextSTT + Google App Engine

Last updated 5 years ago

Was this helpful?

This implementation is not suitable for long files as

But for shorter files, you can add your media to a google cloud storage and run google cloud STT inside a cloud function and save the result to firestore.

// Goolge App Engine calls STT SDK
// this function returns null.
exports.createTranscript = functions.firestore
  .document("transcripts/{transcriptId}")
  .onCreate(async (change, context) => {
    // Get an object representing the document
    const newValue = change.data();
    // for now only run STT for uploaded files
    // eg if running from STT then don't run this
    if (newValue.transcriptionType === "upload") {
      // access a particular field as you would any JS property
      const storageRef = newValue.storageRef;
      console.log("storageRef", storageRef);
      // https://firebase.google.com/docs/storage/admin/start
      const storage = admin.storage();
      // https://github.com/firebase/firebase-tools/issues/1573#issuecomment-517000981
      const bucket = storage.appInternal.options.storageBucket;
      // // const gcsUri = 'gs://my-bucket/audio.raw';
      const gcsUri = `gs://${bucket}/${storageRef}`;
      console.log("gcsUri", gcsUri);
      //////// STT ////////
      // Creates a client for STT
      const client = new speech.SpeechClient();
      const encoding = "mp3"; //'Encoding of the audio file, e.g. LINEAR16';
      const sampleRateHertz = "48000"; //16000;
      // const languageCode = newValue.language.value;
      const languageCode = "en-US"; //'BCP-47 language code, e.g. en-US';

      const config = {
        encoding: encoding,
        sampleRateHertz: sampleRateHertz,
        languageCode: languageCode
      };

      const audio = {
        uri: gcsUri
      };

      const request = {
        config: config,
        audio: audio
      };

      //   // Detects speech in the audio file. This creates a recognition job that you
      //   // can wait for now, or get its result later.
      const [operation] = await client.longRunningRecognize(request);
      // Get a Promise representation of the final result of the job
      const [response] = await operation.promise();
      // TODO: Convert to DPE
      // TODO: Save DPE json format to firebase as collection

      // get text - tmp
      const transcription = response.results
        .map(result => result.alternatives[0].transcript)
        .join("\n");
      console.log(`Transcription: ${transcription}`);

      // Then return a promise of a set operation to update the document
      return change.ref.set(
        {
          // TODO: change transcription to transcript
          transcription: transcription
        },
        { merge: true }
      );
    } else {
      return null;
    }
  });

links

-

google cloud function times out after 1min to 9min
Firestore events
https://github.com/GoogleCloudPlatform/nodejs-docs-samples/tree/master/functions/speech-to-speech
Supercharging Firebase with Google Cloud Platform - Google I/O 2016
A Serverless Audio Transcription Pipeline
audio-2-text
Transcribing videos
SpeechClient - SDK
REST Resource: operations
LongRunningRecognizeMetadata
LongRunningRecognizeResponse
Method: speech.recognize