textav-event-2017
  • Introduction
  • Intro
    • Introduction
    • TODOS
  • Projects
    • hyperaudio
    • oTranscribe
    • WebAv
    • Opened Captions service
    • Opened Captions annotated articles
      • presentation at SRCCON
    • FrameTrail
    • Captions and TV Archives
    • Extending audiogram with automated transcriptions
    • Palestinian Remix
    • BBC Dialogger
    • autoEdit
  • Remote Presentations
    • Aeneas
    • Mercury
    • Captioning Workflow
      • Needs For Captioning Tool
    • Transcription Service at the FT
    • BBC Video Context
  • Problem Domains
    • Problem domain and component based design
    • Interactive Transcription
    • 🔪✅⬇️ (Annotations models)
    • Object-based Broadcasting
    • Tv Archive AI pipeline
    • The Problem with archives
    • From Spoken Word To Sheet Music
  • Services
    • PopUp Archive & Audiosear.ch
    • YouTube for Publishers (Europe) at the Guardian
    • Microsoft STT & Cognitive Services
  • Unconference Projects
    • TransProvenance
      • Architecture
      • Futures of the project
    • Transcript correction
      • webaligner
    • AI Pipeline
      • I learned what Tesseract can do (and so can you!)
    • Captioning Workflow System
    • removeTextTrack API
Powered by GitBook
On this page
  1. Services

PopUp Archive & Audiosear.ch

PreviousFrom Spoken Word To Sheet MusicNextYouTube for Publishers (Europe) at the Guardian

Last updated 6 years ago

Notes

  • Make audio searchable!

  • PopUp Archive came first, and now Audiosear.ch

    People come to them with content, so that is what POpUp Archive is used for. Download files, get SRT (if WebVTT), and if public, back up at Internet Archive.

    Audiosear.ch is a full text podcast search API, importing and transcribing tons of podcasts all the time and making it available via API and extracting entities, keywords, topic clustering

  • Topic clustering via machine learning -- put them in "unsupervised training buckets"

    Cultural analysis -- what do people on the internet think about this?

  • Thinking about features as "Will this bring more people to the podcast table?"

  • Searching within the text to find exactly where a keyword is mentioned

  • Experimental Twitter videos – clips from podcasts with transcripts (because Twitter mutes video by default)

  • Categorize podcasts by language type / topics (e.g. colloquial, politics, NSFW, relationships) and plot on scatter charts

  • API:

https://twitter.com/audiosearchfm/status/856615409922957312
https://www.audiosear.ch/docs