textav-event-2017
  • Introduction
  • Intro
    • Introduction
    • TODOS
  • Projects
    • hyperaudio
    • oTranscribe
    • WebAv
    • Opened Captions service
    • Opened Captions annotated articles
      • presentation at SRCCON
    • FrameTrail
    • Captions and TV Archives
    • Extending audiogram with automated transcriptions
    • Palestinian Remix
    • BBC Dialogger
    • autoEdit
  • Remote Presentations
    • Aeneas
    • Mercury
    • Captioning Workflow
      • Needs For Captioning Tool
    • Transcription Service at the FT
    • BBC Video Context
  • Problem Domains
    • Problem domain and component based design
    • Interactive Transcription
    • 🔪✅⬇️ (Annotations models)
    • Object-based Broadcasting
    • Tv Archive AI pipeline
    • The Problem with archives
    • From Spoken Word To Sheet Music
  • Services
    • PopUp Archive & Audiosear.ch
    • YouTube for Publishers (Europe) at the Guardian
    • Microsoft STT & Cognitive Services
  • Unconference Projects
    • TransProvenance
      • Architecture
      • Futures of the project
    • Transcript correction
      • webaligner
    • AI Pipeline
      • I learned what Tesseract can do (and so can you!)
    • Captioning Workflow System
    • removeTextTrack API
Powered by GitBook
On this page
  1. Projects

Palestinian Remix

PreviousExtending audiogram with automated transcriptionsNextBBC Dialogger

Last updated 6 years ago

Notes:

  • Project : 20 documentaries in 4 languages (English, Arabic, Hebrew, Bosnian) - Based on Hyperaud.io technology.

  • When people say that they have transcripts they may not have them in the format that you need.

  • Getting transcripts in a word document… in tables…??? Built a tool with a vertical timeline, each paragraph with tabs for each language.

  • Helped by QCRI who use Kaldi and have a very good arabic language model.

  • To deal with hardcoded subtitles: Duplicate video and in javascript canvas, frame-by-frame matching looking for changes in the subtitles, and send it to tesseract server-side with English back (with bad spelling because hard to do).

  • “One night I managed to write something terrifying like this …”

http://PalestineRemix.com