textav-components
  • Introduction
  • Guidelines for this board
    • How to add a new library/Modules/components
    • Trello --> Github --> gitbook (programmatically)
    • Plan Roadmap taxonomy
    • github README template
  • Media processing
    • ffmpeg and electron - example boilerplate
    • convert video to audio [Open Source]
    • Generate list of ffmpeg supported file formats [Open Source]
    • Detect silence [Open Source]
    • Youtube Video downloader module (?) [Open Source]
    • Module: Open source STT //Gentle refactor [Open Source]
    • cc extraction // OCR of captions [Open Source]
    • Module: Video format converter [Open Source]
    • Module: Video metadata reader [Open Source]
    • Banpass filter module
    • Tesseract - OCR
  • Transcriptions - utils
    • Transcriber module
    • Sample material for testing STT services [Open Source / CC]
    • Create word accurate time codes from line accurate time-coded transcript (eg srt)
    • Language codes ISO-639-1 Code
    • Module: Timecode conversion [Open Source]
    • UI Utilities for timecode representation
    • Sanitise string for file path
  • Transcription STT Sdk
    • Web Speech API
    • Pocket Sphinx STT [Open Source]
    • IBM Watson STT [Proprietary]
    • Google Cloud Speech API [Proprietary]
    • Microsoft Bing STT [Proprietary]
    • Baidu STT SDK [Proprietary]
    • Speechmatics STT SDK [Proprietary]
    • Spoken Data STT SDKs [Proprietary]
    • Gentle (Server) STT node SDK [Open Source]
    • Temi.com/rev.com [Proprietary]
    • Latvian Kaldi [open source]
    • Mod9
    • Movi - arduino component, offline
    • deepgram
    • Mozilla deep speech
    • AWS Transcriber
  • Transcription UI
    • Transcription text editor with Draft.js Editor [Open Source]
    • Overtyper
  • Alignement
    • Alignement
    • Module: to align partially scripted speeches
  • Captions
    • Module: captions composer (with text pre-segmentation) [Open Source]
    • Module: Captions burner [Open Source]
    • Srt parser composer // Pietro [Open Source]
    • TTML Parser // Gary, Brightcove [Open Source]
  • Annotations
    • Annotation model atjson
  • Paper-editing & remixing UI
    • Front end component: video preview of JSON Edl
  • Cognitive insights
    • LIUM Speaker Diarization BBC - [Open Source]
    • Module: open source summarization module [Open Source]
    • Module: punctuation and capitalisation. [Open Source]
  • Translation SDK
    • Deep L - Translation SDK node
  • Export & remix & video editing
    • Parse EDL (plain text) to JSON [Open Source]
    • Module: Post to facebook [Open Source]
    • EDL composer from JSON EDL [Open Source]
    • Module: Post to Twitter Video [Open Source]
    • edit video EDL (JSON) - ffmpeg-remix (super fast video editing of mp4 videos) // Laurian [Open Sourc
    • EDL Json to XML FCP7 (compatible with premiere) [Open Source]
    • Popcorn Js // Mozilla/Internet Archive [Open Source]
  • unsorted
    • NWJS boilerplate
    • QCTool
    • VRecord
    • Electron travis CI automated build: OSX, Linux, Windows
Powered by GitBook
On this page
  • BBC Specifications
  • C4 Specifications
  • Aljazeera Specifications
  • Labels
  1. Captions

Module: captions composer (with text pre-segmentation) [Open Source]

PreviousModule: to align partially scripted speechesNextModule: Captions burner [Open Source]

Last updated 6 years ago

The idea is basically if you have a transcription Json ( for simplicity's sake let's assume in it's simplest form as an array of word objects with Start and end timecode attributes) how to create an srt where the lines/words are well spaced and timed appropriately, automatically . If that makes sense?

Could refactor this: on npm

Example

or might be that using might be enough?

BBC Specifications

C4 Specifications

Aljazeera Specifications

Labels

Node module, Extract from BBC Subtlelizer,

https://github.com/pietrop/srtParserComposer
concept map of module
Aeneas as described in this card
https://bbc.github.io/subtitle-guidelines/
BBC guidelines online_sub_editorial_guidelines_vs1_1.pdf
http://www.bbc.co.uk/guidelines/futuremedia/accessibility/subtitling.shtml
C4 specification pdf
Link to trello card: Module: captions composer (with text pre-segmentation) [Open Source]