Start:
I think what we would do is download the audio into an mp3 file and then, translate into German and then try to like match up the scrubbing time with the original video and then work on the transcripts editing along with the video.
This would be the ideal way of doing it, eventually.
But for NOW, download the mp3, put into German and then edit the words on like a txt file directly or like whatever file is used for captions. Edit this file directly.
I wonder if there's different dialects which it can pick up on as far as German. Would Swiss German be closer or Austrian German, etc.
Or I can download the mp4, upload it myself and then translate from there.
I think we are going to use this: https://github.com/AASHISHAG/deepspeech-german