Skip to content

Instantly share code, notes, and snippets.

@ChristianAlexander
Created April 12, 2024 23:03
Show Gist options
  • Save ChristianAlexander/bea72b83cfe62926580619979708add8 to your computer and use it in GitHub Desktop.
Save ChristianAlexander/bea72b83cfe62926580619979708add8 to your computer and use it in GitHub Desktop.
Podcast Transcription LiveBook

Podcast Transcription

Mix.install([
  {:req, "~> 0.4.14"},
  {:fast_rss, "~> 0.5.0"},
  {:bumblebee, "~> 0.5.3"},
  {:exla, "~> 0.7.1"},
  {:kino, "~> 0.12.3"}
])

Obtain Episodes

rss_feed_url = "https://feeds.fireside.fm/elixiroutlaws/rss"

%{body: rss_body} = Req.get!(rss_feed_url)

{:ok, rss_feed} = FastRSS.parse_rss(rss_body)
# Grab the fields we care about
episodes =
  Enum.map(rss_feed["items"], fn item ->
    %{
      title: item["title"],
      url: item["enclosure"]["url"]
    }
  end)
# For demonstration, limit the number of episodes to download and process
episode_limit = 2

# Establish a temporary directory to store downloaded podcast episodes
download_directory = Path.join(System.tmp_dir!(), "podcast-downloads")
File.mkdir_p!(download_directory)

episodes =
  episodes
  |> Enum.take(episode_limit)
  |> Enum.map(fn episode ->
    filename = URI.parse(episode.url) |> Map.fetch!(:path) |> Path.basename()
    out_path = Path.join(download_directory, filename)

    Req.get!(url: episode.url, into: File.stream!(out_path))

    Map.put(episode, :local_path, out_path)
  end)

Transcribe

# Download and initialize Whisper model
# Note that other models may have higher accuracy at a cost of slower runtime
{:ok, whisper} = Bumblebee.load_model({:hf, "openai/whisper-tiny"})
{:ok, featurizer} = Bumblebee.load_featurizer({:hf, "openai/whisper-tiny"})
{:ok, tokenizer} = Bumblebee.load_tokenizer({:hf, "openai/whisper-tiny"})
{:ok, generation_config} = Bumblebee.load_generation_config({:hf, "openai/whisper-tiny"})

serving =
  Bumblebee.Audio.speech_to_text_whisper(whisper, featurizer, tokenizer, generation_config,
    defn_options: [compiler: EXLA],
    chunk_num_seconds: 30,
    timestamps: :segments
  )
# Add Homebrew path, necessary for Mac ffmpeg
os_path = System.get_env("PATH")
homebrew_bin_path = "/opt/homebrew/bin"

if :os.type() == {:unix, :darwin} and not String.contains?(os_path, homebrew_bin_path) do
  System.put_env("PATH", os_path <> ":" <> homebrew_bin_path)
end
episodes =
  Enum.map(episodes, fn episode ->
    start_time = DateTime.utc_now()
    transcription_output = Nx.Serving.run(serving, {:file, episode.local_path})
    end_time = DateTime.utc_now()

    Map.merge(episode, %{
      transcription: transcription_output.chunks,
      transcription_processing_seconds: DateTime.diff(end_time, start_time)
    })
  end)
calculate_transcription_speed_ratio = fn episode ->
  audio_length =
    episode.transcription
    |> Enum.map(fn chunk -> chunk.end_timestamp_seconds end)
    |> Enum.max()

  IO.inspect(episode)

  audio_length / episode.transcription_processing_seconds
end

chunk_to_markdown = fn chunk ->
  "- #{chunk.start_timestamp_seconds}: #{chunk.text}"
end

episode_to_markdown = fn episode ->
  speed_ratio = Float.round(calculate_transcription_speed_ratio.(episode), 2)

  """
  # #{episode.title}

  Transcribed by Whisper at #{speed_ratio}x speed.

  ## Transcript

  #{Enum.map(episode.transcription, &chunk_to_markdown.(&1)) |> Enum.join("\n")}
  """
end

Kino.Markdown.new(episode_to_markdown.(Enum.at(episodes, 0)))
@chanphiromsok
Copy link

ffmepg PATH not found
error

episodes =
Enum.map(episodes, fn episode ->
start_time = DateTime.utc_now()
transcription_output = Nx.Serving.run(serving, {:file, episode.local_path})
end_time = DateTime.utc_now()

Map.merge(episode, %{
  transcription: transcription_output.chunks,
  transcription_processing_seconds: DateTime.diff(end_time, start_time)
})

end)
Evaluated
** (ArgumentError) invalid input, ffmpeg not found in PATH
(bumblebee 0.5.3) lib/bumblebee/shared.ex:177: Bumblebee.Shared.validate_serving_input!/2
(bumblebee 0.5.3) lib/bumblebee/audio/speech_to_text_whisper.ex:83: anonymous fn/7 in Bumblebee.Audio.SpeechToTextWhisper.speech_to_text_whisper/5
(nx 0.7.2) lib/nx/serving.ex:1748: anonymous fn/3 in Nx.Serving.handle_preprocessing/2
(telemetry 1.2.1) /Users/rom/Library/Caches/mix/installs/elixir-1.17.1-erts-15.0/02cf9f08996480d4c771b84b14076a25/deps/telemetry/src/telemetry.erl:321: :telemetry.span/3
(nx 0.7.2) lib/nx/serving.ex:683: Nx.Serving.run/2
#cell:7rgsw657jlp2qapa:4: (file)
#cell:7rgsw657jlp2qapa:2: (file)

@ChristianAlexander
Copy link
Author

@chanphiromsok, make sure to install ffmpeg and have its path in your PATH environment variable.

If on MacOS, use brew install ffmpeg. On other platforms, follow instructions here: https://ffmpeg.org/download.html

@chanphiromsok
Copy link

thanks you

@eyadhif
Copy link

eyadhif commented Jul 30, 2024

└─ lib/axon/defn.ex:1: Axon.Defn (module)

Generated axon app
==> nimble_pool
Compiling 2 files (.ex)
Generated nimble_pool app
==> elixir_make
Compiling 8 files (.ex)
Generated elixir_make app
==> xla
Compiling 2 files (.ex)
Generated xla app
==> exla
could not compile dependency :exla, "mix compile" failed. Errors may have been logged above. You can recompile this dependency with "mix deps.compile exla --force", update it with "mix deps.update exla" or clean it with "mix deps.clean exla"
** (RuntimeError) none of the precompiled archives matches your target
Expected:
* xla_extension-x86_64-windows-cpu.tar.gz
Found:
* xla_extension-aarch64-darwin-cpu.tar.gz
* xla_extension-aarch64-linux-gnu-cpu.tar.gz
* xla_extension-aarch64-linux-gnu-cuda118.tar.gz
* xla_extension-aarch64-linux-gnu-cuda120.tar.gz
* xla_extension-x86_64-darwin-cpu.tar.gz
* xla_extension-x86_64-linux-gnu-cpu.tar.gz
* xla_extension-x86_64-linux-gnu-cuda118.tar.gz
* xla_extension-x86_64-linux-gnu-cuda120.tar.gz
* xla_extension-x86_64-linux-gnu-tpu.tar.gz

You can compile XLA locally by setting an environment variable: XLA_BUILD=true
(xla 0.6.0) lib/xla.ex:201: XLA.download_matching!/1
(xla 0.6.0) lib/xla.ex:33: XLA.archive_path!/0
c:/Users/ahmed/AppData/Local/mix/Cache/installs/elixir-1.17.2-erts-15.0.1/02cf9f08996480d4c771b84b14076a25/deps/exla/mix.exs:113: EXLA.MixProject.extract_xla/1
(mix 1.17.2) lib/mix/task.ex:574: Mix.Task.run_alias/6
(mix 1.17.2) lib/mix/tasks/compile.all.ex:108: Mix.Tasks.Compile.All.run_compiler/2
(mix 1.17.2) lib/mix/tasks/compile.all.ex:88: Mix.Tasks.Compile.All.compile/4
(mix 1.17.2) lib/mix/tasks/compile.all.ex:62: Mix.Tasks.Compile.All.run/1
heey , nice work :DD , can u help me with this please ?

@ChristianAlexander
Copy link
Author

@eyadhif, it looks like the Elixir XLA package doesn't release pre-built bundles for Windows anymore. If you run LiveBook with the XLA_BUILD environment variable to true, you might be able to install it. Otherwise, you might be able to install XLA version 0.1.0 (the last version they shipped with Windows support).

@eyadhif
Copy link

eyadhif commented Aug 13, 2024

Thank you very much @ChristianAlexander

@eyadhif
Copy link

eyadhif commented Aug 13, 2024

it actually worked !!

@eyadhif
Copy link

eyadhif commented Aug 13, 2024

do you think using XLA requires more precomplation of the model than using tensorflow ? that's why it takes more time and space ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment