Skip to content

Instantly share code, notes, and snippets.

@eliduke
Last active December 23, 2015 10:23
Show Gist options
  • Save eliduke/191763d1768eb2638e5b to your computer and use it in GitHub Desktop.
Save eliduke/191763d1768eb2638e5b to your computer and use it in GitHub Desktop.
Problems with custom LocalResource class

Lawson,

re: http://viget.com/extend/make-remote-files-local-with-ruby-tempfile

I have created my own This American Life archive (http://thisamericanlife.co) and I am using your AWESOME LocalResource class to import the remote jpg and mp3 associated with each episode to an s3 bucket. It has been awesome and very functional for months and it just stopped working the other day. I think maybe part of the problem was that they changed the download link, but I got that sorted and then kept getting errors and went down a big rabbit hole and now here I am creating this gist in hopes that some of this makes sense and you can help steer me in the right direction. So, here's my breakdown...

To start off, I am requiring both open-uri and httparty in my application.rb. Also, I pulled LocalResource into it's own class (lib/local_resource.rb):

class LocalResource
  
  attr_reader :uri

  def initialize(uri)
    @uri = uri
  end

  def file
    @file ||= Tempfile.new(tmp_filename, tmp_folder, encoding: encoding).tap do |f|
      io.rewind
      f.write(io.read)
      f.close
    end
  end

  def io
    @io ||= uri.open
  end

  def encoding
    io.rewind
    io.read.encoding
  end

  def tmp_filename
    [
      Pathname.new(uri.path).basename,
      Pathname.new(uri.path).extname
    ]
  end

  def tmp_folder
    Rails.root.join('tmp')
  end
end

I am still using your "local resources from url" method:

def local_resource_from_url(url)
  LocalResource.new(URI.parse(url))
end

And here is my bloated import method:

def import

  if new_episode?

    episode = this_week

    doc = Nokogiri::HTML(open("http://www.thisamericanlife.org/radio-archives/episode/#{episode}")).css("div#content")

    number = doc.css("h1.node-title").text.split(":").first.to_i
    title = doc.css("h1.node-title").text.split(":").last.strip
    description = doc.css("div.description").text.strip
    date = Date.parse(doc.css("div.date").text).strftime("%F")

    image = doc.css("div.image img").attribute('src')
    podcast = "http://www.podtrac.com/pts/redirect.mp3/podcast.thisamericanlife.org/podcast/#{episode}.mp3"

    begin
      local_podcast = local_resource_from_url(podcast)
      local_copy_of_podcast = local_podcast.file

      local_image = local_resource_from_url(image)
      local_copy_of_image = local_image.file

      s3 = AWS::S3.new
      bucket = s3.buckets["#{ENV['S3_BUCKET_NAME']}"]

      if !bucket.objects["podcasts/#{episode}.mp3"].exists?
        bucket.objects["podcasts/#{episode}.mp3"].write(:file => local_copy_of_podcast.path, :acl => :public_read)
      end

      if !bucket.objects["images/#{episode}.jpg"].exists?
        bucket.objects["images/#{episode}.jpg"].write(:file => local_copy_of_image.path, :acl => :public_read)
      end

    ensure
      local_copy_of_podcast.close
      local_copy_of_podcast.unlink
      local_copy_of_image.close
      local_copy_of_image.unlink
    end

    Podcast.create!(number: number, title: title, description: description, date: date)

    redirect_to root_path, notice: "New Episode Imported! :)"

  else
    redirect_to root_path, notice: "No New Episodes. :("
  end

end

All of this worked for quite some time, but then I got this error:

undefined method `close' for nil:NilClass

ensure
  local_copy_of_podcast.close
  local_copy_of_podcast.unlink
  local_copy_of_image.close
  local_copy_of_image.unlink

I removed the "begin / ensure / end" part in the import method, I run it again, and I get this:

unexpected prefix: #<Pathname:552.mp3>

def file
  @file ||= Tempfile.new(tmp_filename, tmp_folder, encoding: encoding).tap do |f|
  io.rewind
  f.write(io.read)
  f.close

That seemed like a weird error, because, as I understand things, the tmp_filename is just trying to grab the actual file name and then the file type and create an array like this ['522','.mp3']. But when I looked into the Pathname.basename method, it returns the entire thing "522.mp3" in this weird Pathname object. Keep in mind, this HAD been working for MONTHS, but either way I decided that my next move was to forget about tmp_filename all together and I just hardcoded a file name:

@file ||= Tempfile.new(['522', '.mp3'], tmp_folder, encoding: encoding)

I run import again and I get this:

private method `open' called for #<URI::Generic:0x007ff28e8914e8>

def io
  @io ||= uri.open
end

And this is where I stop because I can't figure out what to do next. Does any of this make sense? If you've made it this far and things are still totally out there and you want to take a look at the actual source files, have at it!

https://github.com/eliduke/thisamericanlife.co

Thanks ahead of time!

Eli

@fran-worley
Copy link

I just had the same thing happen to me. No idea exactly when it stopped working but it appears that Pathname.basename returns a class and not a string and doesn't sub out the extension name.
I changed the tmp_pathname method to the following and it seems to be working:

def tmp_filename
    array = Pathname.new(uri.path).basename.to_s.split(".")
    array[1] = ".#{array[1]}"
    array
end

Definately nicer ways of doing this, but essentially you need to end up with an array in the following format:

["file_name_no_extension",".extension"]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment