Skip to content

Instantly share code, notes, and snippets.

@jage
Last active August 14, 2018 17:56
Show Gist options
  • Save jage/1e5e055a7de6158e8178e6b13c29747b to your computer and use it in GitHub Desktop.
Save jage/1e5e055a7de6158e8178e6b13c29747b to your computer and use it in GitHub Desktop.
source "https://rubygems.org/"
gem "mastodon-api", git: "https://github.com/tootsuite/mastodon-api.git"
gem "nokogiri"
gem "twingly-url"
GIT
remote: https://github.com/tootsuite/mastodon-api.git
revision: 189deb8219ae1ce7c34386d9ad1ca7e4a5fec62c
specs:
mastodon-api (1.2.0)
addressable (~> 2.5)
buftok
http (~> 3.0)
oj (~> 3.3)
GEM
remote: https://rubygems.org/
specs:
addressable (2.5.2)
public_suffix (>= 2.0.2, < 4.0)
buftok (0.2.0)
domain_name (0.5.20180417)
unf (>= 0.0.5, < 1.0.0)
http (3.3.0)
addressable (~> 2.3)
http-cookie (~> 1.0)
http-form_data (~> 2.0)
http_parser.rb (~> 0.6.0)
http-cookie (1.0.3)
domain_name (~> 0.5)
http-form_data (2.1.1)
http_parser.rb (0.6.0)
mini_portile2 (2.3.0)
nokogiri (1.8.4)
mini_portile2 (~> 2.3.0)
oj (3.6.5)
public_suffix (3.0.2)
twingly-url (5.1.1)
addressable (~> 2.5.2)
public_suffix (~> 3.0.1)
unf (0.1.4)
unf_ext
unf_ext (0.0.7.5)
PLATFORMS
ruby
DEPENDENCIES
mastodon-api!
nokogiri
twingly-url
BUNDLED WITH
1.16.2
require "mastodon"
require "nokogiri"
require "twingly/url/utilities"
def extract_urls(html_or_text)
text = Nokogiri::HTML(html_or_text).text
Twingly::URL::Utilities.extract_valid_urls(text)
end
client = Mastodon::Streaming::Client.new(base_url: "https://mastodon.social", bearer_token: ENV.fetch("BEARER_TOKEN"))
client.stream("public") do |obj|
next unless obj.is_a?(Mastodon::Status)
urls = []
urls.concat(extract_urls(obj.account.note))
urls.concat(extract_urls(obj.content))
urls.uniq.each do |url|
puts url
end
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment