Skip to content

Instantly share code, notes, and snippets.

@yasith
Created December 16, 2012 16:48
Show Gist options
  • Save yasith/4309299 to your computer and use it in GitHub Desktop.
Save yasith/4309299 to your computer and use it in GitHub Desktop.
Python Scraper
import requests
import re
from bs4 import BeautifulSoup
page = requests.get(self.getUrl()).text
soup = BeautifulSoup(page)
regex = re.compile('nextPassingTimes.*')
links = soup.findAll('a', {'href': regex})
self._stops = []
for l in links:
url = str(l['href'])
name = str(l.contents[0])
self._stops.append({'url': url, 'name': name})
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment