Skip to content

Instantly share code, notes, and snippets.

@adam-davis
Created March 2, 2012 18:15
Show Gist options
  • Save adam-davis/1960153 to your computer and use it in GitHub Desktop.
Save adam-davis/1960153 to your computer and use it in GitHub Desktop.
Grabbing and pritning places in OpenBlock
scraper_text =A professor from White Hall will be at the Student Centerl"
grab_results = place_grabber(scraper_text)
print grab_results
[(18, 32, 'Student Center'), (37, 47, 'White Hall')]
place_grabber = places.place_grabber()
grab_results = place_grabber(scraper_text)
print grab_results
[(18, 32, 'Student Center')]print grab_results[0][2]
Student Center
You can also iterate through the results in a list using a for. .....in.... loop.
scraper_text = "Today at White Hall a clown was badly injured in a pie eating contest. We sent our expert hip-hop analyist Miles Johnson to the Student Center to report on these shocking events that happened at White Hall."
grab_results = place_grabber(scraper_text)
for left_substring_index, right_substring_index, place in grab_results:
... print place
#output
White Hall
Student Center
White Hall
from ebpub.streets.models import Place, PlaceType, PlaceSynonym
#including the places module for grabbers
from ebdata.nlp import places
from ebpub.db.models import NewsItem
newsItem = NewsItem()
scraper_text = "Where: Kent State Student Center"
#here i’m using a place_grabbber, location_grabber can also be used to search for
#locations in a string
place_grabber = places.place_grabber()
grab_results = place_grabber(scraper_text)
print grab_results
[(18, 32, 'Student Center')]
newsItem.location = Place.objects.get(pretty_name=grab_results[0][2]).location
print newsItem.location
POINT (-81.3431381031799958 41.1477285979820024)
scraper_text = "Today at White Hall a clown was badly injured in a pie eating contest.\
We sent our expert hip-hop analyst Miles Johnson to the Student Center to report on\
these sinister happening that without a doubt occurred at White Hall."
grab_results = place_graber(scraper_text)
print grab_results
[(9, 19, 'White Hall'), (129, 143, 'Student Center'), (196, 206, 'White Hall')]
# checking the size of grab_results
# if there’s more than one match we’ll take a running total of how many times
# each match occured
if len(grab_results) > 1:
results = {}
#in this loop, l, r, and place refer to the three members of the grab_result list
for l, r, place in grab_results:
if place in results:
results[place] += 1
else:
results[place] = 1
print results
{'Student Center': 1, 'White Hall': 2}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment