Created
March 2, 2012 18:15
-
-
Save adam-davis/1960153 to your computer and use it in GitHub Desktop.
Grabbing and pritning places in OpenBlock
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
scraper_text = “A professor from White Hall will be at the Student Centerl" | |
grab_results = place_grabber(scraper_text) | |
print grab_results | |
[(18, 32, 'Student Center'), (37, 47, 'White Hall')] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
place_grabber = places.place_grabber() | |
grab_results = place_grabber(scraper_text) | |
print grab_results | |
[(18, 32, 'Student Center')]print grab_results[0][2] | |
Student Center | |
You can also iterate through the results in a list using a for. .....in.... loop. | |
scraper_text = "Today at White Hall a clown was badly injured in a pie eating contest. We sent our expert hip-hop analyist Miles Johnson to the Student Center to report on these shocking events that happened at White Hall." | |
grab_results = place_grabber(scraper_text) | |
for left_substring_index, right_substring_index, place in grab_results: | |
... print place | |
#output | |
White Hall | |
Student Center | |
White Hall |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
from ebpub.streets.models import Place, PlaceType, PlaceSynonym | |
#including the places module for grabbers | |
from ebdata.nlp import places | |
from ebpub.db.models import NewsItem | |
newsItem = NewsItem() | |
scraper_text = "Where: Kent State Student Center" | |
#here i’m using a place_grabbber, location_grabber can also be used to search for | |
#locations in a string | |
place_grabber = places.place_grabber() | |
grab_results = place_grabber(scraper_text) | |
print grab_results | |
[(18, 32, 'Student Center')] | |
newsItem.location = Place.objects.get(pretty_name=grab_results[0][2]).location | |
print newsItem.location | |
POINT (-81.3431381031799958 41.1477285979820024) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
scraper_text = "Today at White Hall a clown was badly injured in a pie eating contest.\ | |
We sent our expert hip-hop analyst Miles Johnson to the Student Center to report on\ | |
these sinister happening that without a doubt occurred at White Hall." | |
grab_results = place_graber(scraper_text) | |
print grab_results | |
[(9, 19, 'White Hall'), (129, 143, 'Student Center'), (196, 206, 'White Hall')] | |
# checking the size of grab_results | |
# if there’s more than one match we’ll take a running total of how many times | |
# each match occured | |
if len(grab_results) > 1: | |
results = {} | |
#in this loop, l, r, and place refer to the three members of the grab_result list | |
for l, r, place in grab_results: | |
if place in results: | |
results[place] += 1 | |
else: | |
results[place] = 1 | |
print results | |
{'Student Center': 1, 'White Hall': 2} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment