Skip to content

Instantly share code, notes, and snippets.

@enderandpeter
Last active September 26, 2016 23:12
Show Gist options
  • Save enderandpeter/52236c4df006f0017b47eb4e0928f884 to your computer and use it in GitHub Desktop.
Save enderandpeter/52236c4df006f0017b47eb4e0928f884 to your computer and use it in GitHub Desktop.
A spirited but perhaps ultimately futile attempt to programmatically extract the street name, number, and direction from thousands of non-standardized addresses, using one as an example, that were entered into the Murney Business Dashboard.
$subject = 'LOT 3 PHS 2 S OZ NATURAL FALLS DR';
$directionals = array('north', 'east', 'south', 'west');
$pattern = "/(?P<number>^(?:(?:L(?:ot)?|PHS\w*)?(?:\d+(?:th)?(?:\s*\(\d+\))?)?(?:[[:^alnum:]]+)?(?:L(?:ot)?|PHS\w*)?(?:\d+(?:th)?(?:\s*\(\d+\))?)?)+)\s+(?P<direction>north|east|south|west|n|e|s|w)?(?P<street>.*)/i";
$result = preg_match($pattern, $subject, $matches);
if($result === 1){
$matches_found = true;
} else if ($result === 0) {
$matches_found = false;
} else {
echo preg_last_error();
}
?>
<?php
if($matches_found){
echo 'Matches Found';?>
<pre> <?php print_r($matches) ?> </pre>
<?php
$street = $matches['street'];
$number = $matches['number'];
$direction = $matches['direction'];
loadModel( 'agentLifecycle/myBusinessDashboard/MyBusinessDashboardTransactions' );
$showHidden = true;
$allTransactions = MyBusinessDashboardTransactions::GetAll($showHidden);
$abbreviation_map = array(
'AG' => 'Ash Grove',
'AU' => 'Aurora',
'BF' => 'Battlefield',
'BG' => 'Billings',
'BD' => "Bois D'Arc",
'BV' => 'Bolivar',
'BL' => 'Bolivar',
'BR' => 'Branson',
'BT' => 'Brighton',
'BL' => 'Brookline',
'BF' => 'Brookline', // Not sure why this is the case, but hey.
'BN' => 'Bruner',
'CF' => 'Cape Fair',
'CA' => 'Carthage',
'CG' => 'Carthage',
'CV' => 'Clever',
'CW' => 'Conway',
'CR' => 'Crane',
'FG' => 'Fair Grove',
'FD' => 'Fordland',
'FH' => 'Fremont Hills',
'HV' => 'Highlandville',
'HS' => 'Hollister',
'KC' => 'Kimberling City',
'LB' => 'Louisburg',
'MF' => 'Marshfield',
'MV' => 'Mt Vernon',
'MV' => 'Mt Vernon',
'NX' => 'Nixa',
'OZ' => 'Ozark',
'PH' => 'Pleasant Hope',
'RS' => 'Reed Spring',
'RP' => 'Republic',
'RV' => 'Rogersville',
'SB' => 'Saddlebrook',
'SY' => 'Seymour',
'SP' => 'Sparta',
'SK' => 'Spokane',
'ST' => 'Stockton',
'SF' => 'Strafford',
'WG' => 'Walnut Grove',
'WD' => 'Willard',
);
$street_array = explode(' ', $street);
while(empty($street_array[0])){
array_shift($street_array);
if(count($street_array) === 0){
break;
}
}
$first_street_part = trim($street_array[0]);
if(isset($abbreviation_map[$first_street_part])){
// The first part of the street name is a city abbreviation
array_shift($street_array);
}
$street_name = implode(' ', $street_array);
?>
<ul id="street_data">
<li style="font-size: 2em">Address: <?php echo $subject; ?></li>
<li>Street Number: <?php echo $number; ?></li>
<li>Street Direction: <?php echo $direction; ?></li>
<li>Street Name: <?php echo $street_name; ?></li>
</ul>
<?php
} else {
echo 'Matches <strong>Not</strong> Found';
}
?>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment