5) Pedestrian Flow over Multiple Pairs of Origins and Destinations.
Two main advantages of using UNA in python are automation and scallability. In this section, we’ll se how we can automate estimating flows over multiple pairs of origins and destinations. A task that takes a lot of time if done on a GUI software. We will create a pairing table: a table that define what origin, destination, and parameters are used to generate flow in each pair.
[1]:
import madina as md
import madina.una.tools as una
import pandas as pd
import os
import multiprocessing
from pathlib import Path
We will be using the sample data from Somerville, MA provided in the folder Cities/Somerville/Data. The data files must all be in thw same CRS, and all must be clipped to the same analysis area. We want to store all the results in a folder called Cities/Somerville/Simulations/Baseline
[2]:
data_folder = os.path.join("Cities", "Somerville", "Data")
output_folder = os.path.join("Cities", "Somerville", "Simulations", "Baseline")
Path(output_folder).mkdir(parents=True, exist_ok=True) ## create the output folder if it doesn't exist
Then, We identify the pairs of origins and destinations we want to simulate. Assign each pair a Flow_Name: This name would be used to store the corresponding flow as a column/attribute in the network file. We then we name the Origin_File and Origin_Name for each origin in the pair, we also specify an Origin_Weight: an attribute/column in the origin layer. if we didn’t want to assign a weight for a given origin, we use the keyword Count to indicate that all origins weight the
same. We do the same for destinations where we specify Destination_File, Destination_Name and Destination_Weight
[3]:
pairing_table = pd.DataFrame(
{
'Flow_Name': ['Bus_Subway', 'Homes_Subway', 'Jobs_Subway', 'Amenities_Amenities', 'CensusBlock_Parks', 'Institutions_Subway'],
'Origin_File': ['bus.geojson', 'homes.geojson', 'jobs.geojson', 'amenities.geojson', 'CensusBlock.geojson', 'institutions.geojson'],
'Origin_Name': ['Bus', 'Homes', 'Jobs', 'Amenities', 'CensusBlock', 'Institutions'],
'Origin_Weight': ['LineCount', 'RES_2020B', 'EMPNUM', 'Count', 'POP20', 'Count'],
'Destination_File': ['subway.geojson', 'subway.geojson', 'subway.geojson', 'amenities.geojson', 'parks.geojson', 'subway.geojson'],
'Destination_Name': ['Subway', 'Subway', 'Subway', 'Amenities', 'Parks', 'Subway'],
'Destination_Weight': ['Count', 'Count', 'Count', 'Count', 'Count', 'Count']
}
)
pairing_table
[3]:
| Flow_Name | Origin_File | Origin_Name | Origin_Weight | Destination_File | Destination_Name | Destination_Weight | |
|---|---|---|---|---|---|---|---|
| 0 | Bus_Subway | bus.geojson | Bus | LineCount | subway.geojson | Subway | Count |
| 1 | Homes_Subway | homes.geojson | Homes | RES_2020B | subway.geojson | Subway | Count |
| 2 | Jobs_Subway | jobs.geojson | Jobs | EMPNUM | subway.geojson | Subway | Count |
| 3 | Amenities_Amenities | amenities.geojson | Amenities | Count | amenities.geojson | Amenities | Count |
| 4 | CensusBlock_Parks | CensusBlock.geojson | CensusBlock | POP20 | parks.geojson | Parks | Count |
| 5 | Institutions_Subway | institutions.geojson | Institutions | Count | subway.geojson | Subway | Count |
All OD pair flows should be estimated using the same network. Each pair could potentially use a different network cost if for example, we eanted to account for different percieved distances to different demographics or trip types.in this example, we will use the same network and percieved distance. In our pairing table, we specify a Network_File and a Network_Cost: a column that contains numerical values.
[4]:
pairing_table['Network_File'] = 'network.geojson'
pairing_table['Network_Cost'] = 'PercLen'
pairing_table
[4]:
| Flow_Name | Origin_File | Origin_Name | Origin_Weight | Destination_File | Destination_Name | Destination_Weight | Network_File | Network_Cost | |
|---|---|---|---|---|---|---|---|---|---|
| 0 | Bus_Subway | bus.geojson | Bus | LineCount | subway.geojson | Subway | Count | network.geojson | PercLen |
| 1 | Homes_Subway | homes.geojson | Homes | RES_2020B | subway.geojson | Subway | Count | network.geojson | PercLen |
| 2 | Jobs_Subway | jobs.geojson | Jobs | EMPNUM | subway.geojson | Subway | Count | network.geojson | PercLen |
| 3 | Amenities_Amenities | amenities.geojson | Amenities | Count | amenities.geojson | Amenities | Count | network.geojson | PercLen |
| 4 | CensusBlock_Parks | CensusBlock.geojson | CensusBlock | POP20 | parks.geojson | Parks | Count | network.geojson | PercLen |
| 5 | Institutions_Subway | institutions.geojson | Institutions | Count | subway.geojson | Subway | Count | network.geojson | PercLen |
After specifying origins, destinations and the network, we need to specify parameters for the betweenness function. In many cases, you want to use the same parameter for all pairs:
[5]:
pairing_table['Radius'] = 800
pairing_table['Detour'] = 1.15
pairing_table['Decay'] = True
pairing_table['Decay_Mode'] = 'exponent'
pairing_table['Beta'] = 0.001
pairing_table['Closest_destination'] = False
pairing_table['Elastic_Weights'] = False
pairing_table['KNN_Weight'] = None
pairing_table['Plateau'] = None
pairing_table['Turns'] = False
pairing_table['Turn_Threshold'] = None
pairing_table['Turn_Penalty'] = None
pairing_table
[5]:
| Flow_Name | Origin_File | Origin_Name | Origin_Weight | Destination_File | Destination_Name | Destination_Weight | Network_File | Network_Cost | Radius | ... | Decay | Decay_Mode | Beta | Closest_destination | Elastic_Weights | KNN_Weight | Plateau | Turns | Turn_Threshold | Turn_Penalty | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | Bus_Subway | bus.geojson | Bus | LineCount | subway.geojson | Subway | Count | network.geojson | PercLen | 800 | ... | True | exponent | 0.001 | False | False | None | None | False | None | None |
| 1 | Homes_Subway | homes.geojson | Homes | RES_2020B | subway.geojson | Subway | Count | network.geojson | PercLen | 800 | ... | True | exponent | 0.001 | False | False | None | None | False | None | None |
| 2 | Jobs_Subway | jobs.geojson | Jobs | EMPNUM | subway.geojson | Subway | Count | network.geojson | PercLen | 800 | ... | True | exponent | 0.001 | False | False | None | None | False | None | None |
| 3 | Amenities_Amenities | amenities.geojson | Amenities | Count | amenities.geojson | Amenities | Count | network.geojson | PercLen | 800 | ... | True | exponent | 0.001 | False | False | None | None | False | None | None |
| 4 | CensusBlock_Parks | CensusBlock.geojson | CensusBlock | POP20 | parks.geojson | Parks | Count | network.geojson | PercLen | 800 | ... | True | exponent | 0.001 | False | False | None | None | False | None | None |
| 5 | Institutions_Subway | institutions.geojson | Institutions | Count | subway.geojson | Subway | Count | network.geojson | PercLen | 800 | ... | True | exponent | 0.001 | False | False | None | None | False | None | None |
6 rows × 21 columns
If for instance, we needed to set specific parameters, for instance, we want to set the search radius for the pair “Bus_Subway” to 400 instead of 800:
[6]:
pairing_table.at[0, 'Radius'] = 400
pairing_table
[6]:
| Flow_Name | Origin_File | Origin_Name | Origin_Weight | Destination_File | Destination_Name | Destination_Weight | Network_File | Network_Cost | Radius | ... | Decay | Decay_Mode | Beta | Closest_destination | Elastic_Weights | KNN_Weight | Plateau | Turns | Turn_Threshold | Turn_Penalty | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | Bus_Subway | bus.geojson | Bus | LineCount | subway.geojson | Subway | Count | network.geojson | PercLen | 400 | ... | True | exponent | 0.001 | False | False | None | None | False | None | None |
| 1 | Homes_Subway | homes.geojson | Homes | RES_2020B | subway.geojson | Subway | Count | network.geojson | PercLen | 800 | ... | True | exponent | 0.001 | False | False | None | None | False | None | None |
| 2 | Jobs_Subway | jobs.geojson | Jobs | EMPNUM | subway.geojson | Subway | Count | network.geojson | PercLen | 800 | ... | True | exponent | 0.001 | False | False | None | None | False | None | None |
| 3 | Amenities_Amenities | amenities.geojson | Amenities | Count | amenities.geojson | Amenities | Count | network.geojson | PercLen | 800 | ... | True | exponent | 0.001 | False | False | None | None | False | None | None |
| 4 | CensusBlock_Parks | CensusBlock.geojson | CensusBlock | POP20 | parks.geojson | Parks | Count | network.geojson | PercLen | 800 | ... | True | exponent | 0.001 | False | False | None | None | False | None | None |
| 5 | Institutions_Subway | institutions.geojson | Institutions | Count | subway.geojson | Subway | Count | network.geojson | PercLen | 800 | ... | True | exponent | 0.001 | False | False | None | None | False | None | None |
6 rows × 21 columns
Once the input data is prepared, and the pairing table is complete, we have everything we need to calculate the betweenness flow for the six pairs of origins, destinations and their parameters
[7]:
# Creating a Zonal object to be used as a workspace.
somerville = md.Zonal()
# Loading the network file, as specified in the first pair.
somerville.load_layer(
name='streets',
source=os.path.join(data_folder, pairing_table.at[0, "Network_File"])
)
# Going on a loop over rows of the pairing table
for pairing_idx, pairing in pairing_table.iterrows():
if (pairing_idx == 0) or (pairing_table.at[pairing_idx, 'Network_Cost'] != pairing_table.at[pairing_idx-1, 'Network_Cost']):
# Setting up a street network if this is the first pairing, or if the network weight changed from previous pairing
somerville.create_street_network(
source_layer='streets',
node_snapping_tolerance=0.00001,
weight_attribute=None if pairing['Network_Cost'] == "Geometric" else pairing['Network_Cost'], # set weight attribute to None if the keyword 'Geometric" was used, otherwise, use the provided attribute.
)
else:
# if not creating a new network, clear nodes from the existing.
somerville.clear_nodes()
# if turn penalty is applied, update turn parameters in case they change across pairs
if pairing['Turns']:
somerville.set_turn_parameters(
turn_penalty_amount=pairing['Turn_Penalty'],
turn_threshold_degree=pairing['Turn_Threshold'],
)
# Loading layers, if they're not already loaded.
if pairing["Origin_Name"] not in somerville.layers:
somerville.load_layer(
name=pairing["Origin_Name"],
source=os.path.join(data_folder, pairing["Origin_File"])
)
if pairing["Destination_Name"] not in somerville.layers:
somerville.load_layer(
name=pairing["Destination_Name"],
source=os.path.join(data_folder, pairing["Destination_File"])
)
# Inserting origin and destination nodes.
somerville.insert_node(
layer_name=pairing['Origin_Name'],
label='origin',
weight_attribute=pairing['Origin_Weight'] if pairing['Origin_Weight'] != "Count" else None
)
somerville.insert_node(
layer_name=pairing['Destination_Name'],
label='destination',
weight_attribute=pairing['Destination_Weight'] if pairing['Destination_Weight'] != "Count" else None
)
# create a network graph
somerville.create_graph()
# run the betweenness tool by passing arguments from the current pair in the loop.
una.betweenness(
zonal=somerville,
search_radius=pairing['Radius'],
detour_ratio=pairing['Detour'],
decay=False if pairing['Elastic_Weights'] else pairing['Decay'], # elastic weight already reduces origin weight factoring in decay. if this pairing uses elastic weights, don't apply decay
decay_method=pairing['Decay_Mode'],
beta=pairing['Beta'],
num_cores=multiprocessing.cpu_count(), #uses the maximum available number of cores.
closest_destination=pairing['Closest_destination'],
elastic_weight=pairing['Elastic_Weights'],
knn_weight=pairing['KNN_Weight'],
knn_plateau=pairing['Plateau'],
turn_penalty=pairing['Turns'],
save_betweenness_as=pairing['Flow_Name'],
save_reach_as='reach_'+pairing['Flow_Name'],
save_gravity_as='gravity_'+pairing['Flow_Name'],
save_elastic_weight_as='elastic_weight_'+pairing['Flow_Name'] if pairing['Elastic_Weights'] else None, # saving elastic weights if they were used.
)
print ("-----------------------Done generating flow: ", pairing['Flow_Name'], "-----------------------")
# Save saving origin results (reach, gravity are saved to the origin layer)
for origin_layer in pairing_table['Origin_Name'].unique():
somerville[origin_layer].gdf.to_file(os.path.join(output_folder, origin_layer+'.geojson'), driver='GeoJSON', engine='pyogrio')
# saving network results, betweenness flows are saved to the network layer.
somerville['streets'].gdf.to_file(os.path.join(output_folder, pairing_table.at[0, "Network_File"]+'.geojson'), driver='GeoJSON', engine='pyogrio')
print("-----------------------Done Generating all flows-----------------------")
-----------------------Done generating flow: Bus_Subway -----------------------
-----------------------Done generating flow: Homes_Subway -----------------------
-----------------------Done generating flow: Jobs_Subway -----------------------
-----------------------Done generating flow: Amenities_Amenities -----------------------
-----------------------Done generating flow: CensusBlock_Parks -----------------------
-----------------------Done generating flow: Institutions_Subway -----------------------
-----------------------Done Generating all flows-----------------------
The loop above, once done, would generate betweenness flows for all specified six pairs. These flows are useful to recognize critical paths between a given pair of origins and destinations, or could be used to train a pedestrian flow prediction model if training pedestrain counts are available.