Data integration – on the edge

Snap to it

One of the challenges of integrating data across boundaries is edge matching, or in this case, line matching. Our data sharing group took the first step toward solving a seamless roads dataset by agreeing on “snap points”, where each agency would snap their roads at the boundary line with the next. Theoretically, if two people snapped the lines in their individual datasets to the same point, the data would be lined up and ready for integration. In the image, the bright green points are the snap points. At this scale, everything looks nice and happy.

snap points
Cross-jurisdictional roads data with snap points

 

unsnapped line
Unsnapped line near snap point

Zooming in, problems emerge. How to address this without manually examining every point in an eight-county area of over 8500 square miles? We know that looks are deceiving at large scales in ArcMap: on screen the line(s) might appear snapped when not, and vice versa.

Goal

The goal was provide an automated tool to check each snap point to make sure the nearby line endpoints are coincident. Hopefully, this tool can be run once by the person integrating the data. If problems are found, the source agency can be notified and encouraged to fix their data. Once fixed, the data should (again in theory) stay fixed.

I tried many things

Topology? I don’t know how to set up a topology rule which states that every point must be covered by at least two endpoints (not just one).

Intersect? This had promise – every point should be intersected by two lines. But wait, what if a line goes through instead of snapping at the endpoint? Errors I could see on the screen continued to be missed, causing me to mistrust the results. After many, many iterations of Select by Location, I cast my net a little further. I found disjoint.

Disjoint

Disjoint is a geometry method that lets us discover if two geometries share any points in common. To use it, you must acquire the geometry objects of the snap point and the road(s) endpoints near it. The overall process is to iterate through each snap point in the layer, select any road features within a search distance, and iterate through those road features to see if their endpoints are both disjoint (unsnapped) from that point. If any of the points have roads near them but the ends are not snapped to them, they are added to a list and at the end a new feature class is created consisting of the point(s). Then the new feature class can be added to a map to inspect the errors.

The code is set up for a script tool. The parameters are the snap points layer, the roads layer, an output feature class to hold any disjoint points, and the search distance. I used 15 feet successfully, but this value will vary depending on the characteristics of the data.

For details, see the comments within the code.

import arcpy
arcpy.env.overwriteOutput = True
snapPoints = arcpy.GetParameterAsText(0)
roadLines = arcpy.GetParameterAsText(1)
outFc = arcpy.GetParameterAsText(2)
searchDistance = arcpy.GetParameterAsText(3)
try:
# empty list which will contain the ids of the snap points which
# are disjoint (don't share geometry) with one or more nearby lines
ids = []
# get the name of the object id/oid field
oidField = arcpy.Describe(snapPoints).OIDFieldName
# make first feature layer from snap points
spLyr = "spLyr"
arcpy.MakeFeatureLayer_management(snapPoints,spLyr)
# make second feature layer from snap points
spLyr2 = "spLyr2"
arcpy.MakeFeatureLayer_management(snapPoints,spLyr2)
# make road feature layer from integrated roads
rdLyr = "rdLyr"
arcpy.MakeFeatureLayer_management(roadLines,rdLyr)
# iterate through snap points
with arcpy.da.SearchCursor(snapPoints,["OID@","SHAPE@"]) as cursor:
for row in cursor:
# set the expression to use for selecting the current snap point feature using its oid
exp = "{0} = {1}".format(oidField,row[0])
# create a geometry object from the shape of the snap point
ptGeom = row[1]
# select the current snap point in the cursor by id
currentSnap = arcpy.SelectLayerByAttribute_management(spLyr,"NEW_SELECTION",exp)
# make a feature layer from the current snap point
currentLyr = "currentLyr"
arcpy.MakeFeatureLayer_management(currentSnap,currentLyr)
# select the roads from the roads feature layer which are within search distance
# of the current snap point (feature layer)
nearbyRoads = arcpy.SelectLayerByLocation_management(rdLyr,"WITHIN_A_DISTANCE",currentLyr,searchDistance,"NEW_SELECTION")
count = int(arcpy.GetCount_management(nearbyRoads)[0])
if count > 0:
# create a temporary geometry object of the selected line features
lnGeom = arcpy.CopyFeatures_management(nearbyRoads,arcpy.Geometry())
# iterate through line features and check if they are "disjoint" from the current
# snap point geometry object
for ln in lnGeom:
# get the endpoints of the current line
fpLn = ln.firstPoint
lpLn = ln.lastPoint
if ptGeom.disjoint(fpLn) and ptGeom.disjoint(lpLn):
# if both endpoints are disjoint(true), add this snap feature's id
# to the list of unsnapped
ids.append(row[0])
# add the current snap point to the second feature layer
# in preparation for exporting to new feature class of unsnapped points
arcpy.SelectLayerByAttribute_management(spLyr2,"ADD_TO_SELECTION",exp)
arcpy.AddMessage("Unsnapped: " + str(len(ids)))
if len(ids) != 0:
arcpy.CopyFeatures_management(spLyr2,outFc)
arcpy.AddMessage("Script finished. To view unsnapped, add {0} to map.".format(outFc))
else:
arcpy.AddMessage("Script finished. No unsnapped segments.")
del row, cursor, currentLyr, rdLyr, spLyr, spLyr2
except Exception as e:
arcpy.AddError("Error: " + str(e))
arcpy.AddError(arcpy.GetMessages())

ArcGIS Desktop Documentation