This page was recovered from my old site and is, unfortunately, missing some graphical elements. The text has been untouched.
Violent crime clustering is a topic that appeared on the horizon of a possible analysis as I completed a recent paper using the SaTSCan spatial statistic. I realized that this method could be implemented on more than epidemiological research and wanted to give it a shot to see the results. If you’d like to skip ahead to the web map, click the link below (but I wish you’d stay and read the geeky stuff):
Spatial clustering is the phenomenon wherein “…near things are more related than distant things” (Tobler, 1979). Practically speaking, it manifests itself in the natural world quite often: patches of flowers on the side of a hill could be a form of spatial clustering. Spatial clustering is a form of spatial autocorrelation because things which are proximate tend to share the same properties; they tend to be correlated with one another. Understanding ongoing spatial autocorellation in geographic data is important because it:
…could reveal information about the underlying geographical process that generates the spatial pattern, which can further aid the comprehension of underlying geographical process and its relationship with the phenomenon under investigation. — Dr. Yongmei Lu
In his article, Dr. Lu gives an excellent treatment of the hot spot analysis used later in this study. In short, Dr. Lu refers to a hot spot as an area with “unusually high occurrence of point incidents.” At the risk of over-quoting Dr. Lu, I’ll bring up a few more points he makes regarding a few of the weaknesses of hot spot analyses:
On the other hand, high concentrations of point occurrence exist in the distribution no matter being recognized or not. Second, it is a relative rather than absolute concept. Areas with relatively higher concentration of incidents than immediate surroundings could probably show as hot spot, even though its absolute concentration may not be that high compared to entire area. Also, it is hard to identify clear boundary for hot spot due to the continuity of point distribution. Third, hot spots should be comparable in terms of “degree of hot” both within study and across studies under certain circumstances. There are needs to evaluate the intensity of hot spots’ both in research and practice. Lastly and very important, population at risk is an important factor to be considered for hot spot analysis. As discussed in previous section, point distribution is impacted by and therefore reflects its underlying process. — Dr. Yongmei Lu, ibid.
I obtained data containing the locations of schools in Texas from the Texas Education Agency, crime data from the Fort Worth Police Department (FWPD), census data at the census tract level from the United States Census Bureau (Warning, link initiates a download), a mask layer from U.S. county data obtained from Texas Parks and Wildlife. Finally, I created a file geodatabase from the shape data. which may be downloaded here.
(FGDB SHA256: e38150f06ed39de74e1d605b24d376fecf7a0a8937676186fba85aad2f1203a1)
To understand the differences between crimes committed near schools vs. those committed away from schools, a 500 meter (1640.42 feet) buffer was created around each school. I selected each crime that fell within the buffers and calculated the percentage of each crime out of the total. The results are shown in the following table.
Near schools, vandalism, burglary, “all other offenses,” motor vehicle theft, drug violations and public intoxication were higher than in all locations further than 500 meters from schools.
Violent crime – which are considered?
For the next analyses, I selected only violent crime from the dataset, which included such crimes as:
- Aggravated Assault,
and several more.
In this study, a hot spot signifies an area where high or low values cluster spatially. The statistic is the Getist-Ord GI* and is calculated:
where P is census tract population for feature j, n is the total number of features, and w sub i,j is a spatial weight found between features i and j. The inverse of a tract’s population gives the weighting scheme used in the analysis. This prevents the results from simply being a population clustering map. When the weighting is not applied, clustered areas highly mirror highly populated areas.
The results of this analysis highlight areas where there is significant spatial clustering of violent crime. The analysis are purely spatial and do not show any trends in crime rate.
SaTScan is a statistical package that uses the spatial scan statistic. In short, the method works by passing a moving window across the study area, comparing traits within the moving window against those outside of it. The size of the moving window is varied and the process repeats 999 times to achieve greater statistical power.When a group of census tracts within the moving window exhibits a trend that is changing faster than expected, and when that difference is statistically significant, it is marked as a cluster. This method is vastly different from typical crime maps which usually show crime density.
A cluster, in this analysis, does not imply that an area has a violent crime problem. In this context, a cluster is a group of census tracts whose crime rates are either increasing or decreasing at a rate faster than expected for global Fort Worth. Some clusters may have an increasing violent crime problem, and thus would be shown as clusters with a positive trend on the map. Conversely, some clusters may be greatly improving their crime rates. They would be shown as clusters with a negative trend on the map. Positive clusters call for further research to decide what needs to be done to improve the crime situation. Negative clusters are to be noted and the areas crime policies should be examined to decide what should be emulated.
This analysis has shown that there are several clusters, both spatial and spatio-temporal, of violent crime in Fort Worth, Texas, between 2004 and 2016.
Several limitations to the analyses exist. There are a few census tracts with unexpectedly large rate increases The census tract between Old Decatur Road and Saginaw Boulevard, for example, was found to have a violent crime rate increasing by 91.04/100,000 annually. If you enable both the Spatial Hotspots and the Space-Time layer, you will see that there were two violent crimes committed on the Western edge of the census tract. Because there were no crimes between 2004 and 2014, and then there were five violent crimes which occurred at decreasing intervals. The figure below shows this pattern. There is no value for 6/3/2011 because there was no violent crime recorded in the census tract prior to 4/1/2014.
Click on a data point to view the precise delta.
The second issue is what is known as the small numbers problem. The reliability of statistics calculated for areas with small populations is questionable because when the denominator is small, the rate’s variance is high. The following quote from BioMedware is a good example:
To make this more concrete consider a simple example. Suppose a superfund site has been emitting known carcinogens for childhood leukemia into the ground water, and that a small community adjacent to the site relies on groundwater as its drinking water source. The rate for childhood leukemia in that town is quite high – 2-3 times the state average – but the town is small, and because of this the variance in the town childhood leukemia rate is high and is not statistically different from the state average.
However, because this was merely an informal exploratory analysis, and because the results of the analysis largely show what I and others see in reality, I see no reason to question the overall findings.
This analysis was done purely for my enjoyment, and I hope it is beneficial to the community.