Using Geographic Information Systems and Cluster Analysis for Drawing Comparative Geographic Areas in Quasi-Experimental Designs: A Police Patrol Example (Concept Paper)

Tory Caeti, University of North Texas

In any experimental design, it is inherently preferable to draw experimental and control groups randomly. Examples from the criminal justice literature include the Kansas City Preventive Patrol Experiment, the Minneapolis Domestic Violence Experiment, the San Diego Field Interrogation Experiment etc. However, random assignment is typically not an option when the experimental units have been selected a priori. In this research, the Houston Police Department selected seven police beats in the city for additional police resources based on total Part I crimes, excluding shoplifting (to avoid selecting beats with shopping malls). The additional resources would take the form of extra patrol units in the seven targeted beats, freed from calls for service, and directed to deter and prevent serious crime in the beat. In short, the department was interested in targeting those beats most in need of additional police resources and not with "maintaining experimental integrity" as they attempted to reduce serious crimes in their most troubled neighborhoods. Therefore, random assignment in the research was not possible. The targeted beats were spread throughout the city and reflecting varying socio-demographic populations. Selection of comparison beats for a quasi-experimental design presents unique difficulties. Typically when the unit of analysis is individuals, the researchers try to match the individuals on as many characteristics as possible to increase comparability and reduce error. When the unit of analysis is a geographic area, researchers typically use relatively few matching variables for a variety of reasons. Chief among these reasons are lack of information, difficulty in matching what information is available to the geographic units under study, and the simple need to proceed expeditiously with the research. The advent of Geographic Information Systems (GIS) alleviate most of the problems mentioned and a host of other problems associated with finding information about a geographic area. The choice of comparison groups in a quasi-experiment is typically based on matching variables or on some other logical comparison point. The researcher selects comparison groups based on characteristics that both groups possess equally. In some research, the choice has been based on the research question being asked. For example, if a group of prisoners is subjected to a new innovative treatment, their success is gauged based on comparison with a control group of similar prisoners who did not receive the treatment. The same logic applies to geographic areas that receive some treatment when we wish to gauge the effectiveness of the treatment. The choice of similar geographic entities, whether cities, neighborhoods, or countries is based on similarity. The conclusions drawn from a quasi-experiment must account for and be made in reference to the comparability of the experimental and comparison groups. If there is sufficient variation between the two groups, the validity of the comparisons and the final conclusions can be drawn into question. There is a need to reduce the amount of error introduced in the matching process. One way to accomplish this is to match the experimental and control groups based on as many comparable qualities as possible. The greater the number of similarities among comparison variables, the less the chance that some rival causal factor or extraneous variable is responsible for any observed changes. In other words, the more sources of error that can be limited or controlled in the matching process, the more likely it is that we can dismiss rival causal factors in the analysis and conclude that the treatment was responsible for significant changes if any are found. The goal of reducing as much error as possible is the driving force behind the reasons and justifications for using GIS and cluster analysis to select the matching beats for comparison in this analysis. GIS allows an accurate measurement of socio-demographic variables within a fixed area. As such, a vast array of census variables could potentially be used to compare the experimental beats with all the other beats in the city to select a closely matching comparison beat. However, when a large number of variables have been selected, simplistic analysis becomes impossible due to the complexity and sheer number of variables. Cluster analysis allows comparison between a large number of variables without introducing researcher bias into the selection process.

(Return to Program Resources)

Updated 05/20/2006