Skip To Content

Find Similar Locations

The Find Similar Locations tool uses area features to summarize a set of point features. The boundaries from the area feature are used to collect the points within each area and use them to calculate statistics. The resulting layer displays the count of points within each area.

Analysis using GeoAnalytics Tools

Analysis using GeoAnalytics Tools is completed using distributed processing across multiple ArcGIS GeoAnalytics Server machines and cores. GeoAnalytics Tools and standard feature analysis tools in ArcGIS Enterprise have different parameters and capabilities. To learn more about the standard tool Find Similar Location, see Find Similar Locations tool help. To learn more about the differences between the tools, see Feature analysis tool differences.

Workflow diagram

Find Similar Locations workflow diagram

Examples

  • Which of your stores are most similar to your top performers with regard to customer profiles?

  • Based on characteristics of the villages hardest hit by the disease, which other villages are high risk?

  • A town's after-school fitness program was extremely successful. Promoters want to find other towns with similar characteristics to serve as candidates for program expansion.

  • A crime analyst wants to search the database of all crimes to see if a recent crime might be part of a larger pattern or trend.

  • A human resources manager may want to justify company salary ranges. Once she identifies cities that are similar in terms of size, cost of living, and amenities, she can examine salary ranges for positions of interest and determine if they are in line with company salaries.

Usage notes

A table, points, line, or area features can be used.

The reference can be made using all of the features in the input layer or by making a selection. A selection can be made interactively using the Select button Select button or by a filter using the Query button Query button. Multiple features can be selected or unselected using the Select button and a mouse click. Only one query can be used to make a selection on the reference layer.

An input candidate layer is required. The features in the candidate layer will be ranked by similarity to the reference locations.

Ranked similarity is based on the fields specified in the Base similarity on parameter. More than one field can be specified. Only numeric fields with names matching the reference layer can be selected. Features with the lowest rank number have the highest similarity to the reference layer.

By default, all of the features in the candidate locations layer will be ranked from most to least similar. The Show me parameter can be used to specify the number of features you want returned.

The Determine the most and least similar using parameter allows you to specify how features are matched. You may select field values or field profiles.

  • For field values, the most similar candidates will have the smallest sum of squared differences for all the features you Base similarity on; all values are standardized before differences are calculated.
  • For field profiles, the cosine similarity is measured. Cosine similarity looks for the same relationships among standardized attribute values rather than trying to match magnitudes. Suppose there are three fields to Base similarity on called A1, A2, A3. A2 is twice as large as A1, A3 is almost equal to A2. For the field profiles method the tool will be looking for candidates with those same attribute relationships: A2 twice as large as A1, then almost equal. Because this method is looking at relationships between attributes, you must specify a minimum of two fields to Base similarity on. You might use the cosine similarity method (field profiles) to find places like Los Angeles, but at a smaller scale overall. For example, where you are interested in the profile of population, number of cars to number of residents less than 20. The cosine similarity index ranges from 1.0 (perfect similarity) to -1.0 (perfect dissimilarity). The cosine similarity index is written to the Output Features simindex (Cosine similarity) field.

All of the fields used for matching are written to the output. Choose fields to add to results allows you to specify fields to add to the output table, if desired. By default, all fields are added.

Limitations

  • The reference layer and candidate layer must have at least one numeric field with a matching name.
  • When using the field profiles method the reference layer and candidate layer must have at least two numeric fields with a matching name.

How Find Similar Locations works

To use Find Similar Locations, you provide the reference locations, the candidate search locations, and the fields representing the criteria you want to match. The layer you select for analysis should contain your reference or benchmark locations. For example, your reference locations might be a layer containing your top performing stores or the villages hardest hit by a disease. You then specify the layer containing your candidate search locations. This might be all of your stores or all other villages. Finally, you identify one or more fields to use for measuring similarity. Find Similar Locations will then rank all of the candidate search locations by how closely they match your reference locations across all of the fields you have selected.

In some cases, your analysis layer will contain both the reference locations and the candidate search locations. You may have a single layer containing all of your stores, for example, and want to rank them from most to least similar to your top performing store. Use your stores layer as both your analysis layer and your candidate search layer. You must then identify, using one of the selection tools, which store is your top performer. You can select your reference locations using interactive query or by building a query. Alternatively, create a copy of the stores layer so there are two versions in the table of contents. Click the filter button under the first copy and define a Filter to select your top performer. Then click the filter button under the second layer and define a Filter to select the candidate search locations (which may be all of the stores except your top performer). The first layer is your analysis layer (click Perform Analysis below the layer or the Analysis button at the top of your map, and navigate to Find Similar Locations by expanding the Find Locations category). Specify the second layer for the Search for similar locations in parameter. These are your candidate search locations.

In other cases, you will have separate reference and candidate search layers. You may have a stores layer that includes your top performer with fields describing the store customer base (fields such as median income and marital status for example) and a second layer of candidate parcels from which you will determine the best location to build a new store. In this case, if the reference locations layer includes more than just your reference locations, you must first identify the reference locations using one of the selection tools described above. If your layer only includes your reference locations (your top performing store, for example), you do not need to make a selection. You would specify your parcels layer for the candidate search locations (parameter two). If both the parcels and your top performing store have fields describing the customer base, you can run Find Similar Locations to identify candidate parcels with demographic characteristics most like the customers for your best performing store.

If there is more than one reference location, similarity will be based on averages for the fields you specify. For example, if there are two reference locations and you are interested in matching population, the tool will look for candidate search locations with populations that are most like the average population for both reference locations. If the values for the reference locations are 100 and 102, for example, the tool will look for candidate search locations with populations near 101. Consequently, you will want to select fields for the reference locations fields that have similar values. If, for example, the population values for one reference location is 100 and the other is 100,000, the tool will look for candidate search locations with population values near the average of those two values: 50,050. Notice that this averaged value is nothing like the population for either of the reference locations.

Similar tools

Use Find Similar Locations to measures the similarity of locations in a candidate and reference layer. Other tools may be useful in solving similar but slightly different problems.

Map Viewer analysis tools

If you would like to find similar locations using the standard analysis tools, see Find Similar Locations.

If you are trying to select existing locations with a query, use the standard tool Find Existing Locations.

If you are trying to use a query to create new features, use the standard tool Derive New Locations.

ArcGIS Desktop analysis tools

The GeoAnalytics Tools Find Similar Locations is available in ArcGIS Pro.