Where to catch 'em all?
A geographic analysis of Pokémon Go locations

Levente Juhász and Hartwig Hochmair
University of Florida

Wageningen, The Netherlands, May 9, 2017

Pokémon Go

Augmented reality smartphone app

Released in June, 2016
To catch virtual carathers through the camera
Players need to navigate to certain points on a map


PokéStops are associated with landmarks
Players need to visit PokéStops periodically
PokéStops are user generated
PokéStops are user generated

Pokémon Gym

Gyms are also landmarks, that can be "owned" by a team
Gyms are defended by Pokémons
Gyms are where battles happen

Spawnpoint (Pokémon location)

Pokémons can pop-up for 30 minutes
So players can catch them with a Pokéball


Insane popularity

Pokémon Go players everywhere
Pokémon Go players everywhere
Media attention - bad
Media attention - bad
Media attention - bad
Media attention - good
Media attention - good
Monetarization by third parties
Monetarization by third parties

Motivation +1

Can I get my hands on the data?

Research questions


Identify the effects of socio-economic factors on the number of PokéStops in a US census block group


For PokéStops, gyms and spawnpoints, quantify the point distributions across different land use categories


For PokéStops and gym locations, compare their spatial clustering patterns


Determine the quality of the "PokéStop nearby" attribute in Yelp

Data collection

Programmatically reproduce the app's behavior
To scan custom geographic areas
Study areas


RQ1: Relationship between socio-economic factors and PokéStop counts

PokéStop counts aggregated on US Census block groups

Negative binomial regression model

  • Manual stepwise approach
  • Akaike information criterion

Positive association between PokéStop counts and

Negative association between PokéStop counts and

RQ2: Point distributions across land use categories

Compare observed counts to expected counts under complete spatial randomness on different land use categories

expected count = # of points x proportion of area
Chi-squared test of independence

Observed counts differ significantly from expected counts
Relative count index

RQ3: Spatial clustering of PokéStops and gyms

Are they similarly clustered?
Step 1: are they closer than would be expected under CRS?

Compute two sets of nearest neighbor distances:

  • Between PokéStops and gyms
  • Between random points and random points and gyms
PokéStop - Gym: mean=230m; median=216m
Random points - Gym: mean=301m; median=261m
Step 2: similarity of clustering

Cross K-function
Cross K-function

Significance testing with Monte-Carlo simulation
Random labelling approach
Cross K-function
PokéStops and gyms are similarly clustered in most study sites!

RQ4: User tagging of Yelp businesses

"PokéStop nearby" attribute
"PokéStop nearby" attribute

1,400 businesses tagged
Out of 21,600
Are tagged Yelp businesses closer to PokéStops than regular businesses?

Compute two sets of nearest neighbor distances:

  • Between tagged Yelp businesess and PokéStops
  • Between all Yelp businesses and PokéStops
Tagged Yelp business - PokéStop: mean=53m; median=38m
Yelp business - PokéStop: mean=103m; median=58m
Yelp Tagging intensity

Metropolitan: 10 - 15%
Suburban: 1 - 5%
Rural: -


We can trick a closed service and scrape data out of it!
Disadvantaged areas in terms of PokéStop density exist

Negative relationship between PokéStop counts and:

  • % of African-American population
  • % of Hispanic population
Pokémon point counts are positively associated with:

  • Parks
  • Universities/Colleges
  • Business and touristic activities
Strong geographic bias between urban and rural areas

PokéStops and gyms tend to cluster similarly

VGI (Yelp) based on Pokémons tend to be accurate

For the next iteration of location based games developers should address issues with strong geographically linked biases

Thank you!