Where to catch 'em all? A geographic analysis of Pokémon Go locations
Levente Juhász and Hartwig Hochmair University of Florida
Wageningen, The Netherlands, May 9, 2017
Pokémon Go
Augmented reality smartphone app
Released in June, 2016
To catch virtual carathers through the camera
Players need to navigate to certain points on a map
PokéStop
PokéStops are associated with landmarks
Players need to visit PokéStops periodically
PokéStops are user generated
PokéStops are user generated
Pokémon Gym
Gyms are also landmarks, that can be "owned" by a team
Gyms are defended by Pokémons
Gyms are where battles happen
Spawnpoint (Pokémon location)
Pokémons can pop-up for 30 minutes
So players can catch them with a Pokéball
Motivation
Insane popularity
Pokémon Go players everywhere
Pokémon Go players everywhere
Media attention - bad
Media attention - bad
Media attention - bad
Media attention - good
Media attention - good
Monetarization by third parties
Monetarization by third parties
Motivation +1
Can I get my hands on the data?
Research questions
RQ1:
Identify the effects of socio-economic factors on the number of PokéStops in a US census block group
RQ2:
For PokéStops, gyms and spawnpoints, quantify the point distributions across different land use categories
RQ3:
For PokéStops and gym locations, compare their spatial clustering patterns
RQ2: Point distributions across land use categories
Compare observed counts to expected counts under complete spatial randomness on different land use categories
expected count = # of points x proportion of area
Chi-squared test of independence
Observed counts differ significantly from expected counts
Relative count index
RQ3: Spatial clustering of PokéStops and gyms
Are they similarly clustered?
Step 1: are they closer than would be expected under CRS?
Compute two sets of nearest neighbor distances:
Between PokéStops and gyms
Between random points and random points and gyms
PokéStop - Gym: mean=230m; median=216m
Random points - Gym: mean=301m; median=261mStep 2: similarity of clustering
Cross K-function
Cross K-function
Significance testing with Monte-Carlo simulation
Random labelling approach
Cross K-function
PokéStops and gyms are similarly clustered in most study sites!
Outliers
RQ4: User tagging of Yelp businesses
"PokéStop nearby" attribute
"PokéStop nearby" attribute
1,400 businesses tagged
Out of 21,600
Are tagged Yelp businesses closer to PokéStops than regular businesses?
Compute two sets of nearest neighbor distances:
Between tagged Yelp businesess and PokéStops
Between all Yelp businesses and PokéStops
Tagged Yelp business - PokéStop: mean=53m; median=38m
Yelp business - PokéStop: mean=103m; median=58mYelp Tagging intensity
Metropolitan:
10 - 15%
Suburban:
1 - 5%
Rural:
-
Summary
We can trick a closed service and scrape data out of it!
Disadvantaged areas in terms of PokéStop density exist
Negative relationship between PokéStop counts and: