Analyses of swisscom data

Grid preparation

Ancillary data

Municipality boundaries

st_gg21 <- st_read("data-raw/swisstopo/swissboundaries3d_2021-07_2056_5728/SHAPEFILE_LV95_LN02/swissBOUNDARIES3D_1_3_TLM_HOHEITSGEBIET.shp",
                   as_tibble = TRUE) %>% 
  st_zm(drop = TRUE, what = "ZM") %>% 
  # move to LV03
  st_transform(21781) %>%
  # filter(! BFS_NUMMER %in% c(2391, 5391, 5394)) %>% 
  # exclude lakes
  # filter(OBJEKTART != "Kantonsgebiet") %>% 
  # exclude FL & enclaves
  filter(ICC == "CH") %>% 
  select(BFS_NUMMER, NAME, KANTONSNUM, GEM_TEIL) %>% 
  rename(GMDNR = BFS_NUMMER,
         GMDNAME = NAME,
         KTNR = KANTONSNUM) %>% 
  arrange(GMDNR)

write_rds(st_gg21, "data/swisstopo/st_gg21.Rds")

City quartiers

Data from BfS.

st_qg17 <- st_read("data-raw/BfS/ag-b-00.03-95-qg17/shp/quart17.shp",
                   as_tibble = TRUE) %>% 
  st_zm(drop = TRUE, what = "ZM") %>% 
  st_transform(21781) %>%
  select(-OBJECTID_1, -Flaeche) %>% 
  rename(GMDNR = GMDE,
         QNAME = NAME,
         QNR = NR) %>% 
  left_join(st_gg21 %>% 
              st_drop_geometry() %>% 
              select(GMDNR, GMDNAME)) %>% 
  relocate(geometry, .after = last_col())

write_rds(st_qg17, "data/swisstopo/st_qg17.Rds")

Example of Bern

STATPOP offset

Could be used to define offset for st_make_grid

statpop_20 <- read_delim("data-raw/BfS/ag-b-00.03-vz2020statpop/STATPOP2020.zip", 
                         delim = ";", escape_double = FALSE, trim_ws = TRUE)[] %>% 
  mutate_all(as.integer)

# statpop_20 %>% slice(1:10) %>% View()

write_rds(statpop_20, "data/BfS/statpop_20.Rds")

offset_bfs <- read_rds("data/BfS/statpop_20.Rds") %>% 
  summarise(min_x = min(X_KOORD),
            min_y = min(Y_KOORD)) %>% 
  st_as_sf(coords = c("min_x", "min_y"), 
           crs = 21781,
           remove = FALSE)

swisscom offset

Using two communities that are furthest away in southerly and westerly directions:

Chansy to define x
- Postal code 1284
- SFOS number 6611

Chiasso to define y
- Postal code 6830
- SFOS number 5250

Tile definitions were pulled from API using query_swisscom_heatmaps_api.py.

Points of grid were defined using lower left corner coordinates.

WGS84 coordinates were transformed to LV03.

chansy_grid <- fromJSON("data/swisscom/chansy_grid.json")
chansy_grid <- flatten(chansy_grid$tiles) %>% 
  as_tibble()

chansy_grid_sf <- chansy_grid %>% 
  st_as_sf(coords = c("ll.x", "ll.y"), 
           crs = 4326,
           remove = FALSE) %>% 
  st_transform(21781) %>% 
  mutate(x = st_coordinates(.)[, 1],
         y = st_coordinates(.)[, 2])

Chansy

Chiasso

Coordinates of lower left corner were then obtained by getting minimum values

offset_swisscom <- 
  bind_rows(
    
    chansy_grid_sf %>% 
      st_drop_geometry() %>%
      summarise(min_x = min(x),
                min_y = min(y)) ,
    
    chiasso_grid_sf %>% 
      st_drop_geometry() %>%
      summarise(min_x = min(x),
                min_y = min(y)) 
  ) %>%
  summarise(min_x = min(min_x),
            min_y = min(min_y)) %>% 
  # needs rounding - transfrom error?
  mutate(min_y = round(min_y)) %>% 
  st_as_sf(coords = c("min_x", "min_y"), 
           crs = 21781,
           remove = FALSE)

Note small difference from BfS derived minimums (in black)!

Commuter flows

Data

2018 commuter flows from mobility microcensus.
Testing only on areas of canton Bern.

commuters <- read_delim("data-raw/BfS/ts-x-11.04.04.05-2018.csv", 
                        delim = ";", escape_double = FALSE, 
                        col_types = cols(REF_YEAR = col_integer(),
                                         GEO_CANT_RESID = col_integer(),
                                         GEO_COMM_RESID = col_integer(), 
                                         GEO_CANT_WORK = col_integer(),
                                         GEO_COMM_WORK = col_integer(), 
                                         VALUE = col_integer()), 
                        trim_ws = TRUE) %>% 
  janitor::clean_names() %>% 
  filter(perspective == "R") %>% 
  select(-perspective, -ref_year) %>% 
  filter(geo_cant_resid == 2) %>% 
  select(-geo_cant_resid) %>% 
  arrange(geo_comm_resid, desc(value))

There are cross cantonal flows still here!

geo_cant_work <integer> 
# total N=10093 valid N=10093 mean=7.46 sd=14.33

Value |    N | Raw % | Valid % | Cum. %
---------------------------------------
    1 |  313 |  3.10 |    3.10 |   3.10
    2 | 6862 | 67.99 |   67.99 |  71.09
    3 |  254 |  2.52 |    2.52 |  73.61
    5 |   12 |  0.12 |    0.12 |  73.72
    6 |   22 |  0.22 |    0.22 |  73.94
    7 |   11 |  0.11 |    0.11 |  74.05
    8 |    2 |  0.02 |    0.02 |  74.07
    9 |  117 |  1.16 |    1.16 |  75.23
   10 |  344 |  3.41 |    3.41 |  78.64
   11 |  825 |  8.17 |    8.17 |  86.81
   12 |   60 |  0.59 |    0.59 |  87.41
   13 |   90 |  0.89 |    0.89 |  88.30
   14 |    2 |  0.02 |    0.02 |  88.32
   17 |    6 |  0.06 |    0.06 |  88.38
   19 |  285 |  2.82 |    2.82 |  91.20
   20 |    8 |  0.08 |    0.08 |  91.28
   22 |  152 |  1.51 |    1.51 |  92.79
   23 |   31 |  0.31 |    0.31 |  93.09
   24 |  222 |  2.20 |    2.20 |  95.29
   25 |   30 |  0.30 |    0.30 |  95.59
   26 |  104 |  1.03 |    1.03 |  96.62
   77 |  341 |  3.38 |    3.38 | 100.00
 <NA> |    0 |  0.00 |    <NA> |   <NA>

Aggregating by community of origin

geo_comm_resid_agg <- 
  
  left_join(
    
    commuters %>% 
      group_by(geo_comm_resid) %>% 
      summarise(workers_all = sum(value)) %>% 
      ungroup() %>% 
      select(geo_comm_resid, workers_all),
    
    commuters %>% 
      filter(geo_comm_resid == geo_comm_work) %>% 
      rename(workers_stay = value) %>% 
      ungroup() %>% 
      select(geo_comm_resid, workers_stay)
    
  ) %>% 
  
  left_join(
    
    commuters %>% 
      filter(geo_comm_resid != geo_comm_work) %>% 
      group_by(geo_comm_resid) %>% 
      summarise(workers_leave = sum(value)) %>% 
      ungroup() %>% 
      select(geo_comm_resid, workers_leave) 
  ) %>% 
  mutate(percent_leave = (workers_leave / workers_all) * 100)

Map of areas with high outflows

Bernese suburbs examples

Frauenkappelen

commuters %>% 
  filter(geo_comm_resid == 663) %>% 
  datatable()

Bremgarten

commuters %>% 
  filter(geo_comm_resid == 353) %>% 
  datatable()

Stettlen

commuters %>% 
  filter(geo_comm_resid == 358) %>% 
  datatable()

Grid

Generate whole country

100m grid, with lower left corner defined using swisscom derived offset.

Also, adding ID for each cell (based on row number).

country <- st_gg21 %>% 
  st_make_grid(cellsize = 100, 
               offset = c(offset_swisscom$min_x, offset_swisscom$min_y),
               square = TRUE) %>% 
  st_sf() %>%
  st_cast("POLYGON") %>% 
  mutate(ID1 = row_number()) %>% 
  relocate(ID1)

# st_write(country, "data/grid/country.gpkg")
write_rds(country, "data/grid/country.Rds")

Generate cities

Clipped / selected using swisstopo municipality data.

Added second, city specific ID and area.

Function with clipping

# why st_intersection 
# https://stackoverflow.com/questions/62442150/why-use-st-intersection-rather-than-st-intersects

city_grid_clip <- function(municipality){
  
  city <- country %>% 
    st_intersection(st_gg21 %>% 
                      filter(GMDNR == municipality))  %>%
    mutate(ID2 = row_number(),
           area = st_area(.)) %>% 
    relocate(ID1, ID2, area)
  
  return(city)
}

Function without clipping

# solution from
# https://stackoverflow.com/questions/57014381/how-to-filter-an-r-simple-features-collection-using-sf-methods-like-st-intersect

city_grid <- function(municipality){
  
  city <- country %>% 
    filter(st_intersects(geometry, 
                         st_gg21 %>% filter(GMDNR == municipality), 
                         sparse = FALSE)) %>%
    mutate(ID2 = row_number()) %>% 
    relocate(ID1, ID2)
  
  return(city)
}

# or even simpler, without dplyr
# could be generalized with arguments to provide full grid and GMDE shape

city_grid <- function(municipality){
  
  city <- country[st_gg21 %>% filter(GMDNR == municipality), ] %>%
    mutate(ID2 = row_number()) %>% 
    relocate(ID1, ID2)
  return(city)
  
}

Zuirich

# zurich_clip <- city_grid_clip(261)
zurich <- city_grid(261)

Geneva

# geneva_clip <- city_grid_clip(6621)
geneva <- city_grid(6621)

Basel

# basel_clip <- city_grid_clip(2701)
basel <- city_grid(2701)

Lausanne

# lausanne_clip <- city_grid_clip(5586)
lausanne <- city_grid(5586)

Bern

bern_clip <- city_grid_clip(351)
bern <- city_grid(351)

Offset

Using two communities to get min x and y coordinates: Chansy & Chiasso

# chansy_clip <- city_grid_clip(6611)
# st_write(chansy_clip, "data/grid/chansy_clip.gpkg")
# write_rds(chansy_clip, "data/grid/chansy_clip.Rds")

chansy <- city_grid(6611)
# st_write(chansy, "data/grid/chansy.gpkg")
write_rds(chansy, "data/grid/chansy.Rds")

# chiasso_clip <- city_grid_clip(5250)
# st_write(chiasso_clip, "data/grid/chiasso_clip.gpkg")
# write_rds(chiasso_clip, "data/grid/chiasso_clip.Rds")

chiasso <- city_grid(5250)
# st_write(chiasso, "data/grid/chiasso.gpkg")
write_rds(chiasso, "data/grid/chiasso.Rds")

Example

Municipality Bern coverage:

Clipped grid:

Vs. unclipped one:

Note, when using clipped grid, there are few examples of grid cells with very small area.
These could be excluded if no population was found there?

       ID1  ID2            area GMDNR GMDNAME KTNR GEM_TEIL
1  4259841  401 0.1571798 [m^2]   351    Bern    2        0
2  4332927 2876 0.1766709 [m^2]   351    Bern    2        0
3  4423647 5188 0.2123485 [m^2]   351    Bern    2        0
4  4339904 3098 0.2957572 [m^2]   351    Bern    2        0
5  4273711  702 0.7733263 [m^2]   351    Bern    2        0
6  4245805  139 1.0971039 [m^2]   351    Bern    2        0
7  4245786  123 1.1456480 [m^2]   351    Bern    2        0
8  4420099 5082 1.9921709 [m^2]   351    Bern    2        0
9  4291217 1374 2.4771009 [m^2]   351    Bern    2        0
10 4273760  735 3.2587531 [m^2]   351    Bern    2        0

Area estimates

Bern - hourly

Number of tiles for area of Bern

(tiles <- nrow(bern))

[1] 5487

Number of requests needed

(requests <- ceiling(tiles / 100))

[1] 55

Price for complete grid

(price <- requests * 0.01)

[1] 0.55

Price for 24h

(price_24h <- price * 24)

[1] 13.2

Price for 4w hourly

(price_4w <- price_24h * 7 * 4)

[1] 369.6

Price for year hourly

(price_year <- price_24h * 365)

[1] 4818

Price for 4w daily

(price_4w_d <- price * 7 * 4)

[1] 15.4

Price for year daily

(price_year_d <- price * 365)

[1] 200.75

5 cities - hourly

Number of tiles for area of Bern

(tiles <- nrow(zurich) +
   nrow(geneva) +
   nrow(basel) +
   nrow(lausanne) +
   nrow(bern))

[1] 24116

Number of requests needed

(requests <- ceiling(tiles / 100))

[1] 242

Price for complete grid

(price <- requests * 0.01)

[1] 2.42

Price for 24h

(price_24h <- price * 24)

[1] 58.08

Price for 4w hourly

(price_4w <- price_24h * 7 * 4)

[1] 1626.24

Price for year hourly

(price_year <- price_24h * 365)

[1] 21199.2

Price for 4w daily

(price_4w_d <- price * 7 * 4)

[1] 67.76

Price for year daily

(price_year_d <- price * 365)

[1] 883.3

Environment

Analyses were conducted using the R Statistical language (version 4.2.0; R Core Team, 2022) on Windows 10 x64 (build 18363), using the packages ggplot2 (version 3.3.6; Wickham. ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York, 2016.), jsonlite (version 1.8.0; Jeroen Ooms, 2014), sf (version 1.0.7; Pebesma, 2018. Simple Features for R: Standardized Support for Spatial Vector Data. The R Journal 10, 1), pacman (version 0.5.1; Rinker et al., 2017), tidyverse (version 1.3.1; Wickham et al., 2019), dplyr (version 1.0.9; NA), DT (version 0.23; NA), forcats (version 0.5.1; NA), magrittr (version 2.0.3; NA), purrr (version 0.3.4; NA), readr (version 2.1.2; NA), scales (version 1.2.0; NA), stringr (version 1.4.0; NA), tibble (version 3.1.7; NA), tidyr (version 1.2.0; NA) and tmap (version 3.3.3; NA).

References

H. Wickham. ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York, 2016.
Jeroen Ooms (2014). The jsonlite Package: A Practical and Consistent Mapping Between JSON Data and R Objects. arXiv:1403.2805 [stat.CO] URL https://arxiv.org/abs/1403.2805.
Pebesma, E., 2018. Simple Features for R: Standardized Support for Spatial Vector Data. The R Journal 10 (1), 439-446, https://doi.org/10.32614/RJ-2018-009
R Core Team (2022). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL https://www.R-project.org/.
Rinker, T. W. & Kurkiewicz, D. (2017). pacman: Package Management for R. version 0.5.0. Buffalo, New York. http://github.com/trinker/pacman
Wickham et al., (2019). Welcome to the tidyverse. Journal of Open Source Software, 4(43), 1686, https://doi.org/10.21105/joss.01686
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA