2020 excess mortality & voting patterns in CH

Geodata preparation

BfS geodata 2022

Generalisierte Gemeindegrenzen: Geodaten.

Using version from the 08.02.2022. Adding two additional mutations to match the voting dataset.

download.file(url = "https://www.bfs.admin.ch/bfsstatic/dam/assets/21224783/master",
              destfile = "data-raw/BfS/ag-b-00.03-875-gg22.zip",
              method = "curl")

unzip("data-raw/BfS/ag-b-00.03-875-gg22.zip", exdir = "data-raw/BfS/ag-b-00.03-875-gg22")

unlink("data-raw/BfS/ag-b-00.03-875-gg22.zip")
unlink("data-raw/BfS/ag-b-00.03-875-gg22/txt", recursive = TRUE)
unlink("data-raw/BfS/ag-b-00.03-875-gg22/kmz", recursive = TRUE)
unlink("data-raw/BfS/ag-b-00.03-875-gg22/ggg_2022_LV95/ggg_2022_LV95.gdb", recursive = TRUE)
gg_raw <-
  st_read("data-raw/BfS/ag-b-00.03-875-gg22/ggg_2022_LV95/shp/g1g22.shp",
          as_tibble = TRUE) %>%
  st_set_crs(2056) %>%
  st_zm(drop = TRUE, what = "ZM")
Reading layer `g1g22' from data source 
  `C:\projects\ISPM_excess-mortality-voting\data-raw\BfS\ag-b-00.03-875-gg22\ggg_2022_LV95\shp\g1g22.shp' 
  using driver `ESRI Shapefile'
Simple feature collection with 2151 features and 19 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: 2485424 ymin: 1075268 xmax: 2833837 ymax: 1295937
Projected CRS: CH1903+ / LV95
gg <- 
  st_read("data/BfS/ag-b-00.03-875-gg22/ggg_2022_LV95/shp/g1g22.shp",
          as_tibble = TRUE) %>% 
  st_set_crs(2056) %>% 
  st_zm(drop = TRUE, what = "ZM")
Reading layer `g1g22' from data source 
  `C:\projects\ISPM_excess-mortality-voting\data\BfS\ag-b-00.03-875-gg22\ggg_2022_LV95\shp\g1g22.shp' 
  using driver `ESRI Shapefile'
Simple feature collection with 2148 features and 19 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: 2485424 ymin: 1075268 xmax: 2833837 ymax: 1295937
Projected CRS: CH1903+ / LV95
kt <- 
  st_read("data-raw/BfS/ag-b-00.03-875-gg22/ggg_2022_LV95/shp/g1k22.shp",
          as_tibble = TRUE) %>% 
  st_zm(drop = TRUE, what = "ZM") %>% 
  st_transform(2056)
Reading layer `g1k22' from data source 
  `C:\projects\ISPM_excess-mortality-voting\data-raw\BfS\ag-b-00.03-875-gg22\ggg_2022_LV95\shp\g1k22.shp' 
  using driver `ESRI Shapefile'
Simple feature collection with 26 features and 14 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: 2485424 ymin: 1075268 xmax: 2833837 ymax: 1295937
Projected CRS: CH1903+ / LV95 + LN02 height
se <- 
  st_read("data-raw/BfS/ag-b-00.03-875-gg22/ggg_2022_LV95/shp/g1s22.shp",
          as_tibble = TRUE) %>% 
  filter(GMDNAME != "Lago di Como") %>% 
  st_zm(drop = TRUE, what = "ZM") %>% 
  st_transform(2056)
Reading layer `g1s22' from data source 
  `C:\projects\ISPM_excess-mortality-voting\data-raw\BfS\ag-b-00.03-875-gg22\ggg_2022_LV95\shp\g1s22.shp' 
  using driver `ESRI Shapefile'
Simple feature collection with 23 features and 9 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: 2500002 ymin: 1064106 xmax: 2774175 ymax: 1297483
Projected CRS: CH1903+ / LV95 + LN02 height

Canton

Municipality

Alternative lakes dataset

Coming from Kartengeometrien ThemaKart - Set «PRO» 2019 available from BfS.

se_alt <- 
  st_union(
    st_read("data-raw/BfS/KM04-00-c-suis-2022-q/00_TOPO/K4_seenyyyymmdd/k4seenyyyymmdd11_ch2007Poly.shp",
            as_tibble = TRUE),
    st_read("data-raw/BfS/KM04-00-c-suis-2022-q/00_TOPO/K4_seenyyyymmdd/k4seenyyyymmdd22_ch2007Poly.shp",
            as_tibble = TRUE) )%>% 
  st_transform(2056)
Reading layer `k4seenyyyymmdd11_ch2007Poly' from data source 
  `C:\projects\ISPM_excess-mortality-voting\data-raw\BfS\KM04-00-c-suis-2022-q\00_TOPO\K4_seenyyyymmdd\k4seenyyyymmdd11_ch2007Poly.shp' 
  using driver `ESRI Shapefile'
Simple feature collection with 23 features and 2 fields
Geometry type: POLYGON
Dimension:     XY
Bounding box:  xmin: 2500375 ymin: 1064133 xmax: 2774275 ymax: 1297573
Projected CRS: CH1903+ / LV95
Reading layer `k4seenyyyymmdd22_ch2007Poly' from data source 
  `C:\projects\ISPM_excess-mortality-voting\data-raw\BfS\KM04-00-c-suis-2022-q\00_TOPO\K4_seenyyyymmdd\k4seenyyyymmdd22_ch2007Poly.shp' 
  using driver `ESRI Shapefile'
Simple feature collection with 14 features and 2 fields
Geometry type: POLYGON
Dimension:     XY
Bounding box:  xmin: 2558647 ymin: 1089103 xmax: 2805261 ymax: 1246795
Projected CRS: CH1903+ / LV95

That adds few smaller lakes (in blue):

swisstopo geodata 2022

Boundaries

swissBOUNDARIES3D data available from here.

download.file(url = "https://data.geo.admin.ch/ch.swisstopo.swissboundaries3d/swissboundaries3d_2022-01/swissboundaries3d_2022-01_2056_5728.shp.zip",
              destfile = "data-raw/swisstopo/swissboundaries3d_2022-01_2056_5728.shp.zip",
              method = "curl")

unzip("data-raw/swisstopo/swissboundaries3d_2022-01_2056_5728.shp.zip", 
      exdir = "data-raw/swisstopo/swissboundaries3d_2022-01_2056_5728")

unlink("data-raw/swisstopo/swissboundaries3d_2021-01_2056_5728.shp.zip")
# note the source (not raw) to handle #28
st_gg <- 
  st_read("data/swisstopo/swissboundaries3d_2022-01_2056_5728/SHAPEFILE_LV95_LN02/swissBOUNDARIES3D_1_3_TLM_HOHEITSGEBIET.shp",
          as_tibble = TRUE) %>% 
  st_zm(drop = TRUE, what = "ZM") %>% 
  st_transform(2056) 
Reading layer `swissBOUNDARIES3D_1_3_TLM_HOHEITSGEBIET' from data source 
  `C:\projects\ISPM_excess-mortality-voting\data\swisstopo\swissboundaries3d_2022-01_2056_5728\SHAPEFILE_LV95_LN02\swissBOUNDARIES3D_1_3_TLM_HOHEITSGEBIET.shp' 
  using driver `ESRI Shapefile'
Simple feature collection with 2286 features and 23 fields
Geometry type: POLYGON
Dimension:     XYZ
Bounding box:  xmin: 2485410 ymin: 1075268 xmax: 2833858 ymax: 1295934
z_range:       zmin: 193.51 zmax: 4613.686
Projected CRS: CH1903+ / LV95 + LN02 height
st_kt <- 
  st_read("data-raw/swisstopo/swissboundaries3d_2022-01_2056_5728/SHAPEFILE_LV95_LN02/swissBOUNDARIES3D_1_3_TLM_KANTONSGEBIET.shp",
          as_tibble = TRUE) %>% 
  st_zm(drop = TRUE, what = "ZM") %>% 
  st_transform(2056)
Reading layer `swissBOUNDARIES3D_1_3_TLM_KANTONSGEBIET' from data source 
  `C:\projects\ISPM_excess-mortality-voting\data-raw\swisstopo\swissboundaries3d_2022-01_2056_5728\SHAPEFILE_LV95_LN02\swissBOUNDARIES3D_1_3_TLM_KANTONSGEBIET.shp' 
  using driver `ESRI Shapefile'
Simple feature collection with 50 features and 20 fields
Geometry type: POLYGON
Dimension:     XYZ
Bounding box:  xmin: 2485410 ymin: 1075268 xmax: 2833858 ymax: 1295934
z_range:       zmin: 193.51 zmax: 4613.686
Projected CRS: CH1903+ / LV95 + LN02 height

Lakes

Additional dataset of lakes was derived from swissTLM3D_TLM_STEHENDES_GEWAESSER.shp layer from swissTLM3D dataset.

# convert lines to polygons, calculate area

# testing is good here:
# "NAME" = 'Bodensee | Lac de Constance | Lago di Costanza | Lai da Constanza'

st_se <- 
  st_read("data-raw/swisstopo/swisstlm3d_2022-03_2056_5728/TLM_GEWAESSER/swissTLM3D_TLM_STEHENDES_GEWAESSER.shp",
          as_tibble = TRUE) %>% 
  st_zm(drop = TRUE, what = "ZM") %>% 
  filter(OBJEKTART == "See") %>% 
  group_by(GEW_LAUF_U) %>% 
  summarise(geometry = st_combine(geometry),
            NAME = first(NAME),
            OBJEKTART = first(OBJEKTART)) %>%
  st_cast("MULTILINESTRING") %>% 
  st_polygonize() %>%
  mutate(area = st_area(.)) %>% 
  ungroup() %>% 
  relocate(geometry, .after = last_col())

st_see22_insel <- 
  st_read("data-raw/swisstopo/swisstlm3d_2022-03_2056_5728/TLM_GEWAESSER/swissTLM3D_TLM_STEHENDES_GEWAESSER.shp",
          as_tibble = TRUE) %>% 
  st_zm(drop = TRUE, what = "ZM") %>% 
  filter(OBJEKTART == "Seeinsel") %>% 
  group_by(GEW_LAUF_U) %>% 
  summarise(geometry = st_combine(geometry),
            NAME = first(NAME),
            OBJEKTART = first(OBJEKTART)) %>%
  st_cast("MULTILINESTRING") %>% 
  st_polygonize() %>% 
  mutate(area = st_area(.)) %>% 
  ungroup() %>% 
  relocate(geometry, .after = last_col())

st_se <- 
  st_difference(st_se, st_union(st_see22_insel)) %>% 
  st_collection_extract("POLYGON") %>% 
  st_zm(drop = TRUE, what = "ZM") %>% 
  st_transform(2056)

unique(st_geometry_type(st_se))

frq(st_is_valid(st_se))

summary(st_se$area)

plot(st_se[, 2])

rm(st_see22_insel); gc()

Combined

Lakes dataset (selecting only 322 largest features from ThemaKart dataset overlap!) was then used to erase area of municipalities that extends over lakes. Issue is described in #48.

st_gg <- 
  st_difference(st_gg, st_union(st_se_sel)) %>% 
  # small slivers of lakes left
  filter(OBJEKTART != "Kantonsgebiet") %>% 
  st_collection_extract("POLYGON") %>% 
  st_zm(drop = TRUE, what = "ZM") %>% 
  st_transform(2056) # %>% 
# group_by(BFS_NUMMER) %<>% 
# summarise(BFS_NUMMER = first(BFS_NUMMER),
#           NAME = first(NAME),
#           GEM_TEIL = max(GEM_TEIL)) %>%
# st_cast("MULTIPOLYGON") 

Important differences

Resolution

swisstopo files (in blue below) are larger in size (and slower to plot) but offer better resolution and will be used for certain spatial operations (like linking to SEP) whereas BfS data (in red) are smaller in size and faster to plot. Example of municipality Bern:

Multipart features

BfS originally stores data as MULTIPOLYGON. In such case municipality split over several disconnected areas is stored as single record. Example of municipality Monthey which consists of four separate areas:

The same municipality in swisstopo data is stored as set of MULTIPOLYGON types and is represented by four separate records instead of one.

That might have some implications in terms of spatial operations (maybe contiguity too?). Data can easily be converted between types if needed.

Municipality boundaries preparations

Disconnected boundaries - Lake Lugano

Two communities in TI Bissone & Melide are disconnected by Lake Lugano in BfS municipality data:

No connectivity

In reality communities are indeed connected:

IRL

That has been manually corrected in GIS and now looks like:

Connected

Importance of this issue is covered here.

Disconnected boundaries - Lake Zurich

Freienbach & Rapperswil-Jona are disconnected on the map of boundaries:

No connectivity

… but connected IRL:

Connectivity

That has been fixed manually in GIS.

Spurious connection

Community of Twann-Tüscherz (in red) is connected to Erlach via small part of the St. Peter’s Island.

St. Peter’s Island

This connection has been removed.

Non-residential municipalities

Excluding municipalities without residents:

  • Staatswald Galm
  • Comunanza Cadenazzo/Monteceneri
  • Comunanza Capriasca/Lugano

We should not expect any cases there!

ID variables for INLA

gg %<>% 
  mutate(id_space = as.integer(as.factor(GMDNR))) %>% 
  relocate(id_space)

Labour market areas 2018

Data sources

More info here.

raum <- read_xlsx("data-raw/BfS/Raumgliederungen.xlsx", 
                  skip = 1) %>% 
  remove_empty(c("rows", "cols")) %>% clean_names() %>% 
  filter(! is.na(bfs_gde_nummer)) %>% 
  select(-bezirksname) %>% 
  rename(GMDNR = bfs_gde_nummer,
         GMDNAME = gemeindename,
         KTNR = kantons_nummer,
         KTNAME = kanton,
         ARGRNR = arbeitsmarktgrossregionen_2018,
         ARNR = arbeitsmarktregionen_2018) %>% 
  select(GMDNR, GMDNAME,
         KTNR, KTNAME, 
         ARGRNR, ARNR) %>% 
  mutate(id_kt = as.integer(as.factor(KTNR))) %>% 
  relocate(id_kt, .after = KTNR) %>% 
  left_join(
    read_xlsx("data-raw/BfS/Raumgliederungen.xlsx", 
              skip = 1, sheet = "CH1+CL_GBAE+2018.0") %>% 
      rename(
        ARGRNR = Code,
        ARGRNAME = Label)
  ) %>% 
  left_join(
    read_xlsx("data-raw/BfS/Raumgliederungen.xlsx", 
              skip = 1, sheet = "CH1+CL_BAE+2018.0") %>% 
      rename(
        ARNR = Code,
        ARNAME = Label) %>% 
      # weird hyphen in some neames?
      mutate(ARNAME = str_replace(ARNAME, fixed("–"), "-"))
  ) %>% 
  relocate(ARGRNAME, .after = ARGRNR) %>% 
  relocate(ARNAME, .after = ARNR) %>% 
  mutate(border = if_else(ARNAME %in% c("Schaffhausen", 
                                        "Delémont", 
                                        "Porrentruy", 
                                        "Saignelégier-Le Noirmont", 
                                        "La Chaux-de-Fonds",
                                        "Le Chenit", 
                                        "Locarno", 
                                        "Lugano") | 
                            ARGRNAME %in% c("Region Basel", "Region Genf"), 
                          1, 0)) 

Border regions

Defined on the basis of this map. Linked by name of the region since some codes seemed to be out of date? (ie. Basel).

Border regions

Number of municipalities that are located within border labour market areas:

border <numeric> 
# total N=2145 valid N=2145 mean=0.20 sd=0.40

Value |    N | Raw % | Valid % | Cum. %
---------------------------------------
    0 | 1725 | 80.42 |   80.42 |  80.42
    1 |  420 | 19.58 |   19.58 | 100.00
 <NA> |    0 |  0.00 |    <NA> |   <NA>

Typology applied to municipalities:

List of municipalities:

gg %<>% left_join(raum %>% select(GMDNR, KTNR, id_kt, KTNAME, 
                                  ARGRNR, ARGRNAME, ARNR, ARNAME,
                                  border)) %>% 
  relocate(geometry, .after = last_col()) 

write_rds(gg, "data/BfS/gg.Rds")

st_write(gg, "data/geo/BfS.gpkg", layer = "st_gg", append = FALSE)
Deleting layer `st_gg' failed
Writing layer `st_gg' to data source `data/geo/BfS.gpkg' using driver `GPKG'
Writing 2145 features with 11 fields and geometry type Multi Polygon.
st_gg %<>% left_join(raum %>% select(GMDNR, KTNAME, 
                                     ARGRNR, ARGRNAME, ARNR, ARNAME,
                                     border)) %>% 
  relocate(geometry, .after = last_col())

write_rds(st_gg, "data/swisstopo/st_gg.Rds")

st_write(st_gg, "data/geo/swisstopo.gpkg", layer = "st_gg", append = FALSE)
Deleting layer `st_gg' failed
Writing layer `st_gg' to data source `data/geo/swisstopo.gpkg' using driver `GPKG'
Writing 2239 features with 10 fields and geometry type Multi Polygon.

Spatial connectivity of municipalities

Based on queen connectivity between the areas. Uses ‘normal’ municipality boundaries and connects them irrespectively of mountains.

Weights

gg_wm_q <- poly2nb(gg, queen = TRUE, row.names = gg$id_space)

gg_wm_q
Neighbour list object:
Number of regions: 2145 
Number of nonzero links: 12186 
Percentage nonzero weights: 0.264854 
Average number of links: 5.681119 
summary(gg_wm_q)
Neighbour list object:
Number of regions: 2145 
Number of nonzero links: 12186 
Percentage nonzero weights: 0.264854 
Average number of links: 5.681119 
Link number distribution:

  1   2   3   4   5   6   7   8   9  10  11  12  13  14  15  16  17  18  19 
 14  43 187 374 498 409 274 162  95  45  18   7   4   6   2   4   1   1   1 
14 least connected regions:
757 759 820 896 911 992 1002 1118 1155 1408 1558 1582 2072 2127 with 1 link
1 most connected region:
1704 with 19 links
write_rds(gg_wm_q, "data/nb/gg_wm_q.Rds")
nb2INLA("data/nb/gg_wm_q.adj", gg_wm_q)

Region with one neighbour

2271 Meyriez

Region with nineteen neighbours (max case)

5586 Lausanne

Multipart features

Example of 418 Oberburg. Location:

Thanks to using multipart features, connectivity is captured correctly:

Connectivity over Zurichsee

1322 Freienbach fixed above:

Connectivity over Lake Lugano

All together:

tg3o simplification

So called ‘productive areas’, also coming from BfS’ ThemaKart product.

Roughly coincides with removing areas of above 2000m asl.

tg3o <- 
  st_read("data/BfS/KM04-00-c-suis-2022-q/01_INST/Vegetationsfläche_vf/K4_polg20220101_vf/K4polg20220101vf_ch2007Poly.shp",
          as_tibble = TRUE) %>% 
  st_transform(2056) %>% 
  rename(GMDNR = id,
         GMDNAME = name)

Note: the three discontinued municipality borders highlighted above have been also corrected in this dataset.

Dataset consists of 2,145 communities. The state is from 2022-01-01.

Spatial connectivity of tg3o boundaries

Based on queen connectivity between the productive areas. Uses ‘tg3o’ simplified municipality boundaries so it breaks some connections in the mountainous regions. Few disconnected areas are connected back, mostly when there is a train tunnel or a year round road pass connection.

Weights

# observe different location (folder with `nb` extension)
# this is dataset that 'forces' connectivity in certain areas
tg3o_nb <- 
  st_read("data/BfS/KM04-00-c-suis-2022-q/01_INST/Vegetationsfläche_vf/K4_polg20220101_vf_nb/K4polg20220101vf_ch2007Poly.shp",
          as_tibble = TRUE) %>% 
  st_transform(2056) %>% 
  rename(GMDNR = id,
         GMDNAME = name)
Reading layer `K4polg20220101vf_ch2007Poly' from data source 
  `C:\projects\ISPM_excess-mortality-voting\data\BfS\KM04-00-c-suis-2022-q\01_INST\Vegetationsfläche_vf\K4_polg20220101_vf_nb\K4polg20220101vf_ch2007Poly.shp' 
  using driver `ESRI Shapefile'
Simple feature collection with 2145 features and 2 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: 2485203 ymin: 1075283 xmax: 2833175 ymax: 1295864
Projected CRS: CH1903+ / LV95
tg3o_wm_q <- poly2nb(tg3o_nb, queen = TRUE, row.names = tg3o$GMDNR)

tg3o_wm_q
Neighbour list object:
Number of regions: 2145 
Number of nonzero links: 11296 
Percentage nonzero weights: 0.2455105 
Average number of links: 5.2662 
summary(tg3o_wm_q)
Neighbour list object:
Number of regions: 2145 
Number of nonzero links: 11296 
Percentage nonzero weights: 0.2455105 
Average number of links: 5.2662 
Link number distribution:

  1   2   3   4   5   6   7   8   9  10  11  12  13  14  15  16  18  19 
 48 121 234 387 452 389 233 149  73  27  13   5   3   5   1   3   1   1 
48 least connected regions:
286 298 406 412 416 590 591 598 622 631 710 757 759 820 896 911 992 1002 1115 1117 1118 1123 1138 1144 1154 1168 1171 1174 1181 1407 1499 1502 1513 1557 1580 1585 1586 1591 1903 1909 1920 1923 1950 1954 1964 2018 2070 2125 with 1 link
1 most connected region:
1702 with 19 links
write_rds(tg3o_wm_q, "data/nb/tg3o_wm_q.Rds")
nb2INLA("data/nb/tg3o_wm_q.adj", tg3o_wm_q) 

All together

Original connectivity, notice disconnected ‘islands’ of polygons:

Forcing extra connections

Clearer when only the differences are shown

List of ‘extra’, ‘forced’ connections:

  • Lötschberg Tunnel connecting Kandersteg and Ferden
  • Lötschberg Base Tunnel connecting Frutigen and Raron
  • road connection between Ried-Brieg & Simplon

Computing Environment

Analyses were conducted using the R Statistical language (version 4.2.1; R Core Team, 2022) on Windows 10 x64 (build 18363), using the packages magrittr (version 2.0.3; Bache S, Wickham H, 2022), spData (version 2.2.0; Bivand R et al., 2022), spdep (version 1.2.7; Bivand R, Wong DWS, 2018), janitor (version 2.1.0; Firke S, 2021), lubridate (version 1.8.0; Grolemund G, Wickham H, 2011), purrr (version 0.3.5; Henry L, Wickham H, 2022), tibble (version 3.1.8; Müller K, Wickham H, 2022), sf (version 1.0.8; Pebesma E, 2018), sp (version 1.5.0; Pebesma EJ, Bivand RS, 2005), pacman (version 0.5.1; Rinker TW, Kurkiewicz D, 2018), tmap (version 3.3.3; Tennekes M, 2018), ggplot2 (version 3.3.6; Wickham H, 2016), forcats (version 0.5.2; Wickham H, 2022), stringr (version 1.4.1; Wickham H, 2022), tidyverse (version 1.3.2; Wickham H et al., 2019), readxl (version 1.4.1; Wickham H, Bryan J, 2022), dplyr (version 1.0.10; Wickham H et al., 2022), tidyr (version 1.2.1; Wickham H, Girlich M, 2022), readr (version 2.1.3; Wickham H et al., 2022), scales (version 1.2.1; Wickham H, Seidel D, 2022) and DT (version 0.26; Xie Y et al., 2022).

References