The daily_internet_users
data set is created by this
code, which is not evaluated here in this vignette:
<- eurostat::get_eurostat("isoc_r_iuse_i",
isoc_r_iuse_i time_format = "num")
<- isoc_r_iuse_i %>%
daily_internet_users ::filter ( unit == "PC_IND", # percentage of individuals
dplyr== "I_IDAY") %>% # daily internet users
indic_is select ( all_of(c("geo", "time", "values") )
Simply downloading a regional statistics from the Eurostat data warehouse and placing the observation on the same map does not work. If you look into the data, you may realize that the geo codes of French, Lithuanian or Hungarian regions, to name a few, do not match in the years 2012 and 2018.
Let’s have a look at these countries in two years, using the helper
function get_country_code
. The year 2012 is coded with the
NUTS2010
typology and the year 2018 with the
NUTS2016
typology. Then we use the
valideate_nuts_region
function with the default
NUTS2016
(which is currently valid in the European Union)
and the obsolete NUTS2010
definitions.
test <- daily_internet_users %>%
mutate ( country_code = get_country_code(geo = .data$geo) ) %>%
dplyr::filter ( time %in% c(2012, 2018),
country_code %in% c("FR", "HU", "LT")) %>%
mutate ( time = paste0("Y", time )) %>%
pivot_wider ( names_from ="time", values_from = "values") %>%
validate_nuts_regions() %>% # default year the current valid 2016
validate_nuts_regions( nuts_year = 2010 )
The following NUTS regions codes are not valid in the 2010 definition. These sub-national divisions were defined in 2013 or 2016. Some of these regional boundaries did not change, but got new codes after altering the administrative divisions of France. Some of the seemingly missing 2012 data can be found under different codes.
geo | country_code | Y2018 | Y2012 | orig_typology | valid_2016 | typology | valid_2010 |
---|---|---|---|---|---|---|---|
FRB | FR | 70 | NA | nuts_level_1 | TRUE | NA | FALSE |
FRB0 | FR | 70 | NA | nuts_level_2 | TRUE | NA | FALSE |
FRC | FR | 72 | NA | nuts_level_1 | TRUE | NA | FALSE |
FRC1 | FR | 71 | NA | nuts_level_2 | TRUE | NA | FALSE |
FRC2 | FR | 74 | NA | nuts_level_2 | TRUE | NA | FALSE |
FRD | FR | 79 | NA | nuts_level_1 | TRUE | NA | FALSE |
FRD1 | FR | 78 | NA | nuts_level_2 | TRUE | NA | FALSE |
FRD2 | FR | 80 | NA | nuts_level_2 | TRUE | NA | FALSE |
FRE | FR | 75 | NA | nuts_level_1 | TRUE | NA | FALSE |
FRE1 | FR | 73 | NA | nuts_level_2 | TRUE | NA | FALSE |
And there are two regions that are not valid in 2016, because these typologies were changed. Vilnius and Budapest, two big cities, were detached from their larger containing regional units.
knitr::kable(
test [ ! test$valid_2016, ]
)
geo | country_code | Y2018 | Y2012 | orig_typology | valid_2016 | typology | valid_2010 |
---|---|---|---|---|---|---|---|
HU10 | HU | NA | 68 | NA | FALSE | nuts_level_2 | TRUE |
LT00 | LT | NA | 49 | NA | FALSE | nuts_level_2 | TRUE |
Especially in the case of Budapest and Central Hungary, the
comparative data can be produced for different boundary definitions,
because the boundary change was simple. (Budapest was removed from
Central Hungary.) In Lithuania, the change was not more complex, but
unfortunately it cut through a far less rarely used typology level,
NUTS3
. While the change is simple, the replacement data is
usually not published.
Eurostat data: cite Eurostat.
Administrative boundaries: cite EuroGeographics.
For main developer and contributors, see the package homepage.
This work can be freely used, modified and distributed under the GPL-3 license:
citation("regions")
#>
#> To cite package 'regions' in publications use:
#>
#> Antal D (2021). _regions: Processing Regional Statistics_. R package
#> version 0.1.8, <https://regions.dataobservatory.eu/>.
#>
#> A BibTeX entry for LaTeX users is
#>
#> @Manual{,
#> title = {regions: Processing Regional Statistics},
#> author = {Daniel Antal},
#> year = {2021},
#> note = {R package version 0.1.8},
#> url = {https://regions.dataobservatory.eu/},
#> }
For contact information, see the package homepage.