Title: | Download Data from the FAOSTAT Database |
---|---|
Description: | Download Data from the FAOSTAT Database of the Food and Agricultural Organization (FAO) of the United Nations. A list of functions to download statistics from FAOSTAT (database of the FAO <https://www.fao.org/faostat/>) and WDI (database of the World Bank <https://data.worldbank.org/>), and to perform some harmonization operations. |
Authors: | Michael C. J., Markus Gesmann, Filippo Gheri, Paul Rougieux <[email protected]>, Sebastian Campbell |
Maintainer: | Paul Rougieux <[email protected]> |
License: | GPL (>= 2) |
Version: | 2.4.1 |
Built: | 2025-02-02 05:41:32 UTC |
Source: | https://gitlab.com/paulrougieux/faostatpackage |
Download Data from the FAOSTAT Database of the Food and Agricultural Organization (FAO) of the United Nations. A list of functions to download statistics from FAOSTAT (database of the FAO https://www.fao.org/faostat/) and WDI (database of the World Bank https://data.worldbank.org/), and to perform some harmonization operations.
Useful links:
Report bugs at https://gitlab.com/paulrougieux/faostatpackage/-/issues
The function takes a relational data frame and computes the aggregation based on the relation specified.
Aggregation( data, aggVar, weightVar = rep(NA, length(aggVar)), year = "Year", relationDF = FAOcountryProfile[, c("FAOST_CODE", "M49_FAOST_CODE")], aggMethod = rep("sum", length(aggVar)), applyRules = TRUE, keepUnspecified = TRUE, unspecifiedCode = 0, thresholdProp = rep(0.65, length(aggVar)) )
Aggregation( data, aggVar, weightVar = rep(NA, length(aggVar)), year = "Year", relationDF = FAOcountryProfile[, c("FAOST_CODE", "M49_FAOST_CODE")], aggMethod = rep("sum", length(aggVar)), applyRules = TRUE, keepUnspecified = TRUE, unspecifiedCode = 0, thresholdProp = rep(0.65, length(aggVar)) )
data |
The data frame containing the country level data. |
aggVar |
The vector of names of the variables to be aggregated. |
weightVar |
The vector of names of the variables to be used as weighting when the aggregation method is weighted. |
year |
The column containing the time information. |
relationDF |
A relational data frame which specifies the territory and the mother country. At least one column must have a corrispondent variable name in the dataset. |
aggMethod |
Can be a single method for all data or a vector specifying different method for each variable. The method can be "sum", "mean", "weighted.mean". |
applyRules |
Logical, specifies whether the
|
keepUnspecified |
Whether countries with unspecified region should be aggregated into an "Unspecified" group or simply drop. Default to create the new group. |
unspecifiedCode |
The output code of the unspecified group. |
thresholdProp |
The vector of the missing threshold for the aggregation rule to be applied. The default is set to only compute aggregation if there are more than 65 percent of data available (0.65). |
The length of aggVar
, aggMethod
, weightVar
,
thresholdProp
must be the same.
Aggregation should not be computed if insufficient
countries have reported data. This corresponds to the argument
thresholdProp
which specifies the percentage which of
country must report data (both for the variable to be aggregated and
the weighting variable).
## example.df = data.frame(FAOST_CODE = rep(c(1, 2, 3), 2), ## Year = rep(c(2010, 2011), c(3, 3)), ## value = rep(c(1, 2, 3), 2), ## weight = rep(c(0.3, 0.7, 1), 2)) ## Lets aggregate country 1 and 2 into one country and keep country ## 3 seperate. ## relation.df = data.frame(FAOST_CODE = 1:3, NEW_CODE = c(1, 1, 2))
## example.df = data.frame(FAOST_CODE = rep(c(1, 2, 3), 2), ## Year = rep(c(2010, 2011), c(3, 3)), ## value = rep(c(1, 2, 3), 2), ## weight = rep(c(0.3, 0.7, 1), 2)) ## Lets aggregate country 1 and 2 into one country and keep country ## 3 seperate. ## relation.df = data.frame(FAOST_CODE = 1:3, NEW_CODE = c(1, 1, 2))
Columns from FAOSTAT frequently have parentheses and other non-alphanumeric characters. This suite of functions seeks to give control over these names for easier data analysis
change_case( old_names, new_case = c("make.names", "unsanitised", "unsanitized", "snake_case"), ... )
change_case( old_names, new_case = c("make.names", "unsanitised", "unsanitized", "snake_case"), ... )
old_names |
character. Vector of the names to be changed |
new_case |
character. Choice of new names:
|
... |
extra arguments to pass to sanitisation function (only works for make.names) |
A function for constructing year to year change
chConstruct( data, origVar, country = "FAOST_CODE", year = "Year", newVarName = NA, n = 1 )
chConstruct( data, origVar, country = "FAOST_CODE", year = "Year", newVarName = NA, n = 1 )
data |
The data frame containing the data |
origVar |
The variable in which the year to year change is to be calculated |
country |
The column representing the index of country. |
year |
The column represing the index of year. |
newVarName |
The name assigned to the new variable, if missing then .CH will be appended. |
n |
The period for the change rate to be calculated. |
A data frame containing the computed year to year change rate.
The function only works for FAOST_CODE. If the country coding system is not in FAOST_CODE then use the translateCountryCode function to translate it.
check_country_overlap( var, year = "Year", data, type = c("overlap", "multiChina"), take = c("simpleCheck", "takeNew", "takeOld", "complete") )
check_country_overlap( var, year = "Year", data, type = c("overlap", "multiChina"), take = c("simpleCheck", "takeNew", "takeOld", "complete") )
var |
The variable to be checked. |
year |
The column which index the time. |
data |
The data frame. |
type |
The type of check. |
take |
The type of check/replacement to be done in case of type equals to overlap. |
test.df = data.frame(FAOST_CODE = rep(c(51,167,199), each = 3), Year = rep(c(1990:1992), 3), Value = c(c(3,4,4), c(2,2,2), c(1,2,NA))) check_country_overlap(var = "Value", data = test.df, type = "overlap", take = "simpleCheck") check_country_overlap(var = "Value", data = test.df, type = "overlap", take = "takeNew") check_country_overlap(var = "Value", data = test.df, type = "overlap", take = "takeOld") check_country_overlap(var = "Value", data = test.df, type = "overlap", take = "complete")
test.df = data.frame(FAOST_CODE = rep(c(51,167,199), each = 3), Year = rep(c(1990:1992), 3), Value = c(c(3,4,4), c(2,2,2), c(1,2,NA))) check_country_overlap(var = "Value", data = test.df, type = "overlap", take = "simpleCheck") check_country_overlap(var = "Value", data = test.df, type = "overlap", take = "takeNew") check_country_overlap(var = "Value", data = test.df, type = "overlap", take = "takeOld") check_country_overlap(var = "Value", data = test.df, type = "overlap", take = "complete")
Function for generating the n-period absolute change
chgr(x, n = 1)
chgr(x, n = 1)
x |
The time series for the change to be calculated. |
n |
The period for the growth to be calculated over. |
In order to ensure the change calculated is reliable, the following rule are applied.
50% of the data must be present.
The length of the time series must be greater than n
Otherwise the growth will not be computed.
The n-period change of the time series.
test.ts = abs(rnorm(100)) chgr(test.ts, 1) chgr(test.ts, 3) chgr(test.ts, 10)
test.ts = abs(rnorm(100)) chgr(test.ts, 1) chgr(test.ts, 3) chgr(test.ts, 10)
This function should only be used when performing aggregations.
CHMT(var, data, year = "Year")
CHMT(var, data, year = "Year")
var |
The variables that require to be sanitized. |
data |
The data frame which contains the data |
year |
The column which correspond to the year. |
We decide to use the smaller subsets in the regional level because weighting variable may not exist for other variables for the larger subsets.
The function only work for FAOST_CODE, if the country coding system is not in FAOST_CODE then use the translateCountryCode function to translate it.
A function used to construct new variables from existing variables.
constructSYB( data, origVar1, origVar2, newVarName = NA, constructType = c("share", "growth", "change", "index"), grFreq = 1, grType = c("ls", "geo"), baseYear = 2000 )
constructSYB( data, origVar1, origVar2, newVarName = NA, constructType = c("share", "growth", "change", "index"), grFreq = 1, grType = c("ls", "geo"), baseYear = 2000 )
data |
The data frame containing the raw variable |
origVar1 |
The variable name to be used in construction, refer to Details for more information and useage. |
origVar2 |
The variable name to be used in construction, refer to Details for more information and useage. |
newVarName |
The name assigned to the new variable, if missing then .SC/.SH/.GR/.CH will be appended depending on the type of construction |
constructType |
The type of construction, refer to Details for more information. |
grFreq |
The frequency for the growth rate to be computed. |
grType |
The method for the growth to be calculated, currently supports least squares and geometric. |
baseYear |
The base year to be used for constructing index. |
Currently two types of construction are supported, either share or growth rate computation.
Share can be a share of total or share of another variable depending on whether an additional variable is supplied or not.
A data frame containing both the original data frame and the processed data and also a list indicating whether the construction passed or failed.
get_faostat_bulk()
loads the given data set code and returns a data frame.
download_faostat_bulk()
loads data from the given url and saves it to a compressed zip file.
read_faostat_bulk()
Reads the compressed .csv .zip file into a data frame.
More precisely it unzips the archive.
Reads the main csv file within the archive.
The main file has the same name as the name of the archive.
Note: the zip archive might also contain metadata files about Flags and Symbols.
In general you should load the data with the function get_faostat_bulk()
and a dataset code.
The other functions are lower level functions that you can use as an alternative.
You can also explore the datasets and find their download URLs
on the FAOSTAT website. Explore the website to find out the data you are interested in
https://www.fao.org/faostat/en/#data
Copy a "bulk download" url,
for example they are located in the right menu on the "crops" page
https://www.fao.org/faostat/en/#data/QC
Note that faostat bulk files with names ending with "normalized" are in long format
with a year column instead of one column for each year.
The long format is preferable for data analysis and this is the format
returned by the get_faostat_bulk()
function.
download_faostat_bulk(url_bulk, data_folder = ".") read_faostat_bulk(zip_file_name, encoding = "latin1", rename_element = TRUE) get_faostat_bulk(code, data_folder = tempdir(), subset = "All Data Normalized") read_bulk_metadata(dataset_code)
download_faostat_bulk(url_bulk, data_folder = ".") read_faostat_bulk(zip_file_name, encoding = "latin1", rename_element = TRUE) get_faostat_bulk(code, data_folder = tempdir(), subset = "All Data Normalized") read_bulk_metadata(dataset_code)
url_bulk |
character url of the faostat bulk zip file to download |
data_folder |
character path of the local folder where to download the data |
zip_file_name |
character name of the zip file to read |
encoding |
parameter passed to 'read.csv'. |
rename_element |
boolean Rename the element column to snake case. To facilitate the use of elements as column names later when the data frame gets reshaped to a wider format. Replace non alphanumeric characters by underscores. |
code |
character. Dataset code |
subset |
character. Use |
dataset_code |
character. Dataset code |
data frame of FAOSTAT data
data frame of FAOSTAT data
data frame of FAOSTAT data
Paul Rougieux
## Not run: # Create a folder to store the data data_folder <- "data_raw" dir.create(data_folder) # Load crop production data crop_production <- get_faostat_bulk(code = "QCL", data_folder = data_folder) # Cache the file i.e. save the data frame in the serialized RDS format for faster load time later. saveRDS(crop_production, "data_raw/crop_production_e_all_data.rds") # Now you can load your local version of the data from the RDS file crop_production <- readRDS("data_raw/crop_production_e_all_data.rds") # Use the lower level functions to download zip files, # then read the zip files in separate function calls. # In this example, to avoid a warning about "examples lines wider than 100 characters" # the url is split in two parts: a common part 'url_bulk_site' and a .zip file name part. # In practice you can enter the full url directly as the `url_bulk` argument. # Notice also that I have choosen to load global data in long format (normalized). url_bulk_site <- "https://fenixservices.fao.org/faostat/static/bulkdownloads" url_crops <- file.path(url_bulk_site, "crop_production_E_All_Data_(Normalized).zip") url_forestry <- file.path(url_bulk_site, "Forestry_E_All_Data_(Normalized).zip") # Download the files download_faostat_bulk(url_bulk = url_forestry, data_folder = data_folder) download_faostat_bulk(url_bulk = url_crops, data_folder = data_folder) # Read the files and assign them to data frames crop_production <- read_faostat_bulk("data_raw/crop_production_E_All_Data_(Normalized).zip") forestry <- read_faostat_bulk("data_raw/Forestry_E_All_Data_(Normalized).zip") # Save the data frame in the serialized RDS format for fast reuse later. saveRDS(crop_production, "data_raw/crop_production_e_all_data.rds") saveRDS(forestry,"data_raw/forestry_e_all_data.rds") ## End(Not run)
## Not run: # Create a folder to store the data data_folder <- "data_raw" dir.create(data_folder) # Load crop production data crop_production <- get_faostat_bulk(code = "QCL", data_folder = data_folder) # Cache the file i.e. save the data frame in the serialized RDS format for faster load time later. saveRDS(crop_production, "data_raw/crop_production_e_all_data.rds") # Now you can load your local version of the data from the RDS file crop_production <- readRDS("data_raw/crop_production_e_all_data.rds") # Use the lower level functions to download zip files, # then read the zip files in separate function calls. # In this example, to avoid a warning about "examples lines wider than 100 characters" # the url is split in two parts: a common part 'url_bulk_site' and a .zip file name part. # In practice you can enter the full url directly as the `url_bulk` argument. # Notice also that I have choosen to load global data in long format (normalized). url_bulk_site <- "https://fenixservices.fao.org/faostat/static/bulkdownloads" url_crops <- file.path(url_bulk_site, "crop_production_E_All_Data_(Normalized).zip") url_forestry <- file.path(url_bulk_site, "Forestry_E_All_Data_(Normalized).zip") # Download the files download_faostat_bulk(url_bulk = url_forestry, data_folder = data_folder) download_faostat_bulk(url_bulk = url_crops, data_folder = data_folder) # Read the files and assign them to data frames crop_production <- read_faostat_bulk("data_raw/crop_production_E_All_Data_(Normalized).zip") forestry <- read_faostat_bulk("data_raw/Forestry_E_All_Data_(Normalized).zip") # Save the data frame in the serialized RDS format for fast reuse later. saveRDS(crop_production, "data_raw/crop_production_e_all_data.rds") saveRDS(forestry,"data_raw/forestry_e_all_data.rds") ## End(Not run)
A data frame is chosen over the list is solely for the purpose of transition to ggplot2.
ebind(territory = NULL, subregion = NULL, region = NULL, world = NULL)
ebind(territory = NULL, subregion = NULL, region = NULL, world = NULL)
territory |
The data frame which contains the territory/country level data |
subregion |
The sub aggregated region aggregate |
region |
The macro region aggregate |
world |
The world aggregate |
A table containing the relationship between the domain, element, item codes for downloading data from the FAOSTAT API.
Region profile containing the codes, names and regional classifications of countries.
This function can be useful when a dataset provided does not have a country code available.
fillCountryCode(country, data, outCode = "FAOST_CODE")
fillCountryCode(country, data, outCode = "FAOST_CODE")
country |
The column name of the data which contains the country name |
data |
The data frame to be matched |
outCode |
The output country code system, defaulted to FAO standard. |
Function for generating the n-period rolling geometric growth rate.
geogr(x, n = 1)
geogr(x, n = 1)
x |
The time series for the growth rate to be calculated. |
n |
The period for the growth to be calculated over. |
In order to ensure the growth rate calculated is reliable, the following rule are applied.
50% of the data must be present.
The length of the time series must be greater than n
Otherwise the growth will not be computed.
The n-period geometric growth rate of the time series.
test.ts = abs(rnorm(100)) geogr(test.ts, 1) geogr(test.ts, 3) geogr(test.ts, 10)
test.ts = abs(rnorm(100)) geogr(test.ts, 1) geogr(test.ts, 3) geogr(test.ts, 10)
A function to extract data from the World Bank API
Please refer to https://data.worldbank.org/ for any difference between the country code system. Further details on World Bank classification and methodology are available on that website.
getWDI( indicator = "SP.POP.TOTL", name = NULL, startDate = 1960, endDate = format(Sys.Date(), "%Y"), printURL = FALSE, outputFormat = "wide" )
getWDI( indicator = "SP.POP.TOTL", name = NULL, startDate = 1960, endDate = format(Sys.Date(), "%Y"), printURL = FALSE, outputFormat = "wide" )
indicator |
The World Bank official indicator name. |
name |
The new name to be used in the column. |
startDate |
The start date for the data to begin |
endDate |
The end date. |
printURL |
Whether the url link for the data should be printed |
outputFormat |
The format of the data, can be 'long' or 'wide'. |
Sometime after 2016, there was a change in the api according to https://datahelpdesk.worldbank.org/knowledgebase/articles/889392-about-the-indicators-api-documentation "Version 2 (V2) of the Indicators API has been released and replaces V1 of the API. V1 API calls will no longer be supported. To use the V2 API, you must place v2 in the call.
Original (2011) source by Markus Gesmann: https://lamages.blogspot.it/2011/09/setting-initial-view-of-motion-chart-in.html Also available at https://www.magesblog.com/post/2011-09-25-accessing-and-plotting-world-bank-data/
A data frame containing the desired World Bank Indicator
and the WBI package https://cran.r-project.org/package=WDI for an implementation with many more features.
## pop.df = getWDI()
## pop.df = getWDI()
A function to extract the definition and the meta data from the World Bank API
getWDImetaData( indicator, printMetaData = FALSE, saveMetaData = FALSE, saveName = "worldBankMetaData" )
getWDImetaData( indicator, printMetaData = FALSE, saveMetaData = FALSE, saveName = "worldBankMetaData" )
indicator |
The World Bank official indicator name. |
printMetaData |
logical, print out the meta data information |
saveMetaData |
logical, whether meta data should be saved as a local csv file. |
saveName |
The name of the file for the meta data to save to. |
## pop.df = getWDImetaData("SP.POP.TOTL", ## printMetaData = TRUE, saveMetaData = TRUE)
## pop.df = getWDImetaData("SP.POP.TOTL", ## printMetaData = TRUE, saveMetaData = TRUE)
The function downloads data from the World Bank API.
getWDItoSYB( indicator = "SP.POP.0014.TO.ZS", name = NULL, startDate = 1960, endDate = format(Sys.Date(), "%Y"), printURL = FALSE, getMetaData = TRUE, printMetaData = FALSE, saveMetaData = FALSE, outputFormat = c("wide", "long") )
getWDItoSYB( indicator = "SP.POP.0014.TO.ZS", name = NULL, startDate = 1960, endDate = format(Sys.Date(), "%Y"), printURL = FALSE, getMetaData = TRUE, printMetaData = FALSE, saveMetaData = FALSE, outputFormat = c("wide", "long") )
indicator |
The World Bank official indicator name. |
name |
The new name to be used in the column. |
startDate |
The start date for the data to begin |
endDate |
The end date. |
printURL |
Whether the url link for the data should be printed |
getMetaData |
Whether the data definition and the meta data should be downloaded as well. |
printMetaData |
logical, print out the meta data information |
saveMetaData |
logical, whether meta data should be saved as a local csv file |
outputFormat |
The format of the data, can be 'long' or 'wide'. |
A list containing the following elements
The country level data
The aggregates provided by the World Bank
The metaData associated with the data
The status of the download, whether success/failed
## pop.df = getWDItoSYB(name = "total_population", ## indicator = "SP.POP.TOTL")
## pop.df = getWDItoSYB(name = "total_population", ## indicator = "SP.POP.TOTL")
A function for constructing growth rate variables.
grConstruct(data, origVar, newVarName = NA, type = c("geo", "ls", "ch"), n = 1)
grConstruct(data, origVar, newVarName = NA, type = c("geo", "ls", "ch"), n = 1)
data |
The data frame containing the data |
origVar |
The variable in which the growth is to be calculated |
newVarName |
The name assigned to the new variable, if missing then .SC/.SH/.GR will be appended depending on the type of construction. |
type |
The type of growth rate, can be least squares or geometric |
n |
The period for the growth rate to be calculated (Refer to the lsgr or the geogr functions.) |
A data frame containing the computed growth rate.
test.df2 = data.frame(FAOST_CODE = rep(c(1, 5000), each = 5), Year = rep(1990:1994, 2), a = rep(1:5, 2), b = rep(1:5, 2)) grConstruct(test.df2, origVar = "a", type = "geo", n = 1) grConstruct(test.df2, origVar = "a", type = "geo", n = 3) grConstruct(test.df2, origVar = "a", type = "geo", n = 5)
test.df2 = data.frame(FAOST_CODE = rep(c(1, 5000), each = 5), Year = rep(1990:1994, 2), a = rep(1:5, 2), b = rep(1:5, 2)) grConstruct(test.df2, origVar = "a", type = "geo", n = 1) grConstruct(test.df2, origVar = "a", type = "geo", n = 3) grConstruct(test.df2, origVar = "a", type = "geo", n = 5)
A function for constructing indices
indConstruct(data, origVar, newVarName = NA, baseYear = 2000)
indConstruct(data, origVar, newVarName = NA, baseYear = 2000)
data |
The data frame containing the data |
origVar |
The variable in which the indices is to be computed |
newVarName |
The name assigned to the new variable, if missing then .SC/.SH/.GR/.CH/.IND will be appended depending on the type of construction. |
baseYear |
The year which will serve as the base |
The indice
test.df = data.frame(FAOST_CODE = rep(1, 100), Year = 1901:2000, test = 1:100) indConstruct(test.df, origVar = "test", baseYear = 1950)
test.df = data.frame(FAOST_CODE = rep(1, 100), Year = 1901:2000, test = 1:100) indConstruct(test.df, origVar = "test", baseYear = 1950)
Function for generating the n-period rolling least squares growth rate.
lsgr(x, n = 1)
lsgr(x, n = 1)
x |
The time series for the growth rate to be calculated |
n |
The period for the growth to be calculated over. |
Missing values are ommited in the regression. (Will need to check this.)
WONTFIX (Michael): There is still some error associated with this function, will need to investigate further. Will need a rule for this, when the fluctuation is large and data are sufficient then take the lsgr, otherwise the geogr.
In order to ensure the growth rate calculated is reliable, the following rule are applied.
50% of the data must be present.
The length of the time series must be greater than n.
Otherwise the growth will not be computed.
The n-period least squares growth rate of the time series
test.ts = abs(rnorm(100)) lsgr(test.ts, 1) lsgr(test.ts, 3) lsgr(test.ts, 10)
test.ts = abs(rnorm(100)) lsgr(test.ts, 1) lsgr(test.ts, 3) lsgr(test.ts, 10)
This function searches for supported country system and translate the data to allow for join.
mergeSYB(x, y, outCode = "FAOST_CODE", all = TRUE, ...)
mergeSYB(x, y, outCode = "FAOST_CODE", all = TRUE, ...)
x |
data frames, or objects to be coerced to one. |
y |
data frames, or objects to be coerced to one. |
outCode |
The country code system to be used to join the different sources. |
all |
Same as the merge function, defaulted to an outer join. |
... |
Arguments to be passed on to the merge function. |
The names of the data to be merged has to be the same as the FAOcountryProfile code name.
This function checks whether there are overlapping between the transitional countries.
overlap(old, new, var, year = "Year", data, take)
overlap(old, new, var, year = "Year", data, take)
old |
The FAOST_CODE of the old countries |
new |
The FAOST_CODE of the new countries |
var |
The variable to be checked |
year |
The column which index the time. |
data |
The data frame |
take |
The type of check/replacement to be done. |
A function to print standardised formatted labels without having messy codes in the functions.
printLab(label, span = FALSE, width = getOption("width"))
printLab(label, span = FALSE, width = getOption("width"))
label |
The label to be printed |
span |
Whether the dash should span the whole width of the screen(80 characters) |
width |
The width of the screen. |
The formatted print
Lists the dimensions of a dataset including ids and labels. These can be used to query dataset dimension names and the codes therein. They can also be used to access groups, flags, units and the glossary
read_dataset_dimension(dataset_code) read_dimension_metadata(dataset_code, dimension_code)
read_dataset_dimension(dataset_code) read_dimension_metadata(dataset_code, dimension_code)
dataset_code |
character. Dataset as obtained from the code column of search_dataset |
dimension_code |
character. Dimensions as obtained from |
Uses the same functionality as the web interface to pull data from the FAOSTAT API. Contains most of its parameters. Currently only works for datasets that have area, item, element and year. Values for Chinese countries are not yet deduplicated.
read_fao( area_codes, element_codes, item_codes, year_codes, area_format = c("M49", "FAO", "ISO2", "ISO3"), item_format = c("CPC", "FAO"), dataset = "RL", metadata_cols = c("codes", "units", "flags", "notes"), clean_format = c("make.names", "unsanitised", "unsanitized", "snake_case"), include_na = FALSE, language = c("en", "fr", "es") ) getFAO( area_codes, element_codes, item_codes, year_codes, area_format = c("M49", "FAO", "ISO2", "ISO3"), item_format = c("CPC", "FAO"), dataset = "RL", metadata_cols = c("codes", "units", "flags", "notes"), clean_format = c("make.names", "unsanitised", "unsanitized", "snake_case"), include_na = FALSE, language = c("en", "fr", "es") )
read_fao( area_codes, element_codes, item_codes, year_codes, area_format = c("M49", "FAO", "ISO2", "ISO3"), item_format = c("CPC", "FAO"), dataset = "RL", metadata_cols = c("codes", "units", "flags", "notes"), clean_format = c("make.names", "unsanitised", "unsanitized", "snake_case"), include_na = FALSE, language = c("en", "fr", "es") ) getFAO( area_codes, element_codes, item_codes, year_codes, area_format = c("M49", "FAO", "ISO2", "ISO3"), item_format = c("CPC", "FAO"), dataset = "RL", metadata_cols = c("codes", "units", "flags", "notes"), clean_format = c("make.names", "unsanitised", "unsanitized", "snake_case"), include_na = FALSE, language = c("en", "fr", "es") )
area_codes |
character. FAOSTAT area codes |
element_codes |
character. FAOSTAT element codes |
item_codes |
character. FAOSTAT item codes |
year_codes |
character. Vector of desired years |
area_format |
character. Desired area code type in output (input still needs to use FAOSTAT codes) |
item_format |
character. Item code |
dataset |
character. FAO dataset desired, e.g. RL, FBS |
metadata_cols |
character. Metadata columns to include in output |
clean_format |
character/function. Whether to clean columns. Either one of the formats described in [change_case] or a formatting function |
include_na |
logical. Whether to include NAs for combinations of dimensions with no data |
language |
character. 2 letter language code for output labels |
data.frame in long format (wide not yet supported). Contains attributes for the URL and parameters used.
## Not run: # Get data for Cropland (6620) Area (5110) in Antigua and Barbuda (8) in 2017 df = read_fao(area_codes = "8", element_codes = "5110", item_codes = "6620", year_codes = "2017") # Load cropland area for a range of year df = read_fao(area_codes = "106", element_codes = "5110", item_codes = "6620", year_codes = 2010:2020) # Find which country codes are available metadata_area <- read_dimension_metadata("RL", "area") # Find which items are available metadata_item <- read_dimension_metadata("RL", "item") # Find which elements are available metadata_element <- read_dimension_metadata("RL", "element") ## End(Not run)
## Not run: # Get data for Cropland (6620) Area (5110) in Antigua and Barbuda (8) in 2017 df = read_fao(area_codes = "8", element_codes = "5110", item_codes = "6620", year_codes = "2017") # Load cropland area for a range of year df = read_fao(area_codes = "106", element_codes = "5110", item_codes = "6620", year_codes = 2010:2020) # Find which country codes are available metadata_area <- read_dimension_metadata("RL", "area") # Find which items are available metadata_item <- read_dimension_metadata("RL", "item") # Find which elements are available metadata_element <- read_dimension_metadata("RL", "element") ## End(Not run)
The function standardize the data to the desirable unit when the multiplier vector is supplied. For example per 1000 people is scaled to per person by supplying a multiplier of 1000.
scaleUnit(df, multiplier)
scaleUnit(df, multiplier)
df |
The data frame containing the data to be scale |
multiplier |
The named vector with the multiplier to be scaled. The name is mandatory in order for the function to identify the variable in the data frame. A data.frame can also be supplied with the first column being the name and the second being the numeric multiplier. |
## Create the data frame test.df = data.frame(FAOST_CODE = 1:5, Year = 1995:1999, var1 = 1:5, var2 = 5:1) ## Create the named vector for scaling multiplier = c(1, 10) names(multiplier) = c("var1", "var2") ## Scale the data scaleUnit(test.df, multiplier = multiplier)
## Create the data frame test.df = data.frame(FAOST_CODE = 1:5, Year = 1995:1999, var1 = 1:5, var2 = 5:1) ## Create the named vector for scaling multiplier = c(1, 10) names(multiplier) = c("var1", "var2") ## Scale the data scaleUnit(test.df, multiplier = multiplier)
Get full list of datasets from the FAOSTAT database with the Code, dataset name and updates.
search_dataset(dataset_code, dataset_label, latest = TRUE, reset_cache = FALSE) FAOsearch(dataset_code, dataset_label, latest = TRUE, reset_cache = FALSE)
search_dataset(dataset_code, dataset_label, latest = TRUE, reset_cache = FALSE) FAOsearch(dataset_code, dataset_label, latest = TRUE, reset_cache = FALSE)
dataset_code |
character. Code of desired datasets, listed as 'code' in output. |
dataset_label |
character. Name of the datasets, listed as 'label' in the output data frame. Can take regular expressions. |
latest |
logical. Sort list by latest updates |
reset_cache |
logical. By default, data is saved after a first run and reused. Setting this to true causes the function to pull data from FAO again |
A data.frame with the columns: code, label, date_update, note_update, release_current, state_current, year_current, release_next, state_next, year_next
## Not run: # Find information about all datasets fao_metadata <- search_dataset() # Find information about forestry datasets search_dataset(dataset_code="FO") # Find information about datasets whose titles contain the word "Flows" search_dataset(dataset_label="Flows") ## End(Not run)
## Not run: # Find information about all datasets fao_metadata <- search_dataset() # Find information about forestry datasets search_dataset(dataset_code="FO") # Find information about datasets whose titles contain the word "Flows" search_dataset(dataset_label="Flows") ## End(Not run)
A function for constructing the share of a variable of an aggregated variable.
shConstruct(data, totVar, shareVar, newVarName = NA)
shConstruct(data, totVar, shareVar, newVarName = NA)
data |
The data frame containing both the share variable and the aggregated variable |
totVar |
The aggregated variable. |
shareVar |
The subset of the aggregated variable which to be divided by. |
newVarName |
The name assigned to the new variable, if missing then .SC/.SH/.GR will be appended depending on the type of construction |
The share of a variable can be share of the World (if additional variable were not supplied) or share of another variable (per Capita if population was supplied).
A data frame with the new constructed variable
## Total variables provided, scale by totVar test.df = data.frame(FAOST_CODE = 1, Year = 1990:1994, a = 1:5, b = 1:5) shConstruct(data = test.df, totVar = "a", shareVar = "b") ## Total variables not provided, scale by world aggregate. test.df2 = data.frame(FAOST_CODE = rep(c(1, 5000), each = 5), Year = rep(1990:1994, 2), a = rep(1:5, 2), b = rep(1:5, 2)) shConstruct(data = test.df2, totVar = NA, shareVar = "b")
## Total variables provided, scale by totVar test.df = data.frame(FAOST_CODE = 1, Year = 1990:1994, a = 1:5, b = 1:5) shConstruct(data = test.df, totVar = "a", shareVar = "b") ## Total variables not provided, scale by world aggregate. test.df2 = data.frame(FAOST_CODE = rep(c(1, 5000), each = 5), Year = rep(1990:1994, 2), a = rep(1:5, 2), b = rep(1:5, 2)) shConstruct(data = test.df2, totVar = NA, shareVar = "b")
The function translate any country code scheme to another if both are in the
are among the types present in the FAO API. If you require other codes or
conversion of country names to codes, consider the countrycodes
package.
translate_countrycodes( data, from = c("FAO", "M49", "ISO2", "ISO3"), to = c("M49", "FAO", "ISO2", "ISO3", "name"), oldCode, reset_cache = FALSE ) translateCountryCode( data, from = c("FAO", "M49", "ISO2", "ISO3"), to = c("M49", "FAO", "ISO2", "ISO3", "name"), oldCode, reset_cache = FALSE )
translate_countrycodes( data, from = c("FAO", "M49", "ISO2", "ISO3"), to = c("M49", "FAO", "ISO2", "ISO3", "name"), oldCode, reset_cache = FALSE ) translateCountryCode( data, from = c("FAO", "M49", "ISO2", "ISO3"), to = c("M49", "FAO", "ISO2", "ISO3", "name"), oldCode, reset_cache = FALSE )
data |
The data frame |
from |
The name of the old coding system |
to |
The name of the new coding system |
oldCode |
The column name of the old country coding scheme |
reset_cache |
logical. Whether to pull data from FAOSTAT directly instead of caching |
This function translates number to character name or vice versa
translateUnit(vec)
translateUnit(vec)
vec |
The vector containing name or number to be translated |
## Create numeric vector myUnit = c(1000, 1e6, 1000, 1e9, 1e9, 1e12) ## Translate numeric to character myUnit2 = translateUnit(myUnit) myUnit2 ## Now translate back translateUnit(myUnit2)
## Create numeric vector myUnit = c(1000, 1e6, 1000, 1e9, 1e9, 1e12) ## Translate numeric to character myUnit2 = translateUnit(myUnit) myUnit2 ## Now translate back translateUnit(myUnit2)