UC02: Deforestation in Acre, Brazil

Querying and visualizing deforestation data using PRODES as source

Objective

In this use case we shift our focus from forest cover to deforestation. We will query annual deforestation data for Acre, Brazil from 2010 to 2019, using data from PRODES — the official Brazilian deforestation monitoring system operated by INPE.

This example also demonstrates an important aspect of the GJD API: filtering by data source, since multiple sources may report on the same topic for the same jurisdiction.

Prerequisites

Make sure you have completed the Get Started and API Authentication sections before running this notebook.

Step 1: Load Libraries

library(httr2)

Warning: package 'httr2' was built under R version 4.4.3

library(jsonlite)
library(dplyr)
library(ggplot2)
library(DT)

Warning: package 'DT' was built under R version 4.4.3

Step 2: Understand the Query Parameters

For this use case:

Parameter	Value	Meaning
`ID_Topic`	`11`	Deforestation
`ID_Countries`	`["BR"]`	Brazil
`ID_Jurisdictions`	`["BR-AC"]`	Acre
`ID_years`	`["2010", ..., "2019"]`	Years 2010–2019
`ID_sources`	`[12]`	PRODES Amazonia, Brazil

Why filter by source?

The GJD API may return data from multiple sources for the same topic and jurisdiction. For example, deforestation in Acre is reported by PRODES, Hansen/UMD/Google, and MapBiomas — each using different methodologies and yielding different values. By specifying ID_sources=[12], we ensure we only get PRODES data for consistency.

Step 3: Build and Send the Request

base_url <- "https://api.greenjurisdictions.org/api/v1/dataPlaces/false/true/false"

years <- paste0('"', 2010:2019, '"', collapse = ",")

query_params <- list(
  ID_Topic         = 11,
  ID_Countries     = '["BR"]',
  ID_Jurisdictions = '["BR-AC"]',
  ID_Municipalities = '[]',
  ID_years         = paste0("[", years, "]"),
  ID_sources       = '[12]'
)

response <- request(base_url) |>
  req_url_query(!!!query_params) |>
  req_headers(
    "X-API-TOKEN"  = Sys.getenv("GJD_API_KEY"),
    "Accept"       = "application/json",
    "Content-Type" = "application/json"
  ) |>
  req_perform()

cat("HTTP status:", resp_status(response), "\n")

HTTP status: 200

Step 4: Parse the Response

body <- resp_body_json(response, simplifyVector = TRUE)

cat("Message:", body$message, "\n")

Message: Resources retrieved successfully.

cat("Total records:", body$data$total_data, "\n")

Total records: 10

df <- body$data$data |>
  as_tibble() |>
  mutate(
    value = as.numeric(value),
    year  = as.integer(year)
  ) |>
  select(year, jurisdiction, value, unit, source) |>
  arrange(year)

glimpse(df)

Rows: 10
Columns: 5
$ year         <int> 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019
$ jurisdiction <chr> "Acre", "Acre", "Acre", "Acre", "Acre", "Acre", "Acre", "…
$ value        <dbl> 25900, 28000, 30500, 22100, 30900, 26400, 37200, 25700, 4…
$ unit         <chr> "hectares", "hectares", "hectares", "hectares", "hectares…
$ source       <chr> "PRODES Amazonia, Brazil", "PRODES Amazonia, Brazil", "PR…

Step 5: Interactive Table

df |>
  datatable(
    caption  = "Annual deforestation in Acre, Brazil (2010–2019) — PRODES",
    colnames = c("Year", "Jurisdiction", "Deforestation", "Unit", "Source"),
    options  = list(
      pageLength = 10,
      dom        = "tip"
    ),
    rownames = FALSE
  ) |>
  formatRound("value", digits = 0, mark = ",")

Step 6: Bar Chart

ggplot(df, aes(x = factor(year), y = value / 1e3)) +
  geom_col(fill = "#c0392b", width = 0.7) +
  geom_text(
    aes(label = scales::comma(value, accuracy = 1)),
    vjust = -0.5, size = 3, color = "#17252a"
  ) +
  scale_y_continuous(
    labels = scales::comma_format(suffix = "K"),
    expand = expansion(mult = c(0, 0.12))
  ) +
  labs(
    title    = "Annual Deforestation in Acre, Brazil (2010–2019)",
    subtitle = "Source: PRODES Amazonia (INPE) — via GJD API",
    x        = "Year",
    y        = "Deforestation (thousand hectares)"
  ) +
  theme_minimal(base_size = 13) +
  theme(
    plot.title       = element_text(face = "bold"),
    panel.grid.major.x = element_blank()
  )

Step 7: Exploring Multiple Sources

To illustrate why source filtering matters, let’s make the same request without the ID_sources filter and see what we get.

query_all_sources <- list(
  ID_Topic         = 11,
  ID_Countries     = '["BR"]',
  ID_Jurisdictions = '["BR-AC"]',
  ID_Municipalities = '[]',
  ID_years         = paste0("[", years, "]")
)

response_all <- request(base_url) |>
  req_url_query(!!!query_all_sources) |>
  req_headers(
    "X-API-TOKEN"  = Sys.getenv("GJD_API_KEY"),
    "Accept"       = "application/json",
    "Content-Type" = "application/json"
  ) |>
  req_perform()

body_all <- resp_body_json(response_all, simplifyVector = TRUE)

df_all <- body_all$data$data |>
  as_tibble() |>
  mutate(
    value = as.numeric(value),
    year  = as.integer(year)
  ) |>
  select(year, jurisdiction, value, unit, source) |>
  arrange(year, source)

cat("Records with all sources:", nrow(df_all), "\n")

Records with all sources: 30

cat("Sources found:", paste(unique(df_all$source), collapse = ", "), "\n")

Sources found: Hansen et al./UMD/Google/USGS/NASA, post-processing by EII, MapBiomas - Brasil, PRODES Amazonia, Brazil

Now let’s visualize how the three sources compare:

ggplot(df_all, aes(x = factor(year), y = value / 1e3, fill = source)) +
  geom_col(position = "dodge", width = 0.7) +
  scale_fill_manual(values = c(
    "PRODES Amazonia, Brazil" = "#c0392b",
    "Hansen et al./UMD/Google/USGS/NASA, post-processing by EII" = "#2980b9",
    "MapBiomas - Brasil" = "#27ae60"
  )) +
  scale_y_continuous(
    labels = scales::comma_format(suffix = "K"),
    expand = expansion(mult = c(0, 0.08))
  ) +
  labs(
    title    = "Deforestation in Acre: Comparing Data Sources (2010–2019)",
    subtitle = "Three sources report different values for the same jurisdiction and topic",
    x        = "Year",
    y        = "Deforestation (thousand hectares)",
    fill     = "Data Source"
  ) +
  theme_minimal(base_size = 12) +
  theme(
    plot.title       = element_text(face = "bold"),
    panel.grid.major.x = element_blank(),
    legend.position  = "bottom",
    legend.direction = "vertical"
  )

Key Takeaway

Different data sources use different methodologies (satellite imagery resolution, classification algorithms, reference periods). Always be explicit about which source you are using in your analysis and document your choice.

Summary

In this use case we learned how to:

Query deforestation data (Topic 11) from the GJD API.
Filter by a specific data source (PRODES) to ensure consistency.
Visualize annual deforestation as a bar chart with ggplot2.
Compare multiple data sources for the same jurisdiction to understand methodological differences.

Next Steps

Combine deforestation and forest cover data to calculate deforestation rates.
Compare deforestation across multiple jurisdictions in the Brazilian Amazon.
Explore how deforestation trends relate to specific policy interventions.

--- title: "UC02: Deforestation in Acre, Brazil" subtitle: "Querying and visualizing deforestation data using PRODES as source" --- ## Objective In this use case we shift our focus from forest cover to **deforestation**. We will query annual deforestation data for **Acre, Brazil** from **2010 to 2019**, using data from [PRODES](http://www.obt.inpe.br/OBT/assuntos/programas/amazonia/prodes) — the official Brazilian deforestation monitoring system operated by INPE. This example also demonstrates an important aspect of the GJD API: **filtering by data source**, since multiple sources may report on the same topic for the same jurisdiction. ## Prerequisites Make sure you have completed the [Get Started](get-started.qmd) and [API Authentication](api-authentication.qmd) sections before running this notebook. ## Step 1: Load Libraries ```{r} #| message: false library(httr2) library(jsonlite) library(dplyr) library(ggplot2) library(DT) ``` ## Step 2: Understand the Query Parameters For this use case: | Parameter | Value | Meaning | |-----------|-------|---------| | `ID_Topic` | `11` | Deforestation | | `ID_Countries` | `["BR"]` | Brazil | | `ID_Jurisdictions` | `["BR-AC"]` | Acre | | `ID_years` | `["2010", ..., "2019"]` | Years 2010–2019 | | `ID_sources` | `[12]` | PRODES Amazonia, Brazil | ::: {.callout-note} ## Why filter by source? The GJD API may return data from **multiple sources** for the same topic and jurisdiction. For example, deforestation in Acre is reported by PRODES, Hansen/UMD/Google, and MapBiomas — each using different methodologies and yielding different values. By specifying `ID_sources=[12]`, we ensure we only get PRODES data for consistency. ::: ## Step 3: Build and Send the Request ```{r} base_url <- "https://api.greenjurisdictions.org/api/v1/dataPlaces/false/true/false" years <- paste0('"', 2010:2019, '"', collapse = ",") query_params <- list( ID_Topic = 11, ID_Countries = '["BR"]', ID_Jurisdictions = '["BR-AC"]', ID_Municipalities = '[]', ID_years = paste0("[", years, "]"), ID_sources = '[12]' ) response <- request(base_url) |> req_url_query(!!!query_params) |> req_headers( "X-API-TOKEN" = Sys.getenv("GJD_API_KEY"), "Accept" = "application/json", "Content-Type" = "application/json" ) |> req_perform() cat("HTTP status:", resp_status(response), "\n") ``` ## Step 4: Parse the Response ```{r} body <- resp_body_json(response, simplifyVector = TRUE) cat("Message:", body$message, "\n") cat("Total records:", body$data$total_data, "\n") ``` ```{r} df <- body$data$data |> as_tibble() |> mutate( value = as.numeric(value), year = as.integer(year) ) |> select(year, jurisdiction, value, unit, source) |> arrange(year) glimpse(df) ``` ## Step 5: Interactive Table ```{r} df |> datatable( caption = "Annual deforestation in Acre, Brazil (2010–2019) — PRODES", colnames = c("Year", "Jurisdiction", "Deforestation", "Unit", "Source"), options = list( pageLength = 10, dom = "tip" ), rownames = FALSE ) |> formatRound("value", digits = 0, mark = ",") ``` ## Step 6: Bar Chart ```{r} #| fig-width: 9 #| fig-height: 5 ggplot(df, aes(x = factor(year), y = value / 1e3)) + geom_col(fill = "#c0392b", width = 0.7) + geom_text( aes(label = scales::comma(value, accuracy = 1)), vjust = -0.5, size = 3, color = "#17252a" ) + scale_y_continuous( labels = scales::comma_format(suffix = "K"), expand = expansion(mult = c(0, 0.12)) ) + labs( title = "Annual Deforestation in Acre, Brazil (2010–2019)", subtitle = "Source: PRODES Amazonia (INPE) — via GJD API", x = "Year", y = "Deforestation (thousand hectares)" ) + theme_minimal(base_size = 13) + theme( plot.title = element_text(face = "bold"), panel.grid.major.x = element_blank() ) ``` ## Step 7: Exploring Multiple Sources To illustrate why source filtering matters, let's make the same request **without** the `ID_sources` filter and see what we get. ```{r} query_all_sources <- list( ID_Topic = 11, ID_Countries = '["BR"]', ID_Jurisdictions = '["BR-AC"]', ID_Municipalities = '[]', ID_years = paste0("[", years, "]") ) response_all <- request(base_url) |> req_url_query(!!!query_all_sources) |> req_headers( "X-API-TOKEN" = Sys.getenv("GJD_API_KEY"), "Accept" = "application/json", "Content-Type" = "application/json" ) |> req_perform() body_all <- resp_body_json(response_all, simplifyVector = TRUE) df_all <- body_all$data$data |> as_tibble() |> mutate( value = as.numeric(value), year = as.integer(year) ) |> select(year, jurisdiction, value, unit, source) |> arrange(year, source) cat("Records with all sources:", nrow(df_all), "\n") cat("Sources found:", paste(unique(df_all$source), collapse = ", "), "\n") ``` Now let's visualize how the three sources compare: ```{r} #| fig-width: 10 #| fig-height: 6 ggplot(df_all, aes(x = factor(year), y = value / 1e3, fill = source)) + geom_col(position = "dodge", width = 0.7) + scale_fill_manual(values = c( "PRODES Amazonia, Brazil" = "#c0392b", "Hansen et al./UMD/Google/USGS/NASA, post-processing by EII" = "#2980b9", "MapBiomas - Brasil" = "#27ae60" )) + scale_y_continuous( labels = scales::comma_format(suffix = "K"), expand = expansion(mult = c(0, 0.08)) ) + labs( title = "Deforestation in Acre: Comparing Data Sources (2010–2019)", subtitle = "Three sources report different values for the same jurisdiction and topic", x = "Year", y = "Deforestation (thousand hectares)", fill = "Data Source" ) + theme_minimal(base_size = 12) + theme( plot.title = element_text(face = "bold"), panel.grid.major.x = element_blank(), legend.position = "bottom", legend.direction = "vertical" ) ``` ::: {.callout-important} ## Key Takeaway Different data sources use different methodologies (satellite imagery resolution, classification algorithms, reference periods). Always be explicit about which source you are using in your analysis and document your choice. ::: ## Summary In this use case we learned how to: 1. **Query deforestation data** (Topic 11) from the GJD API. 2. **Filter by a specific data source** (PRODES) to ensure consistency. 3. **Visualize annual deforestation** as a bar chart with `ggplot2`. 4. **Compare multiple data sources** for the same jurisdiction to understand methodological differences. ## Next Steps - Combine deforestation and forest cover data to calculate **deforestation rates**. - Compare deforestation across multiple jurisdictions in the Brazilian Amazon. - Explore how deforestation trends relate to specific policy interventions.