7  Discussion of processing choices

8 Discussion of possible issues with the original study

Here we list a series of data processings that raised questions during the re-implementation.

8.1 Erratic data for yearloss

Code
library(aws.s3)
library(tidyverse)
library(terra)
library(sf)

# Load both versions of the data
lossyear_wolf <- s3read_using(FUN = rast, object = 
                    "Replication_wolf/data_input/lossyear.tif",
                    bucket = "fbedecarrats", opts = list("region" = ""))
lossyear_new <- s3read_using(FUN = rast, object = 
                    "Replication_wolf/data_input/lossyear_manual_reduce.tif",
                    bucket = "fbedecarrats", opts = list("region" = ""))
# Generate statistics comparing both

Re-run the analysis using manual reduction

Code
library(raster)
library(rgdal)
library(gdalUtils)
library(aws.s3)
library(tidyverse)


if (!stringr::str_detect(getwd(), "PA_matching")) {
  setwd("PA_matching")
}
if (!stringr::str_ends(getwd(), "replication_report")) {
  setwd("replication_report")
}

dir.create("new_input")
dir.create("new_processed")
dir.create("new_processed/rasters")

save_from_s3(from = "Replication_wolf/data_processed/rasters/elev.tif",
             to = "new_input/elev.tif")
save_from_s3(from = "Replication_wolf/data_input/lossyear_clean_4326.tif",
             to = "new_input/lossyear_clean_4326.tif")

rasterOptions(maxmemory = 1e+09)
rasterOptions(chunksize = 1e+09)

elev <- raster("new_input/elev.tif")

gdalwarp(srcfile = "new_input/lossyear_clean_4326.tif",
         "new_input/lossyear_proj.vrt",
         tr = c(1000,1000), of="VRT",
         t_srs = "+proj=cea +lon_0=0 +lat_ts=30 +x_0=0 +y_0=0 +datum=WGS84 +ellps=WGS84 +units=m +no_defs",
         te = extent(elev)[c(1,3,2,4)])

gdal_translate("new_input/lossyear_proj.vrt",
               "new_processed/rasters/lossyear.tif",
               of="GTiff",
               co=c("COMPRESS=LZW","BIGTIFF=YES"))

put_to_s3(from = "new_processed/rasters/lossyear.tif",
          to = "Replication_wolf/data_processed/rasters/lossyear_redo.tif")

The original paper reports:

The establishment of these PAs was associated with a moderate increase in the deforestation rate (average = 0.19%; s.e.m. = 0.02%), whereas control areas saw a larger increase in the deforestation rate (average = 0.61%; s.e.m. = 0.02%) over the same time span (Extended Data Fig. 3).

Whereas we find that the establishment of PAs is associated with a lower increase (average = 0,08%; sem = 0.02%), whereas control see a larger increase of deforestation, but also in a smaller scale than reported in the previous study (0.38%; sem: 0.2%)

> # paper text
> round(100*mean(df$PA_loss_before), 3)
[1] 0.465
> round(100*mean(df$PA_loss_after), 3)
[1] 0.541
> P = df$PA_loss_after - df$PA_loss_before
> C = df$Control_loss_after - df$Control_loss_before
> round(100*mean(P), 3)
[1] 0.076
> round(100*mean(C), 3)
[1] 0.382
> 100*sd(P)/sqrt(length(P))
[1] 0.02009433
> 100*sd(C)/sqrt(length(C))
[1] 0.01534645
> round(100*mean(df$Control_loss_before), 3)
[1] 0.797
> round(100*mean(df$Control_loss_after), 3)
[1] 1.179
> round(100*tapply(df_small$value, list(df_small$variable, df_small$Continent), mean),3)
                South America Oceania North America Europe  Asia Africa
Control areas           0.184   0.164         0.004  0.778 0.334  0.986
Protected areas         0.109   0.247        -0.089  0.199 0.087  0.459
                Lower GDP Higher GDP
Control areas       0.330      0.393
Protected areas     0.142      0.063
Code
save_from_s3(from = "Replication_wolf/data_processed/data_matched_10_class.csv",
             to = "data_processed/data_matched_10_class.csv")
save_from_s3(from = "Replication_wolf/data_processed/data_matched.csv",
             to = "data_processed/data_matched.csv")

Execute julia, execute merge results.

8.2 Year of deforestation corresponds to mode for coarsened polygon

8.2.1 Difference in computation

The authors describe the corresponding part of their data preparation as “We used the following layers from version 1.6 of Hansen et al.2: tree cover in the year 2000 (percentage of each pixel); forest loss between 2001 and 2018 (binary raster) forest gain between 2000 and 2012 (also binary); and primary year associated with forest loss event” (Wolf et al. 2021, suppl. mat. p. 2). In fact, the way this pimary year associated with forest loss event is computationnaly processed as follow.

Code
lossyear = lossyear.updateMask(lossyear.neq(0)).reduceResolution(reducer=ee.Reducer.mode(maxRaw=20), bestEffort=True)

This means that while aggregating several (~1000) pixels, the authors retained only one year, corresponding to the year in which the most of occurrence of deforestation occurred among all the aggregated polygons.

This might alter the calculation, potentially masking the first years of deforestation and therefore delaying the impact of conservation (ie. deforestation that started before PA creation only appears after PA creation).

8.2.2 Incidence on results

We try to follow the same data process, but changing this by calculating forest loss event for each year.

[to be done]

8.3 Coastal protected areas are removed

8.3.1 Issue in computation

Wolf et al. (2021, suppl. mat. p.2) indicate that they excluded PAs that were exclusiverly marine, but this operation was coded as follows.

Code
main_PAs = PA_df[(PA_df.MARINE .== 0) .& (PA_df.STATUS .!= "Proposed") .& (PA_df.GIS_AREA .>= 1),:]

The MARINE variable in WDPA is coded as follows:

“0 (predominantly or entirely terrestrial), 1 (Coastal: marine and terrestrial), and 2 (predominantly or entirely marine). The value ‘1’ is only used for polygons.” (WDPA Manual version 1.6, 2019).

This means that in practice, coastal areas have been excluded from the analysis. This can be problematic, considering for instance that coastal areas can expand far inland, and that they include among other biomes the mangroves, that were one of the focus biome for the analysis.

8.3.2 Incidence on results

We try to compute the same analysis, but including the areas that were both terrestial or coastal.

8.4 Reprojection after slope calculation and resolution reduction?

The Google earth engine documentation warns against reprojecting after slope calculation or resolution reduction. It could be due to some internal processing contraints but my understanding is that it is a contraint from the area calculation.

8.5 No use of certain matching covariates in the impact estimation model

cf. doubly robust matching, cf. Ho et al. 2007

8.6 Problems with lossyear