Skip to main content
Silvilaser 2019

Silvilaser 2019 - Poster Presentations »

Re-thinking accuracy assessment standards for LiDAR models predicting forest variables

The accuracy assessed for model-assisted LIDAR predictions of forest variables usually lacks and external validation carried out with an independent dataset. The reason is not only the high costs of field data acquisition, but also the fact that dividing the available data into two separate groups, one for training and another for validation, inherently leads to a loss of statistical power. Consequently, cross-validation methods are typically used in this context for calculating and report mean absolute differences, root mean squared errors and coefficients of determination between observed and predicted values. In this research we wanted to put into question the sufficiency of these common procedures, suspecting that they may eventually conceal cases when the models are in fact overfitted to the sample. To demonstrate this end, we employed an argument of the type reductio ad absurdum, showing models clearly unreliable (including an unrealistic number of predictors) which could nonetheless be taken for good in light of these common metrics for accuracy assessment. We then investigated the convenience of including in the assessment a measure to avoid overfitting, which consisted in limiting the inflation in the sum of squares observed in the cross-validation as compared to the overall model fit. Results showed that overfitting can in general be avoided without greatly compromising model precision, effectively yielding more reliable maps at the high resolution delivered by LIDAR. Harmful effects of overfitting lay not just in a lack of model generality, but also a tendency to predict towards the average (i.e. systematically underestimate high values and overestimate lower ones) which is difficult to detect because it leads to an unbiased mean estimate, yet rendering the resulting maps with many false predictions at the pixel level. Further research should focus on how these overfitting effects can be prevented using different cross-validation settings. While we focused our research on the prediction of qualitative variables, we suspect that classification models may be as affected by overfitting and thus recommend similar research to be carried out for typical measures employed in the evaluation of contingency tables, such as the kappa coefficient.

Rubén Valbuena
School of Natural Sciences, Bangor University
United Kingdom

Ana Hernando
College of Forestry and Natural Environment, Research Group SILVANET, Universidad Politecnica de Madrid
Spain

José Antonio Manzanera
College of Forestry and Natural Environment, Research Group SILVANET, Universidad Politecnica de Madrid
Spain

Eric Bastos Görgens
Department of Forestry, Universidade Federal dos Vales do Jequitinhonha e Mucuri
Brazil

Danilo Roberti Alves de Almeida
Department of Forest Sciences, University of São Paulo, Luiz de Queiroz College of Agriculture
Brazil

Carlos Alberto Silva
Biosciences Laboratory, NASA Goddard Space Flight Center
United States

Francisco Mauro
Oregon State University, Forestry College
United States

Antonio García-Abril
College of Forestry and Natural Environment, Research Group SILVANET, Universidad Politecnica de Madrid
Spain

David Anthony Coomes
Department of Plant Sciences, Forest Ecology and Conservation, University of Cambridge
United Kingdom

 


Powered by OpenConf®
Copyright ©2002-2018 Zakon Group LLC