Silvilaser 2019 - Poster Presentations »
Regression and machine learning models for estimating stand volume of eucalypt clonal and tropical highly diverse forests from LiDAR metrics
Point clouds from aerial Light Detection and Ranging (LiDAR) constitute key data for broad-scale forest inventories and structural information mapping. However, the definition of the best model for estimating such forest attributes remains dependent on sample design and target. Here we present two studies which tested regression (parametric) and machine learning (non-parametric) models to estimate stand volume from an area-based approach using the same LiDAR point cloud but performed on Eucalyptus spp. clonal forests (1) and on Atlantic secondary forests (2) in southeastern of Brazil. The LiDAR point cloud density is 5 points.m-2. After normalization of the point clouds, several LiDAR metrics were generated and extracted from 33 plots (of around 300 m2) for the case study 1 and from 29 plots (500 m2) for the case study 2. Those data were randomly separated into training (70%) and validation (30%) samples. By using Spearman’s rank correlation coefficient (ρ), the most correlated metric with the stand volume was selected to test regression models for adjusting an equation. In the second approach, we run the Recursive Feature Elimination (RFE) on the set of LiDAR metrics (and forest stand age for case study 1) to select those variables that improved the accuracy of the Random Forest (RF) model. Correlation coefficient (r) and root mean square error (RMSE) were calculated to assess the performance of the models on unknown data (validation samples). Elevation at different percentiles were the metrics that presented the highest correlations (ρ) with volume from both forests. From RFE, four (case study 1) and eleven (2) variables were selected for modeling volume through RF. Exponential equations (EE) have the best fit to data in both the case study 1 and 2. The performance of adjusted EE and trained RF on unknown data resulted in r of 0.86 and 0.97, and RMSE of 22.51% and 13.11%, respectively, for the eucalypt clonal forests. On the other hand, EE and RF resulted in r of 0.89 and 0.95, and RMSE of 17.38% and 13.86%, respectively, for Atlantic secondary forests. Even in case study 2 the forests are more complex than in case study 1, the accuracies were similar between them. Although the machine learning algorithm modeled the volume of both forests with higher accuracies than exponential equations; the simplicity of parametric regression models highlights their potential for operational broad-scale forest inventory and volume mapping from a low number of field data.