Advanced Search
Turn off MathJax
Article Contents
YU Cuilin, ZHONG Zixuan, PANG Hongyi, DING Yusheng, LAI Tao, Huang Haifeng, WANG Qingsong. Vegetation Height Prediction Dataset Oriented to Mountainous Forest Areas[J]. Journal of Electronics & Information Technology. doi: 10.11999/JEIT250941
Citation: YU Cuilin, ZHONG Zixuan, PANG Hongyi, DING Yusheng, LAI Tao, Huang Haifeng, WANG Qingsong. Vegetation Height Prediction Dataset Oriented to Mountainous Forest Areas[J]. Journal of Electronics & Information Technology. doi: 10.11999/JEIT250941

Vegetation Height Prediction Dataset Oriented to Mountainous Forest Areas

doi: 10.11999/JEIT250941 cstr: 32379.14.JEIT250941
Funds:  The National Natural Science Foundation of China (62273365), Xiaomi Young Talents Program
  • Received Date: 2025-09-22
  • Accepted Date: 2025-11-03
  • Rev Recd Date: 2025-10-11
  • Available Online: 2025-11-12
  •   Objective   Vegetation height is a key ecological parameter that reflects forest vertical structure, biomass, ecosystem functions, and biodiversity. Existing open-source vegetation height datasets are often sparse, unstable, and poorly suited to mountainous forest regions, which limits their utility for large-scale modeling. This study constructs the Vegetation Height Prediction Dataset (VHP-Dataset) to provide a standardized large-scale training resource that integrates multi-source remote sensing features and supports supervised learning tasks for vegetation height estimation.  Methods   The VHP-Dataset is constructed by integrating Landsat 8 multispectral imagery, the digital elevation model AW3D30 (ALOS World 3D, 30 m), land cover data CGLS-LC100 (Copernicus Global Land Service, Land Cover 100 m), and tree canopy cover data GFCC30TC (Global Forest Canopy Cover 30 m Tree Canopy). Canopy height from GEDI L2A (Global Ecosystem Dynamics Investigation, Level 2A) footprints is used as the target variable. A total of 18 input features is extracted, covering spatial location, spectral reflectance, topographic structure, vegetation indices, and vegetation cover information (Table 4, Fig. 4). For model validation, five representative approaches are applied: Extremely Randomized Trees (ExtraTree), Random Forest (RF), Artificial Neural Network (ANN), Broad Learning System (BLS), and Transformer. Model performance is assessed using Root Mean Square Error (RMSE), Mean Absolute Error (MAE), Standard Deviation (SD), and Coefficient of Determination (R2).  Results and Discussions   The experimental results show that the VHP-Dataset supports stable vegetation height prediction across regions and terrain conditions, which reflects its scientific validity and practical applicability. Model comparisons indicate that ExtraTree achieves the best performance in most regions, and Transformer performs well in specific areas, which confirms that the dataset is compatible with different approaches (Table 6). Stratified analyses show that prediction errors increase under high canopy cover and steep slope conditions, and predictions remain more stable at higher elevations (Figs. 69). These findings indicate that the dataset captures the effects of complex topography and canopy structure on model accuracy. Feature importance analysis shows that spatial location, topographic factors, and canopy cover indices are the primary drivers of prediction accuracy, while spectral and land cover information provide complementary contributions (Fig. 10).  Conclusions   The results show that the VHP-Dataset supports vegetation height prediction across regions and terrain types, which reflects its scientific validity and applicability. The dataset enables robust predictions with traditional machine learning methods such as tree-based models, and it also provides a foundation for deep learning approaches such as Transformers, which reflects broad methodological compatibility. Stratified analyses based on vegetation cover and terrain show the effects of complex canopy structures and topographic factors on prediction accuracy, and feature importance analysis identifies spatial location, topographic attributes, and canopy cover indices as the primary drivers. Overall, the VHP-Dataset fills the gap in large-scale high-quality datasets for vegetation height prediction in mountainous forests and provides a standardized benchmark for cross-regional model evaluation and comparison. This offers value for research on vegetation height prediction and forest ecosystem monitoring.
  • loading
  • [1]
    MALAMBO L and POPESCU S. Mapping vegetation canopy height across the contiguous United States using ICESat-2 and ancillary datasets[J]. Remote Sensing of Environment, 2024, 309: 114226. doi: 10.1016/j.rse.2024.114226.
    [2]
    ADRAH E, WONG J P, and YIN He. Integrating GEDI, Sentinel-2, and Sentinel-1 imagery for tree crops mapping[J]. Remote Sensing of Environment, 2025, 319: 114644. doi: 10.1016/j.rse.2025.114644.
    [3]
    CHANG Bingtao, XIONG Hao, LI Yuan, et al. ALCSF: An adaptive and anti-noise filtering method for extracting ground and top of canopy from ICESat-2 LiDAR data along single tracks[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2024, 215: 80–98. doi: 10.1016/j.isprsjprs.2024.07.002.
    [4]
    BEAUDOIN A, BERNIER P Y, VILLEMAIRE P, et al. Tracking forest attributes across Canada between 2001 and 2011 using a k nearest neighbors mapping approach applied to MODIS imagery[J]. Canadian Journal of Forest Research, 2018, 48(1): 85–93. doi: 10.1139/cjfr-2017-0184.
    [5]
    PICKSTONE B J, GRAHAM H A, and CUNLIFFE A M. Estimating canopy height in tropical forests: Integrating airborne LiDAR and multi-spectral optical data with machine learning[J]. Sustainable Environment, 2025, 11(1): 2469406. doi: 10.1080/27658511.2025.2469406.
    [6]
    BENHALIMA N, OUARZEDDINE M, SOUISSI B, et al. Integrating PolInSAR and GEDI data with machine learning for forest canopy height predicting in Pongara National Park, Gabon[J]. International Journal of Remote Sensing, 2025, 46(18): 6875–6896. doi: 10.1080/01431161.2025.2549131.
    [7]
    POURSHAMSI M, XIA Junshi, YOKOYA N, et al. Tropical forest canopy height estimation from combined polarimetric SAR and LiDAR using machine-learning[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2021, 172: 79–94. doi: 10.1016/j.isprsjprs.2020.11.008.
    [8]
    LANG N, KALISCHEK N, ARMSTON J, et al. Global canopy height regression and uncertainty estimation from GEDI LIDAR waveforms with deep ensembles[J]. Remote Sensing of Environment, 2022, 268: 112760. doi: 10.1016/j.rse.2021.112760.
    [9]
    TOLAN J, YANG H I, NOSARZEWSKI B, et al. Very high resolution canopy height maps from RGB imagery using self-supervised vision transformer and convolutional decoder trained on aerial lidar[J]. Remote Sensing of Environment, 2024, 300: 113888. doi: 10.1016/j.rse.2023.113888.
    [10]
    LANG N, JETZ W, SCHINDLER K, et al. A high-resolution canopy height model of the Earth[J]. Nature Ecology & Evolution, 2023, 7(11): 1778–1789. doi: 10.1038/s41559-023-02206-6.
    [11]
    LEI Yuqi, WANG Yuanjia, WANG Guilong, et al. Estimating forest canopy height based on GEDI lidar data and multi-source remote sensing images[C]. The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Changsha, China, 2024: 297–303. doi: 10.5194/isprs-archives-XLVIII-1-2024-297-2024.
    [12]
    钱亚冠, 孔亚鑫, 陈科成, 等. 利用频谱衰减增强深度神经网络对抗迁移攻击[J]. 电子与信息学报, 2025, 47(10): 3847–3857. doi: 10.11999/JEIT250157.

    QIAN Yaguan, KONG Yaxin, CHEN Kecheng, et al. Adversarial transferability attack on deep neural networks through spectral coefficient decay[J]. Journal of Electronics & Information Technology, 2025, 47(10): 3847–3857. doi: 10.11999/JEIT250157.
    [13]
    DEMIR I, KOPERSKI K, LINDENBAUM D, et al. DeepGlobe 2018: A challenge to parse the earth through satellite images[C]. The IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, Salt Lake City, USA, 2018: 172–181. doi: 10.1109/CVPRW.2018.00031.
    [14]
    CHEN Hao and SHI Zhenwei. A spatial-temporal attention-based method and a new dataset for remote sensing image change detection[J]. Remote Sensing, 2020, 12(10): 1662. doi: 10.3390/rs12101662.
    [15]
    BURNS P, HAKKENBERG C R, and GOETZ S J. Multi-resolution gridded maps of vegetation structure from GEDI[J]. Scientific Data, 2024, 11(1): 881. doi: 10.1038/s41597-024-03668-4.
    [16]
    MIURA Y, SHAMSUDDUHA M, SUPPASRI A, et al. A global multi-sensor dataset of surface water indices from landsat-8 and sentinel-2 satellite measurements[J]. Scientific Data, 2025, 12(1): 1253. doi: 10.1038/s41597-025-05562-z.
    [17]
    HUANG Huabing, CHEN Peimin, XU Xiaoqing, et al. Estimating building height in China from ALOS AW3D30[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2022, 185: 146–157. doi: 10.1016/j.isprsjprs.2022.01.022.
    [18]
    XU Yidi, YU Le, FENG Duole, et al. Comparisons of three recent moderate resolution African land cover datasets: CGLS-LC100, ESA-S2-LC20, and FROM-GLC-Africa30[J]. International Journal of Remote Sensing, 2019, 40(16): 6185–6202. doi: 10.1080/01431161.2019.1587207.
    [19]
    TOWNSHEND J. Global Forest cover change (GFCC) tree cover multi-year global 30 m V003[R]. 2016. doi: 10.5067/MEaSUREs/GFCC/GFCC30TC.003.
    [20]
    LIU Xiaoqiang, SU Yanjun, HU Tianyu, et al. Neural network guided interpolation for mapping canopy height of China’s forests by integrating GEDI and ICESat-2 data[J]. Remote Sensing of Environment, 2022, 269: 112844. doi: 10.1016/j.rse.2021.112844.
    [21]
    DUNCANSON L, NEUENSCHWANDER A, HANCOCK S, et al. Biomass estimation from simulated GEDI, ICESat-2 and NISAR across environmental gradients in Sonoma County, California[J]. Remote Sensing of Environment, 2020, 242: 111779. doi: 10.1016/j.rse.2020.111779.
    [22]
    LIU Aobo, CHENG Xiao, and CHEN Zhuoqi. Performance evaluation of GEDI and ICESat-2 laser altimeter data for terrain and canopy height retrievals[J]. Remote Sensing of Environment, 2021, 264: 112571. doi: 10.1016/j.rse.2021.112571.
    [23]
    ARSHAD T and ZHANG Junping. Hierarchical attention transformer for hyperspectral image classification[J]. IEEE Geoscience and Remote Sensing Letters, 2024, 21: 5504605. doi: 10.1109/LGRS.2024.3379509.
    [24]
    KURANI A, DOSHI P, VAKHARIA A, et al. A comprehensive comparative study of artificial neural network (ANN) and support vector machines (SVM) on stock forecasting[J]. Annals of Data Science, 2023, 10(1): 183–208. doi: 10.1007/s40745-021-00344-x.
    [25]
    SALMAN H A, KALAKECH A, and STEITI A. Random forest algorithm overview[J]. Babylonian Journal of Machine Learning, 2024, 2024: 69–79. doi: 10.58496/BJML/2024/007.
    [26]
    DUAN Junwei. Broadfusion: A novel two-stage multifocus image fusion approach with human visual system embedded broad learning system[J]. Knowledge-Based Systems, 2025, 326: 114030. doi: 10.1016/j.knosys.2025.114030.
    [27]
    SAMAT A, PERSELLO C, LIU Sicong, et al. Classification of VHR multispectral images using extratrees and maximally stable extremal region-guided morphological profile[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2018, 11(9): 3179–3195. doi: 10.1109/JSTARS.2018.2824354.
  • 加载中

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Figures(10)  / Tables(6)

    Article Metrics

    Article views (64) PDF downloads(7) Cited by()
    Proportional views
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return