Abstract
Urban tree species provide various essential ecosystem services in cities, such as regulating urban temperatures, reducing noise, capturing carbon, and mitigating the urban heat island effect. The quality of these services is influenced by species diversity, tree health, and the distribution and the composition of trees. Traditionally, data on urban trees has been collected through field surveys and manual interpretation of remote sensing images. In this study, we evaluated the effectiveness of multispectral airborne laser scanning (ALS) data in classifying 24 common urban roadside tree species in Espoo, Finland. Tree crown structure information, intensity features, and spectral data were used for classification. Eight different machine learning algorithms were tested, with the extra trees (ET) algorithm performing the best, achieving an overall accuracy of 71.7% using multispectral LiDAR data. This result highlights that integrating structural and spectral information within a single framework can improve the classification accuracy. Future research will focus on identifying the most important features for species classification and developing algorithms with greater efficiency and accuracy.
Today, approximately 56% of the world's population—4.4 billion people—live in cities. Urban trees play a significant role in mitigating global climate chang
Airborne laser scanning (ALS) is effective for extracting biophysical variables and revising forest inventory maps. The successful use of ALS data has been demonstrated for various applications. For example, ALS has been used to estimate tree heigh
Previous studies have also revealed that combining multispectral information with 3D ALS data can improve the accuracy of tree extraction and tree species classification, as we can take advantage of both datasets. However, challenging factors limit the effective operational use of the fused dataset
Given the limitations of traditional optical remote sensing in capturing three-dimensional forest structures, it is essential to explore the potential of multispectral laser scanning for urban tree inventories, particularly for species classification. This study aims to assess the feasibility of using multispectral ALS data for urban tree species classification and to analyze the information content of features derived from point clouds and intensity data.
The MLS datasets used in this study were acquired in a suburban area in Espoolahti, southern Finland (60°9′18″N, 24°38´24″E) in the southern Boreal Forest Zone. We choose around 822 trees in this area as our field dataset. The land area is approximately 5 k
The points were updated through visual interpretation of Titan data and open datasets from the City of Espoo, the National Land Survey of Finland, Google Maps, and Google Street View. Field checks validated the analysis and resolved uncertainties. The reference points' attributes included species, geographic location, living conditions, tree height, and planting date for each tree.

Fig. 1 Map of the study area and tree samples in the research area.
图 1 研究区和研究区的树木样本
Multispectral Optech Titan data (Teledyne Optech, Toronto, ON, Canada) for the study area were collected in May and June 2016 in collaboration with TerraTec Oy (Helsinki, Finland) from a 650 m flight height. The data acquisition was carried out using a fixed-wing aircraft flying at a constant altitude. The sensor comprises three Titan channels: green (532 nm), near-infrared (1 064 nm), and shortwave infrared (1 550 nm). Each channel provided separate point clouds. In our preprocessed dataset, the point densities over land areas were approximately 9 points/m² for Channel 1, 9 points/m² for Channel 2, and 8 points/m² for Channel 3.
TerraScan (TerraSolid Oy, Helsinki, Finland) was used to preprocess the ALS data and differentiate between ground and nonground points using a standardized procedure. This procedure involved removing noise, such as points detected below the ground level or above the canopy. Subsequently, the point clouds were height-normalized. Ground elevation was subtracted from the point cloud height measurements using a digital terrain model created from the classified ground points of the three channels to eliminate potential discrepancies.
Radiometric calibration of ALS intensity is crucial to ensure successful classification. Therefore, in this study, we implemented relative radiometric calibration. We observed that the intensity values were higher in the middle of the flight path compared to other areas and decreased with scanning height. A range correction was applied to mitigate such effects.
, | (1) |
where is the modified intensity, is the original intensity, is the distance from the LiDAR to the point cloud and is the flying altitude (650 m).
Individual trees were detected using a minimum curvature-based algorithm, which started with creating a canopy height model (CHM). According to our field dataset of each tree coordination, we set the potential crown area within 5
In this experiment, the features were primarily divided into two types: intensity features and geometric features. The maximum height (Hmax) of each tree was calculated from the highest point of all point cloud in each tree segment.
Simultaneously, we got 137 features in each channel from the multispectral ALS data.
Feature | Definition |
---|---|
Single-channel Intensity (SCI) features | |
Imax | Maximum intensity |
Imin | Minimum intensity |
Imean | Mean intensity |
Istd | The standard deviation of intensity |
Icov | Coefficient of variation (i.e., relative standard deviation) of intensity |
Isk | Skewness of intensity |
Irange | Range of intensity |
Ikut | Kurtosis of intensity |
I5 to I95 | Percentiles of intensity values of points above the ground threshold from 5% to 95% in 5% increments |
Multi-channel Intensity (MCI) features | |
Ratios of intensity features in each channel | |
Green normalized differential vegetation index (gNDVI) | |
Green simple ratio vegetation index (gSR) | |
Geometric features | |
Hmax | Maximum of the heights of all points |
Hmean | Arithmetic mean of the height of all points above 1 m threshold |
Hstd | Standard deviation of height of all points above 1 m threshold |
Hrange | Range of normalized height of all points above 1 m threshold |
P | Penetration as a ratio between the number of returns below 1 m and total returns |
CA | Crown area as the area of the convex hull in 2D |
CV | Crown volume as the convex hull in 3D |
CD | Crown diameter calculated from crown area considering crown as a circle |
HP10 to HP90 | Percentiles of the points above 1 m height from 10% to 90% at 10% incremental. |
D1 to D10 | Di = Ni/Ntotal, where i = 1 to 10, Ni is the number of points within the ith layer when tree height was divided into 10 intervals starting from 1 m, Ntotal is the number of all points. |
In this study, we use 8 machine learning algorithms to compare the classification of tree species.: extra trees (ET), random forest (RF), K-nearest neighbour (KNN), logistic regression (LR), linear discriminant analysis (LDA), classification and regression tree (CART), naive bayes (NB), support vector machine (SVM). Tree species were estimated based on prediction models by 8 machine learning algorithms using tree features as predictors and tree species as a response for correctly detected trees.
As presented in

Fig. 2 Titan intensity image of Study area in Espoolahti (Red: Channel 1; Green: Channel 2; Blue: Channel 3).
图 2 埃斯波拉赫蒂研究区,背景为泰坦强度影像(红色:Channel 1;绿色:Channel 2;蓝色:Channel 3)。

Fig. 3 The comparison of classification accuracy of 24 tree species: ET, RF, KNN, LR, LDA, CART, NB, SVM
图 3 树种的分类准确率比较:额外树、随机森林、K-近邻、逻辑回归、线性判别分析、分类回归树、奈夫贝叶斯、支持向量机
The confusion matrix analysis reveals a model that performs well for most classes but struggles with a few, particularly Quercus and Sorbus according to
Tree species | The index number | Number of Trees |
---|---|---|
Pinta-ala | 1 | 2 |
Abies | 2 | 13 |
Acer | 3 | 249 |
Alnus | 4 | 5 |
Betula | 5 | 26 |
Fallopia | 6 | 1 |
Fraxinus | 7 | 2 |
Juglans | 8 | 5 |
Larix | 9 | 11 |
Malus | 10 | 8 |
Picea | 11 | 15 |
Pinus | 12 | 84 |
Populus | 13 | 16 |
Prunus | 14 | 10 |
Quercus | 15 | 23 |
Ribes | 16 | 5 |
Salix | 17 | 4 |
Sambucus | 18 | 1 |
Sorbus | 19 | 84 |
Syringa | 20 | 1 |
Taxus | 21 | 4 |
Thuja | 22 | 2 |
Tilia | 23 | 88 |
Ulmus | 24 | 163 |

Fig. 4 The confusion matrix of classification with geometric and intensity features for each species.
图 4 利用几何特征和强度特征对每个物种进行分类的混淆矩阵。
We also investigated which input features and channels are most relevant for tree species classification based on the measure provided by the RF algorithm for assessing feature importance. If a feature influences the prediction, permuting its values should affect the model error. If a feature is not influential, then permuting its values should have little or no effect on the model error.
Cases | Top 3 features |
---|---|
All features |
|
Multispectral LiDAR data improved the classification accuracy by approximately 5% to 10% for all channels compared to each channel. This proves our hypothesis about the ability of mALS features in classification. For example, the overall accuracy of 71.7% was obtained in multispectral LiDAR all-channel data, while accuracies of 65.7%, 68.3%, and 64.8% were achieved when using only Channel 1, Channel 2, and Channel 3, respectively. Our findings demonstrated the advantage of combining multichannel features over single-channel data in classifying urban trees. However, the sample size of each tree species in this experiment was uneven, which may have affected the model's accuracy. Consequently, a larger and more representative sample will be used in future research. The imbalance in measurement samples reduced classification accuracy to some extent. Addressing this limitation will be a key focus in subsequent studies.
In this study, eight machine learning algorithms were evaluated for their classification performance, each demonstrating distinct strengths and limitations. The selection of an appropriate classification algorithm depends on the specific characteristics of the dataset, including size, dimensionality, and the underlying relationship between features and class labels. Extra trees (ET) and random forests(RF) proved effective in our study due to their ability to handle large, high-dimensional datasets and their robustness against overfitting, which suited the conditions of our dataset. Naive Bayes (NB) was efficient and scalable, especially for high-dimensional data, but its assumption of feature independence limited its applicability in cases with high feature correlation.
It is also important to note that overall accuracy (OA) is influenced by factors such as species composition, stand structure, age, and the methods used to select the best features, which vary among studies. In this research, however, the intensity of laser returns was not calibrated. This limitation can be addressed in future studies. First, we can investigate whether calibrated intensity affects classification results. Second, the use of MCI features in this study mitigated potential variations in intensity.
In conclusion, the ability of mALS compared to single-channel ALS (SCI-Ch) data to characterize tree species in urban areas was assessed in this study. Our classification results indicate that mALS data provided more accurate results than single-channel ALS data for urban tree species classification.
References
Schneider A, Friedl M A, Potere D. Mapping global urban areas using MODIS 500-m data: New methods and datasets based on ‘urban ecoregions’ [J]. Remote Sensing of Environment, 2010, 114(8): 1733-1746. 10.1016/j.rse.2010.03.003 [Baidu Scholar]
Lee J H, Bang K W. Characterization of urban stormwater runoff [J]. Water Research, 2000, 34(6): 1773-1780. 10.1016/s0043-1354(99)00325-5 [Baidu Scholar]
Escobedo F J, Nowak D J. Spatial heterogeneity and air pollution removal by an urban forest [J]. Landscape and Urban Planning, 2009, 90(3-4): 102-110. 10.1016/j.landurbplan.2008.10.021 [Baidu Scholar]
Nowak D, Crane D, Stevens J, et al. A ground-based method of assessing urban forest structure and ecosystem services [J]. Arboriculture & Urban Forestry, 2008, 34(6): 347-358. 10.48044/jauf.2008.048 [Baidu Scholar]
Lovell J L, Jupp D L B, Culvenor D S, et al. Using airborne and ground-based ranging lidar to measure canopy structure in Australian forests [J]. Canadian Journal of Remote Sensing, 2003, 29(5): 607-622. 10.5589/m03-026 [Baidu Scholar]
Næsset E, Økland T. Estimating tree height and tree crown properties using airborne scanning laser in a boreal nature reserve [J]. Remote Sensing of Environment, 2002, 79(1): 105-115. 10.1016/s0034-4257(01)00243-7 [Baidu Scholar]
Clark M L, Clark D B, Roberts D A. Small-footprint lidar estimation of sub-canopy elevation and tree height in a tropical rain forest landscape [J]. Remote Sensing of Environment, 2004, 91(1): 68-89. 10.1016/j.rse.2004.02.008 [Baidu Scholar]
Holmgren J, Persson Å. Identifying species of individual trees using airborne laser scanner [J]. Remote Sensing of Environment, 2004, 90(4): 415-423. 10.1016/s0034-4257(03)00140-8 [Baidu Scholar]
Brandtberg T. Classifying individual tree species under leaf-off and leaf-on conditions using airborne lidar [J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2007, 61(5): 325-340. 10.1016/j.isprsjprs.2006.10.006 [Baidu Scholar]
Lindberg E, Eysn L, Hollaus M, et al. Delineation of tree crowns and tree species classification from full-waveform airborne laser scanning data using 3-D ellipsoidal clustering [J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2014, 7(7): 3174-3181. 10.1109/jstars.2014.2331276 [Baidu Scholar]
Hyyppa J, Kelle O, Lehikoinen M, et al. A segmentation-based method to retrieve stem volume estimates from 3-D tree height models produced by laser scanners [J]. IEEE Transactions on Geoscience and Remote Sensing, 2001, 39(5): 969-975. 10.1109/36.921414 [Baidu Scholar]
Ahmed R, Siqueira P, Hensley S. A study of forest biomass estimates from lidar in the northern temperate forests of New England [J]. Remote Sensing of Environment, 2013, 130: 121-135. 10.1016/j.rse.2012.11.015 [Baidu Scholar]
Hollaus M, Wagner W, Maier B, et al. Airborne laser scanning of forest stem volume in a mountainous environment [J]. Sensors, 2007, 7(8): 1559-1577. 10.3390/s7081559 [Baidu Scholar]
Yu X, Hyyppä J, Kaartinen H, et al. Obtaining plotwise mean height and volume growth in boreal forests using multi‐temporal laser surveys and various change detection techniques [J]. International Journal of Remote Sensing, 2008, 29(5): 1367-1386. 10.1080/01431160701736356 [Baidu Scholar]
Yu X, Hyyppä J, Kukko A, et al. Change detection techniques for canopy height growth measurements using airborne laser scanner data [J]. Photogrammetric Engineering & Remote Sensing, 2006, 72(12): 1339-1348. 10.14358/pers.72.12.1339 [Baidu Scholar]