The construction of the model for damage level assessment of critical infrastructure objects
The model is based on a regional convolutional neural network architecture that can identify and classify different types of damage, including those caused by natural disasters, accidents, or deliberate warfare attacks, such as shelling and bombardment.
Рубрика | Производство и технологии |
Вид | статья |
Язык | английский |
Дата добавления | 12.12.2024 |
Размер файла | 7,2 M |
Отправить свою хорошую работу в базу знаний просто. Используйте форму, расположенную ниже
Студенты, аспиранты, молодые ученые, использующие базу знаний в своей учебе и работе, будут вам очень благодарны.
Размещено на http://www.allbest.ru/
The construction of the model for damage level assessment of critical infrastructure objects
Sakovych Bohdan Pavlovych a postgraduate student of the Software Tool and Technologies Department, Kherson National Technical University, Khmelnytskyi
Abstract
This paper introduces a novel model for the assessment of the extent of damage to critical infrastructure which can aid in object inspection and creation of a restoration plan for authorities. The model is based on a regional convolutional neural network architecture that can identify and classify different types of damage, including those caused by natural disasters, accidents, or deliberate warfare attacks, such as shelling and bombardment. The transfer-learning technique is utilised to improve the model's accuracy and speed. The developed model classifies the objects on the image as intact (0), of low damage (1), moderate damage (2), high damage level (3), and demolished object (4). It can help authorised persons assess the damage, and its level and create a plan for the restoration of a particular object. In addition, the key areas of research that need to be addressed to improve the accuracy and effectiveness of the R-CNN-based model for damage level assessment have been identified. These areas include the development of better data sources and the integration of other types of frameworks to provide the most accurate assessment of infrastructure objects' impairment. The current model version requires additional training and fine-tuning to achieve its peak classification accuracy. While the initial results are promising, there are several avenues for improvement. Incorporating focal cross-entropy loss or similar techniques (such as contrastive learning) in future model versions is expected to enhance accuracy, reduce loss, and address bias. Additionally, exploring alternative architectures, such as SSD or YOLO could lead to further improvements. Switching to other platforms like PyTorch and Detectron2 may also enhance performance compared to currently utilised TensorFlow. Furthermore, an alternative approach could expand the predictive capabilities of the model beyond damage level assessment. For instance, it could pinpoint precise damage locations. Creating highly specialised models for specific object types might also boost performance while minimising loss.
Keywords: Mask R-CNN, CNN, image classification, damage level assessment, critical infrastructure object.
Сакович Богдан Павлович аспірант кафедри програмного забезпечення і технологій, Херсонський національний технічний університет, м. Хмельницький, image classification critical infrastructure object
СТВОРЕННЯ МОДЕЛІ ДЛЯ ОЦІНКИ СТУПЕНЯ
ПОШКОДЖЕННЯ ОБ'ЄКТІВ КРИТИЧНОЇ ІНФРАСТРУКТУРИ
Анотація. У статті представлено нову та сучасну модель оцінки рівня пошкодження об'єктів критичної інфраструктури, яка може допомогти в обстеженні об'єктів та створенні плану відновлення для органів влади. Модель базується на архітектурі регіональної згорткової нейронної мережі, яка може ідентифікувати та класифікувати різні типи пошкоджень, включаючи ті, що спричинені стихійними лихами, аваріями або цілеспрямованими військовими атаками, такими як обстріли та бомбардування. Для підвищення точності та швидкості роботи моделі використано метод переносного навчання, що дозволило підвищити її швидкодію. Розроблена модель класифікує об'єкти на зображенні як неушкоджені (0), з незначними пошкодженнями (1), з помірними пошкодженнями (2), з високим рівнем пошкоджень (3) та зруйновані об'єкти (4). Вона може допомогти уповноваженим особам в оцінці ушкоджень та створення плану з відновлення окремо взятого об'єкта. Визначені ключові напрямки досліджень, які необхідно вирішити для підвищення точності та ефективності моделі на основі R-CNN для оцінки рівня пошкоджень. Ці напрямки включають розробку більш досконалих джерел даних та інтеграцію інших типів фреймворків для забезпечення найбільш точної оцінки пошкодження об'єктів критичної інфраструктури. Поточна версія моделі потребує додаткового навчання та доопрацювання для досягнення максимальної точності класифікації. Хоча початкові результати є непоганими, для збільшення точності моделі, зменшення втрат і усунення упередженості в майбутніх версіях моделі очікується включення фокальних втрат перехресної ентропії або подібних методів, таких як контрастне навчання. Окрім того, вивчення альтернативних архітектур, таких як SSD або YOLO, може призвести до подальших покращень. Перехід на такі платформи, як PyTorch і Detectron2, також може підвищити продуктивність порівняно з TensorFlow. Зрештою, альтернативний підхід може розширити прогностичні можливості моделі за межі оцінки рівня пошкоджень. Наприклад, створення окремих вузькоспеціалізованих моделей для конкретних типів об'єктів зможе підвищити продуктивність моделі мінімізуючи втрати останньої.
Ключові слова: Mask R-CNN, згорткова нейронна мережа, класифікація зображень, оцінка рівня пошкоджень, об'єкти критичної інфраструктури.
Problem statement and relevance to scientific objectives
This work proposes a model for assessing the extent of damage to critical infrastructure objects and facilities. In light of ongoing warfare, it is crucial to promptly assess the damage to the infrastructure object, determine the level of damage, assess the risks and develop a recovery plan. Critical infrastructure objects, such as bridges, railroads, power plants, dams, and ports (both air and sea ones) are the fundamental components that provide a foundation for the country's transportation, energy, and communication sectors, and their proper functioning is vital to maintaining the stability of the nation's economy and welfare. Unfortunately, these structures are highly susceptible to damage and destruction due to both human activity (cyberattacks, shelling, malfeasance, malfunction) and natural disasters (floods, fires, lightning strikes). If cannot be prevented, they should be properly assessed and classified.
Given their importance, it is crucial to promptly and accurately assess the level of damage to provide effective maintenance and disaster response planning [1, 11]. Traditional infrastructure inspection methods, often manual and visual, face speed, accuracy, and scalability limitations. In addition to other methods and techniques [5, 11] of assessing the damage, especially in cases of severe damage or inaccessible areas, the alternative tool to calculate damage levels is processing images using convolutional neural networks (CNNs). Their deep learning algorithms [6] can extract image features, making them ideal for image classification and object segmentation tasks.
The utilisation of CNNs to examine damage data derived from images represents a groundbreaking advancement within the realm of disaster response and recovery. Their ability to extract intricate features from images permits an unprecedented level of accuracy in tasks such as image classification and segmentation [4, 9]. CNNs play a pivotal role in identifying damaged objects, which is an imperative step in assessing the scope of damage subsequent to a disaster. For instance, they can assist in identifying damaged objects within satellite imagery, a critical component of the damage assessment process. This process is focal to disaster response and recovery efforts, as it enables the prioritisation of resources and the formulation of repair plans. As a rule, CNNs are trained using labelled image datasets and are capable of accurately and efficiently identifying patterns indicative of damage [8].
Fig. 1 The diagram illustrates the workflow of a convolutional neural network (CNN). It starts with an input image, applies convolutional and pooling layers to extract and down-sample features, and ends with a fully connected layer that makes the final prediction. The arrows indicate the direction of data flow.
Convolutional neural networks provide numerous benefits in contrast to traditional techniques. CNNs possess the capability to rapidly and precisely process substantial volumes of data, rendering them optimal for the examination of the extensive datasets created during the course of disaster response. Furthermore, they can effectively extract intricate features from images, thereby enabling precise damage analysis, even in scenarios involving severe damage or areas that are not easily accessible. Being trained using historical image data [9], they acquire knowledge from previous disasters and enhance their performance over time. Additionally, the networks can be seamlessly integrated with other technologies, such as unmanned aerial vehicles (UAVs) and drones, to collect images and data from areas that are otherwise inaccessible [19].
Nonetheless, the use of CNNs for damage assessment poses certain challenges, particularly concerning the quality and availability of data. CNNs, like some other machine learning models, necessitate extensive amounts of labelled data in order to train and optimise their performance. However, it is crucial to note that following a disaster, data may be restricted or of subpar quality, which can ultimately impact the accuracy of the model.
Another challenge involves the requirement for expertise in both deep learning and damage assessment. The construction of a CNN model necessitates proficiency in the domains of deep learning, computer vision, and image processing. The task of locating individuals possessing proficiency in both areas can be arduous, and assembling a team encompassing the requisite expertise can be a timeconsuming process. It can be solved using the transfer learning technique [13] which allows training the model on a network that had been trained on a vast number of images, thereby fine-tuning the new model for specific tasks.
In this research, the Mask R-CNN (MR-CNN) variation [14] of CNN is utilised. MR-CNN not only identifies objects within images but also generates precise segmentation masks for each instance [14]. This advancement is particularly beneficial when assessing the level of damage in infrastructure images. Let's review some other recent works related to this topic.
The purpose of the research
The unique model created on numerous critical infrastructure objects of Ukraine is proposed, which is built using a regional convolutional neural network with a mask layer. The proposed model can be utilised for assessing the damage level for critical infrastructure objects and supporting risk- informed decisions in the post-crisis management cycle assisting with object assessment and recovery from post-event damage helping decision-makers make informed decisions for each case scenario.
Analysis of related research works
There are numerous research works related to the assessment of damage to critical infrastructure objects (CIO). They usually consider damage to certain types of CIOs and from certain types of threats. Lin et al. in [2] discussed the importance of rapid assessment of building damage in earthquake-stricken areas for emergency response. The development of remote sensing technology has helped to provide reliable and accurate assessments of building damage over large areas following disasters. The authors propose a data transfer algorithm to evaluate the impact of a single historical training sample on model performance. Favourable samples are then selected to transfer knowledge from the historical data to facilitate the calibration of the new model. The results show that the data transfer algorithm proposed in this work significantly improves the reliability of the building damage assessment model by filtering samples from the historical data that are suitable for the new task. The performance of the model built based on the data transfer method on the test set of the new earthquake task is approximately 8% higher in overall accuracy compared to the model trained directly with the new earthquake samples when the training data for the new task is only 10% of the historical data.
Some works propose utilising CNNs to assess damage. Cha et al. in [3] examine the use of deep learning techniques to detect cracks in structures. The scientists propose a deep learning framework based on a convolutional neural network and a naive Bayes data fusion scheme, called NB-CNN, to analyse individual video frames for crack detection, while a novel data fusion scheme is proposed to aggregate the information extracted from each video frame to improve the overall performance. The paper also discusses the limitations of traditional damage detection methods, such as installing numerous sensors and integrating data from distributed sources. The authors suggest that vision-based methods using image processing techniques (IPTs) have been proposed to address these complexities. However, edge detection is an ill-posed problem, as the results are significantly affected by noise, mainly from lighting and distortion, and there are no optimal solutions. An effective way to overcome these problems is to implement denoising techniques. The authors conclude that machine learning algorithms (MLAs) are more adaptable to real-world situations, and several research groups have proposed techniques that can detect structural defects using this method.
Gulgec et al. [7] proposed the utilisation of CNN for structural damage detection. The scientists present CNNs as a solution to the problem of accurately detecting defects that affect the performance of structures, which has become computationally challenging due to long-term data collection from dense sensor arrays. In their study, the researchers used a Python library named Theano with a graphics processing unit (GPU) to achieve higher performance for data-intensive computations. They evaluated the accuracy and sensitivity of their proposed technique using a cracked steel gusset joint model with multiplicative noise. During training, strain distributions generated from different crack and loading scenarios were adopted, and during testing, completely invisible damage setups were introduced into the simulations. Based on their results, the authors concluded that their proposed technique achieves high accuracy, robustness and computational efficiency for damage diagnosis. Overall, this paper is an interesting approach to structural damage detection by means of CNNs. The authors provide a detailed description of their methodology. They also demonstrate its effectiveness through simulations.
The scientists Chen Xiong, Jie Zheng, Liangjin Xu, Chengyu Cen, Ruihao Zheng and Yi Li presented a multiple-input convolutional neural network (MI-CNN) [20] model for the seismic damage assessment of regional buildings. Their study focused on predicting seismic damage for multi-story RC frame buildings using a nonlinear multi-degree-of-freedom (MDOF) shear model. The MDOF shear model captures the nonlinear performance of RC frame buildings using a tri-linear backbone curve model. The authors proposed the MI-CNN (Multi-Input) model which is identified as a useful tool for fast seismic damage assessment of regional buildings, aiding post-disaster emergency response and rapid disaster relief efforts. The model includes a CNN-based ground motion feature extraction, processing of building attribute data and PGA, and parameter integration for damage prediction. The model's computation efficiency is significantly better than the nonlinear time history analysis of the MDOF shear model, with a speedup ratio of 340 on a laptop platform. This demonstrates improved computational efficiency in seismic damage assessment.
Scientists F. Zhao and C. Zhang have performed research on "Building Damage Evaluation from Satellite Imagery using Deep Learning" [23] where they propose a two-stage deep learning model for evaluating the damage level of buildings after natural disasters using satellite images. The model consists of a Mask R-CNN-based building feature extractor and a Siamese-based semantic segmentation model. The model significantly improves over the baseline methods on the xBD satellite imagery dataset. They utilised the experimental results on the xBD satellite imagery dataset, containing pre- and post-disaster images from 19 disaster events. The paper uses the F1-score and the mean intersection over union (mIoU) as the evaluation metrics. The paper shows that the proposed model outperforms the baseline model by 16 times and the Mask R-CNN framework by 80% in terms of F1-score. The paper also shows that the proposed model achieves higher mIoU than the existing methods. The paper provides some qualitative examples of the model output and discusses the limitations and future work of the research.
The research entitled “Deep Learning-Based Crack Detection Using Mask R- CNN Technique” [24] by Chengjun Tan, Nasim Uddin, and Yahya M. Mohammed explores the application of a sophisticated deep learning algorithm, Mask R-CNN, for the automatic detection of structural cracks. This is of paramount importance as the presence of cracks in civil infrastructures such as bridges and buildings can lead to a reduction in local stiffness and material discontinuities, thereby posing a potential threat to public safety. The researchers discovered that the crack detection system they developed was highly effective and efficient in automatically segmenting a diverse range of crack images. Their model also demonstrated the capability to process video data, suggesting its potential for real-time, on-site detection and shape delineation of existing cracks. This capability for early detection could facilitate the implementation of preventative measures, thereby averting substantial damage or structural failure. Now, let's analyse another scientific work.
The newest article so far “A Novel Improved Mask R-CNN for Multiple Targets Detection in the Indoor Complex Scenes” by authors Zongmin Liu, Jirui Wang, Jie Li, Pengda Liu, and Kai Ren [25] are affiliated with Chongqing Technology and Business University and Zhejiang University. The main goal of the article is to propose a novel improved Mask R-CNN method for multiple target detection in indoor complex scenes, which is a challenging task for service robots that need to perceive and grasp objects in complex environments. The authors claim that their method can improve the accuracy and anti-interference ability of the detection and segmentation results, compared to other methods by integrating the Convolutional Block Attention Module (CBAM) into Mask R-CNN, which can enhance the feature representation by using channel and spatial attention mechanisms. The scientists consider the influence of different backgrounds, distances, angles and interference factors on the detection performance and design corresponding experiments to verify their method. They establish a comprehensive evaluation system based on loss function and Mean Average Precision (mAP) to measure the detection and identification effects of their model. Next, let's outline the methods of this research.
Methods of conducting research
To tackle this problem, it is decided to opt for a regional convolutional neural network (R-CNN) with a mask layer for providing image analysis and inference (Mask R-CNN). It is a large network that is based on convolutional neural networks that can process images and extract particular features from designated regions of interest (RoI) with a predictive mask application.
A regional convolutional neural network with object mask detection, known as Mask R-CNN, is utilised for automated damage detection and assessment. This approach transforms the process of identifying and characterising damage across various infrastructure components. Here, a transfer learning method is used for faster and enhanced results: pre-training MR-CNN on a large, diverse dataset named COCO (Common Objects in Context) [22], providing a robust foundation for feature extraction. In 2014, a large-scale dataset of 328 thousand images was created for object detection, segmentation, and labelling. 2.5 million labelled instances are spread across 80 object categories. The dataset is created through extensive crowdsourcing using new user interfaces.
The transfer learning approach [13] offers several benefits:
- Reduced training time by leveraging pre-trained weights significantly accelerates model convergence, requiring less training data compared to training from scratch;
- Fine-tuning the pre-trained model on domain-specific data often leads to superior performance compared to generic object detection models.
Mask R-CNN (MR-CNN) segments instances by predicting a binary mask for every detected object through a mask branch. The mask branch uses a fully convolutional network (FCN) [17] that generates a mask of m X m by taking the region of interest (RoI) features from the Faster R-CNN backbone network [16], which, in turn, derives from Fast R-CNN [15] and introduces a region proposal network (RPN). MR-CNN is trained using a multi-task loss function that includes both bounding box regression and mask prediction losses. The bounding box regression loss used in Faster R-CNN is retained, while the mask prediction loss comprises a per-pixel sigmoid cross-entropy loss [12]. MR-CNN outperforms previous approaches by a significant margin on the COCO [12] instance segmentation benchmark and can segment objects in real time, making it suitable for various real-world use cases.
The main contributions of MR-CNN are its straightforward and efficient architecture for instance segmentation, which extends the Faster R-CNN object detection framework, and its mask branch that predicts a binary mask for every detected object using a fully convolutional network (FCN) [17]. Additionally, a multi-task loss function trains the network for bounding box regression and mask prediction.
Mask R-CNN is composed of various networks and subnetworks, including the Feature Pyramid Network (FPN) [18] and the Region Proposal Network (RPN) [16]. FPN is designed to address the issue of detecting objects of different scales. It does this by creating a pyramid of features with multiple levels, each level corresponding to a different scale. The lower levels of the pyramid are responsible for detecting smaller objects with high resolution but low semantics, while the higher levels deal with larger objects with low resolution but high semantics. FPN maintains strong semantics at all levels by adding high-level features to lower levels.
Having a rich multi-scale feature map from FPN, it's possible to identify where the objects are. Here, RPN slides a small network over the feature map and at each sliding window location generates multiple region proposals, each with an objectness score (OS) [16]. These proposals are then used by the rest of the MR- CNN model to detect and segment objects.
Here are the defined working steps of a Mask R-CNN-based model for damage assessment of critical infrastructure objects with a damage score ranging from 0 (intact) to 4 (destroyed):
Step 1: Feature extraction and object detection.
- Image pre-processing: Applying image normalisation and resizing to a standard size.
- Feature extraction: extracting meaningful features from the input
image.
Step 2: Damage classification and segmentation.
- RoI Pooling: Extracting features from the candidate bounding boxes using RoI pooling into fixed-size representations.
- Mask branching: Predict a binary mask for each candidate bounding box using a separate mask R-CNN branch with a CNN.
- Damage classification branch: Concatenating the RoI pooled features and the mask features and feeding them into a fully connected network to classify the level of damage (0-4) and then calculating the damage level for each instance mask Mi using a fusion of segmentation mask features and spatial information.
Step 3. Loss Calculation.
- RPN Loss: Combination of localisation loss (smooth L1 loss [10]) and classification loss (cross-entropy loss) to identify object suggestions;
- Mask Loss: Binary cross-entropy loss [9] for each pixel in the predicted mask to measure agreement with the ground truth mask:
- Damage Classification Loss: Cross-entropy loss [12] between the predicted damage score (0-4) and the ground truth damage score for each object;
where у is the ground truth label (0 or 1) and p is the predicted probability.
Total loss LT of the model is calculated as follows:
where Arpn, Amask and Adamage are hyperparameters that are used to balance
the importance of each loss component in the overall loss function.
In this paper, a Mask R-CNN-based approach is suggested for evaluating damage to crucial infrastructure objects (CIO) following anthropogenic factors, including impairment and destruction. The model uses images as input / = {llrl2>- ¦ ¦>!»} and produces a damage level score reflecting the extent of damage inflicted on the object. The model is composed of three primary components: a feature extraction module that extracts characteristics from both images using a shared CNN, a feature fusion module that combines the attributes from both images using a fully connected layer, and a regression module that forecasts the damage score using another fully connected layer. MR-CNN is trained on a unique dataset of different infrastructure objects collected from various sources and classified with damage scores from 0 to 4, where 0 is intact, 1 - is of low damage, 2 is moderate damage, 3 is severe damage and 4 is a destroyed object.
The labelled regions of the single damaged CIO are being analysed in the numerous convolutional layers C.
Fig. 2 The core structure of R-CNN with mask detection: the image is passed through convolutional layers (C) being simultaneously transferred into the pooling layers (P), then into the Regional Proposal Network (RPN), where they are processed into binary class and bounding box data, assembled in the region of interest and afterwards forwarded to CNN to detect class, generate bounding box (bbox), and apply a mask.
To measure the correlation between predictions, the IoU (Intersection over Union) method is utilised [27], which measures the ratio of the overlapping area between two bounding boxes to their combined area. The union of the ground truth bounding box and the predicted bounding box forms the denominator in this calculation. To find the IoU, the overlap between the ground-truth bounding box (Bgt) and the predicted bounding box (Bpr) is computed, which forms the numerator in the IoU equation:
The COCO training dataset employs a range of about 10 Intersection over Union (IoU) thresholds, spanning from 0.5 to 0.95, in order to accurately compute the mean Average Precision (mAP) for each respective category. One commonly used threshold in practice is 0.5. This signifies that a predicted box is considered a true positive detection only if it has an IoU of at least 0.5 with a ground truth box.
To facilitate the training process, the transfer-learning technique [13] is used to significantly improve the training time and accuracy of the model. The model has been pre-trained on over 200 image data, and divided into training, validation, and testing sets. The unique training dataset comprises 160 images with different damage levels ranging from 0 to 4, which have been validated and tested. First, the images were collected, prepared, and then annotated using the VIA Image Annotator [21] (Figure 3).
Fig. 3 The process of annotating (labelling) gathered image data of the destroyed bridge as of the last level (4). Narodychi, Zhytomyr region, July 2022. Photo: Suspilne Zhytomyr.
After labelling all the training and validation data, the training of the model is commenced. For this purpose, an NVIDIA RTX 4060 GPU with 8GB of dedicated memory (VRAM) is utilised. Initially, the model was tested to train on an Intel Core i5 CPU, however, this was a tardy and tedious process, which is about three times slower than using a GPU. The mask branch has outlined the detected object (in this case, an undamaged bridge) and issued the correct damage level (Fig. 4, 5).
Fig. 4 The generated mask of zero damage of an intact temporary bridge over the Ingulets' River near the settlement of Velyka Oleksandrivka of the Beryslav district. Photo: Local roads of the Kherson region.
Fig. 5 The mask and bounding box of an intact hospital. Photo: Ukrainian Healthcare Center.
In order to evaluate the effectiveness of the trained model, the appropriate metrics such as mean average precision (mAP) are utilised. It is recommended to use a separate validation set or perform cross-validation to determine the model's accuracy and adjust hyperparameters if necessary. Once the model's proficiency has been determined, it can be used to identify damaged or destroyed critical infrastructure objects (CIO) in new, unseen images within the testing dataset. The model will detect objects, create bounding boxes, and generate masks for the identified damaged (or intact) objects.
A foundation for a new inference that links damage levels to specific configurations has been established. Level 0 indicates that the objects are intact and have no damage to their structural elements. Level 1 stands for CIO that have cracks and minor deformations. Level 2 represents some localised damage and broken components. Level 3 identifies extreme damage and fragmentation, while level 4 evinces utterly eradicated objects.
Fig. 6 The anchors indicate the damaged object: positive anchors before refinement (dashed) and after (solid)--Kherson International Airport. Photo: Serhii Nuzhnenko.
Now back to RPN anchors. The Region Proposal Network (RPN) runs a lightweight binary classifier on a set of boxes (anchors) A = [A1,A2,... ,Am} over the image and returns object/non-object scores. Anchors with a high object/no-object score (positive anchors) are passed on to the second stage for classification.
It is often the case that even positive anchors do not completely cover the damaged objects. Therefore, the RPN also regresses a refinement (a delta in position and size) to be applied to the anchors to move them to the proper boundaries of the object and to resize them (Figure 6).
A grid of anchor points is used for target generation, covering the entire image at different scales, and the intersection over the union (IoU) of the anchor points with the reference object is then calculated. Positive anchors are those with an IoU of 0.7 or more with any reference object, while negative anchors are those covering no object more than 0.3. Anchors are considered neutral and are not used in training if they cover any object with an IoU of 0.3 or more but less than 0.7. The calculated offset and size change are required for the anchor to fully cover the ground truth object in order to train the RPN regressor.
To facilitate this alignment in FPN architecture, Ren et al. [16] recommend a specific sorting strategy:
- Pyramid level sorting: prioritising sorting by pyramid level (P1,P2,..., Pn) to simplify level-based anchor separation.
- Feature map sequence sorting: sorting anchors based on the feature map processing sequence, typically top-left to bottom-right within each level.
- Aspect ratio sorting: Choose a consistent order for anchors with different aspect ratios within each feature map cell. Matching the order of ratios passed to the function is recommended.
For high-resolution feature maps, where significant anchor overlap can occur, generating anchors for every other cell can significantly reduce computational load while maintaining adequate coverage. Thus, by associating damage levels with specific anchor groups and utilising the ordered anchoring approach, the model can effectively locate and predict damage based on its severity and extent within the image. Additionally, generating fewer anchors for high-resolution maps further enhances computational efficiency without compromising accuracy.
Research results and discussion
The model is able to predict the bounding box and level of damage, which indicates the location of the damaged area. However, it is important to note that the model is still in its beta version and requires further improvement. Please refer to Figure 7 for the images depicting the detection of the damage and its level, and to Figure 8 for a juxtaposing process of the masks of different impairment levels.
Hence, although the detection accuracy needs improvement, it accurately identifies the damaged area. The mask layer in MR-CNN delineates and outlines the contours of the desired object.
Fig. 7 The test image of the destroyed bridge on the road Kalynivka - Snihurivka with the bounding box of the last level of damage with 74% accuracy. Photo: Agency for the Restoration and Development of Infrastructure of Ukraine.
Below in the image, there is a transparent mask that helps to differentiate the object from the grey background.
Fig. 8 The colourful masks of zero and fully damaged grain warehouses were applied. One of them was destroyed by a drone strike on a seaport in Ukraine's Odesa region. Photo: Press Service of the Operational Command South of the Ukrainian Armed Forces/Handout via Reuters.
Fig. 9 The predicted transparent mask highlights the destroyed critical infrastructure object. Kherson National Airport. Photo: Serhii Nuzhnenko
Performance evaluation
Now it's time to inspect the performance of the model. Let's start with RPN predictions. Here is a visualisation of the distribution of the changes in the coordinates of bounding boxes predicted by a Region Proposal Network (RPN) in an image, also known as the deltas of the bounding boxes. The x-axis represents the change in the value (delta) for a specific coordinate or dimension of the bounding box. The y-axis represents the number of bounding boxes that have that particular change in value.
There are four subplots, one for each of the four values that define a bounding
box:
- dy: change in y-coordinate of the top-left corner;
- dx: change in x-coordinate of the top-left corner;
- dw: change in width;
- dh: change in height.
Fig. 10 The deltas of image bounding boxes: on the x-axis, the variance in value (delta) for a given coordinate (dimension) of the bounding box. The y-axis reflects the number of bounding boxes that demonstrate specific differences in value.
The shape of the histograms can reveal important information about the quality of the RPN's predictions. For example, a narrow peak centred around zero would indicate that the RPN is making small, accurate adjustments to the bounding boxes. A wider distribution would indicate that the RPN is making larger, more uncertain adjustments.
The next are the RPN coordinates. The first plot uses the first and second columns of the data (indexed as 0 and 1), and the second plot uses the third and fourth columns (indexed as 2 and 3). They represent the (y1, x1) and (y2, x2) coordinates of the bounding boxes proposed by the RPN [16].
Fig. 11 The scatter plots delineate the spatial distribution of proposed regions in the image space. The first plot represents the top-left corners while the second plot shows the bottom-right corners.
The scatter plots labelled “y1, x1” and “y2, x2” represent the spatial distribution of the proposed regions (bounding boxes) in the image space. Each blue dot represents a proposed region. The first plot (y1, x1) demonstrates the distribution of the top-left corners of the proposed regions, while the second plot (y2, x2) depicts the distribution of the bottom-right corners. The clustering or pattern formation of data points towards the centre in the second plot indicates that the RPN is proposing more regions in the central area of the image.
This equation calculates the mAP (Mean Average Precision) [26, 27] of the image:
where Nc - class quantity, і - class id.
In that case, the peak mAP for an IoU value of 0.5 is 0.433, which is a promising result. The mAR (Mean Average Recall) [27] is calculated identically and amounts to 0.225 (the results vary depending on the image or frame being processed). Finally, let's calculate the FI Score [28] of the model:
According to the obtained results, the F1 score equals 0.21 on average and there is room for improvement. Thus, to calculate the total loss of the model as follows:
The lowest loss score that the model has currently achieved is about 1.44 (0.8 on average), which can be lessened by further fine-tuning. The loss graph is displayed below:
Fig. 12 The loss of the trained model using ResNet101: the blue lines stand for a training loss and the red and pink ones stand for a validation loss
Prospects for future research
The created model requires additional training and fine-tuning to reach its apogee of classification accuracy; however, the results are promising. Future work should focus on improving the detection accuracy, perhaps by incorporating additional training data or refining the model architecture. Furthermore, the model would benefit from a more sophisticated postprocessing step to refine the mask layer. This could involve using morphological operations to smooth the mask or employing a contour detection algorithm to better capture the shape of the object. To significantly improve its accuracy, reduce loss and eliminate imbalance and bias, the focal cross-entropy loss [29] or similar methods, such as contrastive learning [30, 31] can be beneficial to implement in future versions of the model. Additionally, the model can be trained by means of another architecture, such as SSD (Single Shot Detection), SAM (Segment Anything Model) or YOLO (You Only Look Once) / YOLO-NAS [32-34] v9 (the latest as of now) and others. The model can be further improved by running on another platform, such as the PyTorch module and Detectron2 framework instead of TensorFlow 2.
Finally, a completely other approach [35] can change the predicting purpose of the model, e.g. displaying not only the damage level but also pinpointing its precise location. Moreover, it would be advantageous to create a single model for the assessment of only one type of object to elevate the performance of the model while reducing its loss. The model can be formatted into ONNX or TensorRT [36, 37] format to seamlessly integrate and interact with other models and devices. Thus, there is a strong determination and willingness to conduct further research in this field.
Conclusions
The new model of the improved Mask Regional Convolutional Neural Network (MR-CNN) is proposed. The model has the ability to identify the level of damage to critical infrastructure by analysing uploaded images.
Overall, the use of CNNs for damage assessment represents a significant advancement in disaster response and recovery efforts. Their ability to extract complex features from images enables accurate damage assessment, even in cases of severe damage or inaccessible areas. While there are challenges associated with using CNNs, their advantages make them an essential tool for emergency responders and damage assessment professionals. As technology continues to evolve, CNNs will likely play an increasingly crucial role in disaster response and recovery efforts in the not-so-distant future.
References:
1. United Nations Development Programme. (2022). Ukraine: Machine learning algorithms and big data scans used to identify war-damaged infrastructure. Retrieved from https://www.undp.org/ blog/ukraine-machine-learning-algorithms-and-big-data-scans-used-identify-war-damaged- infrastructure
2. Lin, Q., Ci, T., Wang, L., Mondal, S. K., Yin, H., & Wang, Y. (2022). Transfer Learning for Improving Seismic Building Damage Assessment. Remote Sensing, 14(1), 201. https://doi.org/ 10.3390/rs14010201
3. Cha, Y.-J., Choi, W., & Buyukozturk, O. (2017). Deep Learning-Based Crack Damage Detection Using Convolutional Neural Networks. Computer-Aided Civil and Infrastructure Engineering, 32(1), 1-18. https://doi.org/10.1111/mice.12239
4. Lv, Q., Zhang, S., & Wang, Y. (2022). Deep Learning Model of Image Classification Using Machine Learning. International Journal of Distributed Sensor Networks, 1-10. https://doi.org/10.1155/2022/3351256
5. Wu, H., & Zhou, Z. (2021). Using Convolution Neural Network for Defective Image Classification of Industrial Components. Mobile Information Systems, Article ID 9092589. https://doi.org/10.1155/2021/9092589
6. LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521(7553), 436-444. https://doi.org/10.1038/nature14539
7. Gulgec, N. S., Takac, M., & Pakzad, S. N. (2017). Structural damage detection using convolutional neural networks. In Model Validation and Uncertainty Quantification, Volume 3: Proceedings of the 35th IMAC, A Conference and Exposition on Structural Dynamics 2017 (Vol. 3, pp. 331-337). Springer International Publishing. https://doi.org/10.1007/978-3-319-54858-6_33
8. Nex, F., Duarte, D., Tonolo, F. G., & Kerle, N. (2019). Structural Building Damage Detection with Deep Learning: Assessment of a State-of-the-Art CNN in Operational Conditions. Remote Sensing, 11(23), 2765. https://doi.org/10.3390/rs11232765
9. Guo, C., Chen, X., Chen, Y., & Yu, C. (2022). Multi-Stage Attentive Network for Motion Deblurring via Binary Cross-Entropy Loss. Entropy, 24(10), 1414. https://doi.org/ 10.3390/e24101414
10. Sutanto, A. R., & Kang, D.-K. (2020). A Novel Diminish Smooth L1 Loss Model with Generative Adversarial Network. In IHCI 2020: Intelligent Human Computer Interaction (pp. 361-368). https://doi.org/10.1007/978-3-030-68449-5_36
11. Galera-Zarco, C., & Floros, G. (2023). A deep learning approach to improve built asset operations and disaster management in critical events: an integrative simulation model for quicker decision making. Annals of Operations Research. https://doi.org/10.1007/s10479-023-05247-z
12. Wang, Q., Ma, Y., Zhao, K., & Tian, Y. (2022). A Comprehensive Survey of Loss Functions in Machine Learning. Annals of Data Science, 9, 187-212. https://doi.org/10.1007 /s40745-020-00253-51
13. Ribani, R., & Marengoni, M. (2019). A Survey of Transfer Learning for Convolutional Neural Networks. In 2019 32nd SIBGRAPI Conference on Graphics, Patterns and Images Tutorials (SIBGRAPI-T) (pp. 47-57). https://doi.org/10.1109/SIBGRAPI-T.2019.00010
14. He, K., Gkioxari, G., Dollar, P., & Girshick, R. (2017). Mask R-CNN. In 2017 IEEE International Conference on Computer Vision (ICCV) (pp. 2980-2988). https://doi.org/10.1109/ ICCV.2017.322
15. Girshick, R. (2015). Fast R-CNN. In 2015 IEEE International Conference on Computer Vision (ICCV) (pp. 1440-1448). https://doi.org/10.1109/ICCV.2015.169
16. Ren, S., He, K., Girshick, R., & Sun, J. (2017). Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(6), 1137-1149. https://doi.org/10.1109/TPAMI.2016.2577031
17. Dai, J., Li, Y., He, K., & Sun, J. (2016). R-FCN: Object Detection via Region-based Fully Convolutional Networks. NIPS 2016 (pp. 379-387). https://doi.org/10.48550/arXiv.1605.06409
18. Lin, T. Y., Dollar, P., Girshick, R., He, K., Hariharan, B., & Belongie, S. (2017). Feature Pyramid Networks for Object Detection. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 936-944). https://doi.org/10.1109/CVPR.2017.106
19. Li, Q., Zhang, J., & Jiang, H. (2019). Incorporating multi-source remote sensing in the detection of earthquake-damaged buildings based on logistic regression modelling. Nat. Hazards Earth Syst: Sci. Discuss. https://doi.org/10.5194/nhess-2019-20
20. Xiong, J., Zheng, L., Xu, C., Cen, R., Zheng, Y., & Li, Y. (2021). Multiple-Input Convolutional Neural Network Model for Large-Scale Seismic Damage Assessment of Reinforced Concrete Frame Buildings. Applied Sciences, 11(17), 8258. https://doi.org/10.3390/app11178258
21. Dutta, A., & Zisserman, A. (2019). The VIA annotation software for images, audio and video. In Proceedings of the 27th ACM International Conference on Multimedia (MM '19) (pp. 2276-2279). https://doi.org/10.1145/3343031.3350535
22. Borji, A. (2022). Complementary datasets to COCO for object detection. Retrieved from https://doi.org/10.48550/arXiv.2206.11473
23. Zhao, F., & Zhang, C. (2020). Building Damage Evaluation from Satellite Imagery using Deep Learning. In IEEE 21st International Conference on Information Reuse and Integration for Data Science (IRI) (pp. 82-89). https://doi.org/10.1109/IRI49571.2020.00020
24. Tan, C., Uddin, N., & Mohammed, Y. M. (n.d.). Deep Learning-Based Crack Detection Using Mask R-CNN Technique. In 9th International Conference on Structural Health Monitoring of Intelligent Infrastructure. Retrieved from https://par.nsf.gov/biblio/10147424
25. Liu, Z., Wang, J., Li, J., Liu, P., & Ren, K. (2023). A Novel Improved Mask RCNN for Multiple Targets Detection in the Indoor Complex Scenes. ArXiv. Retrieved from https://doi.org/10.48550/arXiv.2302.05293
26. Henderson, P., & Ferrari, V. (2017). End-to-End Training of Object Class Detectors for Mean Average Precision. In Computer Vision - ACCV 2016. Lecture Notes in Computer Science (Vol. 10115, pp. 331-337). Springer, Cham. https://doi.org/10.1007/978-3-319-54193-8_13
27. Gyu, Z. (2018). An Introduction to Evaluation Metrics for Object Detection. Zenggyu's Blog. Retrieved from https://blog.zenggyu.com/posts/en/2018-12-16-an-introduction- to-evaluation-metrics-for-object-detection/index.html
28. Hand, D. J., Christen, P., & Kirielle, N. (2021). F*: an interpretable transformation of the F-measure. Machine Learning, 110, 451-456. https://doi.org/10.1007/s10994-021-05964-1
29. Lin, T.-Y., Goyal, P., Girshick, R., He, K., & Dollar, P. (2017). Focal Loss for Dense Object Detection. In 2017 IEEE International Conference on Computer Vision (ICCV) (pp. 2999-3007). https://doi.org/10.1109/ICCV.2017.324
30. Marrakchi, Y., Makansi, O., & Brox, T. (2019). Structural Building Damage Detection with Deep Learning: Assessment of a State-of-the-Art CNN in Operational Conditions. Remote Sensing, 11(23), 2765. https://doi.org/10.3390/rs11232765
31. Xu, S., & Lan, S. (2022). A Comprehensive Survey of Loss Functions in Machine Learning. Annals of Data Science, 9, 187-212. https://doi.org/10.1007/s40745-020-00253-51
32. Pham, S. V. H., & Nguyen, K. V. T. (2023). Productivity Assessment of the Yolo V5 Model in Detecting Road Surface Damages. Applied Sciences, 13, 12445. https://doi.org/10.3390/ app132212445
33. Redmon, J., Divvala, S., Girshick, R., & Farhadi, A. (2016). You Only Look Once: Unified, Real-Time Object Detection. ArXiv:1506.02640 [cs.CV]. Retrieved from https://doi.org/ 10.48550/arXiv.1506.02640
34. Osco, L. P., Wu, Q., de Lemos, E. L., Gon9alves, W. N., Ramos, A. P. M., Li, J., Marcato, J. (2023). The Segment Anything Model (SAM) for remote sensing applications: From zero to one shot. International Journal of Applied Earth Observation and Geoinformation, 124, 103540. https://doi .org/10.1016/j .j ag.2023.103540
35. Abedi, M., Shayanfar, J., & Al-Jabri, K. (2023). Infrastructure Damage Assessment via Machine Learning Approaches: a systematic review. Asian Journal of Civil Engineering, 24, 3823-3852. https://doi .org/10.1007/s42107-023 -00748-5
36. Jin, T., Bercea, G.-T., Le, T. D., Chen, T., Su, G., Imai, H., Negishi, Y., Leu, A., O'Brien, K., Kawachiya, K., & Eichenberger, A. E. (2020). Compiling ONNX Neural Network Models Using MLIR. arXiv preprint arXiv:2008.08272. https://doi.org/10.48550/arXiv.2008.08272.
37. Zhou, Y., & Yang, K. (2022). Exploring TensorRT to Improve Real-Time Inference for Deep Learning. In 2022 IEEE 24th International Conference on High Performance Computing & Communications; 8th International Conference on Data Science & Systems, 2011-2018. https://doi.org/10.1109/HPCC-DSS-SmartCity-DependSys57074.2022.00299.
Література:
1. United Nations Development Programme. Ukraine: Machine learning algorithms and big data scans used to identify war-damaged infrastructure [Електронний ресурс] / United Nations Development Programme. 2022. - Режим доступу: https://www.undp.org/blog/ukraine- machine-learning-algorithms-and-big-data-scans-used-identify-war-damaged-infrastructure.
2. Lin Q., Ci T., Wang L., Mondal S. K., Yin H., Wang Y. Transfer Learning for Improving Seismic Building Damage Assessment // Remote Sensing. - 2022. - Т. 14, № 1. - С. 201. - DOI: 10.3390/rs14010201.
3. Cha Y.-J., Choi W., Buyukozturk O. Deep Learning-Based Crack Damage Detection Using Convolutional Neural Networks // Computer-Aided Civil and Infrastructure Engineering. - 2017. - Т. 32, № 1. - С. 1-18. - DOI: 10.1111/mice.12239.
4. Lv Q., Zhang S., Wang Y. Deep Learning Model of Image Classification Using Machine Learning // International Journal of Distributed Sensor Networks. - 2022. - С. 1-10. - DOI: 10.1155/2022/3351256.
5. Wu H., Zhou Z. Using Convolution Neural Network for Defective Image Classification of Industrial Components // Mobile Information Systems. - 2021. - Article ID 9092589. - DOI: 10.1155/2021/9092589.
6. LeCun Y., Bengio Y., Hinton G. Deep learning // Nature. - 2015. - Т. 521, № 7553. - С. 436-444. - DOI: 10.1038/nature14539.
7. Gulgec N. S., Takac M., Pakzad S. N. Structural damage detection using convolutional neural networks. / N. S. Gulgec, M. Takac, S. N. Pakzad // Model Validation and Uncertainty Quantification, Volume 3: Proceedings of the 35th IMAC, A Conference and Exposition on Structural Dynamics 2017. - Т. 3. - С. 331-337. - Springer International Publishing. - DOI: 10.1007/978-3-319-54858-6_33.
...Подобные документы
Характеристика, деятельность предприятия, план работы с персоналом. Структура предприятия и отдела. Технология ActiveX Data Objects ADO в Delphi. Концепция и базовые объекты ADO. Компоненты Delphi для поддержки ADO. Схема связи с объектом ADO в Delphi.
реферат [26,4 K], добавлен 22.11.2010"Damage control" как современная концепция лечения пострадавших с критической политравмой. Хирургическое лечение всех повреждений в первые 24 часа. Одновременное выполнение бригадами хирургов трепанации черепа и остеосинтеза закрытого перелома бедра.
презентация [393,4 K], добавлен 01.04.2014The chiral model of graphene based on the order parameter is suggested in the long-wave approximation, the ideal graphene plane being determined by the kink-like solution. Corrugation of the graphene surface is described in the form of ripple and rings.
статья [211,7 K], добавлен 23.05.2012Issues about housing prices formation process. Analytical model of housing prices. Definition a type of relationship between the set of independent variables and housing prices. The graph of real housing prices of all Russian regions during the period.
курсовая работа [1,6 M], добавлен 23.09.2016The air transport system in Russia. Project on the development of regional air traffic. Data collection. Creation of the database. Designing a data warehouse. Mathematical Model description. Data analysis and forecasting. Applying mathematical tools.
реферат [316,2 K], добавлен 20.03.2016The applied science model. The basic assumptions underlying this model. Received and experiential knowledge. Oldest form of professional education. The most advanced modern teaching strategies. Projects for the development of creative abilities.
презентация [156,0 K], добавлен 09.03.2015The concept, types and importance of enterprise infrastructure. Provided producing technologically-advanced enterprise infrastructure. Problems of infrastructure in enterprises of Ukraine. Reproduction and development of enterprise infrastructure.
реферат [30,0 K], добавлен 29.10.2011Critical literature review. Apparel industry overview: Porter’s Five Forces framework, PESTLE, competitors analysis, key success factors of the industry. Bershka’s business model. Integration-responsiveness framework. Critical evaluation of chosen issue.
контрольная работа [29,1 K], добавлен 04.10.2014Процессоры Duron на ядре Spitfire (Model 3), Morgan (Model 7), Applebred (Model 8), Mobile Duron Camaro. Схема материнской платы EP-8KHAL+. Микросхема "Северный мост". Звуковой чип ALC201A. Конфигурация системной памяти. Регулятор заглушки шины RT9173.
курсовая работа [3,6 M], добавлен 26.03.2013The computer systems and unique possibilities for fulfillment before unknown offenses. The main risks and threats to information systems security in the internet. Internet as a port of escape of the confidential information and its damage minimization.
контрольная работа [19,6 K], добавлен 17.02.2011Применение теории лексического прототипа для обучения разграничению синонимов на примере английских синонимичных глаголов "to damage", "to destroy", "to ruin" при изучении иностранного языка в школе. Практическое применение теории лексического прототипа.
дипломная работа [783,4 K], добавлен 21.01.2017Importance of Roman architecture, the priorities of Ancient Rome. Arches and concrete as the achievement of Romans. Types of architecture of ancient Rome, the civil engineering structures. The influence of politics and religion in Roman architecture.
реферат [37,1 K], добавлен 01.12.2010The model of training teachers to the formation of communicative competence. How the Web 2.0 technology tools affect on secondary school students in communication. The objective of the model is instantiated a number of conditions. Predicting the Future.
курсовая работа [30,3 K], добавлен 11.06.2012Emergencies, their classification. Measures of protection in emergency situations of natural character. Strength and protection of population and territories from emergency situations of technogenic and natural character. Preventing disasters in Ukraine.
реферат [22,0 K], добавлен 08.10.2012Short and long run macroeconomic model. Saving and Investment in Italy, small open economy. Government expenditure and saving scatterplot. Loanable market equilibrium in closed economy in the USA. Okun’s Law in the USA and Italy, keynesian cross.
курсовая работа [1,6 M], добавлен 20.11.2013Executive summary. Progect objectives. Keys to success. Progect opportunity. The analysis. Market segmentation. Competitors and competitive advantages. Target market segment strategy. Market trends and growth. The proposition. The business model.
бизнес-план [2,0 M], добавлен 20.09.2008The behavior of traders on financial markets. Rules used by traders to determine their trading policies. A computer model of the stock exchange. The basic idea and key definitions. A program realization of that model. Current and expected results.
реферат [36,7 K], добавлен 14.02.2016The first rapid-transit system. History Metropolitan Railway. Network topologies, construction stages of London's Metropolitan Railway. Safety and security. Infrastructure 5-Line of Metro de Santiago (Chile), The Soviet Union's stations, Stockholm metro.
презентация [1,2 M], добавлен 13.05.2014Static model analysis. Proof mass, suspension beams, static deflection. Residual stress and Poisson’s ratio. Spring constants. Strain under acceleration. Sensitivity, thermal noise. Resolution due to the ADC. Maximum acceleration. Dynamic model analysis.
курсовая работа [1,2 M], добавлен 21.09.2010Концептуальна модель бази даних, визначення зв’язків між ними, атрибутів сутностей їх доменів. Створення ORM source model та Database model diagram для бази даних "Автотранспортне підприємство". Генерування ddl-скрипта для роботи в СУБД SQL-Server.
курсовая работа [47,3 K], добавлен 17.10.2013