1. Introduction
With the rapid development of remote-sensing technology, researchers can quickly obtain large-scale and high-resolution images of planets to explore and discover space energy [
1] and geological research [
2]. The Moon is the closest natural body to the Earth in space, and it is well studied compared to other celestial bodies. By exploring rilles and impact craters on the Moon, we are more likely to find space energy because they have been a hot research topic [
3].
The surface of the Moon is distributed with dark cracks known as rilles [
4]. In [
5], the Moon is rich in energy resources, such as helium-3, which is an efficient, clean, safe, and cheap nuclear fusion resource. The availability of helium-3 on the Moon is much higher than that on the Earth, and helium-3 has been proved to be abundant in the lunar stream [
6]. The study of impact craters can be used to deduce geological age, to explore the existence of water ice [
7], and to select landing sites for lunar rovers [
8], autonomous navigation [
9], and other tasks for deep space probes [
10].
To explore and discover space energy sources and study geological evolution, we should first detect impact craters and rilles on the Moon’s surface. In the past, such detection was usually made visually and manually. However, this time-consuming and inaccurate approach requires the development of automated detection tools, particularly deep learning methods, to detect impact craters and rilles on the Moon.
For example, The Lunar Selenographia published by Hevelius [
11] mapped the lunar impact craters. Pike [
12] obtained 484 impact craters and their scale parameters through Apollo stereo image data. However, with the acquisition of higher precision data, the problem of low accuracy and manual recognition efficiency becomes more prominent. Traditional recognition methods based on Hough transform [
13], feature matching [
14], image transformation segmentation [
15], and quadratic curve fitting [
16] have higher accuracy and more pronounced effects. Nonetheless, the calculation process of this method is also complicated and time-consuming, and further improvement is needed for global impact crater detection. In recent years, with the advent of AlexNet [
17] and the rapid development of graphic processing unit (GPU) hardware, more researchers began to use the convolutional neural network (CNN) in deep learning to solve the tasks of object detection and semantic segmentation in computer vision. The method proposed by Silburt [
18] applied the U-Net network framework to detect impact craters in the whole Moon and obtained relatively accurate detection results. DeLatte [
19] built a Crater U-Net framework using Keras to make the database more prominent and to have a more significant recognition. Still, the accuracy of the detection needs to be improved. Yang [
20] proposed a method based on Chang ‘E-1 and Chang E-2 digital elevation data (DEM) and the digital orthophoto image (DOM) and target detection framework R-FCN network to detect a total of 109,956 new impact craters. On this basis, the paper also analyzed the geological age. In addition, Jia [
21] proposed an SCNeSt architecture with multipath representation and channel orientation of self-calibrated convolution, providing higher detection and estimation accuracy for small impact craters. Yang [
22] designed an HRFPNet with a feature aggregation module (FAM) to extract the context information of small craters while preserving the features of small craters in deep convolution layers, and they built a dataset called Mars day crater detection (MDCD) consisting of 500 images. Twelve-thousand of them are 729 × 729 pixels in size. The development history of impact crater detection is shown in
Figure 1:
In general, most methods are not used for space energy discovery. They are designed to probe only craters, not Moon rilles, directly related to lunar energy discoveries. The high-resolution net (HRNet) model proposed by Chen [
23] is a depth model for human pose estimation. Still, this model’s accuracy rate (83.7%) and recall rate (53.8) are low, resulting in poor recognition performance. To further discover space energy accurately and efficiently, we designed a new specific deep-learning method called high-resolution global-local networks (HR-GLNet) to explore craters and rilles to discover space energy simultaneously. HR-GLNet can maintain high-resolution feature maps throughout the process and can obtain more accurate spatial information. Its multi-scale fusion strategy can also obtain richer high-resolution representations and make the predicted lunar features more accurate.
In summary, the contributions of this work are:
To further promote lunar energy discovery, we propose a new machine-learning approach that automatically identifies craters and rilles simultaneously.
We propose a new semantic segmentation method, GL-HRNet, which is superior to GLNet- and HRNet-based network structures in terms of segmentation accuracy and mean absolute error and can be easily used in other similar tasks.
We also find something unique about the density distribution of craters throughout the Moon. There is a distinctive difference for a relatively small crater (1–5 km in diameter) between the density of impact craters on the lunar mare and the highland. Small craters on the lunar mare are deeper than those on the highlands.
The rest of this article is organized as follows.
Section 2 introduces the proposed method, global branch HRNet, and local branch ResUnet of the GL-HRNet and branch aggregation algorithm.
Section 3 introduces the experimental data, evaluation indexes, and experimental conditions. In addition,
Section 4 evaluates lunar rilles and impact craters and compares the proposed network with other existing networks. Finally,
Section 5 gives our conclusions and puts forward some opinions on the direction of future work.
2. Materials and Methods
By combining deep learning and transfer learning, we proposed the GL-HRNet for lunar energy detection based on GLNet [
24], HRNet [
25], and UNet [
26], as shown in
Figure 2. Firstly, projection, downsampling, and random clipping were carried out for different remote-sensing data to make complex data fragmentary. Secondly, in the global branch of GL-HRNet, the ResNet [
27] in GLNet was adjusted to the HRNet, and the HRNet and feature pyramid network (FPN) [
28] were taken as the backbone network. In this way, rich multi-scale information of craters and valleys was integrated while maintaining high-resolution feature maps. In the local branch, ResUNet and FPN were used as the backbone network to train the partial network independently without adopting the original feature sharing strategy to eliminate the uncertainty caused by inadequately-learned feature maps in the global branch. Finally, the primary loss function and the auxiliary loss function were used to make the global branch’s segmentation graph and the local branch’s segmentation output more accurate to the corresponding manual marking result, and the prediction graph was the output.
2.1. Global Branch of GL-HRNet
Inspired by the HRNet [
25], in this paper, HRNet and FPN were used as main trunks to replace GLNet’s ResNet network. HRNet connects subnets from high resolution to low resolution in parallel, repeatedly fusing multi-resolution features to generate reliable high-resolution representations. Compared with the original ResNet network in GLNet, the resolution of the feature map is improved, and the global context information is enriched. In contrast, more detailed information is retained, improving segmentation efficiency. Its structure is shown in
Figure 3.
Parallel multi-resolution subnets are constructed through parallel connections from high-resolution subnets to low-resolution subnets. Each subnet contains multiple convolutional sequences, and there is a down-sampling layer between adjacent subnets to halve the resolution of feature maps. The high-resolution subnet is used as the first stage, and the subnets from high resolution to low resolution are gradually added to form the new stage. Then multiple resolution subnets are connected in parallel. The resolution of the next phase of the parallel subnet consists of the resolution of the previous phase and the resolution of the next phase. The network structure consists of four parallel subnets, as shown in
Figure 4.
Global branching introduces switching units across parallel subnets so that each subnet receives information from other subnets multiple times. An example of an information exchange unit is shown in
Figure 4b, where stage 3 is divided into multiple switching blocks. Each switching block consists of three parallel convolutional units and one switching unit. The specific implementation of aggregation of feature information of different resolutions by exchanging units in GL-HRNet is shown in
Figure 5.
The exchange unit takes s response graph {
X1,
X2…
Xs,} as the input, each output is aggregated from the response graph of the input, and the corresponding output is {
Y1,
Y2…,
Ys}, where
Yi and
Xi have the exact resolution and dimension. The expression from input to output is:
Each cross-stage switching unit has an additional output
Ys+1, and
a(Xi, k) represents changing the resolution of input Xi from i to k, which is realized by down-sampling or up-sampling. The switching unit of the GL-HRNet global branch uses 3 × 3 convolution with a step size of 2 for down-sampling, while up-sampling is realized by bilinear interpolation. If I = k, then a(Xi, k) represents the identity mapping, a(Xi, k) = Xi.
2.2. Local Branch of GL-HRNet
Inspired by Unet [
26] and ResNet [
27], ResUNet is the backbone network in local branches. Unlike the original network, the proposed improved network does not share the feature map with the global branch depth. The global branch supplements the context information that the local branch lacks and confuses the learning of its feature graph. Therefore, independent training is adopted to improve the segmentation effect of local branches. The network structure of local branches is shown in
Figure 6.
The FPN ResUnet network has an asymmetric structure and is an end-to-end model. There are 44 convolutional layers in the network, and there are four down-sampling and four up-sampling operations in total. There is no entire connection layer, and the output characteristics of each layer have enhanced semantic information. The coding part of the network is similar to the ResNet18 network, which can reduce the spatial dimension of the input image through convolution, pooling, and other operations to extract high-level features.
After each feature extraction with two residual structures, 2 × 2 maximum pooling was used to reduce the spatial dimension, filter out some unimportant high-frequency information, and reduce the spatial size of the feature map. To reduce the disappearance of gradient, the ReLu function was used for all activation functions of the model. Each residual structure contains a BN layer, and each batch is normalized to each level of characteristics in the coding path so that the distribution of each level is relatively stable, which can make the model have better robustness to a large extent, accelerate the convergence rate, and improve the capacity of the model.
2.3. Branch Ensemble
Feature maps extracted from local and global branches can be divided into
L layers, represented by
XL,I and
XG,i, respectively, where
i ∈
L,
L = 4. The feature maps of the last layer are connected along the dimensions, and the final segmentation map is obtained by the aggregation layer
fAGG, making it
SAGG. In addition to the primary loss function of
SAGG, two auxiliary loss functions were adopted in this paper to make the segmentation graphs of global branches
SG,L and partial branches
SL,L, respectively, which are closer to the corresponding artificial marking results (Ground Truth, GT). This operation also makes the training process more stable. As shown in
Figure 7, the aggregation layer between the two branches was set as
fAGG, composed of 3 × 3 convolution, and the ensemble between the feature maps of the two branches was realized.
5. Conclusions
In this paper, based on the combination of deep learning and transfer learning, a lunar feature detection method (GL-HRNet) combining high-resolution features and improved GLNet was used to further promote lunar energy discovery and geological research. In this method, the ResNet in GLNet is adjusted to HRNet, and the HRNet and FPN are used as the backbone network, which integrates rich multi-scale information of craters and valleys while maintaining high-resolution feature maps. Secondly, ResUNet and FPN are used as the backbone network in the local branch and are trained independently. The primary loss function and auxiliary loss function aggregation are used to make the global branch’s segmentation graph and the local branch’s segmentation output more accurate to the corresponding manual labeling results. Compared with different CNN network structures, GL-HRNet model has higher accuracy (88.7 ± 8.9) and recall rate (80.1 ± 2.7) and smaller latitude and longitude error. In addition, the model has a good effect on the identification of Mars impact craters, more new impact craters are found, and the robustness is higher for the detection of Moon rilles. Finally, by analyzing the density distribution of lunar impact craters with a diameter of less than 5 km, it was found that the density of small impact craters in the North Pole and a local area of the lunar highlands (5°–85°E, 25°–50°S) is obviously high. The density of impact craters in the Orientale Basin is not significantly different from that in the surrounding areas.