• Advanced Photonics Nexus
  • Vol. 5, Issue 2, 026007 (2026)
Futong Zhang1、2、3, Kai Song1、2、3, Yaoxing Bian1、2、3、*, Shijun Zhao1、2、3, Hongrui Liu1、2、3, Hongda Ge1、2、3, Lei Han1、2、3, Yichen Yu1、2、3, Weifeng Zhang1、2、3, Dong Wang1、2、3, and Liantuan Xiao1、2、3、4、*
Author Affiliations
  • 1Taiyuan University of Technology, College of Physics and Optoelectronics Engineering, Taiyuan, China
  • 2Taiyuan University of Technology, Shanxi Key Laboratory of Precision Measurement Physics, Taiyuan, China
  • 3Taiyuan University of Technology, Key Laboratory of Advanced Transducers and Intelligent Control System, Ministry of Education, and Shanxi Province, Taiyuan, China
  • 4Shanxi University, Institute of Laser Spectroscopy, State Key Laboratory of Quantum Optics Technologies and Devices, Taiyuan, China
  • *Corresponding author: bianyaoxing@tyut.edu.cn,xlt@sxu.edu.cn
  • show less
    DOI: 10.1117/1.APN.5.2.026007 Cite this Article Set citation alerts
    Futong Zhang, Kai Song, Yaoxing Bian, Shijun Zhao, Hongrui Liu, Hongda Ge, Lei Han, Yichen Yu, Weifeng Zhang, Dong Wang, Liantuan Xiao, "Robust photon-level single-pixel imaging through diverse scattering media," Adv. Photon. Nexus 5, 026007 (2026) Copy Citation Text show less

    Abstract

    Imaging through scattering media holds significant application potential in remote sensing, biomedical diagnostics, and industrial detection. However, conventional imaging systems fail to maintain robustness and generalization across diverse environments. Here, we demonstrate a photon-level single-pixel imaging system that exploits data-domain alignment to overcome this limitation. By coupling a physical preprocessing module with a deep neural network, the system translates scattering-induced degradations from different scattering media into a unified data domain, preserving the essential structure of the optical information. Under natural fog and rain conditions, the proposed method clearly reconstructs the fine target details at a distance of 150 m, demonstrating strong robustness across the scattering medium. With 0.088 photons per pattern per pixel in a single measurement, the 256 × 256 dynamic imaging is well reconstructed. These results establish a generalizable framework for photon-level imaging in diverse scattering media and highlight its promise for robust optical imaging under extreme atmospheric conditions.

    © The Authors. Published by SPIE and CLP under a Creative Commons Attribution 4.0 International License. Distribution or reproduction of this work in whole or in part requires full attribution of the original publication, including its DOI.

    Video Introduction to the Article

    1 Introduction

    Imaging through scattering media plays a critical role in applications such as underwater rescue, autonomous driving, and biomedical imaging.14 Multiple scattering induced by turbid media not only severely attenuates the intensity of ballistic photons but also introduces complex noise components. Moreover, the noise characteristics vary significantly across different types of scattering media.5,6 Existing techniques for improving imaging quality through scattering media can generally be categorized into three classes: physical methods, computational methods, and hybrid approaches that combine both. Physical methods aim to enhance the proportion of ballistic photons during image acquisition to suppress scattering effects. Examples include time-of-flight discrimination between ballistic and multiple-scattered photons, polarization-based optical gating, and the use of narrowband illumination in wavelength-sensitive media for selective suppression.79 These methods usually rely on specialized hardware and perform effectively when imaging through static scattering media. Computational methods, by contrast, typically operate as post-processing strategies that exploit image priors, statistical models, or filtering algorithms to extract target structures and improve image contrast. Although computational methods offer high processing speed, they generally require raw data with relatively high signal-to-noise ratios.1014 Hybrid approaches can, to some extent, leverage the strengths of both strategies, yet they still struggle to maintain satisfactory image quality under diverse scattering conditions.15,16

    Single-pixel imaging (SPI), owing to its unique imaging mechanism and broad spectral response, offers an effective balance between physical and computational methods.17,18 The single-point detection scheme, combined with efficient post-processing algorithms, enables SPI to suppress scattering-induced noise and enhance image quality when imaging through turbid media.19 Photon-level SPI technology is the key breakthrough of traditional SPI in the face of extremely low-light scenarios.2024 By encoding the target scenario through a spatial light modulator and detecting the corresponding returns with a high-sensitivity single-photon detector, it enables reliable image acquisition at the photon level (average number of photons per pixel 1).2527 Benefiting from a variety of advanced physical imaging schemes, the detection sensitivity in imaging through scattering media has been significantly enhanced.28,29 Although these strategies markedly enhance sensitivity, they also amplify the effects of multiple scattering and environmental noise, causing reconstructed images to be overwhelmed by noise. Deep learning has demonstrated remarkable capabilities in image denoising and enhancement.3034 Existing deep-learning-based approaches perform well under specific scattering scenarios, such as underwater imaging at different turbidity levels, where they achieve strong adaptive denoising.29 However, their robustness and generalization are markedly limited in diverse scattering scenarios. Recently, the comprehensive compensation of real-world degradations for SPI significantly improves the robustness and imaging quality of SPI.35 Therefore, it is necessary to construct a physical degradation model for diverse scattering environments to achieve robust photon-level imaging through diverse scattering media.

    In this work, a photon-level single-pixel imaging technique through scattering media based on data-domain alignment is proposed. This approach significantly suppresses the noise introduced by multiple scattering and improves the robustness of cross-media imaging. A preprocessing module is used to perform initial normalization on reconstructed images obtained through different scattering media, minimizing the data-domain gap among various types of degraded images. Then, the normalized images are processed by the proposed histogram prior compensation network (HPCnet), which incorporates histogram-based priors to compensate for medium-specific degradation while preserving physical consistency. As a result, the system can recover fine structural details under various real scattering media and shows strong generalization capability. The results demonstrate that the method can achieve dynamic imaging at a frame rate of 15 frames per second (fps) under a number of 0.088 photons per pattern per pixel. The design is compatible with both active and passive illumination in SPI, ensuring flexible deployment in a wide range of scattering scenarios. When integrated with multimodal sensors such as automotive radar and light detection and ranging systems, it can further enhance the robustness of environmental perception. This fusion can reduce the impact of weather variations on imaging results and provides reliable support for intelligent driving in complex environments, as shown in Fig. 1(a). The proposed technique is expected to become an important approach for addressing perception challenges in adverse weather conditions for intelligent driving systems.

    Principle of deep photon-level single-pixel imaging through scattering media. (a) Schematic of scattering-media imaging scenarios for intelligent driving. (b) Workflow of the photon-level single-pixel imaging process through scattering media. (c) TVAL reconstruction results under various scattering noise conditions (original), preprocessing results using the PODC algorithm, and enhancement results obtained with U-Net and HPCnet, respectively. U-Net is trained on the original image dataset, whereas U-Net+ refers to the direct use of PODC preprocessing together with U-Net. This configuration is employed to evaluate the effectiveness of the PODC module when combined with a standard network architecture. ki indicates the dominant scattering degradation feature: k1 corresponds to attenuation-dominated degradation, k2 to particle-noise-dominated degradation, and k3 to distortion-dominated degradation. (d) Peak signal-to-noise ratio (PSNR) and (e) multiscale structural similarity index measure (MS-SSIM) of reconstructed images under various scattering noise conditions using different reconstruction methods (Video 1, MP4, 392 KB [URL: https://doi.org/10.1117/1.APN.5.2.026007.s1]).

    Figure 1.Principle of deep photon-level single-pixel imaging through scattering media. (a) Schematic of scattering-media imaging scenarios for intelligent driving. (b) Workflow of the photon-level single-pixel imaging process through scattering media. (c) TVAL reconstruction results under various scattering noise conditions (original), preprocessing results using the PODC algorithm, and enhancement results obtained with U-Net and HPCnet, respectively. U-Net is trained on the original image dataset, whereas U-Net+ refers to the direct use of PODC preprocessing together with U-Net. This configuration is employed to evaluate the effectiveness of the PODC module when combined with a standard network architecture. ki indicates the dominant scattering degradation feature: k1 corresponds to attenuation-dominated degradation, k2 to particle-noise-dominated degradation, and k3 to distortion-dominated degradation. (d) Peak signal-to-noise ratio (PSNR) and (e) multiscale structural similarity index measure (MS-SSIM) of reconstructed images under various scattering noise conditions using different reconstruction methods (Video 1, MP4, 392 KB [URL: https://doi.org/10.1117/1.APN.5.2.026007.s1]).

    2 Methods

    The proposed photon-level single-pixel imaging process through scattering media is as follows. First, image data are acquired using photon-level single-pixel imaging through various scattering media. Then, a deep neural network based on data-domain alignment is applied to enhance the image quality, as illustrated in Fig. 1(b).

    The basic principle of photon-level SPI is to encode the incident light field using a spatial light modulator and to record the corresponding light intensity with a single-pixel detector. The target image is then recovered through an inverse reconstruction algorithm36I=Af+n,where I is the measurement vector, A is the measurement matrix composed of patterns, f is the target, and n is the noise. When imaging through a scattering medium, the structural-detection-based SPI process can be expressed as I=A(Hf)+n,where H is the scattering transmission matrix. After acquiring a complete set of measurements, image reconstruction is carried out using sparsity constraints and optimization algorithms, such as total variation regularization with the alternating direction method of multipliers (TVAL):37f^=argminfA(Hf)I22+TV(Hf),where TV denotes the regularization term, and f^ represents the reconstructed original image. In strong scattering environments, the original images reconstructed by photon-level SPI usually suffer from low contrast, blurred edges, and weakened structures, with target details often submerged in background noise.38 If deep learning methods are applied directly for enhancement, the robustness of the neural network decreases due to variations in the scattering medium. To address this issue, domain alignment of the original image data is required prior to network processing, which ensures consistent feature representation across different scattering conditions.

    A prior-oriented denoising compensation (PODC) algorithm is introduced for domain normalization in our imaging system. By performing statistical normalization on the local neighborhood of each pixel in the original image, the pixel distribution is mapped to a unified prior distribution, thereby enhancing the representation of structural features. The proposed PODC not only improves the clarity of image details but also aligns degraded images obtained under different scattering media into the same data domain.39 This alignment facilitates more effective feature extraction by the neural network and improves its robustness. Specifically, for any pixel (i,j) in the image, the local mean μN(i,j) and the local standard deviation σN(i,j) are computed within its neighborhood40μN(i,j)=1|N|(p,q)N(i,j)Ip,q,σN(i,j)=1|N|(p,q)N(i,j)(Ip,qμN(i,j))2,where Ip,q denotes the image intensity at pixel position (p,q), and N(i,j) represents the neighborhood centered at pixel (i,j).

    Based on the local mean and local standard deviation, each local region of the image is normalized as Ii,j=Ii,jμN(i,j)σN(i,j)+ϵ·σ+μ,where σ is the scaling coefficient after local normalization, which is used to compress the local normalized pixel value to the desired contrast range, adjusting the overall contrast of the image. μ is an additive term after scaling to adjust the global brightness of the image. ϵ is a small constant introduced to prevent division by zero.

    To further improve the stability of domain alignment and suppress the enhancement of abnormal pixels, an optional nonlinear control operation is incorporated into PODC Ii,j=tanh(α·Ii,j),where the nonlinear control parameter α determines the nonlinear strength of the compression map, which is used to adjust the compression degree of the histogram. PODC provides standardized image data, and its corresponding pseudocode and the optimal selection of parameters for different scattering conditions are given in Supplementary Algorithm S1; a parameter-sensitivity analysis is also provided, as shown in Fig. S1 in the Supplementary Material.

    However, relying on statistical normalization is insufficient to fully recover the texture details of the image. To better suppress the noise induced by multiple scattering and to improve the quality of the reconstructed images, a preprocessing-aware module is incorporated into the U-Net. Based on this modification, a specialized image enhancement network, termed HPCnet, is designed to adapt to the aligned data in Fig. 1(b). The computational efficiency of different networks is compared in Table S2 in the Supplementary Material.41,42 Specifically, a lightweight feature extraction module is first processes the PODC-pretreated reconstruction to obtain initial features. These features are then passed into the backbone network for hierarchical encoding and decoding. The backbone adopts a symmetric encoder–decoder architecture, consisting of three downsampling modules and three upsampling modules. The numbers of channels in the convolutional layers are 64, 128, and 256, respectively. Skip connections between the encoder and decoder are used to preserve feature information at different scales.

    The training dataset consists of both experimental and simulated data. The experimental dataset contains 5,000 images acquired through scattering media with different concentrations, including fat emulsion, sediment suspension, and fog. The simulated dataset is generated from the experimental images by applying rotation, scaling, and adding a small amount of synthetic noise, with 20 simulated images produced from each experimental image. In total, the final dataset contains more than 100,000 images. This large-scale dataset ensures sufficient diversity for training and improves the generalization capability of the proposed network. Furthermore, the robustness of HPCnet arises from two key design aspects. First, the PODC preprocessing substantially reduces the domain discrepancy across different scattering media, guiding the network to learn structure-dependent representations rather than medium-specific degradation patterns, thereby improving cross-media generalization.43 Then, the training dataset incorporates a physics-informed scattering degradation model, providing physically meaningful supervisory signals that enhance the interpretability and physical consistency of the learned features.44

    The MS-SSIM is adopted as the loss function, which is defined as45MS_SSIM(x,y)=[lM(x,y)]αM·j=1M[cj(x,y)]βj·[sj(x,y)]γj,where lM(x,y) denotes the luminance comparison function at the M-th scale, cj(x,y) denotes the contrast comparison function at the j-th scale, and sj(x,y) denotes the structure comparison function at the j-th scale. The terms αM, βj, and γj are the corresponding weighting exponents for each component.

    As the MS-SSIM is less than 1, the final loss function L is defined as L=1MS_SSIM(Iout,Igt),where Iout denotes the network output reconstructed from the data-domain aligned input, and Igt is the corresponding ground-truth image.

    3 Results

    To quantitatively evaluate the robustness and generalization ability of the proposed imaging system under different scattering conditions, simulation tests are conducted. A series of synthetic degradation models of scattering are constructed, including fog and turbid water models (see Fig. S2 in the Supplementary Material).46,47 The parameter ki=0 denotes no degradation, whereas ki=1 represents complete degradation where the target information is indistinguishable. A degradation type is considered dominant when its proportion exceeds 0.5, whereas the proportions of other noise types remain below 0.5. The scattering medium degradation model is applied to the resolution chart at varying proportions, and then the original images are obtained by TVAL, as shown in the first column of Fig. 1(c).

    The degraded images are then enhanced using the proposed PODC algorithm, U-Net, and the designed HPCnet, as shown in the second to fifth columns of Fig. 1(c). The training sets of PODC and U-Net are k1-dominant original images, whereas the training sets of U-Net+ and HPCnet are k1-dominant images preprocessed by PODC. First, the enhancement results of different algorithms and neural networks on the k1-dominated original images are compared. It can be observed that U-Net+ achieves significantly better enhancement than U-Net, effectively reconstructing the digits disturbed by noise, which can be attributed to the PODC preprocessing. Furthermore, the proposed HPCnet achieves markedly better enhancement than U-Net+, fully reconstructing the detailed digit information. This result demonstrates that the proposed network offers superior adaptability.

    To further verify the robustness of the proposed imaging system through different scattering media, enhancement performance is compared for k2- and k3-dominated images. The enhancement performance of both U-Net and U-Net+ decreases when the dominant noise type changes. Residual noise is still present in the reconstructed images, and the digits are not clearly recovered. In contrast, the results produced by HPCnet show clearly reconstructed digits while maintaining a clean background. Furthermore, to quantitatively demonstrate the high robustness of the proposed imaging system, the PSNR and MS-SSIM of the reconstructed images are calculated in Figs. 1(d) and 1(e). As the noise type changes, the PSNR of the images enhanced by HPCnet remains above 19.01 dB, whereas the MS-SSIM remains above 0.9. These results demonstrate that the proposed imaging system can effectively suppress noise under different scattering conditions and achieve high robustness in cross-media image reconstruction.

    To verify the feasibility of the imaging system, an experimental setup is constructed as shown in Fig. S3 in the Supplementary Material. A 532 nm laser (MGL-III-532 nm) is used as the illumination source. The signal light reflected from the target passes through the scattering medium and is collected by an imaging lens, which projects it onto a digital micromirror device (DMD, UPOLabs HDSLM136D70-DDR). The modulated signal is then detected by a single-photon avalanche diode (SPAD, Siminics SPD500), and the ground truth images of targets are obtained by a CCD. Because quasi-static and dynamic scattering media impose markedly different effects on light propagation, the imaging system requires different parameter settings for sampling rate, frame rate, and illumination conditions under different scattering media. To accommodate imaging across multiple scattering environments and dynamic target scenarios, we systematically analyze the influence of pattern playback frame rate and illumination intensity on the quality of reconstructed images at different sampling rates.

    Figure 2(a) shows the effect of the playback frame rate on the quality of the reconstructed image under sampling rates ranging from 1.0% to 4.0%. Due to the playback frame rate limitation of the DMD, the maximum achievable imaging frame rates corresponding to sampling rates of 1.0%, 2.0%, 3.0%, and 4.0% are 30.4, 15.2, 10.1, and 7.6 fps, respectively. As the playback frame rate increases, the image quality decreases across all sampling rates. This degradation mainly results from the shorter display time of each pattern, which reduces the number of photons captured by the SPAD. The corresponding reconstructed images are shown in Fig. S4 in the Supplementary Material. At the same frame rate, increasing the sampling rate does not yield higher-quality reconstructed images. This is because once the imaging quality reaches saturation, additional patterns only introduce more environmental noise.

    Robustness verification of the proposed system under different scattering coefficients. (a) Effect of the imaging frame rate on the MS-SSIM of original images at different sampling rates. (b) Effect of the number of photons per pattern per pixel on the MS-SSIM of original images at different sampling rates. (c) Reconstructed images obtained using SCU-Net, DPIR, U-Net+, and HPCnet under different concentrations of fat emulsion (corresponding to different scattering coefficients). The “MS-SSIM/PSNR” values are indicated below the corresponding images.

    Figure 2.Robustness verification of the proposed system under different scattering coefficients. (a) Effect of the imaging frame rate on the MS-SSIM of original images at different sampling rates. (b) Effect of the number of photons per pattern per pixel on the MS-SSIM of original images at different sampling rates. (c) Reconstructed images obtained using SCU-Net, DPIR, U-Net+, and HPCnet under different concentrations of fat emulsion (corresponding to different scattering coefficients). The “MS-SSIM/PSNR” values are indicated below the corresponding images.

    Figure 2(b) shows the effect of photon number on the MS-SSIM of reconstructed images at sampling rates ranging from 1.0% to 4.0%. When the photon number per pattern per pixel is sufficiently large, the quality difference among reconstructed images at different sampling rates becomes negligible. As the photon number decreases, the reconstruction quality at low sampling rates deteriorates rapidly. In contrast, reconstructed images at higher sampling rates exhibit a clear advantage. The corresponding reconstructed images are shown in Fig. S5 in the Supplementary Material. These experimental results show that in low-light environments, excessively low sampling rates result in the loss of fine structural details. Conversely, extremely high sampling rates limit the achievable imaging frame rate, which becomes insufficient for imaging through dynamic scattering media or dynamic targets. Therefore, a trade-off strategy balancing frame rate and sampling rate is adopted in the following experiments. For quasi-static scattering media, a higher sampling rate of 4.0% is employed to ensure the reconstruction of target details under low-light conditions. For dynamic scattering media (fog environment), a 2.0% sampling rate is employed for static targets, while a 1.0% sampling rate is employed for dynamic targets.

    To verify the effectiveness of the designed PODC, comparative experiments are conducted using different preprocessing algorithms. Diluted fat emulsions of varying concentrations are used as scattering media.48 The optical thickness and scattering coefficients of the mixed solutions are controlled by adding different volumes of standard fat emulsion into a fixed amount of water, as shown in Table S1 in the Supplementary Material. Figure S8 in the Supplementary Material shows the reconstruction results using Gamma, Retinex, and PODC under different scattering coefficients.4951 Although the PODC algorithms enhance image quality, residual noise remains noticeable in certain regions of the reconstructed images under strong scattering conditions. To further suppress environmental noise, the PODC-preprocessed images are fed into SCU-Net, DPIR, U-Net+, and the designed HPCnet.52,53 As the scattering coefficient increases, traditional neural network models exhibit pronounced blurring and loss of structural details. In contrast, HPCnet effectively removes background noise and consistently produces high-quality images across different scattering coefficients, maintaining both the PSNR and the MS-SSIM above 19.19 dB and 0.848, respectively. We also compare the computational cost between HPCnet and these networks in Supplementary Note 6 in the Supplementary Material. The results demonstrate that the HPCnet can achieve optimal imaging performance with relatively low model complexity. Moreover, the reconstructed images under different experimental conditions are similarly enhanced using the HPCnet, as shown in Figs. S6 and S7 in the Supplementary Material. The results demonstrate that the proposed imaging system effectively adapts to variations in sampling rate, frame rate, and photon number.

    To further validate the robustness of the proposed imaging system under different scattering conditions, imaging experiments are performed using a resolution chart placed behind fat emulsion, fog, and sediment suspension. Figure 3(a) compares the reconstructed results of TVAL, SCU-Net, DPIR, U-Net+, and HPCnet. Under the condition of fat emulsion, the original images exhibit extremely low contrast and severely degraded edges. SCU-Net, DPIR, and U-Net+ all improve the overall contrast of the reconstructed image to a certain extent, but their image quality is still significantly worse than HPCnet. There is still a lot of speckle noise in the reconstruction images that is not effectively suppressed, and the recovery of the detail structure is limited. The corresponding PSNR and MS-SSIM of the reconstructed images are calculated in Table S3 in the Supplementary Material. As the scattering environment becomes fog and sediment suspension, the robustness of the above comparison method is significantly reduced, the reconstructed image is seriously degraded, and some target structures are difficult to identify. In contrast, the proposed method exhibits stable and consistent imaging performance under different scattering medium conditions. It can effectively suppress the speckle noise introduced by strong scattering and maintain a clear structural recovery capability in the cross-scattering environment imaging process, reflecting better robustness and generalization performance. The histogram distribution through the various scattering media is shown in Fig. S9 in the Supplementary Material, showing the adaptability of the proposed method across different scattering media.48

    Robustness verification of the proposed system across different scattering media. (a) Imaging results of resolution charts through fat emulsion, fog, and sediment suspension (abbreviated as Sed. Susp.) using TVAL, SCU-Net, DPIR, U-Net+, and HPCnet. (b) Imaging results of speed-limit signs using TVAL, SCU-Net, DPIR, U-Net+, and HPCnet. (c) Cross-sectional profiles along the dashed lines of the reconstructed images. (d) PSNR and (e) MS-SSIM of reconstructed speed-limit signs obtained by TVAL, SCU-Net, DPIR, U-Net+, and HPCnet.

    Figure 3.Robustness verification of the proposed system across different scattering media. (a) Imaging results of resolution charts through fat emulsion, fog, and sediment suspension (abbreviated as Sed. Susp.) using TVAL, SCU-Net, DPIR, U-Net+, and HPCnet. (b) Imaging results of speed-limit signs using TVAL, SCU-Net, DPIR, U-Net+, and HPCnet. (c) Cross-sectional profiles along the dashed lines of the reconstructed images. (d) PSNR and (e) MS-SSIM of reconstructed speed-limit signs obtained by TVAL, SCU-Net, DPIR, U-Net+, and HPCnet.

    To validate the practical applicability of the proposed imaging system, long-distance outdoor experiments are conducted using an active illumination scheme. A traffic sign positioned 70 m away is selected as the target, and a fog environment is created by placing a humidifier in front of the imaging system, as shown in Fig. S10(a) in the Supplementary Material. Figure 3(b) shows the reconstructed images of the speed-limit sign obtained using TVAL, SCU-Net, DPIR, U-Net+, and HPCnet. Under dense fog environment, the original images exhibit severe noise contamination. SCU-Net improves image contrast but retains substantial noise. Images reconstructed by DPIR exhibit poor contrast, with severe smearing between target edges and the background, resulting in unclear boundaries. Although U-Net+ has a significant improvement in image contrast, it also causes the loss of structural detail information. In contrast, our imaging system reconstructs high-quality images of the speed-limit sign with clear edge details. Additional imaging results of traffic signs are provided in Fig. S10(b) in the Supplementary Material. Furthermore, Fig. 3(c) shows the intensity profiles along the central cross-sections of the reconstructed images. Compared with other networks, the proposed system produces steeper edge transitions and smoother background regions, demonstrating its dual advantage in edge enhancement and background suppression. In addition, the PSNR and MS-SSIM of the reconstructed images are calculated, as shown in Figs. 3(d) and 3(e). The proposed system achieves a maximum PSNR of 23.06 dB and an MS-SSIM of 0.83, both significantly higher than those of conventional methods.

    To verify the imaging performance of the proposed system under extreme weather conditions, passive imaging of a clock located 150 m away is performed under natural illumination, as shown in Fig. 4. Figure 4(a) shows the real testing environment and the corresponding photograph of the passive imaging system. Figure 4(b) shows the reconstructed images obtained using TVAL, SCU-Net, DPIR, U-Net+, and the proposed system under different weather conditions, including heavy rain and dense fog. Due to the nonuniformity of natural illumination, the original images reconstructed directly with TVAL contain significant noise, with target details nearly submerged and difficult to recover. Under daytime heavy rain, the digits on the clock face reconstructed by SCU-Net, DPIR, and U-Net+ appear largely blurred. Under daytime dense fog, SCU-Net, DPIR, and U-Net+ all fail to fully recover the submerged edge structures of the target. Moreover, in nighttime heavy rain, the background intensity of the reconstructed images increases, leading to further degradation of image contrast. The reconstructions by SCU-Net, DPIR, and U-Net+ also exhibit more severe smearing artifacts. In contrast, benefiting from the designed preprocessing module and deep enhancement mechanism, the proposed system successfully reconstructs the complete contour and fine details of the clock under all weather conditions, demonstrating high robustness for imaging through atmospheric environments.

    Robustness verification of the proposed system under different extreme weather conditions. (a) Passive imaging environment under natural illumination. (i) and (ii) Photograph and schematic of the imaging setup. (b) Reconstructed images of a clock located 150 m away under different weather conditions using TVAL, SCU-Net, DPIR, U-Net+, and the proposed system. (c) Extracted frames over time from the imaging of a flashing traffic light located 70 m away in a fog environment using the proposed system.

    Figure 4.Robustness verification of the proposed system under different extreme weather conditions. (a) Passive imaging environment under natural illumination. (i) and (ii) Photograph and schematic of the imaging setup. (b) Reconstructed images of a clock located 150 m away under different weather conditions using TVAL, SCU-Net, DPIR, U-Net+, and the proposed system. (c) Extracted frames over time from the imaging of a flashing traffic light located 70 m away in a fog environment using the proposed system.

    Furthermore, to evaluate the reconstruction performance of the proposed imaging system for dynamic targets, passive imaging experiments of a traffic light located 70 m away are conducted under fog environment. Figure 4(c) shows frame images extracted from the recorded video, and the corresponding dynamic traffic light video is provided in Supplementary Movie 1. Under a number of 0.088 photons per pattern per pixel, the proposed imaging scheme consistently identifies the luminous regions and reconstructs their shapes across multiple time frames, achieving an imaging frame rate of 15 fps. These results demonstrate that our system exhibits high robustness and applicability for both static and dynamic targets under natural heavy rain and dense fog, highlighting its potential for practical implementation.

    4 Conclusion

    In summary, a photon-level single-pixel imaging technology through scattering media based on data-domain alignment is proposed, which effectively addresses the severe image degradation encountered by conventional imaging systems under diverse scattering conditions. Comprehensive experiments demonstrate that the technology exhibits high robustness and stability across diverse scattering environments, offering clear advantages in structural detail preservation and contrast enhancement. The effects of imaging frame rate and photon number on image quality under different sampling rates are systematically analyzed, leading to the optimization of system parameters for imaging through various scattering media. Then, based on a comparison of imaging performance using various preprocessing algorithms under different scattering coefficients, the effectiveness and necessity of the designed PODC module are verified. Next, image reconstruction of a speed-limit sign located 70 m away is performed under fog conditions. Unlike conventional enhancement networks, which are often overfitted to a single scattering scenario, HPCnet explicitly models cross-domain adaptation, enabling robust generalization across diverse scattering environments. Under natural fog and rain, the proposed method clearly reconstructs the details of the clock tower at a distance of 150 m, demonstrating strong cross-media generalization and practical applicability. Remarkably, under a number of 0.088 photons per pattern per pixel, the proposed system also achieved dynamic imaging of a traffic signal with a frame rate of 15 fps. These results highlight the promise of the proposed technique for practical applications such as traffic monitoring, long-range security surveillance, underwater detection, and biomedical imaging, providing a feasible pathway for reliable target perception in low-light and strongly scattering environments.5457

    Acknowledgments

    Acknowledgment. This work was supported by the National Natural Science Foundation of China (Grant Nos. 62305239, U23A20380, 62127817, and 6191101445), the Science and Technology Major Special Project of Shanxi Province (Grant No. 202201010101005), the National Key Research and Development Program of China (Grant No. 2022YFA1404201), and the Fundamental Research Program of Shanxi Province (Grant No. 202203021222133).

    Liantuan Xiao is a professor and PhD supervisor at the College of Physics and Optoelectronic Engineering, Taiyuan University of Technology, and a distinguished professor under the Changjiang Scholars Program (Ministry of Education, China). He received his BS degree (1989), MS degree (1997), and PhD (2001) in physics from Shanxi University. His research focuses on precision measurement physics and single-photon communication and imaging. He has published over 200 papers, including in Nature Physics, Nature Communications, and Physical Review Letters.

    Biographies of the other authors are not available.

    References

    [1] S. Kang et al. Tracing multiple scattering trajectories for deep optical imaging in scattering media. Nat. Commun., 14, 6871(2023).

    [2] R. Liu et al. Scanning-driven photon-counting 3D imaging through scattering media via asynchronous polarization modulation. Laser Photonics Rev., 18, 2300916(2024).

    [3] S. Yan et al. Image reconstruction through a nonlinear scattering medium via deep learning. Photonics Res., 12, 2047-2055(2024).

    [4] X. Chang et al. Pixel super-resolved lensless on-chip sensor with scattering multiplexing. ACS Photonics, 10, 2323-2331(2023).

    [5] S. G. Narasimhan, S. K. Nayar. Vision and the atmosphere. Int. J. Comput. Vis., 48, 233-254(2002).

    [6] D. Akkaynak, T. Treibitz. Sea-thru: a method for removing water from underwater images, 1682-1691(2019).

    [7] J. F. de Boer et al. Polarization sensitive optical coherence tomography – a review [Invited]. Biomed. Opt. Express, 8, 1838(2017).

    [8] L. Zhu et al. Color imaging through scattering media based on phase retrieval with triple correlation. Opt. Laser Eng., 124, 105796(2020).

    [9] X. Ren et al. Single-shot full-Stokes imaging through scattering media. Optica, 12, 1560-1568(2025).

    [10] K. He et al. Single image haze removal using dark channel prior, 1956-1963(2009).

    [11] J. Y. Chiang, Y.-C. Chen. Underwater image enhancement by wavelength compensation and dehazing. IEEE Trans. Image Process., 21, 1756-1769(2012).

    [12] K. Zhang et al. Beyond a Gaussian denoiser: residual learning of deep CNN for image denoising. IEEE Trans. Image Process., 26, 3142-3155(2017).

    [13] J. Wang et al. Fast non-local algorithm for image denoising, 1429-1432(2006).

    [14] Y. Xiao et al. High-resolution ghost imaging through complex scattering media via a temporal correction. Opt. Lett., 47, 3692(2022).

    [15] M. Xiang et al. High-resolution imaging through dynamic scattering media: iterative optimization based on polarimetric characteristics. Sci. China Technol. Sci., 68, 1100405(2024).

    [16] G. Ma et al. Foveated imaging through scattering medium with LG-basis transmission matrix. Opt. Laser Eng., 159, 107199(2022).

    [17] X. Zhang et al. VGenNet: variable generative prior enhanced single pixel imaging. ACS Photonics, 10, 2363-2373(2023).

    [18] W. Zian et al. High-resolution dual-polarization single-pixel imaging through dynamic and complex scattering media using random-frequency-encoded time sequences. Photonics Res., 13, B22-B28(2025).

    [19] L. Zhou et al. High-resolution self-corrected single-pixel imaging through dynamic and complex scattering media. Opt. Express, 31, 23027(2023).

    [20] A. McCarthy et al. High-resolution long-distance depth imaging LiDAR with ultra-low timing jitter superconducting nanowire single-photon detectors. Optica, 12, 168-177(2025).

    [21] H. Liu et al. Single-photon single-pixel dual-wavelength imaging via frequency spectral harmonics extraction strategy. Opt. Express, 33, 1636(2025).

    [22] M. Shcherbatenko et al. Single-pixel camera with a large-area microstrip superconducting single photon detector on a multimode fiber. Appl. Phys. Lett., 118, 1103(2021).

    [23] Y. Wang et al. Mid-infrared single-pixel imaging at the single-photon level. Nat. Commun., 14, 1073(2023).

    [24] K. Song et al. Photon-level single-pixel 3D tomography with masked attention network. Opt. Express, 32, 4387(2024).

    [25] S. Wu et al. Photon-level single-pixel imaging of dynamic features in frequency domain. Chin. Opt. Lett., 23, 081101(2025).

    [26] M. Shangguan et al. Time-multiplexing single-photon imaging lidar with single-pixel detector. Appl. Phys. Lett., 124, 1104(2024).

    [27] X. Liu et al. Photon-limited single-pixel imaging. Opt. Express, 28, 8132(2020).

    [28] L. Pan et al. Single photon single pixel imaging into thick scattering medium. Opt. Express, 31, 13943(2023).

    [29] F. Zeng et al. Photon-level single-pixel wavefront imaging through turbid underwater environment. APL Photonics, 10, 0805(2025).

    [30] L. Bian et al. High-resolution single-photon imaging with physics-informed deep learning. Nat. Commun., 14, 5902(2023).

    [31] M. J. Fanous, G. Popescu. GANscan: continuous scanning microscopy using deep learning deblurring. Light Sci. Appl., 11, 265(2022).

    [32] F. Li et al. Unsupervised learning enabled label-free single-pixel imaging for resilient information transmission through unknown dynamic scattering media. Opto-Electron. Adv., 8, 250013-250011(2025).

    [33] D. Lu et al. Customizing the field of view for imaging through scattering media. Adv. Photon. Nexus, 4, 066004(2025).

    [34] X. Luo et al. Revolutionizing optical imaging: computational imaging via deep learning. Photonics Insights, 4, R03(2025).

    [35] Z. Liu et al. Comprehensive compensation of real-world degradations for robust single-pixel imaging. Light Sci. Appl., 14, 365(2025).

    [36] K. Song et al. Advances and challenges of single-pixel imaging based on deep learning. Laser Photonics Rev., 19, 2401397(2024).

    [37] D. J. Starling et al. Compressive sensing spectroscopy with a single pixel camera. Appl. Opt., 55, 5198(2016).

    [38] W. Wan et al. Demonstration of asynchronous computational ghost imaging through strong scattering media. Opt. Laser Technol., 154, 108346(2022).

    [39] P. Hernández-Cámara et al. Neural networks with divisive normalization for image segmentation. Pattern Recognit. Lett., 173, 64-71(2023).

    [40] A. Ortiz et al. Local context normalization: revisiting local normalization, 11276-11285(2020).

    [41] O. Ronneberger et al. U-Net: convolutional networks for biomedical image segmentation. Lect. Notes Comput. Sci., 9351, 234-241(2015).

    [42] X. Yang et al. Underwater ghost imaging based on generative adversarial networks with high imaging quality. Opt. Express, 29, 28388(2021).

    [43] S. Zhu et al. Imaging through unknown scattering media based on physics-informed learning. Photonics Res., 9, B210-B219(2021).

    [44] F. Wang et al. Deep learning for computational imaging: from data-driven to physics-enhanced approaches. Adv. Photonics, 7, 054002(2025).

    [45] Z. Wang et al. Multiscale structural similarity for image quality assessment, 1398-1402(2003).

    [46] D. Ngo et al. Single-image visibility restoration: a machine learning approach and its 4K-capable hardware accelerator. Sensors, 20, 5795(2020).

    [47] A. d’Arco et al. Physics-based neural network for non-invasive control of coherent light in scattering media. Opt. Express, 30, 30845-30856(2022).

    [48] H. Liu et al. Learning-based real-time imaging through dynamic scattering media. Light Sci. Appl., 13, 194(2024).

    [49] J. J. Jeon, I. K. Eom. Low-light image enhancement using inverted image normalized by atmospheric light. Signal Process., 196, 108523(2022).

    [50] X. Li et al. Pixel-wise gamma correction mapping for low-light image enhancement. IEEE Trans. Circuits Syst. Video Technol., 34, 681-694(2024).

    [51] A. Hore, D. Ziou. Image quality metrics: PSNR vs. SSIM(2010).

    [52] K. Zhang et al. Practical blind image denoising via Swin-Conv-UNet and data synthesis. Mach. Intell. Res., 20, 822-836(2023).

    [53] K. Zhang et al. Plug-and-play image restoration with deep denoiser prior. IEEE Trans. Pattern Anal. Mach. Intell., 44, 6360-6376(2022).

    [54] D. Xu et al. Intelligent photonics: a disruptive technology to shape the present and redefine the future. Engineering, 46, 186-213(2025).

    [55] M. Qiao et al. Harnessing forward scattering effect for high dynamic imaging. PhotoniX, 6, 41(2025).

    [56] J. Bertolotti, O. Katz. Imaging in complex media. Nat. Phys., 18, 1008-1017(2022).

    [57] G. Satat et al. Towards photography through realistic fog, 1-10(2018).

    Futong Zhang, Kai Song, Yaoxing Bian, Shijun Zhao, Hongrui Liu, Hongda Ge, Lei Han, Yichen Yu, Weifeng Zhang, Dong Wang, Liantuan Xiao, "Robust photon-level single-pixel imaging through diverse scattering media," Adv. Photon. Nexus 5, 026007 (2026)
    Download Citation