Deep learning-based fringe projection profilometry (FPP) shows potential for challenging three-dimensional (3D) reconstruction of objects with dynamic motion, complex surface, and extreme environment. However, the previous deep learning-based methods are all supervised ones, which are difficult to be applied for scenes that are different from the training, thus requiring a large number of training datasets. In this paper, we propose a new geometric constraint-based phase unwrapping (GCPU) method that enables an untrained deep learning-based FPP for the first time. An untrained convolutional neural network is designed to achieve correct phase unwrapping through a network parameter space optimization. The loss function of the optimization is constructed by following the 3D, structural, and phase consistency. The designed untrained network directly outputs the desired fringe order with the inputted phase and fringe background. The experiments verify that the proposed GCPU method provides higher robustness compared with the traditional GCPU methods, thus resulting in accurate 3D reconstruction for objects with a complex surface. Unlike the commonly used temporal phase unwrapping, the proposed GCPU method does not require additional fringe patterns, which can also be used for the dynamic 3D measurement.

Fringe projection profilometry (FPP) has been widely used in high-precision three-dimensional (3D) measurements.1–3 FPP usually requires at least three phase-shifted fringe patterns to calculate the desired phase by using the phase-shifting algorithm.4 The calculated phase is discontinuous and wrapped in a range of (−π, π].5–7 Temporal phase unwrapping (TPU) is often preferred to unwrapping the phase by using the gray-code,8 multi-frequency,9 or phase-code patterns.10 These additional patterns obviously reduce the image acquisition speed, thus reducing the 3D measurement speed.11 The best way is to unwrap the phase without using any additional patterns, e.g., the spatial phase unwrapping (SPU)12 and the geometric constraint-based phase unwrapping (GCPU).13 The SPU unwraps the phase following a path guided by a parameter map14 and often fails for a complex surface caused by the local error propagation.15 The GCPU unwraps the phase with the assistance of geometric constraints provided by an additional camera, e.g., the wrapped phase, the epipolar geometry, the measurement volume, and the phase monotonicity.16 

In GCPU, two cameras are selected to construct a stereo vision system, which determines the fringe order based on the established image correspondence between them.13 For each pixel in the first camera, the 3D data can be calculated from the absolute phase retrieved from the candidate fringe orders and then transformed into the second camera to construct the coarse image correspondence. The unique fringe order can be determined by refining the coarse correspondence based on the feature similarity.17,18 Both the low accuracy of the system calibration and unsatisfactory features will result in wrong fringe orders.16,19 The measurement volume constrain is used to improve the GCPU, but it requires selecting complex empirical parameters20 and restricted measurement system,21 especially for dense fringe patterns.22 Deep learning has recently been introduced to GCPU for flexible phase unwrapping due to its advantage of data driven and the ability of extracting high-level features, which can directly map the captured fringe patterns to the desired fringe order without complex parameter selection and system restriction.23 

Traditional deep learning-based GCPU methods construct a supervised neural network for the nonlinear mapping between the inputted patterns and the ground-truth.23 The supervised deep learning requires a large number of samples and ground-truths for the network training, which is very time-consuming and especially difficult for some special cases,24,25 e.g., live rabbit hearts26 and robotic flapping wings.27 In addition, the trained neural network is difficult to apply for scenes that are different from the training scene.28–31 In contrast, the untrained deep learning optimizes the network parameter space by minimizing the loss function constructed using the Huygens–Fresnel principle,32 the physical process of computational ghost imaging,33 or the first Born and multi-slice scattering models.34 Because the selected physical model conforms to the physical mechanism, the optimized network parameter space could be adaptive for different scenes because it does not rely on the trained model compared with the supervised deep learning.35,36 Thus, the untrained deep learning is becoming increasingly important and useful for phase imaging,32 quantitative phase microscopy,37 snapshot compressive imaging,36 compressive lensless photography,35 computational ghost imaging,33 and diffraction tomography.34 

In this paper, we propose a new untrained deep learning-based GCPU (UGCPU) that can transform the calculated phase and fringe background into the desired fringe order. The proposed UGCPU can achieve reliable phase unwrapping for FPP under different scenes, which enables an untrained deep learning-based FPP for the first time. The provided experiments verify that UGCPU outperforms the previous GCPU methods with higher robustness and works well even for objects with complex surfaces and dynamic motion.

The rest of this paper is organized as follows: Sec. II illustrates the principle. Section III introduces the proposed UGCPU. Section IV provides experimental results. Section V concludes this paper.

In FPP, a set of phase-shifted sinusoidal fringe patterns is first projected by the projector and then captured by the camera.38,39 The intensity of these captured fringe patterns can be expressed as

(1)

where (x, y) denotes the image coordinate; N denotes phase steps; a, b, and φ are the fringe background, amplitude, and phase, respectively; and δn is the phase shift amount. The phase shifts are designed as

(2)

In practice, the fringe amplitude is often used to remove invalid pixels around shadows and discontinuities,15 which can be calculated as40 

(3)

The desired phase can be calculated by using a least-squares algorithm41 as follows:

(4)

Because the calculated phase is wrapped in the range of (−π, π], a phase unwrapping process is required to determine the fringe order K for each fringe period. The absolute phase can be obtained as42 

(5)

For simplicity, we omit the notation (x, y) hereafter.

Unlike the temporal phase unwrapping,43 GCPU obtains the absolute phase by adding another camera to replace the additional fringe patterns, i.e., the GCPU-based FPP system requires one projector and two cameras. In particular, GCPU first constructs the coarse image correspondence by using the candidate fringe orders21 and then determines the unique fringe order with the assistance of the feature similarity, e.g., the intensity and phase distribution.44 

The typical GCPU-based FPP system is illustrated in Fig. 1, which includes two cameras c1 and c2 and a projector p. In the world coordinate system,45 the point w is selected from the surface of the object, which reflects the projected light onto the image point oc1 of c1. The wrapped phase of oc1 can be calculated from these captured patterns using Eq. (4). There could be f candidate fringe orders for oc1 due to the fringe frequency f. By reusing Eq. (5)f times, f absolute phases are calculated, and from this, oc1 will correspond to f projector coordinates,46 i.e., ojp,j=1,2,,f. When combining oc1 with one projector coordinate ojp, the 3D coordinate of c1 is calculated by combining the system parameters of the camera c1 and projector p, which can be transformed into the camera coordinate of c2 by using the system parameters of the camera c2 and projector p.20,47 By repeating the above process f times, each pixel of c1 will correspond to f pixels of c2, i.e., ojc2,j=1,2,,f. Then, by comparing the wrapped phase or grayscale difference of oc1 and ojc2,j=1,2,,f, the most similar pixel oc2 is determined and the unique fringe order corresponding to oc2 can be selected.

FIG. 1.

Schematic diagram of the traditional FPP.

FIG. 1.

Schematic diagram of the traditional FPP.

Close modal

Unlike the supervised deep learning, the untrained deep learning can be applied for different scenes without advance training.37 By constructing the loss function based on the general physical model that is independent of the scene,35 the network parameters (i.e., weights, bias, and convolutional kernels) can be optimized for each input through iterated learning.48 An untrained deep learning-based GCPU method is proposed, which obtains the desired fringe order by constructing an untrained convolutional neural network (UCNNet). The constructed UCNNet directly outputs the fringe order from the calculated phase and fringe background.

The flowchart of UGCPU is provided in Fig. 2. For simplicity, the fringe order K, calculated phase φ, and fringe background a with the superscript c1 or c2 refer to the first or second camera, respectively. Thus, φc1, ac1, φc2, and ac2 are selected as the input. The two sets of data provided by c1 and c2 are inputted into UCNNet through two weight-sharing pipelines, respectively. The two pipelines have the same network parameter space.49 UCNNet directly outputs fringe orders Kc1 and Kc2, respectively. The parameter space of UCNNet is initialized and iteratively optimized to obtain accurate Kc1 and Kc2 by minimizing a loss function constructed by following the 3D, structural, and phase consistency, which can be calculated using the inputted data and outputted fringe orders and denoted as Loss1, Loss2, and Loss3, respectively. Here, the optimization process is defined as arg minθLossθ,φc1,ac1,φc2,ac2,Kc1,Kc2, where Loss represents the loss function and θ denotes the network parameter space. For simplicity, we omit the notation θ,φc1,ac1,φc2,ac2,Kc1,Kc2 hereafter. After the network space optimization, the median filter is introduced to remove impulse noise caused by the discrete sampling and the error of UCNNet,50 and the desired absolute phases Φc1 and Φc2 can be retrieved from Kc1 and Kc2, respectively.

FIG. 2.

Flowchart of the proposed UGCPU.

FIG. 2.

Flowchart of the proposed UGCPU.

Close modal

Inspired by efficient residual factorized,51,52 a U-shaped structure is selected to construct UCNNet. One pipeline of the constructed UCNNet is illustrated in Fig. 3. The resolution of inputted patterns is selected as W × H. Both the down-sampling and up-sampling of the input are used to increase the network efficiency and data mining ability.53 The network structure includes the convolution, downsampler block, non-bottleneck-1D, upsampler block, and transposed convolution.52,53 In addition, an activation module is added to restrict the range of fringe orders by linearizing the inputted data.

FIG. 3.

Network structure of UCNNet.

FIG. 3.

Network structure of UCNNet.

Close modal

Taking the first camera as an example, the depth of field of the GCPU-based FPP system can be first pre-determined,21 and a finite range of the candidate fringe orders, Kminc1,Kmaxc1, can be selected for each pixel.20 Then, the activation module linearizes the inputted data into Kminc1,Kmaxc1. The activation module linearizes the inputted data of c2 by the same way.

As aforementioned, the loss function of UCNNet is constructed by following the 3D, structural, and phase consistency. Thus, the loss function consists of three types of losses, i.e., 3D, structural, and phase. The structure chart of the loss function is provided in Fig. 4, and the three types of losses are illustrated below.

FIG. 4.

Structure chart of the loss function.

FIG. 4.

Structure chart of the loss function.

Close modal

The 3D consistency is defined as that the 3D reconstructed by one camera and the projector should be the same as the 3D reconstructed by the two cameras. As shown in Fig. 1, the 3D coordinate of oc1 can be obtained from φc1 and Kc1 when combined oc1 with the corresponding projector pixel ojp. In each epoch of the iterative optimization, the 3D coordinate is updated and transformed into a coordinate oc2 in the perspective of c2 by using the principle of GCPU illustrated in Sec. II. In the same coordinate system of c1, the 3D coordinate of oc1 can also be calculated by combining oc1 with the corresponding pixel of c2. The corresponding pixel is denoted as oc2, which is also updated in each epoch and determined by searching the coordinate with the same absolute phase on the epipolar line by using the phase matching method.54–58 When the above two sets of 3D data are the same, the coordinates oc2 and oc2 should be consistent. Thus, taking the horizontal axis of oc2 and oc2, the 3D consistency-based loss function can be described as

(6)

where M denotes the number of pixels; the subscript m represents the mth pixel; the values of locmc2 and locmc2 corresponding to oc1 are locc2oc1=xc2 and locc2oc1=xc2, respectively; and (·) gives the horizontal axis of one pixel. It should be noted that invalid pixels are neglected according to the fringe amplitude. Because the 3D coordinate calculated by the two cameras is related to the absolute phase of all pixels on the epipolar line, the 3D consistency constraints the outputted fringe order of the entire image.

The structural consistency constrains oc1 and oc2 to have the same structural similarity (SSIM).59 The structural consistency-based loss function can be described as

(7)

where the value of ac2 corresponding to oc1 is ac2oc1=ac2oc2. Because the value of SSIM is related to the grayscale of all pixels on the local window,59 the structural consistency constraints the outputted fringe order of the local area of the entire image.

The phase consistency constrains oc1 and oc2 to have the same calculated phase. The phase consistency-based loss function can be described as

(8)

where the value of φc2 corresponding to oc1 is φc2oc1=φc2oc2. The difference between φc1 and φc2 should be limited in the range of [0, π] due to the 2π period of the wrapped phase. The phase consistency constraints the outputted fringe order pixel-to-pixel.

Because the two cameras should have the same loss function, the loss function of UCNNet is defined as

(9)

where λ1, λ2, and λ3 are the weights of Loss1, Loss2, and Loss3, respectively. According to our experimental results, the empirical values of these weights are λ1 = 1, λ2 = 20, and λ3 = 5.

It should be noted that the inputted data captured by the two cameras are not exactly the same due to the non-common field of view, and the calibration accuracy also influences the above consistencies. The optimized fringe orders will have a slight deviation, which should be less than 0.5 to avoid wrong phase unwrapping.

The GCPU-based FPP system includes a TI DLP6500 projector with a resolution of 1920 × 1080, two Basler acA640-750μm cameras with a resolution of 640 × 480, and two suited lens with a focal length of 8 mm. The measured objects are placed in front of the system with a distance around 1 m.

To verify the proposed UGCPU, a dataset containing 1025 scenes is constructed with 960 simple scenes of toys with a smooth and white surface and 65 complex scenes containing objects with a complex surface. For each scene, six sets of three-step phase-shifted patterns combined with gray code-based patterns are projected and then captured.60 It should be noted that these additional gray code-based patterns are used to obtain the ideal absolute phase and result in the ground-truth 3D for the verification. In order to be comprehensive, different fringe periods, i.e., 25, 30, 35, 40, 50, and 60 pixels, are selected, and the range of the candidate fringe orders contains about 12, 10, 8, 7, 6, and 5 candidate fringe orders according to the measurement system, respectively. For each period, the number of gray code-based patterns is determined according to the projector resolution.60 Both traditional GCPU (TGCPU)22 and supervised deep learning-based GCPU (SGCPU)23 are used for the comparison.

UCNNet is implemented by using Python and the framework of PyTorch on a PC with an Intel Core i9-7900X central processing unit (CPU) (3.30 GHz), 32 GB of RAM, and the GeForce GTX Titan RTX (NVIDIA). The constructed UCNNet selects the adaptive moment estimation optimizer61 to optimize the parameter space with the batch size and learning rate of 1 and 0.001, respectively. The experimental results of UGCPU are available as “Dataset” in “Data Availability.”

From the constructed dataset, four scenes, namely, a simple white toy, complex hand gesture, surgical mask, and power strip, are selected to verify UGCPU. The fringe period is selected as 30 pixels. The experimental results of the simple toy and other three complex objects are shown in Figs. 5 and 6, respectively.

FIG. 5.

Experimental results for simple white toys: (a) inputted data, (b) outputted fringe orders, (c) desired absolute phases, and (d) 3D data.

FIG. 5.

Experimental results for simple white toys: (a) inputted data, (b) outputted fringe orders, (c) desired absolute phases, and (d) 3D data.

Close modal
FIG. 6.

Experimental results for complex objects: (a) hand gesture, (b) surgical mask, and (c) power strip.

FIG. 6.

Experimental results for complex objects: (a) hand gesture, (b) surgical mask, and (c) power strip.

Close modal

Figure 5(a) shows the inputted φc1, ac1, φc2, and ac2. After the iterative optimization, UCNNet directly outputs Kc1 and Kc2 as shown in Fig. 5(b). By reusing Eq. (5) twice, the two absolute phases, Φc1 and Φc2, are retrieved from Kc1 and Kc2, respectively. The retrieved absolute phases shown in Fig. 5(c) are smooth and correct as the ideal absolute phase obtained by the robust gray code-based method. When combined with system parameters, the desired 3D shapes of c1 and c2 can be reconstructed from Φc1 and Φc2, respectively. The 3D shape of c1 is selected and shown in Fig. 5(d). The resulting 3D shape is smooth and does not contain sparkles, wrinkles, or serious height jumps caused by unwrapping errors.50 Thus, UGCPU can achieve correct phase unwrapping. Figures 6(a)6(c) illustrate the results of hand gesture, surgical mask, and power strip, respectively. The left, middle, and right columns show the inputted data, the outputted fringe orders, and the reconstructed 3D shapes, respectively. The proposed UGCPU also achieves correct phase unwrapping and results in accurate 3D reconstruction even for complex scenes.

UGCPU is also tested in dynamic scenes with the fringe period of 60 pixels. A moving hand and two moving white toys are measured and shown in Figs. 7(a) and 7(b), respectively. The first and second rows show the UCNNet input of φc1 and ac1 and the UGCPU resulted 3D, respectively. For the moving hand and toys, 212 and 86 scenes of them are successively captured, reconstructed, and shown in Multimedia views 1 and 2, respectively. During the movement of the object, UGCPU achieves correct and robust phase unwrapping and thus reconstructs accurate 3D shapes for dynamic scenes.

FIG. 7.

Experimental results for the dynamic measurement: (a) hand gestures and (b) white toys. Multimedia views: (a) https://doi.org/10.1063/5.0069386.1 and (b) https://doi.org/10.1063/5.0069386.2

FIG. 7.

Experimental results for the dynamic measurement: (a) hand gestures and (b) white toys. Multimedia views: (a) https://doi.org/10.1063/5.0069386.1 and (b) https://doi.org/10.1063/5.0069386.2

Close modal

We emphasize that UGCPU takes more computational time compared with the previous GCPUs, which requires more epochs for the optimization caused by the random initialization of the network parameter space. A fine-tuning strategy is selected to accelerate the computation by replacing the random initialization with pre-trained model initialization.62 The pre-trained model is the network parameter space of UCNNet pre-optimized using other scenes, which can achieve more reasonable initialization for new scenes. For 20 white toy scenes with the fringe period of 60 pixels, the loss curves with random initialization and pre-trained model initialization are plotted during increasing the epoch in Figs. 8(a) and 8(b), respectively. The former requires 57 min with 400 epochs, and the latter can reduce the computational time to 43 s with 5 epochs by using 80 white toy scenes to obtain the pre-trained model. Each scene only takes around 2 s for the 3D reconstruction, which is acceptable for the practical measurement.

FIG. 8.

Loss curve of UGCPU: (a) optimization with random initialization and (b) optimization with pre-trained model initialization.

FIG. 8.

Loss curve of UGCPU: (a) optimization with random initialization and (b) optimization with pre-trained model initialization.

Close modal

The proposed untrained deep learning-based, supervised deep learning-based, and traditional GCPUs are compared under 800 simple scenes of white toys selected from the constructed dataset. To be specific, 500, 150, and 150 scenes are selected as the training set, validation set, and testing set of SGCPU, respectively. These testing scenes are also used to optimize UCNNet and test TGCPU. For each scene, the retrieved absolute phase is subtracted from the ideal absolute phase obtained by the gray code-based method. An incorrect unwrapping rate (IUR) is defined as the rate of the number of pixels containing unwrapping errors to the whole size of valid areas.15 For clarity, the IUR is divided into three ranges of (0%, 0.1%), (0.1%, 1%), and (1%, 100%). The number of scenes in each range is calculated and provided in Table I.

TABLE I.

Incorrect unwrapping rate of UGCPU, SGUPU, and TGCPU.

MethodsIURPeriod (pixel)
  60 50 40 35 30 25 
UGCPU (0%, 0.1%) 74 64 29 22 
 (0.1%, 1%) 76 86 112 117 109 93 
 (1%, 100%) 11 32 57 
SGCPU (0%, 0.1%) 75 58 46 42 23 
 (0.1%, 1%) 38 41 39 30 48 46 
 (1%, 100%) 37 51 65 78 79 103 
TGCPU (0%, 0.1%) 
 (0.1%, 1%) 22 18 
 (1%, 100%) 128 132 141 143 149 150 
MethodsIURPeriod (pixel)
  60 50 40 35 30 25 
UGCPU (0%, 0.1%) 74 64 29 22 
 (0.1%, 1%) 76 86 112 117 109 93 
 (1%, 100%) 11 32 57 
SGCPU (0%, 0.1%) 75 58 46 42 23 
 (0.1%, 1%) 38 41 39 30 48 46 
 (1%, 100%) 37 51 65 78 79 103 
TGCPU (0%, 0.1%) 
 (0.1%, 1%) 22 18 
 (1%, 100%) 128 132 141 143 149 150 

When a scene is in the range of (0%, 0.1%), the number of its incorrect pixels is around 100 pixels, which can be neglected, in practice, considering the whole size of the captured patterns.11 When a scene is in the range of (0.1%, 1%), the incorrect pixel number is acceptable because these incorrect pixels are discretely distributed and their unwrapping errors can be removed through filtering operation.63 However, for a scene with an incorrect rate beyond 1%, the incorrect pixel number is non-ignorable, and it is difficult to remove these unwrapping errors because these incorrect pixels are often connected or close to each other. For clarity, the curve of the number of scenes with the IUR beyond 1% is illustrated in Fig. 9.

FIG. 9.

Number of scenes with the IUR beyond 1%.

FIG. 9.

Number of scenes with the IUR beyond 1%.

Close modal

Under the testing 150 scenes, the IUR of UGCPU is less than 1% when selecting a relatively large fringe period, e.g., 60 and 50 pixels. UGCPU with a large fringe period can achieve robust and correct phase unwrapping under different scenes. Accordingly, both SGCPU and TGCPU generate non-ignorable unwrapping errors even when selecting a large fringe period, e.g., their 37 and 128 scenes with the fringe period of 60 pixels have the IUR larger than 1%. From Table I and Fig. 9, the number of wrong scenes is increased when reducing the fringe period because the denser fringes and larger number of candidate fringe orders increase the difficulty of parameter optimization, network learning, and feature similarity comparison for UGCPU, SGCPU, and TGCPU, respectively. However, UGCPU still performs better than TGCPU and SGCPU obviously, which has less wrong scenes in each fringe period.

For clarity, three scenes with the fringe period of 30 pixels are reconstructed and shown in Figs. 10(a)10(c), respectively. The first and second rows of Fig. 10 show the inputted data and the 3D data calculated by the ground-truth, UGCPU, SGCPU, and TGCPU, respectively. As shown in Fig. 10(a), UGCPU and SGCPU obtain the same smooth 3D data as the ground-truth. TGCPU contains small and scattered error pixels due to the fixed weights of feature similarities and a relatively large number of candidate fringe orders. As shown in Figs. 10(b) and 10(c), similar results are obtained for TGCPU. SGCPU generates a large number of error pixels because the trained model is not suitable for testing scenes. For UGCPU, the areas containing error pixels are marked by the white box. UGCPU may fail in some small areas located in a non-common field of view and a large perspective difference of the two cameras. It should be noted that, for some scenes with a relatively large number of error pixels, UGCPU can eliminate these scenes by checking the 3D loss value. For example, the 3D loss value is larger than 3 for scenes with the fringe period of 30 pixels.

FIG. 10.

3D shape reconstructed by selected GCPUs: (a)–(c) results of three different scenes of simple white toys.

FIG. 10.

3D shape reconstructed by selected GCPUs: (a)–(c) results of three different scenes of simple white toys.

Close modal

First, the performance of the proposed UGCPU and SGCPU is compared by selecting 160 simple scenes of white toys and 65 complex scenes with the fringe period of 60 pixels from the constructed dataset. SGCPU is trained on 120 simple scenes of white toys, validated on 40 simple scenes of white toys, and tested on 65 complex scenes. The desired results can be obtained in validating scenes. The proposed UGCPU does not require the advance training and only uses the 65 testing scenes to optimize the parameter space. Three complex scenes of the hand gesture, surgical mask, and power strip are reconstructed and shown in Figs. 11(a)11(c), respectively. The first, second, third, and fourth columns of Fig. 11 show the captured scene using φc1 and ac1 and 3D data of the ground-truth, UGCPU, and SGCPU, respectively. Compared with the smooth ground-truth 3D, SGCPU results in 3D shapes containing a large number of height jumps. These height jumps are related to unwrapping errors with wrong fringe order, and the IUR of almost all scenes is obviously larger than 1%. UGCPU results in the same smooth and correct 3D shape as that of the ground-truth shape.

FIG. 11.

Comparison between the proposed UGCPU and SGCPU for different scenes: (a) hand, (b) surgical mask, and (c) power strip.

FIG. 11.

Comparison between the proposed UGCPU and SGCPU for different scenes: (a) hand, (b) surgical mask, and (c) power strip.

Close modal

Second, we compare UGCPU and SGUPU under limited scenes. From the constructed dataset, 160 simple white toy scenes with the fringe period of 60 pixels are measured. In detail, 20, 20, and 120 scenes are selected to train, validate, and test SGCPU, respectively. UGCPU only selects 20 scenes from the testing scenes to optimize the parameter space. Three different scenes are reconstructed and shown in Figs. 12(a)12(c), respectively. When a small number of training scenes are used, the IUR of three-quarters of the scenes is larger than 1% for SGCPU, and serious height jumps are caused by wrong fringe order. UGCPU still performs robustly and obtains smooth 3D data because it does not rely on the advance training.

FIG. 12.

Comparison between the proposed UGCPU and SGCPU under limited scenes: (a)–(c) results of three different scenes of simple white toys.

FIG. 12.

Comparison between the proposed UGCPU and SGCPU under limited scenes: (a)–(c) results of three different scenes of simple white toys.

Close modal

In this paper, we introduce the untrained deep learning to the commonly used FPP by proposing a new untrained geometric constraint-based phase unwrapping method. The proposed UGCPU is more robust than TGCPU and SGCPU, especially when a relatively large fringe period is selected. Although UGCPU requires a relatively complex optimization process, its computational time can be reduced to an acceptable level for practical applications. It is the first work of untrained deep learning-based work for FPP to our best knowledge. UGCPU does not require the additional fringe patterns for the phase unwrapping, which can be used for the dynamic and high-speed 3D measurement. The most important point is that UGCPU does not require advance training, which can be applied to various scenes. In the future, we will further introduce the untrained deep learning to the phase calculation of FPP.

This work was sponsored by the National Natural Science Foundation of China (Grant Nos. 61727802 and 62031018) and the Jiangsu Provincial Key Research and Development Program (Grant No. BE2018126).

The authors have no conflicts to disclose.

The data that support the findings of this study are openly available in figshare at https://figshare.com/articles/dataset/Dataset_zip/16438272.64 

1.
Z.
Wu
,
C.
Zuo
,
W.
Guo
,
T.
Tao
, and
Q.
Zhang
, “
High-speed three-dimensional shape measurement based on cyclic complementary gray-code light
,”
Opt. Express
27
,
1283
1297
(
2019
).
2.
D.
Zheng
,
Q.
Kemao
,
J.
Han
,
J.
Wang
,
H.
Yu
, and
L.
Bai
, “
High-speed phase-shifting profilometry under fluorescent light
,”
Opt. Lasers Eng.
128
,
106033
(
2020
).
3.
S.
Zhang
and
S.-T.
Yau
, “
High dynamic range scanning technique
,”
Opt. Eng.
48
,
033604
(
2009
).
4.
Y.
Ding
,
J.
Xi
,
Y.
Yu
, and
J.
Chicharo
, “
Recovering the absolute phase maps of two fringe patterns with selected frequencies
,”
Opt. Lett.
36
,
2518
2520
(
2011
).
5.
C. J.
Waddington
and
J. D.
Kofman
, “
Modified sinusoidal fringe-pattern projection for variable illuminance in phase-shifting three-dimensional surface-shape metrology
,”
Opt. Eng.
53
,
084109
(
2014
).
6.
L.
Lu
,
Z.
Jia
,
W.
Pan
,
Q.
Zhang
,
M.
Zhang
, and
J.
Xi
, “
Automated reconstruction of multiple objects with individual movement based on PSP
,”
Opt. Express
28
,
28600
28611
(
2020
).
7.
X.
He
,
D.
Zheng
,
Q.
Kemao
, and
G.
Christopoulos
, “
Quaternary gray-code phase unwrapping for binary fringe projection profilometry
,”
Opt. Lasers Eng.
121
,
358
368
(
2019
).
8.
G.
Sansoni
,
M.
Carocci
, and
R.
Rodella
, “
Three-dimensional vision based on a combination of gray-code and phase-shift light projection: Analysis and compensation of the systematic errors
,”
Appl. Opt.
38
,
6565
6573
(
1999
).
9.
C.
Lin
,
D.
Zheng
,
Q.
Kemao
,
J.
Han
, and
L.
Bai
, “
Spatial pattern-shifting method for complete two-wavelength fringe projection profilometry
,”
Opt. Lett.
45
,
3115
3118
(
2020
).
10.
Y.
Wang
and
S.
Zhang
, “
Novel phase-coding method for absolute phase retrieval
,”
Opt. Lett.
37
,
2067
2069
(
2012
).
11.
D.
Zheng
,
Q.
Kemao
,
F.
Da
, and
H. S.
Seah
, “
Ternary gray code-based phase unwrapping for 3D measurement using binary patterns with projector defocusing
,”
Appl. Opt.
56
,
3660
3665
(
2017
).
12.
C.
Quan
,
C. J.
Tay
,
L.
Chen
, and
Y.
Fu
, “
Spatial-fringe-modulation-based quality map for phase unwrapping
,”
Appl. Opt.
42
,
7060
7065
(
2003
).
13.
X.
Liu
and
J.
Kofman
, “
High-frequency background modulation fringe patterns based on a fringe-wavelength geometry-constraint model for 3D surface-shape measurement
,”
Opt. Express
25
,
16618
16628
(
2017
).
14.
X.
Su
and
W.
Chen
, “
Reliability-guided phase unwrapping algorithm: A review
,”
Opt. Lasers Eng.
42
,
245
261
(
2004
).
15.
C.
Zuo
,
S.
Feng
,
L.
Huang
,
T.
Tao
,
W.
Yin
, and
Q.
Chen
, “
Phase shifting algorithms for fringe projection profilometry: A review
,”
Opt. Lasers Eng.
109
,
23
59
(
2018
).
16.
S.
Zhang
, “
Absolute phase retrieval methods for digital fringe projection profilometry: A review
,”
Opt. Lasers Eng.
107
,
28
37
(
2018
).
17.
T.
Weise
,
B.
Leibe
, and
L.
Van Gool
, “
Fast 3D scanning with automatic motion compensation
,” in
2007 IEEE Conference on Computer Vision and Pattern Recognition
(
IEEE
,
2007
), pp.
1
8
.
18.
C.
Bräuer-Burchardt
,
C.
Munkelt
,
M.
Heinze
,
P.
Kühmstedt
, and
G.
Notni
, “
Using geometric constraints to solve the point correspondence problem in fringe projection based 3D measuring systems
,” in
International Conference on Image Analysis and Processing
(
Springer
,
2011
), pp.
265
274
.
19.
C.
Bräuer-Burchardt
,
P.
Kühmstedt
, and
G.
Notni
, “
Phase unwrapping using geometric constraints for high-speed fringe projection based 3D measurements
,”
Proc. SPIE
8789
,
878906
(
2013
).
20.
T.
Tao
,
Q.
Chen
,
S.
Feng
,
J.
Qian
,
Y.
Hu
,
L.
Huang
, and
C.
Zuo
, “
High-speed real-time 3D shape measurement based on adaptive depth constraint
,”
Opt. Express
26
,
22440
22456
(
2018
).
21.
X.
Liu
and
J.
Kofman
, “
Real-time 3D surface-shape measurement using background-modulated modified Fourier transform profilometry with geometry-constraint
,”
Opt. Lasers Eng.
115
,
217
224
(
2019
).
22.
K.
Zhong
,
Z.
Li
,
Y.
Shi
,
C.
Wang
, and
Y.
Lei
, “
Fast phase measurement profilometry for arbitrary shape objects without phase unwrapping
,”
Opt. Lasers Eng.
51
,
1213
1222
(
2013
).
23.
J.
Qian
,
S.
Feng
,
T.
Tao
,
Y.
Hu
,
Y.
Li
,
Q.
Chen
, and
C.
Zuo
, “
Deep-learning-enabled geometric constraints and phase unwrapping for single-shot absolute 3D shape measurement
,”
APL Photonics
5
,
046105
(
2020
).
24.
H.
Yu
,
X.
Chen
,
Z.
Zhang
,
C.
Zuo
,
Y.
Zhang
,
D.
Zheng
, and
J.
Han
, “
Dynamic 3-D measurement based on fringe-to-fringe transformation using deep learning
,”
Opt. Express
28
,
9405
9418
(
2020
).
25.
H.
Nguyen
,
N.
Dunne
,
H.
Li
,
Y.
Wang
, and
Z.
Wang
, “
Real-time 3D shape measurement using 3LCD projection and deep machine learning
,”
Appl. Opt.
58
,
7100
7109
(
2019
).
26.
Y.
Wang
,
J. I.
Laughner
,
I. R.
Efimov
, and
S.
Zhang
, “
3D absolute shape measurement of live rabbit hearts with a superfast two-frequency phase-shifting technique
,”
Opt. Express
21
,
5822
5832
(
2013
).
27.
B.
Li
and
S.
Zhang
, “
Novel method for measuring a dense 3D strain map of robotic flapping wings
,”
Meas. Sci. Technol.
29
,
045402
(
2018
).
28.
H.
Yu
,
D.
Zheng
,
J.
Fu
,
Y.
Zhang
,
C.
Zuo
, and
J.
Han
, “
Deep learning-based fringe modulation-enhancing method for accurate fringe projection profilometry
,”
Opt. Express
28
,
21692
21703
(
2020
).
29.
F.
Tan
,
H.
Zhu
,
Z.
Cui
,
S.
Zhu
,
M.
Pollefeys
, and
P.
Tan
, “
Self-supervised human depth estimation from monocular videos
,” in
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition
(
IEEE
,
2020
), pp.
650
659
.
30.
Y.
Zheng
,
S.
Wang
,
Q.
Li
, and
B.
Li
, “
Fringe projection profilometry by conducting deep learning from its digital twin
,”
Opt. Express
28
,
36568
36583
(
2020
).
31.
S.
Li
,
F.
He
,
B.
Du
,
L.
Zhang
,
Y.
Xu
, and
D.
Tao
, “
Fast spatio-temporal residual network for video super-resolution
,” in
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition
(
IEEE
,
2019
), pp.
10522
10531
.
32.
F.
Wang
,
Y.
Bian
,
H.
Wang
,
M.
Lyu
,
G.
Pedrini
,
W.
Osten
,
G.
Barbastathis
, and
G.
Situ
, “
Phase imaging with an untrained neural network
,”
Light: Sci. Appl.
9
,
77
(
2020
).
33.
S.
Liu
,
X.
Meng
,
Y.
Yin
,
H.
Wu
, and
W.
Jiang
, “
Computational ghost imaging based on an untrained neural network
,”
Opt. Lasers Eng.
147
,
106744
(
2021
).
34.
K. C.
Zhou
and
R.
Horstmeyer
, “
Diffraction tomography with a deep image prior
,”
Opt. Express
28
,
12872
12896
(
2020
).
35.
K.
Monakhova
,
V.
Tran
,
G.
Kuo
, and
L.
Waller
, “
Untrained networks for compressive lensless photography
,”
Opt. Express
29
,
20913
20929
(
2021
).
36.
M.
Qiao
,
X.
Liu
, and
X.
Yuan
, “
Snapshot temporal compressive microscopy using an iterative algorithm with untrained neural networks
,”
Opt. Lett.
46
,
1888
1891
(
2021
).
37.
E.
Bostan
,
R.
Heckel
,
M.
Chen
,
M.
Kellman
, and
L.
Waller
, “
Deep phase decoder: Self-calibrating phase microscopy with an untrained deep neural network
,”
Optica
7
,
559
562
(
2020
).
38.
S.
Zhang
, “
Recent progresses on real-time 3D shape measurement using digital fringe projection techniques
,”
Opt. Lasers Eng.
48
,
149
158
(
2010
).
39.
D.
Zheng
,
F.
Da
,
Q.
Kemao
, and
H. S.
Seah
, “
Phase error analysis and compensation for phase shifting profilometry with projector defocusing
,”
Appl. Opt.
55
,
5721
5728
(
2016
).
40.
W.
Yin
,
S.
Feng
,
T.
Tao
,
L.
Huang
,
M.
Trusiak
,
Q.
Chen
, and
C.
Zuo
, “
High-speed 3D shape measurement using the optimized composite fringe patterns and stereo-assisted structured light system
,”
Opt. Express
27
,
2411
2431
(
2019
).
41.
C.
Zuo
,
Q.
Chen
,
G.
Gu
,
S.
Feng
, and
F.
Feng
, “
High-speed three-dimensional profilometry for multiple objects with complex shapes
,”
Opt. Express
20
,
19493
19510
(
2012
).
42.
C.
Zuo
,
L.
Huang
,
M.
Zhang
,
Q.
Chen
, and
A.
Asundi
, “
Temporal phase unwrapping algorithms for fringe projection profilometry: A comparative review
,”
Opt. Lasers Eng.
85
,
84
103
(
2016
).
43.
Y.
Wang
,
C.
Wang
,
J.
Cai
,
D.
Xi
,
X.
Chen
, and
Y.
Wang
, “
Improved spatial-shifting two-wavelength algorithm for 3D shape measurement with a look-up table
,”
Appl. Opt.
60
,
4878
4884
(
2021
).
44.
Z.
Li
,
K.
Zhong
,
Y. F.
Li
,
X.
Zhou
, and
Y.
Shi
, “
Multiview phase shifting: A full-resolution and high-speed 3D measurement framework for arbitrary shape dynamic objects
,”
Opt. Lett.
38
,
1389
1391
(
2013
).
45.
Z.
Zhang
, “
A flexible new technique for camera calibration
,”
IEEE Trans. Pattern Anal. Mach. Intell.
22
,
1330
1334
(
2000
).
46.
K.
Liu
,
Y.
Wang
,
D. L.
Lau
,
Q.
Hao
, and
L. G.
Hassebrook
, “
Dual-frequency pattern scheme for high-speed 3-D shape measurement
,”
Opt. Express
18
,
5229
5244
(
2010
).
47.
Y.
Wang
,
V.
Suresh
, and
B.
Li
, “
Motion-induced error reduction for binary defocusing profilometry via additional temporal sampling
,”
Opt. Express
27
,
23948
23958
(
2019
).
48.
S.
Feng
,
Q.
Chen
,
G.
Gu
,
T.
Tao
,
L.
Zhang
,
Y.
Hu
,
W.
Yin
, and
C.
Zuo
, “
Fringe pattern analysis using deep learning
,”
Adv. Photonics
1
,
025001
(
2019
).
49.
X.
Chen
,
Q.
Wang
,
J.
Ge
,
Y.
Zhang
, and
J.
Han
, “
Non-destructive hand vein measurement with self-supervised binocular network
,”
Measurement
173
,
108621
(
2021
).
50.
D.
Zheng
,
F.
Da
,
Q.
Kemao
, and
H. S.
Seah
, “
Phase-shifting profilometry combined with gray-code patterns projection: Unwrapping error removal by an adaptive median filter
,”
Opt. Express
25
,
4700
4713
(
2017
).
51.
E.
Romera
,
J. M.
Alvarez
,
L. M.
Bergasa
, and
R.
Arroyo
, “
Efficient ConvNet for real-time semantic segmentation
,” in
2017 IEEE Intelligent Vehicles Symposium (IV)
(
IEEE
,
2017
), pp.
1789
1794
.
52.
E.
Romera
,
J. M.
Alvarez
,
L. M.
Bergasa
, and
R.
Arroyo
, “
ERFNet: Efficient residual factorized ConvNet for real-time semantic segmentation
,”
IEEE Trans. Intell. Transp. Syst.
19
,
263
272
(
2017
).
53.
O.
Ronneberger
,
P.
Fischer
, and
T.
Brox
, “
U-Net: Convolutional networks for biomedical image segmentation
,” in
International Conference on Medical Image Computing and Computer-Assisted Intervention
(
Springer
,
2015
), pp.
234
241
.
54.
Y.
Hu
,
Q.
Chen
,
S.
Feng
,
T.
Tao
,
A.
Asundi
, and
C.
Zuo
, “
A new microscopic telecentric stereo vision system-calibration, rectification, and three-dimensional reconstruction
,”
Opt. Lasers Eng.
113
,
14
22
(
2019
).
55.
K.
Song
,
S.
Hu
,
X.
Wen
, and
Y.
Yan
, “
Fast 3D shape measurement using Fourier transform profilometry without phase unwrapping
,”
Opt. Lasers Eng.
84
,
74
81
(
2016
).
56.
R. R.
Garcia
and
A.
Zakhor
, “
Consistent stereo-assisted absolute phase unwrapping methods for structured light systems
,”
IEEE J. Sel. Top. Signal Process.
6
,
411
424
(
2012
).
57.
R.
Ren
,
P.
Wang
,
D.
Zhou
, and
C.
Sun
, “
Face 3D measurement by phase matching with infrared grating projection
,”
Proc. SPIE
11439
,
114390P
(
2020
).
58.
H.
Zhao
,
Z.
Wang
,
H.
Jiang
,
Y.
Xu
, and
C.
Dong
, “
Calibration for stereo vision system based on phase matching and bundle adjustment algorithm
,”
Opt. Lasers Eng.
68
,
203
213
(
2015
).
59.
H.
Hirschmuller
, “
Accurate and efficient stereo processing by semi-global matching and mutual information
,” in
2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05)
(
IEEE
,
2005
), Vol. 2, pp.
807
814
.
60.
D.
Zheng
and
F.
Da
, “
Self-correction phase unwrapping method based on gray-code light
,”
Opt. Lasers Eng.
50
,
1130
1139
(
2012
).
61.
D. P.
Kingma
and
J.
Ba
, “
Adam: A method for stochastic optimization
,” arXiv:1412.6980 (
2014
).
62.
A.
Kumar
,
J.
Kim
,
D.
Lyndon
,
M.
Fulham
, and
D.
Feng
, “
An ensemble of fine-tuned convolutional neural networks for medical image classification
,”
IEEE J. Biomed. Health Inf.
21
,
31
40
(
2016
).
63.
R. B.
Rusu
,
Z. C.
Marton
,
N.
Blodow
,
M.
Dolha
, and
M.
Beetz
, “
Towards 3D point cloud based object maps for household environments
,”
Rob. Auton. Syst.
56
,
927
941
(
2008
).
64.
H.
Yu
,
B.
Han
,
L.
Bai
,
D.
Zheng
, and
J.
Han
(
2021
). “
Datasets used in untrained deep learning-based fringe projection profilometry
,” Dataset. https://figshare.com/articles/dataset/Dataset_zip/16438272.