Passive localization and tracking of a mobile emitter, and joint learning of its reverberant threedimensional (3D) acoustic environment, where critical structural features are unknown, is a key open problem. Unaccountedfor occluders are potentially present, so that the emitter can lose lineofsight to the receivers, and can only be observed through its reflected raypaths. The locations of reflective boundaries must therefore be jointly estimated with the emitter's position. A multistage global optimization and tracking architecture is developed to solve this problem with a relatively unconstrained model. Each stage of this architecture establishes domain knowledge such as synchronization and initial environment estimation, which are inputs for the following stages of more refined algorithms. This approach is generalizable to different physical scales and modalities and improves on methods that do not exploit the motion of the emitter. In one stage of this architecture, particle swarm optimization is used to simultaneously estimate the environment and the emitter location. In another stage, a Hough transforminspired boundary localization algorithm is extended to 3D settings, to establish an initial estimate of the environment. The performance of this holistic approach is analyzed and its reliability is demonstrated in a reverberant watertank testbed, which models the shallowwater underwater acoustic setting.
I. INTRODUCTION
Passive underwater acoustic localization in shallowwater settings is one of the most important and challenging localization problems (Fischell , 2019; Waterston , 2019). The emitter to be localized could be a surface vessel or underwater vehicle, and the receivers could be surreptitiously placed for passively monitoring a body of water. Receivers are typically omnidirectional tethered hydrophones, but could also be placed on a set of mobile underwater vehicles, as illustrated in Fig. 1. When the environment is accurately known, stateoftheart matchedfield processing (MFP) methods can be used for localization (Baggeroer , 1993). The standard MFP framework does not account for the presence of manmade features such as pier pilings or anchored vessels, or unsurveyed reefs (Chitre and Pelekanakis, 2014). Depending on their positions, these large and potentially unknown objects can act as additional reflective boundaries, or as occluders that block the lineofsight (LOS) arrivals to receivers. MFP is sensitive to model mismatch (Le Gall , 2016; Xu , 2006) and its performance could therefore deteriorate due to such objects, motivating the formulation of a localization framework that incorporates their presence. A representative scenario is presented in Fig. 2, where the goal is to estimate the unknown source position at each transmission point along a trajectory (red circles), in the presence of a large occluder, in a scaled model of the shallowwater setting. This is an instance of an open problem that also arises for indoor acoustic or radio frequency localization (Dokmanic , 2015), where the nonlineofsight (NLOS) arrivals due to multipath reflections must be judiciously exploited rather than mitigated, with a robust passive localization method.
Without restrictions on the localization setting, such problems are illposed and admit infinitely many solutions. The static nature of the environment, and the spatial continuity in the trajectory of a moving emitter, are key to unambiguous localization. Some of the important aspects of the setting are modeled as in Table I; the limited information here leaves some key parameters undetermined so that it is natural to adopt a multistage approach to first estimate them. Accordingly, we deploy a series of localization algorithms sequentially, as depicted in Fig. 3, using an architecture that we call passive endtoend localization (PEEL). Each stage of PEEL produces approximate solutions to different aspects of the localization problem, which then become inputs for the following stages. We call this an endtoend method because it starts from received waveforms that do not have a common clock with the emitter, and delivers a joint estimate of the full emitter trajectory and the environment. We ultimately demonstrate that PEEL achieves strong performance, even when there is modeling mismatch that violates the assumptions in Table I.
Problem feature .  Modeling assumptions . 

Speed of sound, v_{s}  Known, and constant within the environment 
Environment  Static; only the emitter position is changing 
Reflectors, $ \eta i$  Planar, known number, unknown position and orientation 
Transmissions  Periodic transmissions of deterministic pulses, with known period 
Receivers, $ p r , j$  At least 4 receivers deployed, with known positions 
Problem feature .  Modeling assumptions . 

Speed of sound, v_{s}  Known, and constant within the environment 
Environment  Static; only the emitter position is changing 
Reflectors, $ \eta i$  Planar, known number, unknown position and orientation 
Transmissions  Periodic transmissions of deterministic pulses, with known period 
Receivers, $ p r , j$  At least 4 receivers deployed, with known positions 
While there is currently no endtoend passive localization method such as the one we propose, similar frameworks have been designed for different settings. Joint acoustic tracking and environment learning problems can be solved using simultaneous localization and mapping (SLAM) methods (Li and Krolik, 2013; Ribeiro , 2010) in active settings, or when the emitter and receivers form a network (Naseri and Koivunen, 2016). Similarly, while there are timedifferenceofarrival (TDOA) and timeofarrival (TOA) methods for localization (Scheuing and Yang, 2008; Venkateswaran and Madhow, 2012; AlamedaPineda and Horaud, 2014), they do not account for unknown occluders as PEEL does.
Our main contribution is an architecture for joint localization and environment learning. Among other tools, PEEL makes use of:

A global optimization method for joint boundary estimation and localization, which does not need to be tuned to a specific problem.

The mapping of unknown occluders after tracking, followed by iterations with a refined optimization metric, which has not been implemented for passive localization.

Closedform derivations for some key steps of a boundary localization technique (Naseri and Koivunen, 2016), that improve computational efficiency and accuracy.
The paper is structured as follows. In Sec. II, we present the signal model and geometric considerations used for localization. We present the PEEL method in Sec. III. The results that are achieved in the scale watertank environment of Fig. 2 are given in Sec. IV. In Sec. V, we conclude with potential avenues for future research.
In our mathematical notation throughout the paper, lowercase bold variables such as p are vectors, and uppercase calligraphic characters such as $P$ denote sets.
II. LOCALIZATION PRINCIPLE: EXPLOITING VIRTUAL RECEIVERS
In this section, we introduce the notation and signal model, and present key geometrical considerations that are necessary to frame the localization problem. We specifically focus on how multipath arrivals can be modeled and used to improve localization performance.
Triangulation is a standard localization principle that models the emitter as the point of intersection of circles of equal timeofflight (TOF) from the emitter to each receiver. Since these TOF arcs do not perfectly intersect in practice (e.g., due to noise), an estimation method has to be used; often, this method has to simultaneously solve a data association problem (Too , 2017). When the locations of reflectors are known, multipath can be leveraged for better accuracy. Reflections can be interpreted as producing ‘virtual receivers’ as illustrated in Fig. 4, which corresponds to the structure of our watertank setting.
We model s(t) as pulsed and periodic, which limits our estimation scenarios to such manmade sources. This model allows us to record s(t) and to infer some key parameters: the approximate (truncated) channel impulse response length T_{c} (Sun and Wang, 2018), and the time period T_{p} between emissions. We require s(t) to meet the following constraints in order to have accurate estimates $ { \tau \u0302 i , j}$ of the $ { \tau i , j}$. First, we define T_{s} as the temporal duration of s(t), so that s(t) is strictly zero outside $ 0 \u2264 t \u2264 T s$. Next, we define T_{A} to be the full width at half maximum (FWHM) of the autocorrelation function of s(t), which is the time resolution of the matchedfiltered emitted signal. Finally, we define T_{m} as the minimum time difference between the main multipath arrivals. The PEEL operation modes are then as follows:

If T_{s} < T_{m}, we have a pulse that is shorter than the multipath separation, as is the case with the Gaussian pulse in our experiments. Here, we do not need to know s(t) for PEEL; T_{s} is short enough that the multipath arrivals are roughly separable.

If T_{s} > T_{m} and T_{A} < T_{m}, then after matchedfiltering, the multipath peaks are roughly separable. Now, we do in fact need to know s(t) for this matchedfiltering step, so that we are able to identify the distinct arrivals and thereby solve for the environment.
We also need T_{p} > T_{c} to avoid different periodic emissions being observed within the same r(t).^{1} With these introductory developments, we are ready to present an architecture for reliable localization in the presence of such errors.
III. ENDTOEND LOCALIZATION AND STRUCTURE LEARNING
We now present our proposed joint localization and structure learning method, PEEL. First, we use TDOA and TOA localization for initialization; then obtain an initial boundary localization estimate; and finally track a moving emitter as we simultaneously learn the environment, as summarized in Fig. 5.
A. Stage 1: Coarse localization of the emitter
We first describe a standard TDOA localization algorithm (which does not require knowledge of $ { \eta i}$) for synchronization with the emitter (Korhonen, 2008), then a TOA algorithm that leverages multipath from known boundaries $ { \eta i}$ after synchronization (Ribeiro , 2010; Brutti , 2010). Both algorithms use a gridsearch for the emitter to obtain $ p \u0302 e$, where a set $P$ of grid points $ p cand$ is defined over the target area.^{2} Each $ p cand \u2208 P$ corresponds to a potential emitter location, with an associated set of $ { \tau i , j , cand}$ to the receivers $ p r , j$ and the corresponding virtual receivers $ p i , j$. Thus, each $ \tau i , j , cand$ is a function of $ p r , j$ and of $ \eta i$. The TDOA and TOA stages are used to initialize more refined techniques.
The heuristic metric in Eq. (6) is maximized when each $ \tau i , j$ corresponds to a peak location; such locations are the intersections of TOF arcs, as seen in Fig. 6(a). When the $ { \eta i}$ are known, and therefore the virtual receivers are labeled correctly, $ p \u0302 e$ is obtained accurately.
TDOA and TOA localization are more robust if the search space $P$ can be limited to the vicinity of the correct location, $ p e$. Consider the results in Fig. 6(a); although $ p e$ is estimated accurately, there are many local maxima in the search space. These can lead to large errors as in Fig. 6(b). Confining $P$ to the vicinity of $ p e$ avoids such ambiguities. We summarize our TDOA and TOA localization algorithms in Algorithms 1 and 2.
Require: Unoccluded LOS arrivals at each receiver. 
Input: $ P , \u2009 { r j ( t ) } j = 1 M , \u2009 { p r , j } j = 1 M $ 
for $ p cand \u2208 P $ do 
Calculate $ C TDOA$ [Eq. (4)] 
end 
Get $ p \u0302 e , TDOA$[Eq. (5)] 
Result: $ p \u0302 e , TDOA $ 
Require: Time synchronization after applying Algorithm 1; number of boundaries N is known.^{4} 
Input: $ P , \u2009 { r j ( t ) } j = 1 M , \u2009 { p r , j } j = 1 M , \u2009 { \eta i } i = 1 N $ 
for $ p cand \u2208 P $ do 
Calculate $ C TOA$ [Eq. (6)] 
end 
Get $ p \u0302 e , TOA$ [Eq. (7)] 
Result: $ p \u0302 e , TOA $ 
Require: Time synchronization after applying Algorithm 1; number of boundaries N is known.^{4} 
Input: $ P , \u2009 { r j ( t ) } j = 1 M , \u2009 { p r , j } j = 1 M , \u2009 { \eta i } i = 1 N $ 
for $ p cand \u2208 P $ do 
Calculate $ C TOA$ [Eq. (6)] 
end 
Get $ p \u0302 e , TOA$ [Eq. (7)] 
Result: $ p \u0302 e , TOA $ 
One advantage of Algorithm 2 is that it is a heuristic weighting of the arrivals. A localization method that is solely based on $ { \tau i , j}$ will assign equal importance to each arrival so that there will be no automatic compensation for faulty TOF estimates. A TOF estimate is likelier to have a high error if the corresponding arrival is weak. In such cases, its contribution to $ C TOA$ in Algorithm 2 will be smaller, and its impact on $ p \u0302 e , TOA$ in Eq. (7) will be reduced.
Algorithm 2 also allows us to compare different hypothesized environments. Given candidate boundaries $ { \eta i} cand$, we can run Algorithm 2 for each $ { \eta i} cand$, and pick the maximum metric solution as our joint estimate of $ p \u0302 e , TOA$ and $ { \eta i}$. $ C TOA$ is therefore a heuristic measure of the estimate's goodness. Recall that the $ { \eta i}$ cannot generally be assumed known; thus, we next propose a boundary localization algorithm to produce a suitable set of $ { \eta i} cand$.
B. Stage 2: Preliminary boundary localization
After TDOA and TOA localization, we have an initial estimate of the emitter's position and coarse synchronization. To jointly estimate the emitter and the environment in a computationally feasible manner, we need a good initial estimate of reflective boundary positions, which ultimately yields a tight search space for $ { \eta i}$. We assume that a set of unlabeled $ { \tau \u0302 i , j}$ has been estimated (Demirli and Saniie, 2001), though some of these estimated arrival times might have large errors. We now develop a Hough transforminspired method to estimate $ { \eta i}$ which is robust to errors in $ { \tau \u0302 i , j}$, building on (Naseri and Koivunen, 2016).
As some background context, Euclidean distance matrices (EDM) (Dokmanic , 2015) and convex optimization methods (Naseri , 2014) have been used to solve for virtual emitter positions and thereby obtain boundary positions. These methods can suffer from large errors if an arrival $ \tau \u0302 i , j$ has been labeled incorrectly, and their performances are initializationsensitive. An alternative approach relies on the fact that in two dimensions, these $ \tau \u0302 i , j$ each define an ellipse whose foci are $ p e$ and $ p r$, as in Fig. 7(a); in three dimesions, they define spheroids. In Naseri and Koivunen (2016), boundaries are observed to be common tangents to such ellipses; by fitting tangent planes to these ellipses, we can estimate the boundaries while avoiding the errorprone initialization and labeling procedure in multipath environments, such as in Fig. 7(b). It is convenient to define a given $ \eta i$ by the spherical coordinates of a vector from the origin (arbitrarily selected) that is orthogonal to the boundary plane. Thus, each boundary is specified by a range ρ, azimuth θ, and elevation $\varphi $, where these local spherical coordinates correspond to a standard eastnorthup Cartesian coordinate system.
The transform for boundary localization in Naseri and Koivunen (2016), which for convenience we refer to as common tangents to spheroids (COTANS), is derived as follows. A boundary defined by ρ, θ, and $\varphi $ can be conceptualized as a point $ ( \rho , \theta , \varphi )$ in a COTANS transform domain; working out the $ ( \rho , \theta , \varphi )$ expression of a plane is to take the plane's COTANS transform (Borrmann , 2011). When the space $ \rho \xd7 \theta \xd7 \varphi $ is discretized as a 3mode tensor, each hypothetical tangent has a finite set of $ ( \rho , \theta , \varphi )$ on which it may lie. Incrementing this “accumulator” tensor over every potential $ ( \rho , \theta , \varphi )$ entry yields a COTANSdomain image [as in Fig. 9(a)], with maxima at the true boundaries in the noiseless case.
In practice, we first generate a set of points $ { ( \rho , \theta , \varphi )}$ by defining an azimuthelevation grid, and obtaining the COTANS transforms for each of the corresponding tangents. The resulting points $ { ( \rho , \theta , \varphi )} COTANS$ are then rounded to a desired accuracy.^{5} We define a tensor over ρ, θ, and $\varphi $ with this resolution, for each rounded $ { ( \rho , \theta , \varphi )} COTANS$ point, we increment the corresponding cell in the tensor by 1. We sum the tensors for each NLOS arrival at each receiver, to obtain a final tensor of superimposed COTANS curves such as in Fig. 9(a). Each ellipse contributes a curve; observe that the curves do not intersect at a single point because of errors in $ { \tau \u0302 i , j}$ and in the measured $ p e$ and $ p r$, so we have multiple maxima. We therefore use a smoothing filter to replace the tensor values with local averages.^{6} Then, we extract the locations of as many maxima as there are boundaries as in Fig. 9(b), where we set the neighborhood of each maximum in the tensor to zero, to avoid picking the same boundary multiple times.^{7} This method is summarized in Algorithm 3.
Require: Time synchronization from Algorithm 2; number of boundaries N is known. 
Input: $ { \tau \u0302 i , j } , \u2009 { p r , j } j = 1 M , \u2009 p \u0302 e , TDOA $ 
for $ \tau \u0302 i , j \u2208 { \tau \u0302 i , j } $ do 
Calculate COTANS transform for spheroid, add to accumulator ( Appendix A) 
end 
Result: Apply smoothing to COTANS accumulator, find its N maxima as $ { \eta \u0302 i}$. 
Require: Time synchronization from Algorithm 2; number of boundaries N is known. 
Input: $ { \tau \u0302 i , j } , \u2009 { p r , j } j = 1 M , \u2009 p \u0302 e , TDOA $ 
for $ \tau \u0302 i , j \u2208 { \tau \u0302 i , j } $ do 
Calculate COTANS transform for spheroid, add to accumulator ( Appendix A) 
end 
Result: Apply smoothing to COTANS accumulator, find its N maxima as $ { \eta \u0302 i}$. 
Modeling the environment as consisting of planar boundaries makes environment learning more computationally tractable. Note that the virtual receiver model is not limited to planar boundaries: the planar boundary estimates can be tangents to a nonplanar boundary. In this case, however, the boundary estimates will be inaccurate for localization purposes when the emitter moves to a different position. Therefore, rather than using the initial COTANS boundary estimates for the entire emitter trajectory, we need evolving boundary estimates over time, utilizing the global optimization method which we now discuss.
C. Stage 3: Joint boundary estimation and emitter localization
At this stage, we have initial estimates of both the emitter position and the boundaries, allowing for simultaneous optimization over $ p e$ and $ { \eta \u0302 i}$. We assume candidate boundaries $ { \eta i} cand$ that are in the vicinity of the $ { \eta \u0302 i}$, and then carry out TOA localization (Algorithm 2) for each $ { \eta i} cand$. These $ { \eta i} cand$ are drawn from a search space $ \rho \u2208 [ \rho i \u2212 \rho 0 , \rho i + \rho 0 ] , \u2009 \theta \u2208 [ \theta i \u2212 \theta 0 , \theta i + \theta 0 ] , \u2009 \varphi \u2208 [ \varphi i \u2212 \varphi 0 , \varphi i + \varphi 0 ]$, where ρ_{i}, θ_{i}, and $ \varphi i$ are estimated from Algorithm 3, and ρ_{0}, θ_{0}, and $ \varphi 0$ are error margins obtained as the typical maximum parameter estimation errors from COTANS, for a given setting. The $ { \eta \u0302 i}$ associated with the highest maximum metric $ C TOA$ [Eq. (6)] yields the most likely $ p e$ and $ { \eta i}$. Interlacing this algorithm with tracking over the emitter positions, we discover and map the occluders in the environment, and refine our estimates for joint localization and environment learning.
In this simultaneous optimization problem with measurement errors, it is advantageous to maintain multiple evolving estimates of the emitter and boundary locations, for which we use particle swarm optimization (PSO) (Clerc, 2010). In this global optimization algorithm, each $ { \eta i} cand$ is termed a “particle.” Each particle is initialized at a random location and given an initial “velocity” in the search space. Algorithm 2 is carried out for each particle, and the maximum $ C TOA$ and corresponding $ p \u0302 e , cand$ is recorded. Particle velocities are modified at each PSO iteration towards the global maximum metric particle (with the maximum $ C TOA$), and also to the maximum metric particle in a random “neighborhood” of particles. Thus, particles explore the region around the current best approximation.
PSO was used in PEEL because its metaparameters do not need to be adapted to a specific physical setting and because it can be run without reinitialization during tracking. There are alternative global optimization algorithms that can also be used in PEEL, however, and it would be advantageous in an application setting to finetune the metaparameters of PSO or another algorithm for that particular environment.
In order to be able to accommodate for occlusion, we rely on a tracking procedure over the periodic transmissions of the moving emitter. Kalman tracking with a constantacceleration model, for example, can be applied. The $ p \u0302 e$ obtained using PSO is used as the measurement vector to update the tracking. Critically, the gridsearch is confined to a sufficiently small area $P$ around the resulting prediction. This way, even if the emitter is temporarily occluded and the underlying $ C TOA$ metric is erroneous, $ p \u0302 e$ will still be close to the actual $ p e$. Thus, if the emitter enters an area that is unfavorable for localization, PEEL can potentially mitigate the resulting estimation errors. Our algorithm for PSO localization and boundary estimation is given in Fig. 10, and our overall approach is summarized in Algorithm 4.
Input: $ { r j ( t ) } j = 1 M , \u2009 { p r , j } j = 1 M , \u2009 { \eta i } cand $ 
Require: Moving emitter. 
for each emitter position in trajectory do 
for prespecified number of PSO iterations do 
Perform PSO, get $ p \u0302 e , PSO , \u2009 { \eta \u0302 i}$ (Fig. 10, Appendix B) 
end for 
Use $ p \u0302 e , PSO$ to update Kalman tracking, get new $P$ for next iteration. 
end for 
Result: $ p \u0302 e , PSO , \u2009 { \eta \u0302 i}$ 
Input: $ { r j ( t ) } j = 1 M , \u2009 { p r , j } j = 1 M , \u2009 { \eta i } cand $ 
Require: Moving emitter. 
for each emitter position in trajectory do 
for prespecified number of PSO iterations do 
Perform PSO, get $ p \u0302 e , PSO , \u2009 { \eta \u0302 i}$ (Fig. 10, Appendix B) 
end for 
Use $ p \u0302 e , PSO$ to update Kalman tracking, get new $P$ for next iteration. 
end for 
Result: $ p \u0302 e , PSO , \u2009 { \eta \u0302 i}$ 
From the results of Algorithm 4, we can determine whether each LOS path to receivers has been attenuated beyond expectation. Occlusion can be determined by assuming a $ 1 / R \u0302$ attenuation factor in the received signal magnitude, where $ R \u0302$ is a given estimated LOS range. Unoccluded arrivals allow estimation of the emitted power, by scaling the corresponding LOS magnitudes by $ R \u0302$. If the received LOS magnitude is substantially below the magnitude predicted by geometric spreading, we conclude that an occluder is present.^{8} Volumetric intersection methods can then be used to map the occluders (Srinivasan , 1990).
After mapping the occluders, we rerun Algorithm 4. We now know which LOS arrivals are blocked by the occluder, and we change the additive metric $ C TOA$ by leaving blocked paths out of the sum in Eq. (6). Our flexible cost function thereby allows us to compensate for model mismatch due to occlusion, and yields increasingly improved performance.
IV. EXPERIMENTAL RESULTS
We now evaluate our PEEL method for its passive localization and environment learning capabilities, seeking to demonstrate and understand its performance and robustness. For this purpose, we use the convenient setting of the watertank depicted in Fig. 2. We first test PEEL on real data obtained from the watertank; then add synthetic noise to this data to analyze PEEL's performance compared to other localization methods and finally conduct trials in a simulation setting with a wider range of localization scenarios.
For our PEEL implementation, we use 128 particles and 5000 PSO iterations for each emitter position. At each PSO iteration, we use a grid $P$ of 0.5 mm resolution and sides of 6 mm centered on the expected emitter location. We perform ten experiments for each localization scenario and obtain the root mean square (rms) localization error. The rms error for the first pass of PEEL is 3.9 mm, as seen in Fig. 11(a). To put this into perspective, the average distance of the emitter from the receivers for this trajectory is 93.5 mm, and the rms error in this scenario is 4.2% of the mean distance to the emitter. The final boundary estimates in Fig. 11(a) are obtained using the full estimated trajectory, which leads to a more accurate solution compared to the results obtained from using just one emitter position. Running PEEL's second pass, we get the result in Fig. 11(b), where the rms error is reduced to 2.7 mm (2.9% of the mean distance). The occluder is reconstructed with sufficient accuracy to improve our localization result. To understand how much the unknown occluder's presence affects the result, we experimented with the same scenario but with no occluder, obtaining an rms error of 1.5 mm (1.6% of the mean distance), which is within our margin of measurement accuracy. In these trials, occlusion and interference due to reverberation are the key error factors; the SNR of unoccluded LOS arrivals is > 30 dB.
If synchronization and an initial boundary estimate are given to us, singlereceiver localization using Algorithm 4 becomes feasible. In an experiment with a single emitter at the origin and no occluder, we get 2.1 mm of rms error (2.2% of the mean distance). This strong performance is surprisingly only slightly worse than when we use four receivers.
In order to analyze the performances of the localization algorithms that we have examined (TOA localization with known boundaries, and PSO), we carried out localization trials with synthetic Gaussian noise added to experimental signals. For the first position in the trajectory, where there is LOS to every receiver, the same noise power in units of dBm was added to each recorded signal. In Fig. 12(a), we compare PSO's performance for different numbers of PSO iterations, vs that of TOA localization with known boundaries. Because PSO optimizes for the boundary locations as well, its high runtime led to us conduct 150 trials for each noise level, while 5000 trials were carried out for TOA. There is a performance gap between PSO and the “ground truth” TOA result, which is smaller for higher noise levels. Longer runtimes improve PSO's performance and help close this gap.
We next evaluate the performance of environment learning by adding synthetic Gaussian noise to the signals measured at the last position in the trajectory (where two LOS arrivals are occluded) and obtaining the rms range error from the origin to all the boundaries, averaged over trials and the number of boundaries. We conducted 100 such trials for each noise level and for three scenarios: in the absence of the occluding cylinder as measured in a separate set of experiments, with the occluding cylinder present but not accounted for, and with the occluded arrivals removed in the revised metric (modeling the PEEL second pass). The results are given in Fig. 12(b), where we observe a gap in performance caused by the occlusion, beyond the performance deterioration caused by a lower effective SNR. Boundary estimation is improved by compensating for the occluder, though this improvement is less dramatic than the performance gains in emitter localization. Adding a noise power greater than 20 dBm causes much larger errors in the localization and boundary estimation tasks in this setting, as the noise peaks have much greater magnitude than the NLOS arrivals.
To test PEEL's performance on a wider range of experimental geometries, we generated synthetic data that modeled the watertank setting with randomized receiver locations. We implemented a simulation setting as per Deane (1994) that modeled a reverberant environment with multibounce arrivals, producing synthetic received signals with the mirror image method. In these trials, the emitter's trajectory, the boundary locations, the occluder, and the position of the receiver at the origin were kept constant, and PEEL was run for 10 different deployments of the other three receivers. These receivers were distributed uniformly over the region bounded by the positions of receivers in the real experimental setting. We used the same Gaussian pulse that was used in the watertank experiment, though the only distortion that these pulses underwent in simulation was attenuation. Ultimately, we obtained an rms error of 2.6 mm over ten trials, which was close to the watertank results.
Overall, we observed reliable performance across all of the trials for this particular demonstration. Once the unknown occluder is compensatedfor, we only observe several millimeters of error, which is reasonable since there are already millimeterscale errors in the assumed receiver positions. These results are summarized in Table II.
Scenario .  rms Error (mm) . 

Four receivers, no occluder  1.5 
Single receiver, no occluder  2.1 
Four receivers, occluder, firstpass  3.9 
Four receivers, occluder, final result  2.7 
Simulation, random receivers  2.6 
Scenario .  rms Error (mm) . 

Four receivers, no occluder  1.5 
Single receiver, no occluder  2.1 
Four receivers, occluder, firstpass  3.9 
Four receivers, occluder, final result  2.7 
Simulation, random receivers  2.6 
V. CONCLUDING REMARKS
In this paper, we developed a method for localization and environment learning, which was tested in a threedimensional (3D) reverberant underwater environment with occluding objects. Our PEEL method performed iteratively refined localization, with each solution yielding results that served as assumed knowledge for the following more sophisticated algorithm stages. This multistage approach handled a complex problem where successively establishing synchronization, boundary estimation, emitter tracking, and occluder estimation, made solving the overall task computationally feasible. Applying these stages separately is suboptimal as compared to a hypothetical joint optimization over all of the aspects of the problem. However, PEEL breaks down the localization task into subproblems with wellposed solutions. We ultimately achieved accurate simulation and experimental results in endtoend localization.
The trajectory to which PEEL was applied resulted in a difficult localization problem which nevertheless proved to be solvable. There are many adversarial scenarios, however, that can defeat our method. The emitter may move behind an occluder and come to a halt, or it may take a sharp turn which throws off the tracking algorithm. One way to cope with such scenarios would be to run several instances of PEEL in parallel. A discrepancy between their results would indicate catastrophic errors. This parallelization strategy for PEEL is also important for choosing the number of boundaries, N. If one is not confident about the assumption on N, then running PEEL in parallel with different values of N becomes advantageous. Choosing the best N is an important open problem for PEEL.
Note that we do not require a complete deduction of the occluders for improved localization performance. In this passive setting, a complete occluder mapping requires the emitter to move in an arc around them, as is the case in tomography. It is not guaranteed that occluders can be mapped since the emitter's path is not under our control. As long as occlusion is detected, potential errors from summing over nonexistent arrivals are avoided.
When deploying PEEL in an open water setting, there are improvements that must be made to cope with this more difficult environment. To use as general a model as possible, we have used the simplest versions of each algorithm, such as GCCPHAT for TDOA, basic PSO for optimization, and a constant acceleration model for Kalman tracking. Each of these algorithms can be enhanced if more domain knowledge is available. Applying an algorithm that tracks the multipath arrivals (Fleury , 1999; Salmi , 2008), for example, can help to further refine the boundary estimates. The underwater environment's bathymetric profile or the presence of occluding objects may already be known, and must then be taken into account. PEEL can also be applied to tasks such as indoor (acoustic or RF) localization, which features easier channels, but cluttered environments with different challenges.
PEEL currently incorporates global optimization over the emitter position and the locations of planar boundaries. A more sophisticated environmental model would allow it to work robustly with the complex nonplanar boundaries in the reallife shallowwater environment. In the literature, global optimization with simulated annealing and genetic algorithms have been applied to MFP (Hermand, 1999) for geoacoustic inversion, using known emitters. Optimization over not just the locations of their boundaries but their physical properties, while more computationally expensive, could lead to improved results in field deployments.
There are broader questions in TOA localization that pertain to PEEL. Currently, PEEL exploits the emission of deterministic, pulsed signals. In practice, other types of emissions may arise, as with communications sources or vessel signatures. It would be challenging to establish time synchronization with such signals. Although we have experimented with performance in increasing noise, it would also be of interest to obtain a full bandwidth/noise characterization over different environments and signals. Finally, COTANS boundary estimation may be amenable to a machine learning method, to automate the smoothing and peak extraction tasks. COTANS transform results are twodimensional (2D) or 3D images, so we can train a neural network on such data for boundary estimation in future experiments.
ACKNOWLEDGMENTS
This work was supported, in part, by ONR under Grant Nos. N000141912661, N000141912662, and N000141912665, and NSF under Grant No. CCF1816209. The authors also thank Dr. Jim Preisig of JP Analytics LLC for valuable discussions and feedback on the work.
APPENDIX A: DERIVATION OF THE BOUNDARY LOCALIZATION APPROACH
1. COTANS transform for tangent lines to ellipses in 2D
We present the derivation of the COTANS transform of tangent lines to an ellipse in 2D. First, we derive the tangent line to a standard ellipse centered around the origin, with a closedform description in terms of the normal vector to the tangent. We then consider this ellipse to be rotated and translated to its actual position in 2D and modify the normalvector description of the corresponding rotated and translated tangent. The final parametrization of the tangent in terms of its normal vector is $ ( \rho COTANS , \theta COTANS )$; this is a point in $ { ( \rho , \theta )}$space, where ρ is the length of the vector and θ is its direction as measured with respect to the xaxis. This description is by definition the COTANS transform of the tangent line.
Now, we obtain the COTANS transform of this tangent line, by parametrizing it in terms of its Euclidean distance ρ from the origin and the azimuth θ of the normal line to the tangent going through the origin. Thus, a tangent to the ellipse at $ ( a \u2009 cos \u2009 \alpha , b \u2009 sin \u2009 \alpha )$ will be expressed as a point $ ( \rho \alpha , \theta \alpha )$ in the COTANS domain. One derivation of this COTANS transform result is as follows. Our method will obtain $ \rho ( \theta )$, the distance to the origin of a given tangent plane as a function of azimuth.
Equation (A6) allows us to obtain the set of points $ { ( \rho , \theta )}$ in the COTANS domain for a standard ellipse, but we wish to obtain such points for a general ellipse whose foci are $ p e$ and $ p r$. We are given that the LOS distance between the emitter and receiver is $ d LOS$ and that the distance of the NLOS reflection from a given boundary is $ d NLOS$. Our method is to first obtain $ ( \rho , \theta )$ for a standard ellipse with foci at $ p r \u2032 = [ \u2212 d LOS / 2 \u2003 0 ] T$ and $ p e \u2032 = [ d LOS / 2 \u2003 0 ] T$, with axes $ a = d NLOS / 2 , \u2009 b = d NLOS 2 \u2212 d LOS 2 / 2$. One example of such an ellipse is given in Fig. 8(a). The ellipse corresponding to the true localization scenario is obtained by rotating this standard ellipse as in Fig. 8(b), and then translating it to its correct position as in Fig. 8(c). The $ ( \rho , \theta )$ of the tangent line must be correspondingly transformed to match the new position of the rotated and translated tangent that yields the desired $ ( \rho COTANS , \theta COTANS )$.
To modify $ ( \rho , \theta )$ as per Fig. 8, first consider the azimuth of the vector $ p e \u2212 p r$ as $ \theta rot$. To align the standard ellipse with the target ellipse, we replace each $ ( \rho , \theta )$ with $ ( \rho , ( \theta + \theta rot ) \u2009 mod \u2009 2 \pi )$, which we term $ ( \rho , \theta \u2033 )$. This rotation leaves the ρvalue of each tangent line unchanged, and only affects the azimuth, while the new foci are at $ p e \u2033$ and $ p r \u2033$. Next, we calculate a translation vector $ p trans = p r \u2212 p r \u2033$, which would be added to any point on the rotated standard ellipse to obtain the target ellipse. To obtain the resulting $ ( \rho COTANS , \theta COTANS )$ pairs, we first calculate the dot product $ \rho proj = p trans \xb7 \rho \u0302$, where $ \rho \u0302 = [ cos \u2009 \theta \u2033 \u2009 \u2009 \u2009 sin \u2009 \theta \u2033 ] T$ is the unit vector pointing towards the tangent line. Thus, we project the translation vector $ p trans$ onto $ \rho \u0302$. If $ \rho proj \u2265 0$, then we replace $ ( \rho , \theta \u2033 )$ with $ ( \rho + \rho proj , \theta \u2033 )$. If $ \rho proj < 0$, then we replace ρ with $  \rho \u2212  \rho proj  $. If $  \rho proj  < \rho $ and $ \rho proj < 0$, then we do not modify the azimuth $ \theta \u2033$; else, we replace $ \theta \u2033$ with $ ( \theta \u2033 + \pi ) \u2009 mod \u2009 2 \pi $. Carrying out these operations for each of the starting $ { ( \rho , \theta )}$ points, we obtain a final transformed set of points as $ { ( \rho COTANS , \theta COTANS )}$.
2. COTANS transform for tangent planes to spheroids in 3D
Using Eq. (A12), we can take the COTANS transform of tangent planes to a standard spheroid with a = b. We can rotate this spheroid and translate it, and correspondingly transform the points $ ( \rho , \theta , \varphi )$ that define the tangent planes. Our operations are similar to those in the 2D case. We first define our standard spheroid to have foci at $ p r \u2032 = [ 0 \u2003 0 \u2212 d LOS / 2 ] T$ and $ p e \u2032 = [ 0 \u2003 0 \u2003 d LOS / 2 ] T$, with axes $ a = b = d NLOS 2 \u2212 d LOS 2 / 2$ and $ c = d NLOS / 2$. We calculate the azimuth and elevation of the vector $ p e \u2212 p r$, and call these angles $ \theta rot$ and $ \varphi rot$, respectively. We rotate the spheroid by $ \varphi rot$ with the yaxis as the axis of rotation (thereby obtaining the correct elevation), then rotate the result by $ \theta rot$ with the zaxis as the axis of rotation (thereby obtaining the correct azimuth), and finally, we translate it to its correct position.
APPENDIX B: PSO IMPLEMENTATION
In practice, if T_{p} > T_{c}, then T_{p} can be estimated, but this is an environmentspecific problem outside the scope of our method, as is an estimation of T_{m}.
In our scale watertank setting, the regularlyspaced gridsquares have length 1 mm.
More sophisticated methods for extracting the LOS peak could also be used (Guvenc and Sahinoglu, 2005).
Algorithm 2 does not require leveraging all of the boundaries to yield an accurate result. The additional NLOS arrivals provide spatial diversity that helps to increase localization accuracy and mitigate occlusion.
We use a resolution of 0.1 mm for ρ and 0.1° for θ and $\varphi $ in our experiments.
We heuristically use a uniform averaging filter that is 3 mm wide in ρ, and 3° wide in θ and $\varphi $.
Given a maximum of the accumulator result at some $ ( \rho max , \theta max , \varphi max )$, we zero out the θ and $\varphi $ within $ \xb1 15 \xb0$ of the maximum, and all ρ values within this sector.
70% below the predicted magnitude, in our implementation.