When using ultrasound imaging of the tongue for speech recording/research, submental transducer stabilization is required to prevent the ultrasound transducer from translating or rotating in relation to the tongue. An iterative prototype of a lightweight three-dimensional-printable wearable ultrasound transducer stabilization system that allows flexible jaw motion and free head movement is presented. The system is completely non-metallic, eliminating interference with co-recorded signals, thus permitting co-collection and co-registration with articulometry systems. A motion study of the final version demonstrates that transducer rotation is limited to 1.25° and translation to 2.5 mm—well within accepted tolerances.

Ultrasound imaging of the tongue uses a transducer to simultaneously send and receive ultra-high-frequency sound waves. The transducer is placed submentally (under the chin in between the bony frame of the mandible), and the sound waves reflect back when they encounter differences in fluid density. This property of ultrasound allows imperfect imaging of tongue muscle structure and relatively clear images of the tongue surface (Gick et al., 2005; Hedrick et al., 1995; Stone, 2005). These images can be used to track tongue motion using optical flow (Barbosa and Vatikiotis-Bateson, 2015), tongue surface shape using automatic (Articulate Instruments Ltd., 2012) or manual (Tiede and Whalen, 2015) tracing, tongue landmark identification (Ménard et al., 2012), or relating tongue configuration to acoustics (Carignan, 2018). Interpreting these data often requires stabilization of the ultrasound transducer in relation to the head or lower jaw.

In addition, like all medical imaging except for tagged magnetic resonance imaging (MRI), ultrasound cannot be used to track tongue flesh points. It is therefore often ideal to stabilize the ultrasound transducer in a way that simultaneously allows flesh-point tracking. Here we present a non-metallic ultrasound transducer stabilization system, suitable for transducer stabilization as well as co-collection with articulometry data (as well as electromyography and electro-glottal data), which we designed through iterative evaluation and testing. Our system is also light-weight compared to previous metallic systems, at 200–240 g fully assembled, allows free head movement, and is comfortable for up to 1.5 h based on participant self-reports. The transducer stabilization system also allows participants to play several mouth-operated musical instruments, such as trumpets and trombones, in order to record instrumental tongue articulations (Heyne, 2016; Heyne and Derrick, 2016).

Researchers began with hand-held ultrasound transducer stabilization (Heyne et al., 2018), as well as having participants sit and rest their head against a back wall, greatly reducing head motion (Gick et al., 2005). These two ultrasound stabilization techniques are still suitable for shape comparisons and related diagnostics, such as tongue-shape in rhotic production (Heyne et al., 2018), or tongue motion direction in flap production (Derrick and Gick, 2011). Both, however, restrict head motion and disallow estimating motion variance of the ultrasound transducer in relationship to the head, thus they both carry unknown risks of measurement error.

Researchers therefore quickly came to prefer systems such as the Head And Transducer Stabilizer system (HATS) (Stone and Davis, 1995) with video calibration, and the Haskins Optically Corrected Ultrasound System (HOCUS) (Whalen et al., 2005), with optical tracking for automated calibration of ultrasound image position.

HOCUS is still used and works well. HOCUS testing has also established the tolerances that, if met, can allow for ultrasound-only data collection, that is, without optical co-tracking. For two-dimensional (2D) mid-sagittal data collection, lateral motion (translation along the sagittal plane) of less than 2–4 mm, yaw less than 5°, and pitch or roll less than 5°–7°, eliminates most of the risk of blurred or inaccurate midsagittal measurement (Whalen et al., 2005). HOCUS (or any head-correction by tracking systems) cannot correct (recover) yaw, roll, and lateral sliding in 2D ultrasound, so these systems now flag data outside the acceptable motion range.

The Articulate Instruments metallic transducer stabilization system (Scobbie et al., 2008) was one of the first systems to use HOCUS tolerances to eliminate the need for secondary tracking. The system had as much as 1.8 mm of translation and 1.3° of rotation (95% confidence interval), well within the HOCUS-established limits. However, the system is metallic, and so can interfere with co-collection of articulometry data. It also has many points of manual adjustment, which can allow data collection errors due to small positioning differences. It is also somewhat uncomfortable to wear, and so can only be used for about 30–45 min at a time, and tested to 20 min of runtime in Scobbie et al. (2008).

Our goal was to design a system stable enough to work without point tracking, but flexible enough to allow data co-collection with multiple other systems. As noted, ultrasound and optical tracking systems have worked well together for a long time (Whalen et al., 2005). However, optical tracking systems do not allow for internal flesh-point tracking. Articulometry does but it also depends upon a stable alternating-current magnetic field, which can be disrupted by metal objects such as the metallic Articulate Instruments device. Ideally, an ultrasound transducer stabilization system should be made entirely of non-metallic materials.

Since 2012, our research team began developing a non-metallic transducer stabilizer. Our first prototype was tested during a large-scale study of American English speech recorded at different speech rates (Derrick and Gick, 2012). Analysis of head motion residual effects on the ultrasound transducer position showed that our prototype transducer stabilization system kept translation within a 2–3 mm range, and rotation within a 1.2°–2.5° range at a 95% confidence interval over 45 min of ultrasound and articulometry data recording. The most important (lateral) motion was only 2.15 mm at a 95% confidence interval, which meant most of the time the system functioned within an acceptable range of motion (Derrick et al., 2015).

The first version of our transducer stabilization system has been used extensively in research, including collection of data on New Zealand English (NZE) rhotics (Heyne et al., 2018), monophthong vowels (Heyne et al., 2015), syllable utterance rate (Derrick, 2011), laterals in NZE (Strycharczuk et al., 2018) and Australian English (Ying et al., 2016), nasal vowels in Southern French (Carignan, 2017), and teasing apart the respective contribution of the oral and nasal cavities to vowel acoustics (Carignan, 2018). It has also been used for Australian English coronal consonants (Derrick et al., 2014), Iwaidja lenition (Mailhammer et al., 2015), consonant gemination in Moroccan Arabic (Frej et al., 2017), and a study of the relationship between trombone playing and native language, focusing on English and Tongan (Heyne and Derrick, 2015a,b).

There were nonetheless certain flaws with the first prototype which became obvious to us and the other research teams who used it. The most significant was that the ultrasound transducer was held within the system using foam and elastic bands, which required careful physical alignment of the transducer, and could allow for untrackable position shift during long experiments. The risk was increased during transitions between participants as the transducer gel was cleaned off of the transducer and transducer clip. We therefore undertook to improve upon it in a second iteration.

We decided to design a system for which all the customized rigid components could be printed on a three-dimensional (3D) printer. We developed and iteratively refined the requisite set of 3D printed parts and configuration.

Figure 1 shows the system's printable materials for the transducer stabilization portion of the system in assembled form: The extra printable parts needed to fit it to the participant's head and adjust it (bolt, adjuster, head cap) are shown in Fig. 2.

Fig. 1.

(Color online) Schematic of assembled non-metallic ultrasound transducer stabilization system, not including the headpiece and bolt adjusters.

Fig. 1.

(Color online) Schematic of assembled non-metallic ultrasound transducer stabilization system, not including the headpiece and bolt adjusters.

Close modal
Fig. 2.

(Color online) Schematic of headpiece and bolt adjusters.

Fig. 2.

(Color online) Schematic of headpiece and bolt adjusters.

Close modal

The design simultaneously allows rigidity of the clip holder that holds the ultrasound transducer and maintains it in position, and flexibility of the stabilization system to changes in muscle compression of the tongue (e.g., during swallowing). Since the triangular electromagnetic articulomatry (EMA) plate holding the EMA reference sensors is attached to the pivoting clip holder itself, any minor adjustments of the angle of the clip holder may be optionally tracked with articulometry data. This allowed us to test the stability of this second 3D printable system in a replication of the tests used for the first system (Derrick et al., 2015).

Data for evaluation of the current version of the transducer stabilization system were collected from a single-participant replication of those initial tests (see Derrick et al., 2015). The participant was a native German speaker fluently speaking English, who reported having normal hearing. The participant produced 30 min of read speech, which provided concurrent data on ultrasound motion relative to head position. The methods used here are almost identical to those used to evaluate that first iteration (see Derrick et al., 2015), also involving co-collected ultrasound and EMA data, but using the current transducer stabilization system.

A General Electric (GE, Boston, MA) 8C-RS ultrasound transducer was connected to a GE LOGIQ e (version 11) portable ultrasound machine. The ultrasound was connected to an Epiphan (Ottawa, Ontario, Canada) VGA2USB Pro frame grabber plugged into a USB port for a MacBook Pro late 2013 model with 2.6 gHz quad core i7 and 16 GB of RAM. A Sennheiser (Wedemark, Germany) MKH 416 microphone was plugged into a Sound Devices (Reedsburg, WI) USB Pre 2, which was itself plugged into the other USB port of the MacBook Pro (Apple, Cupertino, CA).

The 8C-RS ultrasound transducer was held with the transducer clip (8) and fitted into the transducer stabilization system (4) as seen in Fig. 1. The transducer stabilization system was strapped onto the head of the participant with elastic bands attached between each of the four mount points on the base (1), and the corresponding mount point on the head cap in Fig. 2, allowing the ultrasound transducer to rest submentally and adjacent to the thyroid notch of the larynx mid-sagitally. The transducer stabilization system base was adjusted to the participant's jaw angle by rotating the two jaw flaps (2) slightly up and down as needed.

The MKH 416 microphone was mounted on a Manfrotto 244 variable grip arm and placed to the side of the mouth, 5–10 cm away from the participant and outside the range of the EMA field projector. Our participant was seated in a non-metallic chair with his head positioned next to the NDI Wave EMA field projector. NDI wave sensors were attached to the ultrasound stabilization system on the points of the front triangle (7). NDI wave sensors were also taped to the participant's nasion, left mastoid, and right mastoid, and glued to the midsagittal line of the tongue tip, tongue dorsum as far back as comfortable for the participant, and mid-way on the tongue blade. Sensors were also glued to the gum just under the inner lower left incisor, and the midsagittal line of the upper and lower lip next to the vermillion border.

For this experiment, the participant was seated and presented with auditory reiterant speech at rates ranging from 3 to 7 syllables/s, and then asked to read sentences at the speech rate of the reiterant speech. Sentences and speech rates were randomized and presented on a computer screen 1 m away from their seated position. Each block took about 175 s to read, and 10 blocks of data were collected, representing about 30 min of data over about 45 min of experiment time. EMA data were collected on the NDI Wave system, connected to a PC computer. Ultrasound data were also co-collected on the same machine using FFMPEG and running an X.264 encoder using yuv420p for the video and pulse-code-modulation (PCM s16le) for the audio.

The occlusal (bite) plane of our participant was recorded in order to provide a baseline for his head position. The bite plane was recorded by having him hold a protractor between his teeth. The protractor had three EMA sensors taped to each corner, the minimum necessary to record position accurately. The triangle of the occlusal protractor was translated and rotated onto an ideal projection, and used to calculate a head rotation matrix via the nasion/mastoid triangle. This matrix was then used to transform the positions of all the sensors to an ideal head position.

In order to compare the angle of the ultrasound transducer with respect to the head rotation angle, we carefully measured the distance of the three sensors on the EMA sensor attachment [the three corners of (7)] to the transducer center. Because they were further from the center of rotation of the head, the measurement points were displaced considerably more than the ultrasound transducer center. Therefore, its rotation matrix was used to rotate and move the ideal transducer center to the actual location of the transducer relative to head position. The X (sagittal), Y (coronal), and Z (transverse) displacements of the transducer center were thereby obtained for each NDI wave recording sample in the experiment. The rotation matrix was also used to obtain the pitch, yaw, and roll for each sample.

The results of the test as seen in Table 1 show that the second iteration of our transducer stabilization system performs as least as well as the first iteration. Table 1 shows the residual motion within each 3-min block, as well at the residual motion spanning the entire experiment. The experiment average therefore shows the added effect of long-term transducer stabilizer drift.

Table 1.

Rotation and displacement for 3D-printable ultrasound transducer stabilization.


Angle (degrees)

Displacement (mm)
rollpitchyawXYZTotal
Block SD 0.58 0.36 0.22 1.25 0.63 0.85 1.09 
 95% 1.14 0.71 0.43 2.45 1.23 1.67 2.14 
Experiment SD 0.63 0.55 0.36 1.63 1.06 0.89 1.10 
 95% 1.23 1.08 0.71 3.19 2.08 1.74 2.16 

Angle (degrees)

Displacement (mm)
rollpitchyawXYZTotal
Block SD 0.58 0.36 0.22 1.25 0.63 0.85 1.09 
 95% 1.14 0.71 0.43 2.45 1.23 1.67 2.14 
Experiment SD 0.63 0.55 0.36 1.63 1.06 0.89 1.10 
 95% 1.23 1.08 0.71 3.19 2.08 1.74 2.16 

Here we presented iterative development and evaluation of a stable, non-metallic ultrasound transducer stabilization system. This second version, like the first one, reduces ultrasound motion to well within the thresholds identified in the HOCUS experiments (Whalen et al., 2005). While this stabilization system still requires the researcher to look at the position of the transducer stabilizer on the participant's head, there are no longer hidden risks of misalignment that would not show up in photographs of the setup. Nevertheless, it is important to examine and photograph participants' heads, and examine the ultrasound tongue image for double-lines (see Stone, 2005), which indicate the ultrasound transducer is not set along the mid-sagittal plane. The researcher must also either measure the occlusal plane of a participant to orient the tongue, or select a relatively stable shared tongue position in order to compare data across speakers. The risks may not always be completely eliminated, but they are now visible, and can therefore be taken into account in interpreting results.

The second iteration of our system offers the added benefit of allowing other labs to build and assemble the transducer stabilizer without our intervention. Reproduction of our system requires a 3D printer, 3D printing material, a sewing kit, box cutter, foam, elastic straps, strap adjusters, shock-cord, and Velcro.

Within our stereolithography (STL) sets we provide clips to hold the GE Logiq E 8C-RS transducer, as well as the Telemed 10 and 20 mm transducer used with the Articulate Instruments Echo B. These clips have been created from 3D laser scans of the respective transducers. The clips are composed of the negative of these scans that, when attached to the transducer, results in a grip on the transducer body that is customized to its unique shape. Similar 3D scans of ultrasound transducer heads will provide meshes needed to modify pre-existing clips. Alternatively, we provide a more general-purpose version clip (in four different sizes in width: 20, 25, 30, and 35 mm), which can be used with foam padding to hold an ultrasound transducer in place; however, due to the physical tolerance of foam, this general-purpose solution will allow for minor degrees of movement that will not occur with the 3D-scan-customized transducer clips.

These STL parts were designed for dual-extrusion 3D printers. For the substance of the parts, we recommend polylactic acid for testing the 3D printer or for testing modified designs, and acrylonitrile butadiene styrene (ABS) for prolonged use. For the support structure, we recommend a water-soluble product like polyvinyl alcohol (PVA). If you use a single-extrusion printer, you will need to generate a hand-breakable supporting structure to print both the transducer base and the clip holder.

During construction, a foam base should be cut into a shape that fits on the ultrasound transducer base and/or on the narrow base extender, depending on the desired configuration; 3D printable stencils are provided for cutting foam to match both the wide and narrow configurations. These two configurations allow the system to work with a wide range of head sizes. Our supplementary materials1 provide detailed construction instructions, parts lists, and the STL files (default unit: 1 mm) needed to print the pieces. We tested our instructions through the construction of two extra copies of our system, one at the University of Canterbury, printed at the advanced design and manufacture lab, and the second at Western Sydney University's MARCS Institute. Pictures of the one printed at the University of Canterbury are shown in Fig. 3.

Fig. 3.

(Color online) Photo of the headset in place. Left = front, middle = side, right = isometric view.

Fig. 3.

(Color online) Photo of the headset in place. Left = front, middle = side, right = isometric view.

Close modal

This provision of these instructions and 3D printing materials will allow many more research labs to build and use this design, add clips for many more ultrasound transducers, and support a research culture in which the design can be customized for specific purposes and otherwise improved for decades to come.

This research was funded in part by a New Zealand MARSDEN fast-start grant “Saving energy vs making yourself understood during speech production” to D.D. It was also funded in part through Grant No. NIH DC-002717 to Haskins Laboratories. Special thanks to Scott Lloyd for help in the design and for the construction of the first iteration of the non-metallic ultrasound transducer stabilization system. The MARCS Institute also provided personnel and technical support and equipment for development of the second version, with thanks to Ben Binyamin for remote testing of the print and assembly. Thanks also to the University of Canterbury's advanced design and manufacture lab for printing a second copy of the transducer stabilization system. The 3D printed parts were produced on their Stratasys Elite 3d printer using ABS as the primary material supported by a PVA scaffold structure. Thanks to David Read, the lab tech, who was most helpful in producing these parts. Special thanks to Bob Haywood for consultations on the 3D models and their limitations with regard to consumer-grade printers.

1

See the supplementary material at https://doi.org/10.1121/1.5066350E-JASMAN-144-505811for detailed instructions for printing and assembling this ultrasound transducer stabilizer. They also contain all of the stereo lithography (STL) files needed to print components for the stabilizer. The STL headpiece and base include the digital object identifier (DOI) for this article, allowing ease of citation.

1.
Articulate Instruments Ltd.
(
2012
).
Articulate Assistant Advanced Ultrasound Module User Manual
, revision 2.14 (
Articulate Instruments Ltd
.,
Edinburgh, UK
).
2.
Barbosa
,
A. V.
, and
Vatikiotis-Bateson
,
E.
(
2015
). “
Optical flow analysis for measuring tongue-motion
,”
J. Acoust. Soc. Am
.
136
(
4
),
2105
.
3.
Carignan
,
C.
(
2017
). “
Covariation of nasalization, tongue height, and breathiness in the realization of F1 of Southern French nasal vowels
,”
J. Phonetics
63
,
87
105
.
4.
Carignan
,
C.
(
2018
). “
Using ultrasound and nasalance to separate oral and nasal contributions to formant frequencies of nasalized vowels
,”
J. Acoust. Soc. Am.
143
(
5
),
2588
2601
.
5.
Derrick
,
D.
(
2011
). “
Syllable iterance rate influences categorical variation of English flaps during normal speech
,” in
New Zealand Institute of Language, Brain and Behaviour Annual Workshop (NZILBB)
,
Christchurch, New Zealand
.
6.
Derrick
,
D.
,
Best
,
C. T.
, and
Fiasson
,
R.
(
2015
). “
Non-metallic ultrasound probe holder for co-collection and co-registration with EMA
,” in
Proceedings of the 18th International Congress of Phonetic Sciences (ICPhS)
, pp.
1
5
.
7.
Derrick
,
D.
,
Fiasson
,
R.
, and
Best
,
C. T.
(
2014
). “
Coordination of tongue tip and body in place differences among English coronal obstruents
,” in
Proceedings of the 10th International Seminar on Speech Production (ISSP)
, Cologne, Germany.
8.
Derrick
,
D.
, and
Gick
,
B.
(
2011
). “
Individual variation in English flaps and taps: A case of categorical phonetics
,”
Can. J. Linguist.
56
(
3
),
307
319
.
9.
Derrick
,
D.
, and
Gick
,
B.
(
2012
). “
Speech rate influences categorical variation of English flaps and taps during normal speech
,”
J. Acoust. Soc. Am.
131
(
4
),
3345
(2012).
10.
Frej
,
M. Y.
,
Carignan
,
C.
, and
Best
,
C. T.
(
2017
). “
Acoustics and articulation of medial versus final coronal stop gemination contrasts in Moroccan Arabic
,” in
Proceedings from INTERSPEECH 2017
, pp.
201
214
.
11.
Gick
,
B.
,
Bird
,
S.
, and
Wilson
,
I.
(
2005
). “
Techniques for field application of lingual ultrasound imaging
,”
Clin. Linguist. Phonetics
19
(
6/7
),
503
514
.
12.
Hedrick
,
W. R.
,
Hykes
,
D. L.
, and
Starchman
,
D.
, eds. (
1995
).
Ultrasound Physics and Instrumentation
, 3rd ed. (
Moasby, St. Louis
,
Missouri
).
13.
Heyne
,
M.
(
2016
). “
The influence of first language on playing brass instruments: An ultrasound study of Tongan and New Zealand trombonists
,” Ph.D. thesis,
University of Canterbury
.
14.
Heyne
,
M.
, and
Derrick
,
D.
(
2015a
). “
The influence of tongue position on trombone sound: A likely area of language influence
,” in
Proceedings of the 18th International Congress of Phonetic Sciences (ICPhS)
, pp.
1
5
.
15.
Heyne
,
M.
, and
Derrick
,
D.
(
2015b
). “
Using a radial ultrasound transducer's virtual origin to compute midsagittal smoothing splines in polar coordinates
,”
J. Acoust. Soc. Am.
138
(
6
),
EL509
EL514
.
16.
Heyne
,
M.
, and
Derrick
,
D.
(
2016
). “
Visualization techniques for empirical brass instrument research
,”
Int. Trumpet Guild
140
(
4
),
6
14
.
17.
Heyne
,
M.
,
Derrick
,
D.
, and
Hay
,
J.
(
2015
). “
An ultrasound study of monophthongs in New Zealand English
,” in a talk given at
The 46th Annual Conference of the Australian Linguistics Society
, Western Sydney University, Paramatta.
18.
Heyne
,
M.
,
Wang
,
X.
,
Derrick
,
D.
,
Dorreen
,
K.
, and
Watson
,
K.
(
2018
). “
The articulation of /ɹ/ in New Zealand English
,”
J. Intl. Phonet. Assoc
.
1
23
.
19.
Mailhammer
,
R.
,
Harvey
,
M.
,
Agostini
,
T.
, and
Shaw
,
J. A.
(
2015
). “
Bolstering phonological fieldwork with ultrasound: Lenition and approximants in Iwaidja
,” in
Ultrafest VII
.
20.
Ménard
,
L.
,
Aubin
,
J.
,
Thibeault
,
M.
, and
Richard
,
G.
(
2012
). “
Measuring tongue shapes and positions with ultrasound imaging: A validation experiment using an articulatory model
,”
Folia Phoniatrica et Logopaedica
64
,
64
72
.
21.
Scobbie
,
J. M.
,
Wrench
,
A. A.
, and
van der Linden
,
M. L.
(
2008
). “
Head-probe stabilisation in ultrasound tongue imaging using a headset to permit natural head movement
,” in the
8th International Seminar on Speech Production (ISSP 2008)
, pp.
373
376
.
22.
Stone
,
M.
(
2005
). “
A guide to analysing tongue motion from ultrasound images
,”
Clin. Linguist. Phonetics
19
(
6–7
),
455
501
.
23.
Stone
,
M.
, and
Davis
,
E. P.
(
1995
). “
A head and transducer support (hats) system for use in ultrasound imaging of the tongue during speech
,”
J. Acoust. Soc. Am.
98
,
3107
3112
.
24.
Strycharczuk
,
P.
,
Derrick
,
D.
, and
Shaw
,
J.
(
2018
). “
How vocalic is vocalised /l/? Evidence from lateralization
,” presented at the
16th Conference on Laboratory Phonology (Labphon16))
, Lisbon, Portugal.
25.
Tiede
,
M.
, and
Whalen
,
D. H.
(
2015
). “
Getcontours: An interactive tongue surface extraction tool
,” in
Proceedings of Ultrafest VII
, Hong Kong,
2
pp.
26.
Whalen
,
D. H.
,
Iskarous
,
K.
,
Tiede
,
M. K.
,
Ostry
,
D. J.
,
Lehnert-LeHouilier
,
H.
,
Vatikiotis-Bateson
,
E.
, and
Hailey
,
D. S.
(
2005
). “
The Haskins Optically Corrected Ultrasound System (HOCUS)
,”
J. Speech, Lang., Hear. Res.
48
(
3
),
543
553
.
27.
Ying
,
J.
,
Shaw
,
J. A.
,
Best
,
C.
,
Proctor
,
M.
,
Derrick
,
D.
, and
Carignan
,
C.
(
2016
). “
Articulation and representation of laterals in Australian-accented English
,” in
The 15th Conference on Laboratory Phonology
, Ithaca, NY.

Supplementary Material