The purpose of these acoustical patent reviews is to provide enough information for a Journal reader to decide whether to seek more information from the patent itself. Any opinions expressed here are those of the reviewers as individuals and are not legal opinions. Printed copies of United States Patents may be ordered at $3.00 each from the Commissioner of Patents and Trademarks, Washington, DC 20231. Patents are available via the Internet athttp://www.uspto.gov.

  • GEORGE L. AUGSPURGER, Perception, Incorporated, Box 39536, Los Angeles, California 90039

  • ERIC E. UNGAR, Acentech, Incorporated, 33 Moulton Street, Cambridge, Massachusetts 02138

David John Palmer and Michael Ian Palmer, assignors to Wing Acoustics Limited

24 October 2017; filed 14 September 2016

This patent is very long, very ambitious, and very strange. According to the Claims, the invention is an audio transducer, earphone, or loudspeaker, having a rigid diaphragm suspended from one or more discrete hinges. The patent argues that this arrangement can achieve better performance than conventional half-roll or corrugated suspensions. The heart of the invention is the hinge design, which is quite sophisticated and would probably cost more to fabricate than a typical small loudspeaker. The illustration is a section view of a headphone in which the entire diaphragm assembly is supported by a single central hinge. Apparently, this is all meant to be taken seriously.—GLA

Chao Jiang et al., assignors to GOERTEK INC.

10 October 2017; filed 31 May 2013

The illustration is a section view through this planar diaphragm loudspeaker design. Rectangular diaphragm 1 is driven by multiple moving coil motors 5. The basic arrangement is not new, and the only novel feature appears to be the configuration of sound holes 7 and 8. Their combined area is specified as 20% to 60% of the diaphragm area.—GLA

Armin Schober et al., assignors to TDK Corporation

3 October 2017; filed 9 May 2012

According to this patent, it is difficult to maintain tight production tolerances in the fabrication of tiny microelectromechanical (MEMS) microphones used in cellular phones. To compensate for variations between units, a method is set forth in which the sensitivity of such a microphone can be adjusted after it is installed. Moreover, long-term stability can be maintained by calibrating the microphone each time the device is turned on. The calibration routine is carried out by using electrical signals only; no external sound source is required.—GLA

Peter John Frith et al., assignors to ESS Technology, Inc.

17 October 2017; filed 22 October 2016

This class-D power amplifier uses an unusual configuration of charge pumps to allow output voltage swings greater than the supply voltage. A full-bridge output stage is augmented by two additional half-bridge stages that include charge pumps. The patent explains that, unlike prior art, the charge pumps are switched on only during momentary peaks. Thus, relatively small boosting capacitors can be used.—GLA

Kim Spetzler Berthelsen et al., assignors to Analog Devices Global

7 November 2017; filed 12 December 2014

As can be seen in the diagram, this method of limiting loudspeaker cone excursion first splits the signal into high frequency and low frequency bands, then re-combines them after processing. The low frequency signal is subjected to “smart” limiting, whereas the high frequency signal is not modified.—GLA

Jean-Francois Rey, assignor to Alcatel Lucent

10 October 2017; filed 23 May 2007

A telephone conference setup may have to deal with several different kinds of terminals, including smartphones and laptop computers. When multiple microphones and multiple loudspeakers are used in the same room, echo canceling and suppression of acoustic feedback can become major problems. The conference bridge server described in this patent allows the participants at a given location to use their laptop computers as microphones but listen to a common conference speakerphone.—GLA

Alexey Leonidovich Ushakov, Moscow, Russian

7 November 2017; filed 23 August 2016

This yolk-type headset assembly fits snugly and includes a noise reduction microphone array. “The microphones are used for determination of correlated and non-correlated components of audio signals. The correlated components are treated as a noise signal and the non-correlated components are a target signal.”—GLA

Zhiqiang Zhang and Jianhui Jiang, assignors to HUAWEI DEVICE CO., LTD.

10 October 2017; filed 21 April 2015

Is there really a pent up demand for high-quality reception of 5.1 surround sound via tablet computers and smartphones? For computer gaming enthusiasts the answer may be yes. This patent describes a surround sound transmission system that is compatible with existing two-channel audio. The method includes determining whether the headphone outlet is connected to a conventional headset or a multichannel headset and then processing the audio signal accordingly.—GLA

Fredrik Henn et al., assignors to Dolby International AB

17 October 2017; filed 14 March 2017

This Dolby patent describes an interesting mono-to-stereo encoding scheme; “…the invention bridges the gap between simple pseudo-stereo methods and current methods of true stereo coding by using a new form of parametric stereo coding.” The new feature is a stereo width parameter that is added to the mono sum of the original signal before transmission. The patent is clearly written and explains numerous variations of the basic invention.—GLA

Mikko T. Tammi and Miikka T. Vilermo, assignors to Nokia Technologies Oy

17 October 2017; filed 31 March 2015

This is a continuation of earlier Nokia patent No. 9,055,371. Both patents make use of microphone array processing to provide enhanced stereo recording from a smartphone. Using modern digital technology, signals from three or more closely spaced microphones can be processed to create an accurate two-channel stereo recording. In this case however, a third virtual microphone is created to provide information about the dominant source location. According to the patent, the resulting enhanced stereo recording can be accurately converted to other formats, including binaural headphone listening and multi-channel surround sound.—GLA

Fredrik Henn et al., assignors to Dolby International AB

24 October 2017; filed 14 March 2017

The system described in this patent is almost a twin of patent No. 9,792,919 reviewed above. It features an encoded stereo balance parameter rather than a stereo width parameter.—GLA

Alexander Adami et al, assignors to Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.; Technische Universitaet Ilmenau

31 October 2017; filed 15 May 2015

The loudspeaker locations in a home surround-sound installation may differ considerably from the preferred locations. Prior art includes several methods for processing individual channels to create virtual sound images at the correct locations. The method described here first combines adjacent channels into segments for further processing, which includes decomposing each segment into direct and ambience components.—GLA

Abdessattar Abdelkefi et al, assignors to SAMSUNG ELECTRONICS CO., LTD.; VIRGINIA TECH INTELLECTUAL PROPERTIES, INC.

10 October 2017; filed 5 January 2015

This patent pertains to a means for obtaining electrical energy from vibrations using small piezoelectric, electrodynamic, or electrostatic elements attached to a vibrating surface. Most of the exemplary embodiments shown in the patent document are in essence two-degree-of-freedom dynamic systems, optimized for power generation in selected frequency ranges.—EEU

Takanori Sato, assignor to IDEAL BRAIN CO., LTD.

3 October 2017; filed 25 September 2014

Window panels are mounted in frames which are resiliently connected to a building's structural walls. The window panels' assemblies work as dynamic absorbers to reduce the vibrations of the building due to earthquakes, wind, and other horizontally acting disturbances.—EEU

Xuedong Chen et al., assignors to Huazhong University of Science and Technology

3 October 2017; filed 11 November 2016

This patent describes an isolation system for enabling precision machining and the like in the presence of small disturbances. The system consists of a payload-carrying platform that is connected to a base plate via six angled isolator struts as shown in the attached figure. Some of the struts are passive (spring and damper) isolators and some are active isolators. The latter include piezoelectric sensors and actuators and employ controllers to provide signals to the actuators according to an algorithm that reduces the transmitted vibrations along three orthogonal axes and rotations about these axes.—EEU

Matthew Sharifi and Jakob Nicolaus Foerster, assignors to Google Inc.

24 October 2017; filed 28 January 2016

The patent tells how to match your computer's speech output to your ability to understand what it is trying to say. In other words, the patent is about dumbing down your machine so you can understand it. The synthesis system would include various vocabularies, grammars, and language models. In addition to selecting the most appropriate language, the system would also be able to choose vocabulary items or grammatical constructions suitable for a particular user. The examples mainly describe short, telegraphic phrases, rather than complete sentences.—DLR

Susumu Takatsuka, assignor to Sony Mobile Communications Inc.

7 November 2017; filed 25 March 2009

The patent addresses one of the most basic issues in speech synthesis, that is, the divide between “cut and paste” systems, in which fragments of natural speech are joined together, versus synthesis-by-rule, in which the entire speech waveform is constructed from some form of abstract description of the phonemes and how to put them together. More precisely, the issue is addressed, but not solved. The method described seems to settle on additional information being gathered to enhance the playback of edited natural speech fragments.—DLR

Seok Jin Hong et al., assignors to Samsung Electronics Co., Ltd.

3 October 2017; filed 5 February 2015

This speech recognition engine, intended for tablets, cell-phones, and so forth, is based on a method of mapping phonemes into an n-dimensional space. Thus encoded, words are identified using a phonetic mapping system, which generates a multi-dimensional representation that is then passed to a deep learning network for identification.—DLR

Hoshik Lee, assignor to Samsung Electronics Co., Ltd.

3 October 2017; filed 31 March 2015

This speech recognition system is designed for the situation where multiple speakers may be present in a room, and some speech in the room may not be intended as commands or otherwise to be recognized. In general, such conflicts are resolved based on time of arrival of various utterances, from which a command priority is established.—DLR

Michael Patrick Johnson et al., assignors to Google Inc.

3 October 2017; filed 17 August 2015

This speech recognition system uses vibration sensors, in addition to audio channel decoding, to make decisions about which persons are speaking, which sounds need to be decoded, and finally, which are to be acted on. In particular, the system is especially designed to make use of whether a user is wearing a head-mounted video device. A number of styles of head-mounted devices are considered.—DLR

Ciprian I. Chelba et al., assignors to Google Inc.

10 October 2017; filed 2 May 2013

The patent discusses in great detail how “language sequences” are to be handled. The idea is that it may be possible to quickly recognize a number of frequently occurring word sequences, allowing a rapid response when one of such sequences is found. If no such common word sequence is identified, the system reverts to a second analysis method, which may still encounter commonly occurring items, but not being embedded in a common phrase.—DLR

Michael R. Longé et al., assignors to Nuance Communications, Inc.

10 October 2017; filed 14 November 2013

The patent describes a speech recognition engine designed to be used together with other information gathering devices, such as keyboard, mouse, tablet, etc. The advent of speech recognition was often hailed as a way of freeing the user from the need for these devices. Things did not always work out that way. This patent deals with methods for integrating speech input with information from other devices in order to provide a more accurate overall system.—DLR

Craig L. Reding and Suzi Levas, assignors to Google Inc.

10 October 2017; filed 10 August 2016

The patent deals with the establishment of a centralized speech recognition facility intended to service users over telephone or internet connections. The recognition technologies discussed appear to be widely used. The patent discusses protocols, command sets, voice dialing, response times, and so forth.—DLR

Shuji Miyasaka and Kazutaka Abe, assignors to SOCIONEXT INC.

17 October 2017; filed 25 June 2015

This patent is based on the seemingly simple task of deciding when to speak and when to listen. This is in the context of a speech recognition system set up to control an audio device, typically a TV set. A microphone is also available for collecting speech from a user. The system must at all times decide whether to play sounds from the TV, whether to echo spoken commands, or whether to just shut up and listen. Sounds easy.—DLR

Hassan Sawaf et al., assignors to eBay Inc.

24 October 2017; filed 26 October 2009

The patented language translation system, referred to as a hybrid, combines multiple methods of translation. Intended for monitoring media items, such as news broadcasts, the system would be able to integrate information from many sources. To do this, the system combines so called “rule-based” systems, using syntactic and semantic linguistic models, with statistical approaches, either using word searches, or by deep network analysis of word patterns.—DLR

Steven C. Dzik and Guy A. Story, Jr., assignors to Audible, Inc.

24 October 2017; filed 3 August 2015

The patent describes several methods for comparing audio and text information, such as while playing an audio book and simultaneously displaying the written form. Using text analysis together with word recognition, the goal is to be able to display an item related to the audio stream, without requiring the user to operate the display to keep the items in sync.—DLR

Jay Gainsboro et al., assignors to Securus Technologies, Inc.

24 October 2017; filed 7 September 2011

The patent mainly deals with monitoring data, including speech, from criminal networks, past and present prison populations, such as phone calls, intended to detect scams, escape plans, imposters, etc. A system is described by which an investigator's personal ID can be encoded into a phone call to provide legal evidence of monitoring activities.—DLR

Tak M. Ko and Dragan Zigic, assignors to Change Healthcare LLC

31 October 2017; filed 29 June 2012

The patent describes methods for transcribing spoken material and how to annotate the written material to allow simultaneous display of the written material to match an ongoing audio stream.—DLR

Fadi Biadsy and Diamantino Antonio Caseiro, assignors to Google Inc.

31 October 2017; filed 8 April 2015

Although it is not clear from the patent title, this patent is concerned with the use and maintenance of language models used in speech recognition. Specifically, the patent is concerned with the question of exactly what happens if a particular word has never been seen in a particular context. Cases are described in which such a non-occurring word can, in fact, end up with a higher probability of fitting the context than another word which had occurred in the context, but was considered very unlikely.—DLR

Sung Joo Lee et al., assignors to Electronics and Telecommunications Research Institute

31 October 2017; filed 12 February 2016

The patent describes a very specific application of a deep neural network system to the issue of speech recognition. More specifically, a system called a “context-dependent deep neural network hidden Markov model algorithm” is described. The patent describes a way of improving the accuracy of such a system. The patent text deals almost entirely with details of the Mel-frequency hidden Markov model tuning and rarely mentions the deep neural network.—DLR

Gakuto Kurata et al., assignors to International Business Machines Corporation

7 November 2017; filed 23 September 2015

The patent addresses issues in training a continuous speech recognizer from multiple speakers, particularly in the context of a telephone call center, when a recording of the ongoing speech activity may include cases of overlapped speech from multiple speakers. The training methods presented are mainly concerned with additional training using speech models containing examples of simultaneous speech.—DLR

Chang-Heon Lee and KyuSeop Bang, assignors to HYUNDAI MOTOR COMPANY

7 November 2017; filed 5 December 2014

To my knowledge, there has not yet been an automobile accident in which the proximate cause was a misrecognition of a speech utterance spoken by some person in the vehicle. But that event surely will happen. This patent does not directly address the question of automobile control but does deal with issues of recognition accuracy and how that may be improved. That primary approach is the presentation of a number of strategies for re-examining a received utterance with various different recognition tools, thus not requiring the speaker to repeat what was said. Various strategies, some involving further interaction with vehicle occupants, are discussed.—DLR

Charles Corfield, assignor to nVoq Incorporated

7 November 2017; filed 4 March 2015

The patent discusses methods and issues involved in changing the language models used by a recognizer during system use, avoiding the necessity of shutting down the system to switch to a different language model and/or speaker profile.—DLR

Chao Li and Zhijian Wang, assignors to Baidu Online Network Technology (Beijing) Co., Ltd.

17 October 2017; filed 23 December 2015

The patent describes a particular protocol that would be followed in the process of verifying the speaker's identity. Intended for use with a mobile device, the system would allow users to make payments by voice command. The system would first present a prompt item, such as, in this case, one or more Chinese characters. Reminiscent of a “captcha” type of test, the user would be expected to respond in a particular way. The system would compare the response to training items previously provided.—DLR