The purpose of these acoustical patent reviews is to provide enough information for a Journal reader to decide whether to seek more information from the patent itself. Any opinions expressed here are those of the reviewers as individuals and are not legal opinions. Printed copies of United States Patents may be ordered at $3.00 each from the Commissioner of Patents and Trademarks, Washington, DC 20231. Patents are available via the Internet at http://www.uspto.gov.
Reviewers for this issue:
GEORGE L. AUGSPURGER, Perception, Incorporated, Box 39536, Los Angeles, California 90039
ERIC E. UNGAR, Acentech, Incorporated, 33 Moulton Street, Cambridge, Massachusetts 02138
9,826,307: 43.38.Hz MICROPHONE ARRAY INCLUDING AT LEAST THREE MICROPHONE UNITS
Hiroshi Kubota and Tomohiro Jonan, assignors to TOA CORPORATION
21 November 2017; filed 11 June 2013
The steerable microphone array described in this patent might be used in a smartphone for improved pickup during hands-free operation. Although elaborate instructions are given for microphone placement, it is difficult to distinguish these from established beam-forming design. A more interesting feature is the ability to estimate both the talker's angular position and distance. If the talker is in the near field of the array, individual delays are adjusted to compensate for the resulting errors.—GLA
9,820,050: 43.38.Ja BALANCED PUSH-PULL LOUDSPEAKER DEVICE, A CONTROL METHOD THEREOF, AND AN AUDIO PROCESSING CIRCUIT
Chia-Yu Wu et al., assignors to AMTRAN TECHNOLOGY CO., LTD.
14 November 2017; filed 25 January 2017
This loudspeaker design is a good example of wishful wizardry in full bloom. Two loudspeakers are used to reproduce two-channel stereo. The speakers are mounted in a small closed box, sharing the same back chamber. Signal processing circuitry sends high frequencies from left and right channels to the left and right loudspeakers, but low frequencies are combined into a common low frequency channel. This is all well established prior art, but a novel feature is added: an inverter is included in one low frequency drive signal. Thus, at low frequencies, the speakers are driven “push-pull,” allowing the back chamber volume to be reduced to almost nothing. Of course, the resulting low frequency output is also reduced to almost nothing. That side effect is not mentioned in the patent.—GLA
9,813,836: 43.38.Si PROVIDING VOICES OF PEOPLE IN A TELEPHONE CALL TO EACH OTHER IN A COMPUTER-GENERATED SPACE
Glen A. Norris, Tokyo, Japan and Philip Scott Lyren, Hong Kong, China
7 November 2017; filed 11 August 2017
This patent is concerned mainly with wireless telephone communication between two people, but the patent also covers audio conference setups with several participants. For a two-person telephone conversation, at least one person must wear a headset equipped with a head-tracking device. Using known techniques for binaural signal processing, the incoming mono signal is placed at a virtual location at least one meter from the listener's head—an empty chair perhaps. Moreover, the image remains localized at the chair even if the listener moves. To most people the idea probably seems frivolous, but it could be made to work.—GLA
9,819,805: 43.38.Si COMMUNICATION DEVICE WITH ECHO SUPPRESSION
Svend Feldt et al., assignors to SENNHEISER COMMUNICATIONS A/S
14 November 2017; filed 3 July 2014
This Sennheiser patent is concerned mainly with speakerphones, but can be applied to other situations where acoustic feedback may be a problem—a karaoke system, for example. A method of echo suppression is described which is based on earlier U.S. Patent #3,622,714. In both patents, a pair of complementary comb filters is used to process the transmitting (loudspeaker) and receiving (microphone) signals, frame by frame. Such filtering must be done very carefully to maintain acceptable audio quality. Several improvements are described here that can minimize audible artifacts.—GLA
9,820,029: 43.38.Si DEVICE-ADAPTABLE AUDIO HEADSET
Jens Kristian Poulsen, assignor to BLACKBERRY LIMITED
14 November 2017; filed 17 April 2015
According to this patent, some portable audio devices are intended to work with “smart” headsets in which coded audio signals are used to perform various functions, such as “volume up,” “volume down,” etc. Other audio devices rely on passive signaling for similar purposes. A universal headset is described that, when plugged in, can determine the type of device in use and respond appropriately.—GLA
9,832,304: 43.38.Si MEDIA DELIVERY PLATFORM
John Mikkelsen and Robert Freidson, assignors to SKKY, LLC
28 November 2017; filed 1 April 2015
The invention described here is a smartphone optimized to receive and play audio or digital clips with or without an internet connection. Extensive prior art exists—the patent document includes more than six pages of citations—making it difficult to create anything sufficiently novel to justify a patent. The invention is a collection of specific requirements including multiple digital signal processors, a receiver configured to receive an orthogonal frequency-division multiplex data transmission from one or more servers, provisions for storage and playback, etc., etc.—GLA
9,832,588: 43.38.Si PROVIDING A SOUND LOCALIZATION POINT IN EMPTY SPACE FOR A VOICE DURING AN ELECTRONIC CALL
Glen A. Norris, Tokyo, Japan and Philip Scott Lyren, Hong Kong, China
28 November 2017; filed 11 August 2017
This patent is a nit-picky successor to U.S. Patent #9,813,836 reviewed above. In fact, it is the twelfth in a series of continuations and resulting patents. In this case the patent suggests a number of ways in which a phantom location for the voice of an incoming caller can be automatically assigned.—GLA
9,826,325: 43.38.Tj SYSTEM FOR NETWORKED ROUTING OF AUDIO IN A LIVE SOUND SYSTEM
Adam Holladay et al., assignors to HARMAN INTERNATIONAL INDUSTRIES, INCORPORATED
21 November 2017; filed 7 December 2011
Computer engineers confidently predict that artificial intelligence will eventually achieve self-awareness, and the invention described in this patent may be a harbinger of things to come. It is a sound system that designs itself. A modern sound system for a touring show actually consists of two or three interconnected systems that must be reconfigured for each venue. The patent explains how various methods, calculations, graphical interfaces, and test procedures can be performed automatically.—GLA
9,820,072: 43.38.Vk PRODUCING A MULTICHANNEL SOUND FROM STEREO AUDIO SIGNALS
Martin Mieth and Udo Zölzer, assignors to HELMUT-SCHMIDT-UNIVERSITÄT UNIVERSITÄT DER BUNDESWEHR HAMBURG; HAMBURG INNOVATION GMBH
14 November 2017; filed 29 August 2013
At least a half-dozen methods have been proposed for converting standard two-channel stereo into multi-channel surround sound. The method described here is anything but simple, but it seems to be well thought out, and the patent provides a thorough explanation of the process. The two input channels are first compared on the basis of relative power, autocorrelation, and cross correlation. That information is used to derive partial similarity functions, then calculate first and second panning coefficients. The process is continued as needed to create signals for all front and surround channels.—GLA
9,786,832: 43.40.-r ENERGY HARVESTER
Abdessattar Abdelkefi et al, assignors to SAMSUNG ELECTRONICS CO., LTD.; VIRGINIA TECH INTELLECTUAL PROPERTIES, INC.
10 October 2017; filed 5 January 2015
This patent pertains to a means for obtaining electrical energy from vibrations by use of small piezoelectric, electrodynamic, or electrostatic elements attached to a vibrating surface. Most of the exemplary embodiments shown in the patent document are in essence two-degree-of-freedom dynamic systems that are optimized for power generation in selected frequency ranges.—EEU
9,777,474: 43.40.Tm VIBRATION CONTROL WALL STRUCTURE
Takanori Sato, assignor to IDEAL BRAIN CO., LTD.
3 October 2017; filed 25 September 2014
Window panels are mounted in frames which are resiliently connected to a building's structural walls. The window panel assemblies work as dynamic absorbers to reduce the vibrations of the building due to earthquakes, wind, and other horizontally acting disturbances.—EEU
9,810,917: 43.40.Tm PASSIVE DAMPING FOR OPTICAL IMAGE STABILIZATION
Aurelien R. Hubert and Douglas S. Brodie, assignors to Apple, Inc.
7 November 2017; filed 2 June 2014
Vibrations may be induced in an optical system (such as a camera) as components are made to move relative to each other, for example due to automatic focus adjustment. These vibrations are attenuated by the inclusion of a viscoelastic material, such as silicone gel, between selected moving and stationary parts of the system.—EEU
9,777,793: 43.40.Vn SIX-DEGREE-OF-FREEDOM MICRO VIBRATION SUPPRESSION PLATFORM AND CONTROL METHOD THEREOF
Xuedong Chen et al., assignors to Huazhong University of Science and Technology
3 October 2017; filed 11 November 2016
This patent describes an isolation system for enabling precision machining and the like in the presence of small disturbances. The system consists of a payload-carrying platform that is connected to a base plate via six angled isolator struts as shown in the attached figure. Some of the struts are passive (spring and damper) isolators and some are active isolators. The latter include piezoelectric sensors and actuators and employ controllers to provide signals to the actuators according to an algorithm that reduces the transmitted vibrations along three orthogonal axes and rotations about these axes.—EEU
9,820,065: 43.58.Ry METHOD AND APPARATUS TO EVALUATE AUDIO EQUIPMENT FOR DYNAMIC DISTORTIONS AND OR DIFFERENTIAL PHASE AND OR FREQUENCY MODULATION EFFECTS
Ronald Quan, Cupertino, CA
14 November 2017; filed 4 January 2017
This patent is the culmination of eight earlier patents, each a continuation of its predecessor. Unusual tests for audio equipment are described. For example, the inventor reports that intermodulation (“cross modulation”) distortion can be interpreted and measured as wow and flutter. Other sections of the patent explain how certain TV test methods can be adapted for audio frequencies. Practical applications are debatable, but those involved with audio testing will find the patent interesting.—GLA
9,837,084: 43.72.Ja STREAMING ENCODER, PROSODY INFORMATION ENCODING DEVICE, PROSODY-ANALYZING DEVICE, AND DEVICE AND METHOD FOR SPEECH SYNTHESIZING
Sin-Horng Chen et al., assignors to NATIONAL CHAO TUNG UNIVERSITY
5 December 2017; filed 30 January 2014
A difficult aspect of speech synthesis is the issue of producing a natural-sounding pitch contour. This task is more difficult for tone languages, such as Chinese, in which the pitch contour must carry both phonemic and expressive signals. This patent deals primarily with low-level issues, such as the implementation of pitch effects in particular phonetic segments, but also discusses models for applying prosodic features at multiple levels.—DLR
9,818,398: 43.72.Ne DETECTING POTENTIAL SIGNIFICANT ERRORS IN SPEECH RECOGNITION RESULTS
William F. Ganong, III et al., assignors to Nuance Communications, Inc.
14 November 2017; filed 15 May 2015
The primary message of this patent is that redundancy helps in decoding information. In the case of speech recognition, there are many cases where additional information may be available, possibly in various forms, that could help in understanding what was spoken. For example, in medical cases, a reader checking the recognizer results may not be familiar with a particular drug. But other persons can be consulted to supply such information. Additional recordings may be available, allowing additional recognition passes.—DLR
9,818,399: 43.72.Ne PERFORMING SPEECH RECOGNITION OVER A NETWORK AND USING SPEECH RECOGNITION RESULTS BASED ON DETERMINING THAT A NETWORK CONNECTION EXISTS
Craig L. Reding and Suzi Levas, assignors to Google, Inc.
14 November 2017; filed 23 May 2016
The patent describes several situations in which network facilities may be called into play to help recognize an utterance spoken at a local terminal. The speech signal will first be converted into a more compact and parametric form for further recognition. This may be done at the local level, but will typically be done at a higher network level. These results may then be sent to one or more stations which are able to apply different recognition models, apply various single or multi-speaker analyses, and consult multiple dictionaries or databases to help with the recognition.—DLR
9,818,400: 43.72.Ne METHOD AND APPARATUS FOR DISCOVERING TRENDING TERMS IN SPEECH REQUESTS
Matthias Paulik et al., assignors to Apple, Inc.
14 November 2017; filed 28 August 2015
The patented system to assist in speech recognition would monitor various on-line sources for current trends in speech patterns that could be applied to improve recognition results. Sources monitored could include news sources, social networks, and perhaps search terms. An example cited in the patent would be the use of persons' names to refer to hurricanes. During the time such a storm is active, it may be referred to by name without mentioning that the topic is the storm. Little is said about the dictionary systems by which such information could be made useful.—DLR
9,818,401: 43.72.Ne SYSTEMS AND METHODS FOR ADAPTIVE PROPER NAME ENTITY RECOGNITION AND UNDERSTANDING
Harry William Printz, assignor to Promptu Systems Corporation
14 November 2017; filed 19 September 2016
Persons' names, place names and business names will many times not be found in the dictionaries used by speech recognition systems. The patent describes a variety of special processes and techniques that can be applied to decoding such utterances and has considerable detail on the processing methods that can be applied to isolate and decode such special cases. The patent is unusual in this reader's experience in that it includes a glossary of the special terms used.—DLR
9,818,403: 43.72.Ne SPEECH RECOGNITION METHOD AND SPEECH RECOGNITION DEVICE
Tsuyoki Nishikawa, assignor to PANASONIC INTELLECTUAL PROPERTY CORPORATION OF AMERICA
14 November 2017; filed 27 October 2015
The patent describes a system for speech recognition with the goal of operating one or more pieces of equipment. The emphasis is on using methods of sound source localization in order to isolate and verify the particular speech source. Various methods of beam forming and echo cancellation are discussed in detail.—DLR
9,824,684: 43.72.Ne PREDICTION-BASED SEQUENCE RECOGNITION
Dong Yu et al., assignors to MICROSOFT TECHNOLOGY LICENSING, LLC
21 November 2017; filed 22 December 2014
The patent discusses speech recognition as an example of a more general issue of sequence recognition, that is, dealing with information that arrives in a time sequence. An element in such recognition tasks, is the application of neural network architectures, and deep networks in particular. One aspect of such networks is described as a “bottleneck” layer, which serves to reduce the dimensionality of the information flow through the system.—DLR
9,824,687: 43.72.Ne SYSTEM AND TERMINAL FOR PRESENTING RECOMMENDED UTTERANCE CANDIDATES
Komei Sugiura et al., assignors to National Institute of Information and Communications Technology
21 November 2017; filed 1 July 2013
The goal of this speech recognition/synthesis system is to determine, as quickly as possible, the topic of the conversation under consideration. As the system narrows the range of possible topics at hand, it is able to improve the recognition result for additional utterances. The user is guided in this process by hints or clues provided by the system, based on the currently estimated most-likely topics. The assumption made here is that providing such hints will lead to a more successful interaction, including translating the ongoing dialogue to another language.—DLR
9,824,688: 43.72.Ne METHOD FOR CONTROLLING SPEECH-RECOGNITION TEXT-GENERATION SYSTEM AND METHOD FOR CONTROLLING MOBILE TERMINAL
Kazuki Funase and Atsushi Sakaguchi, assignors to PANASONIC INTELLECTUAL PROPERTY CORPORATION OF AMERICA
21 November 2017; filed 6 July 2015
The patented speech processing system would be used to capture speech phrases uttered during a business meeting and provide techniques for editing the captured phrases into a suitable record of the minutes of the meeting. The means provided for doing such editing consist of a variety of signals in the form of shading or coloring selected portions of the text and allowing various portions of the text to be saved or otherwise edited by means of movements of the hand-held device.—DLR
9,830,318: 43.72.Ne SIMULTANEOUS TRANSLATION OF OPEN DOMAIN LECTURES AND SPEECHES
Alexander Waibel, assignor to Facebook, Inc.
28 November 2017; filed 22 November 2016
This speech recognition and translation system is intended for the translation of a talk or presentation to a small group of listeners familiar with the subject matter. It is expected that a certain amount of audience interaction could be used by the recognition system to improve the overall ability of the system to correctly identify and translate utterances made during the speaker's presentation. Preparatory material, such as an announced meeting topic, may be used to improve system performance.—DLR
9,830,905: 43.72.Ne SYSTEMS AND METHODS FOR FEATURE EXTRACTION
Wenliang Lu and Dipanjan Sen, assignors to QUALCOMM Incorporated
28 November 2017; filed 24 June 2014
The patent provides a detailed description of a speech recognition system based on a cochlear processing model. The speech waveform is separated into time and place signals. Methods are described for isolating near-field versus background signals. Feature extraction is described for identifying voiced and voiceless segments and for identifying vowel and consonant elements in the speech signal. Aspects of speech signal processing in the human brain are applied throughout to evaluate and guide operation of the recognition system.—DLR
9,830,906: 43.72.Ne SPEECH RECOGNITION CONTROL DEVICE
Takashi Inose and Shinobu Nakamura, assignors to Kojima Industries Corporation
28 November 2017; filed 8 April 2014
This speech recognition system is intended for use in an automobile and is designed to detect and isolate utterances spoken by persons seated in various positions in the vehicle. Multiple microphones would pick up audio signals with different time delays, allowing the separation of speech streams spoken simultaneously by multiple vehicle occupants. As speech signals from various positions are isolated, a priority system can be applied to determine which user's speech will be recognized for controlling certain vehicle conditions, such as temperature control settings.—DLR
9,837,070: 43.72.Ne VERIFICATION OF MAPPINGS BETWEEN PHONEME SEQUENCES AND WORDS
Fuchun Peng et al., assignors to Google, Inc.
5 December 2017; filed 21 February 2014
Phoneme or word sequences produced by different speech recognizers are compared to identify and correct errors in the recognition. The patent simply states that one transcription would be different from another. It is not clear in what way the recognizers differ so as to result in alternate analyses. One statement says that, “the second transcription of the utterance may be obtained from a second speech engine.”—DLR
9,837,092: 43.72.Ne CLASSIFICATION BETWEEN TIME-DOMAIN CODING AND FREQUENCY DOMAIN CODING
Yang Gao, assignor to HUAWEI TECHNOLOGIES CO., LTD.
5 December 2017; filed 11 May 2017
The patent presents a detailed discussion of the various aspects of speech analysis, covering many of the widely used techniques of linear prediction analysis and spectral encoding methods. A contrast is made between time domain and frequency domain coding and the patent presents what is said to be an improved time domain encoder.—DLR