Head-Related Transfer Function and Virtual Auditory Display

Retail Price: $99.95

Direct Price: $89.95

Second Edition
By Bosun Xie
Hardcover, 7×10, 504 pages
ISBN: 978-1-60427-070-9
June 2013

ISBN: 978-1-60427-070-9 Categories: , , , ,


A Title in J. Ross Publishing’s Acoustics: Information and Communication Series
Series Editor: Dr. Ning Xiang, Rensselaer Polytechnic Institute

Contains a foreword by Jens Blauert

This book systematically details the basic principles and applications of head-related transfer function (HRTF) and virtual auditory display (VAD), and reviews the latest developments in the field, especially those from the author’s own state-of-the-art research group. Head-Related Transfer Function and Virtual Auditory Display covers binaural hearing and the basic principles, experimental measurements, computation, physical characteristics analyses, filter design, and customization of HRTFs. It also details the principles and applications of VADs, including headphone and loudspeaker-based binaural reproduction, virtual reproduction of stereophonic and multi-channel surround sound, binaural room simulation, rendering systems for dynamic and real-time virtual auditory environments, psychoacoustic evaluation and validation of VADs, and a variety of applications of VADs. This guide provides all the necessary knowledge and latest results for researchers, graduate students, and engineers who work in the field of HRTF and VAD.

Key Features

  • Discusses and summarizes the basic principles and applications of head-related transfer functions and virtual auditory display
  • Reviews the frontiers and latest approaches (modeling, calculations, rendering/display) into HRTF and VAD
  • Applications from this research can be found in engineering, communication, multimedia, consumer electronic products, and entertainment
  • More than 600 references are listed, representing the main body of the literature in this field
  • WAV™ includes errata pages— available in the Web Added Value™ Download Resource Center

About the author(s)

Bosun Xie received a Bachelor degree in physics in 1982 and a Master of Science degree in acoustics from South China University of Technology in 1987. In 1998, he received a Doctor of Science degree in acoustics from Tongji University. Since 1982, he has been working at the South China University of Technology and is currently the director and a professor at Acoustic Lab., School of Science. Xie is also a member of The State Key Lab of Subtropical Building Science. His research interests include binaural hearing, spatial sound, acoustic signal processing, and room acoustics, has published over 150 scientific papers, and owns 5 patents in these fields. Professor Xie is a member of the Audio Engineering Society (AES), a vice-chairman of the China Audio Engineering Society and a committee member of the Acoustical Society of China.

Table of Contents

Chapter 1: Spatial Hearing and Virtual Auditory Display 
1.1. Spatial Coordinate Systems
1.2. The Auditory System and Auditory Filter
       1.2.1. The Auditory System and its Function
       1.2.2. The Critical Band and Auditory Filter
1.3. Spatial Hearing
1.4. Localization Cues for a Single Sound Source
       1.4.1. Interaural Time Difference
       1.4.2. Interaural Level Difference
       1.4.3. Cone of Confusion and Head Movement
       1.4.4. Spectral Cue
       1.4.5. Discussion on Directional Localization Cues
       1.4.6. Auditory Distance Perception
1.5. Head-Related Transfer Functions
1.6. Summing Localization and Spatial Hearing with Multiple Sources
       1.6.1. Summing Localization of Two Sound Sources and the Stereophonic Law of Sine
       1.6.2. Summing Localization Law of More Than Two Sound Sources
       1.6.3. Time Difference between Sound Sources and the Precedence Effect
       1.6.4. Cocktail Party Effect
1.7. Room Acoustics and Spatial Hearing
       1.7.1. Sound Fields in Enclosed Spaces
       1.7.2. Spatial Hearing in Enclosed Spaces
1.8. Binaural Recording and Virtual Auditory Display
       1.8.1. Artificial Head Models
       1.8.2. Binaural Recording and Playback System
       1.8.3. Virtual Auditory Display
       1.8.4. Comparison with Multi-channel Surround Sound
1.9. Summary

Chapter 2: HRTF Measurements
2.1. Transfer Function of a Linear-time-invariant (LTI) System and its Measurement Principle
       2.1.1. Continuous-Time LTI System
       2.1.2. Discrete-Time LTI System 
       2.1.3. Excitation Signals
2.2. Principle and Design of HRTF Measurements
       2.2.1. Overview
       2.2.2. Subjects in HRTF Measurements
       2.2.3. Measuring Point and Microphone Position
       2.2.4. Measuring Circumstances and Mechanical Devices
       2.2.5. Loudspeaker and Amplifier
       2.2.6. Signal Generation and Processing
       2.2.7. HRTF Equalization 
       2.2.8. Example of HRTF Measurement 
       2.2.9. Evaluation of Quality and Errors in HRTF Measurements
2.3. Far-field HRTF Databases
2.4. Some Specific Measurement Methods and Near-field HRTF Measurements
       2.4.1 Some Specific HRTF measurement methods 
       2.4.2 Near-field HRTF Measurement 
2.5. Summary

Chapter 3: Primary Features of HRTFs
3.1. Time- and Frequency-domain Features of HRTFs
       3.1.1. Time-domain Features of Head-related Impulse Responses (HRIRs) 
       3.1.2. Frequency-domain Features of HRTFs
       3.1.3. Minimum-phase Characteristics of HRTFs
3.2. Interaural Time Difference (ITD) Analysis
       3.2.1. Methods for Evaluating ITD
       3.2.2. Calculation Results for ITD
3.3. Interaural Level Difference (ILD) Analysis
3.4. Spectral Features of HRTFs
       3.4.1. Pinna-related Spectral Notches
       3.4.2. Torso-related Spectral Cues
3.5. Spatial Symmetry in HRTFs
       3.5.1. Front-back Symmetry 
       3.5.2. Left-right Symmetry
       3.5.3. Symmetry of ITD
3.6. Near-field HRTFs and Distance Perception Cues
3.7. HRTFs and Other Issues Related to Binaural Hearing
3.8. Summary

Chapter 4: Calculation of HRTFs 
4.1. Spherical Head Model for HRTF Calculation
       4.1.1. Determining Far-field HRTFs and their Characteristics on the Basis of a Spherical Head Model 
       4.1.2. Analysis of Interaural Localization Cues
       4.1.3. Influence of Ear Location 
       4.1.4. Effect of Source Distance
       4.1.5. Further Discussion on the Spherical Head Model
4.2. Snowman Model for HRTF Calculation
       4.2.1. Basic Concept of the Snowman Model
       4.2.2. Results for the HRTFs of the Snowman Model
4.3. Numerical Methods for HRTF Calculation 
       4.3.1. Boundary Element Method (BEM) for Acoustic Problems
       4.3.2. Calculation of HRTFs by BEM
       4.3.3. Results for BEM-based HRTF Calculation
       4.3.4 Simplification of Head Shape 
       4.3.5. Other Numerical Methods for HRTF Calculation 
4.4. Summary

Chapter 5: HRTF Filter Models and Implementation
5.1. Error Criteria for HRTF Approximation
5.2. HRTF Filter Design: Model and Considerations
       5.2.1. Filter Model for Discrete-time Linear-time-invariant (LTI) System
       5.2.2. Basic Principles and Model Selection in HRTF Filter Design
       5.2.3. Length and Simplification of Head-related Impulse Responses (HRIRs) 
       5.2.4. HRTF Filter Design Incorporating Auditory Properties 
5.3. Methods for HRTF Filter Design
       5.3.1. Finite Impulse Response (FIR) Representation
       5.3.2. Infinite Impulse Response (IIR) Representation by Conventional Methods
       5.3.3. Balanced Model Truncation for IIR Filter
       5.3.4. HRTF Filter Design Using the Logarithmic Error Criterion
       5.3.5. Common-acoustical-pole and Zero Model of HRTFs
       5.3.6. Comparison of Results of HRTF Filter Design 
5.4. Structure and Implementation of HRTF Filter
5.5. Frequency-warped Filter for HRTFs
       5.5.1. Frequency Warping
       5.5.2. Frequency-warped Filter for HRTFs
5.6. Summary

Chapter 6: Spatial Interpolation and Decomposition of HRTFs 
6.1. Directional Interpolation of HRTFs
       6.1.1. Basic Concept of HRTF Directional Interpolation
       6.1.2. Some Common Schemes for HRTF Directional Interpolation
       6.1.3. Performance Analysis of HRTF Directional Interpolation
       6.1.4. Problems and Improvements of HRTF Directional Interpolation
6.2. Spectral Shape Basis Function Decomposition of HRTFs 
       6.2.1. Basic Concept of Spectral Shape Basis Function Decomposition
       6.2.2. Principal Components Analysis (PCA) of HRTFs 
       6.2.3. Discussion of Applying PCA to HRTFs 
       6.2.4. PCA Results for HRTFs
       6.2.5. Directional Interpolation under PCA Decomposition of HRTFs
       6.2.6. Subset Selection of HRTFs
6.3. Spatial Basis Function Decomposition of HRTFs 
       6.3.1. Basic Concept of Spatial Basis Function Decomposition
       6.3.2. Azimuthal Fourier Analysis and Sampling Theorem of HRTFs
       6.3.3. Analysis of Required Azimuthal Measurements of HRTFs
       6.3.4. Spherical Harmonic Function Decomposition of HRTFs 
       6.3.5. Spatial Principal Components Analysis and Recovery of HRTFs from a Small Set of Measurements
6.4. HRTF Spatial Interpolation and Signal Mixing for Multi-channel Surround Sound
       6.4.1. Signal Mixing for Multi-channel Surround Sound
       6.4.2. Pairwise Signal Mixing
       6.4.3. Sound Field Signal Mixing
       6.4.4. Further Discussion on Multi-channel Sound Reproduction
6.5. Simplification of Signal Processing for Binaural Virtual Source Synthesis
       6.5.1. Virtual Loudspeaker-based Algorithms
       6.5.2. Basis Function Decomposition-based Algorithms
6.6. Beamforming Model for Synthesizing Binaural Signals and HRTFs
       6.6.1. Spherical Microphone Array for Synthesizing Binaural Signals 
       6.6.2. Other Array Beamforming Models for Synthesizing Binaural Signals and HRTFs 
6.7. Summary

Chapter 7: Customization of Individualized HRTFs
7.1. Anthropometric Measurements and their Correlation with Localization Cues
       7.1.1. Anthropometric Measurements
       7.1.2. Correlations among Anthropometric Parameters and HRTFs or Localization Cues
7.2. Individualized Interaural Time Difference (ITD) Model and Customization
       7.2.1. Extension of the Spherical Head ITD Model
       7.2.2. ITD Model Based on Azimuthal Fourier Analysis
7.3. Anthropometry-based Customization of HRTFs
       7.3.1. Anthropometry Matching Method
       7.3.2. Frequency Scaling Method
       7.3.3. Anthropometry-based Linear Regression Method
7.4. Subjective Selection-based HRTF Customization
7.5. Notes on Individualized HRTF Customization
7.6. Structural Model of HRTFs
       7.6.1. Basic Idea and Components of the Structural Model
       7.6.2. Discussion and Improvements of the Structural Model
7.7. Summary

Chapter 8: Binaural Reproduction through Headphones
8.1. Equalization of the Characteristics of Headphone-to-Ear Canal Transmission
       8.1.1. Principle of Headphone Equalization 
       8.1.2. Free-field and Diffuse-field Equalization
8.2. Repeatability and Individuality of Headphone-to-ear-canal Transfer Functions (HpTFs) 
       8.2.1. Repeatability of HpTF Measurement
       8.2.2. Individuality of HpTFs
8.3. Directional Error in Headphone Reproduction
8.4. Externalization and Control of Perceived Virtual Source Distance in Headphone Reproduction
       8.4.1. In-head Localization and Externalization
       8.4.2. Control of Perceived Virtual Source Distance in Headphone Reproduction
8.5. Summary

Chapter 9: Binaural Reproduction through Loudspeakers
9.1. Basic Principle of Binaural Reproduction through Loudspeakers 
       9.1.1. Binaural Reproduction through a Pair of Frontal Loudspeakers
       9.1.2. General Theory for Binaural Reproduction through Loudspeakers 
9.2. Head Rotation and Loudspeaker Reproduction 
       9.2.1. Virtual Source Distribution in Two-Front Loudspeaker Reproduction
       9.2.2. Transaural Synthesis for Four-Loudspeaker Reproduction 
       9.2.3. Analysis of Dynamic Localization Cues in Loudspeaker Reproduction
       9.2.4. Stability of the Perceived Virtual Source Azimuth against Head Rotation
9.3. Head Translation and Stability of Virtual Sources in Loudspeaker Reproduction 
       9.3.1. Preliminary Analysis of Head Translation and Stability
       9.3.2. Stereo Dipole 
       9.3.3. Quantitative Analysis of Stability against Head Translation
       9.3.4. Linear System Theory for the Stability of Crosstalk Cancellation
9.4. Effects of Mismatched HRTFs and Loudspeaker Pairs
       9.4.1. Effect of Mismatched HRTFs 
       9.4.2. Effect of Mismatched Loudspeaker Pairs
9.5. Coloration and Timbre Equalization in Loudspeaker Reproduction
       9.5.1. Coloration and Timbre Equalization Algorithms
       9.5.2. Analysis of Timbre Equalization Algorithms 
9.6. Some Issues on Signal Processing in Loudspeaker Reproduction
       9.6.1. Causality and Stability of a Crosstalk Canceller
       9.6.2. Basic Implementation Methods for Signal Processing in Loudspeaker Reproduction 
       9.6.3. Other Implementation Methods for Signal Processing in Loudspeaker Reproduction 
       9.6.4. Bandlimited Implementation of Crosstalk Cancellation
9.7. Some Approximate Methods for Solving the Crosstalk Cancellation Matrix
       9.7.1. Cost Function Method for Solving the Crosstalk Cancellation Matrix
       9.7.2. Adaptive Inverse Filter Scheme for Crosstalk Cancellation
9.8. Summary

Chapter 10: Virtual Reproduction of Stereophonic and Multi-channel Surround Sound 
10.1 Binaural Reproduction of Stereophonic and Multi-channel Surround Sound through Headphones
       10.1.1 Binaural Reproduction of Stereophonic Sound through Headphones
       10.1.2 Basic Algorithm for Headphone-based Binaural Reproduction of 5.1 Channel Surround Sound 
       10.1.3 Improved Algorithm for Binaural Reproduction of 5.1 Channel Surround Sound through Headphones
       10.1.4 Notes on Binaural Reproduction of Multi-channel Surround Sound
10.2 Algorithms for Correcting Non-standard Stereophonic Loudspeaker Configurations
10.3 Stereophonic Enhancement Algorithms
10.4 Virtual Reproduction of Multi-channel Surround Sound through Loudspeakers
       10.4.1 Virtual Reproduction of 5.1 Channel Surround Sound
       10.4.2 Improvement of Virtual 5.1 Channel Surround Sound Reproduction through Stereophonic Loudspeakers
       10.4.3 Virtual 5.1 Channel Surround Sound Reproduction through More than Two Loudspeakers
       10.4.4 Notes on Virtual Surround Sound
10.5 Summary

Chapter 11: Binaural Room Modeling
11.1 Physics-based Methods for Room Acoustics and Binaural Room Impulse Response (BRIR) Modeling
       11.1.1 BRIR and Room Acoustics Modeling
       11.1.2 Image-source Methods for Room Acoustics Modeling
       11.1.3 Ray-tracing Methods for Room Acoustics Modeling
       11.1.4 Other Methods for Room Acoustics Modeling
       11.1.5 Source Directivity and Air Absorption 
       11.1.6 Calculation of Binaural Room Impulse Responses
11.2 Artificial Delay and Reverberation Algorithms
       11.2.1 Artificial Delay and Discrete Reflection Modeling
       11.2.2 Late Reflection Modeling and Plain Reverberation Algorithm
       11.2.3 Improvements on Reverberation Algorithm
       11.2.4 Application of Delay and Reverberation Algorithms to Virtual Auditory Environments 
11.3 Summary

Chapter 12: Rendering System for Dynamic and Real-time Virtual Auditory Environments (VAEs) 
12.1 Basic Structure of Dynamic VAE Systems 
12.2 Simulation of Dynamic Auditory Information
       12.2.1 Head Tracking and Simulation of Dynamic Auditory Information
       12.2.2 Dynamic Information in Free-field Virtual Source Synthesis
       12.2.3 Dynamic Information in Room Reflection Modeling
       12.2.4 Dynamic Behaviors in Real-time Rendering Systems
       12.2.5 Dynamic Crosstalk Cancellation in Loudspeaker Reproduction
12.3 Simulation of Moving Virtual Sources
12.4 Some Examples of Dynamic VAE Systems
12.5 Summary

Chapter 13: Psychoacoustic Evaluation and Validation of Virtual Auditory Displays (VADs) 
13.1 Experimental Conditions for the Psychoacoustic Evaluation of VADs 
13.2 Evaluation by Auditory Comparison and Discrimination Experiment
       13.2.1 Auditory Comparison and Discrimination Experiment
       13.2.2 Results of Auditory Discrimination Experiments
13.3 Virtual Source Localization Experiment
       13.3.1 Basic Methods for Virtual Source Localization Experiments 
       13.3.2 Preliminary Analysis of the Results of Virtual Source Localization Experiments
       13.3.3 Results of Virtual Source Localization Experiments
13.4 Quantitative Evaluation Methods for Subjective Attributes 
13.5 Further Statistical Analysis of Psychoacoustic Experimental Results
       13.5.1 Statistical Analysis Methods
       13.5.2 Statistical Analysis Results
13.6 Binaural Auditory Model and Objective Evaluation of VADs
13.7 Summary 

Chapter 14: Applications of Virtual Auditory Displays (VADs) 
14.1 VADs in Scientific Research Experiments
14.2 Applications of Binaural Auralization
       14.2.1 Application of Binaural Auralization in Room Acoustics
       14.2.2 Existing Problems in Room Acoustic Binaural Auralization
       14.2.3 Other Applications of Binaural Auralization
14.3 Applications in Sound Reproduction and Program Recording
14.4 Applications in Virtual Reality, Communication, and Multimedia
       14.4.1 Applications in Virtual Reality
       14.4.2 Applications in Communication
       14.4.3 Applications in Multimedia and Mobile Products
14.5 Applications in Clinical Auditory Evaluations
14.6 Summary 

Appendix A: Spherical Harmonic Functions

Appendix B: Multipole Re-expansions for Calculating the HRTFs of the Snowman Model




“I find the book excellent and, as indicated by Professor Jens Blauert (whom I have known since 1986 and whose opinions I value and trust), it is extremely thorough, multifaceted, detailed, mathematically deep, yet clear — I agree. I give it an extremely positive review. The summary concluding each chapter or section is particularly valuable, an excellent idea and feature….I give the book top rank.”
— Wade R. Bray, HEAD acoustics, Inc.

“This book provides a thorough and comprehensive analysis of issues relevant to HRTFs. Many of them have not been covered in such a detailed, clear way in any other book. Professor Xie has devoted a considerable amount of time and effort creating a work that appeals to readers’ demands. The international acoustical community will be pleased to have access to its important scientific and technological content, now available with this English translation.”
Jens Blauert, Internationally renowned expert in spatial hearing