Abbreviations: BOLD, blood oxygen level dependent; EEG, electroencephalography; EPI, echo planar imaging; fMRI, functional magnetic resonance imaging; FOV, field of view; FWHM, full width at half maximum; GLM, general linear model; HL, hearing level; HRF, hemodynamic response function; MEG, magnetoencephalography; MNI, Montreal Neurological Institute; MPRAGE, magnetization-prepared gradient-echo sequence; PET, positron emission tomography; ROI, region of interest; SQUID, superconducting quantum interference device; SL, sensation level; SPL, sound pressure level; TA, time of acquisition; TE, echo time; TR, time of repetition
Devices that use ultrasound for a variety of applications are widespread in engineering and healthcare, as well as in other aspects of everyday life. Whether by intention or as a side effect, many of these devices are sources of airborne ultrasound. Potentially high sound pressure levels (SPLs) of airborne ultrasound can be produced by, for instance, ultrasound cleaning and welding machines , , thereby exposing their operators to such high ultrasound SPLs. Animal repellents installed in gardens, on balconies or even in public places produce airborne ultrasound, and many public spaces are exposed to ultrasound with undeclared SPLs causing some humans to experience this as a disturbance , , . Because of the short wavelengths involved, standing wave effects frequently occur. The near-field region around an ultrasound source, which exhibits a spatially strongly varying sound pressure, can extend far enough to include the area where microphones or other measurement instruments are typically located. Both effects complicate practical sound-field measurement and noise assessment.
There are numerous indicators that airborne ultrasound events may influence human beings, and that some human beings can still perceive sound at frequencies above 16 kHz , , . However, at present, the precise mechanisms of sound perception at these frequencies are not well understood; this lack of knowledge is reflected in the status of existing regulations and standards (and in the fact that, in some cases, such regulations and standards do not exist). The few existing governmental guidelines for ultrasonic exposure mainly refer to the same very limited literature and knowledge base, usually assessing 1/3 octave band exposure limits of about 110 dB or 115 dB re 20 μPa SPL (note that all SPLs given in dB in this study refer to 20 μPa sound pressure) for ultrasonic frequencies, for example, at workplaces . This situation is aggravated by an inadequate measurement infrastructure and the lack of a metrological basis, with the result that even the determination of SPL values (a simple and common technical procedure within the audible frequency range) poses difficulties in the ultrasonic frequency range , . As a result of this unsatisfying situation, many workplaces cannot be adequately assessed, complaints made by persons exposed to ultrasound cannot be properly evaluated and manufacturers do not have clear guidelines for constructing noise protection enclosures for ultrasound machines, which can lead to health risks being underestimated or exaggerated.
A primary standard for the calibration of microphones has recently been established , allowing the first traceable measurements of airborne ultrasound SPLs to be taken for customary sources in a laboratory environment. This represents an important first step toward an improvement of the present situation. However, the long-term goal is to protect workers and the general public from annoying or even hazardous ultrasound exposure while simultaneously protecting manufacturers and innovators from unjustified or unnecessarily restrictive regulatory measures. This can only be achieved if all aspects of airborne ultrasound are investigated and, ultimately, understood. To do so, all the elements in the chain of transmission – from sources and propagation to sound fields in the air and the hearing and perception of ultrasound by humans – have to be included in such investigations.
This study focuses on the last element of this transmission chain: Its aim is to improve the understanding of the perception mechanisms of airborne ultrasound by the hearing system of human beings. To this end, a combination of audiological methods and brain imaging was used. An attempt was made to gain new insight into the physiological and cognitive mechanisms of ultrasound perception.
The first basic element of an audiological investigation is the determination of hearing thresholds. Little data is available for the ultrasonic frequency range; especially above 20 kHz, very few measurements exist for airborne transmission into the hearing system or for the stimulation of the eardrum. Grzesik and Pluta  investigated 189 subjects at frequencies starting in the audible range from 500 Hz up to 20 kHz as a reference for the investigation of the impact of sound on workers exposed to ultrasound at their workplaces. Henry and Fast  tested 78 subjects aged 18–24 with pure tones between 2 and 24 kHz. In a German study, Herbertz  presented tones of up to 40 kHz, which appears to be the highest frequency applied so far for a study of airborne ultrasound transmission into the ear. All threshold data from the studies indicated were obtained using very different stimuli and under very different measurement conditions, limiting the comparability of the data.
An important step in the perception mechanisms of sound is the necessary activation of the brain when the sound is consciously perceived by the subject (regardless of which physiological pathway is involved) and processed in the hearing system. Functional brain imaging can reveal the areas of and threshold levels for activation, forming an important cornerstone for understanding such perception mechanisms. Fujioka et al.  used magnetoencephalography (MEG) to investigate brain activation in response to airborne ultrasound of up to 40 kHz. They were unable to find any response between 20 kHz and 40 kHz. In contrast to these findings, Hosoi et al.  measured N1m brain activity components for tone bursts of up to 40 kHz using MEG, although their stimuli were presented via bone conduction. Oohashi et al.  successfully demonstrated the influence of ultrasonic frequencies on hearing by using music with extremely high-frequency spectral components as a stimulus under the application of electroencephalography (EEG) and positron emission tomography (PET). The study presented here takes a different approach by combining audiological methods  and, for the first time, two objective assessments of neural effects, in particular MEG and functional magnetic resonance imaging (fMRI), in order to investigate the perception of airborne ultrasound by humans.
Methods and instrumentation
In order to generate controlled high-frequency and ultrasonic stimuli for both audiological and objective brain-activity measurements, a new acoustic source developed in-house was used. This ultrasound source was compatible with the demanding environmental conditions present in fMRI and MEG devices by reducing the amount of metal and minimizing electromagnetic interference with the measurement devices. The source was based on acoustic transmission via a tube, similar to commercial insert earphones such as the Etymotic ER-30 (Etymotic Research, Elk Grove Village, IL, USA). The transducer used for this source was a Kemo L010 piezoelectric loudspeaker (Kemo-Electronic GmbH, Geestland, Germany) without any ferromagnetic parts. This transducer allowed the stimuli to be generated close to the ear without disturbing the imaging sensors of either the MRI or the MEG system. The hermetically sealed loudspeaker was mounted on an acoustic funnel that had a linearly decreasing inner diameter and was connected to the ear via a silicone tube (length: 330 mm, inner diameter: 5 mm) and via the ER3-14A audiometric ear tip (Etymotic Research, Elk Grove Village, IL, USA) (see Figure 1). The silicone tube was connected to the ear tip via a three-dimensional (3D)-printed T-piece (A3 in Figures 1 and 2). The T-piece’s middle tube was designed to fit tightly around a 1/8-inch pressure microphone (Brüel & Kjær 4138, Nærum, Denmark, without a safety grid), mounted on the Brüel & Kjær UA0036 microphone adapter (Brüel & Kjær, Nærum, Denmark) (A2 in Figure 2) for calibration purposes. For easy handling, calibration and sound pressure measurements were performed at the reference plane (depicted in Figure 1), taking into account the subsequent individual acoustic impedance, which was mainly determined by the individual ear canal geometry of the subject. Part B of Figure 2 shows the individual calibration procedure. During the listening tests, the microphone was replaced with a non-ferromagnetic dummy (A1 in parts A and C of Figure 2).
Digitally synthesized tone-bursts between 14 kHz and 24.2 kHz with a total duration of 1400 ms were used as stimuli. The bursts consisted of three ramped sinusoidal tones with a duration of 400 ms each, separated by a pause of 100 ms. On- and offset ramps were added using the Hanning window function  with a duration of 20 ms. Scharf  and Miśkiewicz et al.  showed that loudness adaption (a decrease in the loudness of continuous sound stimulation during a prolonged exposure time) near the threshold increases with increasing frequency, especially for frequencies above 14 kHz. Wynne et al.  introduced low-frequency amplitude modulation to reduce this effect. Following these suggestions, amplitude modulation was introduced to the applied stimuli by the already mentioned pause of 100 ms between the sinusoidal tones, resulting in sidebands with a 2-Hz line spacing pattern around the carrier frequency at the main peak in the frequency spectrum of the overall tone burst. The objective in doing so was to allow the stimuli to be heard as intermittent tones. However, for the fMRI measurements, a sparse-sampling technique was applied, for which reason pure tones with a duration of 3 s were presented.
Signals were digitally generated using a MATLAB code and converted by a 24-bit computer soundcard (RME Fireface UC, Audio AG, Haimhausen, Germany) into an analog signal with a sampling rate of 96 kHz, enhanced by an amplifier (BAA 120 BEAK, BEAK Electronic Engineering, Frankenblick, Germany) and presented monaurally by means of the ultrasound source described in Section “Ultrasound Source”. To minimize distortions from the soundcard (subharmonics and intermodulation products) at conventional audio frequencies, and to protect the test subjects from accidentally applied sounds in the frequency range (f<16 kHz) to which the human ear is more sensitive, an active digital high-pass filter (Stanford Research Systems, Model SR 650, Sunnyvale, CA, USA) with a cut-off frequency fc=20 kHz and a 115 dB/octave roll-off was used between the computer soundcard and the amplifier. In addition, a second (passive, analog) high-pass filter (fc=20 kHz, 12 dB/octave) was installed downstream from the amplifier. It was confirmed that there were no subharmonic distortions with amplitudes above the standardized insert-earphone hearing threshold . Table 1 shows the stimulus frequencies chosen for the exploitation of resonance enhancements in the insert earphone sound source in order to achieve the necessary SPL. Furthermore, in order to keep residual subharmonic distortions at conventional audio frequencies below the lowest hearing threshold in the subject group, the maximum SPL was limited individually for each stimulus frequency (Table 1).
Twenty-six test subjects (13 female and 13 male) 19–33 years of age (mean age: 24.2 years) participated in the hearing threshold measurements. All subjects had normal or corrected-to-normal vision and were otologically normal; this factor was assessed by means of the ISO 389-9  questionnaire filled out by all participants, which is common practice for preparing hearing experiments. No subject had a history of neurological, major medical or psychiatric disorders. All participants in the fMRI experiments were right-handed, as assessed using the Edinburgh Handedness Questionnaire . All test subjects took part on the basis of informed consent. The study was conducted in accordance with the Declaration of Helsinki and with the approval of the ethics committee of the German Psychological Society under vote No. SK 012014.
A subset of these test subjects formed the groups of persons who took part in the MEG and fMRI measurements following their performance in the audiological characterization. Those who showed the lowest hearing thresholds to ultrasound took part in the brain imaging part of the study. The nine participants in the MEG experiments were a subset of the 13 participants measured by fMRI (see below). This subset was a chance result: some participants had to be excluded from the MEG measurement as their weakly magnetic dental work (orthosis) saturated the MEG recording unit.
For all test subjects, individual hearing thresholds were known for the setting of stimulus levels related to these values as a reference. These settings are often referred to as sensation levels (SLs). During all the measurements, verbal checks were carried out as to whether subjects perceived at least parts of the stimuli. The 14-kHz stimulus was audible at all times and stimuli with higher frequencies were often perceived, but the subjects were not invited to identify each stimulus they perceived. The exposure was blind in the sense that the order of presentation of stimuli or silence periods was randomized.
Determination of hearing thresholds
All subjects received written instructions prior to the listening tests. The hearing thresholds were determined monaurally (left ear) by means of the source described in Section “Ultrasound Source”. The experiment was controlled by a computer using the MATLAB-based “psylab” software framework . The experimental paradigm was an “unforced weighted up-down” adaptive procedure as described by Kaernbach . Each trial consisted of a pair of time intervals, which were denoted as A and B and separated by a pause of 200 ms. During the acoustic presentation of these intervals, the signal being presented was indicated on a computer display screen. One of the intervals comprised the acoustic test signal, whereas the other comprised silence. The task of the subject was to indicate via a keyboard or a computer mouse whether interval A or B contained the test signal, or whether she/he was not sure. The subject had unlimited time to answer and was given visual feedback on the correctness of her/his response, after which the next trial began. The allocation of the test signal to the two intervals, A and B, was randomized for every trial.
Hearing thresholds were determined in an ascending order of frequency beginning with 14 kHz. Measurements were aborted when the maximum SPL (Table 1) was reached or the “I don’t know” button was pressed 5 times in a row, indicating that no hearing sensation existed. In this latter case, no further threshold measurements at higher frequencies were performed, assuming an increasing threshold with increasing frequency , , , . In fact, single tests with nine subjects did not indicate a hearing sensation at higher frequencies, thus reinforcing this assumption of a monotonic threshold increase. After the main experiment, the subjects were asked to describe their perception and to report any abnormalities or discomfort. This was done in a relaxed, informal conversation (free interview) without a questionnaire.
Magnetoencephalography (MEG) study
MEG measurements were carried out inside a magnetically shielded room (Type Ak3b, Vacuumschmelze GmbH & Co. KG, Hanau, Germany). The signals were recorded by a commercial 125 gradiometer-channel helmet MEG system (Type MEGVision, Yokogawa/Ricoh, Kanazawa, Japan https://www.yokogawa.com/me/). Five marker coils inside the MEG device and attached to distinctive spots on the head (nasion, left and right preauricular points, two points on forehead) and an ultrasound spatial sampling device (3DSpace, Zebris Medical, Isny im Allgäu, Germany) were used to align the MEG data with subsequent anatomical T1-weighted MRI scans. Estimating the coordinate transformation between the MEG and MRI data allowed a volume conduction model to be generated. The non-magnetic ultrasound source was situated inside the Ak3b and connected via a sound tube and an ear tip to the subject’s left ear. The right ear was closed off by means of an earplug. The stimuli at frequencies of 16.9 kHz, 19.1 kHz, 20.7 kHz and 24.2 kHz, being identical to the stimuli used in the hearing threshold experiments, were presented at two different SPLs: namely, at 2 dB below the individual hearing threshold of the subject (−2 dB SL) and at 5 dB above this hearing threshold (5 dB SL). A reference stimulus with a frequency of 14 kHz at 20 dB above the individual hearing threshold (20 dB SL) was used to compare the brain responses evoked by the ultrasound with a well-known brain response in the audible frequency range. This procedure was used as a test to determine whether the acoustic stimulation was in general able to produce a detectable brain response. The different sound stimuli were presented in random order with a total measurement time of 40 min. This led to 75 epochs, including 200 ms before the onset of the stimulus and 800 ms after the onset of the stimulus, for averaging at each stimulus frequency and loudness setting in MEG. To reject movement artifacts, epochs were discarded if the recorded signal changed by more than 10 pT or showed more than 30 zero crossings. After the experiment, all the test subjects were asked whether they had heard the ultrasound or not.
The data of the MEG recording was processed in MATLAB™ via the FieldTrip toolbox ( and references within the toolbox) using a code made for this purpose. A specific set of non-operational channels (between 7 and 18) was excluded before analysis; an epoch averaging was carried out on the basis of the trigger input. For source reconstruction, a non-linear dipole fit technique was applied to estimate the source position for every stimulus within the volume conduction model. The difference between the magnetic signal measured and the magnetic signal calculated was minimized within a window of 20 ms centered at a latency of 100 ms by means of the Levenberg-Marquardt algorithm. For forward modeling, the individual anatomic MRI data sets were segmented into the scalp, the skull and the internal tissue. These segments defined a three-shell biomagnetic volume conductor  with homogeneous conductivity within each shell, but with conduction ratios of 1:1/80:1 between the scalp, the skull and the tissue. Then, brain activity was modeled by two moving equivalent electric current dipoles representing the two auditory cortices. This model made it possible to identify focal neuronal currents, but in this study, it could not be used to assign these currents to an anatomic region. Major components of this approach are implemented as code in the FieldTrip toolbox .
Functional magnetic resonance imaging (fMRI) study
As with the MEG measurement, the SPL exposure for the fMRI study was set at each frequency relative to the individual hearing threshold for each subject. In the MRI scanner, subjects were instructed to listen passively to the tones presented. They were also instructed that, in some intervals, no tones would be audible. The stimuli were pure-sine tones with frequencies of 16.9 kHz, 19.1 kHz, 20.7 kHz and 24.2 kHz, presented at both −2 dB SL and 5 dB SL (only the combination 24.2 kHz, −2 dB SL was omitted for technical reasons). As in the MEG experiment, the subject’s left ear was stimulated and the right ear was closed off by means of an earplug. In addition to this series, a reference stimulation with a 14-kHz tone presented at a 20 dB SL was performed. As in the MEG case, this was done both to test the equipment and to generate a reference signal that the ultrasound experiments could be compared to. Each trial consisted of the presentation of one tone for a duration of 3 s. Stimulus presentations started 3 s after the time of repetition (TR) onset; that is, the scanner acquired an image for 2 s, after which, following a delay of 1 s of silence, the next tone was presented. The task consisted of 280 trials, including 40 null events in which no tone was presented. All the trials were distributed across four separate echo planar imaging (EPI) sequences. After each EPI sequence, participants were asked two questions in order to assess the subjective hearing sensation during the ultrasound stimulation (1. “Did you hear the ultrasound?” 2. “Were you able to discriminate between different tones during stimulation?”). The sequence of stimuli was randomized, and the transition probabilities were accounted for. To ensure that the participants were exposed to a minimum of scanner-induced background noise, the cryo-cooler compression pump system was switched off for the entire duration of the fMRI measurement.
Images were collected on a 3T Verio MRI scanner system (Siemens Medical Systems, Erlangen, Germany) using a 12-channel head coil. First, high-resolution anatomical images were acquired using a 3D T1-weighted magnetization-prepared gradient-echo sequence [repetition time: 2300 ms; echo time (TE): 3.03 ms; flip angle: 9°; matrix: 256×256×192; and voxel size 1×1×1 mm3]. Whole-brain functional images were collected on the same scanner using a T2*-weighted EPI sequence sensitive to blood oxygen level-dependent (BOLD) contrast using sparse sampling (TR=8000 ms; time of acquisition=2000 ms; TE=30 ms; image matrix: 64×64 voxels; field of view=192 mm; flip angle: 80°; slice thickness: 2.7 mm; 36 near-axial slices; aligned with the anterior commissure/posterior commissure line).
The fMRI data was analyzed using the SPM8 software (Wellcome Department of Cognitive Neurology, London, UK). The first four volumes of all EPI series were excluded from the analysis in order to allow the magnetization to reach a dynamic equilibrium. Data processing started with slice time correction and realignment of the EPI sequence data sets. A mean image for all EPI sequence volumes was created, to which individual volumes were spatially realigned by means of rigid body transformations. The structural image was co-registered with the mean image of the EPI sequence series, after which it was normalized to the Montreal Neurological Institute (MNI) template for random effects analysis. The normalization parameters were then applied to the EPI sequence images to ensure an anatomically informed normalization took place. A commonly applied filter of 8 mm FWHM (full width at half maximum) was used. Low-frequency drifts in the time domain were removed by modeling the time series for each voxel according to a set of discrete cosine functions, to which a cut-off of 128 s was applied. The statistical analyses were performed using the general linear model (GLM). Each trial tone frequency was modeled as a separate regressor. These vectors were convolved with a canonical hemodynamic response function (HRF) and its temporal derivatives to form regressors in a design matrix. Furthermore, six movement regressors were entered into the GLM. The parameters of the resulting GLM were estimated and used to form contrasts. The resulting contrast image was then entered into one-sample t-tests at the second (between-subject) level. Beta values were extracted in the active regions and in an anatomically defined region of interest (ROI) in the bilateral primary auditory cortex as defined in the SPM Anatomy toolbox  from each contrast between a single tone related to the null event.
Subjective hearing thresholds
The resulting threshold values showed a large spread across subjects, and the number of subjects who were unable to determine any threshold at all increased with increasing frequency. The average threshold was calculated as the median over all available individual hearing thresholds and is shown in Table 1 and in Figures 3 and 4. The average hearing threshold for a pure tone of 14 kHz was 32.9 dB re 20 μPa (dB SPL). At the highest stimulus frequency (24.2 kHz), an average SPL of 110 dB SPL was required to trigger an auditory sensation. It should be mentioned that, out of 26 test subjects, only three were able to perceive a tone at this frequency. The range between the minimum and the maximum hearing threshold values across subjects was around 50–70 dB for frequencies below 20.7 kHz, as can also be seen in Figure 4. For higher (ultrasonic) frequencies, this spread became smaller. Despite the similar slope trend with increasing frequency, the average hearing threshold values determined in this study are characterized by an overall offset in the range from 8 to 20 dB in comparison to literature data (Figure 3). It is assumed that the differences are caused by the fact that this study involves the use of an insert earphone instead of free-field conditions, as well as by the fact that the calibration process differed from that used by Ashihara  and Henry and Fast .
Looking at the median and the minimum threshold curve in Figure 3, it is obvious that the threshold increases at a rate of around 50 dB per 1/3 octave for frequencies up to 20 kHz. Above 20 kHz, the slope decreases and flattens out. Owing to the limited applied SPL, only the most sensitive subjects (in terms of hearing) were included in the determination of the hearing threshold for f>20 kHz.
For further investigation, the individual and the average (median) threshold data set of the four most sensitive subjects are shown in Figure 5. The threshold was reduced by about 10 dB at 14 kHz, about 20 dB at 15.75 and 16.9 kHz, and by 10 dB at 19.1 and 20.7 kHz in comparison to the data set for all subjects in Figure 3. The average threshold across the four subjects was 24.5 dB SPL at 14 kHz and 110 dB SPL at 23.75 kHz. The frequency dependence of the slope also differs slightly in comparison to the average data set in Figure 3. The average threshold curve (from the data evaluation of the four most sensitive subjects) agrees well with the standardized threshold values  and literature data, as shown in Figure 5 (gray lines). Looking at the individual data sets also reveals a pronounced spread of the threshold data across the four subjects, especially in the range of 14<f<21.5 kHz. The highest variation, of around 50 dB between the individual threshold data sets, appears at 16.95 kHz. At the lowest applied frequency (14 kHz), and for f>21.5 kHz, the variation is comparably small.
None of the subjects reported perceiving the stimuli below their individual hearing threshold. The stimuli with SPLs above their individual hearing thresholds and frequencies between 16.95 kHz and 20.7 kHz were recognized by nine subjects, but only three of them reported hearing the 24.2 kHz stimulus. The other four of the 13 subjects heard only the reference tone at 14 kHz but were unsure about hearing any other stimuli.
All subjects showed an auditory-evoked response after the reference stimulus was presented at 14 kHz at a level of 20 dB. Figure 6A depicts these responses in detail on the basis of two participants, showing the M100 (N1M, brain activity around 100 ms after stimulus, cf. Hari et al. ) at around 115 ms and a smaller 50 ms wave in the butterfly plots incorporating channels in the region of the temporal cortex. From the corresponding field maps, a clear dipolar activation could be found, which alternates in polarity over the time course from 45 ms until 145 ms after the onset of the stimulus.
Taking the clear auditory-evoked response visible in Figure 6A at 14 kHz into consideration as a reference for the position and approximate field strength of the well-known M100 activity, the responses from the stimulations at ultrasound frequencies were inspected. A grand average was calculated using the channel with the maximum signal in the positive field peak of the M100 (channel is indicated by the black cross in the M100 map in Figure 6A). These grand average auditory-evoked responses are shown for all conditions in Figure 6B for the stimuli above the hearing threshold (left panel) and below the hearing threshold (right panel). Note that the reference grand average response at 14 kHz and a stimulus level of 20 dB SL are also shown in the left panel. To check for the M100 auditory response, signal energies taken from two time windows were statistically compared. The first window was a 100-ms window before the stimulus (the baseline window), and the second was the window from 50 to 150 ms, i.e. centered around the M100 response. These tests were negative in all cases except for 14 kHz. Despite the statistical test, the magnetic field maps up to 800 ms after the onset of the stimulus were scanned visually for a dipolar pattern typical of focal brain activations. Again, only the 14-kHz response showed a dipolar map.
Functional magnetic resonance imaging study
To investigate potential auditory cortex activity under all conditions separately, beta values were extracted from the bilateral cluster and t-tests computed comparing the signal to zero. As depicted in Figure 7, significant activity was found only for the 14-kHz reference stimulus (p<0.05). None of actual trial tone stimuli resulted in significant auditory cortex activation. These results are particularly surprising as, according to the verbal reports taken after each EPI sequence, all the 13 participants perceived at least a large part of the stimuli with ultrasound and were also able to discriminate different tones. Although the ability to differentiate various pitches was not clearly investigated, a hearing sensation was clearly present. The bilateral primary auditory cortex ROIs were localized in the medial temporal lobe. Figure 8 shows the chosen areas marked in red in a T1-weighted anatomical image.
Discussion and conclusion
In this study, the perception of sound and the activation of the auditory cortex by means of sounds at high or ultrasound frequencies were investigated using audiological methods and brain imaging for an objective evaluation and understanding of the more subjective perception. Hearing thresholds were determined up to a frequency of 24.2 kHz with a group of test subjects. Later, a subgroup of these subjects was studied using MEG and fMRI to identify brain activity in response to acoustic stimuli with a defined SPL in relation to their individual hearing threshold. No such activity indicating sound processing in the brain could be found for tones with a frequency higher than 14 kHz when applying the experimental equipment and the methods used in this study.
During the determination of the hearing thresholds, data were obtained up to a frequency of 24.2 kHz. At this frequency, only three of the initial 26 test subjects were able to hear a tone for a threshold determination. For technical reasons, the sound source could only deliver a limited SPL, in particular because the loudspeaker had to be operated within the linear range to avoid intermodulation. The decreasing number of test subjects with increasing frequency inevitably introduces a bias in the determination of the threshold data at higher frequencies, as subjects with poor hearing are selectively eliminated. This is a common problem in the ultrasound range . It induces a tendency to determine lower threshold data, which could be erroneously interpreted as a decrease in the slope of the threshold versus frequency (cf. Ashihara  and Henry and Fast ). Studies on hearing thresholds for bone conduction stimulation , however, support the conclusion drawn from the results at hand that the slope significantly decreases at frequencies higher than 22 kHz, even for airborne ultrasound excitation.
At frequencies between 14 and 20.7 kHz, the difference between the minimum and the maximum threshold values obtained in the test subject group range from 33 to more than 70 dB. This range significantly exceeds the variation in thresholds during determination at audible frequencies , although a comparison is difficult as the data at audible frequencies is obtained with a larger number of test persons. This and the already-discussed finding that a small but non-zero number of particular individuals were able to hear at the highest frequencies could be an indication that particularly sensitive persons (in terms of hearing) exist. This is relevant for the development of a strategy for the future determination of exposure limit values. In such a process, the confidence intervals of threshold data need to be carefully considered. Additional safety margins should be defined for protecting particularly sensitive persons; such margins could be deduced, for example, from the top-percentile of threshold data. This is especially true for children, adolescents and young adults, as they have an increased hearing ability at higher frequencies in general.
After the threshold measurement cycles, the test subjects were asked to characterize their hearing sensation (in case they had one). Although no quantitative measure was used, almost all of the test subjects described the hearing sensation as displeasing. From this, it can be concluded that, at ultrasound frequencies, the range of comfortable hearing is extremely narrow; if an ultrasound tone is heard, it is immediately perceived as unpleasant (see also Leighton , who came to a similar conclusion). The consequence for a future noise reduction or safety strategy could be to define the hearing threshold as the absolute upper limit of exposure at ultrasound frequencies in order to avoid a hearing sensation altogether.
In the following discussion, we use the term auditory-evoked response to have the same meaning as “brain activation”, as some studies estimate brain currents or metabolic activation. This work presents auditory-evoked responses as the primary signal free from any modeling-based errors. Fujioka et al.  could not find any brain activation in response to airborne ultrasound up to 40 kHz, which corresponds to the results of this study: No significant brain activation could be identified for frequencies higher than 14 kHz, regardless of whether MEG or fMRI was used. Fujioka et al. applied a fixed SPL of 60 dB and reported that no test subject was able to hear a tone with frequencies higher than 20 kHz. Thus, contrary to this study, completely different experimental conditions occurred limiting the validity of a comparison.
In general, it is possible to detect physiological brain responses using stimulus SPLs near the hearing threshold. Auditory-evoked potentials are routinely used for hearing threshold detection and newborn hearing screening , , . Although it is common to apply signals with SPLs above the behavioral threshold and to apply a correction factor, evoked potentials for even very weak stimuli of about 5–10 dB above the threshold can be detected . This implies that, at the measurement conditions of this study, the detection of a brain response was possible, in principle. MEG has the potential to reflect a hearing threshold as a detection limit of the magnetic field response as a direct brain signal. Lütkenhöner and Klein  were able to detect a hearing threshold at 1 kHz and Stufflebeam et al.  could detect brain signals down to 5 dB SL. As the BOLD method detects a metabolic response instead of a direct brain signal representative, it is not obvious that fMRI delivers valuable results at stimulus levels near the hearing threshold. Langers et al. , however, obtained brain activation signals evoked by acoustic stimuli with SPLs down to below 10 dB SL. Other studies found a slightly higher detection limit . The interpolation of the data published by Röhl and Uppenkamp  suggests that the limit of the hemodynamic response can be found at SPLs congruent with the hearing threshold of the test subjects. In all of the brain activity measurements mentioned, the sensitivity limit is highly dependent on many parameters and measurement conditions and can only be compared between the studies to a very small extent. This is particularly true for an estimation of the sensitivity and the signal-to-noise ratio of the MEG or fMRI brain response measurements. Thus, no clear assessment is possible as to whether a limited sensitivity or a low signal-to-noise ratio is the reason for the missing response in this study. The fact, however, that two methodologically very different modalities that are sensitive to distinctly different levels of brain activity both fail to detect a (statistically significant) signal is a significant finding for the investigation of the perception of airborne ultrasound.
The SPLs of the ultrasonic stimuli presented during both the MEG and the fMRI sessions were set to 5 dB SL. At the reference frequency of 14 kHz, (see Figures 6 and 7) 20 dB was chosen, which allowed a measurement at a low but significant level of loudness to take place. Because of ethical and technical issues, the SPL at ultrasound frequencies could not be increased above 125 dB and a loudness scaling for every test person was not accomplished. Thus, the loudness at 14 kHz could not be compared with or transferred to higher frequency situations. As the intervals between the equal loudness contours dramatically decrease with increasing frequency and because of the reports of the test persons about a clear hearing sensation, the SPLs at ultrasound frequencies were fixed at 5 dB SL. This value was smaller than the variations that occurred during the threshold determination between individuals but was greater than or equal to the expected variations for one individual.
In these experiments, the stimuli were presented via an ear tip using air-conducted sound propagation. By contrast, Nakagawa and Nakagawa  and Hosoi et al.  presented stimuli via bone conduction. They detected N1m brain activity components for tone bursts of up to 40 kHz using MEG. It became unclear as to whether excitation via bone conduction was more efficient for the stimulation of tone perception at ultrasound frequencies. Hearing thresholds have been obtained up to much higher frequencies than those in the air-conducted case , which can be interpreted as support for this hypothesis. Because of the completely different acoustic excitation mechanism, a comparison between the stimulus strength of bone-conducted and air-conducted stimuli is not possible. In future experiments, a direct comparison between both the stimulation modes using the hearing threshold as a reference should be carried out.
The results of this study may serve as a source of guidance for the development of future safety strategies. However, owing to the limited data set, the results can be interpreted only as an initial indication that perception of airborne ultrasound by the auditory system is limited at the SLs used. Another conclusion can be drawn from the discomfort that the test subjects reported after the hearing experiments. To prevent discomfort as a fundamental impact on humans, it may be preferable to avoid the perception of airborne ultrasound altogether. These conclusions are only a source of preliminary input in the ongoing debate, and further research on other aspects will be necessary in the future.
Financial support from the European Metrology Research Programme (EMRP, “Health” program, grant no. HLT01 EARS) is gratefully acknowledged. The EMRP is jointly funded by the EMRP participating countries within EURAMET and the European Union.
Grzesik J, Pluta E. High-frequency-noise-induced hearing loss: a field study on the role of intensity level and accumulated noise dose. Int Arch Occup Environ Health 1986;57:127–36. PubMedCrossrefGoogle Scholar
Ueda M, Ota A, Takahashi H. Investigation on high-frequency noise in public space. We tried noise abatement measures for displeasure people. 11th International Congress on Noise as a Public Health Problem ICBEN 2014:11. http://www.icben.org/2014/papers/Team4/4_16%20MariUeda_1.pdf
Leighton TG. Comment on ‘Are some people suffering as a result of increasing mass exposure of the public to ultrasound in air? Proc Math Phys Eng Sci 2017;473:20160828. PubMedWeb of ScienceGoogle Scholar
Lawton B. Exposure limits for airborne sound of very high frequency and ultrasonic frequency. Institute of Sound and Vibration, University of Southampton-Tech report 2013;ISVR 334. Google Scholar
Kling C, Koch C, Kühler R. Measurement and assessment of airborne ultrasound noise. 22 International Conference on Sound and Vibration, Florence, IT, 2015. Google Scholar
Barrera-Figueroa S, Torras-Rosell A, Jacobsen F. Extending the frequency range of free-field reciprocity calibration of measurement microphones to frequencies up to 150 kHz. INTER-NOISE and NOISE-CON-Congress and Conference Proceedings 2013;247:6029–37. Google Scholar
Herbertz J. Untersuchungen zur Frequenz- und Altersabhängigkeit des menschlichen Hörvermögens. Fortschritte in der Akustik-DAGA 1984;10:683–6. Google Scholar
Oohashi T, Nishina E, Honda M, Yonekura Y, Fuwamoto Y, Kawai N, et al. Inaudible high-frequency sounds affect brain activity: hypersonic effect. J Neurophysiology 2000;83:3548–58. CrossrefGoogle Scholar
Scharf B. Loudness adaptation. In: Tobias JV, Schubert ED, editors. Hearing research and theory. New York: Academic, 1983;2:1–56. Google Scholar
DIN ISO 389-2: Acoustics – reference zero for the calibration of audiometric equipment – part 2: reference equivalent threshold sound pressure levels for pure tones and insert earphones (ISO389-2:1994). Google Scholar
DIN ISO 389-9: Acoustics – reference zero for the calibration of audiometric equipment – part 9: preferred test conditions for the determination of reference hearing threshold levels. (ISO389-9:2009). Google Scholar
Hansen M. Lehre und Ausbildung in Psychoakustik mit psylab: Freie Software für psychoakustische Experimente, Teaching and training in psycho-acoustics with psylab: free software for psychoacoustic experiments. DAGA2006-Fortschritte in der Akustik DAGA 2006;32:591–2. Google Scholar
Ahmed HO, Dennis JH, Badran O, Ismail M, Ballal SG, Ashoor A, et al. High-frequency (10–18 kHz) hearing thresholds: reliability, and effects of age and occupational noise exposure. Occup Med 2001;51:245–58. CrossrefGoogle Scholar
Lee J, Dhar S, Abel R, Banakis R, Grolley E, Lee J, et al. Behavioral hearing thresholds between 0.125 and 20 kHz using depth-compensated ear simulator calibration. Ear Hear 2012;33:315–29. CrossrefWeb of SciencePubMedGoogle Scholar
Oostenveld R, Fries P, Maris E, Schoffelen JM. FieldTrip: open source software for advanced analysis of MEG, EEG, and invasive electrophysiological data. Comput Intel Neurosc 2011;2011:1–9. Web of ScienceGoogle Scholar
Nolte G. The magnetic lead field theorem in the quasi-static approximation and its use for magnetoencephalography forward calculation in realistic volume conductors. Phys Med Biol 2003;48:3637–52. CrossrefPubMedGoogle Scholar
Morosan P, Rademacher J, Schleicher A, Amunts K, Schormann T, Zilles K. Human primary auditory cortex: cytoarchitectonic subdivisions and mapping into a spatial reference system. Neuroimage 2001;13:684–701. PubMedCrossrefGoogle Scholar
DIN ISO 389-7: Acoustics – reference zero for the calibration of audiometric equipment – part 7: reference threshold of hearing under free-field and diffuse-field listening conditions. (ISO 389-7:2005). Google Scholar
Hari R, Aittoniemi K, Järvinen ML, Katila T, Varpula T. Auditory evoked transient and sustained magnetic fields of the human brain localization of neural generators. Exp Brain Res 1980;40:237–40. CrossrefPubMedGoogle Scholar
Fedtke T, Richter U. Reference zero for the calibration of air-conduction audiometric equipment using ‘tone bursts’ as test signals. Int J Audiol 2007;46:1–10. CrossrefPubMedWeb of ScienceGoogle Scholar
Elberling C, Callø J, Don M. Evaluating auditory brainstem responses to different chirp stimuli at three levels of stimulation. J Acoust Soc Am 2010;128:215. CrossrefPubMedWeb of ScienceGoogle Scholar
Rosner T, Kandzia F, Oswald JA, Janssen T. Hearing threshold estimation using concurrent measurement of distortion product otoacoustic emissions and auditory steady-state responses. J Acoust Soc Am 2011;129:840–51. PubMedCrossrefWeb of ScienceGoogle Scholar
Ferm I, Lightfoot G, Stevens J. Comparison of ABR response amplitude, test time, and estimation of hearing threshold using frequency specific chirp and tone pip stimuli in newborns. Int J Audiol 2013;52:419–23. PubMedWeb of ScienceCrossrefGoogle Scholar
Picton TW. Human auditory evoked potentials, 1st ed. San Diego: Plural Publishing; 2011. Google Scholar
Lasota KJ, Ulmer JL, Firszt JB, Biswal BB, Daniels DL, Prost RW. Intensity-dependent activation of the primary auditory cortex in functional magnetic resonance imaging. J Comput Assist Tomo 2003;27:213–8. CrossrefGoogle Scholar
Nakagawa S, Nakagawa A. Perception mechanisms of bone-conducted ultrasound assessed by electrophysiological measurements in humans. IEEE/ICME International Conference on Complex Medical Engineering – CME 2009;1–5. Google Scholar
About the article
Published Online: 2019-01-18
Research funding: Part of this work was funded by EURAMET and the EU.
Conflict of interest: Authors state no conflict of interest.
Informed consent: All test subjects took part on the basis of informed consent.
Ethical approval: Ethical approval was given by the ethics committee of the German Psychological Society under vote No. SK 012014.
Citation Information: Biomedical Engineering / Biomedizinische Technik, 20180048, ISSN (Online) 1862-278X, ISSN (Print) 0013-5585, DOI: https://doi.org/10.1515/bmt-2018-0048.
©2019 Christian Koch et al., published by De Gruyter, Berlin/Boston. This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. BY-NC-ND 4.0