Neuroscience of speech perception: January 2016

Friday, 1 January 2016

Listeners’ understanding of comprehensive speech in a non-native language

It can therefore be argued that the majority of speech comprehension studies has solely focussed on localising anatomical brain areas while few studies went beyond listeners’ perception of speech in their first language and considered listeners’ understanding of comprehensive speech in a non-native language (Inui et al., 1998; Kim, Relkin, Lee, & Hirsch, 1997; Nakai et al., 1999; Perani et al., 1996). The neuroimaging study by Nakai et al (1999) investigated Japanese speakers’ listening comprehension of their native language (Japanese), a comprehensive non-native language (English) as well as of a non-comprehensive non-native language (Hungarian). Nakai et al. (1999) were particularly interested in detecting distinct activations of separate language regions that respond to processing comprehensive and non-comprehensive languages. In contrast to prior research (e.g. Perani et al., 1996), they found no expansive responses of the IFG and angular gyrus when listeners passively listened to their native language. Nonetheless, similar to previous investigations (Mazoyer et al., 1993; Perani et al., 1996), Nakai et al. (1999) found both comprehensive and non-comprehensive languages to elicit activations from the posterior part of the STG. In line with prior research that found that the IFG is activated to a high degree during the passive listening of words (Mazoyer et al., 1993; Perani et al., 1996), and during perception of speech that is complex in syntax (Inui et al., 1998), Nakai et al (1999) observed the IFG to respond to the comprehensive languages, Japanese and English. These languages were also reported to activate the supplementary motor area (SMA) and the pre-motor area (PMA) indicating a role of these regions in perceived comprehensibility (Nakai et al., 1999). Finally, all languages were observed to elicit responses in the transverse temporal gyri and the PAC (Nakai et al., 1999).

However, the task in Nakai et al.’s (1999) study was a passive listening task and did not measure the neural correlates of participants’ active comprehension of their speech material. Additionally, their experiment dealt with sentence comprehension and did not consider listening comprehension at the word level. Moreover, they did not focus on revealing the possible linguistic benefit of a particular speech modification for listeners with varying levels of proficiency within one language (Nakai et al., 1999). Furthermore, measurements were taken from four participants only. Similar to Nakai et al (1999), the functional magnetic resonance imaging (fMRI) study by Inui et al (1998) investigated Japanese listeners’ speech comprehension. However, they did not investigate their notion at word-level as their speech material included sentences only (Inui et al., 1998).

It can therefore be said that no neuroimaging study to date has investigated the neural basis of native and non-native listeners’ comprehensibility of speech that includes a speech modification such as vowel space expansion, by using an active listening comprehensibility task with word stimuli that were produced in a naturalistic setting with a communicative purpose. Moreover, no neuroimaging study has studied the neural mechanism of the possible perceptual and cognitive advantage that vowel space expansion might provide listeners. Vowel space expansion is contributed to by changes in the first two formants, F1 and F2, and has been shown by prior behavioural studies to enhance listeners’ perception of speech (Ferguson & Kewley-Port, 2007; Uther, Knoll, & Burnham, 2007). It has been found that this kind of speech modification yields a large speech intelligibility benefit for native speakers and for early learners of a second language (L2) and to lead to a small intelligibility benefit for late L2 learners (Bradlow & Bent, 2002); however, this has not been investigated by a neuroimaging study either.

Spoken word comprehension

In contrast to speech intelligibility, speech comprehension entails numerous cognitive activities such as the integration of the physical speech signal over time and the access to and selection of appropriate semantic representations using decision strategies to operate semantic information (Davis & Johnsrude, 2003). Most studies on speech comprehension showed that extensively spread systems on both hemispheres are involved in speech comprehension (Benson et al., 2001; Binder et al., 1997; Chee, O'Craven, Bergida, Rosen, & Savoy, 1999; Demonet et al., 1992; Demonet, Price, Wise, & Frackowiak, 1994a; Nakai et al., 1999; Newman, Pancheva, Ozawa, Neville, & Ullman, 2001; Scott, Leff, & Wise, 2003; Spitsyna, Warren, Scott, Turkheimer, & Wise, 2006; Visser, Jefferies, & Lambon Ralph, 2010). Many investigations have been carried out to uncover the neural basis of speech comprehension (Crinion & Price, 2005; Humphries, Willard, Buchsbaum, & Hickok, 2001; Obleser et al., 2007a; Obleser, Eisner, & Kotz, 2008; Peelle et al., 2010). However, most of these studies focused on comprehension of spoken sentences.

Neuroimaging studies that focussed on the phonological and semantic processing of words demonstrated that together with the left parietal angular gyri, the left middle and inferior temporal gyri are involved in semantic processing while the left posterior inferior gyrus of the frontal lobe and the supramarginal gyri of the parietal lobe enable listeners to phonologically resolve sound information of words (Demonet et al., 1992, 1994a). The observation that those structures that are relevant for semantically processing auditory words are identical to those that are important for semantic operations on visually presented words, indicated a semantic system for words irrespective of presentation mode (Vandenberghe, Price, Wise, Josephs, & Frackowiak, 1996). This amodal processing of semantic information, through which broadly distributed temporal, frontal and parietal regions for speech comprehension of both auditory and visual words are engaged, has been further supported (Chee et al., 1999; Newman et al., 2001). Specifically, right frontal and temporal areas have been related to speech comprehension as well (Newman et al, 2001).

Other speech comprehension studies showed a stronger involvement of prefrontal and angular gyri when more effort was required to retrieve semantic associations when sentences in a foreign language were, for example, presented to listeners (Nakai et al., 1999). While the determination of word meaning through semantic context has been observed to elicit activation from the left superior frontal gyrus (Scott et al., 2003), tasks in which listeners were required to pay particular attention during speech comprehension was reported to activate the dorsal posterior frontal regions (Giraud et al., 1994). Moreover, the posterior middle temporal region, which was shown to respond to semantic requirements, was reported to become more active as executive needs intensified (Whitney, Jefferies, & Kircher, 2011). Although the ventral inferior frontal cortex was related to planned semantic operations (Adams & Janata, 2002; Badre, Poldrack, Pare-Blagoev, Insler, & Wagner, 2005), similar to the angular gyri and the left fusiform gyrus, the ventral inferior frontal cortex responded to elevated difficulty in accessing semantic knowledge when speech was presented in both visual and auditory modes (Adams & Janata, 2002; Rodd, Davis, & Johnsrude, 2005; Schmithorst, Holland, & Plante, 2006; Spitsyna et al., 2006). The angular gyri were related to top-down processing in forecasting semantic information and to the recovery and combination of concepts (Binder, Desai, Graves, & Conant, 2009; Brownsett & Wise, 2010; Obleser & Kotz, 2010).

More recently, the broadly dispersed system of semantic representation was suggested to include the posterior temporoparietal cortex, the precuneus and the left angular gyrus in the parietal areas, the middle and superior frontal gyri and the left frontal pars orbitalis as well as the posterior inferior temporal gyrus, the middle temporal gyrus and the anterior temporal fusiform (Rogalsky, Matchin, & Hickok, 2008; Visser et al., 2010; Visser & Lambon Ralph, 2011). Prior research on speech comprehension has reported a considerable overlay of parts of those structures that are significant for speech articulation, such as the pars opercularis and triangularis of IFG and the inferior and lateral areas of the right cerebellar cortex, with those structures that are essential for speech comprehension (Papathanassiou et al., 2000). This observation has been suggested to account for activities that are shared by both articulation and comprehension of speech and which include processes of for instance articulatory strategies, short-term auditory memory and semantic processing (Bookheimer, 2002; Papathanassiou et al., 2000; Wise et al., 2001).