A SOCIOPHONETIC INVESTIGATION OF STANDARD BRITISH 
ENGLISH CONNECTED SPEECH PROCESSES  
IN NIGERIAN ENGLISH 
 
BY 
 
OLANRELE ROTIMI OLADIPUPO 
(MATRIC NO: 134631) 
 
B.A. (Ed) English (OAU), MPA (LASU), M.A. English Language (Ibadan) 
 
 
A Thesis in the Department of English  
Submitted to the Faculty of Arts  
in partial fulfilment of the requirements for the Degree of  
DOCTOR OF PHILOSOPHY 
of the  
UNIVERSITY OF IBADAN 
 
 
 
 
 
 
JUNE, 2014 
 
 
ABSTRACT 
 
Connected speech processes (CSPs) account for sound modifications and 1 
simplifications in speech, while sociophonetics emphasises correlation between speech 2 
forms and social factors. Existing studies merely identified some CSPs that 3 
characterise Nigerian English (NE); studies that measure speakers' proximity to 4 
Standard British English (SBE) connected speech, especially in relation to social 5 
variation, are scarce. This study, therefore, investigated the incidence of assimilation, 6 
elision and liaison processes of SBE connected speech in NE with consideration for 7 
the region, gender and age of speakers. This is with a view to determining the level of  
NE speakers' approximation to or deviation from SBE.  8 
 9 
The study adopted generative phonology to explain NE speakers' application of or 10 
deviation from the SBE rules, and variability concept to show the correlation between 11 
CSPs and social factors. The participants, who ranged between ages 18-65, were 180 12 
male and 180 female NE speakers with a minimum of 2-3 years post-secondary 13 
education. They were drawn, through stratified and purposive techniques, from four 14 
regions in Nigeria: north (120), west (80), east (80) and south-south (80). All 15 
participants produced semi-spontaneous speeches (SSS), containing 31 utterances and 16 
a short passage, into digital recording devices and filled 360 copies of a structured 17 
questionnaire. Two educated native speakers served as control. The recordings were  
transcribed and the scores analysed, using percentages, MANOVA and Bonferroni's 18 
Post-hoc test. Portions of the SSS of eight participants (representing the social 19 
variables) and one native speaker were analysed acoustically, using PRAAT speech 20 
analyser (version 5120). 21 
 22 
The overall incidence of the CSPs (assimilation, elision and liaison) of SBE for all 23 
categories of participants indicated 43.2% approximation and 56.8% deviation. 24 
However, incidence of each process varied. Three assimilation variants- regressive 25 
devoicing (99.2%), progressive devoicing (65.1%) and nasal assimilation (63.5%)- 26 
showed significant approximation to SBE, while four variants- progressive voicing 27 
(21.2%), voiceless alveolar stop assimilation (47.6%), voiced alveolar stop 28 
assimilation (3.2%) and yod coalescence (6.2%)- deviated significantly. Consonant 29 
elision, in all contexts, occurred significantly (61.5%), while the incidence of liaison- 30 
linking /r/ (8.1%) and intrusive /r/ (2.9%)- was extremely low. The speech waveforms, 31 
formants structure and voicing bars on the participants' spectrograms, in most cases, 32 
displayed considerable deviation from SBE. In terms of social variation, the combined 33 
dependent variable (assimilation, elision and liaison) was significantly affected by 34 
2
gender (Pillai's Trace=0.07,F(3,342)=8.12,p<0.05,η =0.07) and region (Pillai's  
2
Trace=0.11,F(9,1032)=4.29,p<0.05,η =0.04), but not by age or their interactions. 35 
2
Gender had significant effect on elision F(1,344)=22.21;p<0.01,η =0.06): males had 36 
higher mean performance (M=9.91;SD=2.84) than females (M=8.55;SD=2.58). 37 
2
Region was found to be significant in liaison F(3,344)=8.14;p<0.01,η =0.07): Eastern 38 
participants (M=1.38;SD=1.44) had the highest mean score, followed by South-South  
(M=1.10;SD=1.22), Western (M=1.05;SD=1.16) and Northern participants 
(M=0.57;SD=0.94). The Bonferroni's Post-hoc results indicated that only Eastern and 
Northern participants differed significantly from each other. 
 
ii 
 
 
Nigerian English speakers' mastery of Standard British English connected speech 
processes, irrespective of gender and regional variation, manifested, overall, more 
deviation from than approximation to SBE. This suggests Nigerian English speakers' 
relatively low level of competence in Standard British English connected speech 
processes, and has implications for intelligibility.  
 
Key words:  Connected speech processes, Elision, Assimilation, Nigerian English, 
Standard British English 
    
Word count: 494 
 
 
iii 
 
 
 
DEDICATION 
 
 
 
This work is, to the glory of God, dedicated to 
 
The first fruit of my 9 years of marriage 
The reward of my trust in God 
The proof of God’s faithfulness: 
Excellence, Ifeoluwase, Iyanuoluwa, Isaac, Oluwadara, Oladipupo. 
iv 
 
 
ACKNOWLEDGMENTS 
 
First and foremost, I appreciate God, the Yea and Amen, who began and 
ensured the successful completion of this programme. He is the only One who could 
have made it possible, despite the barrage of challenges on my path. After my initial 
attempt to study for higher degrees was cut short by financial handicap in 1999, I 
thought all hope of achieving my heartfelt dream had been shattered. However, the 
God of all flesh stepped in and turned the tide in my favour. My Maker, Daddy, and 
Life Guard and Guide, to You I give all my praise. 
 Special thanks to my supervisor, Dr Adenike Akinjobi, for her keen interest in 
me, my family and study; and for constantly nudging me towards success. Ma, I can 
never forget your usual prophetic salutation: ‘regards to everyone at home’, though 
you knew there was no one else but my wife at the time. You are more than a 
supervisor! A mother, a counsellor and a mentor you are. I also appreciate all my 
lecturers whose erudition contributed to making a scholar of me. Of note in this respect 
is Dr Akin Odebunmi, an erudite scholar, who first exposed us to the fundamentals of 
seminar presentation cum paper writing at the Master’s class and practically guided me 
through the rigour of abstract presentation to the PG school. I cannot forget Professor 
Omobowale, Professor Obododinma Oha, Dr Ayo Ogunsiji, Dr Moses Alo, Dr Dele 
Adeyanju, Dr Bukola Sunday, among others, who all imparted me in one way or the 
other.  
In the same vein, I wish to extend my appreciation to my offshore teachers and 
mentors whose scholarship rubbed off on me: Professor Augustine Simo Bobda of 
Ecole Normale Superieure, University of Yaounde I, Cameroon; Professor Francis 
Nolan, University of Cambridge; Professor Paul Kerswill and Dr Kelvin Watson, both 
of Lancaster University, United Kingdom. Not only did they provide insight into my 
work via electronic communications, they also loaded me with expensive and 
invaluable reading materials, which practically became the compass with which I 
navigated through this research endeavour. 
I cannot but recognise and be grateful to a legion of people, whose sacrificial 
roles at every phase, in no small measure, made this little contribution to knowledge 
worthwhile. The first people on the list are Dr Demola Lewis and my friend, Late Dr 
Akpoghene Onorievarie Ilolo, of blessed memory, who both introduced me to the 
rudiments of acoustic analysis. Dr Ilolo not only hosted me at Oghara but also made 
v 
 
 
her students available for data gathering and practically joined me in recording them. 
May her soul rest in peace! Also worthy of note in this regard is Mr Babatunde 
Adetula who introduced me to the crème de la crème of NNPC, Lagos for data 
gathering and loaded me, free of charge, with imported materials on acoustic analysis 
on one of his trips abroad.  May my God reward you abundantly.  
Also involved in this success story are Patience Ashe, John Bamigbose, Akpan 
Anthony, Mr & Mrs Femi Afolabi, Dr (Mrs) Marion Adebiyi, Ose Amagbewon, 
Osegie Douglas, James Uba, Dr Ikenna Kamalu, Mrs Bukky Olaoye, Akeem Oyedele, 
Mr Ofila, Simon Osunsade, Mr. Harrison Ighodaro, Titus Aidelokhai, Mr Meikudi, Mr 
Sulaiman, Pastor & Dcns. Imoroa, Pastor Emmanuel Igeleko, Pastor Lanre Akinyo, 
Tope Oladiran, Dr Leke Adetula, Dr Peter Amosun, Mrs Odega, Dr Osisanwo, Mr 
Charles Nwagwo (the Faculty Officer, Arts) and a host of others who all assisted in the 
course of data gathering, data analysis and in every other way possible. May you never 
walk alone. 
My heartfelt gratitude goes to my bosses and colleagues in Bells University of 
Technology, Ota, especially at the Centre for Foundation Education (CENFED), who 
provided me with an enabling environment to undertake this research and constantly 
nudged me on through their words of advice and prayers. Professor Isaac Adeyemi (the 
Vice-Chancellor), Professor Fawole, Dr Bayo Adebowale (Director, CENFED), Dr 
Francis Cheo (Dean Students’ Affairs), Mr. Adeniji, Barrister Abel Ogungbenro, Dr 
Esther Areola, Malachy Igwilo, Mrs Israel Ohiseghame and Mrs Kemi Alelamole; you 
are all wonderful people! 
I must also say thank you to my amiable shepherds, Pastor and Pastor (Mrs) 
Andrew Folorunso and all ministers and entire members of Full Gospel Sanctuary Int’l 
Ministries, Lagos for their spiritual and moral support. Ditto my friends: Barrister and 
Dr. Kayode Folorunso, Pastor and Dcns. Tunde Oladimeji, Julianah Akindele and Don 
Utulu. This is the fruit of your labour. 
My wife, Dr Olufunke Oyejoke, you are the best thing that ever happened to 
me! Your love, support, encouragement and understanding are unquantifiable. 
Oftentimes, I ‘abandoned’ you for days because I must visit Ibadan to meet my 
supervisor, travel several miles in search of participants or stay awake several nights in 
order to put my thoughts together. Yet, you never took offence. On many occasions, 
you became my research assistant, moving from one place to another in search of 
participants. From the depth of my heart I say, thank you. Together we shall reach the 
vi 
 
 
acme of our dream in Jesus’ name. To my son, Excellence the Great, thank you for 
bearing with Dad’s constant absence from home and regular vigils since you were in 
the womb. 
Finally, my regards to my relatives and in-laws, especially my mother, Mrs 
Comfort Oyeronke; my siblings, Boda Kunle, Aunty Nike, Lara and Wale; my brother-
in-law and his wife, Mr and Mrs Adeniyi Ojeniyi; and my father-in-law, Pa Benjamin 
Ojeniyi; for their understanding and concerns. I also want to appreciate Mr and Mrs 
Oladiran and their kids for providing me free accommodation and feeding each time I 
stayed back in Ibadan.  
To others too numerous to mention: Thank you all. 
vii 
 
 
CERTIFICATION 
 
I certify that this work was carried out by Mr. O. R. Oladipupo in the Department of 
English, University of Ibadan. 
 
 
 
       _________________________   
Supervisor 
Adenike Akinjobi 
B.A. (Ed) English (Ilorin), M.A., English, Ph.D Linguistics (Ibadan)  
Reader, Department of English 
University of Ibadan, Nigeria. 
viii 
 
 
TABLE OF CONTENTS 
 
Title Page          i 
Abstract          ii 
Dedication          iv 
Aknowledgments         v 
Certification           viii 
Table of Contents         ix 
List of Tables          xiii 
List of Figures          xiv 
Symbols and Abbreviations        xvii 
 
CHAPTER ONE:         1 
1.0  Background to the Study      1 
1.1  English Language in Nigeria: Historical Background  1 
1.2  The Nigerian Linguistic Situation     3 
1.3  New (Non-Native) Englishes      4 
1.3.1  Nigerian English       8 
1.4  Connected Speech Processes      10 
1.5   Phonological Processes in some indigenous Nigerian Languages 11 
1.5.1   Assimilation        11 
1.5.1.1  Vowel -Vowel Assimilation      11 
1.5.1.2  Consonant-Consonant Assimilation     12 
1.5.1.3  Consonant-Vowel Assimilation     13 
1.5.2   Elision         14 
1.5.2.1  Vowel elision        14 
1.5.2.2  Consonant elision       15 
1.5.3   Epenthesis (Insertion)       15 
1.6  Statement of the Problem      16 
1.7  Aim & Objectives       17 
1.8  Research Questions       18 
1.9  Research Methodology      18 
1.9.1  The Participants       18 
ix 
 
 
1.9.2  Research Instruments       19 
1.9.3  Data Gathering Procedure      19 
1.9.4  Data Analysis        19 
1.10  Scope of the study       20 
1.11  Significance of the Study      21 
1.12  Limitations and Constraints      22 
 
CHAPTER TWO: REVIEW OF LITERATURE     23 
2.0   Introduction        23 
2.1  Connected Speech Processes      23 
2.2   Connected Speech Processes in Standard British English  25 
2.2.1  Reduction        26 
2.2.2   Variation of the Word’s Accentual Pattern    29 
2.2.3  Assimilation        30 
2.2.3.1  Contiguous/Contact and Distant Assimilation   32 
2.2.3.2  Regressive, Progressive and Coalescent Assimilation  32 
2.2.3.3  Assimilation of Voice, Place and Manner    35 
2.2.3.4  Partial and Total Assimilation      37 
2.2.3.5  Historical Assimilation      37 
2.2.4  Elision         38 
2.2.5  Liaison        41 
2.3  Review of Related Literature on Connected Speech Processes 43 
2.4  Sociophonetics       47 
2.4.1  Levels of Sociophonetic Variation     50 
2.4.1.1  Segmental Variation       50 
2.4.1.2  Suprasegmental Variation      51 
2.4.1.3  Subsegmental Variation      52 
2.5  Review of Related Literature on Sociophonetic Variation  52 
2.6  Nigerian English: An Overview of the Literature   58 
2.6.1  Nigerian English: Variety Differentiation    58 
2.7  Received Pronunciation/Standard British English   64 
2.8  Acoustic Phonetics       66 
 
x 
 
 
CHAPTER THREE: THEORETICAL FRAMEWORK    67 
3.0   Introduction         67 
3.1  Generative Phonology       67 
3.1.1  Phonological Rules        69 
3.1.2  Formalisation of Rules       73 
3.1.3  The Distinctive Feature Theory     74 
3.1.4  Phonological Boundary       77 
3.1.5  Critique of Generative Phonology     79 
3.2  Variability Concept       80 
3.2.1  Social Variables       84 
3.2.1.1  Age          84 
3.2.1.2  Gender        86 
3.2.1.3  Ethnicity        88 
 
CHAPTER FOUR: PILOT STUDY      91 
4.0  Introduction        91 
4.1  Statistical Analysis       91 
4.1.1  Voicing Assimilation       92 
4.1.2  Yod Coalescence       93 
4.1.2.1  The contextual/boundary distribution of yod coalesence  95 
4.1.3  Elision         96 
4.1.4  Liaison        97 
4.1.4.1  Linguistic correlates of linking /r/     98 
4.1.5  Summary of Performance      99 
4.1.6  Sociophonetic variation of Connected Speech Processes  100 
4.1.6.1   T-test Analysis for Gender      101 
4.1.6.2   T-test Analysis for Age      102 
4.2   Summary, Conclusion and Further Studies    103 
 
CHAPTER FIVE: DATA ANALYSIS, FINDINGS AND DISCUSSION 106 
5.0  Introduction        106 
5.1  Statistical analysis        107 
5.1.1  Assimilation in Nigerian English     107 
xi 
 
 
5.1.1.1  Assimilation of Voice       107 
5.1.1.2  Assimilation of Place        114 
5.1.2  Elision in Nigerian English      120 
5.1.3.1  Liaison in Nigerian English      125 
5.1.3.2  Linguistic Correlates of r-liaison in NE    130 
5.1.4  Summary of Performance      132 
5.1.5  Sociophonetic variation of connected speech processes in NE 135 
5.1.5.1  Introduction         135 
5.1.5.2  Analysis         135 
5.1.5.3  Summary         141 
5.1.5.3.1 Region         142 
5.1.5.3.2 Gender        142 
5.1.5.3.3 Age         143 
5.2  Acoustic Analysis       144 
5.2.1  Acoustic Analysis of  He’s a nice boy    145 
5.2.2  Acoustic Analysis of Ten pounds     150 
5.2.3  Acoustic Analysis of He won't do it       156 
5.2.4  Acoustic Analysis of I met Peter at the station     161 
 
CHAPTER SIX: SUMMARY, CONCLUSION AND  
RECOMMENDATIONS      167 
6.0.  Introduction        167 
6.1  Summary of Findings       167 
6.2  Conclusions        171 
6.3  Recommendations and further studies    173 
 
References          175 
Appendices          193
        
 
 
  
xii 
 
 
LIST OF TABLES 
Title                   Page 
Table 1.1  Palatalisation process in Hausa     14 
Table 2.1  Strong and weak forms      27 
Table 4.1  Frequency and percentage scores for voicing assimilation  93 
Table 4.2  Frequency and percentage scores for yod coalescence   94 
Table 4.3  Percentage scores for coalesced /ʃ, ʒ, ʧ, ʤ/ variants   95 
Table 4.4  Frequency and percentage scores for elision    96 
Table 4.5  Frequency and percentage scores for linking /-r/   98 
Table 4.6  Linking (r) according to the grammatical category of the  
surrounding words       99 
Table 4.7  Summary of CSPs of SBE in EYE data    100 
Table 4.8 Gender mean scores for assimilation, elision and liaison  101 
Table 4.9 Results of T-test analysis for gender     102 
Table 4.10 Age mean scores for assimilation, elision and liaison  102 
Table 4.11 Results of T-test analysis for age     103 
Table 5.1  Frequency and percentage scores for assimilation of voice  
variants        109 
Table 5.2  Frequency and percentage scores for peculiar assimilatory  
processes in NE       112 
Table 5.3  Frequency and percentage scores for place assimilation variants 115 
Table 5.4  Frequency and percentage scores for yod reduction strategies 120  
Table 5.5  Frequency and percentage scores for Elision variants  122 
Table 5.6  Frequency and percentage scores for r-liaison   127 
Table 5.7  Frequency and percentage scores for smoothing   130 
Table 5.8   Linking /r/ according to linguistic contexts    131 
Table 5.9  Summary of CSPs of SBE in the Nigerian English data  133 
Table 5.10  Pearson correlation coefficients     136 
Table 5.11   Box's test of equality of covariance matrices    136 
Table 5.12  MANOVA summary table for Multivariate tests   137 
Table 5.13  Tests of between participants effects     138 
Table 5.14  Table of descriptive statistics of mean scores in elision  139 
Table 5.15  Table of descriptive statistics of mean scores in liaison  140 
Table 5.16   Table of multiple comparisons: Post hoc test    141 
xiii 
 
 
LIST OF FIGURES 
 
Title                   Page 
Figure 1.1 Kachru’s Model of concentric circles     6 
Figure 3.1  Levels of representation      68 
Figure 3.2 A generative model of grammar     70 
Figure 4.1 Percentage chart for coalesced /ʃ, ʒ, ʧ, ʤ/ variants   95 
Figure 4.2  Pie chart showing percentage summary of CSPs of SBE  
in EYE data        100 
Figure 5.1   Percentage voicing assimilation score differences for SBE  
and NE speakers       112 
Figure 5.2   Percentage (%) place assimilation score differences for SBE  
and NE speakers       119 
Figure 5.3 Percentage elision score differences for SBE and NE speakers 123 
Figure 5.4 Percentage elision and non-elision scores for NE speakers  124 
Figure 5.5     Percentage r-liaison and r-suppresion scores for NE speakers 129 
Figure 5.6  Percentage linking /r/ scores for lexical and function words  131 
Figure 5.7  Overall percentage CSPs scores for SBE and NE speakers  134 
Figure 5.8   Overall percentage scores of NE approximation to and  
deviation from SBE       134 
 
Figure 5.9  The textgrid of He’s a nice boy as produced by the control  145 
Figure 5.10  The textgrid of He’s a nice boy as produced by a young  
female speaker of English from Western Nigeria   145 
Figure 5.11  The textgrid of He’s a nice boy as produced by an adult  
male speaker of English from Western Nigeria   146 
Figure 5.12 The textgrid of He’s a nice boy as produced by a young  
female speaker of English from Eastern Nigeria   146 
Figure 5.13 The textgrid of He’s a nice boy as produced by an adult  
male speaker of English from Eastern Nigeria   147 
Figure 5.14 The textgrid of He’s a nice boy as produced by an adult  
female speaker of English from Northern Nigeria   147 
Figure 5.15 The textgrid of He’s a nice boy as produced by a young  
male speaker of English from Northern Nigeria   148 
Figure 5.16 The textgrid of He’s a nice boy as produced by an adult  
female speaker of English from the South-South region   148 
Figure 5.17 The textgrid of He’s a nice boy as produced by a young  
male speaker of English from the South-South region  149 
xiv 
 
 
Figure 5.18 The textgrid of Ten pounds as produced by the control  150 
Figure 5.19 The textgrid of Ten pounds as produced by a young  
female speaker of English from Western Nigeria   151 
Figure 5.20  The textgrid of Ten pounds as produced by an adult male  
speaker of English from Western Nigeria    151 
Figure 5.21 The textgrid of Ten pounds as produced by a young female  
speaker of English from Eastern Nigeria    152 
Figure 5.22  The textgrid of Ten pounds as produced by an adult male  
speaker of English from Eastern Nigeria     152 
Figure 5.23  The textgrid of Ten pounds as produced by an adult female  
speaker of English from Northern Nigeria    153 
Figure 5.24  The textgrid of Ten pounds as produced by a young male  
speaker of English from Northern Nigeria    153 
Figure 5.25 The textgrid of Ten pounds as produced by an adult female  
speaker of English from the South-South region of Nigeria  154 
Figure 5.26  The textgrid of Ten pounds as produced by a young male  
speaker of English from the South-South region of Nigeria  154 
Figure 5.27 The textgrid of He won't do it as produced by the control  156 
Figure 5.28  The textgrid of He won't do it as produced by a young female   
speaker of English from Western Nigeria    156 
Figure 5.29  The textgrid of He won't do it as produced by an adult male  
speaker of English from Western Nigeria    157 
Figure 5.30  The textgrid of He won't do it as produced by a young female  
speaker of English from Eastern Nigeria    157 
Figure 5.31  The textgrid of He won't do it as produced by an adult male  
speaker of English from Eastern Nigeria    158 
Figure 5.32  The textgrid of He won't do it as produced by an adult female  
speaker of English from Northern Nigeria    158 
Figure 5.33  The textgrid of He won't do it as produced by a young male  
speaker of English from Northern Nigeria    159 
Figure 5.34  The textgrid of He won't do it as produced by an adult female  
speaker  of English from the South-South region of Nigeria  159 
Figure 5.35  The textgrid of He won't do it as produced by a young male  
speaker of English from the South-South region of Nigeria  160 
Figure 5.36 The textgrid of I met Peter at the station as produced by the  
control         161 
Figure 5.37  The textgrid of I met Peter at the station as produced by a  
young female speaker of English from Wwestern Nigeria  161 
Figure 5.38  The textgrid of I met Peter at the station as produced by an   
adult male speaker of English from Western Nigeria   162 
xv 
 
 
Figure 5.39  The textgrid of I met Peter at the station as produced by a  
young female speaker of English from Eeastern Nigeria  162 
Figure 5.40  The textgrid of I met Peter at the station as produced by an   
adult male speaker of English from Eastern Nigeria   163 
Figure 5.41  The textgrid of I met Peter at the station as produced by an  
adult female speaker of English from Nothern Nigeria  163 
Figure 5.42  The textgrid of I met Peter at the station as produced by a  
young male speaker of English from Nothern Nigeria  164 
Figure 5.43  The textgrid of I met Peter at the station as produced by an  
adult female speaker of English from the South-South region 164 
Figure 5.44  The textgrid of I met Peter at the station as produced by a    
young male speaker of English from the South-South region 165 
 
xvi 
 
 
SYMBOLS AND ABBREVIATIONS 
 
 
Phonetic Symbols Used 
Vowels 
SBE   NE   keywords   
/i:/  /i/   feed     
/ɪ/     sit    
/e/  /ɛ/   bed     
/æ/  /a/   cat   
/ɑ:/     cart 
/ɒ/  /ɔ/   hot   
/ɔ:/     port   
/ʊ/  /u/   good   
/u:/     two   
/ʌ/  /ɔ/   cut   
/ɜ:/  /ɛ/   bird   
/ə/  /a/   about   
/eɪ/  /e/   pay   
/aɪ/  /ai/   time     
/ɔɪ/  /ɔi/   boy   
/əʊ/  /o/   go   
/aʊ/  /ao/   out   
/ɪə/  /ia/   cheer   
/eə/  /ɛa/   air   
/ʊə/  /uɔ/    poor 
 
Consonants 
SBE   NE   keywords 
/p/    /p/     pet    
/b/  /b/   big    
/t/  /t/   tea    
/d/  /d/   dip    
/k/  /k/   come    
/g/  /g/   get   
/f/  /f/, /v/   fall, voice     
/v/       
/θ/   /t/, /s/   think    
/ð/  /d/, /z/   this   
/s/   /s/   see    
/z/    /z/, /s/   zoo   
/h/   /h/, ø   hat   
/ʃ/    /ʃ/, /s/     ship   
/ʒ/    /ʃ/     genre   
/ʧ/   /ʧ/, /ʃ/     chick   
/ʤ/    /ʤ/     joy   
/m/   /m/   man   
/n/   /n/   know   
xvii 
 
 
/ŋ/    /ŋ/      bang   
/l/    /l/     lame   
/r/   /r/   rat   
/w/    /w/     win   
/j/    /j/     yes   
 
Other symbols 
/ /  phonemic/phonological representation 
[ ]  phonetic representation 
+  has the feature of 
-  lack the feature of 
→  becomes 
/  in the environment of 
−−  position of the affected sound 
  square brackets (enclose distinctive features) 
  braces (enclose morphemes) 
(   )  optional element 
A→B/C− A becomes B after C 
A→B/−C A becomes B before C 
A→B/C−D A becomes B in between C and D 
α  alpha (variable value) 
ϐ  beta 
Ф  bilabial fricative (substitution for /p/ or /f/ in Hausa) 
/ɫ/   dark or velarized /ɫ/  
 
 ̥  devoicing 
w
  labialisation 
́  high tone 
 ̀  low tone 
ø  zero or null element 
<  Less than 
>  Greater than 
$     syllable boundary  
 =    prefix-stem boundary e.g. pre = side 
 +    general morpheme boundary e.g electric + ity  
xviii 
 
 
 #   word internal boundary (boundary between a base and a neutral suffix 
e.g. advertise#d, dog#s 
##    full word boundary  
//     phrase boundary, pause  
H0: u1 = u2 null hypothesis 
HA: u1 ≠ u2 alternative hypothesis 
 
Abreviations 
Cj  cluster involving a consonant and a following /j/ 
V  vowel 
C  Consonant 
W/B  Word Boundary 
M/B  Morpheme Boundary 
RD  Regressive Devoicing 
PV  Progressive Voicing 
PD  Progressive Devoicing 
VLASA  Voiceless Alveolar Stop Assimilation    
 
VASA   Voiced Alveolar Stop Assimilation  
NA   Nasal Assimilation;    
YC   Yod Coalescence 
YM  Young Male 
AM  Adult Male 
YF  Young Female 
AF  Adult Female 
ANOVA Analysis of Variance 
MANOVA Multivariate Analysis of Variance 
IV  Independent Variable 
DV  Dependent Variable 
RP  Received Pronunciation 
SBE  Standard British English 
GA  General American 
NE  Nigerian English 
EYE  Educated Yoruba English 
 
xix 
 
 
IVE  Institutionalised Varieties of English  
NNE  Non-Native Englishes 
NNIVE Non-Native Institutionalized Varieties of the English Language  
WE  World Englishes 
CSPs  Connected Speech Processes  
ESNE   Educated Spoken Nigerian English 
 
 
 
 
 
 
 
 
 
 
  
 
 
 
 
 
 
 
 
  
xx 
 
 
 
 
 
CHAPTER 1 
 
1.0 Background to the study 
 
1.1 English language in Nigeria: historical background 
Nigeria, as a political unit, evolved following the sharing of African territories 
by the colonial powers in the 19th century. The present day Nigeria is, therefore, an 
amalgam of several ancient kingdoms of diverse cultures existing ever before the 
arrival of British imperialists (Attah, 1987). During the 19th century, the abolition of 
the slave trade provided an opportunity for the expansion of trade in agricultural 
produce from Africa to Europe, particularly palm oil from the West African coastal 
areas. In the 1880s, British control was extended to the Lagos hinter land, the Niger 
Delta, and Benin. Consequently, the territory of Lagos, a centre for expansion of 
British trade, missions, and political influence, eventually became a British colony in 
1861. The end of the 19th century further witnessed Britain‟s aggressive military 
expansion in the north of the country, which resulted in the declaration of northern 
Nigeria as a protectorate in 1900 and later followed by the birth of the Southern 
Protectorate in 1906. Finally, in 1914, both Protectorates and the Colony of Lagos 
were merged into a single territory called „Nigeria‟. 
 The earliest history of the English language in Nigeria dates back to the end of 
15th century when the Portuguese arrived in Benin to trade in pepper and slaves on the 
Nigerian coastal area. The contact thereby established with the natives resulted in a 
form of Portuguese based-Pidgin, mainly used for inter-ethnic communication and 
considered as the predecessor of present-day Nigerian Pidgin English. The Nigerian 
Pidgin word, „sabi’, for instance, is traceable to the Portuguese word, „sabeir‟, which 
means „to know‟ (Osa, 1986). Beginning from the mid 16th century, the British took 
over as major trading partners, and Portuguese-based Pidgin was then replaced with 
English-based Pidgin.  
1 
 
 
About the middle of the 19th century, the Christian Missionaries pioneered 
institutionalised Western education in Nigeria, and for about four decades after then, 
they were in charge of language education in the country (Taiwo, 1980; Fafunwa, 
1974). Thus, the period (1842–82) witnessed intensive missionary activities and 
expansion, consequent upon which the first missionary stations were established in 
Badagry (near Lagos in the South West) and Calabar (in the South East) in 1843 and 
1846 respectively (Awonusi, 2008). The missionaries were, however, not allowed to 
settle in the Islamic North of the country for religious reasons.  
Having realised the need to train their converts to read the English Bible, the 
missionaries established schools, which exposed the natives to the English language. 
According to Adetugbo (1979), English dominated the curriculum under various sub-
heads such as reading, writing, dictation, composition and grammar. Fafunwa (1974) 
also notes that the missionaries used English language as the language of instruction in 
schools, being the only language they understood; and the parents were not averse to it 
in any way as they wanted their children to learn and use English which had come to 
be regarded as the language of commerce, civilization and Christianity. Thus, Christian 
Education in Nigeria became a potent tool for spreading a type of Standard English 
(Ogu, 1992).  
The British colonial government involvement in Education of the country 
began to be felt in the 1880s (Awobuluyi, 1996). This was necessitated by the 
manpower need of the colonial administration. For instance, literate Nigerians were 
needed to work as teachers, interpreters and clerks for schools, local native courts and 
the trading companies. Beginning from 1882 therefore, the colonial government 
promulgated various guidelines and ordinances to emphasize the learning of the 
English language.  
First, English was declared the language of instruction in schools. Second, a 
pass in English language became a pre-requisite for certification which, invariably, 
presupposed that only those who passed and could speak English had access to job 
opportunities. And finally, effective learning and teaching of English language became 
one of the conditions the government spelt out for release of grants to schools (Ogu, 
1992). These efforts encouraged the spread of English in Nigeria. In the long run, 
however, the missionary schools were unable to meet the demands for educated 
Nigerians, and the colonial government began to establish state schools, especially in 
2 
 
 
the northern part of the country where Christian education was not embraced as a 
result of the influence of the Islamic religion.  
 
 1.2 The Nigerian linguistic situation  
The linguistic situation of Nigeria is a complex one. This relates to the fact that 
Nigeria is a country with an estimated population of over 140 million people 
(according to 2006 population census) with numerous languages and diverse geo-tribal 
groups. Nigeria has within her territory three of the four phyla into which African 
languages are classified. These are the Nilo-Saharan phylum with 3 members (e.g. 
Kanuri); the Afro-Asiatic with 103 (e.g. Hausa, Tera, Ngizim, Kaekeri, Angas, 
Mwaghavul, Bole, Bachama, Bade, Teshenawa, Kubi, etc.); and the Niger-
Kordofanian with 286 (e.g. Bariba, Birom, Busa, Chamba, Bini, Urhobo, Efik / Ibibio, 
Fulani, Idoma, lgbo, ljo, Jukun, Kambari, Nupe, Tiv, Vere, Yoruba, etc.). The Khoisan 
is the only phylum not present in Nigeria (Hansford et al., I976; Yusuf, 2010). The 
corollary of this ethno-linguistic diversity in Nigeria, therefore, is pervasive 
bilingualism, multilingualism, code mixing, interference and other effects of language 
contact. 
The actual number of languages spoken in Nigeria has been a subject of 
controversy. Tiffen (I968) puts it at 150, Hansford et al. (I976) identify as many as 394 
indigenous languages, Crozier and Blench (1992) propose 440, while Bamgbose 
(1971) and Adegbija (1998) suggest about 400 languages. The current estimate, 
according to Lewis‟ et al. (2013) Ethnologue data, is 522 living languages. These 
languages, according to Awonusi (2007), are of unequal social, official and 
educational statuses. In order to appropriately capture this pluralistic tendency and 
imbalance therefore, scholars have devised a number of parameters for classifying 
them. Awonusi (2007) catalogues some of these parameters, as listed by other scholars 
and himself, based on origin, nativity and size (e.g. exoglossic, non-exoglossic, major, 
small languages, etc.); population, spread and related sociolinguistic indices (e.g. 
decamillionaire, millionnaire and minor languages) and constitutional legitimacy (e.g. 
official, national, etc.).  
In view of this linguistic multiplicity, it has become difficult to adopt a 
particular indigenous language as the national language, as any language so chosen 
will be unacceptable to other ethno-linguistic entities. English has, therefore, benefited 
from this rivalry; assuming the position of a national and official language in Nigeria. 
3 
 
 
It cuts across ethnic boundaries, functioning as the lingua franca for Nigerians of 
diverse linguistic backgrounds and as the bridge between the different languages. It is, 
as such, seen as a symbol of national unity, a force binding all the different ethnic 
groups in the country together (Ogunsiji, 2004; Salami, 2001; Awonusi, 2004a).  
Although the three major Nigerian languages: Hausa, Igbo and Yoruba are 
constitutionally recognized as national languages alongside English (Federal Republic 
of Nigeria, 1999), none of them is able to match the hegemonic status of English in the 
Nigerian society. In Awonusi‟s (2007:3) view, “It (English) is unarguably more 
widespread than others, attracts higher prestige among the elite and may be described 
as super-exoglossic in the face of other foreign or exoglossic languages”. In Nigeria, 
English functions as the language of inter-ethnic communication, formal education, 
governmental administration, commerce and industry, of international communication, 
the media and national integration (Ogu, 1992; Akindele & Adegbite, 1999; Ogunsiji, 
2004).  
Nigerian Pidgin is another important and useful language in Nigeria, which 
transcends regional, ethnic and social boundaries. It is used primarily as a language of 
wider communication and lingua franca by a majority of Nigerians (though restricted 
to informal situations), and as a mother tongue for a population of about one million, 
especially in the South-South geo-political zone of the country (Simpson and Oyetade, 
2007). According to Ihemere (2006), an estimated number of over 75 million people 
speak Nigerian Pidgin as a second language, while about 3 to 5 million speakers use it 
as a native language. In Faraclas‟ (1996) estimates, it is spoken by more than 40 
million people as an L2 and more than 1 million as an L1. Although it does not have a 
standard or acceptable codified form yet, it features on television, on the radio, and in 
certain forms of literature.  
 
1.3 New (non-native) Englishes 
English is a member of the Germanic branch of the Indo-European family of 
languages, which comprises most of the present-day European languages. It is native 
to the United Kingdom, the United States, Australia, New Zealand and part of Canada. 
However, in the light of present realities, English is no more the exclusive property of 
the native English speakers (Graddol, 1997). As a matter of fact, by Crystal‟s (2003) 
calculation, non-native speakers have already outnumbered native speakers by a ratio 
of 3:1. Different peoples of the world now lay claim to the language, which spread into 
4 
 
 
most parts of the world as a result of growth and expansion of the British Empire 
through colonialism and industrial revolution between 18th and 20th centuries, 
coupled with the United State's military, political, economic, technological, and 
cultural prowess since the late 19th century. 
Apparently as a fulfillment of John Adam's 18th century prophecy (cited in 
Kachru, 1996:138) that "English will be the most respectable language in the world 
and the most universally read and spoken in the next century, if not before the close of 
this", English has incontrovertibly become the most widely used in the world today.  It 
has often been referred to as a world language, the lingua franca of the modern era 
(Graddol, 1997). According to the British Council‟s (1995) English 2000 project:  
English has official or special status in at least seventy-five 
countries with a total population of over two billion. English is 
spoken as a native language by around 375 million and as a 
second language by around 375 million speakers in the 
world…Around 750 million people are believed to speak 
English as a foreign language. One out of four of the world's 
population speaks English to some level of competence… 
English is the main language of books, newspapers, airports 
and air-traffic control, international business and academic 
conferences, science, technology, diplomacy, sport, tourism, 
international competitions, pop music and advertising.  
 
In addition to the above, English is an official language of the United Nations and 
many other international organisations, including the International Olympic 
Committee. It is also listed as the official or co-official language of over 45 countries 
and is spoken extensively in other countries where it has no official status.  
Kachru, an ardent apostle of Institutionalised Englishes, in this regard, presents 
what he referred to as the three Concentric Circles of English (see Fig. 1.1) to capture 
the spread and diffusion of English. Explicating the model, Kachru says: 
 
The Inner Circle represents the traditional bases of English, 
dominated by the "mother tongue" varieties of the language. In 
the Outer Circle, English has been institutionalised as an 
additional language... and the Expanding Circle includes the 
rest of the world. In this [Expanding] Circle, English is used as 
the primary foreign language (1997: 214). 
 
5 
 
 
 
Fig. 1.1 Kachru‟s Model of Concentric Circles  
(Source: Kachru, 1985) 
 
Bhatt (2001:530) further elucidates the model as follows: 
 The inner circle refers to the traditional bases of English, where it is the primary 
language, with an estimated 320-380 million speakers (Crystal, 2003).  
 The outer circle represents the spread of English in non-native contexts, where it 
has been institutionalized as an additional language, with an estimated 150-300 
million speakers.  
 The expanding circle, with a steady increase in the number of speakers and 
functional domains, includes nations where English is used primarily as a foreign 
language, with an estimated 100-1000 million speakers (Crystal, 2003). 
 
 The corollary of the spread of English, therefore, is the birth of Institutionalised 
Varieties of English (IVE), also referred to as „Non-native Englishes‟ or „World 
Englishes‟, used in diverse sociolinguistic context. As the English language continues 
its spread and dominance, it keeps absorbing aspects of cultures worldwide. Its long 
time use by non-native speakers, thus, subjects it to structural changes (Muhlhausler, 
1979). This trend is predicated upon Goodman‟s (1964) observation that any language 
removed from its native environment is likely to undergo severe changes in direct 
proportion to the degree of its psychological and sociological separation from its native 
speakers. Scholars are now agreed that there is not one English language anymore; 
rather, there are many (McArthur, 1998), which represent diverse linguistic, cultural, 
and ideological voices. Bhatt (2001:534) puts the phenomenon this way: 
6 
 
 
 
As the English language spread, through linguistic imperialism 
and linguistic pragmatism, to non-native contexts and came 
into close, protracted contact with genetically and culturally 
unrelated languages, it went through a process of linguistic 
experimentation and nativization by the people who adopted it 
for use in different functional domains, such as education, 
administration, and high society (cf. Kachru 1992a). Non-
native English speakers thus created new, cultural-sensitive and 
socially appropriate meanings-expressions of the bilingual's 
creativity by altering and manipulating the structure and 
functions of English in its new ecology. As a result, English 
underwent a process of acculturation in order to compete in 
local linguistic markets that were hitherto dominated by 
indigenous languages. Given the linguistic and cultural 
pluralism in Africa and South Asia, linguistic innovations, 
creativity, and emerging literary traditions in English in these 
countries were immediately accepted. 
 
In view of the fact that language reacts, adapts to and reflects the local ideas, 
attitudes and experiences of new linguistic environments it finds itself (Banjo, 1975), it 
was not a problem, then, for English to become acculturated, nativised and indigenised 
as it comes in contact with diverse languages and unfamiliar sociocultural contexts; in 
Asia with Indo-Aryan and Dravidian languages, in Africa with languages of the Niger-
Congo family, and in Southeast Asia with Altaic languages (Kachru, 1996, Bhatt, 
2001). This is what has resulted in the emergence of regional-contact varieties of 
English, e.g., Indian English, Malaysian English, Singaporean English, Philippine 
English, Nigerian English, Ghanaian English, etc.; with developed nativized discourse 
and style types and functionally determined sub-languages (registers), and are used as 
a linguistic vehicle for creative writing in various genres (Kachru, 1986). This is the 
type of variety Achebe (1966:22) refers to as “a new English, still in full communion 
with its ancestral home but altered to suit its new African surroundings”.  
Kachru (1986:19) classifies the prominent features of the non-native 
institutionalized varieties of the English language (NNIVE) that have evolved as 
follows: 
a) An extended range of uses in the sociolinguistic context  
b) An ongoing process of nativisation of the registers and styles 
c) A body of nativised EL literature with formal and contextual characteristics 
marking it as localized. 
 
7 
 
 
1.3.1 Nigerian English 
 
It is against this backdrop recognition is now given to the existence of Nigerian 
English as one of the Non-native Institutionalized Varieties of the English language 
(NNIVE), which Alo (2005) defines as:  
A domesticated variety of English, functioning within the 
Nigerian linguistic and socio-cultural setting as a second 
language (ESL). It manifests the linguistic (phonological, 
syntactic, semantic, pragmatic and socio-cultural) 
characteristics of the Nigerian environment (social and 
physical).  
 
Although the reality of Nigerian English is no longer a subject of controversy, 
its concept is still beclouded with theoretical issues of definition, characterization, 
identification, standardization, classification, norm and intelligibility. As Jowitt 
(1991:29) puts it, “Of course, „the accepted norms of usage‟ is precisely what is at 
issue”. In this regard, various attempts have been made by scholars to describe the 
character of Nigerian English in sociological and linguistic terms with a view to 
codifying Standard Nigerian English. This has culminated in an avalanche of theses 
and learned articles, describing it in its inclusive (variety differentiation) and exclusive 
(standard variety) forms (Banjo, 1995). 
 So far, the journey to characterization and standardization of Nigerian English 
has not been a smooth one. Although it has been a subject of rigorous research in the 
last few decades, changing from one facet of analysis to another, and has produced 
volumes of studies, codification or standardization feat is yet to be achieved. This is 
because it has been an arduous task agreeing on what constitutes errors (random 
variation) and accepted usage (non-random variation).   
 According to Banjo (1996), the initial drive was towards error analysis. Some 
studies (e.g. Tomori, 1967) devoted to this research effort categorised and quantified 
deviations from British norms as errors which Nigerian English users must be 
encouraged to eradicate. Attempt to eliminate the errors identified led to the 
contrastive analyses approach (see Afolayan, 1968 and Banjo, 1969), which Banjo 
(1995) claims was meant to predict the probable difficulties that may be faced by 
Nigerian learners of English as a result of earlier exposure to mother tongue or to 
explain errors made in the course of learning the target language. 
However, it was soon discovered that error analysis would be much more 
relevant in an environment where English is used as a foreign language rather than in a 
8 
 
 
second language situation, where there is natural tendency to appropriate English to 
suit the sociolinguistic norms and realities of the host community. Banjo (1996:73) 
puts it this way: 
It soon became clear that it was inappropriate to adopt the same 
attitude to all non-mother tongue users of English, if a clear 
distinction was to be made between the users as a second 
language and one as a foreign language…While any mother 
tongue English community could legitimately provide a 
standard for the learners of English as a foreign language 
(depending on the purpose of their learning the language), the 
immediate standard, for the learner as a second language, must 
be provided immediately from within the learning community 
itself. In other words, while all deviations in the former may 
legitimately be regarded as errors, some deviations in the latter 
must be regarded as part of the local norms.       
 
Thus, it was not long before attention was shifted from error analysis and 
contrastive analysis to variety differentiation, which is considered more appropriate for 
the second language situation. However, according to Banjo (1995, 1996), this did not 
amount to a total adoption of errors as legitimate variants. Rather, any departure from 
the norms of the L2 standard variety was to be considered as errors. In view of this 
paradigm shift, scholars (e.g. Brosnaham, 1958; Banjo, 1971, 1993; Adesanoye, 1973, 
1980; Adekunle, 1979; Bamgbose, 1982; Jibril, 1982, 1986; Jowitt, 1991) attempted to 
capture a variety typology of Nigerian English, using such criteria as education, 
occupation, ethno-linguistic consideration, mother-tongue transfer and social 
acceptability and international intelligibility, with a view to establishing the Standard 
Nigerian English variety.   
Coupled with these are so many other articles (Adekunle, 1974; Adetugbo, 
1977, 1987; Adeniran, 1979; Bamgbose, 1982; Obilade, 1984; Odumu, 1984; 
Afolayan, 1987, Igboanusi, 2001; Adegbija, 1989, 2004) which provide insights into 
core linguistic characterization of Nigerian English on phonological, lexico-semantic, 
idiomatic, syntactic and pragmatic levels. Meanwhile, work on codification and 
standardization of Nigerian English is still ongoing. As a matter of fact, the theme of 
the 27th Annual Conference of the Nigeria English Studies Association (NESA) held 
at Covenant University, Ota, from November 2nd to 5th, 2010 was in this direction; it 
was tagged, „Towards the codification of Nigerian English‟. 
Therefore, in the absence of an acknowledged Standard Nigerian English on 
which basis this study may be carried out, we shall confine our research effort to the 
9 
 
 
educated variety of Nigerian English. This presupposes that we are concerned with 
speakers who are exposed to learning of English within the four walls of the Nigerian 
schools up to, at least, the post-secondary level, using the language for daily 
communication, academic activities and official purposes, and have achieved a level of 
mastery considered to be socially acceptable and internationally intelligible.  
  
1.4 Connected speech processes 
Except for a specific purpose, natural speech is not usually spoken with a gap 
between every word; but with one sound slurring into another. Thus, when sounds 
occur close to each other within a word, or at morpheme or word boundaries, various 
phonetic alterations and phonemic modifications, occasioned by the phonological 
environment of the phonemes or speaker‟s articulatory mechanisms, do occur 
(Cruttenden, 2001). There is, therefore, a wide difference between isolated words and 
the same words occurring in connected speech. The phenomena that account for such 
sound alterations and modifications are technically termed connected speech processes 
(henceforth CSPs). These are processes such as assimilation, elision, reduction in weak 
syllables, lenition, liaison, epenthesis, etc. Also included are rhythm and prosodic 
phenomena such as intonation and stress. 
 Typical phonological processes which cause sound modification in speech are 
language universal. This implies that they are “available to all languages, though not 
necessarily used by all” (Chomsky and Halle, 1968:178). According to Oyebade 
(1998:56), they are “motivated by the need to maintain euphony in a language or to 
rectify violations of well-formedness constraints in the production of an utterance”. 
However, it has been observed that some of them are also language or dialect-specific: 
each language or dialect dictates which process to permit or prohibit and to what extent 
(Dressler & Wodak, 1982; Kerswill, 1985; 1987; Nolan & Kerswill, 1990; Roach & 
Widdowson, 2001).  
For instance, French permits the kind of regressive assimilation of voice in 
which a word-final voiceless consonant usually becomes voiced if followed by a 
voiced sound, e.g. /avek/ becomes [aveg] in the phrase “avec vous”:  [aveg vu]. On the 
other hand, Standard British English does not allow this type of regressive voicing 
assimilation. What is rather commonly acceptable is devoicing whereby a word-final 
voiced consonant becomes voiceless when followed by a word beginning with a 
10 
 
 
voiceless sound, e.g., “I have to” is pronounced as [aɪ hæf tu:], not as [aɪ hæv tu:]; nice 
voice as [naɪs vɔɪs], not as [naɪz vɔɪs]. 
Even within the same language, CSPs may vary from one variety or accent to 
another. In this regard, Kerswill (1987) points out how CSPs in Durham English are 
significantly different from those of RP. According to him, Durham English permits 
the regressive voicing assimilation similar to what obtains in French, whereby the 
phrase “this village” is realized as [dɪz vɪlɪʤ] rather than [dɪs fɪlɪʤ] as in RP. 
Conversely, it is uncommon to find, in Durham English, cases of regressive 
assimilation of place whereby there is a loss of word-final alveolar sound as in RP, e.g. 
“had been”, usually pronounced as [hæbi:n] in RP, is most likely to be realized as 
[haedbi:n] in Durham English.  
 
1.5 Phonological processes in some indigenous Nigerian languages  
In view of the divergence of phonological processes of languages, this section 
examines the operation of some of these processes in some indigenous Nigerian 
languages, so as to establish their peculiar manifestations in these languages vis-a-vis 
the English language. This will afford us the opportunity of effectively appraising the 
performance of NE speakers in the CSPs under consideration. 
 
1.5.1  Assimilation 
 Assimilation has been described as the influence of one sound on another in the 
same neighbourhood to become alike. A vowel may assimilate another vowel or a 
consonant influence another consonant. Also, a vowel may acquire the features of a 
contiguous consonant and vice-versa. Depending on the language, this process may be 
regressive/anticipatory (where the first segment changes to become like the second 
one) or progressive/persevarative (where the second segment takes on the features of 
the first segment). Different types of assimilation known to some of these indigenous 
languages are discussed in the following sections. 
 
1.5.1.1  Vowel -vowel assimilation  
  This is an assimilatory process in which a vowel takes on the features of 
another vowel in a contiguous environment. In connected speech, for instance, when a 
word precedes another word that begins with a vowel, assimilation usually occurs 
between the last vowel of the preceding word and the initial vowel of the second word. 
11 
 
 
This may either be regressive or progressive. The process is exemplified below with 
Yoruba, Igbo, Uhrobo and Ikhin languages. 
 
Regressive:  
 
Yoruba (Source: Orie and Pulleyblank, 2002).  
 
(i) ọmọ ẹran /ɔmɔ εrã/  [ɔmεεrã]  goat-kid; son of a bitch 
(ii) ará òṛun /ará ɔrũ/  [arɔɔrũ]  citizen of heaven: masquerader 
 
Igbo (Source: Yusuf, 2010:183) 
(i) nwá + o ṃa ŋwo ọ ͂ ṃ a͂ 'good child' 
(ii) úmù + áká úmààká 'children' 
 
Urhobo (Source: Yusuf, 2010:294) 
(i) èsíó + èsíó /èsíésíó/ [èsjéèsjó] 'continuous pulling' 
(ii) èfá + èfá  /ὲfέὲfá/ [ὲfέὲfá]  'continuous flogging' 
 
Ikhin, a language in Edo State (Source: Yusuf, 2010:49) 
(i) okpa #  okpa → okpookpa 'one by one' 
(ii) eva  #  eva → eveeva  'two by two' 
 
Progressive : 
 
Yoruba (Source: Bamgbose, 1965). 
(i) ará ìlú  aráàlú  'townsman' 
(ii) ilé isé  iléesé  'office' 
 
Igbo (Source: Yusuf, 2010:183) 
(i) ɔ  bù yá → ɔ  bù yá  ɔ  ɔ̀  yá  'it's that'   
(ii) yá bù → yá bù  yá à  'that is...' 
1.5.1.2  Consonant-consonant assimilation 
 This type of assimilation occurs when a consonant changes to become like 
another consonant in a neighbouring environment. Typical of this is homorganic 
assimilation whereby a nasal consonant becomes assimilated to the place of 
articulation of the consonant it precedes whether in the same or following word e.g. 
 
Hausa (Source: Yusuf, 2010:141) 
(i) [m] before bilabial  /gídán bàlá/ [gídám bàlá ] 'Bala's house ' 
12 
 
 
(ii) [n] before velars  /ango/  [aŋgo]  'groom' 
 
Igbo (Source: Carnochan, 1948:423; Yusuf, 2010:184) 
 
(i) [m]    before bilabial    [ɔ bhaara ya mbha] „He rebuked him‟ 
(ii) [n]     before alveolars    /ńdù/ [ndu]  „life‟ 
(iii) [ŋ]     before velars   /nga/ [ŋga]  „prison‟ 
 
Yoruba (Source: Owolabi, 2011:217) 
(i) [m]  before bilabial e.g.  Ó wà ní bodè     Ó wà ń bodè [o wa m bode]     
           „He is at the gate‟ 
(ii) [ɱ] before labio-dental   Ó ko  ̣mi ni  fono ḷójì  Ó ko  ̣mi ń fono ḷójì [o ko mi ɱ  
                fonoloji] „He taught me phonology‟ 
(iii) [n]  before alveolars Ó dúró ní títì  Ó dúró ń títì [o duro n titi]      
        „He stood in the street‟ 
(iv) [ŋ]  before velars  Ó bú mi ni  kò ṛ ò ̣ Ó bú mi n  kò ṛ ò ̣   [o bu mi ŋ koro]  
        „He disparaged at my back‟ 
 
1.5.1.3  Consonant-vowel assimilation 
 This is a process whereby the features of a vowel are spread on a contiguous 
consonant as secondary articulation. Typical processes of this type are labialisation (in 
which lip rounding feature of a vowel is superimposed on an adjacent consonant) and 
palatalisation (whereby the tongue position of a front vowel is extended onto an 
adjacent consonant).  
 In Hausa, simple or plain velars [k, ƙ, g] may be labialised in the environment 
of a back vowel [u, o], i.e. when they are placed immediately before a back vowel. The 
following examples cited by Sani (1989:30) illustrate this process: 
w
(i) mako [mak o]  'a week'   
(ii) wmugu [mug u]  'a wicked man' 
(The simple velars are actually pronounced with rounded lips). 
This process is also found in Ebira as exemplified below: 
w
(i) tu εvụ → t ẹvụ 'to beat a goat' 
w
(ii) dụ àzà → d àzà 'to chase people' 
(Source: Adive, 1985:56, cited in Yusuf, 2010:52). 
 Palatalisation is another common consonant-vowel assimilation process in 
Hausa; the alveolars 's', 'z', 't' and 'd' are commonly palatalised to „sh‟, „j‟ „c‟, and „j‟ 
13 
 
 
respectively when they precede the front vowel 'i' and 'e' (Yusuf, 2010:141; Sani, 
1989:30), e.g.  
Table 1.1 Palatalisation process in Hausa 
 
Singular Noun Root Plural Suffix Added Implication Effect of Palatalisation 
ƙa sá „country ƙas- -aCe (=ase) kasase* kásàshé „countries‟ 
   
buta „kettle‟ but- -oCi (=oti) butoti* butoci „kettles‟ 
gída „house‟ gid- -aCe (=ade) gidade* gídàjé „houses‟ 
     
maza „males‟ maz- -aCe (aze) mazaze* mazaje „husbands‟ 
      
(Source: Sani, 1989:30). 
 
This feature is equally found in Ebira as exemplified below: 
(i) si  ezí → ʃezi  'to look for children' 
(ii) zi  èṿ a → ʒeṿa 'to hurt the oracle' 
(Source: Adive, 1985:56, cited in Yusuf, 2010:51). 
 
1.5.2  Elision  
 Elision is concerned with the loss of a phoneme under some language-specific 
conditions. It affects vowels and consonants alike.   
 
1.5.2.1  Vowel elision 
Vowel elision is a process where a vowel which is normally pronounced in 
slow speech or in a word uttered in isolation is elided in connected speech. Vowel 
elision has been proved to be one of the means of resolving vowel hiatus- a sequence 
of vowels across a syllable boundary- which many languages prohibit (Orie and 
Pulleyblank, 2002). In instances of such vowel sequence, either of the two adjacent 
vowels is deleted. The following instances are taken from Yoruba, Igbo and Urhobo. 
 
Yoruba (Source: Orie and Pulleyblank, 2002): 
 
(i) owó ki  owó   →   owók-ówó  →   owókówó  'any money at all/bad money'  
money any money  
 
(ii) aya     ọba →   aya-ba   → ayaba 'queen' 
wife   king 
 
 
 
14 
 
 
Igbo: 
 
(i) uzọ amaka → uz-amaka  → uzamaka  „road is good‟  
 
(ii) ije oṃa   → ij- ọma  → ijoma  'safe journey' 
  
Urhobo (Source: Yusuf, 2010:289): 
(i) dὲ + úkó  →  d- úkó  →  [duko]  
buy cup      „buy a cup‟  
 
(ii) ɔ̀gɔ  + óbiébì → ɔ̀g- óbiébì → [ɔgobiebi] 
bottle black     'a black bottle' 
 
1.5.2.2  Consonant elision 
 
Consonant elision is concerned with the deletion of adjacent consonants. 
Akinlabi (2004:466-477) discusses three most common and most predictable contexts 
of occurrence of this process in Yoruba. The first context describes deletion which 
occurs when two contiguous syllables contain similar consonants. In such a situation, 
the first of the two similar consonants is deleted and the vowels are assimilated, e.g.  
(a) eguńguń  eéguń  (masquerade) 
(b) òtíto  ̣   òóto  ̣ (truth) 
The second context concerns glides /w/ and /y/ which may be deleted between two 
vowels when followed by back vowels /u, o, ɔ/ and front vowels /i, e, ε/ respectively, 
e.g. 
(a) àwùjọ   àùjọ (assembly of persons) 
(b) adìyẹ   adìẹ (chicken) 
The third context is r-deletion. This may occur when /r/ occurs between two identical 
vowels or /r/ is preceded or followed by a high vowel, e.g. 
(a) wèrèpè   wèèpè (nettle) 
(a) òrìsà   òòsà (god) 
In the same vein, Yusuf (2010:48) cites the following examples of consonant elision 
from Ebira: 
(i) awuru → aaru 'gown' 
(ii) avaba → aaba 'all' 
 
1.5.3  Epenthesis (Insertion) 
Epenthesis is a phonological process which involves insertion of an extra 
segment to an utterance in order to break up a clustering of consonants not permitted 
by a language or to prevent a close syllable from ending a word. As Oyebade (1998) 
15 
 
 
observes, epenthesis is commonly employed in many African languages to break up 
consonant clusters of loan words for smooth production. The following instances are 
cited by Yusuf (2010): 
Loan words  Ebira  Edo  Yorùbá 
bread   iburedi  eburedi búre ḍì 
belt   ibeliiti  ebeliiti  be ḷíìtì 
comb   ikoomu ekoomu kóòmù 
 
 From the foregoing discussion on phonological processes in indigenous Nigerian 
languages, it is obvious that the operational mechanisms of these processes differ from 
one language to another, though they are language universal. This informs the need to 
investigate how Nigerian speakers of English react to these processes in Standard 
British English connected speech, given that they had already formed a speaking 
pattern in their indigenous languages.  
 
1.6 Statement of the problem 
A large volume of research has concentrated on characterising Nigerian 
English sound segments (e.g. Adetugbo, 1977; Ekong, 1978; Jibril, 1986; Aladeyomi, 
2002; Aladeyomi and Adetunde, 2007; Soneye, 2008) and suprasegmental features 
(e.g. Amayo, 1981; Atoye, 1991, 2005a; Udofot, 1997, 2004; Akinjobi, 2004; Gut, 
2001; Jowitt, 2000; Olaniyi, 2007; Oladipupo, 2008) with particular reference to how 
they deviate from or approximate to Standard British English. On the contrary, such 
elaborate attention has not been paid to the sub-segmental (also contextual) features of 
connected speech (the effects of adjacent sounds on each other in a stream of 
connected speech). Yet, the human speech sounds are not so discrete, the prevalence of 
segmental and suprasegmental description notwithstanding. As a matter of fact, a 
segmental phonetic transcription is widely considered an abstract imposition on 
speech; sound segments actually behave in different ways in connected speech.  
Frankly, if the question of intelligibility between native and non-native 
speakers must be adequately addressed, there is need to redirect the focus of 
phonological inquiry to connected speech processes. This is because it is at the level of 
connected speech that the typical difference between native and non-native English 
accents is most pronounced and intelligibility is highly impaired. (Laver, 1968; 
Gimson, 1980; Katalin and Szilárd, 2006).  
16 
 
 
 Meanwhile, few existing studies (e.g. Laver, 1968; Jibril, 1982; Joshua, 2009) 
in this domain have been confined to mere identification of the processes that 
characterise Nigerian English both within words and across word boundary; studies 
that give priority to Nigerian English speakers' proximity to Standard British English 
(SBE) connected speech are scarce. In view of this, it becomes pertinent to pay more 
attention to the sub-segmental domain of the Nigerian English phonology, particularly 
in relation to speakers‟ proximity to Standard British English.  
 Besides, little attempt has been made by scholars to examine the social 
differentiation of Nigerian English speakers in terms of connected speech processes as 
being proposed by this study. The only study we are aware of is Jibril (1982) whose 
preoccupation, however, was on regional variation only. This study, therefore, 
investigates the incidence of assimilation, elision and liaison processes of SBE 
connected speech (across word and morpheme boundary) in NE, in relation to the 
region, gender and age of speakers. This is with a view to determining the level of NE 
speakers' approximation to or deviation from SBE connected speech and unravel their 
social variation. The variationist perspective to this study is necessitated by Kerswill‟s 
(1985, 1987) observation that connected speech processes may be socially 
differentiated in a speech community depending on regional affiliation, age, sex and 
socio-economic class of speakers, and may be adopted or avoided by members of a 
particular sociolinguistic group. This is an aspect of phonological inquiry which, 
according to Huber and Brato (2008), is under-researched in the L2 varieties of 
English; but, in our view, may turn out to be an essential component in the description 
and codification of Nigerian English.  
 
1.7 Aim and objectives 
 
There is, no doubt, a marked difference between Standard British English and 
Nigerian English, not only in isolated sound segments, but also at the level of 
connected speech (where contiguous sounds slur into one another and are thereby 
modified or simplified). The aim of this study, therefore, is to investigate the incidence 
of certain Standard British English processes (assimilation, elision and liaison) in the 
connected speech of Nigerian English speakers, differentiated by region, gender and 
age. The study shall achieve the following objectives: 
 
17 
 
 
(i) ascertain the incidence of assimilation, elision and liaison processes of SBE 
connected speech in Nigerian English 
(ii) determine the extent to which NE speakers approximate to or deviate from the 
Standard British English connected speech  
(iii) discover, if any, connected speech processes typical of Nigerian English 
(iv) examine the social variation of assimilation, elision and liaison in Nigerian 
English in terms of the region, gender and age of speakers. 
(v) identify possible factors that motivate participants‟ performance. 
 
1.8 Research questions 
The resolution of the stated objectives shall be guided by the following research 
questions: 
(i) are there incidences of assimilation, elision and liaison processes of SBE 
connected speech in Nigerian English?  
(ii) to what extent do Nigerian English speakers approximate to or deviate from the 
Standard British English connected speech processes? 
(iii) are there typical Nigerian English CSPs?  
(iv)  are assimilation, elision and liaison socially differentiated in Nigerian English 
in terms of the region, gender and age of speakers? 
(v) what are the possible motivations for participants‟ performance? 
 
1.9 Research methodology 
Insights from Phonetics/Phonology and Sociolinguistics as well as various 
statistical tools were employed to address the issues raised in this study. The analyses 
covered both auditory and acoustic phenomena. 
 
1.9.1 The participants 
 
The participants in the study were 180 males and 180 females between ages 18-
65, born and educated in Nigeria with a minimum of 2-3 years post-secondary 
education. They were drawn, through stratified and purposive techniques, from four 
regions in Nigeria: North (120), West (80), East (80) and South-South (80) (see 
appendix A). For the purpose of data gathering and variational analyses, participants 
from each region were sub-divided into four social categories (according to age and 
gender): Young Male, Adult Male, Young Female and Adult Female. Altogether, each 
18 
 
 
category comprised 90 participants (30 from the North, 20 from the West, 20 from the 
East and 20 from the South-South region), making three hundred and sixty (360) 
participants altogether (appendix A). Two educated native speakers served as control.  
 
1.9.2 Research instruments 
The research instruments used for investigating these phenomena were speech 
elicitation procedure and a structured questionnaire. For speech elicitation, Semi-
Spontaneous Speech (SSS) Style was used. The data which was adapted from Gimson 
(1980) and Dziubalska (1990) comprised thirty-one utterances (Appendix B, Test 1) 
and a short passage (Appendix B, Test 2), containing various CSPs sites. The 
questionnaires were used to elicit information on personal, educational, regional, 
linguistic and socio-economic backgrounds of the participants, which were required for 
the sociophonetic analysis of the data (Appendices C and D).  
1.9.3 Data gathering procedure 
The participants and the control were guided to produce Test 1, which 
comprised thirty-one utterances, into digital recording devices. In order to ensure 
approximation to natural speech, corresponding questions were constructed to guide 
the production of each item. Based on these, the researcher engaged each person in a 
question-and- answer session in a manner that resembled casual conversation. The 
participants were also instructed to read Test 2, which was a short passage on car sale, 
as naturally as possible, as though they were making negotiations. Their initial 
attempts were recorded and then played back to verify whether the conversations 
sounded casual and natural enough. The final recordings were then made after the 
researcher had felt satisfied with their performances. 
  
1.9.4 Data analysis 
 
Two major levels of analyses were adopted in the work. First, the recordings 
were played back and instances of assimilatory, elision and liaison features identified 
at different boundaries in the data were transcribed perceptually and analysed 
statistically, using percentages, Multivariate Analysis of Variance (MANOVA) and 
Bonferroni's Post-hoc test.  
An appropriate (SBE) variant in each context was allotted one (1) mark, while 
zero mark was recorded for each inappropriate variant (non-SBE variant). The total 
scores for all participants in each variant were converted to percentages, the higher 
19 
 
 
percentage taken as the norm. The percentage scores were then represented graphically 
and the findings subjected to Standard English phonological rules, as provided in 
generative phonology, to ascertain Nigerian English speakers' application of or 
deviation from the rules. In order to test for the level of significance between the social 
categories of speakers in their application of Standard British English CSPs, 
participants‟ scores were subjected to Multivariate Analysis of Variance (MANOVA) 
and Bonferroni's Post-hoc test.  
Second, portions of the semi-spontaneous speech data produced by eight (8) 
Nigerian participants (representing the four regions and the social categories) were 
analysed acoustically with a view to corroborating the findings obtained through 
statistical analysis. The same two levels of analysis were also used to analyse the 
control‟s production of the data. 
 
1.10 Scope of the study 
This study is a hybrid of two distinct linguistic fields- Sociolinguistics and 
Phonetics. Therefore, insights, methodologies and analytical tools from both fields 
were employed. As it is well known that variability in speech is a function of different 
factors such as aerodynamic operations, language-specific variation or social factors, 
this study, though emphasised socially conditioned features of SBE connected speech 
in Nigerian English, sought explanation for Nigerian English connected speech 
behaviour from other sources (e.g. phonological naturalness, mother tongue influence) 
This is because CSPs, according to Nolan & Kerswill (1990), are actually a function of 
different phenomena.  
 Furthermore, connected speech processes are of many types, e.g. assimilation, 
reduction, elision, lenition, liaison, epenthesis, /l/ vocalization, glottalisation, /l/ 
darkening, juncture, etc. It is not our intention in this study to examine all the possible 
connected speech processes of SBE in Nigerian English in view of time and space. The 
study was rather restricted to variants of assimilation, elision and r-liaison processes 
commonly employed in SBE. The decision to limit the study to these features was 
informed by two factors. First, concentrating on few features afforded us the 
opportunity of conducting an in-depth investigation into each process. Second, the 
CSPs under consideration form the major and commonest subsegmental features of 
connected speech in SBE (Cruttenden, 2001).  
20 
 
 
 In choosing the participants, the pluralistic nature of the indigenous languages 
in Nigeria was taken into consideration. The selection was representative of four 
regions in Nigeria, so delimited for the purpose of this study- North (comprising 
Hausa, Fula, Kanuri and a few other minority languages spoken in the region), East 
(Igbo), West (Yoruba) and South-South (comprising Edo, Esan, Izon, Annang, 
Urhobo, Ibiobio, etc). This, it was believed, would make it possible to capture Nigerian 
speakers of English of different linguistic backgrounds, knowing full well that it would 
be an arduous task to select participants from all the available language groups in 
Nigeria, as there are over five hundred languages spoken in Nigeria (Ethnologue, 
2013).  
 Furthermore, following the sociophonetic approach employed in this study, the 
social variation analysis was restricted to region, gender and age. The analysis would 
rather have been too cumbersome should we have decided to examine ethnicity or 
language groups, rather than region, as we would have had very many language groups 
to contend with. The same reason goes for the exclusion of the variable of socio-
economic class. 
 
1.11 Significance of the study 
The primary preoccupation of scholars of Nigerian English today is the 
characterisation and eventual codification of this variety of English. So far, concerted 
efforts have been made by various scholars in this direction at all linguistic levels- 
lexis, syntax, phonology, pragmatics, etc. At the phonological level, researchers have 
explored extensively, though not exhaustively, the segmental and suprasegmental 
features of Nigerian English. The sub-segmental domain, which deals with the effects 
of adjacent sounds on each other in a stream of connected speech, has, however, not 
been given elaborate attention.   
  Therefore, the study will, without doubt, contribute immensely to the 
description and possible codification of Nigerian English, as it aims to identify the 
connected speech features observed in Nigerian English, pointing out areas of 
convergence and divergence between SBE and NE and providing useful explanations 
for their occurrence or otherwise. Furthermore, the sociophonetic approach employed 
in the study will reveal the social distribution and differentiation of the CSPs of 
Nigerian speakers of English, on the basis of which valid judgment can be made with 
regard to who uses what CSPs in Nigerian English.  
21 
 
 
 More importantly, it will provide the basis for comparing Nigerian English with 
the Standard British accent and thereby portray Nigerian English as a distinct variety 
of World Englishes. Pedagogically, the study will be of immense value to language 
planners and teachers, as well as Nigerian learners of English since it seeks to provide 
phonological explanations for the marked difference between Nigerian English and 
native English speakers, and unravel possible intelligibility problems. 
  
1.12 Limitations and constraints 
In view of time and space and, and more importantly, the need to keep the 
analysis manageable, the study is limited to just three features of connected speech: 
assimilation, elison and liaison. Due to the same reason, only semi-spontaneous speech 
data was collected, natural speech data was excluded. 
Also, considering the large population of respondents involved in this study, it 
was not possible, in all cases, to conduct the recording sessions in a quiet venue, since 
the participants had to be consulted in their offices, institutions and open places. This, 
in a way, affected some of the recordings, as background noise was created. However, 
we were able to get a good number of clear recordings used for the analyses. 
Finally, the researcher was constrained by a number of factors during data 
gathering period. At a point, it became very difficult to reach some of the target 
population for a number of reasons. First was the insurgency by the „Boko Haram‟ sect 
in the northern part of the country which restricted the researcher‟s access to that area 
for some time. Second was the variational nature of the research which required data to 
be collected from different categories of people. Certain sets of participants were 
difficult to reach; for example, it took time and energy to gain access to Northern 
women for religious reasons.  
 
 
 
 
22 
 
 
 
 
 
CHAPTER 2 
 
REVIEW OF LITERATURE 
 
2.0  Introduction 
 This chapter discusses the major concepts of this study; that is, connected 
speech processes and sociophonetics. It also reviews various scholarly contributions to 
these concepts, as well as the notion of „Nigerian English‟, within which purview this 
research is being carried out. 
 
2.1 Connected speech processes 
During speech, words are not usually spoken in isolation but in a flowing and 
continuous stream. Thus, distinctness of sounds implied by phonemic transcription is 
obviously non-existent, even in carefully spoken citation forms. As Pike (1948) opines, 
sounds tend to slur into one another. This implies that segments are capable of being 
influenced and modified in varying degrees by other adjacent sounds in connected 
speech, especially at morpheme or word boundaries (Nolan and Kerswill, 1990; Roach 
and Widowson, 2001). Nolan and Kerswill (1990:295), in this regard, assert: 
 
The physical activity of speech is continuous rather than 
discrete. Successive phonetic events blend into each other so 
that the segment boundaries implied by the transcription are 
often not evident, and the realizations of a given phonetic 
category may range along a continuum of fine allophonic 
variation according to phonetic environment. 
 
The modifications that occur to sound segments in connected speech involve 
phonemic alterations or simple allophonic realisations in which the less important 
consonants, vowels, or syllables in words are altered or removed; contiguous sounds 
resemble each other or a sound is inserted. Sometimes, the change may be so complex 
that it does not even reflect the sounds properties. To buttress this claim, Nolan and 
Kerswill (1990) provide the example of an utterance: I don’t suppose you could make 
it for five, transcribed phonemically as /ai deunt sǝpǝuz ju: kʊd meik it fɔ: faiv/; but 
23 
 
 
which becomes: [nspeuӡxebme:xif̩faiv] when rendered in fluent and fast speech 
through the processes of reduction, lenition, assimilation and deletion. 
This range of phenomena by which the "explicit, dictionary-type forms of sounds 
are converted to the phonetic properties of fluent speech by a variety of reduction and 
simplification processes” (Nolan and Kerswill, 1990:296) is what is technically 
referred to as connected speech processes (CSPs). Among these cross-word processes 
are assimilation, reduction, elision (deletion), lenition, liaison (linking), epenthesis 
(insertion), /l/ vocalization, glottalisation, /l/ darkening, juncture, etc. 
The occurrence of CSPs has largely been traced to a number of sources. One is 
articulatory economy whereby speakers attempt to apply less articulatory effort in the 
pronunciation of contiguous sounds in connected speech, with a view to reducing the 
number, or the extent, of the movements and adjustments of the speech organs 
(Abercrombie, 1967; Foulkes, 2006). Scholars of this theoretical persuasion who 
studied the effect of speaking rate on articulation (e.g. Gay, 1968; Crystal and House, 
1988a,b; Perkell, Zandipour, Matthies and Lane, 2002) have proved that faster rate of 
speaking usually leads to articulation of shorter duration, increased overlap, and 
greater articulatory undershoot (Foulkes, 2006). 
However, Ohala (1983) reasons otherwise. He is of the view that there is no 
way changes in speaking rate could affect all sounds equally since the degrees of 
inertia and speed movement of the articulators are not the same. As far as he is 
concerned, CSPs are a result of limitation of speech mechanism and/or operations of 
aerodynamic principles in the vocal tract. That is, they are products of variation in the 
structures of the vocal tract. He cites the example of stops which usually change to 
affricate in the environment of close vowels or /j/ (e.g. the pronunciation of tune as 
[ʧun] in some varieties of British English). According to him, the change is not 
occasioned by articulatory change but is due to the aerodynamic of the vocal tract 
setting. Foulkes (2006:3), in this regard, also opines: 
 
Speech is largely dependent on the physical properties of the 
vocal-auditory channel, and, of course, no two human beings 
share exactly the same physical characteristics. Differences in 
spoken forms may therefore emanate from physical differences 
in each link in the chain. Furthermore, these physical 
differences are not only to be found across speakers: 
individuals are also subject to long- or short-term physical 
changes in the vocal tract and auditory system, which in turn 
24 
 
 
may yield long- or short term effects on their speech or 
hearing. 
 
Again, this view of mechanical determination of CSPs has been proved 
inadequate. CSPs, as has been discovered, differ from one language, dialect or 
individual to another (Lindblom, 1963; Byrd, 1994; Laver 1994), whereas the innate 
constraints of the vocal tracts are universal (Foulkes, 2006). Laver (1994), for instance, 
reveals that regressive voicing assimilation is not observed in RP pronunciation, 
whereas it is found in some Scottish accents (e.g. the medial consonant cluster in 
birthday may be pronounced [-ðd-]). It appears then that CSPs are determined by 
language-specific rules which seem to dictate what particular processes are to be 
allowed in a particular language or dialect (Kerswill, 1987; Nolan & Kerswill, 1990; 
Lindbon, 1963; Byrd, 1994; Lavar, 1994). These processes, thus, form part of the 
phonological knowledge internalised by the speakers of a language.  
Against this backdrop, Nolan & Kerswill (1990) conclude that CSPs are 
actually a function of different phenomena. 
 
2.2  Connected speech processes in Standard British English  
Speech is not just sounds in isolation, but a flow of sounds based on a system 
through which phonemes are connected, grouped and modified in certain manner. 
Native speakers of English, in particular, do not pronounce words with gaps but join 
them together in a stream of sounds; as a result of which they are able to speak quickly 
and fluently. In the course of speaking therefore, single words, which ordinarily are 
pronounced clearly in isolation, undergo a number of context-induced phonetic 
modifications especially at word boundary.  
According to Gimson (1980), the word, just like the phoneme, is an abstracted 
linguistic unit when considered from the perspective of its actual phonetic realisations 
under the influence of adjacent sounds or stress or rhythmic pattern. This is because 
the pronunciation of a word in connected speech is subject to the influence of other 
adjacent sounds or of the stress or rhythmic group of which it forms part. The 
modification, according to him, may affect the whole word (e.g. weak forms or word 
stress patterns), or the segment appearing at the word boundary (e.g. junctural 
assimilation, elision, and liaison forms). It follows from the foregoing, therefore, that 
there are two subgroups of connected speech processes in SBE. The first comprises 
suprasegmental features of stress, rhythm and intonation, as well as vowel reduction, 
25 
 
 
which characterise larger strings like syllables or utterances; while the second 
subgroup belongs to the domain of subsegmental which deals with the effects of 
adjacent sounds (vowels and consonants) on each other in a stream of speech (Foulkes 
and Docherty, 2006; Katalin and Szilárd, 2006).  
This section reviews connected speech processes of SBE from both 
perspectives, but pays more attention to the subsegmental subgroup which is the 
concern of this study. 
 
2.2.1  Reduction 
Reduction, according to Bald (1990:317), is “a process in which a form or set 
of forms undergoes changes with respect to certain phonetic features”. An instance of 
this feature in English is vowel reduction, the process by which full vowels are 
replaced by weak or reduced vowels– /ǝ/, /ɪ/ and /ʊ/ in unstressed syllables. It is a 
principal means by which syllables can be squeezed. Gimson (1980) opines that a 
common phenomenon in the various stages of evolution of English is for unstressed 
syllables to undergo a process of gradation which may be a complete disappearance of 
phonemes or obscuration of vowels. In content words, unstressed vowels normally 
weaken to / ə, ɪ /, and less often, /ʊ/ or are sometimes deleted completely. The 
following are instances of weakening in English: 
/ɒ/ [ə] pilot /ˈpailət/ 
/ɜ/ [ə] survive /səˈvaiv/ 
/ʌ/ [ə] surplus /ˈsɜ:pləs/ 
/eɪ/ [ɪ] village /ˈvɪlɪʤ/ 
/e/ [ɪ] challenge /ˈʧælɪnʤ/ 
In the same vein, unstressed function or grammatical words usually show 
reduction of the length of sounds, obscuration of vowels towards / ə, ɪ, ʊ /, and the 
elision of vowels and consonants in connected speech in SBE, except when used for 
special emphasis. Most function words, therefore, commonly have varied 
pronunciations depending on whether they are strong or weak. Katalin and Szilárd 
(2006), in this regard, opine that as many as 95% of the occurrences of function words 
in native English speech are weak. A situation whereby only strong forms are used in 
speech is usually considered typical of foreigners; such pronunciations normally sound 
unnatural and foreign to native speakers of English. The same source provides a list of 
26 
 
 
such function words- determiners, pronouns, prepositions, conjunctions and 
auxiliaries- with their strong and varied weak forms as follows: 
 
Table 2.1 Strong and weak forms 
 
 Word Strong  Examples Weak Examples 
Form form(s) 
1 the ðiː It's not "a" cat, it's /ðǝ/, /ðɪ/ the /ðǝ/ dog, the /ðɪ/ 
"the" cat! 
end 
2 a, an eɪ, æn ǝ, (ǝn) a dog, an end 
3 some sʌm I'll get you some. s(ǝ)m I'll get you some 
apples. 
4 his hɪz It's his car, not mine. (h)ɪz what's-his-name 
5 your= jɔː(r), Is this YOUR CV? jǝ(r) Mind your head! 
you're 
jʊǝ(r) 
6 (s)he, hiː, ʃiː  All I want is YOU. (h)ɪ, ʃɪ, I'll get you some 
we, wiː  wɪ  
apples. 
you juː jʊ 
7 him hɪm  (h)ɪm I love him. 
Whom do you love: 
8 her hɜː(r) (h)ǝ(r), I love her. 
him or her? 
ɜ:(r) 
9 their ðeǝ(r)  ð(e)ǝ(r) Do you hate them? 
them ðem It wasn't US, it was ð(e)m 
THEM. 
10 us ʌs ǝs one of us is crying 
11 there ðeǝ(r) There you are! ðǝ(r) There’s a book on the 
table 
12 at æt What's he getting at? ǝt Look at me 
13 for fɔː(r) It's just what I long f(ǝ), fr, f Stay for a week 
for. 
/ s tei frǝ   wiːk/ 
14 from frɒm Where are you frǝm He's from Barcelona. 
from? 
15 of ɒv It's love I've a lot of. ǝv one of us 
16 to tuː Who did you give it tǝ, tʊ to /tǝ/ me, to /tʊ/ Ann 
to? 
17 than ðæn "Than" is spelt with ðǝn even better than 
an "a" not an "e". the real thing 
27 
 
 
18 and ænd "And" is a (ǝ)n(d) Twist and shout! 
conjunction. 
19 but bʌt Don't say "but"! bǝt sad but true 
20 that ðæt What's that? ðǝt the book that we 
bought 
21 or ɔ:(r) To be or not to be? ǝ(r) sooner or later 
22 as æz as and when ǝz as good as it gets 
23 have hæv Have you seen her? (h)ǝv, v You've got to know. 
has hæz Had I known him (h)ǝz, z, She's got it.  
had hæd earlier...! s It's been a year.  
hǝd, d You'd better stop! 
24 can kæn Can you dance? k(ǝ)n I can see. 
could kʊd Yes, you could. kǝd You could be mine. 
25 will wɪl Will Susan be there? (w)(ǝ)l Susan will be at 
would wʊd Would you like it? (w)(ǝ)d home. 
I'd rather sail away. 
26 shall ʃæl Shall I open the ʃǝl I think you should 
should ʃʊd window? ʃǝd work harder 
27 must mʌst You MUST hold on! mǝs(t) I must go now. 
28 do duː How do you do? dʊ, d(ǝ) How do you do? 
does dʌs Yes, she does! d(ǝ)s What does he do? 
29 am, æm,  I AM hungry! (ǝ)m,  I'm hungry. 
are aː(r) He said he wasn't ǝ(r)  They were all 
was, wɒz, sleepy but he was! wǝz, drinking in the pub. 
were wɜː(r) wǝ(r) 
30 been biːn Where have you bɪn I've been busy all 
been? 
day. 
 
(Source: Katalin and Szilárd, 2006:103-107) 
 
The weak forms of the function words, according to the above source, normally occur 
within the sentence, e.g. 
It's time to /tǝ/ go on 
and at the beginning of the sentence (with the exception of auxiliaries) e.g. 
To /tʊ/ err is human. 
The strong forms, on the other hand, are used at the end of the sentence e.g. 
I can do it if you want me to /tu:/ 
or within the sentence for purposes of emphasis or contrast, i.e. 
 
28 
 
 
when the word is contrasted or co-ordinated with another one, e.g. 
Both of them can /ˈkæn/, but only Jack will /ˈwɪl/, answer this question, or 
It's at /ˈæt/ the corner, not on /ˈɒn/ the corner 
(i) when it is cited or quoted, e.g. 
Don't say ‘but’! /ˈbʌt/ 
(ii) or when the word is emphasised, e.g. 
You must /ˈmʌst/ hold on! or 
He does /ˈdʌs/ do the homework regularly! 
The strong form is also used when a preposition is followed by a pronoun at the end of 
a sentence, e.g. 
I'm looking at you /ˈæt ju:/). 
However, there are certain exceptions to these rules. The strong forms of object 
pronouns are not normally used even at the end of the sentence, e.g. 
Have you seen them? /ðm/). 
Again, the negative form of auxiliary verbs is never weakened e.g. 
I can't /ˈkænt/ (or cannot / ˈkænɒt /) dance 
and, usually, though not always, auxiliaries occur strong at the beginning of the 
sentence, e.g. Can /ˈkæn/ you dance? 
Finally, some function words have strong forms only, e.g., auxiliaries (did, may, might, 
need), prepositions (in, off, on, up), conjunctions (though, when), pronouns (that, these, 
those, who), and the negative particle not; they cannot be weakened. 
 
2.2.2   Variation of the word’s accentual pattern (stress) 
 
According to Gimson (1980:285), words accentual (stress) patterns behave 
differently in connected speech. Generally, the accentual (rhythmic) pattern of a word 
remains constant, notwithstanding the environment. In connected speech, although a 
word may lose the nuclear pitch change which it has in isolation, the position of 
primary and secondary accents is not changed, e.g. 
be ˋhind;  ̩
get be ˋhind me; 
beˈhind the ˋbook c̩ ase. 
 
ˋwind s̩ creen; 
ˋwind s̩ creen w̩ iper; 
29 
 
 
the ˈwind ̩screen was ˋsmashed; 
he ̩bought  a ˈnew ˋwind ̩screen. 
 
ˋyesterday; 
I ̩saw him ˋyesterday; 
ˈyesterday ˋmorning. 
 
ˋpost o̩ ffice; 
ˈpost o̩ ffice ˴clerk; 
ˈnear the ˋpost ̩office. 
 
However, when a simple or compound word pattern consists in isolation of a 
primary accent (stress) preceded by a secondary accent, the primary accent may be 
thrown back to the syllable carrying secondary stress in isolation, if, in connected 
speech, a strong accent follows closely to avoid stress clash, e.g.: 
ˈthir ˋteen, but ˈthir ̩teen ˋshillings 
ˈWest ˋMinister, but ˈWest ̩Minster ˋAbbey 
 
ˈfull ˋgrown, but a ˈfull ̩grown ˋman 
ˈafter ˋnoon, but ˈafter ̩noon ˋtea 
Also, when a strongly stressed syllable closely precedes, the potential pitch-
prominent secondary accent may be reduced to one of quality, quantity or rhythm, 
without pitch- prominence, e.g. 
ˈeight ̩thir ˋteen; ˈnear  ̩West ˋminster; ˈnot ̩full ˋgrown; ˈFriday ̩after ˋnoon 
Moreover, when the primary accent is shifted back, in the case of  a strong 
accent following, the secondary accent which falls on the syllable having primary 
accent in isolation frequently has no pitch-prominence, and may, if the quality of the 
syllable permits, receive no accentual prominence of any kind, e.g. ˈWest ̩minster 
ˋAbbey or ˈWestminster ˋAbbey. 
 
2.2.3  Assimilation 
Assimilation, a process whereby two adjacent sounds become phonetically 
similar, has been extensively described by various scholars. Ladefoged (1993) refers to 
it as a process whereby a sound changes into another under the influence of a 
contiguous sound. To Roach (2000), it is the realisation of a sound in a different way 
as a result of being adjacent to some other phoneme belonging to a neighbouring word. 
30 
 
 
Katamba (1989), in like manner, opines that assimilation involves modifying a sound 
with a view to making it more similar to some other sound in the environment. In 
Crystal‟s (1991:28) view, assimilation is “the influenced exercised by one sound 
segment upon the ARTICULATION of another, so that the sounds become more alike, 
or identical”. Against this backdrop, assimilation can generally be described as a 
process by which a phoneme (sound segment) is modified to resemble a contiguous 
one within a word or at word boundary in a string of sounds. For example, the word 
'this' has the sound /s/ at the end if it is pronounced in isolation, but when followed by 
a word beginning with /ʃ/ as in 'shop' it often changes in rapid speech to /ʃ/, giving the 
pronunciation /ðɪʃʃɒp/. 
Gimson (1980), in this regard, argues that the actual phonetic output of a 
phoneme, and by extension, a word depends on the context, and so, attention must be 
paid to the mutual influence of contiguous sounds on each other when describing 
speech. In other words, phonetic continuity and merging of qualities as well as 
tendency toward assimilation of phonemes must be considered principal factors in 
connected speech. 
Assimilation types have been described by scholars, using various parameters. 
Skandera and Burleigh (2005:90), in particular, identified four categorisations based 
on:  
 the distance between the two sounds involved: contiguous/contact and non-
contiguous/distant assimilation 
 the direction of the influence exerted: regressive, progressive and coalescent 
assimilation 
 the particular distinctive feature affected: assimilation of voice, place and 
manner 
 the degree to which one sound assimilates to another: partial and total 
assimilation 
 
Simo-Bobda and Mbangwana (1993), Roach (2000) and Abercrombie (1967) also 
listed classifications akin to those identified above. Besides, Abercrombie (1967) and 
Simo-Bobda and Mbangwana (1993) further added historical, contextual 
(juxtapositional), ordinary and similitude assimilation. These assimilation types are, 
however, not straight-jacketed; they overlap. For instance, assimilation of voice can be 
regressive or progressive. 
 
 
31 
 
 
2.2.3.1  Contiguous/Contact and distant assimilation 
Contiguous (also contact, contextual or juxtapositional) assimilation is a 
process whereby the pronunciation of a segment is altered under the influence of an 
adjacent sound especially at word boundary. Abercrombie (1967:133) describes it as 
“changes in pronunciation which take place under certain circumstances at the ends 
and the beginnings of words (changes at word „boundaries‟, that is to say) when these 
words occur in connected speech, or in compounds”. A relevant example cited by this 
source is the phrase is she. In isolation, each of the word is normally pronounced /ɪz/ 
and /ʃi:/ respectively; but in connected speech, the phrase becomes /ɪʒ ʃi:/. The two 
words are juxtaposed and is now has a pronunciation different from the one it has 
when said in isolation. In terms of directionality, contiguous assimilation may be 
regressive (anticipatory), progressive (perseveratory) or coalescent 
Non-contiguous or distant assimilation, on the other hand, relates to 
modification involving two sounds which are further apart. An example cited by 
Skandera and Burleigh (2005:90) is the idiom turn up trumps [tɜ:m əp trʌmps] where 
the /n/ in turn sometimes changes to /m/, under the influence of bilabial sounds /p/ of 
up and /m/ of trumps. This assimilation type however barely occurs in English and is 
considered more or less a slip of the tongue. 
 
2.2.3.2  Regressive, progressive and coalescent assimilation 
This category of assimilation relates to the direction of the influence the sounds 
exert on each other. Regressive (Anticipatory) assimilation is the type of assimilation 
in which a sound is modified to become more like the phoneme following it or, put in 
another way, whereby a sound influences the preceding one. This is the most common 
type of assimilation in SBE, which applies to place or manner of articulation and state 
of the glottis. For example, in ten bikes [tem baɪks], alveolar /n/ becomes bilabial /m/ 
under the influence of the following bilabial /b/. Similar examples cited by Weisser 
(2005) are: 
[wʌm mæn] (one man) for [wʌn mæn] alveolar nasal → bilabial nasal 
[hæʃ ʃi:] (has she) for [hæz ʃi:] alveolar fricative → palato-alveolar  
      fricative 
[ha:b bæk] (hard back) for [ha:d bæk] alveolar plosive →bilabial plosive 
[gʊb baɪ] (good bye)  for [gʊd baɪ] alveolar plosive   → bilabial plosive 
32 
 
 
[gʊb pɔɪnt] (good point) for [gʊd pɔɪnt] alveolar plosive → bilabial  
         plosive 
[gʊn naɪt] (good night) for [gʊd naɪt]  alveolar plosive → alveolar nasal 
[tem pɔɪnts] (ten points) for [ten pɔɪnts] alveolar nasal → bilabial nasal 
[θɪŋ kəʊt] (thin coat)  for [θɪnkəʊt]  alveolar nasal → velar nasal 
 
Progressive (perseveratory) assimilation is one in which the preceding 
phoneme influences the subsequent one within a word or at word boundary. For 
example, in lunch score /lʌnʧ ʃkɔ:/, alveolar /s/ becomes palato-alveolar /ʃ/ under the 
influence of the preceding palato-alveolar /ʧ/). Weisser (2005) further cites the 
following cases where progressive assimilation occurs with high frequency function 
words, generally determiners that start with a weak fricative /ð/, e.g. 
[ɪnnə]    (in the)  for [ɪnðə]  dental fricative → alveolar nasal 
[ɪnnækkeɪs]  (in that case) for [ɪnðætkeɪs] dental fricative → alveolar nasal 
[ɪnnɪsweɪ]  (in this way) for [ɪnðɪsweɪ] dental fricative → alveolar nasal 
[ɒnnætdeɪ]  (on that day) for [ɒnðætdeɪ] dental fricative → alveolar nasal 
[damməm]  (damn them) for [dæmðəm] dental fricative → bilabial nasal 
[hu:zzæt]   (who‟s that?) for [hu:zðat] dentalfricative→ alveolar fricative 
[spɒttəm]  (spot them) for [spɒtðəm] dental fricative → alveolar plosive 
He is, however, of the opinion that this type of assimilation is less common than 
regressive assimilation in SBE.  
In coalescent assimilation, a sequence of two sounds merges or coalesces to 
produce an entirely new one. For example, /d/ and /j/ of would you? /wʊd ju:/ 
commonly coalesce into /ʤ/ (/wʊʤʊ/) in SBE. This assimilation process has been 
given different names and described in various ways. Gimson (1980) and Cruttenden 
(2001) simply call it coalescence, and apply it to instances of /sj/, /zj/, /tj/ and dj/ 
becoming /ʃ/, /ʒ/, /ʧ/ and /ʤ/ respectively within word or at word boundary, e.g. 
/sj/ -   in case you need it  [in keiʃʊ  ni:d ɪt]; miss you [miʃu:] 
/zj/ -   has your letter come? [hæʒɔ: letə kʌm]; sees you [si:ʒu:] 
/tj/ -  what you want [wɒtʃʊ wɒnt]; not yet [nɒʧet] 
/dj/ -  would you? [wʊʤʊ]; mind you [mainʤʊ] 
Roach (2000) and Shockey (2003) refer to it as palatalisation. A recent coinage 
adopted for it by Wells (1982, 1994, 2000) is yod coalescence. He, however, limits the 
33 
 
 
phenomenon to the environments where /t/ + /j/ and /d/ + /j/ coalesce to become /ʧ/ and 
/ʤ/ respectively.  
The term yod refers to the tenth letter of the Hebrew alphabet, represented as 
palatal approximant /j/ in the phonetic alphabet of English and several other Indo-
European languages. In English, it behaves in different ways in a Cj (consonant + /j/) 
sequence. First, it may be sounded (this is called yod-presence) as in few /fju:/, new 
/nju:/, beauty /bju:ti/, accuse /əkju:z/, pew /pju:/; second, it may be deleted (yod 
dropping) as in chew /ʧu:/, rude /ru:d/, choose /ʧu:z/, lunatic /lu:nətɪk/, lucid /lu:sɪd/; 
and lastly, it may coalesce with alveolar sounds /s, z, t, d/ to evolve an entirely new 
sound (yod coalescence). 
Yod coalescence or coalescent assimilation, according to Hannisdal (2006), is 
therefore, a subcategory of place assimilation whereby the palatal approximant /j/ 
(yod) fuses or coalesces with a preceding alveolar consonant /t, d, s, z/, either within a 
word or across word boundary, to become palato-alveolar /∫, ʒ, ʧ, ʤ/ respectively, as in 
issue /ɪsju:/ becoming /ɪʃu:/, measure becoming /meʒə/, educate /edjʊkeɪt/ becoming 
/eʤʊkeɪt/, soldier becoming /səʊlʤə/, and miss you /mɪs ju:/ becoming /mɪʃu:/. It has 
been described as a process of simplification, a device by which consonant clusters are 
simplified in order to achieve, or at least approach, the preferred CV structure 
(Hannisdal, 2006; Lutz, 1991). 
Diachronically, yod coalescence dates back to the seventeen and eighteen 
century when unstressed sequences of /tj/, /dj/, /sj/ and /zj/ coalesced following 
borrowings from French (Gimson, 1980); thereby yielding, for instance, the following: 
/sj/ - /ʃ/  ocean, special, issue. 
/zj/ - /ʒ/  occasion, measure, treasure. 
/tj/ - /ʧ/  nature, virtue, picture. 
/dj/ - /ʤ/ grandeur, gradual, educate 
Hannisdal (2006) lists three possible positions where yod coalescence can occur in RP: 
(i) across word-boundaries, e.g. did you? /diʤu /, won’t you? /wəunʧu /; 
(ii) in unstressed syllables within a word, as in education /ˎeʤu:`keiʃn/, statue 
/`stæʧu:/; and less frequently, 
(iii) word-internally before a stressed vowel (/u:/ and /uə/), e.g. Tuesday /`ʧu:zdei/, 
reduce /rɪ`ʤu:s/. 
 According to Cruttenden (2001), yod coalescence is common in fluent 
colloquial speech and well within the boundaries of RP across word-boundaries and in 
34 
 
 
unstressed syllables within a word. It is, however, not yet fully acceptable within RP in 
stressed syllables within a word, although there is evidence of change in this direction 
(Wells, 1994; Taylor, 1998; Altendorf, 2003). 
  
2.2.3.3  Assimilation of voice, place and manner 
Assimilation of voice is a process whereby contiguous consonants tend to be 
either all voiced or all voiceless depending on the state of the glottis. Unlike French 
which favours regressive voicing assimilation,  what is permitted in SBE is devoicing. 
This is a process whereby a voiceless consonant affects a voiced one, irrespective of 
the relative order of the two (Katalin and Szilárd, 2006). Thus, devoicing can either be 
regressive or progressive.  
In regressive (anticipatory) devoicing, a voiced sound is modified to become 
more like the voiceless one following it; for example, I have to go is pronounced as [aɪ 
hæftə gəʊ], not as [aɪ hævtə gəʊ]. According to Katalin and Szilárd (2006), this type of 
assimilation is common with fricatives and affricates and is thus referred to as 
'fricative devoicing' by some writers. Gimson (1980), in this regard, points out some 
instances of voicing assimilation at word boundary, in which final voiced fricatives 
followed by a word-initial voiceless consonant may be realised as the corresponding 
voiceless fricative if the two words are closely linked, e.g. 
/ð/   /θ/  in with thanks, breathe slowly, with some. 
/z/   /s/  in these socks, he was sent, we chose six, He’s seen it. 
/v/      /f/  in of course, we’ve found it, they’ve come. 
/ʤ/     /t∫/ in Goodge street, bridge score. 
Progressive devoicing follows the same pattern; rather than a voiceless sound 
become voiced, a voiced consonant is devoiced to reflect the voicing status of a 
voiceless sound that precedes it at word or morpheme boundary. For example, catch 
h h
Bill and black dog will be pronounced [k æʧ b̥ɪɫ], [blæk d̥ɒg] rather than [k æʤ bɪɫ], 
[blæg dɒg] (Gimson, 1980; Katalin and Szilárd, 2006).  
However, progressive voicing assimilation is also possible in SBE, especially 
in the following instances: 
 the plural morpheme {s}, as in dogs [dɒgz] (where voiceless /s/ changes to 
voiced /z/ under the influence of voiced /g/),  
 the reduced form of the third person singular form of be, e.g. he’s [hɪz],  
35 
 
 
 the possessive marker, e.g. John’s [dʒɒnz]; and  
 the past tense {ed}-form, e.g. carved [kɑ:vd].  
 
Assimilation of place is concerned with changes in the place of articulation of a 
segment (usually a consonant) which in SBE are usually regressive or coalescent 
(Roach, 2000; Gimson 1980). For instance, if a word-final alveolar consonant such as 
/t, d, n/ is followed by a word-initial consonant with a different place of articulation, 
the word-final alveolar consonant is likely to take on the place of articulation feature of 
the following consonant. Thus, if the word „meat‟ /mi:t/ is followed by „pie‟ /pai/ it 
may become /mi:p/; that is, [mi:p pai]: /t/ changes to /p/ before /p/. The following 
cases are cited by Roach (2000): 
(i) before  a bilabial consonant, /t/ will become /p/, as in 
that person  /ðæp pᴈ:sṇ/ 
light blue  /laip blu:/ 
(ii) before a dental consonant, /t/ will change to a dental plosive /t/, as in 
that thing  /ðæt̺  θinŋ/ 
get those  /get̺ ðəʊz/ 
cut through  /kʌt̺ θru:/ 
(iii) before a velar consonant, /t/ will become /k/, as in 
that case   /ðæk keis/ 
bright colour  /braik kʌlə 
quite good  /kwaik gʊd/ 
(iv) /s/ becomes /ʃ/ and /z/ becomes /ʒ/ when followed by /ʃ/ or /j/, as in 
this shoe   /ðɪʃ ʃu:/ 
those years  /ðǝʊʒ jɪǝz/ 
 Gimson (1980) is of the opinion that alveolars are readily prone to such 
assimilation because of their relatively high word-final occurrence. He further provides 
instances of such modifications at word boundaries involving the place of articulation 
where word final /t, d, n, s, z/ usually assimilate to the place of the following word-
initial consonants as follow: 
/t/  /p/ before /p, b, m/, e.g. that pen, that boy, that man /ðæp pen/, etc. 
/t/  /k/ before /k, g/, e.g. that cup, that girl /ðæk kʌp/, etc. 
/d/  /b/ before /p, b, m/, e.g. good pen, good boy, good man /gʊb pen/, etc. 
/d/  /g/ before /k, g/, e.g. good concert, good girl, /ˈgʊg ˋkɒnsət/, etc. 
36 
 
 
/n/  /m/ before /p, b, m/, e.g. ten players, ten boys, ten men /tem pleiəz/, etc. 
/n/  /ŋ/  before k, g/, e.g. ten cups, ten girls /teŋ kʌps/, etc. 
/s/  /∫/ before /∫, j/, e.g. this shop, this year /ði∫ ∫ɒp, ði∫ jɜ:/. 
/z/  /ʒ/  before /∫ , j/ or  /∫/ (changes to fortis) before /∫/, e.g. 
those young men /ðəuʒ  jʌŋ men/, has she? /hæʒi/ or /hæ∫ ∫i/, 
/ð/  /z/ or /s/ before /s, z/, e.g. I loathe singing /ai ləuz siŋiŋ/ 
 
Assimilation of manner refers to changes in the manner of articulation of a 
particular sound to become similar in manner to a contiguous sound. Roach (2000) is 
of the view that clear instances of this assimilation type are rare in English and are 
only typical of the most rapid and casual speech. An example cited is a rapid 
pronunciation of „that side‟ /ðæt saɪd/ and „good night‟ /gʊd naɪt/ as [ðæs said] and 
[gʊn nait] respectively. As in place assimilation, this is also usually regressive except 
in a case of a word initial /ð/ following a plosive or nasal at the end of a preceding 
word, e.g. “in the”/in ðə/→ /ɪn̺n̺ə/; “get them” /get ðəm/→/get̺t̺əm, akin to instances 
cited under progressive assimilation above. 
 
2.2.3.4  Partial and total assimilation 
In partial assimilation, the contiguous sounds involved differ from each other in 
at least one of the distinctive features. For example, the assimilated /b/ of good pen 
[gʊb pen] has similar place and manner of articulation with the following /p/ of pen but 
differs in terms of voicing. On the other hand, the two sounds involved in total 
assimilation are completely alike. For instance, the /t/ and /d/ of that cup [ðæk kʌp] 
and good girl [gʊg gɜ:l] respectively take the same features of /k/ and /g/ they precede.  
 
2.2.3.5  Historical assimilation       
Historical assimilation relates to assimilation that has developed in the process 
of evolution of a language, in which case a word known to be pronounced in a 
particular way took on a new pronunciation which finally became the accepted norm in 
that language (Simo-Bobda and Mbangwana, 1993). For instance, Abercrombie 
(1967:138-139) cites the example of the English word „orchard‟, which was claimed to 
be a compound word: ort + yard. Over the years, it underwent coalescent assimilation 
which changed the middle sounds /tj/ to [tʃ] as it is today. The same goes for the word 
37 
 
 
nature which is now pronunced [neitʃə], and immediate which used to be pronounced 
with [dʒ] in the middle; it is now commonly pronounced [dj]. 
 
2.2.4  Elision 
Elision is the omission of one or more sounds (a vowel, a consonant, or a whole 
syllable) in a word or at word boundaries in rapid connected speech, in order to 
maximise articulatory flow. Jackson (1982:32) refers to it as a process “involving the 
complete disappearance of a phoneme from a phonetic environment”. Usually, when 
there is a cluster of two or more consonants word-internally, some of the consonants 
usually get elided, e.g. han(d)kerchief, Chris(t)mas and gran(d)mother. The same 
process (also referred to as cluster simplification) occurs across word boundaries, e.g. 
Sain(t) Paul   firs(t) knight  nex(t) day 
I don'(t) know   sen(d) Jim   rock an(d) roll 
Guns an(d) Roses  fin(d) me   pos(t)man 
This is because, most often, sounds that ordinarily are enunciated in isolated 
words or slow, careful speech get elided in rapid, casual speech. For example, the 
English sentence: She looked particularly interesting in slow, careful speech or citation 
form, will normally be pronounced as: /ʃi lʊkt pətɪkjələli ɪntərəstɪn/ (with 27 
phonemes). In rapid conversational speech, however, it might be reduced to: /ʃi lʊk 
pətɪkli ɪntrstɪn/ (with 20 phonemes) leaving out seven sounds. 
 Simo Bobda and Mbangwana (1993) identify two types of elision: historical 
and contextual elision. According to them, historical elision, relates to a given sound or 
sequence of sounds that has disappeared in the course of the evolution of a language, 
so that it is no longer pronounced in the contemporary form of the language. Such 
cases of elision are already established in the language, though the old spelling may 
still be retained. For example, in: 
 cupboard  /kʌbəd/ 
 taƖk   /tɔ:k/ 
 evening  /i:vnɪŋ/ 
 history  /hɪstrɪ/ 
phonemes /p/, /l/, /ɪ/, and /ə/ respectively are no longer pronounced in Modern English. 
They also added that silent letters in contemporary English sounds are clear cases of 
historical elision. 
38 
 
 
Contextual (juxtapositional) elision, on the other hand, is concerned with cases 
of sounds that exist in a word said in isolation but are omitted in the environment of 
another word in a rapid colloquial speech. For example: 
 [əgʊdil] a good deal   for [əgʊd dil] 
 [gɪvɪm] give him   for [gɪv him] 
 [lɑ:staɪm] last time   for [lɑ:st taɪm] 
 [teɪkeǝ] take care  for [teɪk keǝ] 
 [blaɪnmæn] blind man  for [blaɪnd mæn] 
 [leðǝm] let them  for [let ðǝm] 
 [fɜ:sθɪŋ] first thing  for [fɜ:st θɪŋ] 
 [ǝkɔ:s] of course  for [ǝv kɔ:s] 
 [kɔ:tɪm] caught him  for [kɔ:t hɪm] 
Gimson (1980) further highlights some other instances of contextual elision 
which affect vowels and consonants. According to him, vowels are usually elided in 
the following cases: 
(i) Initial schwa /ə/ usually gets elided when followed by a continuant and preceded 
by a word final consonant, e.g. 
not alone /nɒtḷ leun/   get another /getṇ nʌðə /, 
run along /runḷ lɒŋ /   he was annoyed /hi wezṇ nɔid/. 
 
(ii) Word initial /ə/ may coalesce with the appropriate preceding vowel, e.g. 
go away /gɜ: wei/   try again /tra: gən/ 
 
(iii) /ə/ may be elided when final /ə/ occurs with following linking /r/ and word initial 
vowel, e.g. 
after a while /a:ftrə wail/   as a matter of fact /əz  ə mætrəv fækt/ 
father and son /fa:ðrən sʌn/ over and above / ˈəuvrən  ə ˴bʌv/. 
Also, consonants elision can take place in the following situations: 
(i) if the combination of continuant consonants /t/ or /d/ (e.g. /–st, -ft, -ʃt, -nd, -ld, -zd, 
-d, -d/)  is followed by a word with an initial consonant e.g. 
nex(t) day   race(d) back   las(t) chance 
lef(t) turn  sof(t) centres   lef(t) wheel 
mash(ed) potatoes finish(ed) now  push(ed) them 
39 
 
 
ben(d) back  tinne(d) meat   sen(d) round 
hol(d) tight   ol(d) man   col(d) lunch 
gaze(d) past   cause(d) losses  raise(d) gently 
loathe(d) beer  move(d) back   love(d) flowers, 
 
(ii) when the word following the combination of word final clusters of plosive or 
affricate /t/ or /d/ (e.g. /-pt, -kt, - t∫t, -bd, -gd, -ʤd/) has an initial consonant. This 
may result in loss of the final alveolar stop in the cluster, e.g. 
help(ed) me   stopp(ed) speaking  jump(ed) well 
thank(ed) me   look(ed) fine   pick(ed) one 
fetch(ed) me   reach(ed) Rome patch(ed) throat 
robb(ed) both   rub(ed) gently  grabb(ed) them 
lagg(ed) behind dragg(ed) down begg(ed) one 
chang(ed) colour  urge(d) them   arrang(ed) roses 
 
(iii) final /t, d/  followed by a word beginning with /j/ usually coalesce with /j/, i.e. /t∫/ 
or /dʒ/, e.g.  help(ed) you   like(d) you   los(t) you 
lef(t) you   grabb(ed) you  tol(d) you 
 
(iv) the /t/ of the negative /-nt/ is often elided, particularly in disyllables, before a 
following consonant, e.g. 
you musn’t lose it /jʊ mʌsn lu:z it/   doesn’t she know? /dʌsn ∫i nəu/ 
or before a vowel, e.g. 
wouldn’t he come? /wudn ɪ kʌm/  you mustn’t over-eat /ju mʌsn əuvər i:t/ 
 
(v)  clusters of word final /t/ and word initial /t/ or /d/ are sometimes simplified, e.g. 
I’ve got to go /aiv gᴅtə gəu/   what do you want? /wᴅdə ju wᴅnt/ 
and less commonly /d/ before /t/ or /d/, e.g. 
we could try /wi kə trai/   they should do it /ðəi ∫ə du: ɪt/. 
 
(vi) one of a boundary cluster of two consonants sometimes undergoes elision, though 
this is usually considered vulgar, e.g. 
he went away /hi wen əwei/   I want to come /ai wᴅnə kʌm/ 
give me a cake /gɪ mɪ ə keɪk/   let me come in /lemɪ kʌm ɪn/ 
get me some paper /gemɪ sm peɪpə/,  I’m going to /aɪm gənə, aɪŋənə, aɪŋnə/ 
 
 
40 
 
 
2.2.5   Liaison 
Liaison, a French word meaning 'connection‟ or „link‟, is defined by Crystal 
(2003: 269) as “transition between sounds, where a sound is introduced at the end of a 
word if the following syllable has no onset”. Another name Roach (2000) gives to it is 
linking, which he describes as a process by which words following each other in 
connected speech are linked together in special ways. According to Kenworthy 
(1987:136), liaison refers to “smooth link between a final consonant in one word and 
an initial vowel in the next word”. Words can be linked in Standard English through 
the following means (Katalin and Szilárd, 2006; Simo Bobda and Mbangwana, 1993; 
Roach, 2000): 
1. r-liaison (linking and intrusive /r/), e.g. far off, [fɑ:rɒf], idea of [aɪdɪǝrǝv] 
j
2. j-liaison (after /i:/ or /ɪ/), e.g. me and you [mi:ʲənjʊ], my own [maɪ ǝʊn] 
3. w ww-liaison (/u:/ or /ʊ/), e.g. you and me [ju: ənmi], allow us [ǝlaʊ ǝs] 
4. consonant-vowel liaison (carry over of a word-final consonant to a word 
beginning with a vowel in a stressed syllable), e.g. first of all [fɜ:stəvɔ:l], not at 
all [nɒtətɔ:l] 
The most prominent of these linking processes is r-liaison (otherwise called r- shandi 
by Wells (1982)) which comprises linking and intrusive /r/. Both concepts involve the 
insertion of /r/ in between two adjacent vowels at word boundary to maximise 
articulatory ease.  
Linking /r/, according to Skandera and Burleigh (2005:58), refers to „a link 
between words through the articulation of a normally unarticulated word-final /r/, 
which is articulated only when preceded by a vowel in the same word, and followed by 
an initial vowel in the next word.‟ In r-less or non-rhotic accents (e.g. SBE), /r/ is 
dropped when it is followed by a consonant or a pause but pronounced when followed 
by a vowel.  This phenomenon, known as /r/- dropping, dates back to the 18th century 
or thereabout when /r/ was dropped before a consonant and in absolute final word 
position (Simo Bobda, 1994). However, in connected speech, when an orthographic 
word-final r or re is followed by another word beginning with a vowel, /r/ may be 
retained; that is, pronounced for euphony purpose, e.g. far off [fɑ:r ɒf], wear out [wɪǝr 
aʊt], car owner [ka: əʊnǝ], more and more [mɔ:r ǝn mɔ:], fire extinguisher [faɪǝr 
ɪkstɪŋgwɪʃǝ], my father and mother [maɪ fa:ðər ən mʌðə], the weather ought [ðə weðər 
ɔ:t], here and there [hiər ən  ðeə], the door opened  [ðə  dɔ:r əʊpənd] (Gimson, 1980; 
Simo Bobda and Mbangwana, 1993; Simo-Bobda, 1994; Hannisdal, 2006). 
41 
 
 
Sometimes, however, /r/ may also be used to link two contiguous vowels at 
word boundary, even when a final r is absent from the orthography of the first word. A 
phonetic /r/ that occurs in such an unhistorical environment is referred to as intrusive 
/r/ (Hannisdal, 2006; Roach, 2000; Simo Bobda and Mbangwana, 1993; Gimson, 
1980). Intrusive /r/, therefore, is a process whereby an unetymological /r/ is inserted to 
remove a hiatus between two consecutive vowels belonging to different words 
(Skandera and Burleigh, 2005), e.g. media event [mi:diər ivent], Anna and I [ænǝr ǝnd 
aɪ], Africa or Asia [æfrɪkǝr ɔr eiʃɪǝ], drama and music [drɑ:mǝr ǝn mju:zɪk], law and 
order [lɔ:r ǝnd ɔ:dǝ], awe-inspiring [ɔ:r ɪnspaɪǝrɪŋ]. 
Wells (1994) claims that intrusive /r/ is an attempt to extend the linking /r/ 
principle to cases which are phonetically identical but differ historically and 
orthographically. To him, “intrusive /r/ arises essentially from the natural tendency to 
give identical treatment to words with identical endings” (Wells, 1982:223). He further 
opines that both liaison features are very common with native RP speakers, and are 
regarded important characteristic features of connected speech found in RP. However, 
linking /r/ is generally thought to be more frequent, correct and desirable in 
mainstream RP, while intrusive /r/ is less common and stigmatised on the grounds that 
there is nothing in the spelling to justify its use (Crystal, 1992). Gimson (1980), in this 
regard, claims that some native speakers consider intrusive /r/ as incorrect or 
substandard, and as such avoids its use. Instead, they employ a vowel glide or glottal 
stop /Ɂ/ to fill vowel hiatus in connected speech, e.g. the door opened /ðə dɔ: əʊpənd/ 
or /ðə dɔ: ʔəʊpənd/. However, resistance to and disapproval of intrusive /r/ by 
language purists notwithstanding, it “is undoubtedly widespread” (Roach 2000:144) 
and “very prevalent in RP” (Wells 1994: 202) 
Apart from the two linking devices described above, semi-vowels /j/ or /w/ are 
other possible linking devices that may be used between two vowels at word boundary 
for hiatus-filling (Simo Bobda and Mbangwana, 1993; Katalin and Szilárd, 2006). If 
the first vowel is high and front, e.g. /i:/ or /ɪ/, the yod /j/ may be used e.g. 
me and you  /mi:ʲənjʊ / 
j
the answer  /ði ænsǝ/ 
j
to be or not to be /tǝbi ɔnɒtǝbi/ 
j
petty Agnes  /pɪtɪ ægnɪs/ 
j
my own  /maɪ ǝʊn/ 
 
42 
 
 
On the other hand, /w/ may be inserted if the first word ends in /u:/ or /ʊ/, e.g. 
w
you and me   /ju: ənmi/ 
w
you all   /ju: ɔ:l/ 
w
to answer  /tʊ ænsǝ/ 
w
allow us  /ǝlaʊ ǝs 
All these linking devices serve the same purpose of filling a hiatus (break in 
pronunciation between two vowels that are next to each other in consecutive syllables 
without an intervening consonant), so as to facilitate the smooth transition between the 
two contiguous vowels (Katalin and Szilárd, 2006; Hannisdal, 2006; Skandera and 
Burleigh, 2005).  
 
2.3 Review of related literature on connected speech processes. 
A good number of studies have been conducted on connected speech processes 
in English. This section reviews some of them. 
Wright & Kerswill (1989), in their paper, “Electropalatography in the analysis 
of connected speech processes”, primarily examine the perceptual correlates of the 
articulatory gradualness of a connected speech process: the assimilation of a final 
alveolar to a following velar or bilabial, and report that the assimilatory process is 
gradual in articulatory terms, and not discrete, as assumed in most phonological 
theories. In the experiment set up, phonetically trained listeners were asked to: 
(a)  identify a word followed in a carrier phrase by a velar or a bilabial as having an 
(underlying) final alveolar or a final velar or bilabial, and 
(b)   characterize the degree to which words identified as having an alveolar are 
assimilated to the following velar or bilabial. 
The findings reveal that: 
(1)  there is no discrete perceptual boundary between the various types of 
articulation (including the underlying velars/bilabials) presented on the tape;  
(2)  there is some evidence that assimilations may never be 'complete', but may 
show a residual tongue body configuration characteristic of an alveolar, even 
when there is no discernible (either articulatorily or auditorily) alveolar gesture. 
 
Kerswill (1991) investigates the social and linguistic factors influencing 
connected speech in Cambridge English from acoustic and articulatory perspectives, 
using twenty-six (26) local Cambridge speakers. The CSPs examined were /l/ 
vocalization, glottalisation and yod-calescence. The study sets out to investigate: 
43 
 
 
(i) the structural linguistic factors influencing CSPs 
(ii) the nature of interactions between CSPs 
(iii) how CSPs diffuse through the linguistic system, and 
(iv) the perceived status of CSPs on socially sensitive features. 
The results of the analysis of a range of conversational and constructed recordings of 
the participants show that: 
(i) CSPs are variably influenced by structural linguistic factors as they are 
principally determined by segmental phonetic context and are sensitive to word 
boundaries and speech rate; 
(ii) socially-sensitive CSPs interact variably with phonetically-conditioned CSPs; 
(iii) increase of CSPs is partially influenced by stylistic factors; and lastly, 
(iv)  CSPs are perceptually significant in the social judgment of speakers and their 
speech. 
 
Nguyen and Ingram (2004) report the findings of a corpus-based descriptive 
analysis of the most prevalent transfer effects and connected speech processes as 
produced by Vietnamese speakers of English, compared with native speakers of 
English. A discriminant analysis is also reported, using the most typical phonetic and 
prosodic processes, in order to examine how well the two speaker groups can be 
discriminated and whether an Australian Vietnamese female speaker who has a native-
like accent is classified into the native or Vietnamese speaker group. 
The results of the analysis show that Vietnamese speakers‟ English is distinct 
from native speakers‟ in many phonetic and prosodic processes. In spite of an 
advanced level of English proficiency with a high proficient global accent and with 
phonetic and articulatory knowledge of English sounds, many Vietnamese speakers of 
English could not articulate the connected speech and assimilation processes which 
characterize native speakers‟ spontaneous natural speech. However, Vietnamese 
female speaker who had grown up in Australia is classified into native speaker group 
by the discriminant function and her speech was free of many phonetic and prosodic 
transfer effects. The fact that the other Vietnamese speakers of English were still 
influenced by transfer effect from their mother tongue underscores the importance of 
the exposure to the second language environment to the improvement of foreign 
accent. 
44 
 
 
All the studies above are concerned with English as spoken by native speakers 
or second language users elsewhere. The rest of the review, therefore, concentrates on 
related studies carried out in this domain in Nigerian English. 
 Laver‟s (1968) article, “Assimilation in Educated Nigerian English” was about 
the pioneering study on connected speech processes in Nigeria, though restricted to 
assimilation. Using educated Nigerian speakers of English from diverse mother 
tongues including Yoruba, Efik, Etsako, Emai, Bini and Otwo as participants, he 
discovers: 
 a tendency for regressive assimilation 
 absence of progressive assimilation of voice 
 extensive cases of assimilation of place 
 that assimilation does not involve manner of articulation alone 
 that Nigerian English allows regressive voicing assimilation while RP does not. 
However, his claim that Nigerian “mother-tongues had no apparent effect on 
the type of assimilations used in English, nor any major effect on the occurrence of 
assimilations in particular phrases” (158) is weak. This is because there is no review of 
any of the indigenous languages used in the work to justify his claim. Besides, this 
position was contested by Jibril (1982) who observes that make them [meg dem] and 
black bird [blagbe:d], which form two of Laver‟s three instances of regressive voicing 
assimilation, are apparently Efik influenced. According to him, /k/ and other plosives 
undergo voicing between two voiced segments in Efik. Furthermore, Laver‟s study is 
restricted to assimilation and the population is limited to just six language groups in 
Southern Nigeria. This justifies the need for the present research which studies 
assimilation, elision and liaison features of connected speech across diverse language 
groups and social categories in Nigeria. Besides, attempt is made not only to identify 
the incidence of SBE connected speech processes in Nigerian English, but also to 
determine speakers' proximity to SBE.  
Jibril (1982), in his study of “Phonological variation in Nigerian English,” 
examines, in passing, consonant assimilation as well as vowel and consonant deletion. 
He discovers from the corpus that: 
 only nasals undergo assimilation of place in Nigerian English, e.g. government 
council [gʌvməŋ kausl], man power [mampa:wa:]. 
45 
 
 
 cases of assimilation of manner that affect alveolars are regressive and involve the 
change of /d/ and /n/ to liquids, e.g. would like [wul laik], don’t like [dol laik]. 
 regressive assimilation of voice affects final plosives only, which become devoiced 
or voiced before a word beginning with voiceless or voiced consonant as the case 
may be, e.g. with the [wid di], twelve thousand [twep θauzn]. 
 using vowel epenthesis to resolve consonant clusters does not occur in the speech 
of most Nigerian speakers of English except in just few cases involving Hausa and 
Igbo speakers; 
 consonant deletion is common in Nigerian English in fast speech or in a bid to 
reduce consonant cluster. 
Without doubt, Jibril‟s study is ground-breaking. It provides a 
sociophonological insight into spoken Nigerian English and identifies a number of 
phonological processes of Nigerian English. It, indeed, provides great impetus to this 
work. However, it is not without its limitations. First, the study is not a comprehensive 
research on connected speech processes in Nigerian English which is the major 
preoccupation of this work. Two, the population sample used was too small to have 
been able to arrive at a valid judgment on Nigerian English. Moreover, participants 
were restricted to the three major languages in Nigeria, without consideration for many 
other small language groups. It is yet to be seen, then, how his study can truly be 
representative of Nigerian speakers of English. Besides, his division of Nigerian 
English into Northern and Southern acents is unrealistic in view of the various 
languages withing each region. Lastly, one wonders how varied Jibril's phonological 
variation is, considering the fact that participants were mainly elite.  
In the light of these, the present study becomes germane. It examines various 
assimilatory, elision and liaison processes of SBE connected speech in NE, using a 
larger population sample from various large and small language groups in four regions 
of Nigeria and different sociolinguistic groups delineated by gender and age. Finally, it 
seeks to reveal the proximity of Nigerian speakers of English to SBE in terms of 
assimilation, elision and liaison, which Jibril's study did not take into cognizance. 
Josiah (2009) focuses on assimilatory processes. His dissertation titled: "A 
synchronic analysis of assimilatory processes in educated Nigerian spoken English" is 
an attempt to identify various assimilatory processes that characterise educated spoken 
Nigerian English (ESNE); find out whether the processes inhibit or facilitate 
46 
 
 
intelligibilty; discover the predictability of the processes in ESNE and find out any 
similarities or differences between SBE and NE. Using a sample of one hundred final 
year university students from nineteen linguistic groups in Nigeria, he examined 
various aspects of assimilatory processes from perceptual and acoustic dimensions. He 
discovers, among other things, that some of the assimilatory processes that characterise 
ESNE, e.g. nazalisation, devoicing of final segments and regressive assimilation are 
predictable; assimilatory processes induced by articulatory factors hardly inhibit 
national, and sometimes, international intelligibility; ESNE speech exhibits more 
assimilatory features than SBE and a number of assimilatory processes observed in 
ESNE are markedly different from those of SBE. The study concludes that ESNE 
phonology is markedly different from that of SBE, and therefore, requires an 
endonormative rather than exonormative model as long as it facilitates effective 
national and international interaction. 
The study, without doubt, provides an illuminatory and indepth investigation of 
assimilatory processes in Nigeria, showing the peculiarity of ESNE and its 
distinctiveness from SBE. It, however, differs from this study in that its preoccupation 
was restricted to assimilatory processes. Besdes, its target was not assimilatory 
processes in connected speech (across word boundary) but within words; and there was 
no attempt to measure the proximity of ESNE to SBE, though it was claimed that a 
great deal of assimilatory processes observed in ESNE were markedly different from 
those of SBE. These limitations form the bases for this study, which is concerned with 
a quantitative investigation of a number of processes that define SBE connected speech 
in the speech of NE speakers and attempts to determine the extent to which NE 
approximates to or deviates from SBE in these processes. 
 
2.4 Sociophonetics 
The term, „Sociophonetics‟, a blend of Sociolinguistics and Phonetics, was first 
used by Deshaies-Lafontaine (1974). It is a research field that is concerned with 
studies that employ both Sociolinguistic and Phonetic methods; that is, work at the 
intersection of sociolinguistics and phonetics. It attempts to demystify Generative 
Phonology‟s pre-occupation with the analysis of the linguistic knowledge of the “ideal 
speaker-listener, in a completely homogenous community” (Chomsky, 1965:3) with no 
consideration for variation that exists between speakers of a language. 
47 
 
 
While sociolinguistics deals with all aspects of language variation, 
sociophonetics studies only socially-conditioned phonetic variation in speech that 
correlates with social factors like speaker‟s gender, age or social class (Honey, 1997; 
Foulkes and Docherty, 2006). As an eclectic field, it is widely used among the 
phoneticians to refer to descriptive accounts of variation in speech in different dialects, 
speech styles or speaker groups (Foulkes, 2006; Esling, 1991; Henton and Blandon, 
1988); and is employed among sociolinguists to refer to phonetically inclined 
variationist studies, pioneered by Labov, which emphasise interrelationship between 
speech form and social factors such as speaking style and the background or 
characteristics of the speaker and explain how linguistic change originates and is 
transmitted (Labov, 1994, 2001). 
Sociophonetic research is predicated upon the fact that language varies, and 
that the variation is the most pronounced at the level of phonetics. For instance, it is a 
proven fact that individuals pronounce sounds differently from one another, and that it 
is pretty difficult to find two identical voices or even two similar utterances of the 
same speaker. Thus, it has been established by scholars that speech production can 
vary according to speakers‟ social background; that is, their gender, age, socio-
economic status and ethnicity (Labov, 1966; Trudgil, 1974; Guy 1981; Hovath, 1985), 
as well as their groups and social networks leaning (e.g. Milroy, 1987; Eckert, 2000). 
Sociophonetic variation, then, represents a pattern of behaviour learned by speakers 
through the experience of using language in social interaction. 
This methodological inclination has given rise to insightful discoveries; one of 
which is apparent time hypothesis which predicts stability of individuals‟ phonological 
systems and accents throughout their adulthood; in which case any observed 
differences between younger and older speakers recorded at the same time are 
generally regarded as changes in progress (Hay and Drager, 2007). This theory has, in 
turn, contributed immensely to the study of language change.  
Another determining factor of variation, which has also become the focus of 
sociophonetic research, is communicative context which encompasses linguistic style 
or register of speech, social context, the topic of discussion, the addressee and the 
intention of the speaker. It is believed that speech may be varied or adjusted by 
speakers at any point in time according to any of these factors. According to Foulkes 
(2006:19): 
48 
 
 
Phonetic forms may be controlled in line with the style or 
register of speech; they may be tailored according to the 
relationship between the speaker and listener; they may be 
designed to provide coherence to a discourse; they may be 
linked to changes in the ambient physical conditions of the 
context; and they may be affected by temporary external 
influences such as alcohol or consciously adopted disguise. 
 
A number of studies have, therefore, been conducted along this line of thought 
to examine variation in speakers‟ style of speaking that correlates with changes in the 
speech setting and in the composition of audience (e.g. Labov, 1972; Bells, 1984, 
2001; Hay et al. 1999). For instance, it has been reported that more standard forms are 
often used by speakers (particularly women) in more formal styles of speech, e.g. the 
production, in formal styles, of post-vocalic [ɹ] in New York (Labov, 1966), and [h] in 
British English (Trudgill, 1974). 
Bell‟s audience design theory also lends credence to this. It states that „style 
derives its meaning from the association of linguistic features with particular social 
groups‟ (Bells 2001:142). This implies that speakers‟ style is determined by and 
adjusted towards the speech style of their audience. Bells (1984) further observes that 
interlocutors often express solidarity with or distance from each others‟ linguistic 
patterns in a communicative context. For instance, in the field research conducted by 
Trudgill (1986) in Norwich, England, he discovered that the way he used glottal stops 
for /t/ correlated with that of his interviewees. Hay, et al. (1999) also reports how the 
ethnicity of the referee influenced phonetic variants in the speech of the television 
presenter, Oprah. Variation, as such, is seen as a function of the relationship between 
the interlocutors. 
Along this line, Lindblom (1990) also opines that speakers tailor their speech 
towards a „hyper-hypo‟ continuum in line with the perceived interactional needs of the 
interlocutor. In a context that requires listener-oriented speech, like giving clear 
instructions or speaking in noisy environments, speaker is likely to employ more 
elaborated (hyper) articulations; whereas, a greater degree of under-articulation (hypo-
speech) may be used in interactions such as narrative which is more speaker-oriented. 
Further studies in this direction have also found that the speech of adults tends 
to be modified during conversation with children. Foulkes, Docherty and Watt (2005), 
in this regard, are of the opinion that sociolinguistic variables may reflect different 
patterns relative to those in inter-adult speech, and may be influenced by the age and 
49 
 
 
gender of the child. Research on bilinguals also supports this line of thought; in that 
patterns of interference between languages depend upon the language mode being used 
(Grosjean, 1998). When speaking to a monolingual, a bilingual is likely to use just one 
language, and as such interference between the speaker‟s two languages will be 
reduced. However, code-switching cannot be avoided when such a bilingual converses 
with other bilinguals. Features from one language are bound to be found in the other. 
Finally, in addition to audience induced variation, phonological choices are 
also made by speakers for pragmatic and discourse functions. For instance, turn taking 
may be signalled by using fully-released non-glottalised voiceless stops in Tyneside 
English (Local, Kelly and Wells, 1986; Docherty, Foulkes, Milroy, Milroy and 
Walshaw, 1997), and by intonational patterns in other dialects, e.g. as a cue to turn-
endings in London Jamaican English (Local, Wells and Sebba, 1985) and (the use of 
high rising tone) as a turn-holding mechanism in Australia (Guy, Horvath, Vonwiller, 
Disley and Rogers, 1986) and New Zealand (Britain, 1992; Warren and Britain, 2000). 
Similarly, aspiration of voiceless stops in English (/p/, /t/, /k/) has been shown to be a 
discourse marker, indicating turn-finality (Local, 2003). 
 
2.4.1 Levels of sociophonetic variation 
Socially-conditioned variation in speech has been examined at different levels 
of phonetics and phonology: segmental, suprasegmental and subsegmental. Few of the 
studies conducted along these lines are hereby reviewed below. 
 
2.4.1.1 Segmental variation 
Much of the research in sociophonetic variation overwhelmingly favours 
segmental categories. Foulkes and Docherty (2006) discuss four main types of 
segmental variation proposed by Wells (1982) which he used to describe variation in 
accents of English. The first type is systemic variation, which relates to differences in 
the composition of phoneme inventory between dialects of British English. For 
example, phonemes /x/ and /ʍ/ are found in Scottish dialects but are absent in most 
other British accents. Socially, the sounds also mark age differentiation in the dialect 
of Glasgow where they are widespread among older speakers than younger speakers. 
Besides, /x/ is seen to be used more by middle class children than working class 
children (Lawson and Stuart-Smith, 1999). 
50 
 
 
Phonotactic distribution of phonemes is Well‟s second category. This is 
illustrated by the contextual distribution of /r/ into rhotic–non-rhotic accents 
dichotomy. While in rhotic accents, /r/ occurs in all contexts, non-rhotic accents 
(comprising most accents of England and Received Pronunciation) only permit /r/ in 
prevocalic positions. This distinction does not apply to regional variation only; it also 
indicates social categories such as social class. For instance, Labov (1966) reports how 
the production of [ɹ] in New York City correlates with social-economic level whereby 
higher social class use [ɹ] more than lower social class. However, the opposite is the 
case in England: high rate of [ɹ] production reflects low social status (Wells. 1982). 
The third category, lexical distribution of phonemes, describes regional, social 
and stylistic variation in accents arising from the use of phonemes in a particular word. 
For example, in the following words, path, class and Iraq, the short vowel /æ/ is used 
in the north of England while the southern accents favour the long vowel /a:/. The last 
category is called allophonic realization. Foulkes & Docherty (2006) exemplify this 
with their research on the English of Newcastle upon Tyne, where they found that 
speakers from the area used a particularly distinctive realization of stops /p, t, k/ in 
word-medial inter-sonorant contexts, such as happy, water, baker, bottle, button and 
metro. 
2.4.1.2  Suprasegmental variation 
Some studies have also been conducted to capture regional and social speech 
variation at the suprasegmental level of intonation (e.g. Cruttenden, 1997; Knowles, 
1978; Local, Kelly & Wells, 1986; Warren & Britain, 2000), pitch accent realization 
(Grabe, Post, Nolan & Farrar, 2000), tonal alignment (Nolan & Farrar, 1999), voice 
quality and vocal setting (Henton & Blandon, 1988; Stuart-Smith, 1999), rhythm 
(Deterding, 2001; Low, Grabe & Nolan, 2000) and stress placement (Wells, 1995). 
The works cited above on rhythm, for instance, reveal that Singaporean English is 
more syllable-timed than British English. Rhythm is also shown to differ across 
dialects of English (Grabe and Low, 2002) and serves as a marker of ethnicity, e.g. 
Latino identity in the US (Carter, 2005) and Maori ethnicity in New Zealand (Szakay, 
2006). 
Attention has also been paid to the phonetic properties of intonation tunes 
across speakers. It has been proved that intonation contours mark regional and social 
differences (Warren, 2005). For instance, in most accents of English, declarative 
statements take falling tunes. However, accents of Newcastle, Liverpool and a large 
51 
 
 
part of Ireland, favour rising or high level contours in the same position (Cruttenden, 
1997; Douglas-Cowie, Cowie and Rahilly, 1995). Socially, it is equally observed that 
the use of rising tunes in declarative statements is generally becoming characteristics 
of many English dialects and is particularly associated with young speakers. In the 
USA, Australia and New Zealand, it is peculiar with lower class and/or female speech 
(Arvaniti & Garding, 2005; Guy, Horvath, Vonwiller, Disley & Rogers, 1986; 
Cruttenden, 1995), whereas it marks the speech of the upwardly mobile in England. It 
has also been proved that rising tunes are commonly used in some speech styles where 
they play diverse discourse roles, such as acting as a turn-holding mechanism in 
narratives. 
 
2.4.1.3 Subsegmental variation 
The accessibility of instrumental techniques has made it possible to extend 
sociophonetic research to subsegmental categories (Foulkes and Docherty, 2006). 
Studies in this direction examine the effects of adjacent sounds on each other in a 
stream of connected speech, in terms of the relative duration, strength or temporal 
coordination of articulatory gestures. Some of the studies conducted along this line 
include Fourakis & Port (1986), Kerswill (1987), Kerswill & Wright (1990), Di Paolo 
& Faber (1990), Thomas (2000) and Scobbie (2005). In the study on „the description 
of connected speech processes in Cambridge English‟ conducted by Nolan & Kerswill 
(1990), for example, a continuum in the degree of assimilation was shown by 
Electropalatographic data. Some tokens revealed complete assimilation (e.g. [gri:m 
pɑ:k] for green park), some showed none at all, while others had partial assimilation 
involving an incomplete alveolar gesture. It was also discovered that assimilated forms 
produced by children from the lower status school were more than those produced by 
children from the higher status schools. Docherty & Foulkes (1999, 2005), from their 
work on stops in Newcastle English, also discovered variation in intervocalic and 
prepausal /t/ in Newcastle and Derby, depending on a speaker‟s social group. 
 
2.5 Review of related literature on sociophonetic variation 
According to Barnes (2005), there has been an upsurge in the drive to integrate 
sociolinguistics and phonetics into a single discipline in recent times. Consequently, 
sociophonetics has become an eclectic field covering variations in speech perception 
(Clopper & Pisoni, 2005; Thomas, 2002; Barnes, 2005), linguistic and sociolinguistic 
theories (Nagy and Reynolds, 1997), first and second language acquisition (Khattab, 
52 
 
 
2002; Lively, Logan & Pisoni, 1993) and forensic and speaker identification (Hoequist 
& Nolan, 1991; Nolan, 1997). Attempt is made to review few of such illuminating 
sociophonetic studies in the above listed fields. However, a majority of them relate to 
the native speaker‟s setting. Only a few studies on spoken Nigerian English have been 
able to explore this research dimension; emphasis has almost always been on level of 
education and ethnicity of speakers. This, perhaps, is because of the assumption that 
sociolinguitic variables of gender, age and social class scarcely affect speakers' 
pronunciation of English in an L2 setting (Ngefac, 2003; Bobda, 1994).  
Clopper and Pisoni (2004), using acoustic and perceptual analyses techniques, 
conducted a sociophonetic research on speech perception, focusing on identification of 
the dialect of speakers by listeners. Their participants were sixty-six young, white male 
talkers between the ages of 20 and 29 from six regions of the United States- New 
England, North, North Midland, South Midland, South, and West (11 from each 
region). They read ten sentences. The acoustic analysis identified several phonetic 
features that can be used to distinguish different dialects while the perceptual analysis 
investigated how well listeners could distinguish speakers from different parts of the 
United States and what features they relied on. The recordings of the sentences 
produced by the sixty-six talkers were played back to twenty-three Indiana University 
undergraduates who served as listeners for the study. They were asked to categorise 
talkers into one of six geographical dialect regions. 
Results showed that listeners were able to reliably categorize talkers using three 
broad dialect clusters (New England, South, North/West), but that they had more 
difficulty categorizing them into six smaller regions. Multiple regression analyses on 
the acoustic measures, the actual dialect affiliation of the talkers, and the 
categorization responses revealed that the listeners in this study made use of several 
reliable acoustic–phonetic properties of the dialects in categorizing the talkers. 
Altogether, the results of these two experiments confirmed that listeners had 
knowledge of phonological differences between dialects and can use this knowledge to 
categorize talkers by dialect. 
“Listener expectations and the perception of Scottish English /ʉ/: a 
sociophonetic investigation” is the title of Barnes‟ (2005) sociophonetic speech 
perception experiment conducted, using Edinburgh listeners. Two male speakers: a 
middle class Glaswegian and a working class Edinburgh native, whose parents were 
also natives of their respective cities of origin, were asked to complete a questionnaire 
53 
 
 
that assessed their socio-economic backgrounds and read six sentences containing a 
word with the short /ʉ/ vowel. 17 listeners, 8 females and 9 males, participated in the 
experiment. They were all native Scottish English speakers between the ages of 19 and 
33. They were asked to listen to recorded sentences and decide whether or not the 
synthesized vowel following each sentence matched the vowel (in a target word) 
produced by the speaker. The listeners were divided into two groups: Group 1 was told 
that the speaker was from Edinburgh, Group 2 was told he was from Glasgow. Both 
groups actually heard the same speaker, who was native to Edinburgh. The response 
patterns of the two groups were analyzed to see if there were any significant 
differences in vowel choices based on the social information given about the speaker. 
The results, however, were inconclusive. 
Khattab (1999) investigates the speech production of two English-Arabic 
bilingual Lebanese boys, born and raised in Leeds, aged six and nine respectively. 
Using auditory and acoustic techniques, he examined the participants‟ glottal stop 
production in English and Arabic taking into consideration corresponding phonemic or 
sociophonetic roles of the glottal stop in each language. This was with a view to 
establishing a relationship between the children‟s production of English and social 
variables existing in their environment.  He specifically sought to examine whether the 
participants had incorporated the glottal stop in English as a sociolinguistic variant of 
/t/ and the frequency and environments of its use compared to a supralaryngeal stop. 
He also confirmed whether they used this variant in their production of Arabic /t/s in 
environments comparable to glottalling environments in English. Besides, an auditory 
analysis of the English vowel system developed by the participants with emphasis on 
„accent-revealing‟ vowels specific to the Leeds accent was also carried out. 
The investigation of the participants‟ glottal stop production in each language 
suggests that they were aware of the different roles the glottal stop [ʔ] plays in each 
language and of the appropriate phonological contexts for its occurrence. However, 
analysis of the frequency of glottal stop [ʔ] realisations in English, along with analysis 
of other variables known to have marked local variants in the participants‟ community, 
revealed that the bilinguals‟ sociolinguistic performance does not follow the patterns 
expected of children of their age- the amount of glottalling expected in Leeds English 
does not seem to have influenced their production. This seems to suggest that their 
Arabic background has hindered their early acquisition of local variants specific to the 
community. Finally, an auditory analysis of the participants‟ production of six English 
54 
 
 
vowels shows that only a few of the participants‟ realisations of six English vowels 
correspond to those found in the Leeds accent. It was concluded that the phenomenon 
is partly related to the bilinguals‟ sociolinguistic background, and partly to 
sociolinguistic changes affecting the whole community. 
Marsden (2006) conducted “a sociophonetic study of labiodental /r/ in leeds”. 
Exploring the social network model, the author attempted to track the increasing 
spread of [ʋ], which had previously been considered a flawed or affected speech, in the 
city of Leeds. The data for the study was collected from 18 speakers across a large 
geographical area of Leeds. They were divided into six cells, three speakers per cell 
with equal numbers of males and females across three age groups: 15 – 30, 31 – 50 and 
51+. The study data comprised sociolinguistic interview covering the informants‟ 
everyday work and social life, and wordlist readings of 54 words. Thirty nine of these 
words contained /r/ with 13 tokens of /r/ in each of three word positions: word initial 
(e.g. rope, run), intervocalic (e.g. porridge, surround) and in word initial consonant 
clusters (e.g. fruit, broke). To make up the remainder of the 54 words, 15 distracter 
items with no /r/ were mixed within the 39 /r/-words. 
The auditory analysis of the data reveals the distribution of labiodental [ʋ] 
variants along an age-related pattern. Some younger speakers used the innovative 
variant while older speakers maintained the standard variant in the majority of cases. 
The social networks of speakers revealed that speakers who used [ʋ] appeared to have 
relatively diverse social network contacts rather than strong ties within a particular 
close-knit local network. Speakers with relatively tight local network ties tended to 
maintain [ɹ]. These findings, the author claims, somewhat confirm the northward 
advancement of a labiodental variant of /r/ since its identification as a dialect feature in 
the southeast of England (Wells, 1982). 
Moreover, the age-related use of [ʋ] suggested a gradual shift from the 
traditional alveolar approximant in the city of Leeds similar to that identified in other 
areas. Finally, the social network findings suggested that linguistic variants were 
diffused by speakers with weak ties to diverse networks which afford them contact 
across a wide socio-geographical range. Conversely, speakers with strong, close-knit 
networks were unlikely to adopt linguistic innovations due to norm-enforcing 
linguistic loyalties that facilitate social group integration. Strong social networks 
therefore do not provide a suitable environment for linguistic innovation and change. 
55 
 
 
Przedlacka's (1999) dissertation is titled, “Estuary English: a sociophonetic 
study”. It was a sociophonetic study of the phonetic nature of a presumed variety of 
Southern British English known as 'Estuary English'. The fieldwork was carried out 
within Labovian framework. The data were collected in four Counties- 
Buckinghamshire, Kent, Essex and Surrey- using a word elicitation task from sixteen 
teenage speakers. Fourteen sociophonetic variables were investigated in the study, 
focusing on differences between the counties, male and female speakers and two social 
classes. 
It was revealed that, given the extent of geographical variation, the accents 
spoken in the area were not homogeneous. Some observed features include: the 
fronting of vowel of the lexical set of GOOSE and STRUT, and syllable non-initial t-
glottaling, which were more rampant amongst female speakers. Against all odds, the 
teenage speech of the Home Counties reveals the use of th-fronting variant, especially 
amongst male speakers. Generally, social class turned out not to be a good indicator of 
change as little difference was found between the classes. 
Docherty, Hay and Walker‟s (2006) article, “Sociophonetic patterning of 
phrase-final /t/ in New Zealand English”, analyses the realization of phrase-final /t/ in 
a corpus of young New Zealand English speakers. The data for the study was drawn 
from the Canterbury Corpus, part of the ONZE archive in the linguistics department at 
the University of Canterbury, which contains over 400 interviews, conducted by 
students enrolled in a 3rd year „New Zealand English‟ course. The corpus comprised 
informal sociolinguistic interviews and a standard New Zealand English word list 
broadly stratified by social class into „Professional‟ and „Non-professional‟ speakers 
based on both educational and occupational criteria. The speakers in the Corpus were 
born between 1926 and 1985. The „younger‟ group was selected from the corpus for 
analysis. 
The analysis consists of a total of 1,057 tokens from 60 speakers – 15 young 
professional females (FP), 15 young non-professional females (FN), 15 young 
professional males (MP) and 15 non-professional males and (MN). The focus of the 
research was on phrase-final /t/ - defining phrase-finality to be the end of an intonation 
phrase. Using a combination of auditory and acoustic methods, the analyses reveal four 
primary variants of /t/- canonical, spirantised, affricated and unreleased stop. The 
results, thus, show that unreleased glottalised variants are much more prevalent than it 
was earlier reported. It was also discovered that young female speakers produce 
56 
 
 
significantly fewer unreleased tokens than their male counterparts, at least in phrase-
final position. 
Rajend (2010) examines the degree of sociolinguistic change in the English of 
young middle-class South Africans of different ethnic backgrounds in relation to new 
post-apartheid opportunities and friendships. Using forty-eight speakers analysed 
within Labovian tradition in relation to the goose vowel (long /u/or /u:/), the paper 
examines the present disposition of young people of the major ethnic groups, Black, 
Coloured and Indian, to the prestige White middle-class norms, whether they are 
adopting or adapting them or resisting change. 
The results of over 4,000 tokens analysed acoustically using PRAAT and 
compared via vowel normalisation procedures showed that middle-class speakers of 
the three ethnicities were fronting the vowel, but in different ways. This was the most 
prominent amongst Black speakers while Coloureds and Indians females show greater 
resistance. Overall, the Black females approximated most closely to the norms of the 
White reference group of their gender. 
Among the few sociophonological studies in Nigeria English is Ojareche's 
(2009) work titled: 'A sociophonological analysis of Nigerian male and female 
television newscasters' speech'. The study attempts to investigate variation in spoken 
English performance of Nigerian newscasters in stress and intonation on the basis of 
gender and ethnicity. The data was sourced from the newscasts of Nigerian Television 
Authority Network newscasters of Hausa, Igbo and Yoruba origin and few other 
minority groups. Sentences extracted from each subject's newscasts were analysed 
statistically and through acoustic means. The study came to the conclusion that there 
was gender balance in television newscast, as there was no significant difference 
between the pronunciation of male and female newscasters. On the other hand, mother 
tongue influence was evident in the newscast of each subject of study. 
Sogunro (2012) is 'A socio-phonological analysis of Hausa English (HE), Igbo 
English (IE) and Yoruba English (YE) varieties in Nigeria'. The work is an attempt to 
emperically describe variations in Nigerian English accents, on the basis of ethnicity 
and gender and assess Jibril's (1982) claim of convergence to Yoruba English. The 
respondents were 30 male and 30 female undergraduates of Nigerian universities, 
representing the three major ethnic groups: Hausa, Igbo and Yoruba. A recording of 11 
preselected sounds /v, z, θ, ð, ʧ, ʃ, ɜ:, e, ə, ʌ/ read by the repondents, and their casual 
57 
 
 
conversations were made. Each of them also filled in a questionnaire. The data was 
analysed using percentages, t-test and ANOVA.  
It was revealed, among others, that ethnicity had a significant effect on 8 out of 
11 sounds; a closer relationship was shown between HE and YE than IE and YE; [u] or 
[o] were used by HE and IE respondents, while YE respondents favoured [ɔ]; high tone 
endings were found in HE and YE, while IE repondents used low tone endings; no 
significant difference was found between the sexes in the three ethnolects. The study 
concludes that since  ethnicity is the major factor of variation in HE, IE, and YE 
accents, Nigerian English accents are best categorised on the basis of ethnicity rather 
than region as Jibril postulated.  
Neither of the home-based studies addressed the subsegmental features 
(domain of CSPs). They were limited to aspects of Nigerian phonology which have 
been overflogged by scholars (segmental and suprasegmental). Besides, participants 
were restricted to the three major Nigerian languages (Hausa, Igbo and Yoruba). All 
these limitations are remedied in the present study. 
 
2.6 Nigerian English: an overview of the literature 
Having long resolved the controversies over the reality or otherwise of 
Nigerian English, scholars have, over the years, been in search of identification, 
characterization, classification and norm of Standard Nigerian English. This, according 
to Kujore (1985), is imperative in order to have “a common point of reference to which 
learners and users may turn for normative guidance”. While some studies have 
concentrated on variety differentiation, others were devoted to identifying the character 
and functions of Nigerian English at lexico-semantic, syntactic, phonological, 
idiomatic and pragmatic levels. The next section reviews some of the studies on 
varieties of Nigerian English. 
 
2.6.1 Nigerian English: variety differentiation 
In view of the obvious fact that Nigerian English is heterogeneous, a number of 
studies have, particularly, been conducted to examine the varieties of Nigerian English 
with a view to establishing what should be accepted as Standard Nigerian English.  
Brosnahan (1958) was the first to categorise the varieties of English in Nigeria, 
using formal education attainment as a criterion. He classified spoken Nigerian English 
into four levels according to quality of education. The first variety, pidgin, is spoken 
by people without formal education; variety two is spoken by primary school 
58 
 
 
certificate holders and is used by most Nigerian speakers; variety three, which 
according to him, is marked by greater fluency and elaborate vocabulary, is peculiar to 
secondary school leavers while the fourth variety, adjudged to be close to Standard 
British English, is associated with speakers who have acquired university or higher 
education. 
Despite the pioneering attempt of Brosnahan, he has been stoutly criticised by 
scholars on many grounds. While Salami (2001) invalidates his typology on account of 
the absence of empirical data to back it up, Banjo (1971) considers it too simplistic. 
Also, it has been argued that Brosnahan‟s typology lacks currency and has lost touch 
with the sociolinguistic realities of the English spoken in Nigeria today. His claims that 
Variety II (spoken by primary school certificate holders) is used by most Nigerian 
speakers, and Variety III (spoken by secondary school leavers) is marked by greater 
fluency and elaborate vocabulary have lost touch with the present reality. A casual 
observer of the trend of spoken English in Nigeria today knows that these are no 
longer tenable. It is not a hidden fact that an average primary and secondary school 
leaver in Nigeria today can hardly communicate in good English. 
Udofot (2004) particularly kicked against Brosnahan‟s claim that Variety I of 
his classification is Pidgin English and is a language of the uneducated. According to 
her, scholars never considered Nigerian Pidgin as a variety of Nigerian English but a 
contact language which evolved as a result of trading activities on the coast between 
Nigerian and European traders in the 19th century and later grew with urbanisation and 
became important in some towns. She further states that Pidgin is used nowadays, 
though in informal contexts, by educated Nigerians- secondary and university students, 
as well as the elitist class. Therefore, the idea of classifying it as a variety of Nigerian 
English is not tenable. 
Besides, Brosnahan‟s (1958) description was rigidly tied to the levels of 
educational attainment. It has however been proved that level of formal education 
alone does not necessarily determine competence in spoken English. Jowitt (1991), in 
this regard, argues that there are other factors like exposure to English at home, innate 
ability and intelligence, amongst others, which could influence one‟s degree of 
proficiency in English. For instance, a speaker who is still in primary school may be 
more proficient in English than a school certificate holder because he grew up in an 
environment where English was used as a medium of communication. It is unrealistic, 
then, to equate proficiency in English with level of formal education alone. It is against 
59 
 
 
this backdrop the present study takes into consideration other social factors (age, 
gender and region of speaker) that are likely to influence performance in spoken 
English. 
Notwithstanding the series of criticisms that trailed Brosnahan‟s classification, 
it remains the pioneering study on variety differentiation of Nigerian English, thereby 
providing a platform for further studies. Besides, it has shown that level of education is 
a fundamental (though not sole) parameter for classifying Nigerian English varieties. 
This is the trend other varieties differentiation in Nigerian English followed: educated 
speakers have been the subjects of categorisation. This, also, is the position taken in 
this study. Our participants are educated speakers of English with a minimum of 2-3 
years post-secondary education. 
Banjo (1971, 1993) proposes a typology of Nigerian English based on 
linguistic features as well as the extent of mother tongue transfers and of 
approximation to a world standard, with level of education being a factor but not the 
sole determinant. He identifies four varieties of spoken Nigerian English plotted on 
points on a cline. Variety I demonstrates excessive mother tongue transfers especially 
at the phonological and syntactic levels. This variety, which is more or less described 
as „broken English‟, is said to be spoken by semi-illiterate Nigerians, who only „picked 
up the language as a result of the exigencies of their occupations‟ (Banjo, 1996:75). It 
is considered socially unacceptable and internationally unintelligible. 
Variety II is associated with speakers who are exposed to formal learning of 
English either at the primary or secondary level. Most Nigerian bi-lingual speakers fall 
into this category. Though features of variety I speakers are somewhat exhibited in 
their speech, they demonstrate more extensive vocabulary with fewer syntactic 
deviations and are able to make more phonemic distinctions than variety I speakers. 
This variety is considered intelligible and acceptable locally but lacks international 
intelligibility. 
Variety III, according to Banjo (1996:78), „represents the acrolectal use of 
English in Nigeria‟. Within this category are speakers who attained a level of mastery 
by exposure to a standard variety of the language through education and other factors 
like family background and quality of instruction. It is characterised by minimal 
syntactic errors, and the phonology has RP deep structure but Nigerian phonetic 
features capable of revealing the speaker‟s origin. It is considered both locally and 
internationally intelligible. Banjo further argues that it is inappropriate to equate it with 
60 
 
 
years or levels of education because speakers attain this level at different periods- 
some in secondary school, others after university education.  
Variety IV is close to the Standard British accent and is considered the spoken 
form of those that have been exposed to native speaker‟s English. Although this 
variety is internationally intelligible, it is socially unacceptable in Nigeria for sounding 
foreign and affected. Thus, Varieties I and IV were rejected by Banjo because they 
lack international intelligibility and social acceptability respectively. Variety II was 
equally discarded on the basis of being internationally unintelligible. Variety III was 
eventually accepted as the endonormative model (home grown) for being locally and 
internationally intelligible. 
 Banjo‟s classification has been applauded by many scholars. Udofot (2004:94) 
considers it as „a more realistic classification which is close to the present-day 
realities‟. Eka (1985:16) opines that „the realistic nature of Banjo‟s article is the basis 
for its popularity and prestige‟. It has, thus, been described as the platform on which 
further research efforts are placed. The present study borrows a leaf from Banjo's 
because it is a variationist study which seeks to determine the extent of Nigerian 
English speakers' approximation to or deviation from Standard British English model 
which was also Banjo's target. 
However, Banjo‟s effort has not been without criticisms. The inclusion of 
Variety IV in his analysis has been criticised by Bamgbose (1982). He proposes 
exclusion of Banjo‟s variety IV on the basis that it represents a category of speakers 
who did not get exposed to English under the same circumstances as Varieties I-III 
speakers. They learnt the language in the native speaker‟s setting and were once 
monolingual speakers. Therefore, they cannot be regarded as Nigerian English 
speakers in the real sense of it. Udofot (2004) also discards Banjo‟s Variety III and 
recommends her Variety II as the model, being the educated variety taught in schools. 
She describes Banjo‟s Variety III as “an ideal which most educated Nigerians hardly 
ever attain except those who have had specialised training in the phonology of 
English” (2004:11). In her opinion, therefore, it will be difficult to get experts who 
speak it.  
Jibril (1982), in a bid to describe the standard variety of Nigerian spoken 
English, approached his study from the geo-tribal perspective, using the recorded 
speech of forty-five Nigerian speakers of English of Hausa, Igbo and Yoruba origins 
on Nigerian Television Authority (NTA). This approach was informed by his view that 
61 
 
 
“Nigerian English is not a single variety of English but a conglomeration of many 
varieties which relate to one another in sufficient respects to qualify for a common 
cover term” (Jibril, 1982:5). In view of this, he examined Nigerian English on different 
levels of variation- geographical, ethnic, social and linguistic. 
He identifies two broad diatopic subvarieties: Hausa English and Southern 
English, and subsequently divides Southern English to Igbo and Yoruba English. 
Using proximity to or distance from RP as a criterion, he further identifies Basic Hausa 
English, Sophisticated Hausa English, Basic Southern English and Sophisticated 
Southern English. In view of perceived similarity between Basic Hausa and Basic 
Southern English, he again proposes Southern-influenced Hausa variety. He, thus, 
claims that standard Nigerian spoken English will be marked by Northern and 
Southern accents, in which case it has to be a combination of Sophisticated Hausa and 
Sophisticated Southern Varieties. 
Jibril‟s study did introduce a new dimension (geo-tribal perspective) to the 
study of Nigerian English. As rightly noted, this approach cannot be ignored in a 
multi-lingual setting like Nigeria where spoken English is bound to be influenced by 
local languages. However, Jibril has been criticised on the grounds that he lumped 
Igbo and Yoruba English sub-varieties together as Southern English, when it is 
obvious that both varieties differ in many respects. Also, Jibril‟s study was limited to 
few elite participants (45) drawn from the three major ethnic groups in Nigeria. 
Besides, he only investigated, in passing, cases of consonant assimilation and other 
syllable structure processes. The present study, however, uses a larger population 
sample involving diverse sociolinguistic groups from both large and small language 
groups in four regions in Nigeria to assess speakers' proximity to SBE in connected 
speech processes. 
 Jowitt (1991), in his book, Nigerian English Usage: an Introduction, proposes 
the concept of „Popular Nigerian English‟ (PNE) in lieu of „Nigerian English‟. He 
claims that “the usage of every Nigerian is a mixture of Standard forms and Popular 
Nigerian English forms, which are in turn composed of errors and variants (1991:47). 
Based on levels of educational attainment, he establishes a scale of three levels: VI, 
V2, V3, akin to Banjo‟s (1971) varieties, which are subsumed under the broad concept: 
„PNE‟. These levels represent Primary VI certificate holders, WASC holders and 
university graduates respectively and range from very heavily mother tongue 
influenced transfers forms to near Standard English forms. Towards the extreme end of 
62 
 
 
the cline is „Near-Standard Nigerian English,‟ which he considers as the emerging 
Standard Nigerian English.  This, however, has been considered a serious deficiency in 
Jowitt‟s work, in that, he equates Popular Nigerian English with the Standard and sees 
Standard Nigerian English as the ideal Nigerian speakers have to strive to attain. 
Nevertheless, Jowitt‟s contribution to varieties differentiation and 
characterisation of Nigerian English has been so illuminating. It provides a 
comprehensive description of both segmental and suprasegmental features of Nigerian 
English from geo-tribal perspective using Hausa, Igbo and Yoruba sub-varieties of 
Nigerian English; and compares them with RP features. His concept of Popular 
Nigerian English is particularly instructive in the sense that it introduces a new 
perspective to the view of Nigerian English.  
Udofot (2004), in her work, “Varieties of Spoken Nigerian English”, reviews 
previous attempts at varieties differentiation in Nigerian spoken English by scholars 
like Brosnahan (1958) and Banjo (1971), with a view to describing the features of 
spoken English in contemporary Nigeria and re-classifying the varieties of Nigerian 
English. Her participants were sixty Nigerians of diverse educational, linguistic and 
socio-economic backgrounds who have learnt English in Nigeria and were taught by 
Nigerian teachers; and a British native speaker of English whom she used as control. 
She attempts to identify segmental and non-segmental characteristic features of 
Nigerian English pronunciation across diatopic varieties, and thereby establish the 
Standard variety of Nigerian English. 
She grouped her participants, aged between seventeen and sixty years, with 
educational qualifications ranging from the Junior School Certificate to the Doctor of 
Philosophy, into three groups of twenty as follows: 
Group One: participants who have studied English for 9-12 years from primary to 
secondary school. 
Group Two: participants who have studied English for twelve to fourteen years; they 
have completed or about to complete tertiary education. 
Group Three: participants who have learnt English for over fifteen years and have 
undergone further training in English pronunciation. 
They all spoke on the same topic: “The high cost of living in Nigeria” for three 
minutes and were also asked to read a common passage. The recorded data was 
analysed perceptually and statistically with the aid of the Wilcoxon Matched Pairs 
Signed Rank Test and the Analysis of Variance (ANOVA). 
63 
 
 
She thereby reclassifies the varieties of spoken Nigerian English into three. 
Variety One, which is also referred to as Non-Standard Variety, has as its exponents 
primary and some secondary school certificate holders, some second year university 
undergraduates, holders of NCE certificates and primary school teachers. Variety Two, 
also called Standard Variety, is composed of third and final year undergraduates, 
university graduates, university and college lecturers, other professionals, secondary 
school teachers of English and HND holders. The third variety which is the 
Sophisticated Variety comprises university lecturers in English and Linguistics, 
graduates of English and the Humanities and those who have lived abroad. She 
recommends her variety II as the Standard variety which, according to her, is the 
variety taught in schools and spoken by most educated Nigerians at the moment, This 
is different from Jowitt‟s (1991) PNE and Banjo‟s (1971) Variety III which she 
described as an improbable ideal for most educated Nigerian speakers. 
An important contribution of this work is that it reveals the gross inadequacy of 
educational attainment as the sole measure of proficiency in spoken English in Nigeria. 
It was discovered from her data that some Master‟s degree holders in English were 
exponents of her Variety one. Moreover, her choice of participants which took 
cognizance of their age, tribe and educational background is, particularly, relevant to 
the present study. However, Udofot‟s study excluded subsegmental features in her 
analysis, which makes this work relevant and necessary.  
The present study, therefore, is a scholarly attempt aimed at contributing to the 
body of research on Nigerian English. It seeks to examine the incidence of SBE 
connected speech processes in Nigerian English and thereby determine the level of NE 
speakers' approximation to or deviation from the SBE norms. This will afford us the 
opportunity to adequately characterise Nigerian English, sifting errors from variation, 
and improve pedagogy. 
 
2.7 Received Pronunciation/Standard British English 
Received Pronunciation (RP), otherwise referred to as The Queen‟s English, 
Public School Accent, Oxford English, BBC English, Standard English, etc. is the 
British English accent which is customarily regarded as a prestige variety and as a 
pronunciation model in the teaching of English as a foreign language. It is the accent 
of the Court, the upper classes and the educated; the accent used by presenters and 
64 
 
 
newsreaders in the BBC and an accent that conceals the regional background of the 
speaker (not confined to any locality, geographical area or region).  
RP evolved in the British society as a result of the rise of accent as a social 
signifier (accent became synonymous with prestige) as well as the need to standardise 
the spoken language (Hannisdal, 2006). By and large, the educated speech of London 
(the capital and the surrounding areas) and pronunciation of the upper social class 
emerged as the high-status variant. RP accent was further given a boost with the advent 
of sound broadcasting in 1922 and television in the 1930s. Only RP speakers were 
used as announcers and newsreaders by the BBC; a fact which further associated RP 
with social presitige, high status and intellectual competence. 
Hannisdal (2006) is, however, of the view that RP no longer enjoys its previous 
towering status due to increasing democratisation of the British society whereby non-
RP accents are now permitted in many contexts where RP previously held sway. She 
further observes that RP is somewhat detested in many contexts, because it connotes 
social exclusiveness and pretension. Hughes and Trudgill (1996:9) put it this way: 
 
It is sometimes said that nowadays there is not the same 
pressure as there once was to modify one‟s speech in the 
direction of RP. Reference is made to the fact that announcers 
with non-RP accents are now to be heard on the BBC, that 
important posts in industry and the civil service are held by 
non-RP speakers, and that some younger RP speakers have 
adopted, more or less deliberately, features of regional 
pronunciation. 
 
The implication of this is that because of its exclusiveness, RP does not 
represent the accent of the majority of British English speakers. Besides, RP is no 
longer a homogenous dialect, as it consists of at least three subtypes: conservative RP, 
common among older generation and some professions or social groups; general RP, 
which is the pronunciation adopted by the BBC; and advanced RP, typical of young 
people (Gimson, 1980).  
Therefore, in order to avoid the controversy of which subtype to choose as a 
model and the difficulty of searching for its speakers as control, this study rather 
adopts the term Standard British English (which is more encompassing and represents 
the speech of a majority of British native speakers) as a model. Speech samples were 
collected from two educated native Britons who had lived in Britain for more than 50 
65 
 
 
years. They were used as control to comfirm the features of connected speech 
processes of SBE attested in the literature. 
 
2.8 Acoustic Phonetics 
Acoustic phonetics deals with the physical properties (properties of the 
soundwaves) of speech sounds; that is, how sounds are physically transmitted and 
acoustically measured. It is the structure of soundwaves (waves of fluctuating air 
pressure) that distinguishes one sound from another. These waves can be acoustically 
measured in terms of frequency (corresponding to the pitch of a sound), intensity 
(corresponding to loudness) and quality (vowel, consonant, voicing, marner, etc.). For 
example, the higher the frequency (measured in Hertz), the higher the sound is in 
pitch; the more extreme the fluctuations in pressure, the greater the amplitude/intensity 
of the wave (measured in decibels), and the louder the sound (Kirchner, n.d.). 
This study employs spectographic instrument  (a device that translates a sound 
into a visual representation of its component frequencies) to examine such connected 
speech features as voicing assimilation, boundary consonant elision and linking /r/ in 
the speech of select Nigerian speakers of English, with a view to corroborating 
findings obtained through perceptual means.  
 
 
 
 
 
66 
 
 
 
 
 
CHAPTER 3 
 
THEORETICAL FRAMEWORK 
 
3.0    Introduction  
 This is a study at the borderline of two distinct fields– Sociolinguistics and 
Phonetics. It is expedient, therefore, to approach it using relevant theoretical insights 
from these fields of study. As a study principally concerned with speech, particularly 
phonological processes, it is premised on Noam Chomsky‟s Generative Phonology. 
Specifically, aspects of the theory which are relevant to this work are discussed. And, 
as a study in language variation, various theoretical developments in Labov's 
variability concept are examined. 
3.1 Generative Phonology  
 Generative Phonology (GP), an offshoot of a wider theory of language– 
Transformational Generative Theory, began to unfold with Chomsky‟s publication of 
The Logical Structure of Linguistic Theory in 1955 and Halle‟s publication of 
Phonology in Generative Grammar in 1962, and was later elaborated in Chomsky & 
Halle's (1968) The Sound Pattern of English (SPE). It developed as a result of apparent 
discontent with certain tenets of classical (taxonomic) phonology which was prevalent 
in North America in the 1940s and 1950s, and rose to serve as an alternative to it. 
 Classical Phonology was concerned with inventories of elements and a 
classificatory or taxonomic approach to linguistic analysis (Clark and Yallop, 1995). 
Utterances were basically segmented into their constituent phonemes with a view to 
discovering the phonemes which occur in different languages (Schane, 1973). The 
theory distinguished between phonemes (level of contrast or opposition or the 
phonemic) and allophones (level of pronunciation or the phonetic). For example, in 
h
English, the aspirated [p ] of pin and the unaspirated [p] of spin are allophones of the 
same phoneme /p/ and are in complementary distribution. However, according to 
Kenstowicz (1994), the derivation of allophones from the phoneme is not determined 
by phonological processes; allophones are rather in a correspondence relation; the 
distribution of the elements composing each level (phonemic and phonetic) is stated by 
67 
 
 
h
phonotactics, e.g. [p ] occurs at the onset of a stressed syllable while [p] occurs 
elsewhere.  
 Consequently, the concepts of phoneme and allophones were fraught with 
h
certain phonological inadequacies. First is the observation that [p ] and [t] as well as 
[h] and [ŋ] also stand in complementary distribution but are not derived from a 
phoneme (Kenstowicz, 1994). Second is the case of phonemic representations of the 
stem final consonant of electric /elektrik/ and electricity /elektrisiti/ which contain /k/ 
and /s/ respectively, cited by Schane (1973). She notes that in phonemic theory, it is 
h
possible to say that [p ] and [p] are variants or allophones of a phoneme /p/. In this 
instance, however, /k/ and /s/ cannot be considered as such. This is because both are 
regarded independent phonemes of English. Therefore, while the phonemicists were 
h
able to provide explanation for [p ] and [p], little or no relationship was established 
between /k/ and /s/. 
 These puzzles, amongst others, motivated the generativists to re-orientate the 
focus of phonological descriptions. Apparently convinced that such a case like electric 
vs electricity above cannot be explained phonemically (is neither phonemic nor 
phonetic), they jettisoned the phonemic level of representation and postulated an 
underlying representation (also known as systematic phonemic, abstract or deep forms) 
converted by phonological rules into systematic phonetic or surface forms (Clark and 
Yallop, 1995; Simo Bobda, 1994). The Underlying level of Representation (UR) is the 
phonemic level which is the dictionary representation of words, while the phonetic 
level is the actual level at which real sounds are produced. At the UR level, the word 
pin will be transcribed /pɪn/ between slashes and converted to its phonetic 
h
representation [p ɪn] at the phonetic level by the phonological (aspiration) rule. This is 
illustrated as follows: 
(Input) Phonemic (underlying) Level of Representation- /pɪn/ 
 
P-Rules- (Aspiration Rule) 
 
h
(Output) Phonetic (surface) Level of Representation – [p ɪn] 
Fig. 3.1 Levels of Representation 
68 
 
 
This allows phonological rules and principles to be more transparently and 
economically stated with a view to eliminating redundancy from phonological 
analyses. Harrington (2004) opines, in this regard:  
 
In the Sound Pattern of English, one of the main aims is to 
factor out many more redundancies from the words' 
phonological representations and to fill in these redundancies 
by rule. This in turn results in a representation which is a good 
deal more abstract than the phonemic forms... Furthermore, 
these highly abstract representations are presumed to form part 
of the talker's knowledge of the language.  
 
To generativists, then, the /k/ of electric and the /s/ of electricity are, at the underlying 
representation, manifestation of a unique segment K, thereby yielding /elektriK/ and 
/elektriK + iti/. After rules (e.g. velar softening) are applied /elektrik + iti/ then 
becomes [elektrisiti].  
 
3.1.1 Phonological rules  
Generative Phonology canvasses a phonological description deprived of 
analytical procedure of segmentation and classification but rather based on the 
formulation of a set of rules which constitute the phonological component of a 
grammar (Chomsky, 1964; Clark and Yallop, 1995). The focus of transformational- 
generative theory from which it evolves is a linguistic description capable of 
constructing a grammar that would generate linguistic forms. In order to yield 
phonological component of such a grammar, therefore, the theory proposes that 
underlying forms of the language must be converted into surface representation by the 
application of a set of phonological rules (Clark and Yallop, 95). In this regard, Clark 
and Yallop (1995:139) explain further:  
 
The very term „generative‟ draws on a mathematical concept of 
definition by the application of rules or operations. Thus, in 
generative linguistics, a set of rules may be said to „define‟ a 
language by generating all and only the correct possibilities... 
The rule is therefore powerful, in the sense that it generates an 
infinite number of possibilities, but also restrictive, in the sense 
that it generates only sequences of the language and not 
impermissible sequences like [aa], [m] or [aaammm]. 
 
 SPE which elaborates the import of the theory considers a grammar as systems 
of rules that relate sound and meaning, and comprising several components including a 
69 
 
 
semantic component and a phonological component by which grammatical structures 
are converted to their phonetic representations by the application of rules.  
 
 Ba se  Deep structures 
Syntactic rules Semantic 
 component 
   Deep structures 
 
Transformational  
Syntactic rules Semantic Representation 
 
   Surface structures 
 
 
 Phonological 
 rules 
      
 
Phonetic representation 
 
Fig. 3.2 A generative model of grammar 
(Source: Clark and Yallop, 1995:402) 
 
As rightly pointed out, the notion of phonological rules is an important concept 
employed in GP to map underlying representations onto phonological representations. 
For Fromkin and Rodman (1993:241) „phonological rules relate the minimally 
specified phonemic representation to the phonetic representation and are part of a 
speaker‟s knowledge of the language‟. In other words, GP attempts to assign, as 
correctly as possible, phonetic representation to utterances by means of „generated‟ 
rules in such a way as to reflect the ideal speaker‟s internalised (intuitive) grammar. 
It's basic premises are that phonological structure reflects the linguistic competence of 
the individual native speaker to compute a phonetic representation for the potentially 
infinite number of sentences generated by the syntactic component of the grammar and 
that this competence can be investigated in a serious scientific fashion (Kenstowicz, 
1994). Harrington (2004) puts it this way: 
 
There are phonological rules that link these often highly 
abstract underlying forms to the phonetic forms ... because 
otherwise we cannot explain how underlying forms are related 
70 
 
 
to pronunciation (this is exactly parallel to our earlier 
phonemic/phonetic distinction: once we represent words 
phonemically, we have to have rules that fill in the redundant 
or predictable aspects of pronunciation like aspiration; the 
difference in the Generative Phonology model is that the 
underlying forms that are being proposed are more abstract 
than phonemic forms- resulting in many more rules to explain 
the predictable and redundant aspects of pronunciation- and 
they lay much greater emphasis on the claim that these 
underlying forms are in some sense 'psychologically real' i.e. 
part of the talker's linguistic competence). 
 
These rules, according to Simo Bobda (1994), capture morphophonemic 
alternations in a way that many irregularities or seemingly inexplicable grammatical 
puzzles can be unravelled by establishing the right ordering of rules and the different 
rule interactions. He cites the example of decade which, ordinarily, violates the rule 
converting /k/ to [s] before non-low front vowels; this is appropriate if ordered before 
the rule changing certain monophthongs to diphtongs. Another instance is the 
divergent phonological behaviour of the underlined parts of (a) gymnasium [eiz] and 
(b) potassium [æs] which, he says, can be traced to rule ordering. In (b) [æ] remains 
lax because two consonants follow it (laxing rule) unlike one in (a); /s/ stands as 
voiceless because s-Voicing applies before Cluster Simplification. 
SPE proposes over forty of such rules operating on underlying representations 
(URs) to yield the surface forms of both segmental and suprasegmental features (Simo-
Bobda, 1994; Clark and Yallop, 1995). The following are some of these rules in SBE 
which are particular related to this study: 
 
(a) Assimilation of voice rules 
 
(i) Regressive Devoicing:  
 [- sonorant]        [- voice] / --------- ##   - sonorant 
         - voice 
  
(The first obstruent takes on the voiceless feature as is found in the second obstruent)  
(Source: Adapted from Schane, 1973:68) 
e.g. I have to go is pronounced [aɪ 'hæftə 'gəʊ]: /v/ of have becomes /f/, losing its 
voicing under the influence of the following voiceless /t/ of to. 
  
 
 
71 
 
 
(ii) Progressiving Voicing:  
 +ant 
+cor       [α voice] / [α voice] ----  
+str 
 
( s alternates between /s/ and /z/ according to whether the preceding non-sibilant 
segment is voiceless or voiced, e.g. stops – [stɒps]; sobs – [sɒbz]). 
 
(Source: Simo Bobda, 1994:56)  
 
(iii) Progressive Devoicing: 
   
 [- son]  [- voice] / [- voice] ## ---- 
  
(A word-intial obstruent becomes voiceless after a word-final voiceless consonant, e.g. 
nice boy - [naɪs b̥ɔɪ] 
 
 (b) Place assimilation rules 
 
 (i)  Alveolar     α  place  
  stop    [α place] /--- ##      
  -voice     stop    
 
(The voiceless alveolar stop /t/ assimilates in place of articulation to the following 
bilabial or velar stop /p, k/, e.g. met Peter and that case).  
      
(ii)  Alveolar     α  place  
  stop      [α place] /--- ##      
         +voice     stop   
  
(The voiced alveolar stop /d/ assimilates in place of articulation to the following velar 
or bilabial stop /g, b/, e.g. good girl and good bye). 
   
(iii)   Alveolar     α  place  
         [α place] /--- ##      
         nasal     stop    
 
(The alveolar nasal /n/ assimilates in place of articulation to the following bilabial 
stops /b, p/ e.g. ten boys and ten pounds).  
(Adapted from: Wells, 1982:61)  
 
 
72 
 
 
(c) Palatalisation/Yod coalescence rule 
 -son  - ant         -con      -con    
+cor         --------  ##        -syl      -stress 
+ant  + strd          -back 
(/t, d, s, z/ are converted into [ʧ, ʤ, ʃ, ʒ] respectively, before the palatal glide /j/ at the 
word boundary followed by an unstressed vowel, e.g. did you? [diʤʊ], won’t you? 
[wʊnʧʊ]).  
(Adapted from: Simo Bobda, 1994:65)  
 
(d) R-insertion/Linking-r rule 
ø   r    V -------##OV  
(/r/ is inserted between a vowel and a following vowel, with or without an intervening 
word boundary, e.g.  here and there /hiər ən ðeə/) 
(Source: Wells 1982 cited in Simo Bobda 1994:67). 
 
As observed from above, phonological rules delete, insert, change segments or 
change the features of segments and are expressed through the process of rule 
formalization. It is clear from the foregoing, therefore, that the goal of GP is not just to 
offer observational adequacy (ability of a grammar to correctly state that certain forms 
are observed while others forms are not) which was common to other models of 
phonology, but to achieve descriptive adequacy (ability to, in addition to transcribing 
the data, account for the knowledge- linguistic competence- of the native speaker) 
(Hyman, 1975).  
 
3.1.2 Formalisation of rules  
In order to express clearly and explicitly the native speaker's internalised 
knowledge of his language, therefore, phonological rules must state the class of sounds 
affected by the rule, the context or phonemic environment of the relevant sounds and 
the resultant phonetic change (Fromkin and Rodman, 1993). Vowel nasalisation rule, 
for instance, will be stated as follows:  
Nasalise (phonetic change) vowels and diphthongs (affected class of sounds) 
before nasals (context or phonemic environment) 
The above stated rule, however, can be formulated and formalised in an economical, 
maximally simple, clear and unambiguous manner, using distinctive features matrices 
73 
 
 
and mathematical/scientific notations as done in the previous section. These feature 
notations 'provide a way to express the generalisations of the language that may be 
obsecured otherwise' (Fromkin and Rodman, 1993:243). In this regard, the vowel 
nasalisation rule above may be expressed as:  
 [-consonant] → [+nasal] / − [+nasal]   
 Some of such notations used in phonological analysis to formalise phonological 
rules as employed in this study are stated below with their interpretations. 
/ /  phonemic/phonological representation 
[ ]  phonetic representation 
+  has the feature of 
-  lack the feature of 
→  becomes 
/  in the environment of 
−−  position of the affected sound 
A→B/C− A becomes B after C 
A→B/−C A becomes B before C 
A→B/C−D A becomes B inbetween C and D 
 
3.1.3 The Distinctive Feature theory 
As noted in the foregoing section, also involved in phonological rules 
formalisation is binary distinctive features, used to unambiguously distinguish the 
affected class of sounds. The origin of Distinctive Feature theory can be traced to the 
Prague School‟s idea of phonological oppositions as championed by Trubetzkoy 
(1939). Distinctive features (DFs) were, however, first formalised by Roman Jakobson 
in 1941 and further elaborated by Chomsky and Halle in the Sound Pattern of English 
(1968). Jakobson and associates (Jakobson, Fant and Halle, 1952; Jakobson and Halle, 
1956) devised sets of distinctive features which provide a background for Chomsky 
and Halle‟s set. According to Mannell (2008), Jakobson's original formulation of 
distinctive feature theory was based on the following ideas: 
1. All features are privative (i.e. binary). This means that a phoneme either has the 
feature, e.g. [+VOICE] or it doesn't have the feature, e.g. [-VOICE] 
2.  There is a difference between PHONETIC and PHONOLOGICAL FEATURES 
 Distinctive Features are Phonological Features. 
74 
 
 
 Phonetics Features are surface realisations of underlying Phonological 
Features. 
 A phonological feature may be realised by more than one phonetic feature, 
e.g. [flat] is realised by labialisation, velarisation and pharyngealisation 
3.  A small set of features is able to differentiate between the phonemes of any 
single language 
4.  Distinctive features may be defined in terms of articulatory or acoustic features, 
but Jakobson's features are primarily based on acoustic descriptions. 
According to Atoye (2005b), the idea of Distinctive Features (DFs) actually 
began as protest against the taxonomic phonology‟s notion of a phoneme being the 
smallest contrastive linguistic unit that cannot be subdivided. Generative phonology, 
for instance, avoids the term „phoneme‟ and, instead, refers to it as sound segment. As 
far as this school of thought is concerned, a sound segment (phoneme) is divisible; that 
is, it is not the smallest unit of utterance but is actually made up of smaller linguistic 
units called features. Thus, a sound segment has a bundle of features capable of 
differentiating it from another. The features are distinctive, contrastive or significant 
for meaning. Botha (1973:215) explains this as follows: 
 
A phonetic segment is not an un-analysable whole, but has an 
internal structure. Its internal make-up is specified in terms of 
distinctive features. The distinctive features occurring in 
phonetic representations have a phonetic function and are 
called phonetic features. Examples of such phonetic features 
are „consonantal‟, „anterior‟, „coronal‟, „voiced „, „nasal‟, all of 
which have a positive value in the phonetic segment 
traditionally indicated by the symbol /n/. 
 
Thus, DFs establish the phonetic characteristics and natural classes of sounds 
using binary notation [+, -] where [+] indicates presence of specified features and [-] 
indicates their absence. For example, the sound segments /b/ and /p/ in bee and pea can 
be distinguished as follows: 
/b/    /p/  
+cons    +cons 
+stop    +stop 
+labial    +labial 
+voice    -voice 
 
75 
 
 
Unlike in phonemic phonology where the difference between the set of minimal pair 
bee /bi:/ and pea /pi:/ is as a result of the substitution of /p/ for /b/, in GP the difference 
is accounted for only in terms of the distinctive feature [+voice] or [-voice].  
In Kenstowicz‟s (1994) view, when phonological segments are represented as 
feature matrices this way, sound change can be formalized as the modification of a 
feature coefficient. Thus, features provide a measure of phonetic distance and allow a 
formal study of natural classes in which the plausibility of a rule is reflected in the 
relative simplicity of its statement.  
In view of this, DF theory is generally regarded a useful tool in explaining 
sound patterns; it particularly offers explanations, in the form of phonological rules, 
for sound changes observed in natural speech, and more readily permits generalised 
statements within and between languages than do phonemes and allophones based 
description, e.g. the alternation between the negative prefix [ɪn-], [ɪm-], and [ɪŋ] in 
intolerant, impossible and incorrect respectively, in which the alveolar nasal /n/ 
assimilates in place of articulation to the following stop (Atoye, 2005b; Simo Bobda, 
1994; Crystal, 1987). Following Chomsky and Halle‟s (1968) distinctive features, this 
rule of Homorganic Assimilation can be captured as: 
 
 +nas   αant   -cont 
 +ant   ϐcor   αant 
 +cor      ϐcor 
 
As seen above, DFs make it possible to formulate a nasal assimilation rule that 
captures different alternations of the alveolar nasal /n/; the use of phonetic symbols 
would require the formulation of three different nasal assimilation rules. This therefore 
establishes the argument that distinctive features allow the possibility of formulating 
phonological rules using a considerably smaller number of units than the phonemes of 
a language, allow natural classes of phonemes to be established with a minimal 
number of features, are economical, maximally simple, clear and unambiguous.  
 However, a major problem facing DF theory, as observed by Atoye (2005b), is 
disagreement amongst scholars on acceptable set of Distinctive Features. Consequent 
upon this, the number and types of DFs vary from one scholar to another, from one 
book to another and even from the same scholar from time to time. For instance, while 
Jacobson, Fant and Halle (1952) proposes twelve sets of features, Jacobson and Halle 
(1956) proposes fifteen. Chomsky and Halle (1968) came up with twenty-four, and 
76 
 
 
Ladefoged (1971) twenty-six. The following are the sets of DFs postulated by 
Chomsky and Halle (1968) as highlighted by Schane (1973:26-33): 
1. The major class features- syllabic, sonorant and consonantal. 
2. Manner features- continuant, delayed release, strident, nasal and lateral. 
3. Place of articulation features- anterior, coronal. 
4. Body of tongue features- high, low, back, and lip shape feature: round. 
5. Subsidiary Features- tense, voiced, aspirated and glottalised. 
6. Prosodic features- stress and long. 
 
3.1.4 Phonological boundary  
Germane to the application of phonological processes and rules is boundary 
delineation. In stating and applying many phonological rules, grammatical boundaries 
are particularly taken into consideration. The literature, thus, contains various 
grammatical boundaries used differently from one author to another. Some of these, 
according to Simo Bobda (1994), are: 
 $    syllable boundary  
 =   prefix-stem boundary e.g. pre = side 
 +   general morpheme boundary e.g electric + ity  
 #   word internal boundary (boundary between a base and a neutral suffix 
e.g. advertise#d, dog#s 
##   full word boundary  
//    phrase boundary  
Of these, Hyman (1975:196) considers ## (full word boundary), # (word internal 
boundary) and + (morpheme boundary) as the major phonological boundaries.  
According to him, these boundaries are of different strengths arranged in a 
linear order on the scale of 0-3; that is, from the weakest to the strongest as follows: 
Ø  +  # ## 
 
0 1 2 3 
This implies that + is the weakest and ## the strongest. Depending on strength, they 
can either inhibit or condition the application of a phonological process. A boundary is 
said to be strong if it is harder to penetrate. That means ##, being the strongest, has the 
greatest propensity for blocking the application of a rule. Simo Bobda (1994:85), in 
77 
 
 
this regard, cites the following instances where the full word boundary blocks nasal 
assimilation: 
 A    B 
  Impair [impɛə] vs ten pence [tɛn pɛns] 
  Rancour [ræŋkə] vs rain coat  [reɪn kəut] 
  Anger [æŋgə]  vs Dan Garvey [dæn gɑvɪ] 
Nasal assimilation takes place In A, but is blocked in B (across word boundary). He 
notes, however, that assimilation is possible at this boundary in a fast or lazy speech.  
 In the same way, certain phonological processes only take place at a particular 
boundary but not at another. Consider, for instance, the following rule deleting the /g/ 
of /ng/ sequences which is triggered by boundary:  
 
g      →         Ø / ŋ ____  #  which yields the forms: 
 
/bring ## her    →             [briŋər]      (full word boundary) 
/sing # er/    →  [siŋər]        (internal word boundary) 
/lɔng + er/    →  [lɔŋgər/      (morpheme boundary) 
/finger/    →  [fiŋgər]    (no boundary) 
 
(Source: Hyman 1975:197). 
 
In the derivations above, ## and # condition the deletion of the /g/ of bring her and 
singer respectively, whereas at the + and Ø boundaries the /g/ of longer and finger 
remains.  
It is obvious from the above explanation that application of phonological 
processes depends on particular types of boundaries. It is, however, still possible to 
make a phonological rule apply in a boundary other than its. To achieve this, Lars 
(1984) suggests what is called boundary-demotion, a process by which a boundary 
acquires a lower status in order to allow a process. He cites the example of the 
application of nasal assimilation [ŋk] in one can (across word boundary). To permit 
this, he argues that #one#can has been changed to #one+can#; that is,  # is demoted to 
+  in order that the rule may apply as in  # in + come #.  
As rightly expressed in chapter two, words are spoken in a fluent and 
continuous stream in connected speech so that the segment boundaries implied by 
phonetic transcriptions are often not evident. This makes it possible for segments to be 
influenced and modified in varying degrees by other adjacent sounds at designated 
78 
 
 
morpheme and word boundaries. The implication of this is that phonological rules are 
applicable in any of the boundaries since there is the possibility of boundary erasure or 
neutralisation in a fast speech. 
 
3.1.5 Critique of Generative Phonology 
There is no doubting the fact that generative phonology brought with it a novel 
and ground breaking perspective to phonological inquiry; however, it has also been 
plagued by severe and critical oppositions and numerous unresolved problems and 
research questions. As Goldsmith and Laks (2000:5) put it “the development of 
Generative phonology (and generative grammar more generally) was born of a 
disciplinary rupture, and brought with it rifts in the field'. Foley (1977), in particular, is 
of the view that GP is not a theory of phonology but is merely transformational 
phonetics. Sampson (1980), on his part, queries the reality of many underlying 
representations posited in SPE and accuses Chomsky and Halle of “overestimating the 
ordinary man‟s knowledge of his language” (203). 
Specifically, GP has been seriously criticised on the grounds of its excessive 
abstractness. Kiparsky (1968), in this regard, doubts the possibility of a learner being 
able to formulate varied phonological rules and representations to explain phonological 
processes in the absence of knowledge of their historical antecedents. He suggested 
that abstract representations are motivated by alternations and that grammars change to 
states in which the underlying representations can be induced by rules that state 
generalizations over the surface phonetic representation. 
Kisseberth (1970) expresses the view that the emphasis of SPE model on 
formal connections among rules makes it difficult to express the functional unity 
among diverse processes. Citing various rules in the phonology of Yawelmani which 
bar the occurrence of successive consonants at the surface level, he posits that it is 
problematic to formalize the notion of rules applying or blocking a particular process 
to satisfy a constraint.  
Stampe (1972) emphasized the importance of substantive rather than formal 
considerations in shaping phonological structure. He draws a sharp distinction between 
natural processes and more phonetically arbitrary rules like SPE's Vowel Shift and 
Velar Softening that state generalizations over limited sets of lexically related words. 
In his view, phonological processes are what the child brings to the language while 
phonological rules are what the language's vocabulary brings to the child. His theory, 
79 
 
 
Natural Phonology, prefers a natural or phonetic explanation of phonological 
phenomena to the GP's excessive formalism.  
Finally, with its emphasis on rules of sound change, it has been argued that the 
SPE model has little to say about phonotactics-static constraints on word shape that are 
unsuited to rules of sound change and seem best treated as conditions on representation 
that outputs must respect. Kenstowicz & Kisseberth (1976) call attention to the 
problem that constraints on lexical shape are often duplicated by rules of sound change 
that can be thought of as bringing the representation in line with the constraint.  
This series of criticism brought about several phonological thoughts such as 
Natural Generative Phonology (Hooper, 1973; Vennemann, 1974), Natural Phonology 
(Stampe, 1972; Donegan, 1978; Donegan and Stampe, 1979), Lexical phonology 
(Kiparsky, 1982; Mohanan, 1982; Pulleyblank, 1986; Strauss, 1982), Metrical 
phonology (Liberman, 1975; Liberman and Prince, 1977; Hayes, 1980), 
Autosegmental phonology (Goldsmith, 1974; 1976) and Optimality theory (Prince and 
Smolensky, 1991; 1993). They all claimed to offer alternatives to GP. However, as 
Goldsmith and Laks (2000) opine, a majority of these theories were conducted and 
published within the framework of generative phonology, and their criticisms of it 
were expressed in terms of SPE. Besides, none of them is without its weaknesses, 
though each portrays an attempt to rectify certain inadequecies observed in Generative 
Phonology. 
Generative phonology, therefore, in many respects remains illuminating and 
relevant to phonological enquiry in both first and second language situations. Through 
phonological rules, GP provides adequate explanations for phonological alternations of 
the ideal native speaker as well as regular and predictable deviations of the second 
language speaker; hence, the choice of the theory as the basis for analysis of data on 
features of Standard British English connected speech in Nigerian English. However, 
in view of the fact that we are dealing with a variable data of a second language 
situation which a single theory may not adequately account for, explanations shall be 
sought from other relevant theories to fill the gap. 
 
3.2 Variability concept 
Variation in language was never considered important or consequential by 
major linguistic schools like Saussurean theory, American and Prague School 
80 
 
 
Structuralism and Chomskyan theory, in particular. As a matter of fact, these theories 
treated language as a strictly invariant entity and dismissed any perceived variability as 
unstructured and never worth studying. The emphasis of Chomskyan Transformational 
Generative linguistic model, for instance, is on the ideal speaker-hearer of a language. 
Both individual and social variation in language was considered part of performance 
which was outside the purview of the linguist.  
However, with the advent of the variability concept, championed by Labov 
(1963, 1966) and expounded by Dillard (1968) and Baratz (1969) amongst others, 
emphasis then shifted to structured variability in language. This empirical research 
method is premised against Bloomfields (1927) Structuralists‟ view that it is 
impossible to distinguish between „good‟ and „bad‟ speech. He proved that although 
„literate‟ and „illiterate English were considered in public opinion as „good‟ and „bad‟ 
respectively, empirically, neither of them is disadvantaged with respect to the other.  
According to Dittmar (1976:104), therefore, variability concept seeks to:  
 
Explain how and in what function language systems are 
divided (regional, social, functional language varieties), how 
speech realisations are evaluated (privileged, stigmatised status 
of speech forms) and how they change on the basis of 
evaluations (revaluation vs devaluation of standards, dialects, 
speech behaviour of minority groups). The descriptions also 
have to explain to what extent language systems interfere with 
one another on the phonological, syntactic and semantic levels, 
how they are acquired, conserved and modified on these levels 
and, finally on the basis of what relationships they co-exist or 
come into social conflict. The aim of research into speech 
variation is thus to describe and explain the entire social 
network of speech practice and the complex competence that 
speakers have at their disposal for communication, in 
correlation with social norms and parameters. 
 
Variability concept is built on the notion that language is inherently variable at 
different structural levels of Phonology, Mophology and Syntax and that it is a 
generally recognised fact that no two utterances of the same word by the same speaker 
are ever exactly alike. The same language varies from speaker to speaker or from 
community to community (Milroy and Milroy, 1997).  
In order to demonstrate the perceived co-variation between linguistics and 
social categories, variationists usually employ quantification as an essential 
methodological tool. In this regard, a linguistic variable, e.g. a sound segment such as 
81 
 
 
/a:/ whose pronunciation is observed to vary in a particular speech community is 
selected and occurrences of its variants in the speech of different speaker groups are 
quantified. This methodological tool makes it possible to make objective and accurate 
judgments about fine grained differences between individuals and groups of speakers 
in a speech community. Speaker variables commonly used for this purpose are socio-
economic class, age of speaker, sex (gender) of speaker, ethnic group of speaker and 
social network. 
Labov‟s work provides the framework for studies in this direction. Many of the 
methods he advanced are still employed in sociolinguistics till date. He introduced the 
sociolinguistic interview, designed to elicit different speech styles within a single 
interview and a stratification of phonological variables according to sex/gender, age, 
socioeconomic status and situational context (Wodak and Benke, 1997). Labov‟s 
contribution to linguistic variation emphasises the twine principles that language is 
essentially variable and that the variation is principled and should be the subject of 
attention of linguistic theory.  
In his 1966 doctoral thesis, “The social stratification of English in New York 
city”, he was particularly interested in the correlation between linguistic and 
sociological variables of social class membership in New York City. Amongst other 
things, he investigated the /r/ variable. Using stratified random sampling method, 
Labov stratified his participants into ethnic groups (New Yorkers, Italians, Jews, 
Blacks) and social class (Upper Middle, Lower Middle, Upper and Lower Working 
Class) with the indicators of income, occupation and level of education. He 
interviewed his participants and recorded formal and informal speech, reading aloud 
from a text, and reading a series of minimal word-pairs, covering the range from least 
to most relaxed speech. In the long run, he quantified the mean scores for each social 
group in each style and discovered a correlation between the phonological and 
sociological variables.  
Labov‟s theoretical and methodological ideas have been replicated by many 
researchers worldwide (e.g.Trudgill 1974; Chesire, 1982; Modaressi, 1978; Romaine, 
1978), while others have approached his pioneering effort from different social 
dimensions. Milroy‟s (1987, 2002) studies, for example, focused on the correlation of 
linguistic and social networks (the dimension of solidarity at the level of individual 
and his or her everyday contact) rather than Labov‟s socio-economic class. According 
to the proponent of this approach, “an individual‟s social network is straightforwardly 
82 
 
 
the aggregate of relationships contracted with others, and social network analysis 
examines the differing structures and properties of these relationships.” (Milroy 
2002:549). This approach queries Labov‟s (1966) view of a city as a single speech 
community; its principal consideration is the internal variation within a particular 
group (the working class) and not the language community as a whole. In this 
direction, their variable is based on the informal social relationship contracted by 
individual speakers with others, and not on comparisons between groups of speakers 
(Milroy and Milroy, 1997). This affords the sociolinguist the opportunity to 
painstakingly investigate and discover the impact of communal social cohesion on the 
speech patterns of individual members. This research drive, thus, serves as a tool for 
investigating the relationship existing between patterns of language maintenance and 
patterns of language change. It seeks to explain, for instance, why stigmatized, non-
standard and low-status speech forms are retained even in the face of intimidation and 
strong pressure from the standard form.   
The Milroys‟ research efforts have shown that language correlates with social 
network. Findings of the observed linguistic variables revealed a connection between 
specific language behaviour and certain peculiarities of the network. It was also 
established that within the network (the working class), considerable variation exists 
between individuals, between different speech-styles, between men and women, and 
between older and younger speakers (Milroy, 1981). Social network analysis approach 
has been applied in both urban and rural monolingual settings (e.g. Bortoni-Ricardo, 
1985; Schmidt, 1985; Lippi-Green, 1989; Edwards, 1986) and in bilingualism, 
language contact and language shift situations (e.g. Gal, 1979; Li, Milroy and Pong, 
1992; Li, 1994). 
However, other variationists like Giles and Smith (1979), Le Page and 
Tabouret-Keller (1985) and Bells (1984, 2001) are more interested in socio-situational 
variation. This line of research is premised against the fact that variation in speech may 
be a function of linguistic style or register of speech. Speech, therefore, may be varied 
according to the setting (formal or informal) and the composition of audience (age, 
sex, socio-economic status, and regional background of speaker and addressee, and the 
degree of intimacy between the participants in the speech event). Giles and Smith 
(1979), in this regard, propounded accommodation theory with a view to exploring 
how speakers modify their accent in response to the speech of their interlocutors based 
on situational factors. Also, Le Page and Tabouret-Keller‟s (1985) acts of identity 
83 
 
 
shows how speakers‟ linguistic behaviour is motivated by the wish to resemble as 
closely as possible that of the group with which they wish to identify. It reveals how 
speakers adjust their behaviour as well as accent to suit the perceived norms of a 
community with a view to identifying with the community. Bell‟s (1984) audience 
design theory is another important contribution to this research drive. It holds that 
„style derives its meaning from the association of linguistic features with particular 
social groups‟ (Bells, 2001:142). In that wise, speakers design their style primarily for 
their audience.   
Having reviewed various frameworks of variationist studies that have emerged 
from the Labov‟s pioneering effort; this study is premised against the Labovian model. 
However, many scholars have expressed skepticism on the applicability of the 
Labovian model in a multilingual environment like Nigeria. This is because Labov's 
studies were restricted to the native speakers‟ settings where most speakers are 
monolinguals and differing levels of proficiency in the language are not an issue. 
Besides, the kind of elaborate social class system upon which his studies were based is 
non-existent in Nigeria. Nevertheless, aspects of the model and its methods are very 
relevant to the Nigerian linguistic setting.  
First, his ethnic/regional approach to variation is appropriate in Nigeria which 
is made up of different linguistic/ethnic groups and genetic make-ups. This is because 
speech production can vary according to ethnicity/region of speakers (Labov, 1966; 
Trudgil, 1974; Guy 1981; Hovath, 1985). Second, Labov (1963, 1966, 1990, 1991) has 
proved that speakers‟ gender and age, besides socio-economic status, are also key 
factors of speech variation. This study, therefore, attempts to correlate 
phonetic/phonological variables of assimilation, elision and liaison with speakers' 
variables of age, gender and region, with a view to establishing any perceived co-
variation. 
  
3.2.1 Social variables 
 
3.2.1.1  Age 
Age as a sociolinguistic variable focuses on socially-oriented linguistic change 
(variation) in age stratification as well as the nature and social status of age and aging, 
rather than its biological status which, in any case, also influences phonological 
variation particularly between adults‟ and children‟s speech as a result of anatomy and 
physiological differences. Eckert (1997), in this regard, discusses how three key life 
84 
 
 
stages- childhood, adolescence and adulthood- affect linguistic patterns in various 
manners. 
First, it has been proved that in childhood, children tend to adhere to the speech 
patterns characteristics of women‟s speech; which suggests the linguistic influence of 
mothers or caregivers on children (Labov, 1990; Foulkes, et al., 1999). This implies, 
therefore, that children are able to acquire speech habits undergoing sound changes in 
their community as propagated by their mothers or caregivers.  
In adolescence, peer-group influence, to a large extent, further shapes language 
use. In a bid to conform to peer-group norms, adolescents take up new phonological 
patterns apparently different from the variety acquired through their parents or 
caregivers. In this regard, Kerswill (1996) reveals how at the pre-adolescent stage 
(ages 6-12), there is the onset of a shift from parent-oriented to peer-oriented language 
use: children‟s speech becomes more and more like that of their peers. Kerswill and 
Williams (2000) demonstrate this with the situation in Milton Keynes (an English new 
town). Children in Milton Keynes grow up amidst various dialects brought into the 
new town by different immigrants. However, by age 12, these varieties have fizzled 
out for a more homogenous local accent as a result of social pressure on adolescents to 
accommodate to their peers. Some studies (e.g. Wolfram, 1969; Macaulay, 1977; 
Eckert, 1988) particularly show low correlation between adolescents and the socio-
economic status of their parents in favour of high conformity to adolescents‟ age 
group.  
Eckert (1997) further describes the adolescence as a period of identity 
construction and argues that the activities that occur at this stage involve linguistic 
innovation. It is a period when social use of vernacular is encouraged and linguistic 
change is advanced. According to Kerswill (1996:198), “adolescents are clearly 
significant bearers of change; their networks allow them to have wider contacts than 
younger children, and their desire for a distinct social identity means that they are 
willing to modify their speech”. 
Adulthood is usually assumed to be a period of stable and fixed phonological 
language system. Studies (e.g. Labov, 1966; Trudgill, 1974; Horvarth, 1985) have 
shown that adults are generally more conservative in their use of variables than 
younger age groups; a fact attributed to demand for use of standard language in the 
work place and social networks (Sankoff and Laberge, 1978; Nichols, 1983). Nichols 
85 
 
 
(1983), for example, studied African-American women and found linguistic 
imperatives of the workplace as an important determinant of patterns of variation.  
However, some studies in real time have recently begun to examine the 
possibility of language change ongoing in adulthood. Coupland (1980) and Mees and 
Collins (1999) reveal how certain factors as the social ambition of an individual may 
actuate possible changes in their choice of sociolinguistic variants. In a real time study 
of four Cardiff women, Mees and Collins (1999) investigated the use of glottal variants 
of /t/ which are uncharacteristic of Cardiff local variety of English. They showed that 
the use of glottal variants was more evident in the speech of speakers who desired to 
move out of Cardiff, while those who preferred to stay used the variants less. In the 
same vein, Harrington, et al. (2000) studied changes in Queen Elizabeth II‟s 
production of vowel and found that there is a shift in her pronunciation from a 
stereotyped upper class RP towards a more mainstream variety of RP. 
 
3.2.1.2  Gender 
Gender is related to the biological sex of speakers  either as  male or female. As 
a sociolinguistic variable, however, gender is considered the social interpretation of 
sex in terms of roles, norms and expectations that apply to men and women (Eckert 
1989; Cheshire, 2002; Hannisdal, 2006). Considerable evidence exists on the 
differences between the language of males and females. On the average, men and 
women tend to use slightly different language styles.  
Over the years, research into gender-specific variation studies has undergone 
various phases and yielded quite a lot of different claims. Some earlier studies actually 
viewed the speech behaviour of women in terms of deficiency model. In this sense, the 
language of men was considered stronger, more prestigious, and more desirable than 
that of women which was regarded as surbodinate and deficient (Lakoff, 1975; Wodak 
and Benke, 1997).  
In another perspective, gender-related studies divided women and men‟s styles 
of speech into good-bad dichotomy (Trömel-Plötz, 1984), thereby over generalising 
the strengths of women styles. For instance, the female style was considered 
cooperative and that of male competitive. However, these studies did not take into 
consideration intra-gender differences; they associated the sexes with the respective 
gender, and relied on a unitary model of gender. Another phase of linguistic gender 
studies that was to follow concentrated on studying fine-grained differences in the 
86 
 
 
speech behaviour of men and women, thereby leading to a situational ranking of both 
sexes. Studies in this phase paid adequate attention to gender category and reflected 
the power structures of society in gender description, discarding the deficit theory for 
dominance theory of gender (O'Barr & Atkins, 1980). Deuchar (1990), in this regard, 
claims that female use the standard language as a means of improving their inferior 
position in a male-dominated world; the weaker a woman‟s position, the more she is 
forced to be polite.  
Gender socialization was the focus of the next phase of gender research. 
Studies in this approach, otherwise called difference theory, emphasized differences in 
subcultures and socialization processes, rather than context-specific power relations. 
The direction of gender studies in this tradition focused on unraveling questions such 
as whether men interrupt women more often than women do, whether men dominate 
topics of conversation, whether women are hypercorrect, and whether more women 
use more standard language than men. Tanen (1991), in this regard, argued that men 
have a report style, aiming to communicate factual information, whereas women have 
a rapport style, more concerned with building and maintaining relationships. 
Labov‟s early work in the 1960s signalled the beginning of gender-specific 
variation studies. His (1963, 1966) studies emphasized the relevance of sex/gender as a 
sociolinguistic variable. He expressed the notion that women of all classes and ages 
employ more standard linguistic variants than men. This is what Hudson (1996) refers 
to as the Sex/Prestige Pattern. Labov (1990) believes that women are more sensitive to 
the incoming prestige forms than men in language change from above, and that men, 
most often, lead in language change from below. In this regard, women have been 
found to prefer fully articulated forms to forms that show the effects of casual speech 
processes, in view of their adherence to correctness; phonetically explicit forms are 
considered more correct than reduced forms (Zue and Laferriere, 1979; Hannisdal, 
2006). Labov (1991), however, explains that women‟s sensitivity to prestige forms is a 
function of their influence and position in a given society. In societies where women 
are given little or no freedom of speech (e.g. Iran and parts of India), women use more 
non-standard forms than men.  
Trudgill (1972, 1983) also patterned his gender-specific variation study on 
Labov (1966) work. He, however, went a step further to provide sociological causes of 
the perceived gender-specific differences in language variation. Trudgill (1972) relates 
this cause to insecurity of the position of women in the society and the need for them 
87 
 
 
to use language to secure and signal their social status. He also links this to the fact 
that men‟s worth is appraised by their work, while women‟s assessment is based on 
their appearance, which includes language. 
Milroy and Milroy's (1980, 1981, 1987) network approach concentrates on the 
internal variation within a group (the working class [WC]) rather than the language 
community as a whole. This has gone to establish significant variation between men 
and women speech within a particular social network. It also provides opportunity for 
associating particular linguistic patterns with specific peculiarities of the network 
structures. Like the previous scholars, the Milroys establish the general tendency for 
women to use more standard forms than men and vice versa and further provide 
explanations for this. According to Milroy (1981), the use of non-standard forms or 
vernacular speech by men is as a result of more rigid group pressure to which they are 
subjected, while women‟s speech is determined by the linguistic freedom tolerated by 
their local peer group.  
However, some other studies have considered the idea of treating gender as an 
independent variable in measuring language use and behavior as inadequate. For 
instance, Eckert (1989) found that the relation between gender and linguistic variation 
has been inadequately established because men and women within the same society 
come to experience life, culture, and society differently. Gal (1995:173) observes that 
variationists have overlooked „„the cultural constructions of language, gender and 
power,‟‟ which influence men and women‟s language behaviour. Cheshire (2002:428) 
also points out that the „„empirical basis‟‟ for the sociolinguistic variation of gender, 
which puts women ahead of men in the use of standard language, has come to be 
questioned in recent years. Eckert and McConnel-Ginet (2003) have shown a shift in 
the perception of gender in recent gender studies. Instead of being simply viewed as an 
inactive form of identity that people possess, gender is now treated as an active and 
salient social category that can influence speakers‟ language use and behaviour in a 
variety of ways. 
 
3.2.1.3  Ethnicity 
Ethnicity is a concept that describes regional or geographical identification or 
groupings of people on the basis of common geneology and ancestry. Ethnic groups 
primarily share cultural and linguistic traits as well as a group history. To a great 
extent, then, language is an important marker of ethnicity. If one considers Milroy and 
88 
 
 
Milroy's (1997) assertion that the same language varies from speaker to speaker or 
from community to community, it will not be difficult to agree that regional or ethnic 
leaning of speakers contribute largely to variation in speech. This view is succinctly 
expressed by Bailey and Robinson (1973) that: 
 
Because the forces of standardization have not yet completely levelled 
the individuality resulting from genetic make-up and rearing, removed 
the human impulse to gather in manageably small groups, or erased the 
cultural differences that distinguish group from group or nation from 
nation, language must be as various as the groups who use it and the 
activities they engage in.  
 
Studies (e.g. Spencer, 1971; Tiffen, 1974; Platt et al., 1984; Akere, 1978) have 
particularly shown mother tongue influence as one of the major features of non-native 
Englishes. This is understandable considering the fact that English coexists with 
various indigenous languages and different ethnic groups who use them in the non-
native setting. Therefore, the English language as used in these different ethnic 
communities is bound to exhibit influences or interference features from the ethnic 
languages of its users. As a matter of fact, it has often been said that there are as many 
geographical varieties of English in Nigeria as there are local languages spoken 
(Banjo, 1979; Adegbija, 1988; Bamgbose, 2004). Strevens (1965:113), for example, 
opines that “One would expect a description of the pronunciations of English which 
may be heard in West Africa to bear a close relationship to description of the phonetic 
characteristics of the language spoken as a mother tongue by various groups of 
people”. In the light of this, speakers from the same ethnic group have been shown to 
demonstrate homogenous features in their pronunciation of English words. Thus, the 
spoken English of Yoruba, Igbo, Hausa/Fulani, Edo, Tiv, Ibiobio speakers in Nigeria 
tend to mirror the phonetic features of each of these indigenous languages. For 
example, while Hausa speakers insert vowels to split a consonant clusters as in 
[rezigineʃən] for [rezɪgneɪʃən], Yoruba speakers nasalise English vowels preceded by 
nasals, e.g. [mɔ̃niŋ] for [mɔ:nɪŋ] (Bamgbose,1971; Simo Bobda, 1994). 
Beyond ethnicity, however, language has been proved to vary according to 
region, nation and wider geographical areas. In this regard, Jibril (1979:43) asserts that 
“Members of several ethnic groups residing in adjacent parts of the country share 
many characteristics in their spoken English with one another”. As far as Mbassi-
Manga (1973) is concerned, there exist many varieties of English as there are countries 
which use it as a second language. There abound in the literature features of non-
89 
 
 
native English which cut across national boundaries. Simo Bobda (1995) specifically 
shows such common traits between Cameroon English and Nigerian English. It is 
against this background we can talk of Yoruba, Igbo, Hausa/Fulani varieties of 
English, Southern or Nothern Nigerian English (as described by Jibil (1982)), Nigerian 
English, Ghanaian English, West African English, East African English, African 
English, South Asian English, etc. 
 In view of this, this study examines variation in connected speech processes of 
Nigerian speakers of English in terms of their regional groupings and contiguity- 
North (comprising Hausa and a few other Nrothern language groups), West 
(comprising Yoruba ethnic group), East (comprising Igbo ethnic group) and South-
South (comprising Edo, Ibiobio, Urhobo and few other language groups in the South-
South region). 
90 
 
 
 
 
 
CHAPTER 4 
 
PILOT STUDY 
 
4.0 Introduction 
This chapter reports the findings of a pilot study conducted to investigate 
Standard British English connected speech processes (CSPs) in Nigerian English, 
using speakers of Educated Yoruba English (EYE) (one of the sub-varieties of 
Nigerian English selected for the larger study). The Yoruba ethnic group is a major 
ethnic nationalities found in the west of the country. The pilot study was necessitated 
by the need to validate the research instruments to be used for the main study and 
ascertain whether the phenomenon is researchable or not.  
One hundred and twenty EYE speakers born, raised and educated in 
Yorubaland, with a minimum of two to three years post-secondary education served as 
participants. They comprised 60 males and 60 females between ages 18-65. For the 
purpose of data gathering and analyses, participants were further grouped into four 
social categories, namely Young Male, Young Female, Adult Male and Adult Female. 
Each category represents 30 participants. The data for the study comprised 22 
utterance items and a short dialogue (containing assimilation, elision and liaison sites) 
which the participants were guided to produce into digital recording devices. The 
perceptually transcribed data were analysed statistically, using percentages and student 
t-test statistical tools.  
 
4.1 Statistical analysis 
Voicing assimilation and yod coalescence were examined in assimilation, /t/ 
deletion, at both morpheme and word boundaries, was investigated in elision, while r-
liaison was tested in liaison. All processes found in the data were subjected to 
statistical analysis. In each boundary context, there were different variants of 
pronunciation; an appropriate (SBE) variant for was allotted one (1) mark, while zero 
mark was recorded for each inappropriate variant (non-SBE variant). The total marks 
91 
 
 
for all participants in each variant were converted to percentages, the higher (or 
highest) percentage taken as the norm.  
In order to test for significance level between each social category in their 
application of Standard British English CSPs, their scores were subjected to student's t-
test, an inferential statistical test used to uncover the effect of a categorical 
independent variable (e.g. age) on a quantitative dependent variable (e.g. elision). It 
determines whether there is a statistically significant difference between the means in 
two unrelated groups based on certain hypotheses formulated. The null hypothesis is 
that the population means from the two unrelated groups are equal (H0: u1 = u2), while 
the alternate hypothesis is that the population means are not equal (HA: u1 ≠ u2). A 
level of <0.05 indicates significantly different group means, while > 0.05 indicates that 
the mean is not significant. 
 
4.1.1 Voicing assimilation 
Voicing assimilation is a process whereby contiguous consonants tend to be 
either all voiced or all voiceless depending on the state of the glottis. This section 
examined the types of voicing assimilation observed in the speech of EYE speakers in 
three different contexts and compared their performance with what obtains in SBE. 
Altogether, nine items were extracted from the data (five in context 1, one in context 2 
and three in context 3) as follow: 
1. A word-final voiced obstruent followed by a word-initial voiceless obstruent at 
word boundary, e.g. chose six, have to, live show, of course and we’ve planned. 
2. The reduced form of the third person singular form of be, e.g. dog’s mine.  
   
3. A word-initial voiced obstruent preceded by a word-final voiceless obstruent at 
word boundary, e.g. black dress, nice boy, ice blue.  
 
Table 4.1 reveals the frequency and percentage scores for voicing assimilation 
processes in the three boundary contexts identified above. 600 realisations were 
expected in the first context, 120 in context 2 and 360 in context 3.  
 
 
 
 
92 
 
 
Table 4.1  Frequency and percentage scores for voicing assimilation 
 
Progressive 
Processes Regressive Devoicing Progressive Voicing 
Devoicing 
Contexts 1.    e.g.  [hæv̥ tǝ]   2.    e.g.  [dɒgz]   3.     e.g.   [naɪs b̥oy]   
Variants RD N/RD PV N/PV PD N/PD 
Tokens of 
600 0 600 34 86 120 255 105 360 
Occurrence 
% Score 100 0 100 28.3 71.7 100 70.8 29.2 100 
 
Key: RD- Regressive Devoicing; PV- Progressive Voicing; PD- Progressive Devoicing; N/RD- Non-
Regressive Devoicing; N/PV- Non-Progressive Voicing; N/PD- Non-Progressive Devoicing 
 
Table 4.1 above shows that in Context 1 (where a voiced segment precedes a 
voiceless one at word boundary, e.g. I have to go), regressive devoicing (RD) was 
observed in all 600 instances, e.g. [hav̥ t u]. This suggests that EYE speakers 
conformed to what obtains in SBE where regressive devoicing is usually observed in 
such a context (Roach, 2000; Katalin and Szilárd, 2006). In Context 2 (which relates to 
the reduced form of the third person singular of verb be), e.g. the dog’s mine, only 34 
(28.3%) cases of progressive voicing (PV) assimilation [dɔgz] were recorded 
compared to 86 (71.7%) of devoicing [dɔgz̥]. This points to the fact that most EYE 
speakers failed to employ progressive voicing assimilation, a trend which shows 
marked deviation from Standard British English. In the third Context (where a voiced 
segment is preceded by a voiceless one at word boundary, e.g. ice blue), EYE speakers 
approximated to SBE with 255 (70.8%) cases of progressive devoicing, as in [ais b̥lu:]. 
Only 70 (29.2%) tokens showed absence of progressive devoicing.  
  
4.1.2 Yod coalescence 
Yod coalescence is a sub-category of place assimilation whereby alveolar 
sounds /s, z, t, d/ coalesce or fuse with a following palatal /j/ either within a word or 
across word boundary to become palato-alveolar /∫, ʒ, ʧ, ʤ/ respectively, as in issue 
/ɪsju:/ becoming /ɪʃu:/ and miss you /mɪs ju:/ becoming /mɪʃu:/. The present study 
examines yod coalescence across word boundary; that is, in connected speech. Twelve 
93 
 
5 x 120 
1 x 120 
3 x 120 
 
items from the data, three for each context, were used to verify the disposition of EYE 
speakers to this SBE cross-word process, e.g. 
1. /s+j/: /s/ is followed by the palatal glide /j/ at word boundary, e.g. miss your, 
in case you, and God bless you.  
2. /z+j/: /z/ is followed by the palatal glide /j/ at word boundary, e.g. has your, 
those young, and amaze you. 
3. /t+j/: /t/ is followed by the palatal glide /j/ at word boundary, e.g. cost you, 
what you, and that you. 
4. /d+j/: /d/ is followed by the palatal glide /j/ at word boundary, e.g. would 
you?, do you think? and could you? 
 
Table 4.2 shows the frequencies and percentages for yod coalescence in the four cross-
word boundary contexts above: /s+j/, /z+j/, /t+j/ and /d+j/. Each of the contexts has two 
different realisations, representing uncoalesced (/sj/, /zj/, /tj/ and /dj/) and coalesced 
(/ʃ/, /ʒ/, /ʧ/, and /ʤ/) forms respectively. In each context, 360 realisations were 
expected, making 1,440 tokens altogether. 
 
Table 4.2 Frequency and percentage scores for yod coalescence. 
 
Processes YOD COALESCENCE
Grand Total 
Contexts 1. /sj/→/ʃ/  [mɪʃɔ:] 2.  /zj/→/ʒ/   [ǝmeɪʒʊ] 3.   /tj/→/ʧ/ [wɒʧʊ] 4. /dj/→/ʤ/[kʊʤʊ]
Variants YC YR YC YR YC YR YC YR YC YR
Tokens 15 345 360 15 345 360 11 349 360 30 330 360 71 1369 1440
% Score 4.2 95.8 100 4.2 95.8 100 3.1 96.9 100 8.3 91.7 100 4.9 95.1 100
 
Keys: YC- Yod coalescence; YR- Yod retention 
 
In /s+j/ context, only 15 instances (4.2%) of the coalesced variant /ʃ/ were 
recorded, e.g. [miʃɔ]; whereas, there were 345 (95.8%) realizations of yod retention 
[mis jɔ]. In /z+j/ context, EYE speakers, again, articulated just 15 (4.2%) cases of 
coalesced /ʒ/, e.g. [ameʒu] while yod was retained in 345 (95.8%) instances [amez ju]. 
The third context: /t+j/ shows 11 (3.1%) incidences of yod coalescence /ʧ/, as in 
[wɔʧu] and 349 (96.9%) cases of yod retention [wɔt ju]. The realisations of yod in 
/d+j/ environment reveal 30 (8.3%) cases of coalesced /ʤ/ [diʤu] and 330 (91.7%) 
instances of uncoalesced variant [did ju]. Overall, only 71 tokens, representing 4.9%, 
94 
 
3 x 120
3 x 120
3 x 120
3 x 120
12 x 120
 
of yod coalescence were articulated by EYE speakers, while yod was retained in 1,369 
cases, constituting 95.1%.  This suggests that EYE speakers deviated significantly 
from SBE as far as yod coalescence is concerned.  
 
4.1.2.1  The contextual/boundary distribution of yod coalesence 
As earlier noted, there are four boundary contexts with potential yod 
coalescence, namely: /s+j/, /z+j/, /t+j/ and /d+j/. This section, therefore, compares the 
scores for all coalesced variants, i.e. / ʃ, ʒ, ʧ, ʤ/ with a view to finding out the 
boundary environment where yod coalescence is the most pronounced in the speech of 
EYE speakers. Table 4.3 shows the overall percentage scores for all participants in 
each variant. 
 
Table 4.3 Percentage scores for coalesced /ʃ, ʒ, ʧ, ʤ/ variants. 
 
Variants % 
/∫/ 4.2 
/ʒ/  4.2 
/ʧ/ 3.1 
 /ʤ/ 8.3 
 
Table 4.3 suggests that variant /dʒ/ has the highest percentage score (8.3) and is 
significantly different from others. This implies that the participants‟ use of yod 
calescence is the most evident in the environment of d + j, e.g. could you? [kʊʤu], 
would you? [wʊʤu],  do you think? [ʤʊ] (This is also reflected in Figure 4.1). 
 
9
8.33
8
7
6
5
% Scores
4 4.17 4.17
3 3.06
2
1
0
/∫/ /ʒ/ /ʧ/ /ʤ/
 
Fig. 4.1 Percentage chart for coalesced /ʃ, ʒ, ʧ, ʤ/ variants. 
 
95 
 
Axis Title
 
4.1.3 Elision 
In rapid casual speech, sounds that ordinarily are enunciated in isolated words 
or slow, careful speech get elided for euphonic effect. Specifically, when there is a 
cluster of two or more consonants word-internally or across word boundaries, some of 
the consonants usually get elided, e.g. han(d)kerchief, Chris(t)mas, nex(t) day and 
fin(d) me. This has also been described as a process of cluster simplification. We 
examine here, using ten utterance items extracted from the data (seven in context 1 and 
three in context 2), the extent to which Educated Yoruba English speakers approximate 
to SBE in consonant elision at word boundary, e.g. 
1. Word-final /t/ before another consonant at word boundary, e.g. doesn’t she, 
won’t do it, kept quiet, exact colour, test drive, don't buy it, you musn’t do it. 
2.  Morpheme-final /t/ before another consonant at word boundary, e.g. jumped 
well, equipped with, fixed price. 
 
 Table 4.4 reveals the frequency and percentage scores for elision in these two 
contexts. Each is composed of two variants: Ø and /-t/ which represent elision and 
non-elision respectively. 840 responses were expected from context 1 and 360 from 
context 2. In all, there are 1200 realisations.  
 
Table 4.4 Frequency and percentage scores for elision. 
 
Processes t-deletion (W/F) t-deletion (M/F) 
Grand Total  
Contexts 1.  exact colour  2.  a fixed price 
Variants Ø /-t/  Ø /-t/  Ø /-t/  
Tokens  398 442 840 143 217 360 541 659 1200 
% Score 47.4 52.6 100 39.7 60.3 100 45.1 54.9 100 
  
Key: W/F: Word Final; M/F: Morpheme Final 
 
Table 4.4 above illustrates the patterns of cluster simplification by consonant 
elision in two different contexts in the connected speech of Educated Yoruba English 
speakers. Three hundred and ninety eight (398) cases of elision, representing 47.4%, 
96 
 
7 x 120 
3 x 120 
10 x 120 
 
were observed in context 1 (where word final /t/ is followed by a word initial 
consonant), e.g.  [egzɑ˺ kͻlͻ], against 442 (52.6%) occurrences of the non-elision 
variant. This shows that more than half of the speakers did not elide /t/ in this position, 
while a little below average did. 
In the second context (where morpheme final /t/ is followed by a word initial 
consonant), there were 143 instances of elision, e.g. [fi˺ s˺ prais], translating to 39.7%. 
On the other hand, 217 (60.3%) realisations of non-elision were observed. This 
suggests that there were clearly fewer cases of elision in this context too. Overall, 
many educated Yoruba speakers failed to elide /t/ in both contexts, which implies that 
EYE speakers deviated from what obtains in SBE. This is clearly at variance with 
Simo Bobda‟s (2007:417) discovery that “simplification of word final consonant 
clusters... is a principal feature of African English accents”, and Jibril‟s (1982) view 
that consonant deletion is common in Nigerian English in fast speech or in a bid to 
reduce consonant cluster. However, the performance may be justified on the grounds 
that the data for this study was elicited from the participants in a way that only 
resembled casual speech.  
 But comparing both elision contexts, we cannot but agree with Simo Bobda 
(2007:418) that “a final alveolar stop preceded by a morpheme boundary is more 
resistant to deletion than one which is not”. This is because we found there were less 
cases of deletion in context 2 than context in 1 (39.7% against 47.4%)  
 
4.1.4 Liaison 
Liaison, according to Crystal (2003:269), is a “transition between sounds, 
where a sound is introduced at the end of a word if the following syllable has no 
onset”. Typical of this process are linking and intrusive (r). This study examines 
linking /r/.  
Table 4.5 shows the frequency and percentage scores for linking /r/ across-
word boundary in the following utterance items extracted from the data: Peter at, more 
of him, after a while, their action, inquire about, colour of , for all, there are, over eat, 
power-assisted steering. There are two variants: /r/ and Ø, representing linking /r/ and 
/r/ suppression respectively. Altogether, 1200 realisations were expected. 
 
 
 
 
97 
 
 
Table 4.5 Frequency and percentage scores for linking /r/. 
 
Process Linking /r/ 
Contexts e.g.  more of [mɔ:r əv] 
Variants r-liaison r-suppression 
Tokens  89 1111 1200 
% Score 7.4 92.6 100 
 
 
Table 4.5 above reveals incidence of linking /r/ in the connected speech of EYE 
speakers. Of the 1200 anticipated cases of linking /r/, only 89 instances, representing 
7.4%, were recorded. On the other hand, there were 1111, that is, 92.6% instances of 
/r/ suppression. It is obvious from the results that most EYE speakers did not make use 
of linking /r/, which implies that this process is not a regular connected speech feature 
of the variety. One factor that probably accounted for the suppression of /r/ is the low 
level of awareness for this feature in Nigeria; it is not a sound feature heard every so 
often except, sometimes, during newscast. Besides, many EYE speakers who are 
possibly aware of it tend to avoid using it during speech because it sounds foreign and 
affected.  
 
 
4.1.4.1 Linguistic correlates of linking /r/ 
The linguistic factors that constrained the use of linking /r/ were investigated. It 
was discovered that linking /r/ occurred more frequently between short grammatical 
words, e.g. there are, more of, after a while, etc. and rarely between lexical words like 
over eat, power assisted; or a combination of lexical and grammatical words, e.g. Peter 
at, inquire about, their action and colour of (see Table 4.6). 
 
 
 
 
 
 
 
98 
 
10 x 120 
 
Table 4.6     Linking /r/ according to the grammatical category of the surrounding  
                    words 
 
Grammatical 
Lexical words Total 
  words 
Process Linking /r/ Linking /r/ 
  
Tokens of 
4 85 89 
occurrence 
% Score 4.5 95.5 100 
 
 
As shown in Table 4.6 above, only 4 instances, representing  4.5% of the 
realised linking /r/ variant occurred when both or one of the adjacent words are lexical 
words; while 85 or 95.5% cases were recorded when /r/ appeared between two 
grammatical categories. Specifically, over 95% cases of the realised /r/ variant 
occurred in-between function words. This implies that the few instances of linking /r/ 
found in EYE occurred predominantly between grammatical items. 
 
4.1.5 Summary of performance 
This section presents the overall performance of EYE speakers in all the 
processes examined, with a view to determining their proximity to SBE. Thus, Table 
4.7 is a summary of the EYE speakers‟ performance in all the processes investigated. 
In view of the fact that each of the processes contained different number of items, the 
total score for each was converted to a percentage. All the calculated percentages were 
then summed up and used to arrive at the overall percentage. 
 
 
 
 
 
 
 
 
 
 
 
 
99 
 
 
Table 4.7 Summary of CSPs of SBE in EYE data 
 
Process Assimilation Elision Liaison Grand Total 
Variant 
EYE (%) 100 28.3 70.8 4.9 47.4 39.7 7.4 298.5 700 42.6 57.4 
 
 
Fig. 4.2 Pie chart showing percentage summary of CSPs of SBE in EYE data. 
 
Table 4.7 (coroborated by Fig 5.6) above shows that with an overall score of 
42.6% approximation to and 57.4% deviation from SBE connected speech processes, 
EYE Speakers deviated considerably from SBE connected speech processes. This 
finding establishes the marked difference between SBE and NE. 
 
4.1.6  Sociophonetic variation of connected speech processes 
This section examines the social differentiation of these processes in EYE 
under the broad categories of assimilation, elision and liaison (voicing assimilation and 
yod coalescence were collapsed under assimilation). This was done in relation to the 
100 
 
Regressive Devoicing 
Progressive Voicing 
Progressive Deoicing 
Yod Coalescence 
/t/- deletion (word boundary) 
/t/- deletion (morpheme boundary) 
Linking /r/ 
Grand Total ( % tokens of occurrence) 
Grand Total ( % tokens expected) 
Overall % (Approximation) 
Overall  % (Deviation) 
 
variables of gender (male and female) and age (young and adult), using inferential 
statistics (Student's t-test). The following research questions were addressed: 
(a) Is there a significant difference between male and female speakers' articulation 
of assimilation, elision and liaison processes of SBE connected speech? 
(b) Is there a significant difference between young and adult speakers' articulation 
of assimilation, elision and liaison processes of SBE connected speech? 
The differences between the mean scores were determined at the significant level of 
0.05. 
 
4.1.6.1  T-test analysis for gender 
 In response to research question (a), effects of gender on assimilation, elision 
and liaison processes were examined. Table 4.8 below shows the mean scores for each 
speaker group in each process, while Table 4.9 reveals their significant levels.  
 
Table 4.8 Gender mean scores for assimilation, elision and liaison. 
 
 
 The performance of male and female speakers in assimilation and liaison 
portrayed little or no difference between the two speaker groups. They both had the 
same mean score (8.0) in assimilation, while females scored slightly higher (0.75) than 
males (0.73) in liaison. In elision, male speakers recorded a higher mean score (5.01) 
than female speaker (4.0). Not surprisingly, t-test results (Table 4.9) established no 
significant difference between the genders in assimilation (t(118) = 0.000; p = 1.000) and 
liaison (t(118) = -.115; p = 0.909), but showed that males' mean score in elision was 
significanly better than females‟ (t(112.723) = 6.142; p = 0.028).  
 
 
 
 
 
 
101 
 
 
Table 4.9 Results of T-test analysis for gender 
 
 
 
 The above finding suggests gender variation in elision in the connected speech 
of EYE speakers; male participants elided significantly more than females. This 
variation may be ascribed to speech casualness and sloppiness of the male folk 
compared to the females' thorough, careful and formal speech as established in the 
literature. 
 
4.1.6.2  T-test analysis for age 
 Effects of age on assimilation, elision and liaison processes were examined 
with a view to answering research question (b). The mean scores for each speaker are 
displayed in Table 4.10 below.  
 
Table 4.10 Age mean scores for asimilation, eision and liaison 
 
 
  
Table 4.10 shows that young speakers (with a mean score of 8.50) assimilated more 
than adult speakers (7.50), but had less mean scores in elision (4.27) and liaison (0.55) 
compared to the adults' mean scores of (4.75) and (0.93) in elision and liaison 
respectively. The t-test results (Table 4.11) established significant differences between 
these sets of mean scores. Young speakers performed significantly better than adult 
speakers in assimilation (t(114.275) = 4.958; p = 0.000), while  adult speakers' mean 
102 
 
 
scores were significantly higher than young speakers' in elision (t(105.568) = -2.614; p = 
0.010) and liaison (t(101.075) = -2.715; p < 0.008). The results are summarized in Table 
4.11. 
 
Table 4.11 Results of T-test analysis for age 
  
 These results, again, establish variation in the connected speech of young and 
adult EYE speakers. Assimilation seems to be more prevalent among the young 
speakers, while the adults, surprisingly, elided more than the young. The significant 
score difference between the age groups in liaison seems to suggest a diachronic shift 
in the awareness and use of r-liaison in EYE, which may mean that linking /r/ is being 
erased from the accent of the young due to absence of awareness.    
 
4.2  Summary, conclusion and further studies 
 This pilot study set to investigate certain features of Standard British English 
connected speech (voicing assimilation, yod coalescence, elision and liaison) in 
Educated Yoruba English from a sociophonetic perspective, as a prelude to a larger 
study on Nigerian English. The findings revealed the extent of approximation of EYE 
speakers to the Standard British English connected speech processes, as well as the 
social differentiation of these features according to gender and age.  
 Overall, the occurrence of the CSPs of Standard British English in EYE speech 
revealed 42.6% approximation and 57.4% deviation. First, it was observed that EYE 
speakers approximated to SBE in only a few connected speech processes, while they 
deviated in varying degrees in others. For instance, they approximated to the SBE 
norms in regressive devoicing which occurred (100%) in an environment of a voiced 
segment preceding a voiceless one at word boundary (e.g. I have to go [hav̥ tu]) and in 
progressive devoicing found at the boundary environment where a voiceless segment 
precedes a voiced one. (e.g. nice boy [nais b̥ ɔi]). EYE participants overwhelmingly 
scored 70.8%. 
103 
 
 
On the other hand, they deviated from Standard British English CSPs in 
progressive voicing, yod coalescence, elision and liaison. Progressive voicing, found in 
the reduced form of the third person singular form of verb be (e.g. The dog’s mine 
[dɔgz main]), had a very low occurrence (23.3%); most participants realized [z] as [s 
or z̥] in this position. Incidence of yod coalescence was abysmally low in all cross-
word boundary environments where yod should have coalesced with /s, z, t, d/ (e.g.  
miss your [mɪʃɔ]); EYE speakers employed this feature in just 4.9% cases. In the same 
vein, at both boundary contexts where elision was tested, many EYE participants did 
not elide /t/ significantly, scoring 47.4% in the environment where word-final /t/ is 
followed by a word initial consonant and 39.7% where morpheme final /t/ is followed 
by a word initial consonant. With an overall score of 45.1%, EYE speakers deviated 
from what obtains in SBE. Lastly, linking /r/ was barely attested: EYE speakers only 
managed to record 7.4% tokens. The only few cases observed occurred between 
grammatical items like there are, more of you and after a while, etc. It is obvious from 
the results that that this process is not a regular connected speech feature of EYE.  
 In terms of social variation, gender variation was attested in elision, with male 
participants eliding significantly more than females. This, possibly, is a corrolary of 
male folks' casualness and sloppiness in speech compared to the females' thorough, 
careful and formal speech as advanced in sociolinguistic research by scholars (e.g. 
Labov, 1963, 1966; Hudson; 1996). Significant differences were also found between 
young and adult EYE speakers. Assimilation seems to be more prevalent among the 
young speakers, while the adults, surprisingly, elided more than the young. Use of 
linking /r/ was common with adult speakers than young speakers, which seems to 
suggest a diachronic shift in the awareness of r-liaison in EYE; lexicalised linking /r/ is 
probably being erased from the accent of the young due to lack of awareness.  
In conclusion, this pilot study has served its purpose; in that, it has been able to 
reveal EYE speakers' proximity to Standard British English connected speech and the 
social variation of the processes. The findings, without doubt, demonstrated speakers‟ 
low level of competence in the use of SBE connected speech processes and their 
respective variation. Besides, the pilot study has been used to validate the research 
instruments and, thereby, confirm the possibility of expanding the scope of the 
research. In view of this, we shall investigate more processes of Standard British 
English connected speech amongst speakers of English from four regions of Nigeria 
(East, North, South-South and West), in order to ascertain their occurrence in Nigerian 
104 
 
 
English and determine the proximity of NE to SBE connected speech. Apart from 
using statistical tools, we shall conduct acoustic analysis on portions of the data to be 
collected in order to corroborate findings. 
 
 
 
 
 
105 
 
 
 
 
 
CHAPTER 5 
 
DATA ANALYSIS, FINDINGS AND DISCUSSION 
 
5.0 Introduction  
 
The data used for this study were semi-spontaneous speeches, comprising 
thirty-one utterance items and a short passage which contained various CSPs sites (see 
Appendix B). The data were produced into digital recording devices by 360 Nigerian 
speakers of English. The participants, who ranged between ages 18-65, were 180 males 
and 180 females with a minimum of 2-3 years post-secondary education. They were 
drawn, through stratified and purposive techniques, from four regions in Nigeria: 
North (120 participants), West (80 participants), East (80 participants) and South-
South (80 participants) (see appendix A). For the purpose of data gathering and 
variational analyses, participants were sub-divided into four social categories: Young 
Male, Adult Male, Young Female and Adult Female. Each category comprised 90 
participants (30 from the North, 20 from the West, 20 from the East and 20 from the 
South-South region), making three hundred and sixty (360) participants altogether. 
Two educated native speakers, who served as control, also produced the same 
utterances.  
Two major levels of analyses were adopted in the work. First, the recordings 
were played back and instances of assimilation, elision and liaison features identified 
at different boundary contexts in the data were transcribed perceptually and analysed 
statistically, using Percentages, Multivariate Analysis of Variance (MANOVA) and 
Bonferroni's Post-hoc Test. The findings were subjected to Standard English 
phonological rules, as provided in Generative Phonology, to ascertain Nigerian English 
speakers' application of or deviation from the rules. Second, portions of the semi-
spontaneous speech data produced by eight Nigerian participants (representing the four 
regions and the social categories) were analysed acoustically with a view to 
corroborating the findings obtained through statistical analysis. The same two levels 
were also used to analyse the control‟s production of the data. 
106 
 
 
5.1 Statistical analysis  
Specifically, assimilation, elision and liaison processes which, according to 
Cruttenden (2001), are the most common features of Standard British English (SBE) 
connected speech were investigated in the data at different boundary contexts. Under 
assimilation, variants of assimilation of voice and assimilation of place were 
investigated; boundary consonant elision strategies were considered under elision; 
while linking-r and intrusive-r were the subject of inquiry in liaison.  
In each boundary context, there were different variants of pronunciation; an 
appropriate (SBE) variant for each context was allotted one (1) mark, while zero mark 
was recorded for each non-SBE variant. The total scores for all participants in each 
variant were converted to percentages, the higher percentage taken as the norm. In 
order to test for significance levels between the social categories in their production of 
Standard British English CSPs, their scores were subjected to Multivariate Analysis of 
Variance (MANOVA) and Bonferroni's Post-hoc Test, using the IBM SPSS statistics 
20 package.  
 
5.1.1 Assimilation in NE 
The subject of assimilation was investigated from the perspective of voice and 
place assimilation. This is because, unlike assimilation of manner, they are more 
prevalent amongst native speakers. Besides, it is easier to capture other categories of 
assimilation (e.g. regressive, progressive or coalescent) under this classification. 
 
5.1.1.1 Assimilation of voice  
Assimilation of voice is a process whereby, in SBE, contiguous consonants 
tend to be either all voiced or all voiceless depending on the state of the glottis. This 
section examined the occurrence of this assimilation process in the NE data at three 
different boundary contexts and compared the performance with what obtains in 
Standard British English (as represented by the control). Altogether, thirteen (13) items 
were extracted from the data (six items in context 1, three in context 2 and four in 
context 3) as follows: 
1. A word-final voiced obstruent followed by a word-initial voiceless obstruent at 
word boundary, e.g. chose six, have to, live show, of course, we’ve planned and 
five pounds. 
 
107 
 
 
2. the reduced form of the third person singular of verb be preceded by a voiced 
segment, e.g. she's, he’s, dog’s mine.  
   
3. A word-initial voiced obstruent preceded by a word-final voiceless obstruent at 
word boundary, e.g. black dress, half-done, nice boy, ice blue.  
In each context, it was to be determined if assimilation took place, what type was 
observed and the extent to which speakers approximated to SBE. Table 5.1 shows the 
frequency and percentage scores of the participants‟ and the contol's productions at 
these three boundary contexts. There were 2,160 total tokens in context 1; 1,080 in 
context 2 and 1,440 in context 3. Altogether, 4,680 tokens were produced by the 360 
participants. 
 
 
 
 
 
 
 
 
 
108 
 
 
Table 5.1 Frequency and percentage scores for assimilation of voice variants 
 
Contexts  1. e.g. have to [hæv̥ tǝ]   2.  e.g. dog‟s  [dɒgz]   3.  e.g. nice boy [naɪs b̥oy]   
Processes Regressive Devoicing      Progressive Voicing      Progressive Devoicing 
Varieties SBE NE Variants SBE NE Variants SBE NE Variants 
Variants   RD N/RD PV N/PV PD N/PD 
    
Tokens of 
12/12 2143 17 2160 6/6 229 851 1080 8/8 937 503 1440 
Occurrence 
% Score 100 99.2 0.8 100 100 21.2 78.8 100 100 65.1 34.9 100 
 
   Key: RD- Regressive Devoicing; PV- Progressive Voicing; PD- Progressive Devoicing 
    
N/RD- Non Regressive Devoicing; N/PV- Non Progressive Voicing; N/PD- Non Progressive Devoicing 
 
 
109 
 
6 x 360 
3 x 360 
4 x 360 
 
Table 5.1, corroborated by Fig. 5.1 below, shows that in context 1 (in which a 
word-final voiced obstruent is followed by a word-initial voiceless obstruent at word 
boundary), regressive devoicing, e.g. [ʧǝʊz̥ s ɪks, hæv̥ t ǝ], which is the common and 
acceptable feature in SBE connected speech, was overwhelmingly produced by the 
control and the NE speakers (12 tokens, representing 100% for the control and 2,143 
tokens, representing 99.2% for NE speakers). This suggests that participants 
comformed significantly to the SBE regressive devoicing rule schematised as: 
 
 [- sonorant]        [- voice] / --------- ##    - sonorant 
          - voice 
  
(the first obstruent takes on the voiceless feature as is found in the second obstruent) 
which is succinctly expressed in the sample derivation of have to shown below: 
    SBE   NE  
Input    hæv ## tǝ   hav ## tu 
Regressive Devoicing  hæv̥ to   hav̥ tu 
Output    [hæv̥tǝ]  [hav̥tu]  
 
However, participants‟ performance appeared to have been motivated by 
phonological naturalness and mother tongue influence. In the first instance, this 
assimilatory process fits into natural rule of assimilation which Hyman (1975:171) 
says “can be attributed to either articulatory or acoustic assimilations or 
simplifications” or what Abercrombie (1967:135) refers to as “economy of effort in the 
utterance of a sequence of words” (ease of articulation). Such natural features are 
phonetically motivated, common and usually attested in different languages. This is 
because speakers will, generally, opt for easier and more natural sounds (e.g. 
devoicing) in the course of speaking (Schane, 1973). This explains why it was possible 
for most participants to devoice the preceding voiced segment in anticipation of the 
following more natural voiceless sound in each instance examined.  
Considered from the mother tongue perspective, the process was easier for 
most NE speakers, perhaps, because the voiced fricative sounds /v/ and /z/ involved in 
this assimilatory process at word boundary are not available in the phonemic 
inventories of a number of Nigerian languages. For instance, while languages like 
Yoruba, Efik and Itsekiri lack /v/ and /z/, Hausa and Tiv do not have /z/ (Dunstan, 
1969). The possible implication of this, therefore, is that some speakers of these 
languages would have to substitute the sounds in question with their voiceless 
110 
 
 
counterparts which are available in their languages. This is what James (1980) refers to 
as positive transfer. 
In context 2 (where the reduced form of the third person singular of verb be „is‟ 
is preceded by a voiced segment), the analysis of the NE data revealed a marked 
deviation from SBE.  While the control overwhelmingly articulated progressive 
voicing (word final /s/ becoming voiced [z] after a voiced segment) 100% e.g. [hɪz, 
dɒgz] as in he’s a nice boy and the dog’s mine, the tokens of progressive voicing used 
by NE speakers were rather insignificant (229 instances out of 1,080 sites, constituting 
21.2%). The low occurrence of progressive voicing in context 2 implies that Nigerian 
English speakers deviated considerably from the Standard British English progressive 
voicing assimilation rule. This may not be divorced from phonological naturalness and 
mother tongue transfer (in view of the challenge phoneme /z/ poses to speakers of 
certain language groups in Nigeria) as earlier noted. 
In context 3, where a word-initial voiced obstruent is preceded by a word-final 
voiceless obstruent at word boundary, progressive devoicing e.g. [haf d̥ ͻn, nais b̥ ɔi] 
half done and nice boy was substantial in the speech of NE speakers with 937 
occurrences, translating to 65.1%. The same trend, though with a higher figure (100%) 
was observed in the control‟s production. This suggests that NE speakers closely 
approximated to SBE progressive devoicing rule formulated as: 
 
[- son]  [- voice] /  - son       ## ---   
                - voice                
 
(a voiced obstruent is devoiced after a voiceless obstruent at word boundary) and 
expressed in the sample derivation of nice boy as follows: 
    SBE   NE 
Input    naɪs ## bɔɪ  nais ## bɔi  
Progressive Devoicing naɪs ##  b̥ɔɪ  nais ## b̥ɔi  
Output    [naɪsb̥ɔɪ]  [naisb̥ɔi]  
111 
 
 
 
R D: Regressive Devoicing;   PV: Progressive Voicing; PD: Progressive Devoicing; 
SBE: Standard British English; NE: Nigerian English 
 
 
 
Fig. 5.1  Percentage voicing assimilation score differences for SBE and NE speakers. 
 
Besides the connected speech processes of SBE identified under the category 
above, NE speakers also employed peculiar CSPs which are not attested in SBE, 
especially where they could not articulate the SBE forms substantially. Table 5.2 
below details some of these processes. 
 
Table 5.2 Frequency and percentage scores for typical assimilatory processes in NE 
 
Final Regressive Consonant 
Processes 
Devoicing Voicing substitution 
Tokens of 
851/1080 439/1440 18/1800 
occurrence 
% Score 78.8 30.5 1 
 
According to Table 5.2, NE speakers predominantly produced final devoicing 
in lieu of progressive voicing, scoring 851 tokens (78.8%) out of 1,080. The /z/ of He’s 
was devoiced to [s or z̥ ], e.g. [his]; while dog’s became [dͻgz̥ or dͻks in certain 
instances]. Final devoicing, a process whereby final obstruents are devoiced in 
absolute and non-absolute word final position (Simo Bobda, 1994), has been reported 
to be a typical feature of Nigerian and neighbouring West African Englishes (Tiffen, 
112 
 
 
1974; Bamgbose, 1982; Adetugbo, 1977; Jibril, 1982; Awonusi, 1987, 2004b; Simo 
Bobda, 1994). Josiah‟s (2009) opines, with regard to this process, that many educated 
Nigerians realize word final /z/ as devoiced [z̥] or [s] except in context where /z/ is 
found intervocalically. Laver (1968), categorically, claims that there is absence of 
progressive voicing assimilation in educated Nigerian English. Simo Bobda (2007) 
also asserts, in this regard, that unlike the RP which has archiphoneme /Z/ for 
morpheme {s} and may undergo devoicing by voicing assimilation rule, NE has 
archiphoneme /S/ which, on most occasions, remains unchanged at the surface level.  
Final devoicing in NE, then, may be a product of what Aitchison (1981:32) 
referred to as "the general and inevitable weakness of articulation of sounds at the end 
of words", which is a function of naturalness in phonology by which speakers tend to 
employ features that require less articulatory effort and are attested in many languages 
(Hyman, 1975; Simo Bobda, 1994). Schane (1973:116) states, in this regard, that “a 
rule that makes obstruents voiceless in word final position is more normal than one 
voicing them in that environment.” This assimilation process (final devoicing rule) is 
captured as: 
[- son]  [- voice] / [+ voice] # ----       
 
(obstruents are devoiced in final position) and expressed in the sample derivation of 
dog’s as follows: 
   SBE   NE 
Input   dɒg#Z    dͻg#Z 
Progressive Voicing dɒg#z   ------------ 
Final Devoicing ----------  dͻgz̥ / dͻks 
Output   [dɒgz]   [dͻgz̥] / [dͻks] 
 
Another CSP somewhat attested in the NE data, which was not articulated by 
the control, is regressive voicing assimilation whereby ice blue and black dress were 
articulated as [aiz blu] and [blag drɛs] respectively. 439 instances of this process, 
representing 30.5%, were articulated in context 2 in lieu of progressive devoicing of 
SBE. Although Laver (1968) had claimed categorically that NE allows regressive 
voicing assimilation, this finding does not support the prevalence of this feature in 
Nigerian English, as only a minority of speakers produced it. 
Apart from the two processes discussed above, there were 18 (1%) cases of 
consonant substitution, e.g. [hap dͻn, hap tu, ͻp kͻs, wip pland, faif fauns] for half 
113 
 
 
done, have to, of course, we’ve planned and five pounds produced by 12 participants 
from the North (7 Hausa, 2 Fulani, 2 Jenjo and 1 Eggon speakers). This is a clear case 
of mother tongue influence peculiar to participants from the northern part of the 
country, where /p/ is substituted for /f/ and vice versa, obviously, due to the influence 
of Hausa which is more or less a lingua franca in that region. It is on record that the 
articulation of /p/ and /f/ poses difficulty to Hausa speakers who, according to Jowitt 
(1991), frequently realize /p/ as [f] and /f/ as [p] since [p], [f] and [Ф] are allophones of 
/p/ or /f/ in Hausa.  
 
5.1.1.2 Assimilation of place  
Assimilation of place is concerned with changes in the place of articulation of a 
segment (usually a consonant) at word boundary. In SBE, these changes are usually 
regressive (e.g. meat pie [mi:p pai])  or coalescent (e.g. what you [wɒʧʊ]). The concern 
of this section, therefore, is to verify the occurrence and direction of these two types of 
place assimilation in the connected speech of Nigerian English speakers, with a view 
to establishing the extent of their approximation to or deviation from SBE.  
The junctural sites where these assimilation types were found in the data 
comprised word-final alveolar /t, d, n/ preceding word-initial bilabial or velar stop 
consonants /b, p, k, g/; and word-final /s, z, t, d/ following word-initial palatal glide /j/. 
11 of such items extracted from the data were grouped into 4 contexts as follow: 
1. The voiceless alveolar stop /t/ followed by a voiceless bilabial or velar stop /p, 
k/ at word boundary, e.g. met Peter and that case.  
2. The voiced alveolar stop /d/ followed by a voiced bilabial or velar stop /b, g/ at 
word boundary, e.g. good bye and good girl.  
3. The alveolar nasal /n/ followed by bilabial stops /b, p/ or velar stop /k/ at word 
boundary e.g. ten boys, ten pounds and in case).  
4. /t, d, s and, z/ followed by the palatal glide /j/ at word boundary, e.g. miss your, 
those young men, what you want, could you.  
 
The frequency and percentage scores for variants produced by the participants and the 
control group in these boundary contexts are presented in Table 5.3 below. Altogether, 
there were 3,960 tokens of occurrence: contexts 1 and 2 have 720 tokens each; context 
3: 1080; context 4: 1440.  
114 
 
 
 
Table 5.3 Frequency and percentage scores for variants of place assimilation 
 
Context 1. e.g. met Peter [mep pɪtǝ]    2.  e.g. good girl [gʊg gɜ:l] 3.  e.g. ten boys [tem bͻɪz] 4. e.g. could you [kʊʤʊ] 
     
Processes Voiceless Alveolar Stop Assm.   Voiced Alveolar Stop Assm.   Nasal Assimilation      Yod Coalescence          
NE 
Varieties SBE NE Variants SBE NE Variants SBE NE Variants SBE 
Variants 
Variants   VLASA N/VLASA VASA N/VASA NA N/NA YC YR 
      
Tokens of 
4/4 343 377 720 4/4 23 697 720 4/4 686 394 1080 7/8 89 1351 1440 
occurrence 
% Score 100 47.6 52.4 100 100 3.2 96.8 100 100 63.5 36.5 100 87.50 6.2 93.8 100 
 
Keys: VLASA- Voiceless Alveolar Stop Assimilation; VASA- Voiced Alveolar Stop Assimilation; NA- Nasal Assimilation; YC- Yod Calescence 
    
N/VLASA- Non Voiceless Alveolar Stop Assimilation; N/VASA- Non Voiced Alveolar Stop Assimilation; N/NA- Non Nasal Assimilation; YR- Yod 
Retention. 
 
 
 
115 
 
2 x 360 
2 x 360 
3 x 360 
4 x 360 
 
In Context 1, where voiceless alveolar stop /t/ is followed by voiceless bilabial 
and velar stops /p, k/ at word boundary, e.g. met Peter and that case, NE speakers 
produced less instances of voiceless alveolar stop assimilation, e.g. [mɛp pita] and [dak 
kes] compared to the control group. While the control produced 100% cases of such 
assimilation, NE participants scored 343 tokens out of 720 expected, translating to 
47.6%. On the other hand, they produced 376 tokens (52.2%) of unassimilated variant. 
This suggests a relative departure from the SBE voiceless alveolar stop assimilation 
rule schematised as:  
       
alveolar      α  place  
stop    [α place] /--- ##              
-voice     stop    
(The voiceless alveolar stop /t/ assimilates in place of articulation to the following 
bilabial or velar stop /p, k/.  
Sample Derivation: met Peter [mep pɪtǝ] 
 
    SBE   NE 
Input    met ## pɪtǝ      mɛt ## pita      
Vl. Alveolar Stop Assm. mep pɪtǝ      ---------------   
Cluster Simplification  mepɪtǝ      ---------------   
Output    [mepɪtǝ]      [mɛt pita]      
 
In the second context, involving assimilation of voiced alveolar stop /d/ to 
bilabial or velar stop /b, g/, e.g. good bye, good girl, the percentage scores reveal 
extremely low incidence of voiced stop assimilation [gʊg gɜ:l] in the NE data 
compared to the control group. Only 23 tokens, amounting to 3.2%, were produced by 
NE speakers, while the control group got 100%. On the other hand, unassimilated 
variant occurred in 697 cases, representing 96.8%. This, again, reveals a complete 
deviation from the SBE voiced alveolar stop assimilation rule stated as: 
    
alveolar      α  place  
stop    [α place] /--- ##              
+voice     stop   
  
(The voiced alveolar stop /d/ assimilates in place of articulation to the following 
bilabial or velar stop /g, b/.  
116 
 
 
Sample Derivation: good girl [gʊg gɜ:l] 
 
    SBE   NE 
Input    gʊd ## gɜ:l   gud ## gɛl 
Vd. Alveolar Stop Assm. gʊg gɜ:l     --------------- 
Cluster Simplification  gʊgɜ:l       ---------------      
Output    [gʊgɜ:l]   [gud gɛl] 
 
In context 3, where the alveolar nasal /n/ is followed by bilabial stops /b, p/ or 
velar stop /k/ at word boundary, e.g. ten boys, ten pounds, in case, NE speakers 
produced a significant incidence of nasal assimilation [tɛm bͻis, tɛm paunds] relatively 
close to the control‟s. They scored 63.5% (686 tokens), while the control got 100%. 
Absence of nasal assimilation was observed in 391 cases, representing 36.5%. This 
implies that participants substantially comformed to the SBE nasal assimilation rule 
expressed as: 
      
Alveolar     α  place  
   [ɑ place] /--- ##        
nasal     stop   
(The alveolar nasal /n/ assimilates to the place of articulation of a following bilabial or 
velar stop)  
Sample Derivation: ten boys [tem bͻɪz] 
 
    SBE   NE 
Input    ten ## bͻɪZ    tɛn ## bͻi # Z   
Final Devoicing  --------------    tɛn ## bͻis   
Nasal Assimilation  tem bͻɪz    tɛm bͻis   
Output    [tembͻɪz]    [tɛmbͻis]   
 
 Context four is a case of yod coalescence. In SBE, /s, z, t and d/ tend to coalesce 
with yod /j/ as in miss your, those young men, what you want and could you to become 
[mɪʃə, ðǝʊʒʌŋ men, wɒʧʊ, kʊʤʊ] respectively in a rapid speech. However, as can be 
seen in Table 5.3 (corroborated by Fig. 5.2 below), the occurrence of yod coalescence 
amongst NE speakers was abysmally low. Only 6.2% (89) incidences of appropriate 
yod coalescence were produced, compared to 87.5% tokens for the control. On the 
other hand, 1,204 (93.8%) tokens of the uncoalesced variant (yod retention) were 
117 
 
 
articulated. This suggests a significant deviation from the SBE yod coalescence rule 
formalised as:  
 -son  - ant         -con      -con    
+cor         --------  ##        -syl      -stress 
+ant  + strd          -back 
 
(/t, d, s, z/ are converted into [ʧ, ʤ, ʃ, ʒ] respectively, before the palatal glide /j/ at 
word boundary.     
 
Sample Derivation: miss your 
 
    SBE  NE    
Input    mɪs ## jə   mis ## jɔ     
Palatalisation   mɪʃ jə    ---------------  
Yod/Glide Deletion  mɪʃə    ---------------     
Output    [mɪʃə]  [mis jɔ]  
 
Certain explanations germane to place assimilation in Nigerian English can be 
deduced from the above analyses. In the first instance, the findings suggest that only 
nasal assimilation approximated to SBE connected speech processes which, to a large 
extent, supports Jibril‟s (1982) claim that assimilation of place in NE is confined to 
nasals only. This may be explained by the fact that homorganic nasal assimilation is a 
common phonological process in most Nigerian indigenous languages (Yusuf, 2010). 
As a matter of fact, it is the principal consonant-consonant assimilation process: most 
other cases of assimilation affect contiguous vowels or consonants and vowels (cf. 
section 1.5.1.2).  
Conversely, participants showed various degrees of resistance to assimilation in 
other place assimilation processes- voiced alveolar stop assimilation and yod 
coalescence were least articulated compared to voiceless alveolar stop assimilation. 
This is not surprising, considering the fact that assimilation is often triggered when 
speech is spoken fast and sounds are linked with each other without junctures between 
them. Nigerian English speakers, however, are known to usually pick and choose their 
words and, in the process, keep words separate. The corollary of this, therefore, is their 
inability to produce the assimilatory processes commonly found in SBE connected 
speech. 
 
118 
 
 
 
VLASA : Voiceless Alveolar Stop Assimilation; VASA: Voiced Alveolar Stop Assimilation;  
 
NA: Na sal Assimilation; YC: Yod Coalescence SBE: Standard British English; NE: Nigerian English      
 
 
Fig. 5.2  Percentage (%) place assimilation score differences for SBE and NE speakers. 
 
The performance of NE speakers in yod coalescence, in particular, proved that 
this phenomenon, which is becoming widespread in SBE (Cruttenden, 2001), is still 
alien to Nigerian English users. Participants‟ performance was a far cry from what 
obtains in SBE as demonstrated by the control‟s score (87.5%). As shown in Table 5.4 
below, speakers had to employ various yod cluster reduction strategies to simplify the 
yod phenomenon in the data. The first strategy used was deletion in which the final /t/ 
and /d/ of the first of the adjoining words were deleted in order to avoid their fusion 
with /j/, e.g. [wͻ ju] what you and [ku ju] could you. There were 71 (9.9%) cases of 
such deletion. This deletion rule can be captured as:  
      
t/d Ѳ      -------- ## /j/         
 
Sample Derivations: could you 
 
    NE  
Input    kud ## ju   
Cluster Simplification  ku ju   
Output    [kuju] 
119 
 
 
 Another yod simplification strategy observed is /t/-voicing (what Awonusi 
(1985) referred to as the Nigeria /t/-tapping), which is “the realisation of intervocalic 
/t/ as a voiced tap rather than a fortis plosive” (Hannisdal, 2006:4). This was produced 
(5.8%) in lieu of yod coalescence in what you [wͻt̬u]  by 1 Yoruba speaker, 4 
participants from the South-South region (Ogoja, Ogoni, Ibiobio and Ijaw), one Igbo 
and 14 speakers from the North (Buram, Ngas, Tambul, Tarok, Fulani, Phyem, Jenjo, 
Igala, Kikaku, Tiv, Challa and Hausa speakers). This realisation can be explained in 
two ways. First, it may signify the infiltration of GA (General American) into Nigerian 
English as reported by Awonusi (2004b), especially if found to be predominant 
amongst young speakers who are known to be linguistic innovators and agents of 
language change. In other way round it might have been be motivated by inherent 
articulatory constraints, whereby in keeping with the “principle of least effort” (Wells 
1982: 94), speakers produce utterances with a minimum articulatory effort. In this 
regard, sounding voiceless [t] between vowel [ͻ] and [jʊ], which are all voiced sounds, 
required the vocal cords to be turned off and on again; whereas, it is a lot easier to 
allow the voicing throughout the articulation process. The latter explanation seems to 
be the possible reason for the articulation of /t/-voicing in the data, as the feature was 
not found to be peculiar to young speakers in the data. 
It is, therefore, plausible to state that apart from the tendency to retain yod at 
word boundary, Nigerian English speakers also employed several yod cluster 
reductions strategies, explicated above, to resolve the yod cluster phenomenon.  
 
Table 5.4 Frequency and percentage scores for yod reduction strategies  
 
Processes t/d deletion t-voicing 
 
Tokens of 
71/720 21/360 
occurrence  
% Score 9.9 5.8 
 
 
 
5.1.2 Elision in Nigerian English 
In rapid casual speech, sounds which ordinarily are enunciated in isolated 
words or slow, careful speech get elided for euphonic effect; that is, in order to 
maximize smooth pronunciation. Specifically, when there is a cluster of two or more 
120 
 
 
consonants word-internally or across morpheme or word boundaries, some of the 
consonants usually get elided, e.g. han(d)kerchief, Chris(t)mas, nex(t) day and fin(d) 
me. This occurs either because of fast speech or for consonant cluster simplification 
purposes.  
 In this section, we examined the application of this SBE feature of connected 
speech in junctural environments in NE, using fifteen (15) items extracted from the 
semi-spontaneous speech data. The purpose was to establish the extent to, and the 
pattern by, which NE speakers elide consonants at morpheme and word boundaries in 
connected speech, compared to what obtains in SBE. The junctural items extracted 
were grouped into five (5) contexts as follow: 
1.  Word-final /t/ before another consonant at word boundary, e.g. doesn’t she, 
won’t do it, kept quiet, exact colour, test drive, don't buy it. 
2.  Morpheme-final /t/ before another consonant at word boundary, e.g. jumped 
well, equipped with, fixed price. 
3. Word-final /d/ before another consonant at word boundary, e.g. found, five, old 
man, cold launch. 
4. Morpheme-final /d/ before another consonant at word boundary, e.g. seemed 
glad, robbed both, advertised car. 
 
The frequency and percentage scores for elision produced by the participants 
and the control in each of these junctural contexts are presented in Table 5.5 below. In 
all, there were 5,400 tokens of occurrence: 2,160 in contexts 1 and 1,080 tokens in 
each of contexts 2, 3 and 4.  
121 
 
 
Table 5.5  Frequency and percentage scores for elision variants 
 
t-deletion (WF) t-deletion (MF) d-deletion (WF) d-deletion (MF)
Context exact colour [ɪgzæ˺kɒlə] jumped well [ʤʌmp˺ wel] found five [faʊn˺ faɪv] robbed both [rɒb˺ bəʊθ] Grand Total 
Varieties SBE NE Variants SBE NE Variants SBE NE Variants SBE NE Variants SBE NE SBE NE
Variants Elision N/E Elision N/E Elision N/E Elision N/E Elision N/Elision
Tokens 12/12 1359 801 2160 5/6 614 466 1080 5/6 688 392 1080 6/6 661 419 1080 28 3322 2 2078
% Score 100 62.9 37.1 100 83.3 56.9 43.1 100 83.3 63.7 36.3 100 100 61.2 38.8 100 93.3 61.5 0.7 38.5
 
Keys: WF: Word Final; MF: Morpheme Final; N/E: Non-Elision 
 
122 
 
6 x 360
3 x 360
3 x 360
3 x 360
 
The analysis in Table 5.5 above (corroborated by Fig. 5.3 below) shows that in 
context 1 (word-final /t/ before another consonant at word boundary), NE speakers 
realised 1,359 (62.9%) significant tokens of /t/ elision e.g. [egzak˺ kͻlͻ, don˺ bai], 
while they failed to elide /t/ in 801 cases (37.08%), e.g. [tɛst draiv, kɛpt kwaiet]. This 
suggests that they closely approximated to the SBE form (represented by the control‟s 
score of 100%). In Context 2 (morpheme-final /t/ before another consonant at word 
boundary), the incidence of elision produced by NE speakers was less than what 
obtained in the first context. It was 614 (56.9%) tokens of elision, e.g. [ʤͻmp˺ wel, 
fis˺ prais] and 466 (43.2%) instances of non-elision, e.g. [ʤͻmpd wel, fiksd prais]. 
This performance, however, compared substantially to the control‟s percentage of 
83.33%.   
In the third context (word-final /d/ before another consonant at word 
boundary), NE participants‟ performance, again, approximated to the control‟s of 
83.3%. They produced 688 (63.7%) tokens of elision, e.g. [faun˺ faiv, ol˺ man], while 
they failed to elide /d/ in just 392 (36.3%) instances. Context 4 (morpheme-final /d/ 
before another consonant at word boundary) also revealed significant preference for /d/ 
elision in NE. Participants recorded 661 (61.2%) incidences of /d/ elision, e.g. [rͻb˺ 
boθ, ɑdvɑtais˺ car], compared to the 100% performance of the control. They failed to 
elide /d/ in the same position in 419 cases, representing 38.8%.  
 
 
 
Fig. 5.3  Percentage (%) elision score differences for SBE and NE speakers. 
  
123 
 
 
The overall percentage scores for elision and non-elision variants in all contexts 
(as reflected in Table 5.5 and represented in Fig. 5.4) show that out of total 5,400 
realizations, there were 3,322 (61.5%) incidences of elision, while 2,078 (38.5%) 
tokens were recorded for non-elision. This performance suggests that consonant elision 
is prevalent in Nigerian English in a manner that closely approximates to the SBE rule, 
schematised as:  
 
   ## 
t / d → Ø  / -------      C  
   # 
 
/t, d/ is deleted before a consonant at word or morpheme boundary. 
 
Sample Derivation: test drive 
 
    SBE   NE 
Input    test ## draɪv   tɛst ## draiv  
Cluster Simplification  tes draɪv   tɛs draiv 
Output    [tesdraɪv]   [tɛsdraiv] 
 
 
Fig. 5.4  Percentage (%) elision and non-elision scores for NE speakers. 
 
These findings have demonstrated that NE, like many other varieties of English 
(including SBE), shows a tendency to elide consonants at word boundary. We, 
therefore, agree with Bailey‟s (1973:181) assertion that “all speakers of English delete 
/t/ or /d/ in the heaviest environment".  
However, unlike in SBE where as large as a whole syllable may be deleted, 
especially at a weak position in connected speech (Kerswill, 1985; Nolan and Kerswill, 
124 
 
 
1990; Wells, 2000), elision predominantly affects consonant(s) at the coda position of 
the first of two adjoining words in NE. The frequency of consonant elision in this 
phonological context is, probably, made possible by the fact that the coda is said to be 
weaker than the onset position (Hooper, 1976 cited in Jibril, 1982). 
Beyond this, however, the preponderance of boundary elision in Nigerian 
English is best explained as a consonant cluster simplification strategy, rather than an 
output of fast speech. This is because most participants elided the sounds in question 
even when they did not speak fast. Talking about consonant simplification therefore, 
most Nigerian languages have more natural syllable structure: CV or VCV (Hyman, 
1975); the complex consonant clusters of SBE are rare and therefore pose problems for 
many NE speakers, especially in connected speech. In order to resolve this linguistic 
dilemma, consonants clusters are often simplified by vowel epenthesis or by consonant 
deletion(Simo-Bobda, 2004; 2007). Simo-Bobda (2007) refers to this simplification 
strategy (consonant deletion in particular) as a major feature of African English 
accents. This is, therefore, another instance of naturalness in phonology as explained 
earlier; since, according to Hyman (1975:162), “Consonant deletion processes are 
widespread in languages”.  
 
 
5.1.3.  Liaison in Nigerian English 
There are several ways by which contiguous vowels at word or morpheme 
boundary are linked together in SBE. This could be through r-liaison, a semi-vowel, 
etc. The commonest of these categories is r-liaison, comprising linking /r/, e.g. car 
owner /ka:r əʊnǝ/ and intrusive /r/, e.g. media event /mi:dɪər ɪvent/. This section, 
therefore, examined the occurrence of r-liaison (linking and intrusive /r/) in the speech 
of Nigerian English speakers and discussed the findings in the light of what obtains in 
SBE. 14 boundary items with potential r-liaison were extracted from the data. 11 items 
were used to test linking /r/ and 3 for intrusive /r/.   
 
Context  items 
 
Linking /r/: Peter at, more of him, after a while, their action, wore a black dress, 
inquire about, colour of, for all, there are, over eat, power-assisted 
steering.  
Intrusive /r/: law and order, idea of it, media event.  
 
125 
 
 
Table 5.6 shows the frequency and percentage scores for variants produced by 
Nigerian participants and the control group in each of these boundary contexts.  
Altogether, 5,040 tokens were expected: 3,960 in contexts 1 and 1,080 in contexts 2. 
 
126 
 
 
 
Table 5.6 Frequency and percentage scores for r-liaison. 
    
Linking /r/ Intrusive /r/ 
Processes Grand Total  
e.g. [æftrə waɪl] e.g. [mi:dɪər ɪvent] 
Varieties SBE NE Variants SBE NE Variants SBE NE SBE NE 
r- r- r- r-
Variants   r- liaison r- suppression 
liaison suppression liaison suppression 
  
Tokens of 
22/22 319 3641 3960 6/6 31 1049 1080 28 350 0 4690 
Occurrence 
% Score 100 8.1 91.9 100 100 2.9 97.1 100 100 6.9 0 92.3 
127 
 
11 x 360 
3 x 360 
 
Table 5.6 above reveals a very low occurrence of r-liaison in the NE data. Of 
the total 3,960 anticipated tokens of linking /r/ in context 1, only 319 incidences, 
representing a negligible 8.1%, were recorded, e.g. [mͻr ͻf, aftar e wail]; whereas, 
there were 3,601 tokens of r-suppression, constituting 91.9%, e.g. [pitɑ ɑt, deǝ ɑkʃͻn]. 
On the other hand, a percentage score of 100% by the control group represents an 
overwhelming tendency for linking /r/ in SBE. In context 2, the rate of intrusive /r/, 
e.g. [aidiar ͻf] is much lower for NE speakers; there were just 31 instances (2.9%) and 
1,049 cases (97.1%) of r-suppression [midia ivent]. This is a far cry from what obtains 
in SBE as depicted by the control‟s 100% intrusive /r/ usage. It is obvious from the 
results (as portrayed in Fig. 5.5) that the incidence of r-liaison is abysmally low in NE, 
unlike in SBE where it is much more prevalent. Nigerain English speakers failed to use 
linking and intrusive /r/ significantly, having only 350 incidences (6.9%) in both 
contexts out of a total realisation of 5,040 tokens. This suggests that Nigerian English 
speakers deviated significantly from /r/ insertion (linking /r/) rule which is a regular 
feature of SBE connected speech, captured as: 
Ø   r    V ------- ## V  
(/r/ is inserted between a vowel and a following vowel at word boundary).  
NE rule is rather formulated as: 
r   Ø    V ------- ## V  
(Orthographic r is deleted between a vowel and a following vowel at word boundary) 
Sample Derivation: more of 
    SBE   NE 
Input    more of   more of 
mͻ: ## əv    mͻ ## ͻf   
R-Insertion   mͻ:r əv    ---------- 
Output    [mͻ:rəv]    [mͻ ͻf] 
 
A number of factors account for the low usage of /r/ liaison in the data. First, 
using linking /r/ across word boundary requires the two adjacent words; e.g. Peter and 
at, to be linked with each other. This, however, is an arduous task for most Nigerian 
speakers of English who, like many L2 speakers, normally keep orthographic words 
separate in connected speech and thereby pronounce every sound as distinct as 
possible (Simo Bobda, 1994; Bamgbose, 2004). Second, the feature was not 
128 
 
 
encouraged by the syllable-timed rhythm of NE, whereby each syllable tends to occur 
at regular time intervals. It was, however, possible for the control to produce it because 
their isochronous rhythm requires all the unstressed syllables (e.g. -ter at the [tə ət ðə]) 
after the stressed syllable (Pe- [pɪ]) to be pronounced swiftly, taking the same amount 
of time as the single stressed syllable.  
 Finally, the level of awareness for this feature is abysmally low in Nigeria. It is 
not a sound feature heard every so often except, sometimes, in the media from 
newscasters, presenters and announcers who try as much as possible to approximate to 
the native English speech in order to appeal to their international audience. Besides, 
many NE speakers who are possibly aware of it tend to avoid it in casual speech, 
because such a speech feature makes them sound foreign and affected and often elicits 
a negative attitude from people.  
  
 
 
Fig. 5.5    Percentage (%) r-liaison and r-suppresion scores for NE speakers 
 
However, it was observed that linking /r/ was more prevalent than intrusive /r/ 
in the data. A score of 8.06% was recorded for linking /r/ against 2.87% of intrusive 
/r/. The abysmally low occurrence of intrusive /r/ in NE is not surprising, considering 
the fact that pronunciation of English words in Nigeria is, to a large extent, 
orthographic or spelling induced (Akinjobi, 2013); since r is not present in the 
orthography of the affected junctural words, one would be asking too much to expect 
/r/ to show up in those environments. This finding is consistent with Awonusi‟s 
(2004b:16) claim that “intrusive /r/ is ... practically non-existent in NEA”. 
Apart from suppressing /r/, few NE speakers also produced smoothing to 
resolve the linking /r/ phenomenon (see Table 5.7). Smoothing, otherwise known as 
levelling, is a subtype of compression whereby “a prevocalic diphthong loses its 
129 
 
 
second element and is reduced to a monophthong” (Hannisdal, 2006:116). For 
example, /aɪə/ and /aʊə/ of fire and power may become [aə] or [a:] either within a word 
or in connected speech (Gimson, 1980; Wells, 1982; 2000). In the data, /eə##ə/ and 
/eə##æ/ of there are and their action were respectively smoothed to [eə] and [a:] as in 
[deə] and [da:kʃɔn] in 40  instances, constituting 2.8%.  
Jibril (1982) had earlier attempted to explain this phenomenon, which he 
referred to as diphthong monophthongisation process (akin to smoothing) in NE, in 
terms of mother tongue influence. According to him, there is a tendency in Igbo and 
Yoruba for the first of two vowels sequence in a word boundary to undergo regressive 
assimilation and be deleted outright, as in uzo amaka 'road is good' becoming 
[uzamaka] and fe oko 'get a husband ' becoming [fo̩  ko̩] respectively. This trend, he 
opines, may sometimes influence reduction of English diphthongs to monophthongs by 
the speakers of these languages, perceiving them as a vowel sequence.  
However, in view of the fact that this process cut across different language 
groups in the data (11 speakers from the East, 10 from the West, 12 from South-South 
and 7 from the North), there is, obviously, more to smoothing than mother tongue 
influence. It can also be viewed as a reduction process (a junctural simplification 
strategy) for minimixing articulatory effort. This feature is said to be common in rapid 
or casual speech in RP and in many dialects of English (Cruttenden, 2001:139; Wells, 
1982:286, 2000:165; Katalin and Szilárd, 2006).  
 
Table 5.7 Frequency and percentage scores for smoothing. 
 
Process Smoothing 
Tokens of 
40/1440 
occurrence 
% Score 2.8 
 
 
5.1.3.2  Linguistic correlates of r-liaison in NE 
This section examines the linguistic environments that constrained the use of 
linking /r/ in the data. The analysis shows that linking /r/ occurred more frequently 
between short grammatical words, e.g. there are, more of, after a while, for all; but 
130 
 
 
rarely between lexical (including a combination of lexical and grammatical) words like 
over eat, power assisted, Peter at, inquire about, their action, colour of, and wore a. 
 
 
Table 5.8  Linking /r/ according to linguistic contexts 
 
Grammatical 
Lexical words Total 
  words 
Process Linking /r/ Linking /r/ 
  
Tokens of 
17 302 319 
occurrence 
% Score 5.3 94.7 100 
 
 
Fig. 5.6 Percentage linking /r/ scores for lexical and function words. 
 
 
As shown in Table 5.8 (corroborated by Fig. 5.6) above, in an environment 
where both or one of the adjacent words are lexical, only 17 instances of  linking /r/, 
representing 5.3%, occurred out of the total 319 linking /r/ tokens. On the other hand, 
302 (94.7%) cases of linking /r/ were recorded when /r/ appeared between two 
grammatical categories, which means that about 95% cases of linking /r/ occurred in-
between function words. This, therefore, suggests that linking /r/ is used, largely, in-
between grammatical items in NE. This, however, is not categorical, as it was 
discovered that linking /r/ was used only in such grammatical phrases as there are, 
more of you and after a while, which have somewhat been lexicalised due to the fact 
that participants have heard them most often. This, not surprisingly, could not be 
131 
 
 
replicated in other environments as for all, inquire about, their action, colour of, wore 
a, etc. when required. 
 This explains the claim made by Awonusi (2004b:216) that „NEA operates the 
linking /r/ rule in a manner consistent with RP in such phrases like for a while, here 
and there and after all‟. This position is also consistent with what Hannisdal (2006) 
found out in RP that linking /r/ occurs most frequently between short, often 
grammatical, words, e.g. there are, here is, where a, or a, are also, your own, etc. 
 
5.1.4 Summary of Performance 
Having examined the incidence of SBE assimilatory, elision and liaison 
processes in the connected speech of Nigerian English speakers, we found it germane 
to present the overall performance of participants vis-a-vis what obtains in Standard 
British English, as represented by the control. Thus, Table 5.9 shows a summary of the 
NE participants‟ performance in all variants of processes investigated compared to the 
control‟s. In view of the fact that each of the processes contained different number of 
realisations, the total score for each was converted to a percentage. All the calculated 
percentages were then summed up and used to arrive at the overall score. 
132 
 
 
 
Table 5.9 Summary of CSPs of SBE in the Nigerian English data 
 
Process Assimilation Elision Liaison Grand Total
Variant
SBE (%) 100 100 100 100 100 100 87.5 100 83.3 83.3 100 100 100 1254.1 1300 96.5 3.5
NE (%) 99.2 21.2 65.1 47.6 3.2 63.5 6.2 62.9 56.9 63.7 61.2 8.1 2.9 561.7 1300 43.2 56.8
 
 
 
 
133 
 
Regressive Devoicing
Progressive Voicing
Progressive Deoicing
Voiceless Stop Assimilation
Voiced Stop Assimilation
Nasal Assimilation
Yod Coalescence
/t/- deletion (word boundary)
/t/- deletion (morpheme boundary)
/d/- deletion (word boundary)
/d/- deletion (morpheme boundary)
Linking /r/
Intrusive /r/
Grand Total ( % tokens of occurrence)
Grand Total ( % tokens expected)
Overall % (Approximation)
Overall  % (Deviation)
 
Table 5.9 (represented graphically in Fig 5.7 below) shows that the control, 
respresenting the Standard British English accent, had an overall percentage score of 
96.5%, while Nigerian English speakers scored 43.2%. Thus, NE speakers had an 
overall approximation of 43.2% and overall deviation of 56.8% (Fig. 5.8). This 
suggests that Nigerian English speakers exhibited, overall, more deviation from, than 
approximation to, Standard British English Connected speech processes. 
 
 
 
Fig. 5.7 Overall percentage CSPs scores of SBE and NE speakers. 
 
 
 
Fig. 5.8  Overall percentage scores of NE approximation to and deviation from SBE. 
 
 
 
 
134 
 
 
5.1.5  Sociophonetic variation of connected speech processes 
This section examines the social differentiation of the three CSPs (assimilation, 
elision and liaison) under consideration, in relation to the variables of region, gender 
and age, using inferential statistics: Factorial MANOVA (Multivariate Analysis of 
Variance) and the Bonferroni's Post-hoc test. In order to arrive at valid and accurate 
statistical outputs and to also make the analysis manageable, all variants identified 
under each category of CSPs were collapsed and treated together. For example, all 
assimilation variants made up assimilation, all elision sites were combined under 
elision while liaison comprised linking and intrusive–r.  
 
5.1.5.1  Introduction  
 A 4 x 2 x 2 between-participants Multivariate Analysis of Variance 
(MANOVA) was performed on three dependent variables: assimilation, elision and 
liason. The independent variables were region (East, North, South-South and West), 
gender (male and female) and age (young and adult). The following research questions 
(culled from the major research questions guiding this study) were addressed: 
(a) Are there significant mean differences in the combined DV of the CSPs 
(assimilation, elision and liaison) on the basis of region, gender and age? 
(b) Are there significant mean differences in individual DVs (assimilation, elision 
and liaison) among different regions? If so, which regions differ? 
(c) Are there significant mean differences in individual DVs (assimilation, elision 
and liaison) between male and female participants? 
(d) Are there significant mean differences in individual DVs (assimilation, elision 
and liaison) between young and adult participants? 
 
5.1.5.2  Analysis  
First, the linearity of the three DVs was tested using Pearson Moment 
Correlation Coefficient. The result shows that the three DVs (assimilation, elision and 
liaison) are linearly related.  Correlation coefficient is low (ryy < 0.80) but stastistically 
significant (see Table 5.10). This suggests that we can make use of MANOVA. 
 
 
 
 
 
135 
 
 
 
Table 5.10 Pearson correlation coefficients 
  
  Assimilation Elision Liaison 
*
Assimilation Pearson Correlation 1 .108  .491* 
 Sig. (1-tailed)  .020 .016 
N 360 360 360 
 *Elision Pearson Correlation .108  1 .032* 
Sig. (1-tailed) .020  .030 
 
N 360 360 360 
 Liaison Pearson Correlation .491* .032* 1 
Sig. (1-tailed) .016 .030  
 N 360 360 360 
*. Correlation is significant at the 0.05 level (1-tailed). 
 
 
However, the result of Box's M test (Table 5.11) conducted to evaluate the 
assumption of homogeneity of variance-covariance matrices shows that the test is 
significant (which means that the covariance matrices are significantly different across 
levels of the IVs).  This somewhat indicates an increased possibility of Type I error; 
but with a high power to detect the main effect (0.998 and 0.991- see table 5.12), the 
error can be catered for. Besides, Pillai‟s Trace multivariate test, acknowledged for its 
robustness to violations of assumptions, shall be reported.  
 
Table 5.11  Box's test of equality of covariance matrices 
   
 
 
Box's M 138.824 
 
F 1.455 
 df1 90 
 df2 79208.252 
Sig. .003 
 
Tests the null hypothesis that the observed covariance matrices of the 
 
dependent variables are equal across groups. 
  
Multivariate analysis of variance was used to test research question (a) at p < 0.05. The 
result of the multivariate test is presented in table 5.12. 
 
 
 
 
 
 
 
136 
 
 
Table 5.12  MANOVA summary table for Multivariate tests 
 
 
 
MANOVA results as presented in Table 5.12 show that region significantly 
affected the combined DV of the CSPs: Pillai's Trace = 0.11, F (9, 1032) = 4.29,           
2 
p < 0.05, η = 0.04). This implies that there were significant mean differences in the 
combined DV of the CSPs across region. The multivariate effect size was, however, 
small (3.6%).  
 The results further indicate a significant gender effect on the combined DV of 
2 
the CSPs, Pillai's Trace = 0.07, F (3, 342) = 8.12, p < 0.05, η = 0.07. This again 
implies that there was a significant mean difference in the combined DV of the CSPs 
between male and female participants. The multivariate effect size was also small 
(6.7%) though. 
 However, age did not significantly affect the combined DV of the CSPs, Pillai's 
2 
Trace = 0.02, F (3, 342) = 2.19, p < 0.05, η = 0.02.  The multivariate effect size was 
small (1.9%).  
137 
 
 
 Since a significant multivariate main effect for each factor has been obtained, it 
is customary to go ahead and examine the univariate F tests of each DV with a view to 
identifying which of the DVs were significantly affected by the IVs. However, the 
experiment-wise alpha protection provided by the overall F test does not extend to the 
univariate tests. In order to neutralize the inflated error rate that could arise due to 
multiple ANOVA, therefore, Bonferroni-type adjustment is normally employed. This 
requires setting a more stringent alpha level for the test of each DV to avoid the set of 
DV exceeding some critical value. In doing this, the overall α-level for the analyses is 
divided by the number of DVs (Adegoke, 2012) 
 In this regard, the earlier experiment-wise alpha level of 0.05 was divided by 9 
(number of tests to be performed) to get an acceptable confidence level for each of the 
6 tests. The alpha level was, therefore, set to p < 0.006 (that is, 0.05/6). Thus, research 
questions (b), (c) and (d) were tested at p < 0.006 (approximated to 0.01) significant 
level. 
 
Table 5.13   Tests of between participants effects 
 
 
138 
 
 
 From Table 5.13, we found that it is only the mean scores in liaison that 
2 
differed significantly across different regions, F (3, 344) = 8.14; p < 0.01, η = 0.07. 
This suggests that there was a significant univariate main effect of region on liaison. 
The effect size is small (6.6%). On the other hand, there was no significant univariate 
2 
main effect of region on assimilation, F (3, 344) = 2.05; p > 0.01, η = 0.02 and elision, 
2 
F (3, 344) = 2.48; p > 0.01, η = 0.02. 
 In the same vein, Table 5.13 shows that mean scores in elision differed 
2 
significantly between male and female participants, F (1, 344) = 22.21; p < 0.01, η = 
0.06.  This suggests a significant gender effect on elision. The effect size is small 
(6.1%). However, there was no significant univariate main effect of gender on 
2 
assimilation, F (1, 344) = 0.03; p > 0.01, η = 0.00 and liaison, F (1, 344) = 1.54;          
2 
p > 0.01, η = 0.00. 
 Finally, no significant univariate main effect of age was found in assimilation, 
2 2 
F (1, 344) = 2.61; p > 0.01, η = 0.01; elision, F (1, 344) = 1.59; p > 0.01, η = 0.01 and 
2 
liaison, F (1, 344) = 1.78; p > 0.01, η = 0.01. 
 
Table 5.14   Table of descriptive statistics of mean scores in elision 
 
139 
 
 
 From table 5.14, we are able to see, clearly, participants' performance levels in 
elision, especially the significant difference observed between male and female 
participants. The table shows that male participants had higher adjusted mean score (M 
= 9.91; SD = 2.84) in elision than females (M = 8.55; SD = 2.58). 
 In the same vein, Table 5.15 reveals participants' performance levels in liaison. 
The Eastern participants had the highest mean score (M = 1.38; SD = 1.44), followed 
by South-South (M =1.10; SD = 1.22), Western (M = 1.05; SD = 1.16) and Northern 
participants (M = 0.57; SD = 0.94). 
 
Table 5.15   Table of descriptive statistics of mean scores in liaison 
 
 
 However, because the IV (region) has more than two levels, it became 
necessary to examine the post hoc test for liaison in order to show where the regional 
differences lie; that is, which regions are significantly different from one another.  
 
 
140 
 
 
Table 5.16    Table of multiple comparisons: Post hoc test 
Multiple Comparisons 
Bonferroni 
(I) (J)  Statistics 
REGION REGION 95% Confidence 
Mean Interval 
Dependent Difference Std. Lower Upper 
Variable (I-J) Error Sig. Bound Bound 
*
Liaison NORTH EAST -.8083  .17046 .000 -1.2607 -.3560 
*
SOUTH-SOUTH -.5333  .17046 .011 -.9857 -.0810 
*
WEST -.4833  .17046 .029 -.9357 -.0310 
*
EAST NORTH .8083  .17046 .000 .3560 1.2607 
SOUTH-SOUTH .2750 .18673 .850 -.2205 .7705 
WEST .3250 .18673 .496 -.1705 .8205 
 
*
SOUTH- NORTH .5333  .17046 .011 .0810 .9857 
SOUTH EAST -.2750 .18673 .850 -.7705 .2205 
WEST .0500 .18673 1.000 -.4455 .5455 
*
WEST NORTH .4833  .17046 .029 .0310 .9357 
EAST -.3250 .18673 .496 -.8205 .1705 
SOUTH-SOUTH -.0500 .18673 1.000 -.5455 .4455 
Based on observed means. 
 The error term is Mean Square (Error) = 1.395. 
*. The mean difference is significant at the .01 level. 
 
 Recall the alpha level of 0.05 had been adjusted to p < 0.01 (based on the 
number of tests performed earlier) to get an acceptable confidence level to protect 
against inflated alpha error. Looking at the pairwise tests comparing liaison by region 
in Table 5.16, therefore, only the East and the North are significantly different from 
each other in liaison. This suggests a convergence of sort among three regions- East, 
South-South and West. 
 
 
5.1.5.3  Summary  
 The variational analysis of NE speakers'  disposition to assimilation, elision and 
liaison processes of SBE connected speech, based on region, gender and age, has 
shown that NE connected speech, displayed only very little variation. In most 
instances, participants' performance cut across speaker groups studied, indicating a 
uniform tendency for or departure from a particular SBE process.  
 
141 
 
 
5.1.5.3.1 Region 
 The study revealed a regional pattern of usage which does not support 
theoretical claims in the literature about the heterogeneity and diversity of Nigerian 
English. Since it is believed that Nigerian English is, theoretically, as varied as the 
number of indigenous languages spoken within her border (Banjo, 1979; Adegbija, 
1988), one would have expected a pattern of connected speech that fully justifies this 
claim. Suprisingly, however, assimilation and elision were not found to be significant 
on the basis of region; which means that there was no regional variation in the use of 
these processes.  
 However, this does not, in any way, suggest that the NE speakers represent a 
homogeneous speech community; neither does it imply that they have essentially the 
same norms. Rather, it is a demonstration of certain shared phonetically motivated 
patterns of usage or speaking habit. A lack of regional variation in elision, for example, 
may be traced to phonological naturalness and mother tongue influence. NE Speakers, 
regardless of their regional or ethnic leaning, will generally employ sounds which are 
more natural (easier to articulate). As earlier argued in this chapter, the complex 
consonant clusters of SBE pose problems for many NE speakers; there is, therefore, a 
high tendency for simplification of consonant clusters by consonant deletion in 
connected speech, irrespective of the region or tribe of speakers. Simo-Bobda (2007), 
specifically, refers to this trend as a major feature of African English accents, not 
peculiar to Nigerian English. Non-regional variability in assimilation, on the other 
hand, is a possible reflection of the general tendency of NE speakers to keep words 
apart in connected speech. 
 The only regional contrast found was in liaison. Even this was not in anyway 
categorical, because the difference lies between the East (with the highest mean score) 
and the North (with the lowest mean score) only, and the effect size was very small: 
just 6.6% (see Table 5.13). Thus, there was a convergence of sort amongst speakers 
from Western, South-South and Eastern regions. Besides, the overall score in r-liaison 
was generally low. This considerably low percentage score of 6.9% (see Table 5.6) is a 
wide departure from the SBE norm; the paucity of r-liaison only became most obvious 
among nothern participants. 
 
142 
 
 
5.1.5.3.2 Gender 
 Results of gender variation somewhat followed the trend found in region. The 
findings did not sufficiently demonstrate variation in the speech patterns of male and 
female speakers as established in numerous sociolinguistic research, especially in 
assimilation and liaison. It is generally believed that female speakers possess higher 
usage levels for the more conservative or prestigious speech variants and lower levels 
for those at the progressive or vernacular end. Against this background, one had 
expected female speakers to have performed better than their male counterparts in 
liaison, being a prestige variant. This was however not the case, as no gender variation 
existed in the process. This again reveals a generally low usage of this SBE feature 
among Nigerian speakers, and demonstrates an equal status for liaison, irrespective of 
gender. 
 The hypothesis was, however, partly confirmed in elision where a significant 
difference was found between male and female speakers (the effect size is, again, very 
small: 6.1%; Table 5.13). The gender difference, however, can be treated more as 
phonological explicitness than prestige (though social prestige may not be totally ruled 
out). Elision is a phonetically motivated process that is characteristic of connected 
speech, in that it enhances the ease of articulation (Hannisdal, 2006). That males 
significantly elided more than females suggests that men are more receptive to natural 
phonological processes and tend to be articulatorily more economical than women, 
who are considered more careful and formal in speech (Labov, 1963, 1966; Hudson; 
1996). There is, however, a certain correlation between articulatorily motivated 
processes and social prestige. Phonetic explicitness is often linked with correctness and 
high-status varieties, while phonetic reduction or simplification is associated with 
sloppiness, casualness or vernacular speech. Thus, the finding somewhat depicts the 
sex/prestige pattern to the extent that males‟ better performance in elision is considered 
a reflection of their casualness and less-prestigious speech compared to the women 
folk.   
 
5.1.5.3.3 Age 
In regard to age variation, the study did not find any correlation at all in the 
speech patterns of young and adult speakers in assimilation, elision or liaison. This, 
again, either demonstrates Nigerian English speakers' general low competence in 
connected speech processes of SBE, regardless of age group;  or shows that they are 
143 
 
 
motivated by similar phonological tendencies in their articulation of Standard British 
English connected speech processes. 
 
5.2 Acoustic analysis 
This section examines the acoustic properties of portions of the semi-
spontaneous speeches produced by 8 participants (representing the social variables of 
region, gender and age) and one native speaker. The purpose was to measure 
participants' level of approximation to or deviation from the Standard British English 
CSPs (as represented by the control) and to corroborate the findings obtained through 
statistical (perceptual) analyses. The instrumental analysis was conducted using Praat 
(version 5120) developed by Paul Boersma and David Weenink of Summer Institute of 
Linguistic, USA. The software displays such acoustic properties as speech waveforms, 
spectogram, fundamental frequency, formant structure, voice bar and pitch curve 
amongst others. Some of these acoustic tools were used to determine and identify pitch 
of utterance, voiced, devoiced or voiceless segments, as well as, articulated, elided or 
assimilated sound segments.  
Four portions of the semi-spontaneous speech data extracted for instrumental 
analysis to test the CSPs examined earlier are: He’s a nice boy (assimilation of voice), 
Ten pounds (place assimilation), He won’(t) do it (elision) and I met Peter at the 
station (liaison). Each item was segmented into interval tiers (sentence/phrase and 
transcription). In each category, the textgrid for the control‟s utterance is displayed 
against those of Nigerian English speakers, as shown below, so as to reveal the 
differences or similarities between the two groups. 
 
 
 
 
 
 
 
 
 
 
 
144 
 
 
5.2.1 Acoustic analysis of He’s a nice boy. 
 
Control 
 
 
Fig. 5.9 The textgrid of He’s a nice boy as produced by the control. 
 
 
S1- West: YF 
 
 
 
Fig. 5.10 The textgrid of He’s a nice boy as produced by a young female speaker of 
English from Western Nigeria. 
 
145 
 
 
 S2- West: AM 
 
 
 
Fig. 5.11 The textgrid of He’s a nice boy as produced by an adult male speaker of 
English from Western Nigeria. 
 
 
 
S3- East: YF 
 
 
Fig. 5.12 The textgrid of He’s a nice boy as produced by a young female speaker of 
English from Eastern Nigeria. 
 
146 
 
 
S4- East: AM 
 
 
 
Fig. 5.13 The textgrid of He’s a nice boy as produced by an adult male speaker of 
English from Eastern Nigeria. 
 
 
S5- North: AF 
  
 
 
Fig. 5.14 The textgrid of He’s a nice boy as produced by an adult female speaker of  
English from Northern Nigeria. 
 
147 
 
 
S6- Hausa: YM 
 
 
 
Fig. 5.15 The textgrid of He’s a nice boy as produced by a young male speaker of 
English from Northern Nigeria. 
 
 
S7- South-South: AF 
 
 
 
Fig. 5.16 The textgrid of He’s a nice boy as produced by an adult female speaker of 
English from the South-South region of Nigeria 
 
148 
 
 
S8- South-South: YM 
 
 
 
Fig. 5.17 The textgrid of He’s a nice boy as produced by a young male speaker of 
English from South-South region of Nigeria. 
 
The above textgrids display the speech waveforms, spectrograms, duration and 
pitch curves of the utterance: He’s a nice boy as produced by the control and eight 
Nigerian speakers of English. We were to determine the proportion of participants that 
observed progressive voicing assimilation by producing s of He’s as voiced [z]  (and 
not as [s]) as obtained in SBE and demonstrated by the control.  
 Acoustically, a voiced fricative is identified by a band of vertical striations (a 
voice bar) at the base of the spectrogram and a comparatively regular vocal cord pulses 
on the waveform; while a voiceless fricative is characterised by small irregular 
flunctuations of air pressure on the waveform, absence of a voice bar and break of the 
pitch curve on the spectogram (Kirchner, n.d.; Ladefoged, 1993:186-187). 
Through these acoustic cues, we were able to ascertain that the control 
produced [z] in He’s (progressive voicing) as indicated by the voice bar which appears 
at the lower part of the spectogram (see top of the yellow spot) and the regular vocal 
cord pulses on the waveform (see the pink spot) in Fig. 5.8. It follows, therefore, that 
only S3 (a young female NE speaker from the East) and S7 (an adult female NE 
speaker from South-South) produced [z] in He’s, indicating progressive voicing. Six 
other participants (S1, S2, S4, S5, S6, and S8) produced [s]. Their textgrids reveal 
149 
 
 
absence of a voice bar and a break of the pitch curve at the point where [s] is produced 
on the spectogram.   
This implies that only 25% of the NE speakers were able to articulate 
progressive voicing as obtained in SBE. This confirms the initial perceptual finding 
that Nigerian English speakers deviate significantly from Standard British English in 
Progressive voicing, where participants recorded 21.2%. 
 
5.2.2 Acoustic analysis of Ten pounds 
 
Control 
 
 
 
Fig. 5.18 The textgrid of Ten pounds as produced by the control. 
 
 
 
 
 
 
 
 
 
 
 
150 
 
 
S1- West: YF 
 
 
Fig. 5.19 The textgrid of Ten pounds as produced by a young female speaker of 
English from Western Nigeria. 
 
 
 
S2- West: AM 
 
 
 
Fig. 5.20  The textgrid of Ten pounds as produced by an adult male speaker of 
English from Western Nigeria. 
 
 
151 
 
 
S3- East: YF 
 
 
 
Fig. 5.21  The textgrid of Ten pounds as produced by a young female speaker of 
English from Eastern Nigeria. 
 
 
S4- East: AM 
 
 
Fig. 5.22  The textgrid of Ten pounds as produced by an adult male speaker of 
English from Eastern Nigeria. 
 
 
152 
 
 
S5- North: AF 
 
 
Fig. 5.23  The textgrid of Ten pounds as produced by an adult female speaker of 
English from Northern Nigeria. 
 
 
S6- North: YM 
 
 
 
Fig. 5.24  The textgrid of Ten pounds as produced by a young male speaker of 
English from Northern Nigeria. 
 
 
153 
 
 
S7- South-South:  AF 
 
 
 
Fig. 5.25 The textgrid of Ten pounds as produced by an adult female speaker of 
English from the South-South region of Nigeria. 
 
 
S8- South-South:  YM 
 
 
Fig. 5.26 The textgrid of Ten pounds as produced by a young male speaker of 
English from the South-South region of Nigeria. 
 
154 
 
 
The above textgrids which display the speech wave form, spectrogram, 
duration, frequencies, formant structure and formant tracks of each slice of the 
utterance: Ten pounds, as produced by the control and eight Nigerian speakers of 
English, were examined to determine whether or not alveolar nasal /n/ became 
assimilated to bilabial nasal /m/ (nasal assimilation) in ten pounds as expected in SBE 
connected speech. 
First, it is obvious from the spectogram that each respondent articulated a nasal 
consonant as shown by the formants (the dark horizontal bars on the pink spot where 
[n or m] was articulated) and formant trackers (the red horizontal lines). According to 
Ladefoged (1993, 2003), nasal consonants generally have formant structures similar to 
but often fainter than those of vowels. This is for the reason that nasals have lower 
amplitude than vowels. The first formant usually lies at the base line of the 
spectogram. In this case, therefore, each of the nasal sounds displays very low first 
formants. For the control, it is at about 267.1 Hz, while those of the Nigerian speakers 
fall between 246.4 Hz and 547.9 Hz.  
 However, in order to determine the nasal consonant produced, we examined the 
pattern of formant transitions that characterise the vowel that precede each nasal 
consonant. Nasals are usually distinguished from each other by the different formant 
transitions (movement of the formants) occurring at the end of the vowel that precedes 
or follows the nasal consonants (Ladefoged, 1993; Kirchner, n.d.). According to 
Ladefoged (1993:201), bilabial nasal /m/ is distinguished by a downward movement of 
the second and third formant before it, while velar nasal /ŋ/ is characterised by coming 
together of the second and third formants before it. Alveolar nasal /n/, on the other 
hand, is identified by 'comparatively small movement of the formant'.  
 In the control's spectogram, the second and third formants (depicted by the 
formant transition of vowel /e/ of ten as well as the formant trackers), show a 
downward movement before the following nasal consonant, indicating that [tem 
paʊns] was articulated. This implies that assimilation of [n] to [m] occurred in the 
control‟s speech as earlier established in the perceptual analysis. The same trend was 
also observed in the spectograms of four Nigerian participants (S2, S3, S4 and S5), 
which implies that these participants assimilated [n] to [m]; while the remaining 4 
participants (S1, S6, S7 and S8) did not. This, somewhat, corroborates the finding of 
perceptual analysis which showed that NE speakers closely approximate to the SBE 
norm in nasal assimilation. 
155 
 
 
5.2.3 Acoustic analysis of He won't do it   
 
Control 
 
 
Fig. 5.27 The textgrid of He won't do it as produced by the control. 
 
 
S1- West: YF 
 
 
 
Fig. 5.28  The textgrid of He won't do it as produced by a young female speaker of 
English from Western Nigeria. 
 
156 
 
 
S2- West: AM  
 
 
 
Fig. 5.29 The textgrid of He won't do it as produced by an adult male speaker of 
English from Western Nigeria. 
 
 
 
S3- East: YF 
 
 
 
Fig. 5.30  The textgrid of He won't do it as produced by a young female speaker of 
English from Eastern Nigeria. 
 
157 
 
 
S4- East: AM 
 
 
 
Fig. 5.31  The textgrid of He won't do it as produced by an adult male speaker of 
English from Eastern Nigeria. 
 
 
S5- North: AF 
 
 
 
Fig. 5.32 The textgrid of He won't do it as produced by an adult female speaker of 
English from Northern Nigeria. 
 
 
158 
 
 
S6- North: YM 
 
 
 
Fig. 5.33  The textgrid of He won't do it as produced by a young male speaker of 
English from Northern Nigeria. 
 
 
 
S7- South-South: AF 
 
 
 
Fig. 5.34  The textgrid of He won't do it as produced by an adult female speaker of 
English from the South-South region of Nigeria. 
 
159 
 
 
S8- South-South: YM 
 
 
 
Fig. 5.35 The textgrid of He won't do it as produced by a young male speaker of 
English from the South-South region of Nigeria. 
 
The above textgrids of the utterance: He won't do it exemplify boundary 
consonant elision process in SBE and NE. From the formant structure and the pitch 
bars on the spectrograms of the control and the participants, it is obvious that t was 
elided in all cases in won't. First, formants and vertical striations (representing voicing) 
are visible on the spot where won’t was articulated (highlighted pink), which indicates 
that only approximant /w/, vowel /ʊ/ and nasal /n/ were produced (these acoustic 
features are uncharacteristic of /t/, which is voiceless and is usually represented with a 
burst of noise). Second, in most instances, the pitch curve stretches from [ɪ] of he to [ɪ] 
of it without a break, which implies that a voiceless segment was not produced in-
between (recall that a voiceless segment breaks the pitch curve). In the three instances 
of the control, S1 and S5 where there is a break in the pitch bar, the break occurs on 
the spot where [d] was produced and not on [t]. This presupposes that it was a 
devoiced [d̥] rather than voiced [d] that was articulated. 
This, therefore, confirms the initial finding that NE speakers approximate to the 
SBE connected speech processes in regard to elision and, at the same time, establishes 
the predominance of elision as a consonant cluster simplification stategy in Nigerian 
English as discovered in the previous analysis.  
160 
 
 
5.2.4 Acoustic analysis of I met Peter at the station   
 
Contol 
 
 
 
Fig. 5.36  The textgrid of I met Peter at the station as produced by the control. 
 
 
S1- West:  YF 
 
 
 
Fig. 5.37  The textgrid of I met Peter at the station as produced by a young female 
speaker of English from Western Nigeria. 
161 
 
 
S2- West: AM 
 
 
 
Fig. 5.38 The textgrid of I met Peter at the station as produced by an adult male 
speaker of English from Western Nigeria 
 
 
 
S3- East: YF 
 
 
 
Fig. 5.39  The textgrid of I met Peter at the station as produced by a young female 
speaker of English from Eastern Nigeria. 
 
162 
 
 
S4- East:  AM 
 
 
 
Fig. 5.40 The textgrid of I met Peter at the station as produced by an adult male 
speaker of English from Eastern Nigeria. 
 
 
S5- North: AF 
 
 
 
Fig. 5.41  The textgrid of I met Peter at the station as produced by an adult female 
speaker of English from Nothern Nigeria. 
 
 
163 
 
 
S6- North: YM 
 
 
 
Fig. 5.42 The textgrid of I met Peter at the station as produced by a young male 
speaker of English from Nothern Nigeria. 
 
 
S7- South-South: AF 
 
 
 
Fig. 5.43  The textgrid of I met Peter at the station as produced by an adult female 
speaker of English from the South-South region of Nigeria. 
 
164 
 
 
S8- South-South:  YM 
 
 
 
Fig. 5.44  The textgrid of I met Peter at the station as produced by a young male 
speaker of English from the South-South region of Nigeria. 
 
 The textgrids above illustrate the dispositions of SBE and NE speakers to r-
liaison. We were to determine whether /r/ was used to link the vowel of the second 
syllable of Peter [ə] with the vowel of at [ə]. Acoustically, /r/ is distinguished mainly 
by a decrease in the frequency of F3; that is, the lowering of the higher formants: the 
third, and even, the fourth (Ladefoged 1993, 2003). Ladefoged (2003:149) specifically 
claims that “variations in the frequency of F3 indicate the degree of r-coloring: the 
lower the F3, the greater the degree of rhoticity”. From the spectogram, therefore, it is 
obvious that the control used r-liaison as demonstrated by the lowering of the third 
formant at the spot where /r/ appears on the spectogram (highlighted pink). F3 
descends to 2100 Hz for /r/ between the last syllable of Peter and at as indicated by the 
red arrow. 
 On the other hand, none of the Nigerian participants used r-liaison as can be 
seen on the spectogram. In the first instance, the higher formants for the boundary 
segments (highlighted pink) are not shown to be lowered; most of them are close to the 
upper limit of the spectogram. Besides, the linking phenomenon was resolved either by 
lengthening the vowel of the second syllable of Peter or pausing before Peter and at. 
For example, while S2, S3, S4 and S6 paused in-between the boundary vowels, S1, S5, 
165 
 
 
S7 and S8 lenghtened the vowel of the second syllable. This finding, again, justifies 
the initial submision that linking /r/ is not a typical connected speech feature of 
Nigerian English as speakers failed to approximate to SBE. 
 
 
 
 
166 
 
 
 
 
 
CHAPTER 6 
 
SUMMARY, CONCLUSION AND RECOMMENDATIONS  
 
6.0. Introduction 
 This study set out to investigate assimilation, elision and liaison processes of 
Standard British English (SBE) connected speech in NE, in relation to the region, 
gender and age of speakers. This was with a view to establishing the extent to which 
NE speakers approximate to or deviate from the connected speech processes of 
Standard British English. The findings are to afford us the opportunity to appropriately 
describe the subsegmental domain of Nigerian English and unearth various 
sociophonetic influences on it.  
 Using the semi-spontaneous speeches of 360 participants, drawn from four 
regions and two social categories (gender and age) in Nigeria, we carried out both 
statistical and acoustic analyses in an attempt to answer the following research 
questions: 
(i) are there incidences of assimilation, elision and liaison processes of SBE 
connected speech in Nigerian English?  
(ii) to what extent do Nigerian English speakers approximate to or deviate from the 
Standard British English connected speech processes? 
(iii) are there typical Nigerian English CSPs?  
(iv) are assimilation, elision and liaison socially differentiated in Nigerian English 
in terms of the region, gender and age of speakers? 
(v) what are the possible motivations for participants‟ performance? 
 
6.1 Summary of Findings 
 Consequently, the following discoveries were made about connected speech 
processes in Nigerian English. 
 In regard to the question of whether there are incidences of assimilation, elision 
and liaison processes of SBE connected speech in Nigerian English (research question 
167 
 
 
i), some of the processes that characterise the native speakers‟ connected speech were 
found in the Nigerian English data, but in varying degrees. Some of them were 
predominant or substantial, while others were minor. The first category comprised 
regressive devoicing, progressive devoicing, nasal assimilation and consonant elision. 
These CSPs were found to be prevalent in the data and cut across ethnic and social 
considerations. Included in the second category are SBE processes that were attested to 
a lesser degree in the data; these are progressive voicing, alveolar stop assimilation, 
yod coalescence, t-voicing, smoothing, linking and intrusive /r/. 
 Therefore, in response to the question on the extent of NE speakers‟ 
approximation to or deviation from the SBE connected speech processes (research 
question ii), the overall incidence of the CSPs (assimilation, elision and liaison) of 
Standard British English indicated 43.2% approximation to and 56.8% deviation from 
SBE. Considering each process, we found that NE speakers demonstrated significant 
approximation to SBE in three assimilation variants- regressive devoicing, progressive 
devoicing and nasal assimilation, while they deviated in four others- progressive 
voicing, voiceless alveolar stop assimilation, voiced alveolar stop assimilation and yod 
coalescence. Consonant elision, in all contexts, occurred significantly, while the 
incidence of liaison (linking and intrusive /r/) was extremely low. 
The statistical analysis of the above findings showed a preponderance (99.2%) 
of regressive devoicing at word boundary where a voiced obstruent precedes a 
voiceless one, e.g. [ʧoz̥ siks] we chose six player, [hav̥ tu ] I have to go, etc. In each 
case, the preceding segment was devoiced in anticipation of the following voiceless 
sound. Progressive devoicing was significantly produced (65.1%) at word boundary 
where a voiceless segment precedes a voiced one, e.g. [haf d̥ͻn] the job was half done, 
[naɪs b̥oy ] nice boy, etc. In each item, the initial segment of the second of the two 
boundary words was, in most cases, affected by the voicelessness of the last consonant 
of the first. Nasal assimilation was also predominant (63.5%) at word boundary where 
/n/ precedes bilabial stops /b, p/ or velar stop /k/, as in [tɛm bͻis] ten boys, [tɛm pauns] 
ten pounds and [iŋ kes] in case. On the other hand, progressive voicing deviated 
significantly (21.2%) at word-final position where the reduced form of verb be is 
preceded by a voiced segment, e.g. [hiz] he's, [dɔgz] dog's. In the same vein, voiceless 
alveolar stop assimilation was produced less (47.6%) in an environment where 
voiceless alveolar stop /t/ precedes a bilabial or velar stop /p, k/, as in [mɛp pɪta] I met 
Peter, [dak kes] that case. At the word boundary where voiced alveolar stop /d/ 
168 
 
 
precedes a bilabial or velar stop /g, b/, e.g. [gug gɛl] good girl, the percentage score for 
the NE speakers revealed extremely low incidence of voiced stop assimilation (3.2%). 
Also, yod coalescence was barely articulated (6.2%) at four boundary environments 
where /s, z, t, d/ fuse with yod /j/ to produce [ʃ, ʒ, tʃ, ʤ], e.g. [miʃɔ:] miss your train, 
[ðoʒɔŋg] those young men, [wɔʧu] what you want, [kuʤu] could you, etc.  
Nigerian English speakers demonstrated a propensity for consonant elision, 
with 61.5% performance, at different boundary contexts (especially in boundary 
consonant clusters involving alveolar plosives /t, d/), e.g. [dɔsn˺ ʃi] doesn't she, [ezɑk˺ 
kͻlͻ] exact colour, [ʤͻmp˺ wel] jumped well, [fiks˺ prais] fixed price, [rͻb˺ boθ] 
robbed both banks, etc.  Conversely, liaison (linking /r/ and intrusive /r/) was less 
prevalent among the participants. Linking /r/ was slightly employed by participants in 
8.1% instances, especially in-between short grammatical words like [mͻr ɔf] more of, 
[aftar e wail] after a while, [ðeər a] there are, etc.; while instances of intrusive /r/ were 
extremely low (2.9%).  
The statistical findings were further corroborated by the results obtained 
through the acoustic analysis which revealed that participants, in most cases, 
considerably deviated from the SBE norms. As shown by the speech waveforms, 
formants structure, voice bar and pitch curves on their textgrids, only 25% of the NE 
speakers were able to articulate progressive voicing as obtained in SBE, 50% produced 
nasal assimilation, while none of the speakers used r-liaison. The spectrograms, 
however showed that t was elided by the eight participants in He won’t do it. All these 
corroborated the findings of statistical analysis.  
As regards the question of whether there are peculiar NE connected speech 
processes in the data (research question iii), few CSPs were found to be typical of the 
NE variety, which were not attested in the SBE data. These are final devoicing, 
regressive voicing and consonant substitution. At word-final position where the 
reduced form of verb be is preceded by a voiced segment, final devoicing, e.g.  [ʃis] 
she’s a good girl, [di dͻks main] the dog’s mine, was significantly produced (78.8%); 
while regressive voicing, whereby a voiceless segment preceding a voiced one at word 
boundary becomes voiced e.g. [aiz blu] ice blue, [blag drɛs] black dress, showed 
30.5% tokens of occurrence. Consonant substitution, which occurred in junctural 
contexts involving /p/ or /f/, is more or less an idiosyncratic deviation from SBE with 
considerably low incidence (1%). It was attested in the speech of few participants from 
Northern Nigeria where /p/ was sometimes substituted for /f/ or vice versa due to the 
169 
 
 
influence of Hausa (a lingua franca in the region) which regards [p], [f] and [Ф] as 
allophones of /p/ or /f/ (Jowitt, 1991). This process was evident in the following 
alternations: [hɑp tu] have to, [faif fauns, faɪp pauns] five pounds, [ͻp kͻs] of course, 
[laip ʃo] live show, [hɑp dͻn] half done, etc. 
On the issue of the social variation of assimilation, elision and liaison in 
Nigerian English in relation to region, gender and age (research question iv), it was 
discovered that region and gender of speakers were significant in only a few processes, 
while age was inconsequential. Regional contrast was found in liaison: speakers from 
the East performed significantly better than those from the North; at the same time, 
there was a convergence of sort across three regions: West, South-South and East. 
Gender also had a significant effect on elision: male speakers‟ performance was 
significantly better than female speakers‟.  
In regard to the factors responsible for participants‟ performance (research 
questions v), it was observed that participants‟ approximation to SBE in certain 
processes was principally motivated by articulatory (phonetic) factors and mother 
tongue transfer rather than adequate knowledge of English phonological rules. 
Participants were able to approximate to Standard British English CSPs in processes 
that are more natural (require less articulatory effort), common and attested in many 
languages or where the sound segments in question are easily accessible in their 
indigenous languages. This informed their better performance in processes that involve 
devoicing, homorganic nasal assimilation and deletion. Whereas they could not 
replicate this feat in voicing, yod coalescence and r-liaison processes which require 
more articulatory energy.  
In terms of social factors, gender variation in elision can be associated with 
articulatory economy as well as sloppy, casual and less prestigious speech habit of the 
male folks, compared to their female counterparts‟ formal and more refined speech. In 
the literature, phonetic explicitness is often linked with correctness and high-status 
varieties, while phonetic reduction or simplification is associated with sloppiness, 
casualness or vernacular speech which men‟s speech is known for. On the other hand, 
the regional variation in liaison between Eastern and Northern speakers does not 
necessarily give one region a significant social advantage over the other, as speakers 
from all regions recorded extremely low performance in this process. Rather, it 
suggests that absence of r-liaison in Nigerian English is the most obvious in the North.  
 
170 
 
 
6.2 Conclusions 
The implications of the foregoing, therefore, are as follows: 
1. Nigerian English speakers' use of Standard British English connected speech 
processes manifested, overall, more deviation from, than approximation to, 
SBE. This suggests NE speakers' relatively low level of competence in 
Standard British English connected speech processes, and establishes the 
marked difference between the CSPs of the two varieties. Unlike in SBE where 
the occurrence of CSPs is widespread (Laver, 1968), these processes are not so 
prevalent in the speech of NE speakers. Where they occur at all, they are 
largely influenced by mother tongue transfer and articulatory exigencies- the 
need to employ natural features that require less articulatory effort and are 
attested in or common to many languages (Hyman, 1975; Simo Bobda, 1994).  
Besides, in SBE, CSPs do not affect sound segments only, but also a 
whole word, syllable and sometimes a phrase (Kerswill, 1985; Nolan and 
Kerswill, 1990; Wells, 2000); whereas, they are almost always restricted to 
word or morpheme boundaries in NE. These deviations stem from the fact that 
NE speakers, like many other L2 speakers, do not speak English like native 
speakers who are fond of speaking fast, with sounds (and by implication 
words) slurring into each other. Rather, they have a tendency to keep words 
apart during speech.  
This can be explained in terms of a number of factors. One is the 
syllable-timed rhythm of the phonology of the indigenous Nigerian languages, 
in which each syllable is given equal weight (no syllable is more important 
than the other) and pronounced with equal emphasis. Most Nigerian speakers, 
therefore, negatively transfer this innate speech habit to the phonology of 
English. The second factor is the bookish nature of the speech habit of Nigerian 
English speakers. Being L2 leaners, Nigerians neither have the intuition of the 
native speaker nor acquire English in the native speakers‟ environment. English 
is learnt, principally, in a formal classroom setting where pronunciation 
teaching is based on isolated words and not on utterances (Laver, 1968; 
Gimson, 1980). Most Nigerian speakers, therefore, speak just the way they 
read, putting emphasis on each word.  
 
171 
 
 
2. In a bid to resolve the difficulty posed by certain SBE connected speech rules, 
Nigerian English speakers have developed certain characteristic patterns of 
CSPs that deviate from the SBE norm. Some of them are predominant or 
substantial processes employed by a majority of speakers and can be regarded 
as Standard Nigerian English (e.g. final devoicing); while others are 
idiosyncratic deviations or low-level processes with regional colouration (e.g. 
consonant substitution). This is due to the fact that certain SBE generative rules 
do not apply in Nigerian English and, at the same time, there are peculiarly 
Nigerian English rules, which do not operate in SBE. 
 
3. Deviation from the SBE norm, therefore, may have implications for 
intelligibility. NE speakers may end up producing „un-English‟ CSPs which are 
unintelligible to native speakers, particularly if such deviation is far removed 
from the SBE norm. Gimson (1980:313) advises in this regard that an L2 
learner must avoid “assimilatory habits which are characteristic of his own 
native language but not of English”. Beyond appropriate articulation of 
connected speech of SBE, however, NE speakers are likely to find it difficult to 
understand or decode the speech of a native speaker which is highly reduced, 
simplified or fused. This is an aspect which Gimson (1980), again, considers 
even more important than the acquisition of productive skills. 
 
4. The observation that CSPs may be socially differentiated in a community, 
depending on the regional affiliation, age, sex and socio-economic class of 
speakers (Kerswill, 1985; 1987) is not fully supported in this study. This is 
because only a little variation was observed in the data. Regional variation was 
found in liaison between Eastern and Northern participants, while males 
performed significantly better than females in elision process; the effect sizes 
of both levels of variation were, however, very small. The implication of this is 
that there is more convergence than divergence in these aspects of CSPs of 
Nigerian English speakers regardless of region, gender or age. In view of this, 
we cannot but agree with Laver (1968) that variation in speech in Nigerian 
English seems to be confined to certain sound segments and particular 
intonation patterns.  
 
172 
 
 
5. Nigerian English is, definitely, ripe for standardisation and codification. At the 
subsgmental domain and in other domains earlier examined by scholars, there 
are features that are shared with Standard English, those which are peculiarly 
Nigerian and those that are alien to the NE variety. It becomes pertinent, 
therefore, to collate these features with a view to sieving genuine variations 
from errors and delimiting Standard Nigerian English pronunciation from 
regional varieties and non-standard forms which may impair intelligibility.  
 
6.3 Recommendations and further studies 
This study has attempted to contribute to scholarly efforts to characterise, 
standardise and codify Nigerian English, especially at the subsegmental level. It 
identified, categorised and provided nomenclatures (where none existed) for a number 
of connected speech processes found in Nigerian English vis-a-vis what obtains in 
Standard British English. Of paramount significance is the sociophonetic approach 
employed to unearth these processes which, by so doing, threw more light to the 
variability of their usage in Nigeria. In view of the above discoveries, therefore, it 
becomes clearer that as much as this study confirms the reality and distinctness of 
Nigerian English variety and the imperativeness of its codification, there is need to 
raise the Standard of spoken English in Nigeria, not only to be intelligible to the native 
speakers and speakers of other varieties, but also to ensure that Nigerian speakers are 
able to understand the native speakers' speech. This is because, too distant deviations 
from SBE may impair intelligibility at both production and perception levels.  
Therefore, pedagogical efforts should be made to target for correction those 
features that are heavily induced by negative mother tongue transfer and spelling cued 
mispronunciation, which may widen the intelligibility gap between the native and non-
native varieties. Besides, pronunciation teaching in Nigerian schools should no longer 
be based, primarily, on segmental features; emphasis should be placed on how 
differently these seemingly discreet sounds behave in the stream of connected speech. 
Learners should therefore be exposed to the communicative use of English. 
This study, apart from providing a descriptive analysis for the sub-segmental 
features of Nigerian English, will serve as a planning platform for language planners, 
on the basis of which a standard that will be acceptable for teaching-learning processes 
can be established. It will also be of immense value to scholars, students and all who 
crave good spoken English. 
173 
 
 
This is, no doubt, one of the studies that have attempted to shift focus from 
segmental analysis to the description of sentence features. It is, however, restricted to 
only a few connected speech processes, in view of space. Further inquiries should, 
therefore, be extended to other connected speech processes, particularly, using natural 
speech data.  
 
 
 
174 
 
 
REFERENCES 
 
Abercrombie, D. 1967. Elements of general phonetics. Edinburgh: Edinburgh 
University Press. 
Achebe, C. 1966. Things fall apart. London: Heinemann. 
 
Adegbija, E. 1989. Lexico-semantic variation in Nigerian English. World Englishes 
8.2:165-177. 
 
__________ 1998. Nigerian English: towards a standard variety.  A keynote address 
th
presented at the 5  Conference of the International Association of World 
Englishes (IAWE) held at the University of Illinois at Urbana-Campaign, 
USA, November 5-7. 1-26. 
 
__________ 2004.  The domestication of English in Nigeria. The domestication of 
English in Nigeria. V. Awonusi and E. Babalola. Eds. Lagos: University of 
Lagos Press. 
 
Adegoke, B. A. 2012. Multivariate statistical methods for behavioural and social 
sciences research. Ibadan: Power Press and Publishers. 
 
Adekunle, M. 1974. The standard Nigerian English. Journal of the Nigeria English 
Studies Association (JNESA) 6.1. 
 
__________ 1979. Non-random variation in Nigerian English. Varieties and functions 
of English in Nigeria. E. Ubahakwe. Eds. Ibadan: African University Press. 
 
Adeniran, A. 1979. Nigerian elite English as a model of Nigerian English. Varieties 
and functions of English in Nigeria. E. Ubahakwe. Eds. Ibadan: African 
University Press. 
 
Adesanoye, F. A. 1973. A study of varieties of written English in Nigeria. PhD. 
Thesis. University of Ibadan. 
 
_______________ 1980. Patterns of deviation in written English. Journal of Language 
Arts and Communication. 1.1:53-55. 
 
Adetugbo, A.   1977. Nigerian English: fact or fiction. Lagos Notes and Records 6. 
 
_______________1979. Appropriateness and Nigerian English. Varieties and 
functions of English in Nigeria. E. Ubahakwe. Ed. Ibadan: African University 
Press. 137-166 
 
_______________ 1987. Nigerian English phonology: is there any standard?  Lagos 
Review of English Studies ix: 64-84. 
Afolayan, A. 1968. The linguistic problem of Yoruba learners and users of English in 
Nigeria. PhD. Thesis. London. 
Aitchison, J. 1981. Language change: progress or decay? London: Fontana Press. 
__________ 1987. English as a second language: a variety or a myth? Journal of 
English as a second language 1. 
Akere, F. 1978. Socio-cultural constraints and the emergence of a standard Nigerian 
English. Anthropological Linguistics 20. 9:407-421.   
175 
 
 
Akindele, F. and Adegbite, W. 1999. The sociology and politics of English in Nigeria: 
an Introduction. Ile-Ife: Debiyi-Iwa Publishers. 
Akinjobi, A.A. 2004. A phonological investigation of vowel weakening and unstressed 
syllable obscuration in educated Yoruba English. PhD. Thesis. Department of 
Linguistics. University of Ibadan. xxiv+300. 
  
__________  2013. Spelling cued mispronunciation in Nigerian English. Papers in 
English and linguistics. R. Atoye. Ed. Ile-ife: Linguistic Association of 
Nigeria. 18-30. 
 
Akinlabi, A. 2004 Yoruba sound system. Understanding Yoruba Life and Culture. N. 
Lawal, M. Sadiku and A. Dopamu. Eds. Trenton: Africa World Press. 453-
468. Retrieved March 10, 2011, from http://www.rci.rutgers.edu/ ~akinlabi/ 
Yoruba-Sound-System.pdf. 
Aladeyomi, S. 2002. An evaluation of the spoken English performance of Nigerian 
televisions newscasters. PhD. Thesis. Dept. of Communication and Language 
Arts. University of Ibadan. xvii+330. 
__________  and Adetunde, A.K. 2007. Errors of segmental phonemes in the spoken 
English of Nigerian Television Newscasters. The Social Sciences 2.3:302-306. 
Retrieved January 5, 2011 from http://www.medwelljournals.com/archive 
details. php?jid=1818-5800&issueno=7. 
Alo, M. 2005. Revisiting issues in English use and usage in Nigeria: implications for 
the EGL classroom. Journal of the Nigeria English Studies Association 11.1: 
114-130.  
Altendorf, U. 2003. Estuary English: levelling at the interface of RP and south-eastern 
British English. Tübingen: Narr. 
Amayo, A. 1981. Tone in Nigerian English. Papers from the Sixteenth Regional 
Meeting of the Chicago Linguistic Society. J. Kreiman and A. Ojeda. Eds. 
Chicago Illinois: University Press. 
 
Arvaniti, A., & Garding, G. 2005. Dialectal variation in the rising accents of American 
English. Laboratory Phonology 8. C. T. Best; L. Goldstein & D. H. Whalen. 
Eds. Berlin: Mouton de Gruyter. 
Atoye, R. 1991. Word-stress in Nigerian English.  World Englishes 10.1: 1- 6. 
_______ 2005a. Deviant word stress in Nigerian English: its implications for 
Nigerian‟s English oral literacy. Paper presented at the 22nd Annual 
Conference of Nigeria English studies Association (NESA). Ile- Ife: Obafemi 
Awolowo University. 
_______ 2005b. A set of distinctive features for English sound segments. Papers in 
English and linguistics. R. Atoye. Ed. Ile-ife: Linguistic Association of 
Nigeria. 41-53 
Attah, M.O. 1987. The national language problem in Nigeria. Canadian Journal of 
African Studies 21.3: 393-401 
Awobuluyi, O. 1996. Language education in Nigeria: theory, policy and practice. 
Retrieved April, 14, 2009, from yeyeolade.wordpress.com/2009/05/15/ 
language -education-i. 
 
176 
 
 
Awonusi, V.O. 1985. Sociolinguistic variation in Nigerian English. PhD. Thesis. 
Umiversity of London. 
_____________ 1987. The identification of standards within institutionalised non-
native Englishes: the Nigerian experience. Lagos Review of English Studies 
IX: 47-63. 
_____________ 2004a. Cycles of linguistic history: the development of English in 
Nigeria. Nigerian English: influences and characteristics. A. Dadzie and V. 
Awonusi. Eds. Lagos: Concept Publication. 46-66. 
______________2004b. Some characteristics of Nigerian English phonology. 
Nigerian English: influences and characteristics. A. Dadzie and V. Awonusi. 
Eds. Lagos: Concept Publication. 203-241. 
______________ 2007.  Linguistic hegemony and the plight of minority languages in 
Nigeria. Retrieved February, 5 2009, from www.reseau-amerique-
latine.fr/ceisal-bruxelles/ESE/ESE-7-AWONUSI.pdf. 
______________2008. Promoting Christian education to fight corruption.   Nigerian 
Tribune. Retrieved October 12, 2009, from http://www.tribune.com.ng/ 
news2008/index.php/en/. 
Bailey, C.J.N. 1973. The patterning of language variation. Varieties of present-day 
English. R. W. Bailey and J. L. Robinson. Eds. New York: Macmillan. 156-
189. 
 
Bailey, R. W. and Robinson, J. L. 1973. Varieties of present-day English. New York: 
Macmillan.  
Bald, W. D. 1990. An example of phonological reduction in English. Studies in the 
pronunciation of English: a commemorative volume in honour of A.C. 
Gimson. S. Ramsaran. Ed. London: Routledge. 317- 322. 
Bamgbose, A. 1965. Assimilation and contraction in Yoruba. Journal of African 
Languages 2: 21-27. 
____________  1971. The English language in Nigeria. The English Language in West 
Africa. J. Spencer. Ed. London: Longman.   
____________ 1982. Standard Nigerian English: issues of identification. The other 
tongue: English across cultures. B. Kachru. Ed. Urbana: University of Illinois 
Press. 
_____________2004. Problems of standardisation and Nigerian English phonology. 
Nigerian English: influences and characteristics. A. Dadzie and V. Awonusi. 
Eds. Lagos: Concept Publication. 179-199. 
Banjo, A. 1969. A contrastive study of aspects of the syntactic and lexical rules of 
English and Yoruba. PhD. Thesis. University of Ibadan. 
________ 1971. Towards a definition of standard spoken Nigerian English. Annales 
d’Unniversite d’ Abidjan 24-28. 
________ 1975. Varieties and standardization: the case of English in Nigeria. WAMCA 
Paper. 
________ 1979. Beyond intelligibility in Nigerian English. Varieties and functions  
of English in Nigeria. E. Ubahakwe Ed. Ibadan: African Universities Press. 7–13. 
177 
 
 
________ 1993. An endonormative model for the teaching of the English Language in 
Nigeria. International Journal of Applied Linguistics 3.2 
________ 1995. On codifying Nigerian English: the research so far. New Englishes: a 
West African perspective. A. Bamgbose, A. Banjo and A. Thomas. Eds. 
Ibadan: Mosuro Publishers. 203-231.  
________ 1996.  Making a Virtue of Necessity: an overview of the English Language 
in Nigeria. Ibadan: Ibadan University Press. 
Baratz, J. C. 1969. A bi-dialectal task for determining language proficiency in 
economically disadvantaged Negro children. Child Development 40.3:889-
901. 
Barnes, K. 2005. Listener expectations and the perception of Scottish English /u/: a 
sociophonetic investigation. MSc. Project. The University of Edinburgh.      
vii + 51. 
Bell, A. 1984. Language style as audience design. Language in Society 13:145-204. 
______ 2001. Back in style: reworking audience design. Style and sociolinguistic 
variation. P Eckert and J.R. Rickford. Eds. Cambridge: Camb. Univ. Press. 
139-169. 
Bhatt, R. M. 2001. World Englishes. Annual Review of Anthropology 30:527-550.  
Bloomfields, L. 1927. Literate and illiterate speech. American Speech 2.10: 432-439. 
Bortoni-Ricardo, S. M. 1985. The urbanization of rural dialect speakers. A 
sociolinguistic study in Brazil. Cambridge: Cambridge University Press. x + 
265 
Botha, R.P. 1973. The phonological component of a generative grammar. Phonology: 
selected readings. E.C. Fudge. Ed. Penguin Books. 213-31. 
Britain, D. 1992. Linguistic change in intonation: the use of high rising terminals in 
New Zealand English. Language Variation and Change 4. 77-103. 
British Council. 1995. English 2000. Retrieved March 20, 2008, from https://www. 
britcoun.org/english/enge2000.htm. 
Brosnaham, L. F. 1958. English in southern Nigeria. English Studies 39: 97-110.  
Byrd, D. 1994. Relations of sex and dialect to reduction. Speech Communication 
15:39-54. 
Carnochan, J. 1948.  A Study in the Phonology of an Igbo Speaker. Bulletin of the 
School of Oriental and African Studies, University of London 12.2: 417-426. 
Retrieved July 2, 2009, from http://www.jstor.org/stable/608757. 
Carter, P.M. 2005. Prosodic variation in SLA: rhythm in an urban North Carolina 
Hispanic community. Penn. Work. Pap. Ling. 11.2:59–71 
Cheshire, J. 1982. Variation in an English dialect: a sociolinguistic study. Cambridge: 
CUP 
_________  2002. Sex and gender in variational research. The handbook of language 
variation and Change. J. K. Chambers, P. Trudgill and N. Schilling-Estes. 
Eds. MA: Blackwell Publishing. 423–443.  
 
178 
 
 
Chomsky, N. 1955. The Logical Structure of Linguistic Theory. New York: Plenum. 
__________  1964. Current issues in linguistic theories. The Hague: Mouton. 
__________  1965. Aspects of the theory of syntax. Cambridge, MA: MIT Press. 
__________  and Halle M. 1968. The sound pattern of English(SPE). New York: 
Harper and Row. 
Clark, J. and Yallop, C. 1995. An introduction to phonetics and phonology. 2nd ed. 
Oxford: Blackwell 
Clopper, C. and Pisoni, D. B. 2004. Some acoustic cues for the perceptual 
categorization of American English regional dialects. Journal of Phonetics 
32:111–140. 
Coupland, N. 1980. Style-shifting in a Cardiff work-setting. Language in Society 9:   
1-12. 
Crozier, D.H. and Blench, R.M. Eds. 1992. Index of Nigerian languages. 2nd ed. 
Dallas: Summer Institute of Linguistics. 
Cruttenden, A. 1995. Rises in English. Studies in general and English phonetics: 
essays in honour of Professor J. D. O’Connor. J.W. Lewis. Ed. London: 
Routledge. 155–173.  
_____________ 1997. Intonation 2nd ed. Cambridge: Cambridge University Press. 
_____________ 2001. Gimson’s pronunciation of English. 6th ed. London: Arnold. 
Crystal, D. 1987. The Cambridge Encyclopedia of language. New York: Cambridge 
University Press. 
_________ 1991. A dictionary of linguistics and phonetics. Oxford: Basil Blackwell. 
_________  1992. The changing English language – fiction and fact. Thirty years of 
linguistic evolution:studies in honour of René Dirven on occasion of his 60th 
birthday. M. Pütz . Ed. Philadelphia: Benjamins. 119-130. 
_________ 2003. English as a global language. 2nd ed. UK: Cambridge University 
Press.  
Crystal, T.H. and House, A.S. 1988a. The duration of American-English vowels: an an 
overview. Journal of Phonetics 16: 263-284. 
__________  and ________1988b. The duration of American-English stop consonants: 
an overview. Journal of Phonetics 16:285-294. 
Deshaies-Lafontaine, D. 1974. A socio-phonetic study of a Que´bec French 
community: Trois-Rivie`res. Unpublished PhD. thesis, University College 
London. 
Deterding, D. 2001. The measurement of rhythm: a comparison of Singapore and 
British English. Journal of Phonetics 29:317–330. 
Deuchar, M. 1990. A pragmatic account of women‟s use of standard speech. Women in 
their speech communities: new perspectives on language and sex.  J. Coates 
and D. Cameron. New York: Longman. 27-32. 
Di Paolo, M. and Faber, A. 1990. Phonation differences and the phonetic content of 
the tense-lax contrast in Utah English. Language Variation and Change 
2:155–204. 
179 
 
 
Dillard, J. L. 1968. Non standard Negro dialect: convergence or divergence? The 
Florida FL Reporter 6.2: 9-12. 
Dittmar, N. 1976. Sociolinguistics. A critical survey of theory and application. 
London: Edward Arnorld. 
Docherty, G. J., and Foulkes, P. 1999. Newcastle upon Tyne and Derby: Instrumental 
phonetics and variationist studies. Urban voices. P. Foulkes and G. J. 
Docherty. Eds.  London: Arnold. 47–71.  
_____________  and  ________  2005. Glottal variants of /t/ in the Tyneside variety of 
English. A figure of speech: a festschrift for John Laver. W. J. Hardcastle & J. 
Mackenzie Beck. Eds. London: Lawrence Erlbaum. 173-197. 
____________, _________, Milroy, J., Milroy, L. and Walshaw, D. 1997. Descriptive 
adequacy in phonology: a variationist perspective. Journal of Linguistics 33:275–310.  
_____________, Hay, J. and Walker, A. 2006. Proceedings of the 11th Australian 
th
International Conference on Speech Science & Technology. 6-8  December 2006. P. 
Warren and C. I. Watson. Eds. University of Auckland, New Zealand. 
Donegan, P. J. 1978. On the natural phonology of vowels. PhD. dissertation, Ohio 
State University. Ohio State University working papers in linguistics 23. New 
York: Garland. 
 _________  and Stampe, D. 1979. The study of natural phonology. Current 
approaches to phonological theory. D. Dinnsen. Ed. Bloomington: Indiana 
University Press. 126-173. 
Douglas-Cowie, E., Cowie, R., and Rahilly, J. 1995. The social distribution of 
intonation patterns in Belfast. Studies in general and English phonetics: 
essays in honour of Professor J. D. O’Connor. J. Windsor Lewis. Ed. 
London: Routledge.180–186. 
Dressler, W. U. and Wodak, R. 1982. Sociophonological methods in the study of 
sociolinguistic variation in Viennese German. Language in Society 11:339–
370. 
Dunstan, E. 1969. Ed. Twelve Nigerian languages. London: Longman. 
Dziubalska–Kolaczyk, K. 1990. Phonostylistics and second language acquisition. 
Papers and studies in contrastive  linguistics. J. Kisiak. Ed. 25:71-83.   
Eckert, P. 1988. Sound change and adolescent social structure. Language in Society 
17:183-207. 
________ 1989. The whole woman: sex and gender differences in variation. Language 
Variation and Change 1: 245-268. 
________ 1997. Age as a sociolinguistic variable. The handbook of sociolinguistics. F. 
Coulmas. Eds. Oxford: Blackwell. 151-167. 
________ 2000. Linguistic variation as social practice. Oxford: Blackwell. 
________ and McConnel-Ginet, S. 2003. Language and gender. Cambridge: 
Cambridge University Press. 
Edwards, V. I986. Language in a Black community. Avon: Multilingual Matters.  
180 
 
 
Eka, D. 1985. A phonological study of standard Nigerian English. PhD. Thesis. 
Ahmadu Bello University, Zaria. 
Ekong, P. A. 1978. On describing the vowel system of a standard variety of Nigerian 
spoken English. M. A. Thesis, University of Ibadan. 
 
Esling, J. H. 1991. Sociophonetic variation in Vancouver. English around the world. J. 
Cheshire. Ed. Cambridge: Cambridge University Press. 123-133. 
Fafunwa, A.B. 1974. History of education in Nigeria. London: George Allen & 
Unwin.  
Faraclas, N. 1996. Nigerian Pidgin. London: Routledge. 
Federal Republic of Nigeria. 1977. The national policy on education. Lagos: Federal 
Press. 
_______________________. 1999. The constitution of the Federal Republic of 
Nigeria. Lagos: Federal Ministry of Information Printing Division. Chap. V, 
Section 55: A 908. 
Foley,  J. 1977. Foundations of theoretical phonology. Cambridge: Cambridge 
University Press. 
Foulkes, P. 2006. Phonological variation: a global perspective. Handbook of English 
Linguistics. B. Aarts and A.M.S. McMahon. Eds. Oxford: Blackwell. 625-
669.  
________ and Docherty G. J. 2006. The social life of phonetics and phonology. 
Journal of Phonetics 34.4:409-438.  
________, _________ and Watt, D. 1999. Tracking the emergence of structured 
variation. Leeds working papers in Linguistics and Phonetics 7:1–25. 
________, __________ and ________ 2005. Phonological variation in child directed 
speech.  Language 81. 
Fourakis, M. and Port, R. 1986. Stop epenthesis in English. Journal of Phonetics 
14:197–221. 
Fromkin V. and Rodman, R. 1993. An introduction to language. 5th ed. Orlando: 
Harcourt Brace Jovanovich. 
Gal, S. 1979. Language shift: Social determinants of linguistic change in bilingual 
Austria. New York: Academic Press. 
_____ 1995. Language, gender, and power: an anthropological review. Gender 
articulated: language and the socially constructed self. K. Hall, and M. 
Bucholtz. Eds. Routledge:New York. 169–182. 
Gay, T. 1968. Effect of speaking rate on diphthong formant movements. Journal of the 
Acoustical Society of America 44:1570-1575. 
 
Giles, H., & Smith, P. I979. Accommodation theory: optimal levels of convergence. 
Language and social psychology. H. Giles & R. St. Clair. Eds. Oxford: Basil 
Blackwell. 45-65. 
Gimson, A. C. 1980. An introduction to the pronunciation of English. 3rd ed. London: 
Edward Arnold. 
181 
 
 
Goldsmith, J. A. 1974. An autosegmental typology of tone: and how Japanese fits in. 
Proceedings from the Fifth Regional Meeting of the North East Linguistics 
Society (NELS 5). E. Kaisse and J. Hankamer. Ed. Cambridge: Harvard 
University Linguistics Department. 172-182. 
 
_____________ 1976. Autosegmental phonology. Ph.D. Thesis. MIT.  
_____________ and Laks, B. 2000. Generative Phonology: its origins, its principles 
and its successors. The Cambridge history of linguistics. L. Waugh and E.J. 
Joseph. Eds. Retrieved February, 10, 2011, from http://www.hum.unchicago. 
 .  edu/~jegoldsm/ papers/Generative Phonology.Pdf. 
Goodman, M. 1964. A comparative study of creole French dialects. The Hague: 
Mouton.  
Grabe E. and Low E.L. 2002. Durational variability in speech and the rhythm class 
hypothesis. Pap.Lab. Phon. 7, Mouton. 
_______ Post, B., Nolan, F. J., and Farrar, K. 2000. Pitch accent realization in four 
varieties of British English. Journal of Phonetics 28.161–185. 
Graddol, D. 1997. The future of English. The British Council. Retrieved Sep. 5, 2008 
from http://www.britishcouncil.org/de/learning-elt-future.pdf.  
Grieve, D. G. 1966. English language examining. Lagos: West African Examinations 
Council. 
Grosjean, F. 1998. Studying bilinguals: methodological and conceptual issues. 
Bilingualism: Language and Cognition 1.131-149. 
Gut, U. 2001. Prosodic aspects of Nigerian English. Retrieved Sep. 10, 2007, from  
http://www.spectrum.uni-bielefeld.de/TAPS/Gut.pdf. 
Guy G. 1981. Syntactic and phonetic variation in Carioca Portuguese. PhD. thesis. The 
University of Pennsylvania. 
_____, Horvath, B., Vonwiller, J., Disley, E., & Rogers, I. 1986. An intonational 
change in progress in Australian English. Language in Society 7.23–51. 
Halle, M. 1962. Phonology in generative grammar. Word 18.1/2: 54-72. 
Hannisdal, B. R. 2006. Variability and change in Received Pronunciation: a study of 
six phonological variables in the speech of television newsreaders. PhD. 
Thesis. Department of English. University of Bergen. Retrieved August 20, 
2010, from https://www.bora.uib.no/bitstream/1956/2335/1/ Dr.Avh. Bente  
Hannisdal.pdf.  
Hansford, K., Bendor-Samuel, J. and Stanford, R. 1976. An index of Nigerian 
languages. Ghana: Summer Institute of Linguistics. 
Harrington, J. 2004. Generative phonology and underlying representations. Retrieved 
May 7, 2009, from http://www.linq.mq.edu.au/sspeech/ phonetics /phonology 
/generative /index.html. 
__________, Palethorpe, S. and Watson, C. I. 2000. Does the Queen speak the Queen's 
English? Nature 408: 927-928.  
Hay J, Jannedy S, Mendoza-Denton N. 1999. Oprah and /ay/: Lexical frequency, 
referee design and style. Procediings of International Congress of Phonetic 
Sciences 14th. 1389-1392. 
182 
 
 
______ and Drager, K. .2007. Sociophonetics. The Annual Review of Anthropology 
36:89–103. Retrieved September 9, 2009, from https://www arjournals.annual 
reviews. 
Hayes, B. 1980. A metrical theory of stress rules. PhD. Thesis, MIT.  
Henton, C. and Bladon, A. 1988. Creak as a sociophonetic marker. Language, speech 
and mind: studies in honor of Victoria A. Fromkin. L. Hyman and C.N. Li. 
Eds. London: Routledge.3-29.  
Hoequist, C. and Nolan, F. J. 1991. On an application of phonological knowledge in 
automatic speech recognition. Computer Speech and Language 5:133-153. 
Honey, J. 1997. Sociophonology. The handbook of sociolinguistics. F. Coulmans Ed. 
Blackwell: Oxford. 92-106. 
Honikman, B. 1964. Articulatory settings. In honour of Daniel Jones. D. Abercrombie, 
et al. Eds. London: Longman. 
Hooper, J. B. 1973. Aspects of natural generative phonology. PhD. Thesis, UCLA. 
Horvarth, B. M. 1985. Variation in Australian English: the sociolects of Sydney. 
Cambridge: Cambridge University Press. 
Huber, M. and Brato, T. 2008. The emergence of social varieties in Ghanaian English. 
Łodz. Retrieved October 10, 2010, from http://phoneticsofenglish.wordpress. 
com/2008/12/07/accents-2008-ii-international-conference-on-native-and-
non-native-accents-of-english.  
Hughes, A. and Trudgill, P. 1996. English accents and dialects. 3rd ed. London: 
Arnold. 
Hyman, L. M. 1975. Phonology: theory and analysis. New York: Holt, Rinehart and 
Winston. 
Igboanusi, V. 2001. A dictionary of Nigerian English. Ibadan: Sambooks Publishers. 
Ihemere, K. U. 2006. A basic description and analytic treatment of noun clauses in 
Nigerian Pidgin. Nordic Journal of African Studies 15.3:296–313. 
Jackson, A. 1982. Analysing English. Oxford: Oxford University Press. 
Jakobson, R. 1941. Kindersprache, Aphasie und allgemeine Lautgesetze. Uppsala 
Universitets  Arsskrift 9: 1–83. 
___________  and Halle, M. 1956. Fundamentals of language. The Hague:Mouton. 
___________, Fant, C.G.M. and Halle, M. 1952. Preliminaries to speech analysis: the 
distinctive features and their correlates. Cambridge, Mass.:MIT Press. (MIT 
Acoustics Laboratory Technical Report 13.) 
James, C. 1980. Contrastive analysis. London: Longman. 
Jibril, M.  1979. Regional variation in Nigerian spoken English. Varieties and 
functions of English in Nigeria. E. Ubahakwe. Ed. Ibadan: African University 
Press. 43-53 
 
________1982. Phonological variation in Nigerian English PhD. Thesis. University of 
Lancaster. 
183 
 
 
_______ 1986. Sociolinguistic variation in Nigerian English. English World-Wide 
7:147-174. 
Josiah, U. E. 2009. A synchronic analysis of assimilatory processes in Educated 
Nigerian spoken English. PhD. Thesis. Department of English. University of 
Ilorin. i-xviii+321. 
Jowitt, D. 1991. Nigerian English usage: an introduction. Ikeja: Longman. 
_______   2000. Patterns of Nigerian English intonation. English Worldwide 21.1:   
63–80. 
Kachru, B. B. 1985. Standards, codification and sociolinguistic realism: The English 
language in the outer circle. English in the world: teaching and learning the 
language and literatures. R. Quirk and H. G. Widdowson. Eds. Cambridge: 
Cambridge University Press and the British Council.11-30.  
___________ 1986. The alchemy of English: the spread, functions and models of non-
native Englishes. London: Pergamon 
___________ 1992. The second diaspora of English. English in its social context: 
essays in historical sociolinguistics. T. W. a. C. T. S. Machan. Oxford, 
Oxford University Press. 230-52  
___________ 1996. World Englishes: agony and ecstasy. Journal of Aesthetic 
Education 30.2. Illinois: University of Illinois Press. 
___________ 1997. World Englishes 2000: resources for research and teaching. World 
Englishes 2000. L. Smith. Ed. Honolulu: University of Hawaii Press. 209-
251.  
Katalin B. B. and Szilárd S. Eds. 2006. The Pronunciation of English. Budapest, 
Múzeum: Bölcsész Konzorcium.  
Katamba, F. 1989. An introduction to phonology. New York: Longman. 
Kenstowicz, M. 1994. Phonology in generative grammar. Cambridge, MA: Blackwell 
Publishers.  
____________ Kenstowicz, M. and Kisseberth, C. 1976. Topics in phonological 
theory. New York: Academic Press.  
Kerswill, P. 1985. A sociophonetic study of connected speech processes in Cambridge 
English: an outline and some data. Cambridge Papers in Phonetics and 
Experimental Linguistics 4. 
_________ 1987. Levels of linguistic variation in Durham. Journal of Linguistics 
23:25-49. 
__________ 1991. The interaction of sociophonetic features and connected speech 
processes. Non-Technical Summary (Research summary), ESRC End of 
Award Report, R000231056. Swindon: ESRC 
__________ 1996. Children, adolescents and language change. Language Variation 
and Change 8: 177-202.  
___________ and Williams, A. 2000. Creating a new town koine: children and 
language change in Milton Keynes. Language in Society 29: 65-115. 
184 
 
 
___________ and Wright, S. 1990. On the limits of auditory transcription: a socio-
phonetic perspective. Language Variation and Change 2: 255–275. 
Kenworthy, J. 1987. Teaching English Pronunciation. London: Longman 
Khattab, G. 1999. A socio-phonetic study of English-Arabic bilingual children. Leeds 
Working Papers in Linguistics and Phonetics 7: 79-94. 
 
_________ 2002. /l/ production in English–Arabic bilingual speakers. International 
Journal of Bilingualism 6:335–353. 
Kiparsky, P. 1968. How abstract is phonology? Bloomington: Indiana University 
Linguistics Club.  
__________  1982. From cyclic phonology to lexical phonology. The structure of 
phonological representations. H. van der Hulst and N. Smith. Eds. Dordrecht: 
Foris. 1:131-175.  
 
Kirchner, R. n.d. Phonetic and phonology: understanding the sound of speech. 
Retrieved January 12, 2013, from http://www.ualberta.ca/~kirchner-on-
phonology.pdf 
 
Kisseberth, C. 1970. On the functional unity of phonological rules. Linguistic Inquiry 
1: 291-306. 
Knowles, G. O. 1978. The nature of phonological variables in Scouse. Sociolinguistic 
patterns in British English. P. Trudgill. Ed. London: Arnold. 80–90 
Kujore, O. 1985. English usage: some notable Nigerian variations. Ibadan: Evans 
Brothers Ltd. 
Labov W. 1963. The social motivation of a sound change. Word 19:273–309. 
_______ 1966. The social stratification of English in New York City. Washington, 
D.C.: Centre for Applied Linguistics DC: Cent. Appl. Ling. 
_______  1972. Sociolinguistic patterns. Oxford: Blackwell. 
_______ 1990. The intersection of sex and social class in the course of linguistic 
change. Language variation and change 2:205-54. 
_______ 1991. Sociolinguistic patterns. Philadelphia: University of Pennsylvania 
Press. 
_______  1994. Principles of linguistic change. Vol.1: Internal Factors. Oxford: 
Blackwell. 
________ 2001. Principles of linguistic change. Vol.2: Social Factors. Oxford: 
Blackwell. 
Ladefoged, P. 1971. Preliminaries to linguistic phonetics. Chicago: University of 
Chicago Press. 
______________1993. A course in phonetics. 3rd ed. Orlando: Harcourt Brace. 
______________2003. Phonetic data analysis: an introduction to fieldwork and 
instrumental techniques. Oxford: Blackwell. 
Lakoff,  R. 1975. Language and women’s place. New York; Harper and Row. 
Lars, R. 1984. Phonology: an introduction to basic concepts. Cambridge: CUP. 
185 
 
 
Laver, J. 1968. Assimilation in educated Nigerian English. ELT Journal 22.2: 156-160. 
_______1994. Principles of phonetics. Cambridge: Cambridge University Press. 
Lawson, E. and Stuart-Smith, J. 1999. A sociophonetic investigation of the „Scottish‟ 
consonants (/x/ and /f/) in the speech of Glaswegian children. Proceedings of 
the 14th International Congress of Phonetic Sciences. Berkeley: University of 
California. 2541–2544. 
Le Page, R. B. and Tabouret-Keller, A. 1985. Acts of identity: Creole-based 
approaches to language and ethnicity. Cambridge & New York: Cambridge 
University Press. 
Lewis, M. P., Gary F. S., and Charles D. F. Eds. 2013. Ethnologue: languages of the 
world, seventeenth edition. Retrieved September 23, 2013, from 
http://www.ethnologue.com. 
Li, W. 1994. Three generations, two languages, one family. Clevedon: Multilingual 
matters. 
_____, Milroy, L and Pong, S. 1992. A two-step sociolinguistic analysis of code-
switching and language choice: the example of a bilingual Chinese 
community in Britain. International Journal of Applied Linguistics 2:63-86. 
Liberman, M. 1975. The intonation system of English. PhD. Thesis. MIT. 
___________  and Prince, A. 1977. On stress and linguistic rhythm. Linguistic inquiry 
8: 249-336. 
Lindblom, B. 1963. Spectrographic study of vowel reduction. Journal of the 
Acoustical Society of America 35:1173-1181. 
____________ 1990. Explaining phonetic variation: a sketch of the H and H theory. 
Speech production and speech modelling. W.J. Hardcastle and A.Marchal. 
Eds. Amsterdam: Kluwer. 403-439. 
Lippi-Green, R. L. 1989. Social network integration and language change in progress 
in a rural alpine village. Language in Society 18:213-34. 
Lively, S. E., Logan, J. S. and Pisoni, D. B. 1993. Training Japanese listeners to 
identify English /r/ and /l/: the role of phonetic environment and talker 
variability in learning new perceptual categories. Journal of the Acoustical 
Society of America 94: 1242–1255. 
Local, J. 2003. Variable domains and variable relevance: interpreting phonetic 
exponents. Journal of Phonetics 31.321-339. 
_______, Wells, W.H.G. and Sebba, M. 1985. Phonology for conversation: phonetic 
aspects of turn delimitation in London Jamaican. Journal of Pragmatics 9: 
309-330. 
_______, Kelly, J. and Wells, W. H. G. 1986. Towards a phonology of conversation: 
turn taking in Tyneside. Journal of Linguistics 22: 411–437. 
Low, E. L., Grabe, E. and Nolan, F. J. 2000. Quantitative characterizations of speech 
rhythm: syllable-timing in Singapore English. Language and Speech 43: 377–
402. 
Lutz, A. 1991. Phonotaktisch gesteuerte Konsonantenveränderungen in der 
Geschichte des Englischen. Tübingen: Niemeyer. 
186 
 
 
Macaulay, R. 1977. Language, social class and education: a Glasgow study. 
Edinburgh: University of Edinburgh Press. 
Mannell, R. 2008. Distinctive features. Retrieved May 7, 2009, from http://clas.mq. 
edu.au/speech/phonetics/phonology/features/distinctive_features.pdf. 
Marsden, S. 2006. A sociophonetic study of labiodental /r/ in leeds. Leeds Working 
Papers in Linguistics and Phonetics 11. Retrieved January 12, 2011, from 
http://www. leeds.ac.uk/linguistics/WPL/WP2006/7.pdf. 
Mbassi-Manga, F. 1973. English in Cameroon: a study of historical contacts, patterns 
of usage and current trends. PhD. Thesis. Leeds University. 
McArthur, T. 1998. The English languages. Cambridge: Cambridge University Press. 
Mees, I.M. and Collins, B. 1999. Cardiff: a real-time study of glottalisation. Urban 
voices: accent studies in the British Isles. P. Foulkes and G. J. Docherty. Eds. 
London: Arnold. 185-202. 
Milroy, J. 1980. Language and social networks. Oxford: Basil Blackwell.  
_______  1981. Regional accents of English. Belfast: Blackstaff. 
_______  1987. Language and social networks 2nd ed. Oxford: Blackwell. 
_______  2002. Social networks. The handbook of Language variation and change. J. 
K. Chambers, P. Trudgill and N.Schilling-Estes. Eds. MA: Blackwell. 549–
572. 
_______ and Milroy, L. 1997. Varieties and variation. The handbook of 
sociolinguistics. F. Coulmas. Eds. Oxford: Blackwell. 47-64. 
Modaressi Y. 1978. A sociolinguistic analysis of modern Persian. PhD. Thesis. Univ. 
Kans. 
Mohanan, K. P. 1982. Lexical phonology. PhD. Thesis. MIT.  
Muhlhausler, P. 1979. Structural expansion and the process of creolization. St. Thomas 
Conference on Creole Studies. Manuscript.  
Nagy, N. and Reynolds, B. 1997. Optimality Theory and variable word-final deletion 
in Faetar. Language Variation and Change 9:37–55. 
Ngefac, A. 2003. Extra-linguistic correlates of Cameroon English. Ph.D Thesis. 
University of Yaounde I 
Nguyen, T. A. T. and Ingram, J. 2004. A corpus-based analysis of transfer effects and 
connected speech processes in Vietnamese English. Proceedings of the 10th 
Australian International Conference on Speech Science & Technology 
Macquarie University, Sydney, Australian Speech Science & Technology 
Association Inc. 
Nichols, P. C. 1983. Linguistic options and choices for black women in the rural south. 
Language, gender and society. B. Thorne, C. Kramarae and N. Henley. Eds. 
Cambridge, MA: Newbury House. 54-68. 
Nolan, F. J. 1997. Speaker recognition and forensic phonetics. The handbook of 
phonetic sciences. W. J. Hardcastle and J. Laver. Eds. Oxford: Blackwell. 
744–767. 
187 
 
 
__________  and Farrar, K. J. 1999. Timing of f0 peaks and peak lag. Proceedings of 
the 14th International Congress of Phonetic Sciences. Berkeley: University of 
California.961–964.  
_________  and Kerswill, P. E. 1990. The description of connected speech processes. 
Studies in the pronunciation of English: a commemorative volume in honour 
of A. C. Gimson. S. Ramsaran. Ed. London: Routledge. 295–316.  
O‟Barr, W. and Atkins, B. 1980. Women‟s language or powerless language? Women 
and language in literature and society. S. McConnell-Ginet, R. Borker and N. 
Furman. Eds. New York: Praeger. 93-110.  
Obilade, T. 1984. On the nativization of the Englilsh Language in Nigeria. 
Anthropological Linguistics 26.2:170-185. Retrieved May 7, 2009, from 
http://www.jstor.org/stable/ 30027502. 
Odumuh, A. E. 1984. Aspects of the syntax of „educated‟ Nigerian English. Journal of 
Nigerian English Studies Association (JNESA) 9:68-78. 
Ogu, J. N. 1992. A historical survey of English and the Nigerian situation. Lagos: 
Kraft Books Ltd. 
Ogunsiji, A. 2004. Status, features and functions of English in Nigeria and their 
implications for EL2 leaching/learning. Language and discourse in society. A. 
Oyeleye. Ed. Ibadan: Hope  Publications. 
Ohala, J. J. 1983. The origin of sound patterns in vocal tract constraints. The 
production of speech. P. F. MacNeilage. Ed. New York: Springer-Verlag. 
189-216. 
Ojareche, R. A. 2009. A sociophonological analysis of Nigerian male and female 
television newscasters‟ speech. MA Project. Dept. of English. University of 
Ibadan  
 
Ola Orie, O. and Pulleyblank, D. 2002. Yoruba vowel elision: minimality effects. 
Natural Language & Linguistic Theory 20.1:101-156. Retrieved June 3, 2009, 
from http://www.jstor.org/stable/4048051. 
Oladipupo, O. R. 2008. The English intonation patterns of selected reporters on 
television stations in Lagos State Nigeria. MA Project. Dept. of English. 
University of Ibadan. xxiii + 118. 
Olaniyi, O. K. 2007. Approximation to Standard British English: Nigerian English 
Suprasegmentals. Reinventing the English language in Nigeria in the context 
of globalization and decolonization. W.Adegbite and B. Olajide. Eds. 109-
123. 
 
Osa, O. 1986. English in Nigeria:1914-1985. The English Journal 75.3:38-40. 
Retrieved May 7, 2009, from http://www.jstor.org/stable/818853. 
Oyebade, F. 1998. A course in phonology. Ijebu Ode: Shebotimo Publications. 
Paus, C 1997. A comparison of the sociolinguistic conditioning of phonetic reduction 
in Moscow Russian and Northwestern Mexican Spanish. PhD. Thesis 
University of Southern California, Los Angeles, CA. 
Perkell, J. S., Zandipour, M., Matthies, M. L. and Lane, H. (2002). Economy of effort 
in different speaking conditions I: a preliminary study of intersubject 
188 
 
 
differences and modeling issues. Journal of the Acoustical Society of America 
112:1627–1641. 
Pike, K. 1948. Problems in the teaching of practical phonemics. Language Learning 
1.2: 3-8. 
Platt, J., Weber, H. and Ho, M. 1984. The new Englishes. London: Routeledge & 
Kegan Paul. 
Prince, A. and Smolensky, P. 1991. Notes on connectionism and harmony theory in 
linguistics. MSc. Project. 
________ and __________ 1993. Optimality theory: constraint interaction in 
generative grammar. Report no. RuCCS-TR-2. New Brunswick, NJ. Rutgers 
University Center for Cognitive Science. 
Przedlacka, J. 1999. Estuary English. A sociophonetic study. PhD. Thesis. University 
of  Warsaw. 
Pulleyblank, D.1986. Tone in lexical phonology. Dordrecht: D. Reidel. 
Rajend, M. 2010. Socio-phonetics and social change: deracialisation of the GOOSE 
vowel in South African English. Journal of Sociolinguistics 14.1: 3-33. 
Roach, P. 1992. Introducing phonetics. London: Penguin English. 
_______ 2000. English phonetics and phonology: a practical course. 3rd ed. 
Cambridge: Cambridge University Press. 
_______  and Widowson, H. G. 2001. Phonetics. Oxford: Oxford University Press 
Romaine S. 1978. Postvocalic /r/ in Scottish English: sound change in progress? 
Sociolinguistic patterns in British English. P. Trudgill. Ed. London: Edward 
Arnold. 144–56. 
Salami, A. 1968. Defining a standard Nigerian English. JNESA 2: 99-106. 
Salami, O. 2001. Nigerian English: a short introduction. The English compendium 1 & 
2. Lagos: Department of English Lagos State University. 
Sampson, G. 1980. Schools of linguistics: competition and evolution. London: 
Hutchinson. 
Sani, M.A.Z. 1989. An introductory phonology of Hausa. Kano: Triumph Publishing. 
Sankoff, D. and Laberge, S. 1978. The linguistic market and the statistical explanation 
of variability. Linguistic variation: models and methods. D. Sankoff. Ed. New 
York: New York Academic Press. 239-50. 
Schane, S. A. 1973. Generative phonology. New Jersey: Prentice Hall. 
Schmidt, A. 1985. Young people‟s Dyirbal: an example of language death from 
Austrialia. Cambridge and New York: CUP. 
Scobbie, J. 2005. Flexibility in the face of incompatible English VOT systems. 
Laboratory Phonology 8. C. T. Best, L. Goldstein, and D. H. Whalen Eds. 
Berlin: Mouton de Gruyter. 
Shockey, L. 2003. Sound patterns of spoken English. Oxford: Blackwell. 
Simo Bobda, A. 1994. Aspects of Cameroon English phonology. Bern: Peter Lang. 
189 
 
 
____________  1995. The phonologies of Nigerian English and Cameroon English. 
New Englishes: a West African perspective. A. Bamgbose, A. Banjo and A. 
Thomas. Eds. Ibadan: Mosuro Publishers. 248-268 
____________  2007. Some segmental rules in Nigerian English phonology. English 
World-Wide 28.4:279-310. 
____________ and Mbangwana, P. 1993. An introduction to spoken English. 
University of Lagos Press.  
Simpson, A. and Oyètádé, B. A. 2007. Nigeria: ethno-linguistic competition in the 
giant of Africa.  Language and national identity in Africa. A. Simpson. Ed. 
Oxford: Oxford University Press. 172-198.  
Skandera, P. and Burleigh, P. 2005. A manual of English phonetics and phonology. 
Tubingen: Gunter Narr Verlag. 
Sogunro, B. O. 2012. A sociophonological analysis of Hausa English, Igbo English 
and Yoruba English varieties in Nigeria. PhD. Thesis. Department of 
Linguistics and African Languages. University of Ibadan. i-xx+198. 
 
Soneye, T. 2008. CH digraph in English: patterns and propositions for ESL pedagogy. 
Papers in English and linguistics. R. Atoye. Ed. Ile-ife: The Linguistic 
Association. 
 
Spencer, J. 1971. Ed. The English Language in West Africa. London: Longman 
Stampe, D. 1972. How I spent my summer vacation (A dissertation on natural 
phonology). PhD. Thesis. University of Chicago.  
Strauss, S. 1982. Lexicalist phonology of English and German. Dordrecht: Foris. 
Strevens, P. D. 1965. Pronunciation of English in Westa Africa. Papers in Language 
and Language Teaching. Oxford: Oxford University Press. 113-124. 
Stuart-Smith, J. 1999. Glasgow: accent and voice quality. Urban voices. P. Foulkes 
and G. J. Docherty. Eds. London: Arnold. 203–222.  
Szakay, A. 2006. Rhythm and pitch as markers of ethnicity in New Zealand English. 
Proc. Aust. Int. Conf. Sp. Sci. Tech, 11th, Auckland. Auckland: Univ. Auck. 
Press. 421-426.  
Taiwo, C. O. 1980. The Nigeria education system. Lagos: Thomas Nelson. 
Tamuno, T. 1979. Opening address to the ninth conference of the Nigerian English 
Studies Association. Varieties and functions of English in Nigeria. E. 
Ubahakwe. Ed. Ibadan: African University Press. 
Tannen, D. 1991. You just don’t understand: women and men in conversation. 
London: Virago.  
Taylor, J. 1998. What, pray, is happening to dear old RP? La Linguistique 34.2: 141-
151. 
Thomas, E. R. 2000. Spectral differences in /ai/ offsets conditioned by voicing of the 
following consonant. Journal of Phonetics 28:1–26. 
Tiffen, B. W. 1974. The intelligibility of Nigerian English. PhD. Thesis. University of 
London. 
190 
 
 
___________1968. Language education in Commonwealth Africa. Language in 
education.  J. Dakin. Ed. London: Oxford University Press.  
Tomori, S. H. O. 1967. A study in the syntactic structures of the written English of 
British and Nigerian grammar school pupils. PhD. Thesis. University of 
Ibadan. 
Trömel-Plötz, S. 1984. Gewalt durch sprache. Frankfurt am Main: Fischer. 
Trubetzkoy, N.S. 1939. Grundzüge der Phonologie Güttingen: Vandenhoeck and 
Ruprecht. 
Trudgill, P. 1972. Sex, covert prestige and linguistic change in the urban British 
English of Norwich. Language in Society 1:179-196.  
_________ 1974. The Sociolinguistic differentiation of English in Norwich. 
Cambridge: Cambridge University Press. 
_________ 1983. On dialect: social and geographical perspectives. Oxford: Basil 
Blackwell. 
________  1986. Dialects in contact. Oxford: Blackwell. 
Udofot, I. 1997. The rhythm of spoken Nigerian English. PhD. Thesis. University of 
Uyo. 
_______   2004. Varieties of spoken Nigerian English. The domestication of English in 
Nigeria. V. Awonusi and E. Babalola. Eds. Lagos: University of Lagos Press. 
93–113.  
Vennemann, T. 1974. Phonological concreteness in natural generative grammar. 
Toward tomorrow’s linguistics. R. Shuy and C. J. Bailey. Eds. Washington, 
DC: Georgetown University Press. 
Warren P. 2005. Patterns of late rising in New Zealand-intonational variation or 
intonational change? Language Variation and Change. 17.2:209–30 
________ and Britain, D. 2000. Intonation and prosody in New Zealand English. New 
Zealand English. A. Bell and K. Kuiper. Eds. Amsterdam: John 
Benjamins.146–172.  
Weisser, M. 2005. Assimilation. Retrieved October 2, 2009, from http://www.phil.tu-
chemtz.de. 
Wells, J. C. 1982. Accents of English. Cambridge: Cambridge University Press. 
_________1994. The Cockneyfication of R.P.? Stockholm studies in English. G. 
Melchers and N.-L. Johannesson. Eds. Stockholm: Almqvist & Wiksell. 84: 
198-205. 
__________ 1995. Age grading in English pronunciation preferences. Proceedings of 
the 13th international congress of phonetic sciences. Stockholm: University 
of Stockholm. 696–699.  
_________  2000. Longman pronunciation dictionary. New ed. London: Longman. 
Wodak, R. and Benke, G. 1997. Gender as a sociolinguistic variable: New perspectives 
on variation studies. The handbook of Sociolinguistics. F. Coulmas. Eds. 
Oxford: Blackwell. 127-150. 
Wolfram, W. 1969. A sociolinguistic description of Detroit Negro speech. Washington 
DC: Centre for applied linguistics   
191 
 
 
Wright & Kerswill 1989. Electropalatography in the analysis of connected speech 
processes. Clinical Linguistics & Phonetics 3:49 – 57 
Yusuf, O. 2010. Basic linguistics for Nigerian languages. Ijebu-Ode: Shebiotimo 
Publications. 
Zue, V. and Laferriere, M. 1979. Acoustic study of medial /t, d/ in American English. 
Journal of the Acoustical Society of America 66.4: 1039-1050. 
192 
 
 
APPENDICES 
 
Appendix A: Distribution of participants by social variables 
 
Region Age Gender No 
East Young Male 20 
East Adult Male 20 
East Young Female 20 
East Adult Female 20 
      80 
        
North Young Male 30 
North Adult Male 30 
North Young Female 30 
North Adult Female 30 
      120 
        
South-South Young Male 20 
South-South Adult Male 20 
South-South Young Female 20 
South-South Adult Female 20 
      80 
        
West Young Male 20 
West Adult Male 20 
West Young Female 20 
West Adult Female 20 
      80 
    
Gr and Total  360 
 
193 
 
 
 
 
 
 
Appendix B.   The Semi-Spontaneous Speech Data 
 
TEST 1 
 
1) I‟ve met Peter at the station  
2) There are ten boys 
3) She‟s a good girl 
4) You will miss your train 
5) Has your letter come?  
6) Those young men 
7) What you need is a good job.  
8) Would you leave here?  
9) Doesn‟t she know her teacher? 
10) He won‟t do it  
11) No, he kept quiet  
12) You mustn‟t over-eat 
13) I found five  
14) No, he is an old man 
15) That was cold lunch  
16) No, he seemed glad 
17) No, but they robbed both banks  
18) You jumped well  
19) I want more of Him 
20) I met him after a while  
21) Their action is wrong 
22) They maintain law and order  
23) Know what? I don‟t have an Idea of it. 
24) I was at a media event  
25) We chose six players. 
26) Yea! I have to go 
27) Oh! It was a live show 
28) He‟s a nice boy 
194 
 
 
29) The dog‟s mine 
30) She wore a black dress 
31) The job was half-done 
 
TEST 2 
 
A.  Good morning. I‟d like to inquire about the advertised car  
B.  Yes, we have the car here. Its features will amaze you 
A.  Is the information about it valid? 
B.  Yes, of course. It is equipped with power-assisted steering, which I suppose, is 
the most important piece of information that you need 
A.  Well, obviously, but...do you think it is really ice blue with darker blue inside?  
B.  Oh... yes, this is the exact colour of the car.  
A.  All right, then. Can I arrange a test drive for tomorrow?  
B.  Y..es, you can have it tomorrow... It‟ll cost you ten pounds in case you don‟t buy 
it 
A.  Ten pounds!! Could you rather make it five pounds? 
B.  Sorry, madam, we have a fixed price for all customers.  
A.  Well...in that case, I‟ll be there tomorrow. Goodbye.  
B.  Goodbye and God bless you.    
 
 
195 
 
 
Appendix C. Questionnaire for adult speakers 
 
Thank you very much for voluntarily participating in this research data gathering exercise to investigate the use of 
English in Nigeria. Please fill the questionnaire carefully. The information being gathered is purely for a research 
purpose and your responses shall be treated without reference to your name or personality. 
1.    Personal information 
Sex: M  F 
 Age Grade: 16-30   31-49   50+ 
 Tribe____________________ 
 Place of birth? _________________________________  
Did you grow up in your region of origin (where your indigenous language is spoken)?  Yes    No 
If no, where did you grow up? ___________________ 
Have you spent a greater part of your life in your region of origin? Yes    No 
If no, where? ________________________ 
Have you ever lived in Britain, America, Canada or any other country where English is spoken as a first 
language?  Yes  No 
If yes, where?________________________ For how long? ______________________ 
2.    Educational Background 
Highest Qualification/Level of education     
SSCE       NCE /OND           Undergraduate  HND/B.Sc/B.A MA/M.Sc/Ph.D 
Course of Study________________________________ 
What is the nature/status of the schools you attended or are attending?  
Private Primary         Public Primary           Others______________ 
Private Secondary              Public Secondary                    Others______________ 
Private University              Public University           Others______________ 
Were you taught/are you being taught by teachers who are native speakers of English?    
Yes        No 
Were you exposed to diction (pronunciation) at any level of your education?  
Yes   No 
If yes, at what level? ____________________________________________________ 
 
3.    Linguistic background 
 Parental linguistic Background 
Father:  literate  illiterate  
Mother: literate  illiterate 
Major language spoken by father ____________________________ 
Major language spoken by mother _____________________________ 
 
 
 
 
196 
 
QUESTIONNAIRE ON THE USE OF ENGLISH IN NIGERIA 
 
 
Language of parental instruction as a child 
English Yes  No   
Any other language(s) ________________________________________ 
First language spoken as a child  
English Yes  No 
Any other language(s) ________________________________________ 
4.    Socio-economic background (respond as applicable to you) 
What do you do for a living (Your Profession)? _______________________________ 
 How long have you been working? ______________What is your Position at work?___________________ 
 What part of this city do you live in?_________________________________________ 
 Rank yourself/your family economic status along any of the following levels: 
 Low   Middle Class   High 
 Do you have access to DSTV or any other cable television at home? Yes  No 
 What is your favourite channel? ____________________________________________________________ 
Have you ever travelled out of Nigeria to other countries before? 
 Yes  No 
 If yes, where______________, ______________, ______________, ______________, _______________
       
How many times?_______________________________________________________ 
 
Thank you very much. 
 
197 
 
 
Appendix D. Questionnaire for young speakers 
 
Thank you very much for voluntarily participating in this research data gathering exercise to investigate the use of 
English in Nigeria. Please fill the questionnaire. The information being gathered is purely for a research purpose 
and your responses shall be treated without reference to your name or personality. 
1.    Personal information 
Sex: M  F 
 Age Grade: 16-30   31-49   51+ 
 Tribe____________________ 
 Where were you born? _________________________________  
Did you grow up in your region of origin (where your indigenous language is spoken)?  Yes    No 
If no, where did you grow up? ___________________ 
Have you spent a greater part of your life in your region of origin? Yes    No 
If no, where? ________________________ 
Have you ever lived in Britain, America, Canada or any other country where English is spoken as a first 
language?  Yes  No 
If yes, where?________________________ For how long? ______________________ 
2.    Educational Background 
Highest Qualification/Level of Education     
SSCE       NCE /OND           Undergraduate  HND/B.Sc/B.A MA/M.Sc/Ph.D 
Course of Study________________________________ 
What is the nature/status of the schools you attended or are attending?  
Private Nursery/Primary        Public Primary   Others______________ 
Private Secondary              Public Secondary                     Others______________ 
Private University              Public University            Others______________ 
Were you taught/are you being taught by teachers who are native speakers of English?    
Yes        No 
Were you exposed to diction (pronunciation) at any level of your education?  
Yes   No 
If yes, at what level? ____________________________________________________ 
 
3.    Linguistic background 
 Parental linguistic Background 
Father:  literate  illiterate  
Mother: literate  illiterate 
Major language spoken by father ____________________________ 
Major language spoken by mother _____________________________ 
 
 
 
 
198 
 
 
Language of parental instruction as a child 
English Yes  No   
Any other language(s) ________________________________________ 
First language spoken as a child  
English Yes  No 
Any other language(s) ________________________________________ 
4.    Socio-economic background (respond as applicable to you) 
What are your parents‟ careers (Professions)?   
 Father______________________________  Mother___________________________ 
 What do you do for a living? _______________________________ 
 How long have you been working? ______________What is your Position at work?___________________ 
 What area of this city do you live in?_________________________________________ 
 Rank yourself/your family economic status along any of the following levels: 
 Low   Middle Class   High 
 Do you have access to DSTV or any other cable television at home? Yes  No 
 What is your favourite channel? ____________________________________________________________ 
Have you ever travelled out of Nigeria on holidays to other countries before? 
 Yes  No 
 If yes, where______________, ______________, ______________, ______________, ____________ 
  
How many times?_______________________________________________________ 
 
Thank you very much. 
 
 
199