ABOUT THE BOOK INTRODUCTION TO PROBABILITY AND Probability theory has a very long history dating back to the seventeenth century. It is a well-established branch of mathematics that has applications in every area of human STOCHASTIC PROCESSES discipline and daily experiences. WITH APPLICATIONS This is an introductory textbook dealing with probability and stochastic processes. It is designed for undergraduate and postgraduate students in Statistics, Mathematics, the Physical and Social Sciences, Engineering and Computer Science. It Marginal Distribution of pH presents a thorough treatment of probability and stochastic ideas and methods necessary for a firm understanding of the subject. The text can be used in a variety of course lengths, 1 levels, and areas of emphasis. ----------1/ \ p 7 S _ 0 0 28125 0.J2S125 0.J3984J75I / ^ -- ----T The material is divided into three parts. The first part covers V Seffetl basic probability topics for undergraduate students. The second part covers advanced probability topics that are of interest to postgraduate students while the third part deals with topics in Thr function P"". stochastic processes that are taught both at undergraduate and postgraduate levels. Very little statistical background is assumed in order to obtain full benefits from the use of the text. Also, numerous examples and practice questions are included to aid understanding of all the subject areas covered by the book. The publication of this book is a demonstration of our commitment to the provision o f relevant and current materials for Statistics students in higher institutions of learning. FASCO PUBLISHERS ISBN 978- 9 78- 52890- 0 - 8 UNIVERSITY OF IBADAN LIBRARY ilN l k o d u c t i o n t o p r o b a b i l it y AND STOCHASTIC PROCESSES (WITH APPLICATIONS) FOREW ARD (c) 2014 by Shittu, Olanrewaju 1., Otekunrin, Oluwaseun O., This impressive book by Shittu O.I., Otekunrin O. A., Udomboso C. G., and Udomboso, Christopher G., Adepoju, Kazeem A. Adepoju K A. (o f the Department o f Statistics, University o f Ibadan, Nigeria) encompasses the essence o f probability, and stochastic processes under a hirst Published: May, 2014 common shade. The authors are to be commended for their lucid presentation as well as their broad coverage o f the subject matter. All rights reserved. A special feature o f the book is that the exercises and the test form an Vo part o f this publication may be reproduced or transmitted in any form integrated pattern. These exercises are designed to encourage the student to or by any means, electronic or mechanical, including photocopy, reread the text, practice them and become thoroughly familiar with the recording, or any information storage or retrieval system, without techniques described. This will help in impressing on the student the methods permission in writing from the publisher. and logic o f establishing the techniques. Student involved in statistics-oriented discipline, and professional statistician in Fasco Publishers, 67 Gbadebo Street, Mokola. Ibadan. a wide variety o f fields will find this book a highly useful volume for study, Tel: 08032934309, 08051036056 and application. This is a scholarly undertaking by the four authors, and l have a full appreciation for the job nicely done. ISBN 9 7 8 - 9 7 8 - 52890 - 0 - 8 G.N. Amahia (Ph.D) Professor o f Statistics University o f Ibadan Pnnted in Ibadan by Fasco Printing Works Ibadan. iii UNIVERSITY OF IBADAN LIBRARY P R E F A C E T A B L E o f c o n t e n t s Probability theory has a very long history dating back to the seventeenth Foreward century. It is a well-established branch o f mathematics that has applications in iii every area o f human discipline and daily experiences. Preface iv 1 his is an introductory textbook dealing with probability and stochastic Contents v processes. It is designed for undergraduate and postgraduate students in Chapter 1 - The Mathematics of Choice Statistics. M athematics, the physical and social sciences, engineering and 1.0 Introduction 2 computer science. It presents a thorough treatment o f probability and stochastic 1.1 Fundamental Principle of Counting ideas and methods necessary for a firm understanding o f the subject. The text 1.2 Permutation can be usedin a variety o f course lengths, levels, and areas o f emphasis. 1.3 Combination 1.4 Stirling Numbers of the Second Kind 16 Hie material is divided into three parts. The first part covers basic probability 1.5 Allocation and Matching Problems 28 topics for undergraduate students. The second part covers advanced probability Chapter 2 - Elements of Probability topics that are o f interest to postgraduate students while the third part deals 2.1 Introduction with topics in stochastic processes that are taught both at undergraduate and 2 2 Definition of Terms and Concepts 8̂ postgraduate levels. 2.3 The Approaches to the definition of Probability Very little statistical background is assumed in order to obtain fiillbenefits from 2.4 Probability of an event the use o f the text. Also, numerous examples and practice questions are 2.5 Consequences of Probability Axioms included to aid understanding o f all the subject areas covered by the book. 2.6 Rules of Probability 2.7 Venn Diagrams 2.8 The Principle of Inclusion and Exclusion The publication o f this book is a demonstration o f our commitment to the •4 2.9 Conditional Probability and Independence 55 provision o f relevant and current materials for Statistics students in higher 2.10 Statistical Independence institutions o f learning o f the authors. ( haptcr 3 - Conditional Probability and Bayes Theorem ()1 This text which cannot be said to be exhaustive was developed from the years 3.1 Conditional Probability 3.2 Independence o f learning and teaching o f probability and stochastic processes. While we 3.3 Bayes Theorem ^ claim responsibility for some errors that could have been made inadvertently in 3.4 Total Probability ^ this first edition, we welcome comments and objective criticisms from the users 67 o f this hook. iv v UNIVERSITY OF IBADAN LIBRARY Chapter 7 - Probability Generating Functions (PGf) 1-42 7.1 Introduction 142 Chapter 4 - Fundamentals of Probability Functions 78 7.2 Properties of PGF 142 4.1 Introduction 78 7.3 Probability Generating Functions approach for deriving Means and Variances 4.2 Probability Density Function (pdf) 78 of some Discrete Distributions 144 4.3 Distribution Function 82 7.4 Binomial Distribution 144 4.4 Distribution Function for Discrete Random Variables 85 4.5 Joint Distribution Function 87 Chapter 8 - Moment Generating Functions 147 4.5.1 Conditional Distribution of Jointly Distributed Random Variables 89 8.1 Moment Generating Function 152 4.6 Independence of Functions of Random Variables 90 8.2 m.g.f for Bivariate Distribu:ion 155 4.7 Functions of Random Variables 97 8.3 Obtaining Moments from m.g.f 156 Chapter 5 - Some Discrete Probability Distributions 102 Chapter 9 - Characteristic Functions 158 5.1 Bernoulli Trials and Binomial Distribution 102 9.1 The Characteristic Function (c.f.) 158 5.2 Binomial Distribution 103 9.2 Exponential Distribution 159 5.3 Poisson Distribution 107 9.3 Gamma Distribution 160 5.4 Properties of a Poisson Experiment 107 9.4 Characteristics Function of the Sum of Independent Random Variables 166 5.5 Mean and Variance of a Poisson Distribution 108 9.5 Some Special Probability Distribution 167 5.6 The Poisson Distribution as an Approximation to the Binomial Distribution 110 9.6 The Inversion Formula 169 5.7 Hypcrgeomctric Distribution 111 5.8 Mean and Variance of Hypergeometric Distribution 113 Chapter 10 - Measurable Function 175 5.9 Binomial Distribution as an approximation to the 1 Iypergeometric Distribution 116 10.1 Some definitions 175 5.10 Negative Binomial and Geometric Distributions 118 10.2 Obtaining Countable Class of Disjoint 1 SO 5.11 Negative Binomial Distribution 118 10.3 Abstract Model for Probability of an Event 175 5.12 Geometric Distribution 120 10.4 Axiom for Finite Probability Space 176 5.13 Multinomial Distribution 124 10.5 The Halley-De-Moivre Theorem 184 10.6 Probability Space 187 Chapter 6 - Some Continuous Probability Distributions 127 10.7 Sigma-Field (a - Field) 189 6.0 Introduction 127 10.8 Borel Field 190 6.1 Normal Distribution 127 10.9 Random Variable in Measure Space 191 6.2 Exponential Distribution 128 6.3 Gamma Distribution 130 Chapter 11 - Limit Theorems and Law of Large Numbers 199 6.4 Pareto Distribution 134 11.1 Introduction 199 6.5 Maxwell Distribution 137 11.2 Concept of Limit 199 11.3 Markov’s Inequality 200 11.4 Bienayme-Chebyshev’s Inequality 201 vii vi UNIVERSITY OF IBADAN LIBRARY 11.5 Convergence of Random Variables 204 17.5 Compound Distributions 259 11.6 Laws of Large Number 204 17.6 Markov Chain 260 17.7 Stationarity Assumption 271 Chapter 12 - Principles of Convergence and Central limit Theorem 215 17.8 Absorbing Markov Chain 272 12.1 Introduction 215 12.2 Convergence of Random Variable 215 Chapter 18 - Equilibrium (Steady State) and Passage Time Probabilities 285 12.3 Cauchy-Schwarz Inequality 218 18.1 Introduction 285 12.4 Borel-Cantclli Lemma 219 18.2 Graph of Marginal Distribution of 286 12.5 The Central Limit Theorem 221 18.3 Stationary Distribution 288 12.6 The Central Limit Theorem 221 18.4 First-Passage and First-Return Probabilities 291 12.7 Strong Law of Large Numbers for Independent Random Variables 222 18.5 Distribution of Number of Steps for First Passage 293 12.8 Bolzano-Cauchy Criterion for Convergence 226 15.6 First Return (Recurrence) 294 12.9 First Borel-Cantelli Lemma 226 12.10 Second Borel-Cantelli Lemma 227 Chapter 19 - Chapman-Kolmogorov Equations and Classification of States 297 12.11 The Zero-One-Law 229 19.1 Introduction 297 12.12 Limit Theorems for Sums of Independent R.V’s Lindcberg-Levy Theorem 232 19.2 Classification of States 300 19.3 Discrete Time Process 303 Chapter 13 - Introduction to Brownian Motion 236 19.5 Continuous Time Process 310 13.1 Brownian Motion (Weiner Process) 236 19.6 The Exponential Process 312 13.2 Brownian Process 237 13.3 Multinomial Distribution and Gaussian Process 237 Chapter 20 - Introduction to the Theory of Games and Queuing Models 319 13.4 Properties of a Brownian motion (B. M) 239 20.1 Games Theory 319 20.2 Gambler’s Ruin 319 Chapter 16 - Introduction to Stochastic Processes 246 20.3 Queuing Theory 331 16.1 Basic Concepts 246 20.4 The Basic Queuing Process 335 16.2 Discrete-Time Markov Chains 247 20.5 Poisson Process and Exponential Distribution 336 16.3 Classification of General Stochastic Processes 249 20.6 Classification of Queuing System 337 16.4 Classical Type of Stochastic Processes 249 20.7 Poisson Queues 338 16.5 Markov Processes 251 Chapter 17 - Generating Functions and Markov Chains 254 17.1 Introduction 254 17.2 Basic Definitions and TailP robabilities 254 17.3 Moment-Generating Function 256 17.4 Convolutions 258 viii ix UNIVERSITY OF IBADAN LIBRARY PART ONE UNIVERSITY OF IBADAN LIBRARY C H A PT E R 1 This is because \A\ + |S | counts every element o f A D B twice. Lei us illustrate this T H E M A T H E M A T IC S OF C H O IC E with the following example. 1.0 Introduction Many real life situations requires enumerating the number of possible ways of taking Example 2: If A = (2.3, 4, 5, 6). |/1| = 5 and B = ( 3,4, 5 ,6 ,7). |/?| = 5 a number of decisions out of many available ones, or the number of ways an event can then. |/1| + \B\ = 10 occur, number o f possible outcomes of an experiment. All o f the above require the act A U B = 2 ,3 ,4 ,5 ,6 ,7 of either counting, choosing, arranging or a combination of the above. Therefore, it is \A U B \ - 6 just apt to introduce the reader first, to some basic principles of counting. Since A and B are not disjoint. \A U B\ < \A \ 4- \B\ Compensating for this double counting yields the formula 1.1 Fundamental Principle of Counting j/1 U B | = |/1| + \B\ — \A n B | ................eqn.U) If one experiment can result in n possible outcomes an experiment can result in k 1-rum our example, A fl B = 3, 4 ,5 ,6 possible outcomes, then nk is the total number of possible outcomes from the two |i4 n B\ = 4 experiments. \A u B \ = 5 + 5 - 4 Consider a finite sequence of decisions. Suppose the number of choices for each = 6 individual decision is independent of decisions made previously in the sequence. Thus proving equation (1) Then, the number o f ways to make the whole sequence o f decisions is the product of Theorem 1:\A U B U C\ = \A\ + |S | + |C| - \ A n B \ - |/1 n C \ - \ B n C \ + these numbers o f choices i.e. n! \A D B n C\ for three sets A. B and C. Proof: Example 1: The number of four-letter words that can formed by rearranging the We know from equation (1) above that \A U B\ = \A \ 4- \B\ — \A n B\ letters in the word PLAN is 4! =24. Then, for 3 sets. \A U B U C\ = \A U [B U C]| PLAN PLNA PALN PANL PNLA PNAL = \A\ 4- \B UC| - |/1 n [B UC]| LPAN LPNA LAPN LANP LNPA LNAP Applying equation (1) to \B U C| gives APLN APNL ALPN ALNP AN PL ANLP \A U S u C| = \A\ + [\B\ + \C\ - |S n C |] - \ A n [B U C ]|........... eqn (2) NPLA NPAL NLP A NLAP NAPL NALP Because A n [S U Cl = {A D S ) U (/I n C). we can apply equation(l) again to obtain \A n |S u CJ| = \A n B\ + \A r\ C\ - \A r\ B r\ C |........... eqn (3) 1.1.2 The Second Counting Principle (The Principle of Inclusion and Exclusion) Finally, a combination o f equations (2) and (3) yields^ If a set is the disjoint union of two (or more) subsets, then the number o f elements in \A U S U C| = [\A\ + \B\ + |C|] - [\A n S | + \A D C| 4- |S n C|] + \A n B n the set is the sum of the numbers of elements in the subsets, i.e. C |........... eqn (4) n(A U S) = n(/l) + n (S ) implying that \A U B\ = \A\ 4- |S | if A and B arc disjoint. Theorem \:\A U B\ < |/1| 4- \B\ if A and B are not disjoint. 2 UNIVERSITY OF IBADAN LIBRARY When we subtract the sum of the number o f elements in such pairwise intersections, Thus proving theorem 2. some elements may have been subtracted more than once. Those are the elements that From this derivation, we notice that an element o f A n B n C is counted 7 times in belong to at least three of the sets/1;. W!e add the sum of the elements of intersections equation(4). the First 3 times with a plus sign, then 3 times with a minus sign and then of the sets taken three at a time. (Mote: the condition i < j < k ensures that every once more with a plus sign. intersection is counted only once) Example 1.1: IF/l = {1, 2 ,3 ,4}B = {3, 4, 5, 6} C = {2,4, 6, 7} then The process continues with sums being alternately added or subtracted until we come A U fiU C = {1,2,3,4,5,6, 7} to the last term which is the intersection o f all sets A, thus proving the theorem. |/1 U B U C\ = 7 ...........................(a) Let S = A U A2 U ... .U A , and A f = S\Ai lhen the PIE principle can also be expressed as Ml = 4 \B 1 = 4 Mic n ..... n A ,c| = \C 1 = 4 MI + M I+ IC I = 12 +1l2X4P2 = ^ , ^ (i) Without restriction, number of arrangement's ~ = 1260 ways. When two = 5 x 4 x 3 x 12 R 's occur together is ^ = 360 way when two R’s do not occur together is = 12 x 60 1 2 6 0 -3 6 0 = 900 ways. = 720 ways P (two R 's not occur together) = = 0.714 1260 (iii) Leave out the criminal then we have (ii) If two R’s and two A’s do occur together we have (A, A) (R. R) N G. E i.e., P5 = 5! = 120 ways. = 6 x 5 x 4 x 3 (D) When the number of items not occurring together is more than two = 360 ways Some kind o f logic would have to be applied here. It is better illustrated with an example. (C) Perm utation when two things are not to occur together: Procedure Example 1.13: In how many ways can 5 blue cars and 4 red cards be arranged in a straight car park two red cars are not to stand together. (a) Find permutation without restriction (b) Find permutation when two things occur together. Solution: First, the first 5 cards are positioned as indicated below (c) The difference between (a) and (b) gives the number of arrangement when two X B X B X B X B X B X things do not occur together. The blue cars can be arranged in 5! ways. Now there are 6 vacant positions (marked Example 1.11: In how many ways can 10 different books be arranged on a shelf if X). The remaining 4 red cars can be arranged in P4 = 360 ways. The required two particular books are not to stand together? number o f ways o f parking 5 blue cars and 4 red cars is 5! X P4 10 11 UNIVERSITY OF IBADAN LIBRARY The remaining 4 digits can be fitted in = 120 x 360 P4 = 4! = 6 ways = 43200 ways So, the total number o f numbers divisible by 2 = 24 x 2 = 48. (E) When Items are repeated: ( d ) Suppose we have to form numbers which begin with 1 and end with 3. Here The number o f permutation of n different items taking r at a time, when each the first and the last places are fixed. item may occur an number of times is nr . Then, the remaining 3 digits can be filled in. Example: 1.14: A die is rolled 4 times what is the sample space. 1 2 4 5 3 Solution: 1 2 5 4 3 A die has six faces, hence may occur in 6 ways. 1 4 2 5 3 The sample space is 1 4 5 2 3 = 6 ways 64 = 1296 1 5 2 4 3 (F) Formation of numbers with digits: 1 5 4 2 3 The idea of permutation can be applied in the formation of numbers with digits. This (e) Suppose we have to form a number where 1 or 3 is in the beginning or the end. is particularly useful in a raffle draw. Let us illustrate with a simple case. Then the two digits can be arranged among themselves in 2! ways. Hence total number o f arrangement will be P3X 2 = 12 ways. Example 1.15: Suppose the five digits 1, 2. 3, 4. 5 are given. To find the total number (0 Suppose we have to form numbers greater than 30,000. Here there should be 3 of numbers which can be formed under different conditions. or 4 or 5 in ten thousand’s place which can be filled in 3 ways. (a) Without restriction = P5 = 5! = 120 ways. The remaining 4 digits tilled in 4! ways. (b) Suppose 5 always occur in the tenth place. Now the tenth place is fixed, then Therefore, we have, i.e. the remaining four places can be fitted with four digits as l \ = 4! = 24 ways. i.e. 3 1 2 4 5 1 2 3 4 5 2 1 3 5 4 3 2 1 4 5 etc 1 2 4 5 3 2 1 4 5 3 i.e.. total number o f numbers 3 X P4 1 3 2 5 4 2 3 1 5 4 x 2 = 2 4 ways = 3 X 2 4 = 72 1 3 4 5 2 2 3 4 5 1 Example 1.16: How many numbers can be formed with digits 1.2. 4, 0, 5 when any 1 4 3 5 2 2 4 1 5 3 is not repeated in any number? 1 4 2 5 3 2 4 3 5 1 Solution: There are 5 digits in all including zero. The number of single digit numbers is Px. The number of two digit number is P2. Out of this, some have zero in the tenth (c) Suppose we have to form a number divisible by 2. Then the unit's place must be occupied by 2 or 4 which can be arranged in 2 ways. 13 12 UNIVERSITY OF IBADAN LIBRARY Example 1.18: Suppose the letters of the word STAPLER is given to form place and so reduces to one digit number. Hence the number of two digit numbers is words. P2 - P\- Similarly, the nutrber of three digit number is P3 - P2. (a) If there is no restriction, the number of words is The total number o f numbers is P7 = 7! = 5040 words. Px + (P2 ~ Pi) + t o “ Pz) + t o - Ps) + t o ~ PJ (b) Suppose all words to be formed begins with S. The remaining 6 places can be 4 + 16 + 48 + 96 + 96 filled in 6! = 720. 260 numbers. (c) Suppose all words to be formed begins with S or ends with E. The two positions can be filled in P2 = 2 ways. The other 6 digits can be filled in Example 1.17: P6 = 6! = 70 ways. (i) Find the sum of all the numbers that can be formed with digits 1, 3, 4, 7, 5, 9 Hence total number of words is 2 x 120 = 240 words. taking all at a time. (d) If all words formed must begin with S and end with E. The two places are now (ii) Find the probability o f having a number with 3 in the tenth place. fixed. Then the remaining 5 places can be filled in 5! = 120 ways. Hence, 120 words are formed. Solution: (e) Suppose two vowels A and E are to stand together. Regard A and E as one (i) We need to consider when each digit occupy a particular place. The number of a, E, STPLR permutation when 1 is in the unit place is Ps = 5! = 120. The number of STPLR can be arranged among themselves in 6! = 120 ways. permutation when any o f the given numbers occupy the unit place is also The two vowels can be arranged in 2 ways. 5! = 120 ways. Hence we can sum all the numbers in the unit place a Hence the total number of words is 2 x 120 = 240 words. 120(29)* 1 = 3 480* 1 (0 If three particular letter are to occupy the even places. The first letter can be Similarly the sum of numbers in the 10th place is also filled in 3 ways, the second in 3 ways and the third in 1 way. a total of 6 ways. 120(1 + 3 + 4 + 5 + 7 + 9) = 2480 * 10 Then, the remaining 4 letters can be filled in 4! = 24 ways. Hence, the total = 34800 number of words is 6 x 24 = 144 In the same manner, the sum of all the numbers is 3480 (100,000 + 10,000 + 1,000 + 100 + 10 + 1) (H) Ordered: = 3430 (111111) = 386666280 Arrangement of items round a circle: (ii) The number o f numbers taking all at a time without restriction is Things can be arranged round a circle in (i) clockwise and (ii) anti- clockwise P6 = 6! = 720 direction. The number o f numbers when 3 occupy the tenth place is P5 = 120 Pr (a number 3 in the tenth ^p lace) = —720 = 0.1667 (G) Formation o f words with letters: This is similar to what we illustrated in Formation o fn umbers with digits. 15 14 UNIVERSITY OF IBADAN LIBRARY Example 1.19: In how many ways can 7 people sit round a circular dinning tabic = “ (7 — 1)! Example 1.21: !n how many ways can a committee o f 5 be selected from amongst 6 = 360 ways boys and 7 girls; if the committee must consist o f (i) 2 boys and 3 girls, (ii) at most 3 (i) The number o f arrangements when the direction (clockwise or anticlockwise) boys? is specified is (n — 1)!. This is because one of the items can be used as a Solution: There are a total o f 13 persons. starting point. (i) The total number o f combination is 2 boys can be selected from 6 boys in (®) (ii) When the direction o f arrangement is not specified is ^ (n - 1)! ways. ways. 3 girls can be selected from 7 girls in Q ways. Example 2.17: How many ways can 20 different beads be arranged to form a necklace? Total number o f combination is = 2 (n — 1)‘ ( 2) (3) = 15x35 = 525 ways = ^ (19!) ways (ii) There could be 0, 1 ,2 and maximum of 3 boys. Hence the total number of combination is Example 1.20: A round table conference is to be held by 10 persons such that 2 particular person may wish to sit together. ® © +® ® +© ® +© Q Solution: Regard the 2 people as one. We now have 9 persons. The two persons can = 21 + 210 + 525 + 420 be arranged in 2! ways. The 9 persons can be arranged in (9 - 1)! ways. The total = 1176 ways number of arrangement is Example 1.22: A box contains 20 balls all o f which are o f the same size. 15 o f them 8! x 2! = 80640 ways are Red and 5 Black balls. 4 balls are selected at random from the box, find the probability o f having: 1.4 Combination (i) exactly 2 black balls. The number o f arrangement or ‘selection’ of n different items taking some or (ii) at least 1 red ball all of the number o f things at a time irrespective o f the order is referred to as Solution: combination. The number o f combination n things taking rat a time is denoted by (i) The first thing to do is to find the combination o f any 4 balls out o f 20 (i.e. sample space) ( ^ ) - (n-r)!r! Most of the problems on selection without replacement can be solved using Number o f ways o f choosing 2 black from 5 is Q . combination approach. Number of ways o f choosing the remaining 2 from 15 red balls is ( ^ ) . Number o f outcomes of favour of the event is © ( ? ) fSVlSX 16 P(2 black anti 2 red balls) = - = 0.217 ( 4 ) 17 UNIVERSITY OF IBADAN LIBRARY Example 1.24: A certain examination consists of 12 questions divided into two parts (ii) The probability o f having at least 1 red ball is of 6 questions each. How many ways can a student choose any 8 questions if he must ( y x a + t v T / M y K M f l G ) attempt exactly 5 questions from the first part?"_ 75 + 1050 + 2 2 7 5 +t 1)365 Solution: From the first part, questions are selected in ^ = 6 ways.4845 In the second part, 3 questions are selected = 20 ways. = 0.983 The required number is ^ ^ = 120 ways. Combination when a particular thing must be included or not included (A) The number o f ways o f choosing r things out of n in which k particular thing (i) (B) When all items arc alike and each of them may be disposed off in 2 ways: always occur is In this situation, the item may be included or rejected. The total number of ways of The number of ways of choosing r things out o f n which k particular thing disposing all things is 2 x 2 x ...x n times = 2n. This include a case where all the (ii) . f n - k \ items are rejected. never occur is y r ) I lencc. the total number of ways in which one or more things are included is 2” - 1. This is equivalent to Q ) + ( n ^ j ) ------f ( j ) Example 1.23: 15 players were invited for a crucial football match. In how many ways can 11 players be chosen if (i) the skipper must be included Example 1.25: In how many ways can a student solve one or more questions out of 8 (ii) a particular player is injured and must not be included. in a paper? (iii) player A must be included and player B must not be included. Solution: The student may either solve a question or leave it (i.e. 2 ways). The total number o f ways of solving one or two or all the questions is Solution: (i) If the skipper is selected first, we have 14 players left to select the remaining 01 10 players. = 2 5 6 - 1 /’14\ The required number is = 1001 ways. = 255 ways ( Note:ii) Remove the injured player, now select 11 from the remaining 14 players. If it must include a case where none of the questions is solved, then the required The required number is /■14\ = 364 ways. number is (iii) If we remove B and select player A ©+(?Mz)+-+®=28 Then required number is { ^ ) = 286 ways. = 256 ways Example 1.26: How many different products can be formed with the letters a. b. c. d, e a n d / 19 18 UNIVERSITY OF IBADAN LIBRARY Solution: = 5.36 x 1028 Solution: The number of ways in which one or more of the six letter (b) If the group of 13 cards are to be arranged, in how many ways can this be = 26 = 1 done? Rut this includes a single letter which is not a product. Hence, the number of products Solution:(-1032!1)4 = 1.28x 103° i.e. 26 - 6 - 1 = 57. Example 1.29: How many ways can 18 books be divided? (C) When some items are alike and each of them can be disposed in a way: (i) equally or Given n = [p + q + r + s + — ] items out of which p. q, r, s of them are alike and (ii) in ratio 1:2:3 p can be chosen in (p + 1) ways q can be chosen in (c/ + 1) ways Solution: /■ can be chosen in ( r + 1) ways. (i) 18 books can be divided into 3 groups of 6 each. Then the required number is then the total number of combinations is (p + l)(q + l ) ( r + l ) ( s + 1) — 1 ways. 18! = 17,153,138 ways Example 1.27: How many factors has 2160? (ii) lo divide 18 books in ratio 1:2:3 each group would consist of 3. 6, 9 respectively. Solution: The factors of 2160 are i.e. 2160 = 1 6 x 2 7 x 5 Hence the required number is = 4,084,080 ways. = 24x 33x 5 1 But (h) Permutation and Combination Occurring Simultaneously 24 can be formed in 5 ways. Some problems require the application of the permutation and combination 33 can be formed in 4 ways. approaches simultaneously. We shall give a theory which may be proved. 51 can be formed in 2 ways. Hence the total number o f factors are 5 x 4 x 2 = 4 0 . I heorem: If there are in different things o f one kid, n different thins o f the 2nd kind and k different things of the 3rd kind. The number of permutation which can he formed (D) When Sharing (Dividing) n items into different groups: containing ro f the first..? of the second and/ of the third is A number o f items can shared among a group of people equally or in given (?) * O x (j) * (r *s + D | proportion. Example 1.30: How many ways can 5 boys and 4 girls selected from among 12 boys (i) If n = p + q + r and p = q = r. . M* and 9 girls be arranged on a bench? Then the number of ways of sharing n things equally is Solution: 5 boys are selected from 12 in ways. (ii) [ fn = p + q + r and p * q =£ r, then the number of ways o f sharing n things proportionally is ^ 4 girls are selected from 9 in Q ways. Example 1.28: (a) In how many ways can a deck of 52 cards be shared among 4 players equally? 21 20 UNIVERSITY OF IBADAN LIBRARY but the 9 people can be arranged among themselves in ap9 — 9! ways (F) Combination with repetition Os2) ( ) = 3.62* 10'° Sometimes we are interested in the number of combinations of items when The required number is each o f the items may be repeated. Given n items, the number of combinations 4 91 taking r at a lime then repetitions are allowed is denoted by nHr where Example 1.31: „_ H (rn +=r - l()"( n ++ r’-- 2-) . .1.('n) =+ r-- m. ending 2 in a tie. Show that the number of ways this can happen is (®) (^ ) = Also, SCm, 1) = 1 = S(m, m). This is because there is just one way to partition 0! {1,2,3, ...m} into a single block and 31312! ~) Find n and r such that the following equation is true {lj U {2} U {3} U .....U {m} is the unique unordered way o f expressing {'1,2,3, ...m} as the disjoint union o fm nonempty subsets. 1.5.1 Stirling’s Identity: For any two rpositive integers m and r. 1.5 Stirling Numbers of the Second Kind r! 5(m, r) = ^ ( - l ) r+cC(r, t) Definition 4: Let 5 be a set. A partition o f S is an ordered collection o f pairwise, c=i disjoint, nonempty subsets o f 5 whose union is all of S. The subsets of a partition are Therefore 5 (m ,r) = ^ £ t= i ( —1 )7 +£C(r, t) t ,n called blocks. Example 1.34: 5 (4 ,1 ) = C( 1 ,1)14 = 1 For S = /lj U A2 U A3 U ... U Ak to be a partition of S: 5(4,2) = ^ l - C ( 2 , l ) l 4 + C (2 ,2 )24] i. Ai n Aj = 0 whenever i ^ j ii. A j * 0 . \ < j < k = ^ [ -2 + 16] = 7 Two partitions are equal if and only if they have the same blocks. For instance. {1} U {2,3},{1} U {3,2}, {2,3} U {1} and {3,2} U {1} are 4 different 5 (4 .3) = i [ C ( 3 , l ) l 4 - C(3, 2)24 + C(3,3)34] 6 looking ways of writing the same two-block partition of S = {1,2, 3} = 7613 - 4 8 + 81] = 6 24 25 UNIVERSITY OF IBADAN LIBRARY Solution: This is 5(m, 1) + 5(m, 2) + ••• + 5(m ,n). This is the same as finding the 5(4.4) = - ^ [—C (4,1)14 + C(4,2)24 - C (4.3)34 + C(4 ,4)44] number o f ways in which {1,2,.... m] can be partitioned into n or fewer blocks since it is no longer a requirement that no urn be left empty. = ^ - [ - 4 + 96 - 324 + 256] = 1 Example 7: The number o f ways to distribute four labelled balls among two unlabelled urns is 5 (4 ,1 ) + 5(4 ,2) = 1 + 7 = 8 i. e. 1.5.2 Application of Stirling’s num ber o f the second kind to distribution of 5 (4 ,1) = {1,2,3,4}&{ } objects into urns 5(4 ,2 ) = {1 }&{2,3,4},{2}&{1,3,4}. We are interested in the question "In how many different ways can m balls be {3)&{1.2,4},{4}&{1.2. 3}.{1,2}&{3,4}.{1,3}&{:i,4}, {1,4}&{2,3} distributed among n urns?” We are going to answer this question by considering Variation 3: In how many ways can m labelled balls be distributed among n labelled whether the balls and urns are labelled or not and whether a particular urn can be left urns? This is nm. empty? Example 1.36: Five labelled balls can be distributed among 3 labelled urns in We will consider 4 variations: 3s = 243 ways. Variation 1: In how many ways can m labelled balls be distributed among n unlabelled urns if no urn is left empty? This is the same as “In how many ways can Variation 4: In how many ways can m labelled balls be distributed among n labelled urns if no urn is left empty? This is ?t! 5(m, n). the set {1, 2, 3, ...m ] be partitioned into n blocks. This is 5(m, n). Example 1.35: In how many ways can 4 labelled balls be distributed among 2 There are 5 (m ,n ) ways to distribute m labelled balls among n unlabelled urns using variation 1. After the distribution o f the balls, there are n! ways to label the urns. By unlabelled urns if no urn is left empty? the fundamental principle of counting, the answer is n\S(m , n). Solution:5(4, 2) = 7 that is if the balls are labelled 1, 2 ,3 ,4 then the 7 possibilities Example 9: In how many ways can 5 labelled balls be distributed among 3 labelled are urns if no urn is left empty? {1}&{2,3,4} Solution:3! 5(5 ,3) {2}&{1,3.4} Example: Suppose that a secretary prepares 5 letters and 5 envelopes to send to 5 {3}&{1,2,4} different people. If the letters were randomly stuffed into the envelopes, a match {4}&{1,2,3} occurs if a letter is inserted in the proper envelope. {1,2}&{3,4} (i) In how many ways can the letters be stuffed into the envelopes so that no letter {1,3}&{2,4} falls into the proper envelope? {1,4}&{2,3} (ii) What is the probability that none of the letters is placed in the right envelope? Because the urns are unlabelled, (iii) What is the probability that at least one of the letters is placed in the right {2}&{1,3,4} = {1,3,4}&{2) etc. envelope? Variation 2: In how many ways can m labelled balls be distributed among n (iv) What is the probability that exactly 3 o f the letters were placed in the right unlabelled urns? envelope? 26 27 UNIVERSITY OF IBADAN LIBRARY letter has only one matching envelop). Then it is possible to determine the probability that the secretary sniffs the letters randomly into right envelops. Solution: The total number of derangements for the 5 letters is (i) 1.6.1 Derangements D5 = 5! i _ i + 1 _ 1 + ! _ ! Definition 1: A derangement o f (1. 2..... n) is a permutation //, /'?,..... , /„ of (1 .2 ...... n) 1! + 2! 3! + 4! 5! J such that it* 1, iyt 2......./>t«. = 120 [1 - 1 + 0.5 + 0.1667 + 0.0417 + 0.00833] Thus, a derangement of (1. 2,...,n) is a permutation /'/, i2.... z'„ of (1, 2,....n) in which = 120(0.71673) no integer is in its natural position: /'/^ 1, ij# 2, . ., i„*n. =86.008 ways Denote by D„ the number of derangement of (1, 2....... n) (ii) Probability that none o f the letters is placed in the right envelope is given as Consider the following example for illustration: d l _ 1 _ }_ + _1_ __1_ + J_ _ J_ 5! 1! + 2! 3! 4! 5! Example 1: At a party, 10 gentlemen check their hats. In how many ways can their = 0.716 hats be returned so that no gentleman gets the hat with which he arrived? (iii) The probability that at least one o f theletters is placed in the right envelope is This problem consists o f an n-element set X in which each element has a 1 Prob [None of the letters is placed in the right envelope] specified location. We are required/asked to find the number of permutations of the = 1 - (0 .7 1 6 ) set X in which no element is in its specified location. = 0.2833 Here, the set X is the set of 10 hats and the specified location of a hat is (the probability that exactly 3 of the letters were placed in the right envelope is (iv) The head of) the gentlemen to which it belongs. given by Let us take X to be the set {1,2........ ,10} in which the location of each of the integers is that specified by its p, ositi1 on in the sequence 1 ,2 ..........10.( A ' / 2! 3! (A -A )! Theorem: For n > I, Dn = n! 1 ----- h —1 ----1+ N\ 1! 2! 3! + ( - ! ) "n-l Proof: Let S be the set o f all n! permutations of (1. 2.........n). For j = 1, 2,..., n. let p, % - 3 ) ! l - l + 91 be the property that in a permutation, j is in its natural position. Thus, the permutation 5! //, hi...... in of (1 ,2 ........ ,n) has property pj provided z} = j. A permutation of (1, 2,...n) = 0.083 is a derangement if and only if it has none of the properties pi, pi.......pn- Let Aj denote the set o f permutations of (1, 2..... n) with property pj. (j = 1,2, 1.6 Allocation and M atching Problems n). The derangements of (1.2..... n) are those permutations in A,1' n A\ n ..... n A*. Introduction Matching and allocation are some of the classic problems in probability theory. I his Thus, Dn = | A\ n A, n ........n A,1, | problems dated back to the early 18th century has many variations. There are many The PIE is used to evaluate D„ as follows: ways to describe the problem. One such description is the example o f matching letters with envelops. Suppose there are '/letters with //matching envelops (assume that each 29 28 UNIVERSITY OF IBADAN LIBRARY The permutation in A| are of the form 1. A . . w h e r e /„ is a permutation of rhus, —j- is the probability that it is a derangement if we select a permutation o f (1, (2..........n). Thus, |A11 = (n - 1)! And more generally for |A,| = (n - 1)! for j = 1, 2, 2..... n) at random. ........ n. The permutations in A |n A2 are of the form 1, 2, 13. ...in where /j ....... i„ is a 1.6.2 The Matching Problem permutation o f (3........... n). Thus. | A |n A2| = (n - 2)! Suppose that an absent minded secretary prepares n letters and envelopes to send to n different people. If the letters were randomly stuffed into the envelopes, a Generally. |A jn A,| = (n -2 ) ! for any 2 combinations (i .j) of (1 .2 ...... n). match occurs if a letter is inserted in the proper envelope. For any integer k, with 1 < k < n, the permutations in A in A2n .....r v \k are of the form 1.2 .......k, /'k 1 in- where /'*-/......./„ is a permutation of (k+1......... n). Thus, Example 2: Suppose that each of jV men in a room throws his shirt into the centre of |A in A2n....r»Ak| = (n - k)!. the room. The shirts are first mixed up and then each man randomly selects a shirt. Generally. |A i,n Ai2 n . . . .n A ik| = (n - k)! for any k-combination (/|. /2,..... /k) of (1. (1) What is the probability that none of the men selects his own shirt? 2... n): (2) What is the probability that at least one of the men selects his own shirt? k - combinations of (1. 2.......n), applying the inclusion- (3) What is the probability that exactly k of the men select their own shirt?Since there are Solution: exclusion principle, we obtain: 1. From our discussion on derangement, the probability that none of D„ = n\ - (« -!)! + +HrC (« -« ) ! the men selects his own shirt is----- 3 (" - 2 ,!- ( ; ) (" - 3>i+..... PN , _ 1 + I _ i +n\ n\ . n\ JV! 1! 2! 3! •+ (-D " —JV!= / ; ! ----------- - + — + •+ (-D" - 1! 2! 3! n\ 2. The probability that at least one of the men selects his own shirt is . r. 1 1 1 •+ (-D ' 1 - Prob [None selects his own shirt] [ 1! 2! 3! n\ 1 bus. from example 1 above. 1 -1 + - - - + , ( - 0 * 2! 3! JV! 1 1 1 1 1 D,o = 10! 1 1 1 1 + 1 1! ' 2! 3! 4! 5! 6! 7! 8! 9! H- I - - + - ........... - 1- ^ . 2! 3! JV! You should be able to supply the final answer for Dm 1 I + 1 ( - 0 * Note: (i) The series expansion for e '] = 1 - ^1! -t- 2! - 73!j + 4! 2! 3!.................... /V! 3. The probability that exactly k o f the men select their own shirt is as follows: (iij— is the ratio of the number of derangement of (1, 2.......n) to the total First fix attention on a particular set o f k men. The number o f ways in which this and n\ number of permutations o f (1 .2 ..... n). only this k men can select their own shirt is equal to the number of ways in which the other N-k men can select among their shirts in such a way that none of them selects his own shirt. 31 30 UNIVERSITY OF IBADAN LIBRARY The probability that none o f the N-K men, (selecting among their shirts), selects his Solution: 1. 6 men. 6 women divided into 2 groups own shirt is 1 - 1 + — - — + (i) two groups o f 6persons each It follows that the number of ways in which the set o f men selecting their own shirts corresponds to the set of k men under consideration is (-I)"'* (N-K)! 2! 3! + ..........+ { N - K ) \ Also, as there are possible selections of a group of K men, it follows that there 14.4375 924 are /VN = 0.0156 ( N - K ) \ K 2! 3! ( N - K ) \ (ii) 6 ! 6! ways in which exactly K of the men select their own shirts. X All males and all females 2S3!3! 233!3! The probability required is thus 12 ! , , . N-K 2 6 3! 3!f N ( N - K ) \ 1 — 1 H--- i -----i- -------(----i-r----*---- , K 2! 3! (N -K )\ N\ ■_2363!!3_!/y (2-5)2 6.25 26162!6 14.4375 14.43 = 0.43292! 3! (N - K ) \ ! ! K\ Example 4: e~ This result is approximately — , for large N. k = 0,1............... (a) State the principle o f inclusion and exclusion. K\ (b) Suppose 15% of apple and 10 consignments were toxic. If the consignment Example 3: Suppose there are a group o f six men and six women. They are to be paired in groups consists o f 60% apple and 40% mango, what is the probability that a fruit selected at random is toxic? of 2 for the purpose of determining roommates. (i) What is the probability that both groups will have the same number of Solutions: male and female. (ii) What is the probability that there are no male and female as (b) 15% of apple are toxic, 10% of mangoes are toxic Consignment: 60% apple, 40% mango roommates? Let F represent fruit; A: apple, M: mango Let T represent toxic fruit 32 33 UNIVERSITY OF IBADAN LIBRARY 5880 (i) P(T) = P(A\T)P(A) + P(M\T)P(M) " 282475247 = 0.15(0.6) + 0.10(0.4) = 7.369 x 10~14 = 0.09 + 0.04 = 0.13 Example 6: (ii) PWT) = ^ Suppose that each of the 10 men in a room throws in their cap into the center of the 0.09 room to be picked by 10 ladies in the annual marriage fixing ceremony. What is the 013 probability that = 0.0117 (i) No lady picks the cap o f the man o f her •:hoice. Example 5: (ii) At least one lady picks the cap o f the man of her choice. 3.(a) Give the Stirling’s identity. (iii) Exactly 7 ladies could not pick the cap of men of their choice. (b)(i) In how many ways can 10 labelled balls be distributed among 7 labelled urns (ii) What is the probability ii' the urns are unlabeled and non of them is lell empty. Solution: 1 lOmen and 10 ladiesSolution: ^n!= fI l - 11! + -2! - -3! + -4! - - + —10!J](a). Stirling Identityr (i) Pr (No lady picked a cap) = [1 - 1 + 0.5 - 0.1667 + 0.0417 - 0.0083 +s(m-r)=^ I (- )r+,C)tM 0.0014 - 0.0002 T 0.000 - 0.000 + 0.000] f=l = 0.3679 wherem and r are positive integers (ii) Pr (at least one lady picked a cap) = 1 - Pr (No lady picked a cap) b(i)7n = 10 labelled balls = 1 - 0.3679 n = 7labelled balls = 0.6321 Number of ways is n m = 7 10 (iii) n - kwhere n = 10 , k = 7 = 282,475,249ways 1 0 - 7 = 3 (uses the principle o f inclusion and exclusion) b(ii) 5 (10 ,7 ) = P(k) “ (* )< " " V ' l 1 ~1! + 2! “ 3! + * ( n -/0 l] £ [ ( j ) 110( - 1 )8 + Q 210( - 1 )9 + ( 3) 3‘°(-l)> ° + Q 4 10(—l )11 + Q 510( -1 )12 + Q 6‘°(—l ) 13 + Q 710(—l) 14] 7! 7! 29635200 1 - 1 + 0 .5 -0 .1 6 6 7 “ 7! 7! = 5880ways 0.333 5040 Therefore, P r[5 (10 ,7 )] = -S~ ^ - 34 35 UNIVERSITY OF IBADAN LIBRARY = 0.00006 (10) fen children are to uc grouped into two clubs in such a way that five will = 6.61 x 10"5 belong to each club. If in watch club a secretary and a president is to selected, Therefore Pr(exactly 7 ladies could not pick the cap o f men of their choice) is 1 — ^ in how many ways can this be done? i.el - 6 .6 6 x 10”s = 0.9993 (11) A shelf contains Chemistry, Mathematics and Economic text books. In how many ways can S books be selected? (12) Show that: Practice Questions a. nP (n - l ,r ) = P(n,r + 1) (1) Show that ( ” ) = (n " r ) b. P{n + l ,r ) = rP(n ,r - 1) -F P(n,r) (2) If Cn_4 = 15; find n. 13. In how many ways can lour elements be chosen from a ten-element set: (3) An examination question is divided into three sections A, B. C with 3. 4 and 5 a. with replacement if order matters? question respectively. A student is required to answer t questions each from. b. with replacement if order does not matter? Sections A and B and 3 from Section C. In how many ways can he write the c. without replacement if order does not matter? d. without replacement if order matters? examination? 3. In how many ways can six balls be distributed among four urns i f : (4) In how many ways can he solve one or more question in Section C. a. the urns are labelled but the balls are not? (5) If the paper is one o f the professional examination papers where candidates are b. the balls are labelled but the urns are not? required to attempt as many questions as possible, find the total number of c. both balls and urns are labelled? ways a candidate can write the examination if must attempt at least one d. neither balls nor urns are labelled? question? 14. Show that Ds = 44 (6) In how many ways can a person purchase two or more items out o f 5? 15. Seven gentlemen check their hats at a party. How many different ways can (7) A nursery school pupil learning simple arithmetic is given 5 counters with their hats be returned so that: digits 2. 1,3. 0, 4. 5 to form numbers. Find the probability that the pupil is a) no gentleman receives his own hat? about to form a b) at least one gentleman receives his own hat? (a(i)) 3 digit number c) at least two gentlemen receive their own hat? (ii) a number greater than 400.000 (b) Using all the digits except 0. how many numbers can be formed and what is their sum? (8) How many ways can the letters o f the sentence “Daddy did a deadly deed” be formed? (9) A boy found a keylock for which the combination was unknown, but correct combination is a four digit number d l( cl2, d3l d4, where d,, t = 1,2, 3,4 is selected from 1, 2, 3, 4. 5, 6. 7, 8. How many different lock combinations arc possible results in such keylock? 36 37 UNIVERSITY OF IBADAN LIBRARY CHAPTER 2 E L E M E N T S O F P R O B A B IL IT Y at a time; draw of two cards from a deck one after the other; a random selection of a ball from a box and examine the colour. 2.1 Introduction The definition o f probability is as varied as the values of any random variable. Its (c) An outcome: This is a possible result o f a trial or an experiment. In a toss of definition depends on the extent to level one is knowledgeable o f the use and power two coins, an outcome could be any one of HH, HT, TH, TT. The possible outcomes in a throw of a die are, 1, 2, 3, 4, 5, 6. o f probability concept. Probability can be defined as a measure of uncertainty concerning a phenomenon. It (d) Sample Space: Is the totalily o f all possible outcomes o f an experiment. It is a can also be defined as a real value that measures the degree o f belief one has in the set o f all finite or countably infinite number o f elementary outcomes occurrence of a specified event. Probability is also described as the study of random ex,e 2, - ,enIt is usually represented byS = [e1 ,e2, ... ,e n} phenomena. Most phenomena studies in the Physical Science. Biological Sciences. The sample space in a toss o f a coin and a die is represented by Engineering and even Social Sciences are looked at not only from deterministic but H1H 2H 3H 4H 5H 6H also from a random point of view. Therefore the theory o f probability has as its T IT 2T 3T 4T 5T 6T central feature, the concept of a repeatable random experiment, the outcome of which 1 2 3 4 5 6 is uncertain. To the Statistician, probability remains the vehicle that enables him use information in i.e. S = [IH, 2H ,3H , 5//, 17\ 27, 3 7 ,4 7 ,57\ 67} the sample to make inferences or describe a population from which the sample was The sample space when a die is thrown twice is obtained. I'hus the study o f probability prepares a strong background for reliable S = {11 ,1 ,2 ,1 ,3 ,1 ,4 , 1, 5,1, 6 ,1 ,2 ,2 2 ..... 66} statistical inference. No wonder Professor Sir John Kingman remarked in a review (c) An Event: Is a subset o f a sample space. Lecture in 1984 on the 150th anniversary o f founding of the Royal Statistical Society It consists of one or more possible outcomes of an experiment. It is usually that “the theory of Probability lies at the root o f all statistical theory”. denoted by capital letters A, B, C, D, .... It should be noted that a subset in a given set could consist o f all the possible outcomes or none o f the outcomes of Definition the given set.2.2 of Terms and Concepts Before we define probability as a concept, it is necessary to review the definition of e.g. When a die is tossed once, we define. Set some probability terms that shall be employed in our discussions. A = {s e t o f even number} = [2,4,6} (a) A Trial: Is any process or an act which generate a number o f outcome which B ={s-et o f prim e num ber} ={1,3,5} can not be predicted a priori. A trial usually results into only one of the C = {s e t o f num ber g rea te r than 7} ={0} possible outcomes e.g., A toss of a coin once, will lead to either a Had (IT) or a (f) Mutually exclusive events: Two events A and B are said to be mutually tail (T) turning up. The selection o f a card from a deck o f well shuffled cards exclusive, if the occurrence o f A prevents the occurrence of B. This implies result in one of the cards being drawn. that the two events can not occur together i.e. A n B= e.g. the occurrence o f H (b) A Random Experiment: Is any operation which when repeated generates a prevent the occurrence o f 7 in a toss o f a coin. number o f outcomes which cannot be predetermined, e.g. A toss of two coins (g) Mutually Exhaustive Events: Events Av Az, A3, A4, ... ,A n are said to be mutually exhaustive if they constitute the sample space, i.e. 38 39 UNIVERSITY OF IBADAN LIBRARY number of outcomes for an experiment. There is no requirement that the experiment 2i= 1> s . be performed bpef,o .r.e the probability is determined, i.e.Number o f outcom es in fa vo u r o f A _ } Total num ber o f outcom es fo r experim ent N However, some events could be both mutually exclusive and exhaustive. This implies Where N is the total number o f possible outcomes that they are disjointed and yet their sum is equal to the sample space. This would be ThusProbability is a measure o f likelihood that a specific event will occur. illustrated later in (1.8). It should be noted that the last two probability terms are Example 2.3.1: Find the probability o f obtaining any number in a simple thrown of a associated with one experiment only. die. (h) Independent Events: Two events A and B are said to be independent if the Solution: The experiment has six outcomes 1, 2, 3, 4, 5, 6. occurrence of A does not affect B. This implies that the two events can occur together, e.g. the event o f an event number and a Tail in a throw of a coin and P (a number) -------------- --------------- = -Total num ber o f outcom es 6 a die at once. (i) Sure/Certain Event: The sample space S is the only sure event. The Example 2.3.2: Find the probability o f obtaining an event number in one roll of a die. Solution: Let A be the event o f an even number, probability of a certain event E is one (P{E) = 1) 4 = {2, 4, 6}; n (A) = 3 (j) Impossible Event: This is the complement of the sure event. It is an empty 5 = {1,2,3,4,5,6}; n (S) = 6 set 0. p r ^ \ Number o f outcomes included in A _ 3 _ ^ ^ Total num ber o f outcom es 6 2.3 The Approaches to the definition of Probability This approach to the definition o f probability only holds for finite sample space where The three conceptual approaches to the definition of probability (1) the classical elementary events are equally likely. However this assumption is not always true in approach, (2) the relative frequency approach and (3) the axiomatic approach, (4) the real life as all events are not equally likely. After all we are not equally endowed. subjective approach. These three concepts are explained as follows: (b) Frequency or ‘aposteriori’ probabilityApproach: This method defines probability as an idealization of the proportion o f times that a certain event will occur (a) Classical or ‘a priori’ Approach in repeated trials o f an experiment under the same condition. Thus, in an experiment If there are nnumber of exhaustive, mutually exclusive and equally likely cases of an is repeated /V times and n(A ), is the number o f times that A occur, then the relative event and suppose that nA of them are favourable to the happenings of an event A frequency is under the given set of conditions, then (A) = ^ . An example is the toss of a die n(/l) N once. The six possible outcomes are 1,2,3,4,5,6. The probability of occurrence of a 2 But relative frequencies are not probabilities but approximate probabilities. If the is -. The probability is ‘a priori’, that is it can be determined before carrying out the 6 experiment is repeated indefinitely, the relative frequency will approach the actual or experiment. theoretical probability. This method assumes that the elementary outcomes of an experiment are equally n(A) likely. It defines the probability of an elementary event e{ as 1 divided by the total P(A) = limn—ca N 40 41 UNIVERSITY OF IBADAN LIBRARY However, there is a requirement that the experiment be performed before the real world occurring at random is then determined satisfying certain properties (called probability is determined. Hence, the probability is determined aposteriori. It should axioms). he noted that some events in real life cannot be repeated before the probability is determined. Even if it can be determined the limit may not converge. 2.4 Probability of an event Example 2.3: Fifty o f the 800 cars that enters the University o f Ibadan on a If A is an event from an experiment E with sample space the real valued function graduation day are found to be Jeep. Assuming different cars comes into the campus P(A)\s called the probability of A which satisfy the following axioms: randomly, what is the probability that the next car is a Jeep? (1) 0 < P(A) < 1 for every event A (2) PCS) = 1 Solution: Let N be the total number of cars and n be the total number o f Lexus. Then N=800, n=50 (3) P(A, U A2 U ...) = P (/la) + P(A2)+... CO Using the relative frequency concept of probability, the probability that the next car being a Lexus is 1 = 1 P (Lexus) = £ = -51 = 0.0625 for every finite or infinite sequence o f disjoint event Av A2 ... (c) Subjective Probability: is the probability assigned to an event based on 2.5 Consequences of Probability' Axioms subjective judgement, experience, information and believe. Such probabilities Theorem I assigned arbitrarily are usually influenced by the biases and experience of the (a) If .-I is a given event and Ac is the compliment o f A. then P (AC) = 1 - P(A). person assigning it. Proof: A U Ac = S For instance the probability of the following events are subjective: P(A + Ac) = P(S) = 1 by axiom (2) 1. The probability that Jude, who is taking statistics in the second .-. P(A) + P(AC) = 1/1 and Ac are mutually exclusive semester will score seven points in the course. = P(AC) = l - P ( A ) . 2. The probability that a particular Football Club win the maiden match with another club. (b) Theorem II: 3. The probability that Ade will win the case he has filed against his Given that cj) c S, then P(A) = 0 landlord. Proof: Since subjective probabilities is based on the individual’s own judgement, it is rarely S U 0 = S. used in practice as it lacks the theoretical backing. P(S U 0 ) = S = 1 by axiom (2) (d) Axiomatic or theoretical Approach: To circumvent the difficulties posed by P(S) + P (0 ) = 1 since P(S) = 1 the earlier approaches to the definition of probability and based on the study of 1 + P(0) = 1 random of random phenomena, researchers have developed a mathematical = P(0) = 0. expression of certain aspects of the real world. The probability of a certain part of the 42 43 UNIVERSITY OF IB DAN LIBRARY 2.6 Rules ol Probability Theorem 1: Let 5 be a sample space and P (.) be a probability function on S : then the l licorem 5: Commutative laws: probability that the event A does not happen is 1 - P(/l) i.e. P(A') = 1 - P(A). / l u f i = f lu / l / l n B = 8 n / i Proof: Theorem 6: Associative laws: From definition. /I n /!' = 0 ; / l u /l' = 5 A U (B U C) = (A U B) U C P(/l U A') = P(S) A n ( B n C) = (A n f l ) n c P(/l U A') = P(5) = 1 P(/l U A') = P(i4) + P(A') = 1 Theorem 7: Distributive laws: P (/l') = 1 - P(/|) /i n (B u c ) = (/i n B) u (a u c ) A u (B n c ) = (a u B) n (a u c) Theorem 2: Let S be a sample space with probability function P ( . ); then 0 < P(/l) < 1 lor any event A in S. (A')' = A Proof: A' = S \ A My property (1). P(/1) > 0 We need to show that P (/l) < 1 Thus I roni theorem ( I ». P (/l) -f P(/T) = 1 A n S = A Mut P(A') > 0 A u S = S So. P(A) = 1 - P(A') < 1 A n 0 = 0 /l U 0 = /I fheorem 3: Let S be a sample space with a probability function P ( .). If 0 is the Also impossible e\ent. then P (0 ) = 0. i4 n /T = 0 Proof: Observe that 0 = S' /I u A' = 5 from property (3). we get P(5 U S') = P(S) + P(S') A n /l = A P(S) + P(0) A u A = A Mut S U S' = 6* and P(S) = 1 Therefore P (0 ) = 0 Theorem 11: De Morgan's laws: (A U B)' = A ' r \B ' Theorem 4: If >1l and /12 are subsets o f S such that Ax c A2. thenP(/li) ^ P (/l2)- (A n B)' = A ' V B ' Theorem 12: A - B = A n B' = A \ B P(A \ B ) = P(/l n B') = P(A) - P(A n B) 44 45 UNIVERSITY OF IBADAN LIBRARY Solving Problems using Venn diagrams Theorem 13: Example 1: In a sample o f 1000 foodstuff stores taken at an Ibadan market, the P(A U B) = P(A) + P(B) - P{A n B) following facts emerged: if A and B are disjoint, that is P(A n B) = 0, 200 of them slock rice, 240 stock beans, 250 slock gaari, 64 stock both beans and rice. then P{A U B) = P (/l) + P (5 ) 97 stock both rice and gaari, while 60 stock beans and gaari. If 430 do not stock rice. do not stock beans and do not stock gaari, how many of the stores stock rice, beans Theorem 14: and gaari? P(0) = 0 Solution: Theorem 15: Multiplicative law of Probability If there are two events A and B, probabilities o f their happening being P (/l) and P (P ) respectively, then the probability P(AB) of the simultaneous occurrence o f the events A and B is equal to the probability o f A multiplied by the conditional probability of B(i. e. the probability o f B when A has occurred) or the probability of B multiplied by the conditional probability o f A i.e.P(AB) = P(/1)P(P /A ) = P(B)P(A/B 2.7 Venn Diagrams A set is a collection of objects, which can be distinguished from each other. The objects comprising the set are called the elements o f the set and they may be finite or Let: R represent rice stores infinite in number. Venn diagrams are diagrammatical representation of sets. For instance, consider the B represents beans stores set A = {1,2, 3 ,4 ,5 ,6 ,7 ,8 ,9} , suppose that A has a subset B = { 2 ,3 ,4 ,5}. The G represents Gaari stores diagrammatic representation of this is shown below. Let jc represents those that stock all the 3 food items A Those that stock gaari alone are 250 - [(97 - x ) + x + (60 - x)] = 93 + x Those that stock beans alone are 240 - [(60 - x) + (x) 4- (64 — x)] = 116 4- x Those that stock rice alone are 200 - [(64 — x) + x + (97 — x)] = 39 + x 430 did not stock any o f the food items Therefore. 1000 = (39 + x) 4- (93 4- x)+(116 4- x ) + x + ( 6 4 - x) 4- (60 - x ) + (97 - x ) + 430 And x = 1 0 0 0 -8 9 9 = 101 Therefore 101 stores stock rice, beans and gaari. 47 46 UNIVERSITY OF IBADAN LIBRARY 2.8 The Principle of Inclusion and Exclusion 2.8.1 The Second Counting Principle M u B U C| = Ml + [M l + |C| - M n C|] - M n [B U C ]|............eqn (2) If a set is the disjoint union of two (or more) subsets, then the number of elements in Because A n [B U C] = (A n S) U (A n C), we can apply equation( 1) again to obtain the set is the sum of the numbers o f elements in the subsets, i.e. |/1 n M u C]\ = \A n B\ + M n C| - \A n B n C |........... eqn (3) n(/l U B) = n(/l) + n(B ) implying that \A U B\ = |/11 + \B\ if A and B are disjoint. Finally, a combination o f equations (2) and (3) yields M U B u C| = [Ml + Ml + Ml] - [M n a | + M n C| + M n c | ] + M n s n Theorem I:|/l U B\ < \A\ + \B\ if A and B are not disjoint. Cl.......... eqn (4) This is because Ml + \B\ counts every element of A n B twice. Let us illustrate this Thus proving theorem 2. with the following example. From this derivation, we notice that an element of A n B n C is counted 7 times in equation(4), the first 3 times with a plus sign, then 3 times with a minus sign and then Example 2: If A = (2 .3 .4 .5 .6 ), \A\ = 5 and B = ( 3 ,4 ,5 ,6 ,7 ), \B\ = 5 once more with a plus sign. then. Ml + \B\ = 10 A U B = 2 ,3 ,4 ,5 ,6 ,7 Example 3: If A = {1,2 ,3 ,4}B = {3,4,5,6}C = {2,4,6,7} then M u f l | = 6 A u f l u C = {1,2,3,4,5,6,7} Since A and B are not disjoint, \A U B\ < Ml + Ml |/1 U £ U C | = 7 ...........................(a) Compensating for this double counting yields the formula Ml = 4 Ml = 4 M u B\ = Ml + Ml - M n B I............... eqn.(l) From our example. A n B = 3 ,4 ,5 ,6 Ml = 4 \A r \B | = 4 Ml + Ml + Ml = 12 M U S | = 5 + 5 - 4 A n B = 3 ,4 ,/l n C = 2,4, f? n C = 4 ,6 In this example, \A n B\ = \A n C| = \B n C| = 2 so that = 6 thus proving equation (1) M n B\ + M n C| + M n C| = 6 and A n f i n c = 4 ,M n f in C | = i Theorem 2:\A U B U C| = Ml + Ml + |C| - \A n B\ - \A D C| - \B n C| + Therefore. \A U B U C| = Ml + Ml + Ml - \A O B\ - \A n C| - \B n C| + M n S n C| for three sets A, B and C. M n s n c | = 1 2 - 6 + 1 = 7 ......................... (b) Proof: I bus. (a) = (b) thus establishing theorem 2. We know from equation (1) above that | A U 8 | = |/1| + Ml - M n B\ Generally, the Principle of Inclusion and Exclusion (PIE) states that: Then, for 3 sets. \A U B U C\ = \A U [B U C]| P. AltA2, are finite sets, the cardinality o f their union = MI + M U C | - \A n [ B UC]| Mi U/ l2 U ...,U An\ =n Applying equation (1) to \B U C\ gives Y-1 M , i -L—y'1 < i< j< n p ( n / i / | + —Y1 l ’i i< j1, | 48 49 UNIVERSITY OF IBADAN LIBRARY Proof: Solution: Let At . A2, A3 be the subset consisting of those integers that are divisible On the left is the number o f elements in the union of n sets. On the right, we first by 5. 6 and 8. The number we are interested in is count elements in each o f the sets separately and add them up. If the sets At are not n A \ n A '}| = 1000 - |j4,| - |/i,| - |/1,| + |A, n A21 + \Al r*Ai\ + \A, n A}\ - disjoint, the elements that belong to at least two o f the sets Ah or the intersections | A, n A 2n A}\ A, ft Aj. are counted more than once. We wish to consider every such intersection, but each only once. Since A{ n Aj = Aj n At, we should consider only pairs (Ai.Aj) with W = [!M J=200 W = [1 -’—^ J' = 166 1/1,1 ='100 = 125 i < j . When we subtract the sum of the number of elements in such pairwise intersections, Note: The results for |/ll|, |. 'l , | and |.4,| were achieved by using the round down, some elements may have been subtracted more than once. Those are the elements that notation [ J which involves the dropping o f the fractional part. belong to at least three of the sets A{. We add the sum of the elements of intersections To compute the number in a 2 and 3 - set interaction, we use the least of the sets taken three at a time. (Note: the condition i < j < k ensures that every common multiple (LCM) i.e. intersection is counted only once) The process continues with sums being alternately added or subtracted until we come to the last term which is the intersection of all sets A{ thus proving the theorem. Let S = Ax U A2 U ....U An and A * = S\Ai then the PIE principle can also be expressed as MiC n .....D A nc | = n i s i - y w y ' I S \a , n Aj n Ak \ + ■■■ = 8i< jsn ^—l l< i< j< kin - ( - l ) ”+1 |n >l<| Thus. Î l,1 r \A \ n ^ j | ------ . . ]u» u n 2 are any two events of Example 4: Let A be the subset of the First 700 hundred numbers 5 = {1 ,2 ,.......700} rule an experiment with sample space S. then we have the addition that are divisible by 7. Find the number o f elements in 5 that are not divisible by 7. Solution: u a 2) = P(At ) + P(A2) - P(A] n Az)Proof: A = {7,14,21,28,35,42,49 .......} In a Venn diagram \A\ = 100 Fig. 1.1 \A'\ = \ S \ - \ A \ P(At U A2) = P(A l) U P(A2) = 1 = 7 0 0 - 100 P(A1 U A Z) = P(A,) + P(A2 n A f ) = 600 but P(A2 n A \ ) = P{A2) - P(A2 n a 2) Example 5: Find the number of integers from 1 to 1000 that are not divisible by 5. 6 and 8 50 51 UNIVERSITY OF IBADAN LIBR RY ( ii) P(YuG) = P(Y) + P(G) P(AX U A2) = P (/li) + P{A2)P{Ai n A2)Addition rule 5 4 9 ••• P(A1 U A2) = P (/li) + P (/l2)^ (^ i n A2)Addition rule (since only one ball is drawn P(Y n G) = 0) However, if Ajand A2 have no point in common, that is when Atand A2 are mutually (iii) P (P ) = 1 — P(B) = 1 — ^ = 0.6 exclusive (iv) P(B U G)c = l - P ( B U G ) P(Aj n A 2) = 0 since Aj n A2 = 0 = 1 - IP(B) + P(B)] We have I* = (At U A2) = P(At ) + P(A2)Special Addition rule P = (Aj U A2) = P(A,) + P(A2)Special Addition rule a Using the same procedure fort any three events A. B and C. 20 P(A U B U C) = P(A) + P(/i) + P(C) - P{A n B) - P(A n C) - P(A n C) = 0.4 - P(A n B n C) Alternatively. P(neither Black nor Green) = P(Yellow or Red) Example: A coin is rolled three times, what is the probability of getting (i) 1 head, (ii) ~P(Y) + P(B) 2 heads, (iii) at l east 2 Solution: Let H and T heads.represent Head and Tail respectively. Let the sample space be defined as = ^ = 0-4 5 = [HUH, THH, TTH. HTT. , TTT} (v) P(B D K)= 0 see note in (ii) above.HTH. HHT. THT (i) P(1 head) = {I ITT, TUT,TTH} = Example: A survey o f 500 students taking one or more courses in Algebra. Physics (ii) P(2 head) = {HHT.TH1I.HTH} = *} and Statistics during one semester revealed the following numbers o f students in (iii) P(at least 2 head) = P(2 heads) + P(3 heads) indicated subject: = - + - = - = 0.5 Algebra 186 Algebra and Physics 83 8 8 8 Note: The events o f 2 heads and 3 heads are mutually exclusive. Physics 295 Physics and Statistics 217 Statistics 329 Algebra and Statistics 63 A student is selected at random what is the probability that he takes Examples: A bag contains 8 black balls; 3 red balls, 4 green balls and 5 yelkns ball: (i) all the three subjects all of which arc of the same size. If a ball is drawn at random from the bag. what ii (ii) Statistics but not Physics me probability that the ball is (i) black, (ii) either yellow or green (iii) not black, iv (iii) Statistics but not Physics and Algebra neither black nor green, (v) black and yellow? (iv) Statistics. Algebra but not Physics Solution: Let B R. G and )' represent the event of black, red. green anc 'ciiow bal: (v) Algebra or Physics respectively. Total number of balls = 20. n(B) 8 n ) KB) = —- f = — =0 . 4 n(S; 20 53 52 UNIVERSITY OF IBADAN LIBRARY Solution: Let A. P and S denotes the event o f a student taking Algebra. Physics and = P (s ) - P(5 n Pc) Statistics respectively. = P(5) - [P(S n P) - P(A n P n 5)] Presenting the information in a Venn diagram we have _3_ 5 2 9 010 65 5 20107 | 50503 ~ 500 = 0.33 (v) P (Algebra or Physics) = P(A n P) i.e. _P1(8U6 P+ ) 2=95 P_ ( _A83)_ + P(P) - P(A u P) n(A n A n Bc) = n(A n S) - n(A n P n S) = 10 500 500 500 n(P n S n Bc) = n(P n S) - n(A n P n S) = 164 _~ 530908 n(A n P n Sc) = n(A n P) - n(A n P n S) = 30 = 0.796 Using the addition rule, we can find the number of students that takes all the three subjects. 2.9 Conditional Probability and Independence n(/l U P U 5) = n(/l) + n(P ) + n(/l n P) - n(A n 5) + n(A n P n 5) If A and B are any two events, the conditional probability of A given B is the 500 = 186 + 329 - 83 - 217 - 63 + n(A fl P n S ) probability that even A will occur given that event B has already occurred. n(A n P n s ) = 53 This is equivalent to the probability of events A and B (occurring simultaneously) ••• P(All three subjects) = ^ = 0.106 divided by probability of event B. (ii) P(Statistics but not Physics) i.e. P{A/B) = provided P(B) * 0 = P(S n Pc) = P(/l n s ) = P(B)P (A /B) = P{A)P{B). =3 2P509 ( 0 5 _ ) 2 -517 P(S n P) In general00 P(A, n A3 n ...An) = P(Al )P(Al / A 2)P{A3/ A l n A2) ... P {A n )/ ( A , ... An) = —500 = 0 .2 2 4 (iii) P (Statistics but not Physics and Algebra) Let Al ,A2, Az denote the 1st, 2nd and 3rd cards = P(S) - P(A n P) P(A1 n a 2 n a 3) 4= P3( /l1)2.P(i42/-41) .P ( /i3/A 1 n a 2) =3 2P9( S) - P(/l n P) - P(S n P) + P(A n P n 5) = —52 x 5—1 x —50500 _ _58030_ _ 251070 + _5503^ 0_ 13226400 - _£~~ 50!0. = 0.00018 = 0.164 (iv) P (Statistics. Algebra but not Physics) 54 55 UNIVERSITY OF IBADAN LIBRARY Example: A bag contains 10 while balls and 15 black balls. Two balls are drawn in (b) without replacement succession (a) with replacement (b) without replacement. What is the probability that (i ) P ( ,B C W ) = P (B ) .P (W /B ) (i) the first ball is black and the second white —~ 2155 * 2150 = 0 2 5 (ii) both are black (iii) both are of the same colour (li) P(B (iv) both are of different colours _1 n B215 1)4 = P(Bl ). P(B2/ B l ) (v) the second is black given that the first is white. ~ i I * ^ = ° - 3 5 Solution: Let B and W denote black and white balls respectively. (iii) P (both black or both white) = P t f ^ P15f a /1B4 x, )1 0+ P9(l4r1)P (l^2/lV 1)(a) with replacement 25 * 24 + 2 5 * 24 ( i ) P ( B= D15W )1 0= = P 0(B.2)4. P(W)25 25 = 0.35 + 0.15 —x — = 0.50 (iv) P (both are of different colours) = P_( B15) P 1(0W /B10) +1 5P ( W )P (B /W )(ii) n S2) = P ( B ) x P ( B ) 25 * 25 + 2 4 * 2 4 = © = 0 3 6 = 0.25 + 0.25 = 0.50 (iii) P (both black or both white) = P(B^ n B2) + n W2) ( v ) P ( B / l V ) = ^ p == 00..S326 + 0.16 _ 1255 * 2140 .1 00. ' 25_ 0̂ 245 = 0.625 (iv) P (both are of different colou[r1s5) =1 0P1( B, n[1 0W )1 5+1 P(W n B)=~ 2[25* 25J + 1.25* 25] 2.10 Statistical Independence= 0.(400.24) Two events A and B are said to be independent if the probability that B occurs is not influenced by whether A has occurred or not. i.e. P(B) = P{B/A) V ' V ' J P(W) = 00.6.4 Hence events A and B are independent if P ( A H B ) = P(A).P(B) From the last result, we could see that the =tw 0o. 6e.v ents are independent, hence, Three events are said to be mutually independent ifP ( B / W ) = P(W) (i) They are pairwise independent, i.e. because the drawing is with replacement. P(A r B ) = P(A). P{By, P (A r \C ) = P(A) cot P{C); 56 57 UNIVERSITY OF IBADAN LIBRARY P(B n c ) = P{A). P(B).P(C) and = P(ABA) + P(B.A.B') (ii) P(A fl B n C) = P(A). P(B). P(C) x It should be noted that mutually exclusive events are not independent as the *) + _ _5_ occurrence of one rules out the possibility of the other, i.e. _ 36 P(A /B) = P(B/A) = 0. ( iv ) P(B wins at least one game = 1- P(no game) Example: What is the chance of getting two sixes in two rollings of a single die? = 1 - P(B1 B2 B3) Solution: P (six in 1st die) = — P (six in 2nd die) = — 6 6 19 Since the two events are independent 27 Example: An unbiased die is rolled n times P (six in 1st and 2nd die) = — 6 6 36 (i) Determine the probability that at least one six is observed in the ^trials. Example:/! and B plays 12 games of Ayo (Yoruba traditional game). A wins 6 and B wins 4 and two are drawn. They agree to play three games more. Find the probability Calculate the value o f n if the probability is to be approximately -• Solution: 2 that: (i) A wins all the three games P(a six in a throw) = - 6 (ii) Two games end in a tie P(no six in a throw) = - (iii) A and B wins alternately 6 (iv) B wins at least one game. (i) P(at least 1 six in n trials) = 1 - P(no six in n trials) Solution: Let A and B represent the event of A and B winning the game and D winning the game and D denote the event of a tie. (ii) If the probability is^ ; then © = ? n l ° E ( i ) = i ° g ( j ) ( i) P(A wins all three) = \ x \ x \ = \2 2 2 o n _ log(V2) (ii) P (2 games and in ties) = P(D. D. D)c + P(DC. D. D) + P(D. Dc. D) log(5/6) / I 1 5 \ /5 1 1\ f l 5 I n n = 4 “ k V ' e M e V f i M e V ' i J 5 Example: Determine the probability for each of the following events. 72 (a) A king or an ace or jack o f clubs or queen of diamond appears in a single card from a well shuffled ordinary deck o f cards. (iii) If A and E - B wins alternately in two mutually exclusive ways. (b) The sum of 8 appears in a single toss of. a pair o f fair dice. 58 59 UNIVERSITY OF IBADAN LIBRARY (c) A 7 or 11 comes up in a single toss of a pair ol dice A n B = {HH),A n C = {HT},B nC = {TH),A n B n c = 0 P(A) = P(B) = P(C) = -4 = 0 . 5 Solution: P (A n B ) = P ( A ) . P ( B ) = i ; P ( B D C ) = P ( B ) . P ( C ) = -l (a) P(King) = ^ ; P(an ace) = ^ P (A f lC ) = P ( A ) . P { C ) = P {A n B n C) * PQ4). P ( B ) . P ( C ) P (Jack o f club) = ^ ~ 62 Hence events A, B and C are not mutually exclusive.P (Queen o f diamond) = — Example: An urn contains P ' white and 'q‘ black balls and the second contains ' C P (a kin' d4, an ace, J. o f club or5 Q. o f diamond) white and d' black balls. A ball is drawn at random from the first and put into the ,52 + 52 + S 2 Ĥ 52/ ~ 26 second. Then a ball is drawn from the second urn. Find the probability that the ball is white. (b) Solution: This is a conditional probability. Dice 1 2 3 4 5 6 Total number o f ball in the 1st Um is (P + q ) 1 2 3 4 5 6 7 Total, number of ball in the 2nd Urn is (c + d ) 2 3 4 5 6 7 8 Total number o f ball in the 2nd after the first draw is c + d -I- 1 3 4 5 6 7 8 9 P (white ball in the 2nd um) 4 5 6 7 8 9 10 = P ( W ) P ( W /B + P ( W ) P ) ( W / W ) 5 6 7 8 9 10 11 = — -— (— ) + — — (— ) 10 11 12 c + d + l V p + d / c + d + 1 \p+qJ6 7 8 9 _ c(p+q) (c+d+\)(p+q) c P(sum = 8) = ^ ; ~ c + d +1 (O P ( 7 ) = £ ; P ( H ) = £ 2 P ( 7 ° r n ) = ^ + ^ = 9 Example: A pair o f fair coins is tossed once. Let A be the event o f head on the first coin and B the event of head on the second coin first coin and B the event o f head on the second coin while C is the event of exactly o head is events A, B and C mutually independent? Solution: 5’ = {HH.HT, TH, TT} A = {////, HT),B = {////, TH) C = [HT,TH} 60 61 UNIVERSITY OF IBADAN LIBRARY C H A P T E R 3 . _5J_ C O N D I T I O N A L P R O B A B I L I T Y A N D B A Y E S ’ T H E O R E M 145 (ii) Let G, n G': denote the event that the two students selected are both girls. VI Conditional Probability PG. nC, = P(GX) P(G2 /G,) Supposed A and B are any two events such that A is the prior event and B is the posterior event. There is the possibility that there are points o f intersection between 12 J2 30 V 29 the two events such that the occurrence of one is conditioned on the other. Thus we 132 22 give the following definition. 870 ~ 145 Definition 1: Let A and B be two events in the sample space S with given probability (iii) space IS. A. B. P(.)) where P(.) is a real valued function, the conditional probability of /i, B2 vj G,G, is the event that both students selected are o f the same sex. event A given the event B has occurred denoted by P[A/B], is defined by P(BtB2v G lG 2) = P{B,B2) + P(G,G2) Since Bt n B , and G,Gz are mutually exclusive P(A/B) = ,P(B)>0 , this implies that r ( n ) P(B,B2 u G,G2) = 51 2*? P( A r*B)= P(AI B).P(B) 145 145 73 Also P(B/A) — *P(A)> 0 which also impliesP ( A n B) - P(B! A).P(A) l (A) 145 Example I : Two students arc chosen at random from a class consisting of 18 boys (iv) B,G, u G, B: is the event that the two students selected are a boy and a girl. and 12 girls. What is the probability that the two students selected are: (B,G, u G,B2) = P(B,Gj) + P(G,B2) (a) both boys (b) both girls (c) of the same sex (d) a boy and a girl. - P{B[ ) . P ( G 2 / B i ) + P(Gx) . P ( B 2I G x) Solution: Let B| to be the event that the first student selected is a boy. 2 1 J1 12 2 130 A 29 30 V 29 Let B> be the event that the second student selected is a boy. Let B, o B, denote the event that the two students selected are both boys. = 1 r 12/ + 1 x 1 8 / = 7 2 /5 / 2 9 5 r 729 7145 (i) P( Bt n B2) = />(/*, ) , P ( B J B l ) where Example 2:A boy has 10 identical marbles in a container consisting o f 6 red and 4 blue marbles. He draws two marbles at random one after the other from the container without replacement. Find the probability that: P ( B j B t) = (a) the first draw is red while the second is blue ' / 9 lb) both draws are o f the same colour 3 17 Therefore, P(Bt n B 2) = - x — (c) both draws are o f different colours. Solution: (a) Let R| be the event that the first draw is red Let B: be the event that the second draw is blue. 62 63 UNIVERSITY OF IBADAN LIBRARY 3.2 Independence The event Rt n B, is the event that the first draw is red while the second draw is blue. ^ n f l ] P(Rt n B 2 ) = P(R{) . P { B J R x) where Recall that P [A/B] />(£)> 0/ W PiR,) = X o Definition 2: Two events /! and B are said to be stochastically or statistically independent if and only if any one o f the following conditions is satisfied: p ( b j r x) = y 9 (i) P(AnB) = P(A)P(B) 6 4 P(R, o B ,) — x — (ii) P(A/B)= P(A) if P(B) > 010 9 (iii) P(B/A) = P(B) if P(A) >0 4^ It is easily shown that (i) implies (ii), (ii) implies (iii) and (iii) implies (i). See Post­ 15 test (2 ). (b) Let R| be the event that the first draw is red. Therefore. P(Ar\ B) = P(A/B)P(B) = P(B/A)P(A) if P(A) and P(B) are non-zero. Let R-2 be the event that the second draw is red. This implies that one o f the events is independent o f the other. In fact, Let B| be the event that the first draw is blue Let B2 be the event that the second draw is blue. P\.UB\--P{AnB) ._p tBIA)pW = P^)P{A) P(B) P(B) P(B) ’ Therefore So. if P(A). P(B) > 0 and one of the events is independent o f the other, then the />(/?,/?, u /?,/?,) = P(RtR2) + P{BxB2) since /?,/?, and B{B2 are mutually second event is also independent o f the first. Thus, independence is a symmetric excusive. relation. Pi /?,/?,)= P{RX) P(R2 I R x) Remark: Two mutually exclusive events A and B are independent if and only if P(A) 10 '' 9 P(B) 0 which is true if and only if either P(A) or P(B) = 0 Also, if P(A) * 0 and P(B) * 0, then A and B independent implies that they are not mutually exclusive. P(Bt/?,) = P(BX) x P ( B J B ,) Definition 2: Events A/, A: ..........4n from A in the probability space [.S'. A, P(.)] are _ 4_ 3 said to be completely independent if and only if 10 ' 9 (i) P (/f n A,) = P(A,)P(A,) fo rte / 12 15 (ii) P(At n Ai n Ak) - P(A,)P(A,)P(Ak) for i * j , j *k, i * k , = 1 _ Therefore, n B\B^ = » + » 15 (iii) 1̂n1 A. = n#s|/> ( 4 ) Note: (i) These events are said to be pairwise independent if P(Ai n A l )= P(A ,)P(At ) for a l l / * / 64 65 UNIVERSITY OF IBADAN LIBRARY (ii) Pairwise independence does not imply independence P( B/A) - P ( B n A ) (iii) .1 and B mutually exclusive implies that they are not independent. P(A) This implies that P (A n B) = P (B n A) - P(B/A)P(A) Example 3: Suppose two dice are tossed. Let A denote the event o f an odd total. B. the event o f an ace on the first die, and C the event o f a total of seven. Therefore P(A/B) = ~̂ - - A)P('A) P{B) (i) Are A and B independent? fhe above is known as Bayes theorem. (ii) Are A and C independent? (iii) Are B and C independent? 3.4 Total Probability Rule and Baye’s Theorem Solution: If there are two or more events where one is the prior and the other in the P[A/B] - 4 ~ P(A) posterior event, it is often desirable to determine the probability that a particular event has occurred given that the other event has previously occurred. Even though this kind P[A/C] l* P [ A ] -'/2 of problem can be solve by merely applying the addition and multiplication rule, P1C/B| i= P (C ) = '/6 much compact procedure has been developed called the Baye's theorem. So, A and B are independent A is not independent o f C Baye’s Theorem B and C are independent Let a sample space S of an experiment be partitioned into n mutually exclusive and exhaustive events A X,A2. .... An. Let B be an arbitrary event that occurred after Example 4: Let A| denote the event of an odd face on the-first die, the experiment been performed. Such that P (/!,■) =£ 0, i = 1 ,2 ,..., n then. Let Ajdenote the event o f an odd face on the second die. Let A3 denote the event of an odd total in the random experiment consisting o f two /5(B) = s r = i ^ f) W A ) and dice. Then. P W B ) = P(A,)P(A2) = j x j = P ( A , n A 2) P(A,)P(A,) = j x j = PfA2/A,]P(A,) = P ( A , n A ,) Proof: Let the events A,and B be depicted as in Fig. 1.3 P lA ,n A ,) = i = P ( A 2) P lA J 1 herefore. A|. A 2 and A3 are pairwise independent. But P( A, n / I , n .4,) = 0 * j = P(At)P(A ,)P(A ,) So. A |. A: and A3 are not independent. 3.3 Haves Theorem P ( A n B ) By definition o f conditional probability, we have Given that P(A/B) = P(B) 66 67 UNIVERSITY OF IBADAN LIBR RY Solution: p(Ai n B) | et k be the event o f picking an apple. P(B/Ai) P M I sing the table below: P(Ai n B ) = P(Ai')P(B/Ai) Slate of Nature P(B.) P(E/Bj) P(Bi)P(E/Bi) P(B,/E) B, (4A. 10) X X We know thal XP(Aj n B) B:U aT 40) P i A J B ) X X Xs X" T c/t T B3(2A.30) X X 715 X Such that P{Ai n B) = P{ B)P {AJB) Total 1 - x 5 1 But total probability is The required probability P(B) = P(Ai n B) + p (A2 n B ) + P(A2 n B) + - + P(/ln n s ) P(B|/E) - '/, Using (1) in (3) we have P(B) = P^JPCB/ZIO + P(/I2)P(B//12) + - + P(An) ( B / A n) 11 ^ P I B ^ P I E / B,) = Y j P(.Ai)( .B/Ai) i= 1 4 1 Using (3) in (2) we have the Bayes' formula defined as: 5 3 = X VlS P(.Ai)P(B/Ai) Example 2 :In a certain town, there are only two brands o f hamburgers available, P(A‘/ B ) ~ £ " 1 P{A,)P (S//1,) Brand A and Brand B. It is known that people who eat Brand A hamburger have a Example I: The contents o f 3 identical baskets B,(i = 1, 2. 3) are: 30% probability o f suffering stomach pain and those who eat Brand B hamburger have a 25% probability o f suffering stomach pain. Twice as many people eat Brand B Bp 4 apples and 1 orange compared to Brand A hamburgers. However, no one eats both varieties. Supposing B2: 1 apple and 4 oranges one day, you meet someone suffering from stomach pain who has just eaten a Ru 2 apples and 3 oranges A basket is selected at random and from it, a fruit is picked. The fruit picked turns out hamburger what is the probability that they have eaten Brand A and what is the to be an apple on inspection. What is the probability that it come from the first basket probability that they have eaten. Brand B? Solution: l.et S denote people who have just eaten a hamburger Let A denote people who have eaten Brand A hamburger Lei B denote people who have eaten Brand B hamburger Let C denote people who are suffering stomach pains We are given that P(A )- X P(B) X 69 68 UNIVERSITY OF IBADAN LIBRARY Si)"., chance o f catching fish, the corresponding figures for the river and the lake are P(C7A) = 0.3 40% and 60% respectively. P(C/B) = 0.25 (a) Find the probability that he catches fish on a given Saturday. S - A 3) = PCX = 3) + PCX = 4) + PCX = 5) = ( ! ) (0.65)3(6.35)2 + Practice Questions ( ! ) (0.65)4(0 .35)1 + ( ^ (0.65)s(0.35)° 1. If/11,/12, and A3 be any three events, prove that = 0.3364 + 0.3124 + 0.116 3 = 0.765 Here we need to calculate the probability that he goes to each of the locations without P M , + /l2 -M 3) = £ p M i) - Y , p M l + A2 + /13) 1=1 i=j catching fish It is important to note that addition theorem can be validly applied only when P(F') the mutually exclusive events belong to the same set. P i s W / s ) _ M - 1 = n?R6 2. A newspaper vendor sells three papers: the Times, the Punch and the Commet. p(f') - ^ 70 customers bought the Times. 60 the Punch and 50 the Commet on a Similarly. particular day. 17 bought Times and the Punch and 15 the Punch and the Commet and 16 the Commet and the Time while 3 customers bought all three P ( P / n = « ^ = # = L 0.429 papers. Every customer bought at least one type of paper. Using Venn diagram or otherwise; find; P(L /F ' ) = nL)Pl' ' /R) = ? = L 0.286 1 ' 20 (i) how many customers patronized the newsagent on that particular day? So it is mostly likely that he has been to the river. (ii) how many customers bought a single paper? (d) Let Si. Sj denote the event that the first and second fishermen goes to the sea (iii) how many customers bought Times but not Commet? respectively, and define R/, R:. L /, ^similarly. (iv) how many customers bought the Punch or Commet. but not the Times? .the probability that they meet on a given Saturday (assuming independence) is 3. A random sample of 60 candidates who sat for Part I and II of an examination P(SX n S 2) + P{RX n R2) + P{LX n L2) in 1984 is taken. The table below' shows the number of candidates who passed = -1 x 1- x1- x1- x1 - x1 - or failed each part of the examination. 2 3 4 3 4 - 3 Part I = rs = 0.33 Part 11 Pass Pass Fail Total Probability that they fail to meet on a Saturday is Fail 20 35 ( i - j K = 0-666 Total 24 60 1 he probability that they fail to meet on three consecutive Saturdays is i) copy and complete the table ( 1 - j ) 3 = ^ = 0-296 ii) if a candidate is chosen at random from the sample, use the table to l lie probability that they meet at least once in three weekends is 74 75 UNIVERSITY OF IBADAN LIBRARY find ihc probability that the candidate: a) passed part II component B3 will fail with probability 0.6. Also, if component B| fails, the b) passed parts 1 and 11 device will shut off with probability 0.2 ; if component EL fails, the device will c) passed part II but failed part I. shut off with probability 0.5. if component B3 fails, the device will shut off iii) if a candidate is chosen at random from the subgroup o f those who failed with probability 0.1. The device suddenly shuts off, what is the probability Part I, find the probability that the candidate passed Part II. that the shut off was caused by the failure of component B|. 4. Given that: 9. Stores X, Y. Z sell brands A. B and C of men’s shirts. A customer buys 50% (i) P(AnB) = P(A)P(B) of his shirts at X. 20% at Y and 30% at Z. Store X sells 25% brand A. 40% (ii) P(A/B) = P(A) if P(B) > 0 brand B and 25% brand C'. Store Y sells 40% brand A, and 20% brand B and (iii) P(B/A) = P(B) if P(A) >0 30% brand C. Store Z sells 20% Show that (i) implies (ii). (ii) implies (iii) and (iii) implies (i) 5. Consider the experiment of tossing 2coins. Let the sample space S = {(H,H), (H.T), (T.I-I). (TGI)! and assume that each point is equally likely. Find: i ) the probability of two heads given a head on the first coin ii) the probability of two heads given at least one head. 6. Given that two dice are tossed. What is the probability that their sum will be 6 given that one face shows 2? 7. A certain brand of compact disc (CD) player has an unreliable integrated circuit [/C]. which fails to function on 1% of the models as soon as the player is connected. On 20% of these occasions, the light displays fail and the buttons fail to respond, so that it appears exactly the same as if the power connection is faulty. No other component failure causes that symptom. However, 2% of people who buy the CD player fail to fit the plug correctly, in such a way that they also experience a complete loss of power. A customer rings the supplier of the CD players saying that the light displays and buttons are not functioning on the CD. What is the probability that the fault is due to the IC failing as opposed to the poorly fitted plug? 8. An electronic has 3 components and the failure of any one of them may or may not cause the device to shut off automatically. Furthermore, these failures are the only possible causes for a shut-off and the probability that two of the components will fail simultaneously is negligible. At any time, component B| will fail with probability 0.1, component B? will fail with probability 0.3 and 76 77 UNIVERSITY OF IBADAN LI RARY CHAPTER4 t i \ ) for every interval fa, b ] FUNDAMENTALS OF PROBABILITY FUNCTIONS P ( a < X < b } = { * f (x) dx Then X is said to be a continuous random variable with pdf 4.1 Introduction 1 lowever. f (i) and (ii) above holds and A random variable X is a real valued function that assigns values to every elementary 00 outcomes of an experiment. Let E be an experiment, with elementary outcomes (iii) ^ / ( x o = 1' and el , e2, e3, e4, .........in the sample space S, thenS = (el l e2, e3l e4............. }. i =co A .random variable X can take values 1,2,3 ,4 ,........ for finite or countable infinite (iv) for all i, i = 1 , ci + 1 ,... ,b s . t . elementary event. b An event may consist o f one or more elementary events, for example: P(a < X < b) = A = {ev e3, ek+1: e,eS} 1=1 B = {} a null set Then X is said to be discrete random variable with probability mass function (pm 0 f(Xi) C = { !\x) - J /(X) dx Independent events: Two events A and B are independent if the occurrence o f A has no influence on the occurrence o f B and vise versa, Where /Jxj is the pdfof the random variable X and F{x) is the distribution function, then i.e P(AHB) = P(A) .P(B) F[t) = /(Oancl Independent Random Variables The random variable X and Y are said to be independent if for any two set of real numbers if for all A and B. - /= (« ) ]= /io P{X < a ,y < b} = P{X < a,)P{Y < b) P ( A r \ B = P(A). P(B) Consider a continuous random variable X defined on an interval (0. a]. Let x be a point on [0. a) i.e. a value o f x. 4.2 Probability Density Function (pdf) P(>a) = Pr{x0 < X < x 0 + xa] Suppose X is a random variable and 3 a function / w such that It follows that (•) /(*) ^ 0 p(2xa) = Pr{xo < x < x 0 + 2*a} (ii) / u ) has at most a finite number o f discontinuity in every finite interval on the = Pr{x0 < X < xQ + x] + Pr[xQ + a: < X < x 0 T 2xa] real line '-P {x ) + P (x) (*»») C mf(x)d x = 1 =2 P iX) 78 79 UNIVERSITY OF IBADAN LIBRARY 11 follows that P( n x ) = n P w X and Y will be dependent if Z is the number o f successes in the n + m trials i.e. If (0 < x < a) and we consider/5̂ ) to be contiunuious at x = 0, then it is Z = X + Y KmPw = Pm = 0 It follows from the above that Example Pr(x = x0) = 0 f o r a n y x 0. If X and Y are independent binomial random variable with respective parameters Thus for a continuous random variables we define a probability density function (pdf) (n,p) and (m,p). Calculate the distribution of X + Y f(x) such that Solution Let Pr{a < X < b] = t f f M dx n For all real values a and b P(X + Y = K) = ^ P(X = t, Y = K - i) Equation (3) can be rewritten as i= 0 Pr{a < X < a + h} = h f a + 0(h) Or n Pr{a < X < x + dx} = f o d x = Y j P(X = i) P(Y = K - 0 (=0 Front the above, we can deduce the following (0 fix) ^ 0 ii (ii)Pr{a < X < b) = f * dw dx =K i= 0 (tit) J _ f (X)dx = 1 = Pr {—oc < A' < -cc] (iv) 0 < f (x) < 1 In term of the joint distribution function, the distribution o f X and Y is F(a.b) = Fx(a)Fy (b) V-a.b. i = 1 m+n-k Example: Suppose that n + m independent trials have a common probability of p q success P If X is the number o f success in the first n trials and Y. the number of where success in the final n trials. Show that X and Y are independent. c n - i o r a i=o Solution and (y ) = 0 when j > r P(X — x, Y = y ) = Q p V ~ * (y ) Py (1 - P)m~y 0 < X < n = P(X - x) P(Y = y) 0 < y < m 80 8 1 UNIVERSITY OF IBADAN LIBRARY 4.3 Distribution Function Solution Distribution function forms the foundation o f the theory of probability and statistics. (/) p{X = - l)= 1 - p\ 3 a jump discontinuity at x = 1 If the value of .Vobserved in n-experiment is less than or equal to a k-times, then (ii) p ( X = 0) = 0: F is contains at x = 0 (/,/) P( x > i) v , o F ,(at) = /> (*< *) = - n = F(I) - ^0 , If A' is discrete and m is the number of times X is observed in n trial, then = \ - p + ^ p - ( \ - p ) = ^ p n Theorem f x(x) is the (cumulative) distribution function The distribution function Fx(x) o f a random variable is non-degreasing, continuous /,,, is the probability density function. on the right with Fx(-oo) = 0 and Ft(-oo) = 1. Conversely every function F, with the Let A' be a real random variable on the probability space(n, A J }). For x e 93, we above properties is the different from a random variable on some probability space. define (/) P ( X < x ) = F x(x) Proof: For x < x ' (ii) Px(a.b) = P(a < x < b) [ X < x ' ] = [ x < x ] + [ x < X < x ' ] = f x ( b ) ~ f x ( a ) ( b > a ) p \ x < .y 1 ] = p \ X < a-] + p\x < X < a 1 ] Kxuinplc: Since p\x < X < a 1 ] > 0 Let X have the distribution function Fx( x ' ) -F x(x)>0 0; 1/ A <-1 I his implies that F (a ) is monotone non-decreasing in a (a ) 1 - p \ \t -1 2 Since [a < X <> dlt] -> as a,1, l x ) - K(x) 0 asx), i a Find Since this is true for every sequence {a,1,} then Fx (a) is continuous from the right. W p ( X = -i) (») p ( x = 0) For a continuous random variable X. the c.d.f is defined as (m) p ( X > ) = P U X < x ) = £ . f (c)dt If X assumes a value between and b Pr{a < X <} = P(X < b) - P{X < a) = F(t» “ F(a) 82 83 UNIVERSITY OF IBADAN LIBRARY F(0 = £ 'X * = *i) = £ f t * ) dx i =0 From (5) we obtain Example: _ dh\x) Let the probability space by ('Jt./?.^)and Xb e the identity mapping o f SJ?to R. wherep I (*) clx is the normal probability distribution. Then From (4) we can also define Pr{X > x} = Pr[x < X < + c o } = Pr{ - o o < X < + 00} - Pr{ - c o < X < * } It can be noted that Fy increases continuously. Since /*’(•) represented the cumulative probability at an event, its maximum value is unity and non-negative. 1 ~ P(.X) This is often referred to as the survivor function i.e. F ( - 00) = 0 = tim F[y) = 1 “ F(x) F(i * ) = 1 = fim F(y) Hazard function is a related quantity defined by ~ lim F[s) => lim/7 from the left For a discrete random variable X. the equivalence o f pdf is probability mass Uni /•[,, => lim/7 from the left (pmf) defined as P(x) = P(X = x,) 4.3.1 Distribution Function for Discrete Random Variables Let us define the distribution function for the discrete random variable as t t F(t, = £ P(X = x t ) = £ Pr(X = xt ) /•,' (.v)= /*(.\’ < .v). then. t=l C=1 P(.x < .V < .v1 )= 7>(.V < -v‘) - P(X < x) Example: which tends to zero as .v t .v1 and F * is continuous from the left. Let X be the number of success in single trial of an experiment with constant f {x ' )h /*'i (.v + 0) is the limit from the right probability P. When the trial is repeated n times then l'{x )- F (v - 0) is the limit from the left P{X = x) = Q ) pxqn~x where q = 1 — P It is known that f\.{x) for discrete random variable increases by jumps, and is called n t the step-function. F ,(.q 1=1 1=1 =(P + <0 " = 1 Recall £(X ) = np, Var (X) = npq = a2 -X| x, x. 85 84 UNIVERSITY OF IBADAN LIBRARY K .vain pic: Consider a random variable ..V with distribution function given by 0: x < 0 («) 4 ' - = > < ) = 4 v = X ) - 4 < > 1 0 < * < 1 (iii) Pr(.V ~-\)=Pr(x* - l ) - Pr(.V < 1) = 1 .1 = 0 4 ’ 1/ . / 4 ’ 1 < x < 2 M Fr(.V = 2)= Pr(A" £ 2 )- /> (^ < 2 ) = ̂ + i j - ^ i + l j 1/ - / y 2 < x < 3 = 16 - 12 =1: x > 3 ~ ( i) Sketch the distribution function and hence or otherwise (v) P(X > 2)J(X > l) = = = “ ^ > 1 ) \-F(\) 3/ 9 (//) ( 'alculate Pr|,V = j/y | (vi) F ( i ) - F ( 2) = \ ~ y ?i= y } (iii) Calculate Pr{.V - lj («) f (2-) -F(v ) = / 4 - / 4 = 0 (iv) Calculate Pr{A' = 2} (v) Calculate the conditional probability that X is greater than 2. given that X is (vii) f ( i ) - 4 o ' ) = % - o = X greater than (iv) l - F ( r ) = l - % = % (vi) Pr[2< X <3}; (vii) P r { \ < X < l } (x) I - f (3')=1-1=0. (viii) l*rJO < X < ij ( a ) Pr {X>2} ; 0 P r k > 3 } 4.4 Jointly Distributed random variables Solution If the occurrence of event X that affects event Y we require the concept of conditional probability. The conditional probability distribution function of X given Y for discrete random variable is given by: P(X/Y) = P(y)> 0 P(Y/X) = P(X) PW > 0 While for continuous random variable: f (x /Y )= 1 fylY) r ( r / n - j & § g p 86 8 7 UNIVERSITY OF IBADAN LIBRARY Definition: Lei (£2,e, P ) be a probability spaceandlet B be an event with P(/l) > O.Then the conditional probability of B given A is defined by P(B/A) = CO P(A) > 0 /*(*) = j f(x,y)dy X, Y, continues P{A) But P{AB) = P(B/A) P(A) = P(A/B) P(S) Recall the Baye’s theorem Px0 0 = ^ P{X = XJ = ^ pij X, Y discrete P(Bk/A) = ZHs l P(A/BiP(.Bi) j the m.d.f. for random variable Y is Two random variable. X and Y are jointly and continuously distributed if there exist a fy(y) = X, Y, continues function /(^d e fin ed for all real x and y and a two dimensional plane C such that: J J Py (y) = Y,jP{Y = y i )X,Y discreteP{{x,y)e C} = x ,y e C f {x,y)dx dy Example: The joint d.f. at X and Y is given by 2 e~xe~2y {P{X = x t, y = y{] = pu > 0 0 < x < CO ZXpij = 1 Compute (i) P{x < l ,y < l / 2} The function / (x y)is called the joint of X and Y. Satisfying the following conditions (ii) P{x < y) (0 /o ,y) > = 1, V x . y e C ( i t t ) R ( X < a ) Example 2: (u) ^ = 1, for X, Y discrete Given f {xy) = j 2(x + y - 3xy2) (0 elsewhere 0 < y < 1 ' 0 < x < 1 x y f x J y Find (0 Pr{0 < X < 3/ 4}(tv) P[X/Y < l / ^f (x ,y) = 1, f o r X, Y contiunes For discrete random variable. GO Pi1/ io < y < 3/ 4}(^) Pr[X < 3/ 4 / k < V 2) The joint distribution function of X and Y is given by: (iii) P(*.y) j J 4.5.1 Conditional Distribution of Jointly Distributed Random VariablesP(X,Y) = f{x,Y)dx dy = P(y) > 0 x y x y t= l)=0 1 j and the marginal distribution d.f for X. is defined as 89 UNIVERSITY OF IBADAN LIBRARY Exercise: Proof: If X and Y are independent Poission random variable with respective parameters A, Let F''[t)~ P{l) then and A2. Compute the distribution of X + Y. Solution = / W Let P(x + y = n) = Pr(X = k, Y = n - k) for 0 < k < n and disjoint events Two random variable’s .Vand Y are said to be stochastically independent different: (X = k,Y = n - k ) /(■'■• )') = f (x)fi (.')■ < x < co Exercise -co < y < 00 Given the following probability distribution function 2 4 Where /(.v,_y)is the joint density function of X and Land f ( x ) and /,(y ) are the x/y 1 3 marginal p.d.f of X and Y respectively. 1 V24 Vl6 1/q 2 V Vs V 24 lU Theorem: Two random variables are stochastically independent if and only if the12 joint p.d.f can be written as product of a non-negative function ofx alone and a non- 3 5/24 5/16 5/8 negative function ofy alone. v 3 V2 v6 1 Proof: Find p(XY) 11 /[<., ) - gix ) h{)’) where g(x)aii(l li(y) are non-negative function of.r and;-alone respectively, then the marginal pdf at X is given by 4.6 Independence of Functions of Random Variables / M = £ / t , . , i * - Two random variable’s X and Y are said to be stochastically independent iff: hx.y) = A 00/ 200; -co < x < co = J _ s(x)h(y) dy, where g(x)is a function of x alone —co < y < 00 • • AW = ^(-v)J h ( y ) d y where f (x.y) >s the joint density function of X and Y and / a(x) and / 2(y) are the = c, g(.t) marginal pdf of X and Y respectively. Similarly, the marginal p.d.f of Y is given by Theorem: Two random variables are stochastically independent if and only if the joint p.d.f can be written as a product of a negative function of x alone and a with negative function of y alone. = J s[x )h{y) dx, where h(y)is a function o f y alone Where / (.v) is the pdf and random variable ̂ Vand f(x) is the distribution function. A W = h ( y ) \ " g ( y ) d x - c, h(v) 90 91 UNIVERSITY OF IBADAN LIBRARY CO CO -3- rH LO The last two integrals being iterated integrals w.r.t. two measures respectively and the But since f lt y) is the joint pdf of X and first then integral w.r.t. product o f two measure. OR I f f [x,y)= g(x) h(y) for some function gaud h J | fix. v fo dx ~ 1» by hypothesis then J g[.x\h | h(y)dy = J f ( x , y ) d (.v,y) 1 B AXB j” £ fu. x)(lx dy = [ 'S ,sM hb ) cLx dy Where A is a unique-infinite measure A, X A2 Applying Fubini’s theorem to the finite integer we have: Fubini Theorem (2): If h is a non-negative function on X Q, then dy { \ j ' ^ dy) ^ ltd A = ̂ It dp, dp2 Letting C, * C3 = 1 = \\h dp2 dp, l l a ... , = r x ^ ^ K v ) = i The above reduces to Theorem (2) above in the case of indicator function of rectangles. ■'■fx.x)=ClSlX) C2h(y) = J \ ( x ) f 2(y) Lemma which implies that X and Y are stochastically independent. Let X and Y be stochastically independent random variable the pdf of Z = X + Y is Fubini’s Theorem: (1) A necessary and sufficient condition that a measurable subset given by g(2-)= [ f(x)h(j>) = j ‘ f(x)h(Z - x) dx A of Q, X Q, has measure zero is that almost every w, - section (or almost even Where /(.v) is the p.d.f of X and iv, sec lion has //, - measure (or //, - measure ), zero. h(v) is the p.d.f of Y. I f A = A, X A 2, A(a ) = j p f A w J d p , (vV|) Proof: = jp,(Aw,)d/i2(w2) Let X have the p.d.f f (x) and y has p.d.f h[y\ = p (a , ) p (a 2) pdf of 2? . p(\z < z}) = P{X Y < Z) the joint pdf of X and Y is f ( x ) h(y) since ,Y Fubini’s theorem gives condition under which it is possible to compute double and fare stochastically independent. integral using iterated integrals. It allows the order of integration to be changed in iterated integrals. •.c(z)=££7MMv)- =i»rn dxTheorem Suppose A and B are complete measure spaces. Suppose f sy)‘s A X B measurable if Since < G(z)< I, by Fubini’s theorem I.YS Then Jf (x ,y)d{x ,y )=\ J/(.t,y) / ( v) / 'Wl*U A - {(.v, t- x, ),0 < .v, < x , 0 < x, < =c}unto the space = Y j / (v) - x l S'nce y = z - -x > *, < l}. I he inverse transformations are given by •v, = .»i v.; x i = y t >2 = .»',(> - ) :i) Applying Binominal expansion, we have dxi _ dx= v,; i =_ V. dy2 g( * S -v!(?-.r)! d\\ 5v, = - y , = — (A >Af d>\ dy. 2! V ' (2A f ~ P(2A) 2! 94 95 UNIVERSITY OF IBADAN LIBRARY A . , . I ~ A x ,) fU s )—A .i.) 1 he Jacobian o f the transformation By stochastic independence, since the marginal pdfs civ, dx2 Vj V, /(■*,) = /('■ ,)./= I. 2.....ndv, dv: f a l-.v , - y The random variables are said to be a random sample of size n from a distribution fa dy2 which has pdf / ( v).dX 3 -CV, - V, V,) Exercise>f2>\ l et .V, mill X. he two stochastically independent random variables with p.d.f . J a 0. since v, is not ilentically zero —— c II < x. ■;) = 1, 0 < v = < l interest Therefore, there is need to obtain the distribution of the later variable. It should noted that I bus. given the pdf or c.d.f. of therandom variableX. the pdf or c.d.f. of ), has the gamma pdf with parameter a = 2./? ■= 1 anotherrandom variable Y may be obtained as a function of X. >. has the uniform distribution over (0.1) There are two given major technique to achieve this. They are CDF technique and the \, mill W have exponential distribution with parameter 1 . Frans formation technique. Definition (lor more than two variable) 4.7.1 The C DF Technique I el A',,A'.......V,,// mutually stochastically independent random variable s. each of Given the CDFofY (Fr (A)with some function of interest (say) Y = g(x) is of which has the same p.d.f f(x) which may or may not be known. Then interest. 97 96 UNIVERSITY OF IBADAN LIBRARY I he idea is to express the CDF of Y in terms of the distribution of X. Define set Example 2: Ay = {X/g(x) < y) It follows that [{Y < y}J and X 6 Ay \ l.el/*(x) = Lx 0 < a: < 1 i.e. Fy(y) = Pr(g(x) < y) and y = 3a: + 1 In the continuous case find the distribution of g{y) Solution Myl = /'7 ,W dx y = 3Ar + y = ^ X = ^ = FX(x2) - f x M Pyiy) - P(Y < y ) and p.d.f of fy(y) = df dy Fy(y) = P[ 3 * + l 2e 2xdx Let X be a continuous random variable with pdf / (V) > 0 for a < X < b and y Iny Fy(y) = e ~ 2x r?(x). If there is a one-to-one transformation from A = {x/ fy (x) > 0} on to II |n |K//y(y) > 0) with inverse transformation. == e1 lny~2 + 1 1 < y < CO X = w(y) if the derivative d/ dy w(y) exist, then- y~ 2, /y (y ) = / > ( y ) | ^ | y e / i /y(y) = ^ /v(y) a Where lIt—/y II is the Jacobian of the transformation Y could be monotone increasing or decreasing fx(y) = Hx(y) |^ ;| 9 S 99 UNIVERSITY OF IBADAN LIBRARY Example 3: Using the last example Example f (x) = Zx 0| < X < 1, and y - 3x l c l /U ) - 5 7 ( V 2). * = -2 ,-1 ,0 .1 .2 . Find the distribution of Y = |X| clx _ 1 Solution cly 3 3(y) = 2 (t 1) * j = |C y -D , 1 < y < 4 / y ( l ) = « - l ) + / i ( l ) = £ + i = Example 3: / , ( 2) = A ( - 2) + A ( 2) = i r o 0 < 30 < CO 4 Given = j1Q9 elsewhere 77 y = 0 determine the pdi of y — X~ Solution f (x) = 2x e ~*2 Exercise y = X2 => X = -y1 /X/i 1. Let X have a Poisson distribution with p.d.f f(xy = - ~ a a *dy 1 x! x == 0, 1, 2,... sOO = fx(y) \~\ I .et Y = 4X, derive the pdfof Y. 2. A random variable X has pdf = 2 y ' / i e - y * V 2 y ' 1/2 f(x) = 1 0 < X < 1 = e~y 0 < y < co Find the pdf of Y = - 2 In X 3. If the random variableX~ N (0,1), find the pdfof Y ■ A'2 4.7.3 Transformation that are not one-to-one 4. Use the transformation method to solve the problem i If 5 0 ) is not one-to-one over A = [x/fx(x) >0}; then thee is no unique solution to example 1.. j4X' 0 < X < 1 equation y = g(x). It is usually possible to partition A into disjoint subsets 10 elsewhere Ay ,Al , A3 ... such that f.i(x) is one-to-one over each Aj Use the C 'L)F technique to derive the pdfof f y ( y ) = ^ / x ( ‘V/(y)) (i) Y - X \ (ii) w = ex(iii) Z = InX (iv)p =(,Y -0.5)2 i In the above example i. e. fy {y) = ^ fx (xy)where the sum is over Xysuch that /tt(xy) = y P{x = 2 / y = 3) = - ^ 1 17 P{Y= 3 ) — - v 6 100 101 UNIVERSITY OF IBADAN LIBRARY CHAPTER 5 5.2 Binomial Distribution SOM E DISCRETE PROBABILITY DISTRIBUTIONS In Bernoulli distribution, there is just one trial that can result in either success or failure. But, in Binomial distribution, we have repeated and independent trials of an experiment with two outcomes resulting in either success or failure, yes or no etc. 5.0 Introduction The probability of exactly .t successes in /; repeated trials is given by: In this chapter, we will be studying some discrete probability distributions with a view to obtaining their men and variances. p'q"~ ; x = 0, 1, 2, .... ...., n \-x) 5.1. Bernoulli Random Variable 0 elsewhere A random variable X, that assumes only the value 0 or 1 is known as a Bernoulli where p is the probability of success random variable. The values 0. or I can be interpreted as events of failure and success q = 1 -p is the probability of failure respectively in an experiment usually referred to as Bernoulli trial. x is the number of successes in repealed trials. Definition 1: A random variable X is defined to have a Bernoulli distribution if the f(.v) is the probability density function (p.d.f). discrete density function of X is given by (p * (l - p )1_*forx = 0 or 1 )I ~ - M —v . ,• \ 5.2.1 Properties of Binomial distribution / M = = p*( 1 - p)1 */{o, 1}0 ) (i) It has n independent trials 0 otherwise J (ii) It has constant probability of success p and probability of failure Where the p satisfies 0 < p < l . l — pis usually denoted by q q = 1 “ P- Theorem 1:If X has a Bernoulli distribution, then (iii) There is assigned probability to non-occurrence of events. H(X) = p. Vcir(X) = pq (iv) Each trial can result in one of only two possible outcomes called success or failure. Proof: E(X) =0.q + l .p = p 5.2.2 Mean and Variance of a Binomial Distribution Var{X) = E(X2) - ( E[X])2 f(x) - p'q"-' x = 0, 1, 2, ......... ,n = 0 2.q + l 2. p - p z = pq Bernoulli distribution is a special type of discrete distribution sometimes referred to as (i) Mean: Indie tor function. This implies that for a given arbitrary probability E(X) = z * f {x) space[S, A, P(. )], let A belong to A , define the random variable Xto be the indicator function of A; that is x(w) = then A has a Bernoulli distribution with parameter p = P[X = 1| = P[A], //! (/7—at)! p qjc! 102 103 UNIVERSITY OF IBADAN LIBRARY - z.r=0 * ------------------ P q £ [* (* - ! ) ] = « ( /i- l) /r(//-.r)!,v(,v-l)! /. £(JTJ) = 4 V (^ '-1 ) ] + E(X) (» -!)! - • S : (ii-x)l(x-l)! p' p x l q"~’ = n(n-l)p2 + np n-1 /. V(X) = E { X 1) - [E{X)]2 - np ("-D I frl («-x)!(x-l)! p q = n(n-l) p2 + np - n2p2 = n2p2 - np2 + np - n2p2 Let s = .r - 1, .v = s + 1 = np - np2 = np £ (""I)!L -------- xr~, q = np (1 -p)j.o (/»-.s-l)!jr! p = npq '/ l - P Remark: The binomial distribution reduces to the Bernoulli distribution when n = 1 ~ "p z*=o P'q'-’-'S Example 1: = np (p +q).nn‘- l =_ np It is known that screw produced by a certain company will be defective with (ii) Variance: probability 0.02 independently of each other. The company sells the screws in Var(X) = E(X2) - [E(X)]2 packages of 10 and offer a money back guarantee that at most 1 of 10 screws is E[X2] = E[X(X-1)] + E(X) defective. What proportion of packages sold must the company replace? Solutio n Let X be the number of defective screws thus n = 1=, p = 0.02 // - z * (" “ !) Pr (at most one defective) = 1 - P(X = 0) - P(X = 1) /7! /> (^> i) = i - m ' < i ) = ^ x ( x - l ) — t p V " ' 10N (/j - x)x! = 1 (0.2 )u(0.8)' (0.2)' (0.8)9 0 n(n-\)(n - 2 ) ! = 2 > ( * - l ) p„ 2 p_ t=2 q n -.T What is the final answer?(»i - x)!x(x - l)(x - 2)! Example 2: = n ( n - \ ) p 2 T ( « - 2)! „.t=2 n - v £ (« -* ) ! (* -2)! Z7 7 A communication system consist of// components each of which will, independently function with probability p. The total system will be able to operate effectively it at Let s - x-2 x - s + 2 least one-half of its components function. = « (« - D^ 2 (if - 2)! For what value of p is the 7-components system were likely to operate more U (#i - j2 - 2)!s! p q effectively than a 5 components system.- //(// - 1) p 2 £ ( a - > .v P’q"-’- 2 Solution A 7-component system will be effective 104 105 UNIVERSITY OF IBADAN LIBRARY 5.3 Poisson Distribution If P(E 7 > 3) = P(E = 4) + P(E = 5) + (P(E = 6) + P(E = 7). = 1 - P(E < 3) = 1 - P(E = 0) - P(E = 1) - P(E = 2) - P(E = 3) When n becomes large and p is fairly small, the use of the binomial distribution in calculating the various probabilities becomes cumbersome. To overcome this P V + p ' qs + 3 ^ , + p ’ problem, we use another probability function which approximates the binomial distribution. This probability function is known as the Poisson probability function A 5- component will be effective if which we shall be considering in this lecture. P (£ s > 2) = P{E = 3) + P{E = 4) + P(E = 5) A random variable closely related to the binomial random variable is one whose P 'q 2 + P'q' + P S possible values 0, 1,2, 3,.....represent the number of occurrences of some outcomes not in a given number of trials but in a given period of time or region of space. This The 7-component will be better if variable is called the Poisson variable. P(E1 > 3) > P(E5 > 2); for q = 1 - p. Complete this 5.4 Properties of a Poisson Experiment Try for 5 and 3. A Poisson experiment is a statistical experiment that has the following properties: 1. The experiment results in outcomes that can be classified as success or failures. Example 3: 2. The average number of success(A) that occur in a specified region is known. For what value of K will p (x = K/ P) /( X = K - 1) .b e ^8r eatter or less than 1 if X is a 3. The probability that a success will occur is proportional to the size of the b (n, p) and 0P(X = k - 1) iff The probability distribution of a Poisson random variable is called a Poisson (n - k + l)P>A:(l - P ) distribution. i.e.K <(n + l)P Given the mean number of successes Athat occur in a specified region, the probability This implies that for the binomial distribution b (n, p), as k goes from 0 to n, P (x=k) density function (pdf) of Poisson distribution is given by first increases monotonically and then decreases monotonically, reaching its largest e-*U') value when k is the maximum. P(x: A) = x\ 106 107 UNIVERSITY OF IBADAN LIBRARY where .t is the actual number of successes that result from the experiment. = t A ~ np (// is the total number of observation in the experiment and p is the probability « = | x ( x - 1)! of success). = a y Z Z fZ Note that mean A and variance are equal i.e. A = mean = variance. Also A is the (x -])! parameter of the distribution, with e= 2. 71828 L e ts= .r- 1 Some examples of random variables that obey the Poisson probability law are: 1. The number of customers entering a post office on a given day = * t .0 si 2. The number of misprints on a page (or a group of pages) of a book. = A 3. The number of packages of instant noodles sold in a particular store on a given day. (ii) Variance Var(X) = E(X2) - [£(.r)f Identities: E(x2) = E[x(x-1)]+E(x) E[x(x-l)]= £ t ( x - l ) e ’xA* ,v=o • x\ — 1, + A, H--A--2- - 1- -A--'- K... r. _ --xX 2Ki rt-v.mn'Vxo- 6 S f f c . b n x ; o r i i c r i r o b : 6 j O' 7 •' ' . . m + n dtnobrm nw sv ; U 38)f 6 fo siqrrtpa . A wfLeh arc Cr ■ b) ?(X - 4) - Iu 1 2 ... ml II - ").QC. t-933 >. ■ 4 4 \ mo! (jn - ;;)i.v! C A 6 J. - s ni+it Example 2 :. c As part of a health survey, a researcher decides to investigate prevalence of cholera in S sub-urban areas but of a city’s 28 sub -urban areas. If 6 cf the sub-urban areas have a x m(-m - 1)! c very high prevalence rate, what is the probability that none of them wh! be included in I (m-A-)!.v(.r-l)! r - x the researcher’s sample? c Solution: . rmy ?■ \ Retail that we have f i x ) =- i~ ; .r-x:- as the p. d. f for hypergeometric distribution c- r awt* nt+n Here, x = 0, n = 22, m + n = 28 and m = 6 . - c M = * - 1 = s + 1 ,.i> —1----- this implies _ ~ (W -J-l)!*! C 12 113 UNIVERSITY OF IBADAN LIBRARY m(m - 1) ^ ( * - 2)! . c = —m+n . c c lll+rt S~> / J^ i r-j-l W i-2 (m -x ) !(x -2)! let s = x-2 x = s + 2 m = —mn+n . C m(m - 1) ( m - 2)! C r—I Hl+hf' L—i (m - i1 - 2)!j ! f'J“2 m ( m - l ) x m m+n ̂ *Ca "Cr , 2 m + n\ (m + n - 1)1 {m + n - r)\r\ [(m + n - l ) - ( r - l ) ) ( r - 1)! m(m - 1) b+m. 2m+n /-* m r m + n! (in + n -l)! m(m - 1) _ (m + n -r )!r! (m + n - r ) ! ( r - l ) ! m + n\ (m + n - 2)! (m + n - r ) ! r ! [(m + n - 2 ) - ( r - 2 ) ) ( r — 2)! Simplifying gives m ( m - l ) ( m + n - r) !r! (m + / i - 2 ) ! mr E(x) = (m + n)! (m + n - r ) \ ( r - 2)! m + n (ii) Variance: m(m - l ) r ( r - l ) ( r - 2 ) 1 {m + n - 2)! (m + n){m + n - l ) (m + n - 2 ) ! (m + n - 2 ) ! £ (x 3) = e \x (x — X) + x] rm(m - l)(r - 1) = £[x(x-l)] + E{x) (m + n)(m + n - 1) E[x(x -1)] = £ x ( x - l ) / ( x ) E(x') = £[x(x-l)] + E{x) = Z ' IHrc-. nc,. x(x ~ ^ —in1 m ( m - l)r(r- 1) r mm + n (m + n ) ( m + n - 1) m + n c. Therefore, K(.v) = E(x2) - [£(.t)]3 Continuing gives m(m — 1) r(i— 1) rm ( rm (m + n){m + n - 1) m + n \ in + n v - , 14 (/i i -mx(m) ! x- l()x(-m1 ) -( x 2-)!2 ) ! "CW - . t Simplify this last expression to obtain rm ' m + n - r (Post-Test 2) = 2 ^ u - i ) ------------- ------ m + n —m+n ) , m + n - ] i=2 c Note: If the sampling was with replacement, r and p = would be the appropriate binomial parameter and its respective variance would be r ( 1 ----— ). ,s ^ (w -x )!(x -2)! m +n V m + n / = m (m -l) 2 ̂ ------------ ^ T 114 115 UNIVERSITY OF IBADAN LIBRARY M [ " ' ml ft! (r-x)! The binomial variance is slightly greater than the hypergeometric variance because of LJ [ r -x ; the factor ( ~ “ ) >n the hypergeometric variance. m + n (m + /Q! r (m + n -r)!rl Asm + n becomes very large compared to r, the hypergeo metric distribution tends to the binomial distribution. ml ______ n\______ ( m + n - r ) l r \ ( m - x ) l x \ { n - r + ;c)!(r — jc)! (m- \ n)\ 5.9 Binomial Distribution as an approximation to the Hypergeometric _ r 'j w!w! (m -f n - r )i Distribution Kx ) ( m - x ) \ ( n - r + x ) \ ( m /?)! Suppose the p.d.f of a hypergeometric distribution is given by _ f r ̂ ni(in - 1)...... (/>? - x +1) (m - cc)! n(n - l)....(n - r + x + l)(;t - r x)! VV (m-x)\(n-r-Tx)l(m + n).... {(m + n ) - (/•-!)) f lw(/w-l)..... jm -(x - l)} n ( n - l ) ......[ n - ( r - x - 1)] _ v -V _______________________________________ (m + + n) - (r - 1)] then, we have the following theorem. Divide through by m+n Theorem: Let m, n ->ao and suppose that m x - \ 1 f\ m" 1 ( - _ 1 ) f ® /--X-:.] +n) \m+n m+n J rn +■ n m+n J) * f[ jfj"+ n ))(y nin+n .'jj+n, \m+n i j +n —in + n = rm, , - + p , o < p < \ m+n m+n r - 1 ) n .m+n in+n m+n) r - x then PxqH~\ x = 0,1,2....... /- in + n r ) f m V m i }) \ m + n ) \ m + n m + n ) f\ m m x - n + n m + n J Proof: ( " ) > ) r - x 1 We have [ n, m + n ) . m + n m + n J f\ m n+ n m \ ( r - 1) 'm) n ) 1... ( m + n) ... m ( m + Since------ P, h „e nc,e --------n--- => 1 m + n in -h II l r ; I here fore UNIVERSITY OF IBADAN ilg 1L &U_IBRARY P(r) = prob o f (k - 1) successes in (x + k - 1) trials lim x prob o f (x + k)th success »»■*-«—»»> l X = r + k - l Ck_lPk- 1qT.p m + n = r + k - r = 0, 1 , 2 ....................... eqn. (1 ) P'c,' Pk(fe + r - l)(/c + r - 2) .... [k + r + 1 - ( r + 1)] , = ------------------------------ H------------------------------ 9r m n Pk(k + r — l)(/c + r - 2) ....(/c + l)fc , r - x = -----------------------ri---------------------- ’ This result implies that we can approximate the probabilities —y by m + n r = Pk( - i y - k c rqr -v P by setting p = mf provided m, n are large. This is true for all x = 0, 1, =m n No -tek cthTaPt:k( - q ) r................. eqn. (2) 2, ..... r. (i) r + k — lck_l Pkqr> r = 0, 1,2 ........ If m, n, are large, approximate the hypergeometric distribution by an appropriate binomial distribution. If the need arises, we may also go a step further in = r + k - l CrPkqr. r = 0,1,2......... approximating the binomial distribution by the appropriate Poisson distribution. (ii) 2 Z P ( r ) = P k Z?=0- k Cr( - q r 5.10 Negative Binomial and Geometric Distributions = Pk[ l - q ] ' k Negative binomial and Geometric distributions are two families o f discrete = pkp~k = 1 distributions that are very important in Statistics. The Geometric distribution is so Equations (1) and (2) for k > 0 are known as negative binomial distribution. named because the values of the Geometric density are the terms of a geometric series while the Negative binomial distribution is sometimes also referred to as the Pascal’s 5.11.1 Mean and Variance of the Negative Binomial Distribution distribution. (i) Mean Recall that the moment generating function (MGF) of a random variable A', M(t) = E(etx), using the moment generating function approach, therefore, from equation ( 1), the 5.11 Negative Binomial Distribution MGF of/? is Consider a succession of Bernoulli trials, let P(r) denote the probability that exactlyr + k (k > 0), trials are needed to produce k successes. This will so happen M(t) = E(etr) = Y JC° 0etr(r + r ~ ^ P* * (-* « ')’■ « r ) ={op(L 7 l l ' r = 1 -2-3 ............... (2) = p ‘ ( i - 9ctr * Now, M'(f) = k qecPk(1 - qe1)- *-1 5.12.1 Mean and Variance of a Geometric Distribution kq Consider equation (2) E(R) = = 0 = — V E(R)2 = M"(t) (i) Mean = k qetPk(1 - gt?1)- *-1 + (fc + l)g ecP*(l - q e T ^ k q e 1 Complete the solution using V(/?) = E(/?)2 — (E(/?))2 (see Post-test 4) E ( R ) = Z ” / p ( i “ p)r_1 etl - p = q 5.12 Geometric Distribution * E(R) = Y ” r p ( q y - ' If in equation (1), we put k = 1, we have r + k - l Ck_lPkqr v-*°° d = rc0P qr = q r p , r = 0,1 , 2, .... d v 100 and q = 1 - p, we have geometric distribution. - * * i l j * r The following describes the Geometric distribution. Consider a sequence of Bernoulli trials with probability p of success. This sequence is = p Zdq7 ^ + g2 + ) = q' p = 0 - pY = Pp (1 ~ q ) 2 Generally, the p.d.f, f(r) =■ P[R = r] of R is given by ( ( l - q + q)\ f ( .r) = (1 - pYp. r = p \ o T q y ~ ) = 0, 1 , 2 ...... / ( f ( r ) = qrp, r = 0,1,2 ............ ( 1) E(/?) = - V 120 121 UNIVERSITY OF IBADAN LIBRARY (ii) Variance Thus, the mean and variance of this form geometric distribution are ^ and I r ­ £(«2) = Y " r 2 p ( l - p)r_1 respectively. 4—'r= l Example 1: A fair die is cast on successive independent trials until second six is = Y °° r 2 p ( r observed. What is the probability of observing exactly 10 non-sixes before the second *—Jr=l “ d six is cast.Z Solution: This is a negative binomial distribution problem. So,d v ' ,co p k ( \ - p Y r = 0 ,1 ,2 ...... U - l , d = P j j ( q + 2 q 2 + 3 q 3 + - ' ) ClO + 2-1^ Therefore, we have 0.049 d = p - q ( l + 2q + 3q2 + 4q3 + - ) Example 2: Recall that 1 + 2x + 3x2 + 4x3 + ••• = Team A plays team B in a seven game with series. That is the series is over (1-X)2 when either of the teams wins four games. For each game, p(A wins) = 0.6 and the Therefore, we have p —dq q (V•■( l--fl)2/ games are assumed to be independent. What is the probability that the series will end ( l - q m ) + 2 q ( l - q ) in exactly six games. = P (1 ~ q Y Solution: [Cl — qr)][(l - q + 2q)] The game will end is either A or B wins the game series. = P (1 - q )4 p(game ends) = p (A wins series in 6 games) + p (B wins series in 6 games) r n1 + q = p l((1T-3 9)3 ^ ] ( 0 .6 ) ‘ (0.4)’ + [^ 0 .4 ) ‘ (0.4)2 r 1 + 9 = 0.207 + 0.092 P i ( l - 9 ) : = 0.299 Note: that (P)2 J p( A wins series in 6 games) = p [A looses 2 games before 4 wins] f(-p2) = 1̂ ] since q = 1 - p = P(Y = 2) Therefore,!/(/?) = £(/?)2 - (f(/? )): 2 - p (p):Hi)' = 0.207 1 ~ P Example 3: r,2 In a sequence of independent rolls of a fair die; 122 1 2 3 UNIVERSITY OF IBADAN LIBRARY • Each trial has a discrete number of possible outcomes i. Whai is the probability that the first four is observed in the sixth trial. • The probability that a particular outcome will occur is constant for any Solution: This is geometric distribution problem given trial P(R = 5) = j = 0.067 where R denotes the number of non-fours before the • The trials are independent A multinomial distribution is the probability distribution of outcomes from a occurrence of the first four. multinomial experiment. i. What is the probability that at least six trials are required to observe a four. Definition: Suppose a multinomial experiment consists of n trials, and each trial can Solution: P[* > 5] = \ - P[R £ 4] result in any of k possible outcomes £1,£ 2^ 3. .....»£*• Suppose, also, that each = !-[/>[/? = 0']+ P[R = l]+ P[R = 2]]-t- P[R = 3+/>[/? = 4 j possible outcome can occur with probabilities pa, p2, P3, ..... , pk . Then, the probability p that Ej occurs nx times, E2 occurs n2 times,...... , and Ek occurs nk times is P = [(n ^ ln fc !)] fr1"1 P2"2 .....Pk"*] where n = na + n2 + n 3 + - 4- nk Example 1: A bowl consists of 2 red marbles, 3 green marbles and 5 blue marbles. 4 marbles are randomly selected from the bowl with replacement. What is the probability of selecting 2 green marbles and 2 blue marbles? Solution: The experiment consists of 4 trials, so n = 4. The 4 trials produce 0 red marbles, 2 green marbles and 2 blue marbles; so nred ~ 0 , Kgreen ~ 2 # W-blue ~ 2 7776 On any particular trial, the probability of drawing a red, green or blue marble is 0.2, 0.3 and 0.5 respectively. Complete the solution Using the multinomial formula, we have = f._____2!_____ [Pi "1 Pz"2.....Pic"*] 5.13 Multinomial Distribution l(na!n2! .....nk!). We know from binomial distribution that each trial of a binomial experiment can result in two and only two possible outcomes. In the multinomial experiment, however, each trial can have two or more possible outcomes. So, a binomial '(oT^Tii)] [(0-2)°(0-3)zCo.5)z) experiment is a special case of a multinomial experiment. Therefore p = 0.135. A multinomial experiment is a statistical experiment that has the following properties: • The experiment consists of n repeated trials 1 2 5 1 2 4 UNIVERSITY OF IBADAN LIBRARY E xam ple 2: Suppose a card is drawn randomly from an ordinary deck ofplaying'cards and then C H A P T E R 6 put back in the deck. This exercise is repeated five times. What is the probability of SO M E C O N TIN U O U S PR O B A B IL IT Y D ISTRIBU TIO N S drawing 1 spade, 1 heart, 1 diamond and 2 clubs? 6.0 Introduction Solution: Having studied some discrete probability distributions in the last chapter, this chapter The experiment consists of 5 trials, n=5 now deals with the study of some commonly used continuous probability The 5 trials produce 1 spade, 1 heart, 1 diamond and 2 clubs; so rij = 1 ,n 2 = 1 ,n 3 = distributions. 1 , n4 = 2 On any particular trial, the probability of drawing a spade, heart, diamond or club is 6.1 Normal Distribution A random variable X is said to have come from the normal distribution if its 0.25, 0.25, 0.25 and 0.25 respectively. Thus, p1 = 0.25, p2 = 0.25, p3 = 0.25, p4 = probability density function (pdf) f i x ) i s define as: 0.25 Using the multinomial formula, we have / w = 1 a M 2.-co .* < o o V2^ v ; P = [(nj n,r.'....nt l)]^ ’,lp a ’" .....P*"*] With p > 0 and a 2 > 0 The mean and variance of the normal distribution can be obtained as follows: [(1! i n ! 2!)] [(° z5)1(0.25)1(0.25)1(O.25)2] E(x2) = f xr f{x)dx p = 0.05859 J-CO 1 - _i2 /x — i#\ 2P. Practice Questions - i V27T(7Z V ° 1. Suppose that a fair die is rolled 9 times. Find the probability that 1 appears 3 dx times, 2 and 3 twice each, 4 and 5 once each. 2. In a city on a particular night, television channels 4, 3 and 1 have the rt£ 1 following audiences: channel 4 has 25 percent of the viewing audience, Let Z = —a ' d x a channel 3 has 20 percent of the viewing audience and channel 1 has 50 percent X = H + 8 Z of the viewing audience. Find the probability that among ten television E (x7) = — — f (p + aZ)r e J 2odZ viewers randomly chosen in that city on that particular night, 4 will be a s 2n J-oo watching channel 4, 3 will be watching channel 3 and 1 will be watching 1 f " _z* = ~ ^ j= J (p + crZ)r e 2 dZ channel 1. When r = 1 1 r z2 E(X) = — = (p + aZ) e z dZ \ 2u J-m 126 127 UNIVERSITY OF IBADAN LIBRARY i r r _£i r - Z e ~ d Z key property o f being memoryless. In addition to being used for the analysis of = v f ? l " L e z2l + a L Poisson processes, it is found in various other contexts.r* i * r 1 JL = H ~T=e 2 +cr Z - — e * dZ J - m V 2 n J . co \ f 2 n The exponential distribution is not the same as the class of exponential families of distributions, which is a large class of probability distributions that includes the Recall that ~ e 2 is a standardized normal distribution with 0 and variance 1. exponential distribution as the baseline distribution Therefore E(X) = /i( l) + o-(O) A random variable X is said to have an exponential distribution if is probability density function is defined as Since - j=e~~ = 1 and therefore f i x )’ = X e-^ .X > 0 •• E(X) = n Its corresponding moment about the origin is derived using r* 1 z2 E{Z) = Z - = e ~ T d Z ■'—co yj2 n 4 = E(xr) = C x ' m d xJ — 00 To obtain the variance, set r to 2 in equation ( 1 ) and use = /* xrXe~Xxdx Var(X) = E(X2) — [£,(A')]2, we proceed as follows = A r ^ e - ^ d x J — CO E(X2) = ~ [ Qr -f c r Z ye '^ d Z 1 f 0v0 2n J— co z2 dyLety = Xx, — = e • dx = - = (p2 + 2/i(JZ + cr2Z2) e ~ d Z V27T J dx = y andx = j—co r 1 r® 1 z2= ^ T = e ^2 dz + 2 nd r Ze —i dZ + o2 \ —= Z 2e z dZ = /^2J C- ol )o V+ 2 2tT/icr(0) + cr2 ( l ) J-,,V2d J-my/2n = ^ i o dx = (Sdy VarQ0 = EV(2)-[E tX )]2 = ^2 “ O l1")\2 E(,Xr) = Y ^ f “(fiy)T+‘" le-rt)dy = F - ( l ) 2 1 V E(.Xr) = I / - yr*a~2e-yp iy 7 ! When r = 3 _ T 4_ ( 4 - l ) j 6 % A3 A3 A3 Recall from Gamma function that and similarly with r = 4 Ta = e~xx ~1 dx, then i _ r5 _ (5 “ D i _ 2A o r " 4 A4 A4 “ A4 E{Xr) = — T(r 4- a) This gives the rth moment about the origin from which the first four moments can be derived. 6.3 Gamma Distribution When r = 1, we have line gamma distribution is a two-parameter family of continuous probability distributions. The common exponential distribution and chi-squared distribution are £ t f ) = ^ r ( r + a ) special cases of the gamma distribution. £ (* ) = ^ « ra In each of these three forms, both parameters are positive real numbers. The parameterization with k and G appears to be more common in econometrics and = ccp certain other applied fields, where e.g. the gamma distribution is frequently used to When r = 2 model waiting times. For instance, in life testing, the waiting time until death is a E{X„2 ) = JP-2T 2 + a random variable that is frequently modeled with a gamma distribution. Ta A continuous random variable X is said to have a Gamma distribution if its probability density function is defined as follow /?2(1 + a ) r ( l + a) _ X ra f M = -‘ ^ p , x > o .c c > o ,p > o /?2(1 + a )a ra = To 6.3.1 Moments of Gamma Distribution = a ( l + a )/?2 x re ~ i x a_1 Therefore, we obtain the variance of X using the fact that dx Vcc Pa Var(X) = E(X2) - [E(X)]2 130 131 UNIVERSITY OF IBADAN LIBRARY = a( 1 + a )/?2 - (a/?)2 = a /?2 + a 2/?2 - a 2/?2 1/a rp O = a /?2 When r = 3 £ (* 3) = ^ra T(3 + a ) and finally = f a l " l “ ery"~I ^ with r = 4, we have Since Vx = / “ e5'}/®-1 dy as before. £ (**) = F r r (4 + a) Then, 6.3.2 Moment Generation Function of Gamma Distribution Mx( t ) = (1 - / ? t ) - ‘ The moment generating function of a random variable X distributed as Gamma i.e. Differentiating the above and setting t to zero, we obtain the first four X~GA(aP) is derived as follows: moments about the origin as follows Mx(t) = E(etx) = f elIf { x ) d x Mxl (t) = ap a - p t r a- 1 j — 00 E(X) = M](0) = ap /"or — h < t < h M]1 = - a p 2( - a - 1)(1 - p t y a~2 etx e Pxa~l M,(0 _= Jo dx E(X2) = AfJKO) = a 2p 2 + ap2 Ta p° Var(X) = E(X2) - [E(X))2 = a 2p 2 + a p 2 - (ap)2 1 C e t x e ~ * x a- X --------- -̂----------------- dx = a p 2 ra p a Va P a The characteristics function, the second characteristic function and the cumulate generating function can be obtained respectively as x(t) = - a log(l - pi t)and i: dx Kx{t)= - a \ o g { \ - pt)r a p 1 6.3.3 Maximum Likelihood Estimation of parameter of the Gamma . 1 rv■>(*' ‘) x a dx Distribution EccPa Jo Let X\,Xi,.. .,Xn be a random sample of size n taking from Gamma distribution, the 1 r« iF)̂ dx likelihood function is= r a P a J0 Py * * i-pt 132 133 UNIVERSITY OF IBADAN LIBRARY L = £-(*+i)+i (Ta)np an ' K(3K I00 k FTTT+T The corresponding log-likelihood function is \p n Z”= i xt LogL = ----- - — + ( a - 1 )^ l o g x t - n logr(tr) - a n log/? X~K 0 0-K(3K — . n H K \ P Differentiating this with respect to a and /? we have dlogL 1 ra l , o P K loo _ -0/?* p K— = ^y\ o g x i - n — -nlog(3 X K \ P 0* p K i=i 3/o^Z, _ sr=i^/ an J pci where p , x > 0 It is interesting to show that , r -K / ” /■(*) dx = 1 , this is as follows = KpK r - K \ P U' = ^ — oo r—K ■ p dx dx /2y)‘ dx = (y[2y )a = -y[?=2y dy j IcJye dy When r = 1 2V2a3 f ” I -v , ^ 7 f / 0 >*' * I m - T T r H _2_ 3 __2_1 1 J R V2 ~ JR 2 V2 2§ar2 Since P^ = Vir, we have ~V^T 2ia 2?2a 2 1 1 V Z ~ ~ j T F i 2 r 2 = 1 « E D - 2 When r = 2 This affirms that Maxwell distribution is a true pdf. ?1+in2 3 2 £ « 2) = Vtt P2 + 2 6.5.1 Moments of the Maxwell Distribution 4a2 5 /•on = ( xvw dx 2 J — CO i«h3 J<> 00 2 2T+r- ^ 2~+ra 3 + r e - y E(XT 2 ?1 y z1 dy V dx ■ f n J r . 1 1 1f c o 2 1+J y 1+ J ax2+re 2a7 dx > T - 2 zI y 2I r e - y dy'0J0 Using the notates earlier, we have 139 UNIVERSITY OF IBADAN LIBRARY 25a32} m n = Since fa = (a - l ) 3 5 _ n 2 2z a3 22a 3 V7T 7T 21+5a4 3 4 fc’O T = “ V T ”1 2 + 2 23 a4T j = ~ V T ~ ' From which the first four moments can be derived 3 1 1 _ 3 1 8a4- r - 8a4- - T-Sci- nce Ir -5 = -3r3- 2 2 _ 22 22 2 2 2 2 f 2 _ 4 f 2 8a4- - - r- s m ~ r n = 15 a4 The third and fourth moments about the mean, i.e. n3 and ^4 can then be obtained as ypn 4 = 3a2 Var (X) = E(X2) - [E(X)]2 /16 / ̂ \ 2 * = 2a T " 5 J i = 3 a > - f e l = a4( 1 5 - - Finally, the coefficient of Skewness and Kurtosis is thus: n 5, = ~ 0.48569 When r = 3 (3-i)1 5v - ^ t- 3 ~ 0.10818 141 UNIVERSITY OF IBADAN LIBRARY dG{t) G \t) dt C H A P T E R 7 = £ * / " '/>(*) PR O B A B IL IT Y G E N E R A T IN G FU N CTIO N S (P G F ) = Y j x Pw => G(i>= 7.1 Introduction 4. The variance of X is given by: The probability generating function (PGF) for a discrete random variable is a power series representation (the generating function) of the probability mass function of a Var [X] = g1i,"+ g ;1)-[g ;„]2 random variable X. Proof:G;„ = Y x P<*> PGFs are often employed for their succinct description of the sequence of probability X P[X = /'] and to make available the well-developed theory o f power series with non- negative coefficients. Gtn = I ( x 2-x ) i> , , / '-2 Definition 1: The probability generating function (PGF) of a random variable X is defined as: G,(t) = E[t ‘ ) = J X - I.t f />[* = *] o,„ = where: (7,,,"= E(x!) - E(x) Gx{l) is defined only when X take values in the non-negative integers P(X=x) is the probability mass function of X. G,., = E(x2) - g ;„ The notation Gx is usually used to emphasize the dependence on X. V(x) = E(X2) - [E [X )Y r ~t 7.2 Properties of PGF V(x) = E(X2) 1. The probability mass function of X is recovered by taking derivatives of G. ■ M But IE(x2) = < P(k) = P(X = k)= GlK)( 0) K\ Therefore Var [XJ - g„i + g ;„-[ 2 . If X and Y have identical PGFs, then they are identically distributed, i.e. if there are two random variables X and Y and Gx = Gy, then fx = fy. 3. The expectation of X is given by E(X) = G ‘(1) Proof: G(t) = F.(t*) = £ C P ( .r ) 143 142 UNIVERSITY OF IBADAN LIBRARY 7.3 Probability Generating Functions x is the random variable 1. Bernoulli Distribution (i) Mean: The probability density function (pdf) of a Bernoulli distribution is given by G(/) = E [0 ?[X =x]=P'q'-J = £ f P [ X = x] (i) Mean: »»» Gx(t) = E[t*] = £ j t 'P[X = x] = t°p°ql4) + t ’p 'q1' 1 G*(t) ^ q + pt Gl)=P G;n = P = E(x) (ii) Variance: G " « - ^ a a = [p /+ 0 -p )]" dt1 Therefore, G '(t) — p G(t) - \pt+q\" G"(t) = 0 G ' ( . ) = ^ 0 But Var(X) = E(X2) - [E(X) ] 2 dt ~ n \p t+ q Y And Var is the probability of success ■- np - np' ij the probability of failure 144 145 UNIVERSITY OF IBADAN LIBRARY = np(l-p) C H A P T E R 8 Var(x) = npq M O M E N T G E N E R A T IN G FU N C TIO N S 3. Poisson Distribution 8.1 Moment Generating Function (i) Mean : The moment generating (m.g.f) is one which generates integral moments when these G(t) = E (tx) moments exists. e~AX* (i) For the univariate random variable X, the mgf is given by x\ «,(< )= ,-0 , Where t is a dummy variable (ii) For the bivariate case -'e have corresponding = G(t) = Where tKand t2 are dummies and the random variables X x, X 2 are jointly G‘(t) = Xe~x^ distributed. G'(t) = /Uf'lw (iii) In general for multivariate case, we have ~Xe° = (<„/„-(.)= E{e... ......... -■) G11 (t) = X.Xe-*'* The moment generating function Mx(t) of a random variable X is defined for all real = X2e~i+il values of t by G "(l) = Mx(l) = l£{etx)X2e~i + i = X 1 (ii) Variance : ( YixetXP{x)'> i f X is discrete Var (X) = GII(1) + [GI(1) - [G'(1)]2] [ / ^ e txf(X) d x ; i f X is continous = G"(1) + G '(1 )-[G i(1)]2 Mx(t) is m.g.f. because all the moments of X can be obtained by successively differentlaity Mx(t) and then evaluating the result at t = 0. = X2 + X - X2 = X Example: I f / (x) = X = 1 ,2 ,3 ,4 M M = Z U i e ‘*fw = -4 e l + -4 e 2t + 4- e 3c + -4e 4t If Aj and Xx have the same pdf and Y = X2 + X2 My(t) = £ [e t(*i+*2)] 146 147 UNIVERSITY OF IBADAN LIBRARY CU II = E(e“ ‘.e“ *) « , « = [M ,(0]2 For the discrete distribution. If X has a pdf / (x) with support {a1( a2, ...) then = i6 e » + i16. e3 .+ ±16 e« + i16. e« 1i6. c« + i16 e« :1l6 e « + i16 eet « ,C 0 = X e" dw Example: R i y = /(•.)«“ ■+ /(a1)«“ , + - Let Y be a discrete random variable with pdf x — 0,1,2, Hence, the c.d.f. at effll is = P(X = aj). Thus, the probability of any value X say a f is the coefficient of e tVl. Example: Let the moments of v. be defined by E{Xy = 0.8, r = 1 ,2 ,3,...) y=o yi Then oo r oo ^ Mx(t) = M(o) + ^ 0.8 (—) = 1 + 0.8 0.8 w = r et>'(Aet)>' r = l r = l y=o yi = 0.2 + 0.8 ^ 0 . 8 ^ ?? r = 0 = 0.2eot + 0.8e“ y=0 Thus, P • X = 0) = 0.2, P(X = l+ = 0.8 = = p A ( e f - l ) m; ( 0 = ^ e ( O v ( ^ f)y , t a « r) _ ^ f ( A 0 2 Smce = Z “ 7 T - Ae = — = 1 + 1 T — y=o = *[£«(*“ )] = E[Xetx) MJ(t) = Ae(exp{(A(ef - 1)} Since the interchange at the differentiation and Expectation operator is allowed, we m;(0) = a can assume that; My(t) = (Aec)2exp{(Ael - 1)} + Aec exp{A(ef - 1)} =A2 4- A V'ar(y) = m; ( 0) - (Afy(O)]2 => A2 + A - A2 = A for discrete case Obtain the l/a r(r) given that Yar(x) = M;(0)-fM.;(0)]2 >49 UNIVERSITY OF IBADAN LIBRARY = 0 x ( O - 0 y ( O Also, the M.g.f. ol a random variable uniquely determines the distribution. • " / » * ] - / s « B,/r» * ‘ Example: for continuous case If X and Y are independent random variable with parameters (n,p) and (m,p) Example 3: respectively. What is the distribution of X + Y. From an Exponential Distribution Mx(t) = (Pet + (Pef + £?)n+7n = J etxXe~^ dx Example o Calculate the distribution of X + Y when X and Y are independent. Poisson random variable with means Aj and A2 respectively. Solution = aJ dx = SdyMA O E(e“ ) = J e a f a d x d x oo w _ylMAO = e" t+— e 2 fidy / 1 -X *z£ \7 / (pVTn= e e b / dx' f i n d 2 15 3 152 UNIVERSITY OF IBADAN LIBRARY 1 00 00 = e*„ t+£2£ i f —1— e _zi dy tt VH + r t At r r r 12 J 0V2tt M*(£) = (2ton/2u i v 2 / • • • / exp [ - 5 " The function in the integral is a standardized normal distribution. Therefore. -A t) ] dxa ...dx„ If we lety = x - y - At Mx(t) = e^£+ 2 since e c » y +. -1 t.1x A, „t 0r0 «f i , . (27r)n/2|A|V2_. CO -C O f —l = e ~ y ^ dy = 1 —J (p V 2n By examining 00 • Lei X~Nn(jin, A), then the moment generating function of X is given as Mx(t) = e*V + J ... J e ~ 2yi/>~ly dy 1 ...dyn , Proof — 00 —00 We know from Alternate Integral that = (2tt)t1/2|A|1/2 i.e. Ankens inTeyrat CO 0 0 j ... J e 2*lAXdxi. . .dxn = (27T)n/2|>4| 2 Mx( t ) = ------- 77 r - ( 2tr) / 2|A|2(27r)n/2|A|=-00 -00 Where A = I = variance - covariance matrix OO 0 0 M(x) = e ‘*y + ; t xAt = (2 7 T ) 'I |/ i r i J ... | e x p [ ^ - i ( x - / i ) 2i4_1( x - / i ) ] dx1 ...dxn -00 -00 8.2 Bivariate Distribution If wc Icl Let A' and Y be jointly distributed as L = tx " ( x — a/)1/! 1( x - / i ) , then simplifying this we have = exp {- (*+;>)} L = - ̂ ( x - yt - At)x(x - y - At) + tV Obtain the joint m.g.f. Solution: M, ,.(/,,/,)= ) 154 155 UNIVERSITY OF IBADAN LIBRARY = \ \ e e ^ * r)dydx. = npq = JJe-4'-',)-r' (/,./, at 1 = 0 , then divide by i[r). Unlike the ordinary expectation, the characteristic function always exists. (ii) For the bivariate case This is because EE\(Xx r Hi )= ,{r1+ I), f ' r ’jv Vo*1.0y01 | a (00) Examples: The characteristic function for the binomial distribution is given by 1 rrr,0,0) 9.2 Exponential Distribution The p.d.f is given as X < X < 00 = ( / v + 9 )' The C.F. is oo Exercise eitx-e~X/°d7 Obtain the joint characteristic function for the following: (*) /(x ,y ) = exp{-(jc+ v)| (//) f ( x , v ) - — expj - —(r" f y ’\ 2 n [ 2 dx- } { • * * > (/«) P{x,v) = = 6 ~ \ e - 0~l -it 158 159 UNIVERSITY OF IB :----- --- ' "• ̂ -f -ADAN LIBRARY = (n - l)(n - 2) ...3.3 f ( l ) and f ( l ) = / 0° V xdx = 1 (1-iflt) 01 (t) = +£0(1 - i0 t)~2 The Characteristic Function of the Gamma distribution is obtained as: m1 = 0 1(O) = 7 = 0 0(0 = E(eicx) = / e itxf (x )d x 0*(t) = 2i29 2( l - i0t)~3 0 n (O) = 2 9 2i2. Aeitx-Ax (Ax)K- 1 / e^.A-^CA x)*"1m 2 = 0 l l (O) = 2 6 2 I W r(/c) Var(r) = m2 - tn\ = i l ; oox it-ic - a - io dxr(k)Jo A = 2d2 - 92 r oo e - ( A - “ ) x A k x k _ I , = e 2 = Jo ------ roo------ dx 0 (0 = ' W" ‘t)x <** 9.3 Gamma Distribution A random variable is said to have a gamma distribution with parameters (t, A), A > Using Laplace transformation, we have 0 and t > 0 its density function is given by Ak x > 0 (A-it)kr(*) Ak(A4)kfix) - K ) 0 x < 0 0 1(t) = iU fc( A - tO _k" 1 i * AkA-k_1 k where mi = 0 l (O) = i ~ A m2 = 0 1J(O) = ++ l )£AkA-k-2T(t) = j e~y y l l dy o integration by parts yields Var(x)- =G m)2f ke\ 2 k fk \ 2+ » ~ ( l ) = - e - v 1 | o + J e_y(t _ 1)y t _ fc ” A2 = ( t — 1) J0°° e~yy l~2dy = ( t - l ) r ( t - ' l ) . If X is a random variable of the discrete type [i.e.x = 0,1,2,...] with probability If follows that function. P(X = x,) = P(X), then the characteristics function of X is define by T(n) = (n - l)r(n - 1) «K0 = £ (e ltx) = I * P fce‘tXk ....................... (1) = (n — l)(n - 2)f(n — 2) If X is a random variable of the continuous type with pdf/(x) 160 161 UNIVERSITY OF IB T!. . - ;------>—>ADAN LIBRARY M © "g II 1 then 0 (t) = E(eitx) = f*™ f w e ltxdx (2) Example: The moments of a characteristics function can be obtained by continuous differentiation of the function (discrete or continuous) r time and dividing the result since [eicx] = 1 and T.k Pk = 1 or f ^ f a d x = 1 by ir then /_+rA x )fe£trl dx = 1 i.e. pr = ; r th moment The summation in (1) and the integral in (2) are absolutely and uniforming converged. Thus, = 1st moment Thu, the characteristic function 0 ( t) is a continuous function for every value oft. Second moment p2 = 1 ; P3 = ■ /3 ; Properties Since 0 r (t) = irx rf(X)eltxdx ( 0 0(0) = E(e°) = E( 1) = 1 and 0 r (t) = Z k irx rP(Xk)e itXk (ii) [0 (0 1 = |£ (e ‘“ ) | S £-|e‘“ |) = 1 Example 1: Hence, |0 (t) | < 1 (iii) 0 (—t) = E[e~ltx) = E(_Cost X - i S i n t x ) = E(Cost x ) — iE(Sint x) = iPeicp = i2Pelt 0 ( - t ) = E (e itx) = E(Cost X + i Sin t x) = E(Cost x) - 0"(O) = i2P iE (Sintx') E(x2) = $ r = P thus, 0 (—t) = 0 ( t) ; a conjugate to 0 (0 E(X2) - (E(X))2 p2 = Var(x) = P - P2 All c.f. must satisfy the above condition. = P ( 1 - P ) Example: = pq Let X be a random variable from the Bermoulli distribution. Obtain the Characteristic function Example 2: Suppose X is from a Poisson distribution the characteristics function is given by Solution 0(t) = V co *—l k=0 , tCX/lxe •* = Z i = o * = 0.1 0 W = I x!x=0 _ gl'tOpOgl _|_ eiC(l)p1q° = q + P e il it-*x - o o Y ^ ' ) = 1 - P + Peu = e L ~ * - x=0 = 1 + P (e<£ - l) = e-*e* 'U =e*<-e‘l -V [Ae“ + l] 0'(O = - t e t2/2; 0'(O = 0 0*(o) = A?[A + 1] e W = ^ = £ = o cr2 = M2 = £ (* 2) - £(JQ2 m 2 = E(x2) = * M = t 2e - t2/ 2 - e- t2/ , = A(A + 1) - A2 = A i2 Example 3: The characteristics function, and moments of the standard normal £ - 1 = 1 “ i2 distribution is given as: Var(X) =1 m 20-= - =m1l 00 0 (0 = j eitrf(x)dx Exercise: where -00 Obtain m 3 and 77i4whatare your observation(m2,7n3)7n5)ableto equal to zero 00 For Binonial distribution = 0 (0 = | e*txe~* dx — 00 00 1 /■ ” _ f x 2 - l t x \ , 0 (0 = ̂ e ltxPx ( 1 - />)»-* = F ^ J - e (— J dx x=o x By completing the square in the experiment = £x (* ) (P )"-* = {Peu + q)n = J L . J e - U l z ± ) \ M ! . dx ■ F 2 n J \ 2 ) 2 Xt) = n(Pei t+q)n- 1iPelt 00 1 f - i ( x - i t \ 2 - t 2/ 0 '( 0) = inp = f 2 n J e , (— ) e ' 2dx ma = — x 0 0 Z(O = e ' * * - ' ) The characteristics function is defined as e [A,e‘c+Aze lC-/ti-A2J 0t(O — eitx° 0 1 (0 = a , - A2e(A>e Oi.e. non-negative Var (x) = £[(* - £ (x )]2 = 0 Iff. P[X - E(X) = 0] = 1 or Exercise Obtain the mean and variance function for each of the following: P[X = £(*)] = 1 Thus, we find that the random variable X has a one-point distribution. (/) r f ' = e x p j i> / - - i r V j (/7) f f 1 = a [ a - i t Y 9.5.2 Two-Point Distribution (ui) ( p t h ) = - a . r f a A random variable X has a two-point distribution of there exist two values xx and x2 set. 9.6 The Inversion Formula P(X = xa) = P, P{X = x2) = l - P (0 < P < 1) The characteristic function corresponds to a family at distribution which is obtained If we put x, = 1 and x 2 = 0 we have by adding an arbitrary constant to a d.f. o f a random variable. The inversion formula P{X = 1) = P, and P(X = 0) = 1 - P is a tool that can be used to get back the original distribution function on the entire Then the above qualities as a zero-one distribution. real line if the characteristic function is known. A very good example of a zero-one distribution is the Bernoulli Distribution 0(0 = Pelt l + (1 - P)eu 0 Theorem = Peil + (1 - P) Let F{x)and tf>{'] be the cumulative distribution and the characteristic function = 1 + P [eu - l ) of A' respectively, then for given real numbers a and b, the inversion formula is 0'(O = P defined as 0 (0 = P F(ll] - F, , - P.im — f ------- 4 i ]dt 0'"(O = P ,n| f — 2 tt l it For every K Proof mk = P c Pi = Par(x) = m2 - m2 ‘ 2n J it = P - Pz = P ( 1 - P ) First we need to show that | 1 < b - a hence bounded. Sgn P = 0 it F = 0 Now it is possible to apply the Fubini's theorem to Ic as -1 it P< 1 - J r e""* -* Corollary: (Modern Probability Theory, (1985) 2/r it i e‘udF{l)dr, (a_e"(r '’•can be written as Proof: Car/(x- c) + iSin t ( x - a ) - Cost(x-b)+ iS in t(x-b ) If Fand F 1 are the two d.f.s. corresponding to a given characteristic function then from the above theorem. . j _ _J_ j j Cos-/(x - a) +iSin /(x - a)-Cost (x -b )+ iS in l(x - a ) d t ) ^ ^ , - ^ , = ^ , - ^ 1 (*><■) multiply numerator and denominator by i At all the common points of F and F 1. I r 2i ( Cost (x - a)+ i Sin t(x - a ) - Cost (x - b) + / Sint(x - a)dt) \ , Allowing b to vary for fixed a = 2tt}x i [ it y " K.\ - = = a constant asC ->x> Bui F ^ - F ( + qo) — 0. Allowing b to increase infinitely through continuously points of F and F 1. This implies that F ^ - F{u) - 0 and hence continuity points of both. 170 171 UNIVERSITY OF IBADAN LIBRARY \ e‘' (g + p e ^ 2*-J, ---k- + —k = 0.; xx b i = ̂ Z [ J - p V _/2 J C a y /(x -y ) - /S w /(x -y )* ■ - Jo-dFu) + j + j W „ , + J - ^ , But -* -II -* -« _ rS & i/^ -y )* r C<»/(x-y) I (( *x--y/ )) 0 J F(xT-y)) = /r -0 = (̂o+Ol “ ^u-0) + I f a ) ~ V . ) + ^ [̂ (o*0) — ̂ *(o-0) ■ - S " /> V " '* r A j . “ ^*) “ (̂U) rim II' (i and /) arc points o f continuity of F . = iz / > V ' *a\J, Example: /> V " If CO / / ■ ^ ( f - x ) f i r - * Given ^J'1 = ( > > (/■ -x j /* w li 1 7 3 172 UNIVERSITY OF IBADAN LIBRARY C H A P T E R 10 IN T R O D U C T IO N T O M E A SU R E T H E O R Y 10.1 Introduction Probability theory is a part of mathematics which is useful in discovering the regular features of random events or phenomenon. In probability theory, the sigma algebra (which we shall define later) often represents the set of available information about a phenomenon. A function (or a function of a random variable) is measurable if and only if it represents an outcome that is knowable based on the available information about the experiment, the event to which it belongs and the probability function. For us to understand how a probability measure can be obtained, let us develop an abstract model for the probability of an event particularly for infinite sample space fl from a specified experiment. 10.2 Abstract Model for Probability of an Event I.etfl be the sample space such that H = {w* i = 1,2, 3 ,........ } w, are called indecomposable outcome or simple events. The is a decomposable or compound events, that is Ex = {wj i = 1,2,3 } The elementary definition of probability is PART TWO r , ( r \ _ No o f fa v o u r a b le cases^ ' T o ta l n u m b e r o f ca ffles ...................... ' ' Since events are subset of H , it follows that the union and intersection of a finite number of events and the compliments are also events. (1) For the model of mirror reality, the operation above can be represented by A, B, A U B,A n B, A , B . That is all statements about events can be written in terms of u,.n. (2) A random for defining probability in term of weights is to allow for the fact, that some events are more likely to occur than others. The weight of a set is just the sum of the weights associated to each point in the set. 174 175 UNIVERSITY OF IBADAN LIBR RY Let ft be sure event, the impossible event will be (p. Let A be a non-empty class of subset of ft called events. Let P(be the probability) be a real-valued function defined Any collection of events is a class of events. Classes will be denoted by A, B, etc. on A. Such that P(E)denote the probability of event E. The pair A, P is called the probability field and the triplet (ft, A, P) is called the Example probability space. Let ft be the real lineR containing all the real points w. i. e. ft = {w: — oo < w < oojalso let 10.4 Axiom for Finite Probability Space A - {w : we(—co, a)} and (i) If Ej 6 A for i = 1,2,..., n then B = (w:we(c,d)} Define: n n (0 A n B ; (ii) A u B\ (iii)Ac and Bc and give your assumptions | J e< G A and f~ | Et G A (iv) Show that the compliment of an interval need not be an interval. <=i i«i 0 0 If E G A .then E' 6 A Solution (ill) If E e A . then P(E) > 0, also P(ft) = 1 A r\B =

d) The number of possible outcomes of an experiment (E) may be finite ot infinite. Ac n B = B i f a < c < d Let w denote a sample point (an outcome) from the experiment. Ac U B = Ac i f a < c < d Let ft denote the totality of outcomes of E i.e. ft = {w1( w2, ...} BCAc i f a < c < d Let event A={w: w< eft}be a subset of ft. e.g (i) B={Wj - oo < w < co); all values on IRL On your own. define the above if c < a < d ■ or i f c < d < a (ii) C={wi: a < w < b}-, all values in the range (a,b) Sequences and Limits (iii) D={w,: w0); a singleton. A sequence of sets is an ordered arrangement of sets in order of magnitude (iv) E={w,: Wj, w2, }; a doubleton. .Monotone increasing sequence: A sequence of { sets {/ln} is said to be monotone (v) F={w: iv. = 0); an empty set. ( increasing if An Q An+, for each An. The class of all subsets of f t is called the power set of ft such that if f t contains n If the sequence {/!„} n = 1,2,... is monotone increasing (non-decreasing) if for every points, there are 2n subset of ft. Thus, if f t is finite, the number of all possible subset is also finite. n, wchave An+1 3 An Then the limit of (/ln) is the 3mm of the sequence i.e. The power set of f t when ft = {w,. w2, w2. w4) => 24 = 16 176 177 UNIVERSITY OF I ADAN LIBRARY (ii) The limit of {An} is said to exist if limAn = lim An = A, A = Y An = lim An (iii) If {A^} is not monotone and A exists then An —» A i.e. An converges to A. nZ_ilj n—oo (iv) Even if 1im An does not exist, limAn and lim An will always exist. or ■ n a OO = Example:( J1' 4* = An; U a ‘ = a i. e. An T Ak k = l Consider the sequence {4n} where = {w: iv belonging to all Ak except Av ... A„ = w: 0 < iv < b + ̂ ^ " /n ; (b > 1) CO Sup = i > Does the series {An} converge?. k = T For any arbitrary monotone increasing sequence {An), the limit is OO OO C = linAn = li sup Ak = |~= l| Ak k[=Jn Ak Solutionk fiv: 0 < w < b + —\ ,‘i f n is even, Monotone decreasing sequence: A sequence of sets {An) is said to be monotone Let Cn = * nJ [w: 0 < w < b + ( 7 (n + x)) j ; i f n i s odd decreasing if An+1 Q An for each An. If {An) n: 1, 2 ,...) of events, is monotone decreasing (non - increasing) and for every limAn = {w: 0 < iv < b) n we have An rj An/+,',1 , then the limit is the product of event [An) i.e. Similarly,A — Final An — limn_=oo ‘A4n or [w: 0 < w < b - (Vn)]l t f n s = {w: iv belonging to at least one o f An, An_ i ...)n oo { [iv: 0 < w < b - (V (n + X) ) ] ; i f n is evenr- 1i i e A n l A limA„ = {0 < w < b]k= \ k Therefore, lim An * limA„ For any arbitrary monotone decreasing sequence {An},the limit is Hence. {/!„} does not converge OO In f Exercise: k=n If An = A: n = 1,3,5,... = B:n = 2,4, 6, ... Limits: B - UmAn = lim inf Ak = k - l k = i Show that lim An = A u B, limAn = A n B Note that When docs lim An exist. (0 linAn £ linAn 178 179 UNIVERSITY OF IBADAN LIBRARY Exercise: Corollary: Examine the following for convergence, if convergent, derive the limit; p | /l, = i4x + A\A 2 + ACXAC2A^ + - W ^ - = (0 ,V 2 n ) .^ " « = [ - l . V(2ntI)] £=1 If tb) An = | the s e to f rational in ( l - 1/^n + ^ 1 +- */n)j co (c) An = 2-1/n, 2 + 2/n), n is odd. W 6 P j Ait then w belongs to some /lf i=i Thus w may belong to Ax or Acx or Acx or A2 or A \A \ i.e. w G Ak for some k. 10.2 Obtaining Countable Class of Disjoint =» iv £ U ?.i establishes equivalent of both sides of (*) Lemma 1.1: Given a class = 1,2..... n}of n sets there exists a class {/?,-, i = 1, 2.....n) of disjoint sets such that U”=i At = Ef=1 Bt 10.4.1 Definition: Additive Set Function Proof: By induction A set function /u is said to be additive if V A,B,sJ. = 2 Note Then = ( U ^ / l i ) U /lm+1 • • Once the value + oc,-oojs not allowed i.e. *-co • If all the values of

«?’ 'Sl> vv . a -ADAN LIBRARY For every decreasing sequence {En} 1 Theorem s.t. n0. The probability function Plmi is a set function that has r - additive property and hence A set function is said to be continuous if it is continuous from above and below. is a measurement space. Theorem Let cp be finitely additive and continuous from below, then f i is - additive. Krample: Let (O..F) be a measurable space on which a sequence of probability t measure Pi,Pl ,...Pu... defines a set function. Proof < j Show that P [ E )d . / ^ J— /^,(£)is an additive set function. Given a sequence of disjoint sets{£„}, then 2" ., Solution It is required to show that Let N be a finite number, since (p is finite additive, then (0 0 < /? .,< l (ii) Plm, is counrably additive and is a measure V.ns| J » “ l (iii) Prove that P(f2)=l »N» NSi i -VX sn -fl I“ iX . . (y) t̂£) = ~ ̂ (E) + ^ r P:(£) + JT PJ(M + ” Let Sv = En be an increasing sequence but -l P ,m 2 0. - L p 2lE)> 0 ... l»«l (p\?im S y J= Cim tp(Sn) and + 2- y - i - . s - 2 _ = _ l » i n r2 n “lfl * i - r i _ y 2 ■ # ■ ) 0 .,< I (ii) 1.x! 1/i, i be a sequence of disjoint set, it is required to prove By finite additively •/ «* iv .y| , » ._. \_ > 'i» )1 nsI ns| from i .1’ c (pis r - additive. I S3 1 8 2 UNIVERSITY OF IBADAN LIBRARY * 1 ® Proof: (Using mathematical induction on n) 4 1=1 / = nm| Uu-i * / for n = l: P {E,) = ) for n = 2: P (E , [ J E 2) = P f c ) + P (E 2 ) - P ( E xV \E2) = t ± t r M nm | ^ 1=1 The result is true for n = 2 Since each of /*„ is a measure and 0 < P(t) < 1 » = 3: />(£, U £ 2 U E, )= />(£,) + P(E2) + P(E,) - P(E, R E2) - P(E, R £ , ) - P(£2 R £ 3) * +p(E ln £ , n E J) ■■■-I 1 1=1 - Z*-i ' W L, 1«-i ?z !J = 2 > U ) * i i-i Assuming it is true for n and also tme for n = m -1, we have - Z1-1 ^ ) p (£ ,U £! u ...U £.,.i) = / :( u £. 1 = Z />(£ . ) - i > f e n £ y)+ I f e n ^ n s y )V i - i / <■ ! i< /< y< o«+ i i s i < y < i & n + i ^.)is countably additive. t ( - i r !p (£ ,n £ 3n ...£„ .,) (Sii) = 1 ^ . ( 9 ) « = m : i { y £ , l = / ( [ j £ , U £ . ] - i p ( £ , n £ y n £ „ ) + x ^ n ^ n ^ n * . , ) V i - i y V i - i ) i s < s y s * s o » - i =*±=l z± (1) 1 r £(£, n e 2 n ... n ) 1 1 1 . Assuming it is true for it = m, we need to prove that the theorem is true for n = m +1 2 4 . 8 - >1 • S in 6 S K =.— = - ^ _ = l • -1- ^ : m Le tE = [ jE n then /-i. 10.5 The Halley-De-Moivre Theorem ^ Q e ,.J = P ( E U E .J Theorem: Let { f jb e a class of events each of which belongs to a r - field 91, and each of which may or may not occur. Then = ! > ( £ , ) - Z p ( E ,n £ , ) + ( - i r + z ^ M ^ n £,)+ ... P\at least one o f the event E\ occur} «■ I ISi IS/< j ( £ . ) - Z ^ . n £ j + Z r a . n £ / n £ ‘ +( - T l« ’(n £ ,) This implies that the result is true for all positive integers n. V i = | / » « l IS li j& n IS 1< 7< *S » i 1 8 4 1 8 5 UNIVERSITY OF IBADAN LIBRARY Erwuupie: Sotuttrn: Lc-i 11 be events which belongs to a r - field % shew that^e probability (j) Let l:\ denote the event that ihe i h letter and envelope •march that exactly K events occurred out of n is given by ir./! Where S, * fl Ea f | ... D £» ) r y~ r3 X4-■ l- r 1 Since e 'x = 1 - x + :----— + ----- ... .. 2! 31 • 4! From Halley-De Moivie theorem )E.] = Si - S 2-+Si - . . . + ( - l} - 'S ll {i.e.Pmi?(q/'l or 2 o r3 or...or N match) envelope ■ <«i J If k = 0, no event occurred: .v..p(|Ji:, j - i -er'1 = 0.63212 V<»i .> -“(A*, U£, U...U £ j - 1 ■- / ’(ft'U E, U -U £n) . = 0.6 = 1- S’. r V o . t f - S . , ( n ) Takirfe limit as N —>x> , . - P{nOn o f the events occurred) _ l r 1 . ______ j i” L ___ 4I- Example 2: . . . 2! 3! 4! Suppose /r letter and corresponding envelopes are typed by a typist. Suppose 1 + 1 1 1 1_____ .1. further that the messenger, who is in a hurry to leave for the post office, randomly 2! 3! 4!■ft insert letters into envelopes, thinking erroneously that all the.letters were identical. Finch envelope contains one letter, which i.e. equally likely to be any one of the p 10.6.2 Countable Probability Space letters. : Sometimes it is impossible for all the sample points in a fi to be equally likely. Hence, each P, is viewed as unit probability mass among the sample points following li) • Calculate the probability that at least one of the letters is inserted into a certain rule or law. This law is sometimes referred to as probability distribution. its correspondence envelope. (it) Find the limit of this probability as N -» oo Example: Fora geometric distribution Suppose Q = {0.1. 2. ...}and / » , , * ( \ - 6 ) 0 \ x = 0. 1,2...... (0 < ^ < 1) m 187 U '—NIVERSITY OF IBADAN LIBRARY Then Pt - P[x) > 0, £ / | v) = J ,2) From (1) let \ = {{a}, fb. c}, fd}, fi, 0} and P{a} = P{d} = V 4 c) = Pffi} = 1, P {0} = 0. If ACO^then P(a] = Y j P[a ' The triplet (fi, L P) is not a probability space since ̂do not form a field. xO.4 Poisson Distribution Exercise C-et 0\ x = 0, 1, 2,... (2a) Is f = {/Ft, £2» ■*•.£*} afield. (b) Hence or otherwise obtain all the elements of the a - field of t. T^en P(s\ is a Poisson distribution and X is a Poisson random variable (3) Consider the sample space F = {0{W!, w2}, {w3, w4}, fi} Definition 3: \ f A = = {w3#w4} A class of sets A is called a field or a - field if and only if the following conditions Show that F is a field; • hold true. 1. If E, 6 A, then U"=i Ei 6 A Exercise: 2. If E 6 A, then E' G A Let Er,Ei, '..,En. denote an infinite sequence of events in a — field A. From the above, it follows that Define 3. If Ej 6 A implies U"=i Ei G A , " W . . Example h m=n . • OO A = {fi, 0 }is a field.C B„=IB == [A, A }is a field m - n{A, H, 0} is no t a field, since A g C (a) Prove that BnCEnAn V-n . G= (A, B . A B . A U B.A U B , a U B ,A U B , A B ,A B , A B , A B , fi,0 } is a field. (b) Show that {/4n} is monotone decreasing. The class of all subset of a given set fi is a field. (c.):show that {fln} is monotone increasing. Example 2: 10.7 Sigma Field (o - Field) (1) Let fi = fa, b, c, d} and 5 = {{a}, fb, c, d}, fi, 0} A non-empty class of sets which is closed under complementation and countable i-e. ? a field? Yes P(a) = ^ , P(b, c, d) = ^ unions (or countable intersection) is called a field. //ps-CQ, l;, P) is a probability space. Note: Yes, since ̂forms a field. • A field containing an infinite number of sets may not be a 0 - field. 1 8 9 1 8 8 UNIVERSITY OF IBADAN LIBRARY M I Into section oi an arbitrary number of a — fields is a o - field. 10.8.1 Borcl Set 1 0 .8 B o r c l H e l d Borel field and Borel sets play a very important role in the study of probability. Hus is a subset of the real line. Let C be a class of all intervals of the term Monotone field: A field A is said to be a monotone field if it is closed under (--oo, .v).* G IR as subset of the real line !RL Also let ( 0 = Tl be the minimal field monotone operations, i.e. if lim An e IF whenever {IF} is a monotone sequence of set F. generated by d. i.c. Ane F./t,, T A => A e F f hen 'ft contains the intervals of the form [x, oo) (i.e. compliments of (—«>, a), it also A„e ¥.An l A => A e F contains the intervals. Theorem: A a - field is a monotone field and conversely. Proof: Let A be a a — field and Ane A. If An T A, then A = U„ An\s a countable (-<»,a | = n ( - 00, a + “ ). by coutable intersection union sets of A\. Hence A e A. Similarly, if An l A.A = C\nAn is a countable , ' (a.oo) = (-oo, a |“ by complimentation (a.b) = (—co,/;) n (a, oo),a < b intersection of sets of A. (a,b\,[a.b).etc fo r a.b G K. ••• A e A\, hence, A is a monotone field. Conversely, let A be I enuna n n a monotone field and let Ax, A2 .... be sets belonging to A. 1 ct be the class of ail intervals of the term ({,b),(a > b)a,b e IK but arbitrary. Then ( J Ak and j "~j Ak belong to A\ since A is a field. k=l k=l Then a ( t \ = V). Proof: By (*) (overleaf) a.b.cty for all a, b. Hence, These are monotone sequences whose limits ( J Ak and p |A k must belong to A. By definition of minimal field. a ( t x) c i k= 1 k=1 fo prove inclusion Thus A is a a - field. Let x e (a.b) then. U“ i( - n . x ) e o ( e x),V 10.9Kandom Variable in Measure Space l et ft be the sample space with sample points w. Interest is usually in the value n*) => (-oo.Jf) a (et) ^ x associated w ith w. l' c rr(/',) as defined in the last example. (a) Point function: function on the space ft to a space ft assigns to each point If is also possible to prove that die Bore! field is the minimal field containing any one w e ft a unique point in ft denoted by X(W). Thus X(VV) is the image of the argument w of the following- under A' i.e. value of X at w f t —— » Q' e, = {(—oo, x |.x 6 IR} i/rnuw n r u n g r f ;i = ((a.ftl.fl < b.a.b e r- ’ The set Q* = |X(lv): we ft| which is a subset of Q’ is called the strict range of X. f , - ([a. b I, a < b . a . b < ■'} If i f £1" => X is a mapping from ft to ft. C,, - {|a.b),a < b.a.b e IK) The symbol X(vv), etc will be used to denote functions even though they denote — 11 a c o ) , v t- IK) . t c . values of functions. : mi 191 UNIVERSITY OF IBADAN LIBRARY Kxample 1: X_1((w}) = {{w c“ft}: X(vv) = w1) Let n = |0, ±1, ±2,... |; ft' = (0,1,2,... | Note that for a point w' e ft12 one or more than one points in ft whose image under ft = 10,1,4.9...,]; i f X {w) = w2 X is w l. Let /!' c ft1. The set of all point for which X(W)e /?1 is called the inverse of Thus .Vis0 a mappin0g of ft into and onto Q*...i - 1 i /Sunder X denoted by X-1 (H1)- % ,i- 2 2 With every point function X, we associate a set function A”-1 whose domain is a class ± 3 3 s(j of subsets of ft and whose range is a class'^ (say) of subset ft. Then, X-1 is called the ‘inverse function' (or mapping) of X. X ( B ) = |X(w):w E B \ . B C a iv, = w2 => X(wl) = X ^jone - to — one X ~ l ( y ) = \ B ( B ■): B e y I In this case X(wl) * X(w2) A w, = w2 X -I( f t) = [w:X(w)c f t| = f t X(w) = w2is not 1 — 1 function Lemma: Inverse mapping preserves all set relations. Since w, = 4-2, w2 = — 2 have the same image Proof: Let W c C c ft’, then X(w i) = 4 = X(iv2) X '(« ) = \w:X{w)e B \ c \w.X{w)eC\ = X ~ \C ) If ft is the real line (-co < w < co) and ft = (0 < w < co)then X(w) = exp(w) is a (d) Indicator Function 1-1 onto function from ft to ft and 1-1 from ft to ft. If the range space is & or its A real valued function lA defined on ft as subset, the function is said to he a .numerical’ or ‘real-valued’ function. xa _ (= 1 if w e A (b) Set Function ' " A (= 0 if w e.Ac II the arguments of a function are sets of a certain class, then we have a set is called an indicator function (characteristic function by some authors). The strict function. Suppose Zl- E A\, we associate a value p(/l), (say) then n is a set function. f.i range /,,is /„(ft) = (/„(w): weft} = {0,1}. If B is a set function and B c R, the range may represent entity such as weight, length, measure, etc. space then lA l(B).= . ifB does not contain '0' or T The interval (a,b) may be associated with b - a\ f(a , b) U f (c .d ) = (b - a) + = A, if U contain 'T but not '0' (d - c), etc. - A(\ if 13 contain '0' but no t'T I wo real valued function X and Ton ft are said to be equal iff X(iv) = Y(w) V- w e ft. = ft, if 13 contain both '0' but not'T i. c. X = Y Thus IZl(B) = { lA < lB denoted byX~‘({w}). The A r = 13 <--» l,\ = 1 .m UNIVERSITY OF IBADAN LIBRARY /„ = /2(/t) = l A(.A).!n = 1 Proof: (ii) l(A(:) = 1 - [(A ); l(B—A) = 1(B) - >A o n »i 00 l*i.B ...........................................cA m u . u = f^ j^1 = I > j=i t=i (=i = min (Iai*-—Un) 0 V)*(AuR) ~ U +n *B ~ IA- ' b “ maX Oa- ' b)*(A+B) = *A + 1 [i=Ji A i = ^i= l l A i ^ l A i A Ai + ^i= i 1A i ^ * A i A Aj + A k Let Bk c fl ,then w eX_1(n Bk) <=> X(W)€ n Bk => P | B.elB <=> X(W)€Bk Ft 1Thus, B is closed under countable intersection. Hence, IBis a o - field. <=> w €X_1(Bk) f1t « w e n X_1(Bk) 10.9.1 /(A) as a Measurable Function Hence, X"'1 Since lA '(B) = { weX-'(B) .■.X~'(BC) = (X -‘( S ))c 10.9.3 Function of Function Clearly ( f t ) = [w:X(Ml)cft') ^ f t If X is a function from ftto fl and X is a function from fl to fl , then the function X~'( X~l (a (f)) c A 10.9.5 Random Variable (Economic Definition) => A Suppose ft be a sample space. Let A\ao — fieldof events associated with a certain fixed experiment. Any real value A\ - measurable function defined on ft is called a 10.9.6 Vector Random Variable random variable. Thus,’X is a random variable iffB~x, the a - field induced by X is Suppose w eft, the associate X(w) = (X\wy, Kfvv)) a point in the 2-dimensional contained in A\. huclidian R2. The Z define a function from ftto R2. Consider the class of 6 of all Suppose we define two non-negative functions rectangles bounded by the lines xx = a.x = b,y = c,y = d,a < b,c < d arbitrary, X(w) = *(w)» '^X(w) ^ 0 flic minimal o - field containing f in Borel field (332)in R2. = 0, it X(W) 0 /. is called a 2-dimcntional random variable ifZ "’(332) c A\.Z~l(®2) «s a a - field and induced by X. X(W) ~ (̂w)> < 0 Illustration = 0 if X(wj > 0 QO • The above are respectively called the positive and negative parts of X. Then S„ = £ X t, E(Sn) = nA.a(Sn) = VTil A' ’ and X are Borel function of X and will be random variable if X is a random i- I variable Note: flic moment generating function of Znis given as 1 11 These functions play an important role in the theory of integration of M _ „CO , (VJ probability function. (2) To show whether a function is a random variable, it is not necessary to determine whether X~l (B)e A\ for every B in 33. It is sufficient to verity X ~ l(f) c A where C is any class of subsets of R given in sub interval on log/W/(t) = —t'fnA - nA - e page 8. = - t V 5 * - .U ( i - U + j s + s 5 + 5 ^ + - 1 ) 1 % 197 UNIVERSITY OF IBADAN LIBRARY lini log » 1 , *,2 CHAPTER 11 l|-«CX» ' * /.(£) = —2 => My{t) - e 2 LIMIT T HEOREMS AND LAW OK LARGE NUMBERS = m g f o f A/(0,1) Problem 11.1 Introduction Suppose that S„ has the binomial distribution b(n,p)- show that distribution The law of large numbers is concerned with the conditions under which the average of %n----------- » N (0. 1) a sequence or random variable converges (in some sense) to the expected average as the sample si/.e increases. Theorem: Let Yn, n > 1 be a sequence of real converging to Y0 Then the sequence r,+y2 y,+rz+r:, y1»y2+-.y,i x ' 2 ' i n 11.2 Concept of Limit s Also converges to Y0 However, the inverse is not true. Let .v„ be a point in some intervals oflhe real line '.H. Let / be a function which is Proof: delined at every point of / except possibly at .v„. The limit of the function as x Let > 0, we find n, s.t.n > n => (V̂ + ••• /„) — K„| < £ approaches v0 is /, written as Since Yn -» K0 3 no s. t. |Tn - K0| < e/ 2 K > 1 irn /,\, = L ov /( f| > L as x —> x l ind > n0 s. t. — K0| < e/ 2 for convinence If for any positive number X (no matter how small) there is some 8 greater than zero We claim that n > n, => - '"0| < £ such that Then iriii ,/ I _ |(yt+yo)+-t(yno+»b) |/ ( i Z.| < £■„ for all 0 < |.v~.v(1| < S1 it K°l “ I n |(yl+yll)+- + (y,,o+yo) + (y„oM+y0)+--My,t + y0) I rom the above definition, the number e > 0 is first given, then we try to find a it! n number d > 0 which satisfy the definition. Example 1: Prove that < in i (3.r — 4) = 14 n 0 n V \Yi- Y 0\ n i—i n0 + 1 Z_i Solution: Given A > 0 , find 8 > 0 [depending on 1 .1 s.t. 0 < jv - 6) < d. we have 1=1 i=n0+l \ f - 14| < e * * / 2 + '-T *el 2 -> |3.\ 4 14| |3 ( .v - 6 | 3j.v 6| < 38 S £/ 2 + e/ 2 N o te th at | \ 6 j < A 0} < -â Example 2: x + 1 ProofProve that ( Suppose X is continuous with density function '-*2 3*+ 4 10 Solution: Given £0 > 0, we went to final 0 < |x - 2| < £, we have < = [ V w * + « £ V ( x) a > £ x f ( x)dx / ( x) ^ _ * + l 3 x + 2 x - 2 • 8 J ' 10 3*+ 4 10 ~ 10(3* + 4) 10(3x + 4) ' 10(3x + 4) > j~af(x)dx If x is sufficiently near to 2 so that = o ^ f ( x)dx 3.r + 4 > 10, thus 1 <1 > aP(X £ a) 10(3*+ 4) 10 /. aP[X > a) < E (X ) => P(X £ a) < -̂ a Thus l / M — I 10(3.r-4) 100 The above is for a single variable X. Suppose we have a sequence of variable 8 = 100 £ {X„}, n = l,2,...n, then we have the Markov’s inequality for a sequence of {A',,}as Theorem: Let / be the constant function defined by = C where C is a constant P X . tint f ( x ) - C Proof: Given s>0, find 8 > 0 such that 0< |* - t7|< 5 =>|y|T)-Cj<£- 11.4 Bienayme-Chebyshev’s Inequality Theorem: If A is a random variable with mean n and variance a 2, then for any value The distribution of certain statistics of interest are too complicated to derive for e> 0 : differing sample sizes. In many cases, limiting distributions can be obtained as an approximation to the exact distribution, when the number of observation N is large. O" Thus, most important theoretical results in probability theory are limit theorems. Proof: Let consider some useful limit theorem. Since ( x - r t is a non-negative random variable applying the Markov’s inequality 11.3 Markov’s Inequality with a = k \ we have P{{X - /j )> K }< —-— &,then the above (*) is equivalent to Theorem: If X is a random variable that takes only non-negative values then for any value u > 0 2 0 1 200 UNIVERSITY OF IBADAN LIBRARY lim P{jXM| >oo, it does The above inequalities are important in that they enable us: (i) derive bounds on probability when only the mean, or mean and variance of not follow that for every e: > 0. we can find a finite n0 such that for all n > n.'; the the probability distribution are known. relations \X,\ " ( i - / > r (i) Suppose it is known that the number of eggs sold in a poultry farm in a month is a random variable with mean 75 crates. To show that lim pfx„|>e}= 0Hf* * ’ ' (ii) What is the probability that the sales for next month is greater than 100 crates. Solution: (iii) Ilf the variance of the sales for the month is >5, determine the bounds on By Chebyshev’s inequality we have the probability that sales in the coming month will be between 50 and 100 crates. E(Xh)- n.p\ Var (x)= or v n Jnpq Solution: Let X be the number of eggs sold in a month But Chebyshev’s inequality states a = r (i) by Markov’s inequality ■Jn a 2 / P(X>Vto)< — = - P \X - p \ > e } < ^ - fo re > 0. v 7 100 4 (ii) by Chebyshev’s inequality a V , or P \X \> K a < ^ r - 4 - — 7 />).*'-7 5 |> 25 = — 1 n| 1 k2a 2 nK: ' 1 1 252 25 p \X n\ > k o } < \ p \ x - 75| < 25} > 1 — — = — A 1 1 J 25 25 Lotting 11 — - wc have So the probability of sales of eggs for this month is at least —24 Definition: The sequence {Xn} of a random variable is said to be stochastically P\X |> e}< -^ - = n e convergent to zero if for every e> 0 the relation 2 0 2 203 UNIVERSITY OF IBADAN LIBRARY PrjX . - n p \ > e } < & 11.6.1 Weak Law of Large Number (WLLN) Let Xx,X2, ... be a sequence of iid random variable’s each having finite mean E(Xi) = Chebyshev’s Inequality H. Then for any G > 0 This theorem is often used as a statistical tool in proving important results in statistics. > 6j _ 0 as n > co For example: — p lfVar(x) = 0 prove that 1'his implies that Xn -* fi PX = B(x)=\ Proof: suppose the random variable has a finite variance a 2 Proof by Chebyshev’s inequality, for any 0 > 1. £ ^ y ar f Xi+X2..Xn\ _ — o It follows from the Chebyshev’s inequality that as n —» oo and using the continuity property of probability and Chebyshev inequality. Thus, as as n -» oo n-*oo (. X\ "h X2 ...xnlim P ( -A * > € } = ” P (E2. {| 1 - " i > ^ } = 0 x n -* n => p[x * n\ = o This implies strong convergence (Strong convergence) Convergence Almost Surely X n is said to converge to X almost surely, almost certainly or almost strongly 11.5 Convergence of Random Variables Convergence in law denoted by denoted by X „ —— if Xn(w) —> ^ (M.,for al w except for those belonging to a L Xn -» X if at every continually of X through distribution function F of null set N. Limn_ 0o /y,(x) ^{x) Thus X„ -^ ± -> X iff X„(w) Ar„,.) < oo Where Pn(x) denotes the distribution function of Xn Thus, the set of convergence of {Xn) has probability unity. Lemma: 11.6 Laws of Large Number X n———>X iff as n ->co This refers to the weak or strong convergence of sample mean X # = %i t0 a corresponding population mean (/j ). -> 0 V an integer 2 0 4 205 UNIVERSITY OF IBADAN LIBRARY Proof: Theorem 2:IfX„ —-~>Cimplies that Fn(x) -» 0 fo rx 1 forx>Cand Now AyOv)-* .^(w), if for arbitrary r > i , there exist some conversely. /?„(»•. r)s.l.V K >n„ (w,r), \X „ (w ) -X (w ] < / r Proof: If X n ——» C, Fn (x) -> F(x) where F(x) is the d.f. of the degenerate random Moreover X m — > X imp lies that P[Xn - X] = 0 variable which takes a constant value C since Using de-Morgan rules, fO, x < C • ,-'1 {l, x > X [ » : \ X . ( w ) - X M \ > y $ \ = 0 r n Conversely, let Fn(x) —> F(x) as defined above. i.e. for each r Then PrjjT ,-C |S;e] = P[A', £ C+s]+Pr[X„ S C] [ w . \ x A » ) - x ( .w ] z . y r\\± co. & t f u k - * l * X l ] = 0 Hence A',,-- Suppose X n's are discrete random variables taking • values Replacing the above by the complimenting condition we have o ,i ,2,...s.t.p (x „= ;)= /> ,.if p„->p, a s n - » 0and S takes value /{ n k - * i < x ] ) - >i / with probability Pt (i = 0, 1, 2 ,...) and hence = 1 ■ then Note: Lemma 1:A sequence of random variable’s converges a.s. to a random variable iff the .ICT sequence converges mutually almost surely. So that X a converges to X in distribution. Lemma 2:If X n — — >A\then there exist a subsequence{Xnk}of {Xn}which Examplc:Let A^be a binomial random variable with index hand parameter converges a.s. to X. P„ s.t.as n -> co, "Pn -> X > 0 and finite. Then we can verify Convergence in Distribution (K = 0, 1, 2,...) If F„[x). is the d.f. of a .random variableX,, and F(x) the d.f of random variable The binomial random variable leads to Poisson random variable with parameter X in X . then }Xn} is said to converge in distribution or in law or weakly. It is denoted as distribution as n-*co. X" ——-> X, Fn -> F weakly or Fn(x) -> F(x). Theorem 1: If X„ —— >X , then F„ F(x), x e C(f) 206 207 UNIVERSITY OF IBADAN LIBRARY , , k - n p ...... Convergence in r,h Mean Now let y p = " r - ■ ...... .(**) A sequence of random variables is said to converge to X in the rlh mean, denoted by Jnpq Since n and k are large, we can expressed the factorials in (*) above by means f the Xu ~±~>Xif H\XU - X\r -> 0 as n -> « . Stirlings formula/approximation as For r = 2, it is called the convergence quadratic mean or mean square. e*"1 For r = I. it is called convergence in the first mean. Lemma: If X „ ~ r~> X => E\Xn\ -> E\Xf b(k, m, p )— -------------- Proof: For (r < l)put(Xn - X ) and X for X in tyheincqulaity p k q " ' k e e 4 ' f. r s £ K - * r + 4 * r (2^ (2n Y n k*y> (n-k )r k*Yi Interchanging X n and X in inequality and combining, we have » ( nq E \ X l - E \ x \ < E \ X a - X \ (2n Y ^ k ( n - k ) U A n - k Thus X n — -r-* X => E\X ,\ => E\X\ Where 0 -O lH) - 0 (k)- 0 ( n - k) . I UsingM< l[ I +i +_l_ Then /*{.¥„ - X\ >e} < from ( . . ) the above can be rewritten in the form Substituting for k : . X m- ~ * X as n .-**>. 1 1 Lemma 1ltf|<---- + —+ P The binominal distribution b (k ; n. p) approaches the normal distribution as n <- oc 1 1 12n 1 + X k K q q ‘ XtV / " ‘IJ i.e. /?(k; n ,p )~ - = T ‘---- e ^ If we assume t,h a.t Vr« ->v nu aass nn -> °°, then 8 —>0 and e* —> 1 ■ J i n npq Proof Let A(k;n, p) = n\ K l(n-K ) p _ k qn-K for large values of n. The above represent P[SK - K ) where S t is a random variable which denote the number of successes in n . Bernoulli trials with probability success for success in each which can be approximated by trial. 11’we let n >xandkeep /’ fixed then p \S n - np\> np) ->0 ¥ e > 0 by the law of J — for large n Jnpq large number. Accordingly \K -np\jn -» 0. 209 208 UNIVERSITY OF IBADAN LIBRARY nq 11.6.2 Criterion for Convergence in ProbabilityTo estimate the quantity ̂̂ J |̂ — U- • « The following lemma gives the necessary and sufficient condition for convergence in T'aking logarithm of the above gives probability. Lemma « ' o e[ y ' y ( n - K ) iog { y n _ k) Which can be rewritten in the form x - W E ( £ i ) ^ 0 a s n * |x„|1 X >0 iff E 0 as n -> co. - t i p \ + xk ~ log \ n p 1 + * J — U W JPJ f Proof nq 1 +■**,/— log IXI ’ • |x Inp I+X* J ~ For any X , the r.v. is bounded by unity. Taking g(x) = rj-77 1 for € > 0i nP J \V»\ Upon substitution for K. Since x /̂7 ^ is small, we cam expand the Logarithmic function in power series. J _ W _ 1 — 1- < e [ - 1 ^ ) /li+KlJ 1+e * i ,+w j/ r+e Using the Taylor expansion then a reminder From RHS E M log(] + x) = x - y + ^ - ; (o< |e3|< x ) b + w j (* *) above becomes From LHS p \x ,\ >e]-» 0 => j 0 - | * ; + C X V i ' But K is a non -negptive r.v. so Where C is a constant ■K If we assume y -» 0 then can be approximated by E\ >o — x: Theorem: Iff [x) is a continuous real valued function and X, — X, then f(x) Hence (*ii)is asymptotic to e ^ x‘ Gathering the estimates (i), (ii) and (iii) above, we have b( k; n, p)~ 1 11.6.3 De Moivrc-Laplace Limit Theorem J i n npq If S is the number of occurrence of an event in n independent Bernoulli trial, with The normal approximation to the binominal distribution. probability E for success in each trial, then 2 1 0 211 UNIVERSITY OF IBADAN LIBRARY Now using the Chebyshev’s inequality :.p\x,\>e) 0 as n -»oo. 11.6.4 The Weak Law of Large Number = ><=}< ^ ■■, " • It follows that lim P(jXn| > e)= 0 as n-»co. n s ' n~'* Let A'l,X ,,...X nbe a sequence of independent and identically distributed random variables each having mean E[X,)=/u and finite variance a 1. Then for any e> 0 11.6.6 Strong Law of Large Number (SLLN) P^X - p| > £ j —> 0, as n —»oo This refers to the strong convergence of the sample mean to the population mean, Proof i.e. Xn => E{Xi) = p It follows that i. e. Jirn P{sup \Xn - p | > e ) = 0 e [x ] - // and Var (x)= — n Or From Chebyshev’s inequality we have P lim [Xn = ix] = 1 n —*on ’ 1 n s" Note that SLLN holds iff the population mean exist. /.lim p jx - / / |>e } -0 Theorem: Let /V„X?, .- .X n be a sequence of independent and identically distributed This theorem was first proved by Jacob Bemowlli random variables each having a finite mean // = E{X ,). Then with probability 1 X—x- -*- -X- ,-- -+- -..-.- + A,,11.6.5 Bernoulli's Law of Large Number ---- - /j as n -» co Let fc} be a sequence at random variable with pdf Or Prjim (X, + X, +... 4- Xn )/n = p j= I \ P 'O - P ^ f o r 0 < P < 1 and r = 0, 1, 2,... Theorem: Let [Xk),k = 1, 2 , ... be an arbitrary sequence of random variables with Further let X n = Yn - P sequence of random variables {A',,} is stochastically various ok and first moment Mk. If the Markov’s condition (i.e.lim n_co ak = 0) convergent to 0 for any e> 0. i.c. lim P(jXn|> e) = 0 is satisfied then the sequence [Xk - Mk) is stochastically convergent to zero. Proof: Proof Suppose Xk arc pairwise untouched (i.e. independent). Consider the r‘h variable W c have E (X J = 0 *, + *2+••+*„ Ym = n Wc have 212 213 UNIVERSITY OF IBADAN LIBRARY l A C H A P T E R 12 E(Y^ = n k=Il^ Such Xk are pairwise uncorrelated, we have P R IN C IP L E S O F C O N V E R G E N C E AND C E N T R A L L IM IT ^ 20 'n ) = T H E O R E Mp4 LV "k=i 12.1 Introduction The Central limit theorem is concerned with determining conditions under which the If ̂ n sum of a large number of random variables has a probability distribution that is Ifl im — V at - 0 approximately normal.n -00 n 2Z_i k<=l Then by Chebyehev’s inequality (theorem) it follows that 12.2 Convergence of Random Variable A sequence of random variables {Xn} is said to be converge to a random variable .A" if nl-i*moo />[|rn - E ( ) 'n) |> e ] = 0 W w)} converges to X(w)co for all w e f l Thus {Xn} is said to Thus, the sequence — Mk} is stochastically convergent to zero converge to X everywhere. If X n(w) converges to X(w) only for w'EQf* w e A, then C is called the set of convergence of X,. If Ce A, then tim X n is a random variable clearly, C is the set of all w e Cl , at which whatever be £ > 0, \Xn (w)~ A'(w)! < e for all n greater than n = N0(w) sufficiently large symbolically for = n + m, m > 1 C = [w :X m( w ) ^ X ( w ) ] = *•n>0 u« nni [ ~ | * ~ w - * w H Equivalently, replacing “for every £->0 by for every — k = \, 2, ...e k Since C is obtained from countable operation on measurable set, C is measure from C e A. Now |/ ( .v ) -C | = |C -C | = 10]<£ Hence \ f (x ) — C\ < s V x 2M 215 UNIVERSITY OF IBADAN LIB ARY This theorem tells us that the limit of a constant 8 that constant. This lemma provides us with sufficient evidence/condition for the convergence in Remark: probability. The proof of the above follows from Markov’s inequality. If f a has the limit Las x -> x then f a is said to converge to /. OR i.e. P[ If C is the limit of f a as x a then f a is said to converge to a constant C written as n —̂ oo as Ax) ~*C as x-> a Theorem: Let X be a k-dimensional r. vector and g > 0 be a valued (measurable) Note that the constant Lor C can also be a random variable. function defined on 9?*, so that g(x) is a vector random and let C > 0, then Convergence in Probability A sequence of random variables {XK} is said to converge to X in probability, denoted Poof: Assume X \s continuous with pdf Then by — ~->X, If for every e > 0, as n ->oo equivalently, if for ¥■ e > 0, as n <- oo -- fg{xltXi.... Xt ) f ( x , ..... x t ) d x „ . . . , dxk + J (gx,......,xt ) / ( * ....., x t ) d x ...... d x k A P[)Xm- X < e ] - ¥ l . Note: This concept plays an important role in statistics, i.e. consistency of estimations, weak Where A = laws of large numbers. ( x ) ] - } & ( x i ” — >x k ) / ( * p »* )A » ~ A Equivalcntrandom variables: Two random variables X and X ' are said to be A equivalent if X -> X 1 a.s [almost surely] Using the result from Markov inequality Lemma: X n— and X n——>X' =>X and X ' are equivalent. A This lemma shows that a sequence of random variables cannot converge in probability = CPfeW e A] = CP\g{x) > C] to two essentially different random variables. Lemma: X n -»0, i f c\Xn\ ~>0 , ^ f W 2 C ] S £ M Replacing Xn by (.Xn - X ) we have: Note: X n -+X<*> iff X n- X — ^ 0 If A' is of discrete type, the proof is initial analogous. 1\X" - X \ 0 implies X n — 1 Special Case I: (») Let X be a random variable and take g [x ) - \X ~ L \ ,/j - J i ( X \ r > 0. Then 216 2I7 UNIVERSITY OF IBADAN LIBRARY p\x-^\>c]=A Proof: Cr • The above is known as Markov’s Inequality Then X, amd Yt are standardized variables hence Special Case II: If r in (*) above is replaced by 2(i.e. r = 2) we have i f f - \ < E ( X , >',) K a \ i - L Note: A A more familiar term of Canchy-Schwarz inequality is e -[x r ) < £ ( * 2)£ ( r : ) Remark 12.4 Borel-Cantelli Lemma Let X be a random variable with mean ^ and variance cr2 = 0. Then the above gives In the study of sequences of events A,,A}... with Pk = P[Ak ); a significant role is p \ x - //| £ c ]= 0 for every C > 0 played by Borel-Cantelli Lemma. This implies that p ( X = /j ) = 1 i.e. (i) If the series converges, then a finite number at events 12.3 Cauchy-Schwarz Inequality Ak occurs with probability 1. Let X and Y be two random variables with means u, ,/i2 and positive variance (ii) If the events are (completely) independent, the series diverges, a' and a 2 respectively. Then then an infinite number of event At occur with probability 1. Or equivalently, Theorem: Let {An} n = 1,2,... be a sequence of events and P{A„ )denote the probability of the - o fo f < E[[X - n )(Y- AA)] £ ofo} event An where 0 < P(An)< 1. Then if and E [ { X - ^ ) { Y - ^ ) ] = g -g \ JC (i) If £ p (Aii)< oo with probability one only a finite number of event iff ( * = * ) = i An occur. 2 1 8 2 1 9 UNIVERSITY OF IBADAN LIBRARY i.e. The infinite product of the r.h.s of the above is divergent. Hence, P ( a )= 1 which x 30 (ii) II the events {An}n = 1,2,...are independent ^ P { A n)= with prooaouuy shows that ^ P(An) = oo /=i A C ( J An as ( > co n^r In simple language, the theorem states that a large number of independent random variables has a distribution that is approximate normal. It provides a simple method Meaning that P{a ) < A ( J a ,, < for computing approximate probability for sum of independent random variable’s and explain the fact that many natural populations are normally distributed. as n oo; P(An) 0 /!=/' 12.6 The Central Limit Theorem hence, P(a ) = 0 ^ P(A„)< °° Let {Xn,n > 1} is a sequence at random variable Define fi=l Sn = Xi + X2 + " X ni a(Sn) as the standard deviation of Sn and Zn (ii) If An are independent and _ Sn-g(Sn) *(Sn) £ > ( 4 , ) = c° Then Zn converges in distribution to /V(0,1). This is an example of SLLN. nor then A - l - Ax xi f f at most finite number of events An occur. Example hence, A = \ J f ) A n Suppose Xs above are i.i.d each with the Poisson distribution with parameter X. Show <•■1 «i-l that the SLLN holds. In view of the independence of An, we have 12.6.1 Central Limit Theorem for Independent Random Variables Let Xi,X2, - be a sequence of independent random variable’s having means p, = \ - f (a ) = p {a ) ^ p(\j ( ) a„ E(Xf) and variance a? = Var (XL). If (a) the Xv are uniformly bounded, that is for k. r n | «=r some M',P$\X‘i\. < M) = 1 for aHTaad = L ^ f m ) = x r * l \.«n| / 220 UNIVERSITY OF IBADAN LIBRARY i\ We will show that with probability I, uu CO ( h ) ^ of = co, then 1=1 Z U C X i- f i ) Let X1,X2l ...Xn be independent random variable’s with E(X[) = 0, Var (X,) = a 2 P< < a • -* 0 (a) as 7i —* co we have for some a > 0 by Kolmogoro’s inequality ZT=i°? Kroncker’s Lemma (Proposition) If a1( a2, ... are real number such that co n Z-r < oo, converges, then lim Z)l n-*oo — i —n = 0 1 = 1 n-*oo 12.7 Strong Law of Large Numbers for Independent Random Variables S B . X % = ° Let Xi,X2, ... be independent random variable’s with E (A",) = 0, Var (Xs) = a? < °o. l=j By Knonecker’s proposition, we have that If i < 0, then with probability i. X1+Xa+-+Xn 0 a s n —» co = 1 t=i Note: It can be observed that Kolmogorov’s inequality is a generalization of Chebyshev’s inequality. If X has a mean n and variance a 2, then by letting n = 1 in Pr < Max > a } = 0 Kolmogorov’s inequality we obtain Ijsksn 1 %l l=J P{\X - n\ > a) < -jj{which is Chebyshev's inequlaity) This implies that n Where Xl ,X2, — Xn are independent random variable’s with E{Xf) = 0, Var (Xs) = im ) * l/ n = 0 or equivalently that of; then Chebyshev’s inequality yields -«oo Z1 j = 1 n 2 f{|Jfi + - + Xn | > a } s 2 ] ^ lim ) --------- = n-cc Z_j n 1 = 1 ‘ 1=1 Kolmogorov’s inequality gives the same bound for the probability of larger set of and that variables. The theorem (Kolmogorov’s) is used as a basis for the proof of the strong p f lim Xk = o] = law of large numbers in the case where the random variable’s are assumed to by V|| •• Li ) independent but not necessarily identically distributed. Proof: (Of strong law of large numbers for independent random variable’s) 2 2 3 2 2 2 UNIVERSITY OF ■*T8 7# . . v,..:;. . .---- — c IBA .............. —DAN LIBRARY Definition 1: -■> A„(w) -v A'(w) as n ̂co Two sequences h{Xn(w)}, {/„(vv)} of random variables are said to be "Trial Proof Equivalent" a finite number of terms, Given = X„ Pr{X„\ > a„} < co i.e. for almost all w Q, X n (vv) = Y„ (vv) => PrlEn occur infinitely often} = 0 for all but a finite number at n Pr jlim sup E n j= 0 Lemma P'1 k fw-!l Umw E- 1 ■<> if ' Z Pr^ - x , M * y , ( w ) } < oo i.e. trials x X > ml Pr n u ^ = 1 by de - M orgasloa Then Pr{w: X n (vv) * Yn (vv) inf initely after }= 0 Proof /.P r = 1 Let En = [Xn(w) * ^ ( mO) => Pr [En occurs inf initely after) = 0 Since X P\En} —v oo in converges =>Prjlim infE„j = l PrjPirn sup En J = 0 or 1 Pr{A\(w) = y.(vv) V except a finite number of n} Definition 2 . . . ; =>(jT.}and{Yn} are Tail Equivalent. A sequence {y„}is said to be a truncation at the sequence {A'Jat [an)where {a,,} is a sequence of positive real number if Examples Let En = {vv: Yn(w) -> A„(w) as n ->»} =*P(E) = 1 We know: \X,\ < an => -a n < X n < an i.e. A - [A*. - Y„ Vn except finitly inany nj P(/l) = I Y / / / X Y / / / -\ cut off the {an} at {Aj in order to obtain {T„} ’ 0 »n If we E amd w e A we E fl A = B = {w: Y„(w) —> A„(w) as n -> co amd X„ = Y„ V except finitrly many n) Lemma: Let the sequence {/„} be a truncation of sequence {Xn} at sequence {a„} be finite i.e. r=.weA 'f|A=e>weB Yn - XPrlA^I > an }< co (Efl A)< B then thus P(E) = 1. P( A) = 1 P(B) = 1 Y„ (w) —> X n (w) as n —v oo where V. and A arc defined on the same ample space. 225 224 UNIVERSITY OF IBADAN LIBRARY Hence, they are tail equivalent. 12.8 Bolzano-Cauchy Criterion for Convergence Lemma: Let C be a fixed real number. If |C| < AT efor some K > 0 and every€> 0, it follows that C = 0. By Bolzano-Cauchy criterion for convergence Proof: suppose not. Then C * 0 £ /> (£ „ )< °o 0* I Since € is chosen arbitrary, put e into let e= > 0, since K is given. 2k given any e> 0,3 an N0(e) Then |C| J = Such that V-n > N 0.K e which is clearly a contradiction except for |C| = 0 hhK ^ / >(E „l< e and lettingK -» co 12.9 First Borel-Cantelli Lemma Theorem: Let {£„} be a sequence of events each of which is a subset of ft such that £ F where F is a a - field of sub-events of ft defined on the probability space Now Z (ft, F.P), then ^ P ( E En)< co => pjlim supEn J= 0. P(E)< Z p (£. 0 [iffE .} is a sequence o f events we are often interested in how many of the event occured] Where ecan be taken arbitrarily close to zero, OR i.e. only finitely many En occurred. If f c ) is a sequence of events in a a - field F, w here ft, F, P is a probability Is1 BC-Lemma does not require independence of the event En. space. Then P({Eon})< 30=5 /4im supE„ }= 0 12.10 Second Borel-Cantelli Lemma Let {£„) be a sequence of independent events on the same probability space (fl, F, P)then if £ = mli mMt sup En; Proof: Let £ = »li-m«, supr En !0->1 ( £ . ) = «° Then £ - f ) | j £ „ ; clearly E c (J E„ V-m e =>£(£„ occur infmtely often) = 1N i.e. £jiim supEn j= 1 . 226 227 UNIVERSITY OF IBADAN LIBRARY Proof: Corollary to 2ml B.C. Lemma Recall that hm sup En = p | \ J e „ s If A7 .fareindependentandX„ -> 0 (a.s) , m=l m m Then 2 >|jr,,|>c]< ,im *UP (En)] i = P{hm inf Enc}= pjlim (fl )} .rr»l ; . • * . ' Whatever be C> 0, finite For any N > 0 and every K > N Proof: If Xn's are independent random variables /!„ = [jx„|]> C are independent. Since Since E, s are independents E f 's are independent too. X„ 0 a.s. iff. / > C]< oo as n -» 0 and for any C > 0am-N m N Since J~[(l - P{En)) < J"J e ,,[K'1 = e "" by exponential property We have / ’(lim SUP A.n J= 0 mN mN Since PE(A„)< c o . as K oo; £ P(En) -» oo i.e. £ P(En) = co Note: " » N n=N The converse of Borel-Cantelli lemma is not true if An's are not independent. - I P ( E . ) => lim e "" -> 0 Af-w: => 1 -P {E ) = 0 ^ P(e )= 1. 12.11 The Zcro-Onc-Law Theorem: Let A„ A-,,..., be events and let A be the smallest c r - field containing each of these < co events. Suppose E is an event in A with the property that, for any integer j\, j 2, jk pr{lim sup En }= according as J P ( E n) = ICO that events. Whenever Et,E 2>..., En,... are independent E and Aj, f) Aj2, f l ... fl Ajfc are independent. r | Then PIE) is either 0 or I.I e- (|) J^2iib]E (A n) /'(lim sup A3) = 1 Bui ±P(.4„ J~rc=> P^lim inf An )■--- 0 228 229 UNIVERSITY OF IBADAN LIBRARY Thus lim sup An * lim inf An n-Kt, n-*xj Hence {An} dos not converge. By independence of E and Aj, f | Aj, D ... fl Ajk = jA jl:'Aj,...AjkdPP(E ) n Exercise 2: Let X have the uniform distribution (X ~ p (0 , l)) consider the sequence But (fl, A, P)is a complete probability space. o f events {An} . .■ .p (/fn £ )= .p M ^ p [e ) Where An = jw: X(w) < — | . Are {An}independent. For all A e A, in particular A = E. :.P{E)={P(E)}2 P ro o f : f ( t ) = l Then P(e ) = 0 or 10 < x < 1 Completeness A measure space (H, 0 , P) is said to be complete if Acontains all subsets of sets of measure zero. ThCn 5 P Â" ̂ ~ 5 n ~ +°° Harmonic Series Diverges Note (i) A non-empty event with zero probability is negligible But En ^ A n+lz>An+l=>... (ii) Every subset of a negligible event have-zero probability; / ’Jim sup A )= I M 7 ' " ) = ^ W =0 ,,,BI n**ni / Lemma (1) Given a probability space (H.A, P)and a sequence {£„. u - 1.2,...} of event T in )- where EnCH and E T V-n Clearly the above violates the 2nd B-C lemma as the sequence {/*„} of events is overlapping and the therefore not independent. Prove (1) lim inf En C lim sup En n- v n n-f«> Proof: (ii) lim inf P(En)< lim inf P(En)n -* r n-Kc on Q -A ( (2) Let (Q, F) be a measure space, on which a sequence of probability measure isLet on AJ » 1defined. The set function P(e ) dA Pn(£) Then J IAj{, fAl2 .....//!,* dP = P(Aj, f | Aj, f l . - f l Ajk f |£ ) (i) Show that 0 < /} ,< 1 . = p(Aj1nA j; n . . . .n A j j / ’(£) (ii. i i ) is countably additive and is therefore a measure (iii) Prove that Fj(f2) = 1. 230 231 UNIVERSITY OF IBADAN LIBRARY Solution Now let Au {w: \X,(w)| ̂ e zTn}, then An l

1 .Since Pn (e )^ 1 and 0 lim (En) = 0 and iim P(An) = 0 2 n 2 " n-*w I N - X (ii) Show l im j ; P„ (£ ) = P. (£) and d/IMmC. Jf X]dP = 0 n=l *• n»l ^ Am This verifies the 4lh L.T. so by Lindeberg’s theorem (Mi) P ( f i ) = Z j r P . ( n ) = i ; ^ ( i ) n = l ^ ^ X , + X 2 + - + X n_ _̂ in distribution tTn + . . . ~ 2 + 4 Lindeberg’s Theorem (The Conditions of Lindcberg Theorem) 5W = T ~ 2 Let 1 X itX 2, . :X k t i ->^n >•••^2*3 Be a rectangular array of random variable satisfying the following condition. 12.12 Limit Theorems for Sums of Independent R.V’s 1. V-n > 1 X n ,X n ,...Xnt_ are independent Lirideberg-Levy Theorem Lcl X r .\\... be a sequence of ij.d.r.v each with mean 0 and variance 2. e (* j = 0; r J (o < r < co) Then + * 7 + — + X* 71/(0, l) 3- B] = r ; + r ; : + ... + r ; 4. wrlA 5 ; > 0 zTn Proof: *«0 Consider an array Xi K e > 0 . Let S„ = X„ + X„ +...+ X . and N a random variable with standard normal * „ * i . * , distribution Condition 1,2 and 3 of Lindeberg’s Theorem are satisfied. Then ——— —» -j=L= T e ^4// 5, 42n ■“ We only need to verify condition 4. The above statement is basic to the central limit theorem Let e> 0, Bl = n r :, then However, i f E(XiiL ) = 0 n —n t Z J -m p 2, \x.dpn r i* dt Since X s are i.i.d. and 232 233 UNIVERSITY OF IBADAN LIBRARY Lyapunov’s Theorem 1 LetXi be a sequence of independent random variables. If a positive number 8 can be found such that as n —» co; / > 1. eoW-.i u„ *=i k = I V2 jr By hypothesis (ii) above 1 i? .+rf = 0 Proof: • /*.•; • • : . - . -, ' «<-” 6d V* The random variables, define above satisfies 1, 2, and 3 of Lindeberg’s theorem. It . jSl q 5* i i _ » #(o, l) also satisfies the following: "5 .. (i) for some fixed & > 0 > E \X » k \2+S < 00 (>o (im ■° n i Et.i £ i^-*r+' = ° then We now need to show that condition 4 of Lindeberg’s theorem is satisfied. Let Var(X,)=b for / = 1, 2 then Bn =bTn Setting E(X,) = ar, condition 4 becomes / \ x a*hVn Vr,b2\ |,r-a|!>*^fl, - a^ dFM = o+eJ/r+fp -» which approaches zero since the Var(Xt)< oo anc/ 6 = 0 Now wc need to show that condition (2) implies condition (4) This follows from the inequality 234 235 UNIVERSITY OF IBADAN LIBRARY Where /?(ll|is the initial distribution C H A P T E R 13 //, is the drift vector IN T R O D U C T IO N T O B RO W N IA N M O T IO N ^ i s the diffusion matrix 13.1 Brownian Motion (Weiner Process) Brownian motion describes the macroscopic picture of a particle emerging in random 13.2 Brownian Process system defined be a host of microscopic random effects in d-dimensional space, Peter If the drift vector is zero and the diffusion matrix is the identity, then is & Yuval (2008). At any step on the microscopic level, the particle receives a termed/referred to as the Standard Brownian Motion. Hence, the macroscopic picture displacement caused by other particles hitting if or by an external forces so that it’s emerging from a random walk can be fully described by a Standard Brownian •• ** H Motion. posterior at time-zero is So, its posterior at time n is given by S„ = S0 + ^ x , where ■̂1 the displacements X t,X 3,... are assumed to be independent, identically distributed 13.3 Multinomial Distribution and Gaussian Process The most important joint distribution is the multivariate normal (or the multinomial) random variables with value in TRd. The process {S’,,:/*£ 0}is a random walk, the distribution. It arises in many applications and has some properties that makes its displacements represent the microscopic inputs. Thus Brownian motion is a kind of manipulation very simple. stochastic process. If A is any (/jx»)symmctric matrix, consider the quadratic form Any continuous time stochastic process {#(/):/£0} describing the = A X macroscopic feature of a random walk should have the following properties: n n (i) For all time 0 * x 0} has almost surely continuous paths. Let V = A'1, then V is also positive-definite and symmetric. (iv) It follows from the CI.T that these feature implies the existence of and a matrix le '.R ^ su ch that for every t > Q and h > 0 , the increment is Definition 1: A collection (X„ Kjwhich has the joint density. multivariate normally distributed with mean h/.i and covariance matrix /?XXr Any process {.V,} with the above feature seem be represented by (2x)% m(def.V)T & i = A .i + M + £ / W J'” t>0 is said to have the multinomial distribution A'fO.K) 237 236 UNIVERSITY OF IBADAN LIBRARY 13.4 Properties of a Brownian motion (B. M) Definition 2: If //,, are finite real numbers then X = The following are the properties of a Brownian motion. C^i+M ^2 + /^2» -> ^ n + A. )joint p.d.f 1. The Brownian motion is a Gaussian process with autocovariance function. V { s , t ) = E ( x , . X , ) (27r)/^-(det. P ) ^ exp ~ (±J Y~'(x~m)} a°d l *s sa>d to have the multinomial = min (5,/) distribution n (̂ j, v ) 2. The autocovariance function P(.v,r) = min (i-,/) Definition 3: Letr be any set (usually a subset of the real axis). For every t fe r re t i.c. symmetric for r = (0,co) A ^ b e a random variable defined on a probability space (Q, A,P). Then the family 3. Let A ^be a B.M. process and define A'(.s,f)= Xlt) - X {s), the increment ,w): re r} at random variables is called a Stochastic process. process on the interval' (s,t\ Then A(j ,/) ~ A^(0,/-i) Definition 4: Let V(s,t) = e \ X ^ - ' / u, \ x ^ - J} be the autocovariance function at 4. Given the Brownian motion process M M for all relevant values of t and s and pt = E ] x ^ \ p s = ZsjAQ,)] £ ( / ) = 4 K /+A M d f ) Definition 5: A stochastic process Af(r,w) with the property that all its. finite­ = 3/r 5. The Brownian motion process is continuous everywhere but is nowhere dimensional distribution are multinomial and E(X,}=0, differentiable. E ( X „ X , ) = V { s , t ) Where K(v ) is a positive-definite function on r , is called a .Gaussian Process with Definition 7: Let T be any set (usually infinite) and possibly uncountable) and let autocovariancc function P{-, •): TXT -> 9? be a function with the two properties. Remark: (ii) for any finite subset }er and any real numbers Z2,...,Zn not • Two Gaussian processes with the same autocovariance function have all zero the same finite-dimensional distribution • The most important example of a Gaussian process is the Weiner (or I.-1 2/-iM ' , < > , .* ,> 0 Brownian motion) process. then P(v ) is called a positive-definite function on T Lemma: Definition 6: A Gaussian process is said to be a wiener (Brownian) process P(/,./,) - min (r,,r,) is a positive definite function on r i f (/') r = (0, ao) (H) -T1(1) =- 0 and (iii) = m in(j,/) 238 1I UNIVERSITY OF IBADAN LIBRARY Proof: (/) Clearly V(/,, /,) = V (/,, /,) (/'/') I f 0 < I, < /, then z * . + ('z -O + (/> -':)[ Z * , z * > +(/«/■I Z<-i /E1 ^ i > t e *, = Z<-i Z,-»i min 0 / . t e • Z f c ' . - i J /E-I* / „ i ‘-i ,-i Where /„ - ()./, > (1 <./.) Clearly the last expression in the “curly bracket] is a positive number. •S/rar min(/i./()-/. /dr / = j and J(.v,/) = min (.y./) is positive definite. Since b> symmetry, we may interchange / and7 to cover cases in which f Theorem: - I ' . T + 22 . L * , Let V(/)be a Wiener process and let X(.s.t)- ^ lit * A',.,denote the increment of the ••I /"ni process on an interval (.v,/), then Expanding the square bracket gives (/) A (.v,/) - iV(0,/- .v) = * .-+ 22 , 2 ;* , (//) // (.v,, /,) anil (a,, /.) arc disjouit intervals.then X ( s t. tx)andX(s,, t2)are stochastically independent. = *,’ + 22,(2 ,.,)+ 2* .!* ,.,)+ ... + 22,(2 . J + 2(2 ,.2 .) Proof (i) A',#,is Gaussian, therefore the joint distribution of A',f, and .Y(#, is ■ f t * . /"I*! multinomial and so .Yj(J - Aj,, - A',, N(o.r’) I * , I - I I * , where r - I a r ( X ( s j ))(-/♦I -4 * 1 .1 - - O ' Writing the expression in full, we have = 4 ^ i , - 2avv, + -v,;,] - / , k IZ . (3, + * + . . . + 2 . f} ~ r ( t . / ) - 2 l ( s . t ) + V (.v..v) 1 C |(^; + - H )' - (£; *- £ , + ... + } - / - 2.V + .v • - • M . , ' * J -?„■’ } - / • .V Hut / w ,. by hypothesis, rewriting mi) 'A c ;*- , K ^ ( / ; I + ^ ( a , 1 [0 , t = 0 “ V[U. ^ (', ^ 1 •?; )+ ^ * *2) l / — o Show that is a Brownian process. Since Cfn>{s,l) = 0, there exist stochastic independence. Exercise 1: For any real set of number C,,C,.... C,(and real values random variable Hint for Solution •j.V, J". show that ]T ]T (?,(?,£■(A', - / / J iA ' , -p ) is positive scmi-dcllnite for Calculate !'(•. ) o/ ,.i /-i Note that for a B.M. process // = 0 //)(); //)=£[();,,)(>;„)) Hint for Solution: Let >'■ = A', - / / , then £ ()') = 0 and V u r \^ C iY, ] = Zij ^ C ,) ' = .v/»n. n{I i .-i •, W s j i-i i- i #-ri -- min (.v./)/ Exercise 4: M 1 cl A'Ufbc a Brownian motion process and let rr,,, - A„, — / A*(ll; 0 < / < I Example 2: Calculate the autocovariancc function of the Gauss-Markov Process Find the or /',(.•>./) Hint for Solution Hint for Solution: Assuming /T()7) = 0; Nolo that /:(-/*„,)- 0 nm/ ! > ./) • - A-[(c"A>:- )(e - Ye-'')] , )(«•*)] Cf e:> for l > s i.' <‘*'1 t* for l < s * 1 for 1 = s - f, i/ l-s.f Example 3: 242 24? UNIVERSITY OF IBADAN LIBRARY ^ E [ ^ r s X j X {l)- t ( x J ] = £[a'\t)X U) - sX{t)X {,) - X[t>f X [x) + tsX = P M - . sI'(1,0 - /P ( 1,j )+&K(U) = Min( s j ) - S min ( i t ) - t min(l,.?) + / s min (l,$) = .v - st - is + st; for s < l - t - t s - s t + st; for t < s = M in ( s j ) - s t ; V-sJ PART THREE 244 245 UNIVERSITY OF IBADAN LIBRARY C H A P T E R 14 Definition IN T R O D U C T IO N T O S T O C H A S T IC P R O C E S S E S A stochastic process is any process that evolves with time. A few examples are data on weather, stock market indices, air-pollution data, demographic data, and political 14.1 Basic Concepts tracking polls. These also have in common that successive observations are typically Researchers in science, engineering, computing, business studies and economics quite not independent, such collection of observations is called a stochastic process. often need to model real-world situations using stochastic models in order to Therefore, a stochastic process is a collection of random variables that take values in a understand, analyze, and make inferences about real-world random phenomena. set S, the state space. The collection is indexed by another setT, the index set. Finding a model usually begins with fitting some existing simple stochastic process to the observed data to see if this process is an adequate approximation to the real-world The two most common index sets are the natural numbers T = {0,1, 2,...}, and the situation. nonnegative real numbers which usually represent discrete time and continuous time, *> '• "• '■ i‘ respectively. The first index set thus gives a sequence of random variables Stochastic models are used in several fields of research. Some models used in the (X0,XVX2, — )and the second, a collection of random variables {AT,,,, t > 0 j, one engineering sciences are models of traffic flow, queuing models, and reliability random variable for each time t. In general, the index set does not have to describe models, spatial and spatial-temporal models. In the computer sciences, the queuing theory issued in performance models to compare the performance of different time but is also commonly used to describe spatial location. computer systems. The state space can be finite countable infinite, or uncountable, depending on the application. Learning stochastic processes requires a good knowledge o f the probability theory, advanced calculus, matrix algebra and a general level o f mathematical maturity. 14.1.1 Applications of Stochastic Processes Nowadays, however, less probability theory, calculus, matrix algebra and differential The followings are some areas of Stochastic Processes: equations arc taught in the undergraduate courses. This makes it a little bit difficult to (i) Marketing: To study customers or consumer buying behaviour and forecast. teach stochastic processes to undergraduate students. (ii) Finance: To study the customer’s account recordable behaviour and forecast. (iii) Personnel: To study and determine the manpower requirement of an The mathematical techniques and the numerical computation used in stochastic organization. models are not very simple. In an introductory course, the hope is to teach students a (iv) Production: To study and evaluate alternative maintenance policies, small number of stochastic models effectively to enable them to start thinking about inventory, and so on, in industries. the applications of stochastic processes in their area of research. These small numbers (v) Transport: To effectively control flow and congestion in the transport of stochastic models are the core topics to be taught in an introductory course on industry. stochastic processes directed to researchers in the physical sciences, engineering, operational research and computing science. These researchers have a stronger 14.2 Discrete-Time Markov Chains background in mathematics and probability than researchers in the biological You arc playing a lotto, in each round betting N13 on odd. You start with N30 and sciences. after each round record your new fortune. Suppose that the first five rounds give the 247 UNIVERSITY OF IBADAN LIBRARY sequence loss. loss, win, win, win, which gives the sequence of fortunes, 9, 8, 9. 10, For a transformation matrix, a 2-level change of state will produce 2 by a matrix, a 3- lcvel change produces 3 by 3 matrix and so on 1 1, and that you wish to find the distribution of your fortune after the next round, 18 given this information. Your fortune will be 12 if you win which has probability — 14.3 Classification of General Stochastic Processes The main elements of distinguishing stochastic process are in the nature of the state and 10 if you lose, with probability —20 . One thing we realize is that this depends 3 8 space, the index parameter T, and the dependence relations among the random variables XL. only on the fact that the current fortune is N il and not the values prior to that. In general, if your fortunes in the first of rounds are the random variables^, ...,Xn. the conditional distribution of ^n+l given Xv ...,Xn depends only on Xn. This is a 14.3.1 State Space 5 fundamental property and we state the following general condition. This is the space in which the possible values of each 0 1/7) 1 ^ = 1 , for/;eS 14.4 Classical Type of Stochastic Processes for all ( e S We now describe (first brielly) then in details some of the classical types of stochastic processes characterized by different dependence relationships among At . Unless 14.2.1 The Transition Matrix In changing from one stale to another in any Markov system, a measure of probability random .staled, we lake T - [(), -x-] and assume the random variables A", are real valued is always attached. Ii is the collection of all such probabilistic measures which are arrange din rows and columns that ids called the transition matrix. UNIVERSITY OF IBADAN LIBRARY Then for any t and s we have 14.4.1 Process with Stationary Independent Increment / ( / + *) = £ [X ,„ -X „] If the random variables X l2>- X,,. X ,j-X l2,...,Xtn — X ln_{ are independent for all = £ [X,.s-A'.v + X5. -X „ ] = £ [X ,.S- X ,] + E[XJ.-X „] choices of £1( t2, .... ^satisfying £, < /, < ...< /„ then we say that Xt is a process with independent increments. = £ [ X , - X „ ] + £ [ X , - X „ ] If the index set contains a smallest index t0, it is also assumed X c - X li,...,Xtn- X ln_l are independent. If the index set is divided, Using the property of stationary increments what is 7 = (0 ,1 ,...), then a process with independent were reduces to a sequence of = / M - / W independent random variables Z0 = Xq,Z{ = X, - X ^ . i = 1,2,3, ...in the sense that The only solution to the functional equation / ( / + s = / ( / ) + f ( s ) = /( /) / . knowing the individual probabilities/distributions of ZQ,ZV ... enable us to determine the joint distributions o f any finite set of Xt, in fact that of differentiating with respect to t and independently with respect to s we have X, = Z„ + Z, + ... + Z,, 1 =0,1,2,... f ( r + s) = f ' ( r ) = f \ s ) . Therefore for 5 = l, we find f ( t) = constant = f(i) = c. Integrating this elementary Remarks'Definition differential equation yields f [ t ) = cl + d. 1. 1! the distribution of the increments X(t, + h ) -X ( t t) depends only on the But / ( 0 ) = 2, / ( 0 ) implies / ( 0 ) = 0 and therefore d = 0. length h of the interval and not on the time t, the process is said to have TTh. eref.o re expressi- on / ( 'x)r = / ( ! > «s/ f n % sia/iunun • increment. 2. For a process with stationary increments, the distribution of X(/2 + h ) - X ( t 2), =>E[X,] = M n +M, 1 as requires. no matter what the values of h, t2 and h. 3. We now state a theorem; 14.5 Markov Processes If a process {Xr t eT}, whereT = [0, oo] or T = (0,1,2,...) has stationary A Markov process is a process with the property that, given the value ofXt , the values independent increments and has a finite mean, then it true that: of Xs, S > t, do not depend on the value if X u < t; that is, the probability of any £ (X ,)= M n + M, where M0 = £ ( x J and M ,= £ (X ,) -M 0 particular future behaviour of the process, when the present state is known exactly, is 07 = 07 + 07 where not altered by additional knowledge concerning the past behaviour, (provided our £7,; - £ [(X„ - M„)] and |.v: s;«.A[ Pr {x, s and 2. Explain the concept of a simple Markov Chain. is basic to the study of the structure of Markov process. We may express the condition ( 1) as follows: 3. Define the following: p r{« < x , ^ tyx ,i x,>Ks = x^ - x ,„ = "„) = H x„> tn, t A) where ) £ |a < £ < b} (a) Slate Space (5) (b) Index Set (7 ) 14.5.1 Martingales (c) Renewal Process Let (.V,) be a real-valued stochastic process with discrete or count parameter set. We say that (A',) is a Martingale if. for all t, and if for any < /, e (X1i1.,|X 1I r/,.... Xln =o„) = c for all values of ai, a2, ... a„. 14.5.2 Renewal Process A renewal process is a sequence Tk of independent and identically distributed (i . i .d ) positive random variables, repressing the lifetimes of some “units”. The first unit is placed at time zero; it falls at lime /', and is immediately replaced a new unit which then fails at time 7', + 7'2and so on. the motivating the name “renewal process”. The time of the nth renewal is S„ - 7] + 7', t-... + Tn. A renewal counting process N, counts the number of renewals in the interval [o.tj. formally .V, = n for Sn < ( < Sn,t, n = 0 ,1.2 .... Remarks: I lie Poisson process with parameter A is a renewal counting process for which the unit lifetimes have exponential distribution with common parameter A Other examples such as Poisson process, birth and death processes and Branching Process v\ ill he considered in small details. UNIVERSITY OF IBADAN LIBRARY C H A P T E R 15 P {x < i) = 1 - q , G E N E R A T IN G FU N C TIO N S A N D M A R K O V C H A IN S So that the probability generating function follows p(x) = Zi=oPi * ' = I: (X1) 15.1 Introduction Also for the joint probability, we have the generating function as Generating function is of central importance in the handling of stochastic processes involving integral-valued random variables not only in theoretical analysis that also in Q CO = £«=0 Qi practical appreciations. Stochastic process involves all process dealing with We can see that (?(*) is not the same as P(x) individuals’ populations, which may be biological organisms, radioactive atoms, or telephone calls. Q(x) do not in general constitute probability distribution despite the fact the 15.2 Basic Definitions and Tail Probabilities coefficients are probabilities. Suppose we have a sequence of real numbers a 0, a a...... Involving the doming Note that variable x, we may define a formula sothatP(i) = 1, A (x ) = Go*0 + a ^x1 + a2x 2 + ••• = £?= Qaix i and /P (x ) /< ^T /p .xV If the series converges in some real inference - x 0 < x < x0, then the function A (x) is known as the generating functions of the sequence { a j. We may also see this as a < ^ Pj. if / x / < 1 transformation that carries the sequence unit the function A(x). If the sequence {a,} is < 1 bounded, then a comparison with the geometric series shows that A(x) converge at This means that P(x) is absolutely convergent at least for /x /< 1. But for Q(x), all least for f x f x j . coefficients are less than unity, this making Q(x) to converge absolutely at least in the II the following restriction is introduced open interval / x /< 1 . n Converting P(x) andQ(x), we have t'=0 ( l -x )Q O O = l - P O ) Then the corresponding function A(x) is viewed as a probability-generating function. Specifically, consider the probability distribution given by which is easily seen when the coefficient of both sides are compared, H x = i) = Pi for the mean and variance of p,-. we have Where X is an integral valued random variable assuming the values 0,1,2 .... U = /•(*) = £ ip, =p<( 1) Consequently, we define the tail probabilities as i = 0 P{x > i} = q, = q‘ =r=i r! function and continues fcWx-DO- 2) .... (x - r + 1)] = £ (i - lXi - 2).... (i - r + l)Pi the characteristics function exist always both for discrete function. = p « ( 1) = 0 ,( 0 = £ > * / ■ « t=i From these result, several other generating function could be obtain such as the and moment generating function, characteristics function, cumulative generating function. 30 x = j e ltx f ( x )d x 15.3 Moment-Generating Function — 0 3 This is define as where the Fourier transform o f / (x ) is A1x(t) = E(eCx) 30 for X discrete witth probability p,-, we have / m = T j W o d w - 0 0 A range simpler generating function is that of the cumulants. When the natural Mx(t) = 'Yj e tipi = P (e f) logarithm of either the mgfo r the c f is generated, it results into the cumulant- generating function, which is simpler to handle than the former two. for X continues with frequency function f (a>u ) , we have This is given by Kx(t) = logMx(t) Mx(t) = J f{u )d u — 00 obtaining the Taylor series expansion of My(t) r! we have whore /fr is the rth cumulant. M(t) = 1 + Zr=i V } tv In handling discrete variables, the functional moment generating-function is also r! useful, which is defined as where is the rth moment assume the original. Because of the limitation of the moment generation function ( in that it does not Q(a) = P ( l + y ) = e[Cl + y)i] always exist) the characteristics function become appropriate which is define by = 1 ! Ir=lUr!(r)yr 0 (t) = E{eitx) flic Taylor expansion is similar where uir) is the rth factorial moment about the origin. 256 257 UNIVERSITY OF IBADAN LIBRARY 15.4 Convolutions Jusl as the case of two sequences, several sequences can also be combining together. Let there be two non-negative independent integral-valued random variables X, Ywith The generating function of the convolution is simply the product of the individual generating functions. That is. if we have the sequence {a;) * {£, ) * {c,} * {d,) * .... the p d f P(x = 0 = a, generating function becomes /l(x) B(x) C(x) D (x ).... and Given the sum of several independent random variables, P(y = /) = bj the probability of the joint event (x = y, - j ) is given as aibj. Syi = Xj + X i + x ? + ••• + X n Where Xk have a common probability distribution given by p,-, with pgfP(x), then the Let there be a new random variable S = x 4- y the event (s = k) is made up of the pg/ol'5,, is {(P(x)}71. Further, the distribution of 5„ is given by a sequence of mutually exclusive events (X = 0, Y = k), ( X = 1 ,Y = k - 1 ) ,. . , (X = k,Y = 0) probabilities which is the n-fold c o(pn.v)o •lu.t.i..o..n* {opf .){ p=*} with r if its written as {pi) * ipi) Given the distribution of 5 as Pis = k ) = ck 15.5 Compound Distributions Suppose the number of random variables contributing to the sum is itself a random then it can be shown that Ck = a0bk + a-i bk. 1 + — + arb0 variable. Thai is When two sequence of numbers which may not be probabilities are compounded, then SN = + x2 + — + *n it is called a convolution which ca{nC kb}e =re p{reks}e nted generally as wherea * [bk] P{xk = i} = f i ' Given the following general functions p{N = n} = g lx > » (* ) -2 5 o « i* <’| P{Sn = /) = /i,. and the corresponding p d f be given as C(x) = l i .o Q x 'J F W - £ f i * ‘ ^ we can then write C(x) = A (x)B(x) Q ( * ) = l 9 n * " this is because, multiplying the two series A{x) and 5(x), and given the coefficients n (x ) = Z /ijX '. ol'x* as ck. Simple probability consideration show that we can write the probability distribution When considering probability distribution functions, the probability-function of the ol'S„ as sum.5, of two independent non-negative integrated-valued random variable X and Kis ^ = p{s„ = /} simply the product of the letters probability-generating functions. = £ p {/ V = r,}P(Sn = l/N = n) 259 258 UNIVERSITY OF IBADAN LIBRARY Set L = 0, (i, = i. and 1 = 0 Tor llxcd n. the distribution of Sn is the n-fold convolution of {F,} with itself, that is 15.6.2 Transition Diagram (/•;}." Thus A transition diagram is a graphical representation o f the process with arrows from E<-uF{5(I = l /N = njx* = {F(x)}n each stale to indicate the possible direction of movement together with the Thus the probability generating function//(x) can be expressed as corresponding transition probabilities against the arrow s. 1 = 0 Kxample 15.1 *' ^ gnp{Sn = l /N = n) Consider a process w ith three possible slates av a 2, cmcla2. I .el p,,-: i = 1, 2,3, j =- n= 0 1 , 2 , 3 , denote the transition from one state to the other. S n ^ p t f n = l /N = n}*' The corresponding transition diagram is as follows: n=0 i=0 I h i = ' A > = ny=0 s „ { f ^ ) } n = G ( /M ) Thus gives a functionally simple form for thepg/'of the compound distribution {A;} of the sum SN 15.6 Markov Chain The diagram above represents a square matrix It would be o f interest to define the joint probability o f the entire experiment. This P = (p „) i = 1 .2 .--------n, j = 1.2.---------- will be a very complicated or intricate problem.Early in the 20lh century, a Russian Mathematician A.A Markov, provide a simplification of the problem by making the 15.6.3 Transition M atrix assumption that the outcome of a trial XL depends on the outcome of the immediate To even transition diagram, there exist a transition matrix and vice versa. F or the proceeding trial Xt_, (and on if only) and effects Xc+1 (next trial) only. The resulting example 16.1. the transition matrix is as given below : process is known as Markov Chain. 15.6.1 Transition Problem Pn Pl2 P13' P21 P22 P-a If a,- denote the state of the process X, and a,, i not equal toj denotes the state of the p = .P31 P32 P n process X, f ,. then there is a problem of going from a, to a, denoted by pi;- define as, Pn ~ W o , = ci, / X ( = a,) 2 0 1 260 UNIVERSI Y OF IBADAN LIBRARY This is a one-step transition matrix for every given i.{p,y} indicate the branch problem in a tree diagram. In general. Pou PU, Poi rPu P12 ...... - Pin Pn, P>. Pn. P21 P22 ..... IPnl Pn 2 ..... Pn and. □ « / = i Example 15.2 (Forecasting the Weather) y=i Suppose that the chance of rain tomorrow depends on the previous weather conditions For any given t. p,y is the probability of transition to a. given that the process was in only through whether or not it is raining today and not on past weather conditions. slate a,-. Suppose also that if it rains today, then it will rain tomorrow with probability o ; and if it docs not rain today, then it will rain tomorrow with probability /?. In this section, we consider a stochastic process {Xn n= 0,1,2,...} that takes in a II we say that the process is in state 0 when it rains and state 1 when it does not rain, finite or countable number of possible values unless otherwise mentioned, his set of then the preceding is a two state Markov chain whose transition probabilities are possible values of the process will be tested by the set of non-negative integers (0, I, given by 2. If X„ = 1. the process is said to be in state / at time n. We suppose that a l-c r „ ( a 1 - o ' whenever the process is in state /, there is a fixed probability piy-that it will set be in fi l - /? j { P | - / ? J state j. That is we suppose that =- ]\XU -■ ..... X, = l,,X„ = /„ }= P„ or all statesi„, il t ..., in- j . i./and Vn > 0. Such a stochastic process is known as a Markov chain. The value p,y Example 15.3 represents the probability that process will, when in state i, next make a transition into Suppose that company XYZ has three departments a}, a2 and a3. The employees lean stale j. Since probabilities arc non-negative and since the process must make a to be transferred to another department at the end of the year as follows: i) A man who is in a: . must be transferred only to a2 transition into some state, we have that Ptj > 0, i, j > 0 ; P0 = 1, i = 0 ,1,... h-o ii) A man who is in a2 cannot be transferred to a lt but can be transferred to a2 or a3 with equal probability. P denote the matrix of one-step transition probabilities p,y. so that iii) A man who is in a 3 cannot be transferred to a2 but can be transferred either to a3 with probability “/g or a I with probability Draw a one state transition diagram, and matrix. UNIVERSITY OF IBADAN LI " ^BRARY First problem: Suppose the process state in other 1, what is the probability that after n-steps it will be in state j? Consider a process with only three states al, a2 and a3. What is the probability that after two steps the process will be in state j . f o r j = 1,2,3 given that the initial state of the process is i . fo r i = 1,2,3. P{X2 = a J X , = fl,J = PU .PU = PPu m By assuring that i = 1, we obtain a probability tree for the process as follows: P{X2 = a2\X0 = a ,) = P12.P2, = P2, m P»m P[X2 = a 3|X, = a,) = P.j .Pj, = P3iw P»m PllPll + P12P21 + P l3P3lP llP l2 + Pl2P22 + Pl3P32PllPl3 + P12P23 + P13P33 P2lPl l +■ P22P21 + P23P3lP2lPl2 + P22P22 + P23P32P2lPl3 + P22P23 + P23P33 P3lPll + P32P21 + P33P3lP3lP l2 + P32P22 + P33P32P3lPl3 + P32P23 + P33P33 Assume tliai i = 2. then P{-f:; — “ l / — a2) ~ P21 ■ Pi 2 — P2I . , . , (2) / *0 - a 2) ~ P22 P22 ~ P22 /■'{a'x = a< / = a2} = P23 P32 = P^J P23 265 264 UNIVERSITY OF IBADAN LIBRARY P r/3) = V 4 Pn(3) = V 6 Assume that i = 3, then 31 In the same veinP { x 2 = a \ / x 0 = a 3 ) = p 3 i . P i 3 = P P { X 2 = a 2 / X 0 = a 3 ) = P32-P23 = P 32 P { x 2 = a 2 / x 0 = a 3} = P 3 3 . P 33 = V r s It could be seen that p (n ̂ = p" Example 15.4 Use a probability tree to fmdp(3) in example 16.3 P{x3 = a 2 / x0 = a2} = X/ 2 l f 2 . V 2 = V s P{*3 = as / xn = a2} = V 2 • V 2 • V2 = V s P{^3 = a l / xo = a 2) = V 2 ' V 2 • V 3 = V l2 = aj) = 1. V 2. V = V P{x3 = a 3 / x 0 = a2} = V 2 • V 2 ■ 2/3 = * 4 {x3 a 2 / x a = 2 4 P{x3 = a 2 / x 0 = a2} = 1/ 2 . 1/ 3 l = V 6 ^ { * 3 = aa ,j //X o == aa ,j)} == 11 -. V1 / 2 • aV/ 2 == VV4 P{*3 = az / *0 = ai) = V 2 • V2 • V2 = V 6 PP[x3 == x„ a,} = 2 3 = V 6 P{^3 = / *0 = Qz) = V 2 • 7 3 • V 3 = V 9{*3 a2/ x 0 = 1-a/ 2 2/3 3 p{x3 = a3/x0 = a2) = V2 • 2/ 3 • 2/3 = 2/g Pl3<3)= 267 266 UNIVERSITY OF IBADAN LIBRARY ^ ] l !' — V 12 + V69 -7 /72/346 Therefore.- % + v = r-(3) ^23" ' = V i 3 + V6 + % = 3?/ 72 Pu Pu pS [V s v 4 7/ l 2 ,j(3) ... P21 P22 P23 = 36 7h.A 37/ 72 *3^ P S P23 2S/S4 Note that = p (” l5p = p (0)p" at n = 1 , pll) = p “»p at 11 = 2, p(2) = pCilp = pC0)p 2 at n = 3, p(:̂ = pt'z)p = pWp* This implies that _ V „(»- ’ ’ Pij ~ / _ P I k Pkj k Definition Let {9fn , n = 0,1, 2 ....} denote a square of real valued variable index by n. The value of x for given n is the state of the process at the n th step. = a , / x0 = a,} = V 3 • 1. V 2 = V 6 P{xn = j / *„_•! = i} is a one-step transition probability matrix. The index n denote l>{xi = a3 / x„ = a3) = V 3 • 1. V 2 = V 6 something close to time and therefore depend on xn. x, xn_2 *0 anc* not on P{x3 = a 2 / *„ = a ,j = 2/ 3 V 3 • 1 = 2/ g -V7l + l'*71 + 2> The Markov assumption is that ' ( * 3 = « . / ^ = «3} = 2/ 3.2/ 3 1/3 = 4/ 27 P^xv ~ in /xv- l — jn-l> xn-2 ~ jn-2>—->x 0 = jo) = P{xn = j r / xn- 1 = /n + l<) ^ * 3 = * 3 / Vo =• a3} 2/ 3.2/3.2/3 = fl/2? The conditional distribution of xn given the whole past history of the process must equal to the conditional distribution of xn given x ,,^ . *1* « 4/27 =• V 6 + % = ?/18 Example 15.5: (A Communication System) Pn,nJ , = l //6 ++ 8//2 7 _- 2 5 //54 Consider a communications system which transmits in digits 0 and I. Each digit transmitted must pass through several stages, at each of which more is a probability p 2<»K 269 UNIVERSITY OF IBADAN LIBRA a R -J Y ^vj Ci-Oi " . In order words, we can-say. that the.proccss is in that the digit entered will be unchanged when it leaves. Letting d'ndenote the digit State 0 if it rained both today and yesterday; entering the nth stage, then {Af„,n = 0,l...} is a two-state Markov chain having a State 1 if it rained today but not yesterday; transition probability matrix. State 2 if it rained yesterday but not today; P 1 - P (P l - P } State 3 if it did not rain either yesterday or today.P = P = 1 - P P 1 - P P The preceding would then represent a fair-state Markov chain having a transition on probability matrix. Example 15.6 0.7 0 0.3 0 On any given day Gary is either cheerful (C), so-so (5), or glum (G). If he is cheerful 0.5 0 0.5 0 today, then he will be C, S, or G tomorrow with respective probabilities 0.5, 0.4, 0.1. 0 0.4 0 0.6 If he is feeling so-so today, then he will be C, S, or G tomorrow with probabilities 0.3, 0 0.2 0 0.8 0.4, 0.3. If he is glum today, then he will be C, S, G tomorrow with probabilities 0.2, You should carefully check the matrix P, and make sure you understand how it was 0.3. 05. obtained. Letting Xn denote Gary's mood on the nth day. then {Xn, n > 0} is a three states Markov chain (State 0 = C. state 1 = 5., State 2 = G) with transition probability 15.7 Stationarity Assumption matrix. A Markov chain is stationary if for m =£ n 0.5 0.4 P[xn = j n / xn-\ = jn - 1>} = P{xm = jm /xm-l ~ jm- 3'} 0.1 0.3 0.4 0.3 or simply. 0.2 P{xn = j / x r - 1 = i.) = P {X m = j/xm- 1 = 00.3 0.5 In this case the one-step transition probability does not depend on the step number. It Example 15.7: (Transforming a process into a Markov chain) is therefore sufficient Suppose that whether or not it rains today depends on previous weather conditions For us to state only the one-step transition probabilities. through the last two days. Specifically, suppose that if it has rained for the past two We therefore set n = 1 and obtain days. Even it will rain tomorrow with probability 0.7. if it rained today but. not yesterday, then it will rain tomorrow with probability 0.5; if it rained yesterday but pjy° = ^ {* 1 = j / x o = 0 not today, then it will rain tomorrow with probability 0.4; if it has not rained in the P,("J = P{.xn = j/x o = past two days, then it will rain tomorrow with probability 0.2 . II wo lei lhi.> state at time n depend only on whether or not it is raining at time n. then For, n = 0, this leads to >he preceding mode! is not a Maikov chain (why not?). However, we can transform p!j) = 1 for j = i ilus model into a Markov chain by saying that the state at any time is determined by the weather conditions during both that day and the previous day. 271 UNIVERSITY OF IBADAN LIBRARY P i j is the probability of the first event, and that of the second is P i( i ' = 0 for J f t Pi"1 = P{xn = j / x o = 0 by definition T,kPikakj = Y,kP{xn = j , x n. i = k / x 0 = i} marginal from joint = k ,x0 = i}P{xn. j = /c/x0 = i) Consider a process with the following three states; a1( a2 a3l where afis an absorbing state, and others are transient. ^ P A n =7 An -1 = W A n - l = kAo = 0 k , — P22a2l = Yk j v* ~ ' )pKi 15.8 Absorbing Markov Chain A stale in a Markov chain is absorbing if it is impossible to move out of that state. That is, the process stays there. A Markov chain is absorbing if it can’t least one absorbing state. That is, Pjj = 1.0 — P 23a 3 l A state in a Markov chain is transient or non-absorbing if it is possible to get out of that stale. That is Pjj =£ 1.0 for state j. 15.8.1 Probability of a Markov Process ending in a Given Absorbing State This depend on the given in that state. Let atj denote the probability that an absorbing chain will be absorbed in state if it states in the non-absorbing state a,. Method 1 Then There arc two possibilities, either the first transition is to state ay (in which case the a U = p i x < = a/Ao = ) chain is immediately absorbed) or the first transition is to some transient or non­ absorbing state ak ,k * j, and then the process immediately enters states a, f r o m a k. These arc two mutual exclusive events. 272 273 UNIVERSITY OF IBADAN LIBRARY Bv substitution As an example consider the following transition matrix lor absorbing Markov chain a2t ~ P21 + P a i + P 3O31 with four states. Note that an absorbing state is indicated by probability l2 2 2 2 v 4 v 4 V 2 0 p = V 3 v 3 0 v 3 ciij is a one-linear equation in several unknowns. Construct a corresponding linear 0 0 1 0 equation by using each o f the other transit state as initial state. 0 0 0 1 In the given example, a2l is a linear equation in two unknowns. Note that Ptj is Note that the absorbing state are a3 and a4. obtained from the given one step transition matrix. The onlyunknown are akJ-, all k =£ Suppose that we want a13, that is the probability starting from a. will get absorbed in j slate a3 . In other word, we want the probability that the chain will enter a3 from a1. The corresponding a21is given by l Then aiywill give us a33 = Pl.3 + P n a13 + Pl2a23 a23 = P23 + P23a13 + P22a23 Substitute forpiy. noting that akj is unknown. ~ P?.2a21 ars = V 2 + V 4 a i3 + V 4 fl23 a23 — 0 + 1/g «13 + V 3 a23 ~ P33°31 Solving the simultaneous equation, we obtain « i3 = 4/ 5 and a23 = 2/ 5 The matrix becomes | fl13 a l4| I °23 Alternatively. l hen Naive all values of’equation ii for all ^ j t sue ii as a2 1 and u?1) simullnneouslv ! :>> rr Vi, + 2.H Piltakj 274 27? UNIVERSITY OF IBADAN LIBRARY Wo can write this matrix form. Let A denotes the matrix of aiy R denotes the matrix oI'Pij. Q denotes «3 fl2 The matrix o f pik. That is A = (af/) = ia kj} -s x r r R = (Pa) s x r 1 0 0 0 = i P i k ) S X 5 Then aijcan be written as 0 1 0 A = R + QA fl4 Where r = number o f absorbing states v 2 0 V * s = number o f transit states Step I: Arrange the rows and columns of the one-step transition matrix in which a 0 v 3 V s way that the absorbing states appear first in the rows and first in the columns. V Step 2: partition the new one-step transition matrix as follows r - r ~ \ Step 4: Find I-Q and hence ( / - ( ? ) 1 absorbing states 0 transient states 5 ■< R 3/ 4 " 'A( /-< ? ) = J - v 3 2/3 hrx r )> 0 ( rx s )> N(sxr)> Q (sxs ) u - «?i = (3A ) (2/ 3) - (V 4) (V 3) = ( 5/ i 2) Step 3: Solve for A. the matrix of the unknown, as follow [2/ 3 V 3 (/ -Q )A = R C0f(/- c ) = k 3/ J A = ( / -< ? ) - '/* cofT(/ - Q) - Adj(l - (?) = 2/3 v 3 Since (l-Q) is non-singular and so has an inverse. (/ - (?)_1 is known as the V3 3A fundamental matrix. , Ad](I - Q) 2/ 3 v 4 l or example, the above 4 x 4 matrix incan be rearranged as follows: since V - Q r = Je t( , _ Q) 12AV3 V4 277 UNIVERSITY OF IBADAN LIBRARY r8/ s 3/ s i i 4/s 9/5j Therefore /! = ( / - Q)_1R = L4/s 9/ 5J ° V 3. % V 5 7s 3/sJ 15.8.2 The Expected Number of Times a Markov Process will be in each Possible Starting Transient (Absorbing) State Lei N = (liij) where Uij is the number of times the chain is in transient state a;- given the initial state is at. Lot n,7 denote the mean number of time that the chain is in transient state a, . Let N denote the matrix of n (y, which is a square matrix since i and j range over the transient stateds. Consider the state at time 1. That is, the first time interval is spent in state a, (a( is Consider a chain with the three states in (a), a a, a2, a3 where aj is the absorbing transient state). If i =£ j and the transition probability pik given the probability that the state. Assume that the initial state process is a2. process will be in aK from at . Then nij = T.kPiknkj nii = Piknki = 1 = dii + 'LkPiknki Which is combined into n i i = d U + Y j P ikU ki ' = l> f° r ‘ “ j l< = 0, for i j 279 278 UNIVERSITY OF IBADAN LIBRARY i---- o r—i LD co L̂T) <» In matrix form this can be written as Recall djj = Vij + I,kPikakj d 1 d2 • dn dx d2 . dn r " N d, 1 0 0 r ^ \ dx i 0 0d2 0 1 0 d2 0 1 0 dn 0 0 1 dn 0 0 1 J J 15.8.3 The Length of Time (Expected Number of Steps) Required before Absorbtion Occurs For any given initial transient state a, the expected number of step required before {p*} = Q absorption is given by the elements of the rector M = N N = 1 + QN t = Z ” " i n = (/ - o r 1 1 = (/ - 0. or ZjP j P i 0.5 0.0 0.75 P i Pi = 0.5 0.5 0.25 Pi 0.5 0.0. Pi. Pk = I , PiPuc Pi. 0.0 A probability distribution which satisfies pk is called invariant or stationary = 0.5//! + 0.75//3 distribution (lor a given Markov Chain). In this case row ofP(n) is the probability //2 = 0.5//, + 0.5//2 + 0.25//3 vector// = (/tl t //2, •••)• Hence, given Pi = 0 .5 //2 pin) _ p ln-l/p Thus nl i—m on P(n) = Um P^ -^ P Pi ~ -SP\ 71 —• co P\Pi PlP2 and PlP2 — Pi Pi \P] P2 = 2a<3 Substituting we have 4 This can be written as Pi = 3^1 P = pP IJv imposing the normalizing condition on the sum ut we obtain or P, + P2 ■+ " 1 p T = Pr pT 4 2 Pi "F -j « i + jr/M I lien:Ibr.: 2X8 2 6 9 UNIVERSITY OF IBADAN LIBRARY 1 D H R This means therefore that 4 2 D S 2 v 2 0 Pi = and n3 = - P = H v 4 v 2 V Thus n = (jiiPiPz) = (V 3 4/ g 2/ 9) R 0 v 2 V This gives a sample method of obtainingP(7l) than raising Pto power n. V J Findlimn_co P(n) and give all possible interpretation of the result. Interpretation: can be interpreted as follows: 16.4 First-Passage and First-Return Probabilities 1. Probability of a distant state: if a point in time Is fixed in the distant future , //y We shall approach this topic by way o f asking certain questions. is the probability that the process will be as state j. Q l : What is the probability that in a process stating from a(. the first entry to a; 2. As a time average: if the process is operated for a long time, /iy is the fraction occurs at the nthstep? of time that the process we be at state j. Q2: What is the number of stepsn, required to reach state ay for the first time? 3. As a fraction of process: if many identical processes are operated For Ql. consider the function/?^ which is the probability that the process will enter simultaneously, /iy is a fraction of the process that can be found in state j after slate j at the nth step given that it is in state i of the initial step. That is, a long time. 4. Reciprocal of mean number of transition: iij is the reciprocal of the mean P p> = P[Xn-‘ j\Xo = i) number of transition between recurrence of the state, that is, average (a) In this case, the process would enter state ay, after onlyk, 1 < k < n — 1 , number of transition before a steps. man inay will come back to a*. (b) After that is called either stay three is ay or change to another state and then return to ay. For Q2, the probability/^ that the process will reach state ay Example 16.2 for the first time at the nth step given that it stated from a, is called first- An individual of unknown genetic character is crossed with a hybrid. The offspring is passage probability and is define as again crossed with a hybrid, and so on. The states are dominant(D), hybrid (H) and fij * = P{Xn = j-X71—1 ^ J'Xn-2 * j > —>X\ * j\Xo — 0 recessive (/?). The transition probabilities are Definition: First-Passage Probability This is the probability that the process is in state ay at time nand not before, given that it was in state a, at time 0. 290 291 UNIVERSITY OF IBADAN LIBRARY Tlu> implies that the probability that n steps are required to reach state aj for the first 0.5 0 0 0 0.5 0 time given that the process siartsfroin slate a,. p ! ? * 0 0 0 Clearly - 0. the process is still at a,- 0.5 0.25 0 0.5 0 O' 0.25 0.25 O' 0 0.5 0.5 0 0.5 0 = 0 0.25 0 /jy11 = Pi,, the one-step transition probability, t =£ j 0.75 0.25 0 . 0 0 0. .0.375 0.125 0. A Iso. 0.250 0.500 0.250' P ? JpW = 0.375 0.375 0.250 .0.375 0.500 0.125. Then, 0.250 0.500 0.250' 0.25 0.25 O' 0.375 0.375 0.250 - 0 0.25 0n v u H)i - h i I' ll / ,>ij P>i /<=1 or. _ JJ) r;"’} ore known. number of steps to get from i to /(the first passage). That is, the number of steps required to reach a ;- for the first time. Kxample 16.2 (ii) The number of steps required to get from i to j is therefore a random Consider (lie problem of departmental transfer in chapter 17. variableA/,,. with 0.5 0.25 0 />{*.) = "} = / « r .. 0 0.5 0 . 0.75 0.25 0 2 9 3 UNIVERSITY OF IBADAN LIBRARY 16.6 First Return (Recurrence) m i l = X p t j + ^ ( 1 + m k j ) p lk ( i) If; = t, f£ n) gives the probability of the first return to state a f. For example, k * j the probability that the person transferred from department a £ will return to a, for = Pi j + ^ Pik + P i k ™k j the first time at time n. k * j k * j Corresponding to = ^ Pik + ^ Pi k™k j all k k * j fii ̂ = P[Xn = ^ 2 it —,Xi ^tlA'o = i) = 1 + Y j VikTrLki k*J ( iii) The equation relating / J n) to would also be the same. But since £ kpik = 1, This expressesm^- as a linear function of m kj as the unknowns. t i n) = n * i = zi*0 = 0 = 1 (ii) By using the same relation for other m £y’s a complete set of linear equation (ii) Then Nu is a random variable whose value is the recurrence of state a£. (equation to the number of unknowns) can be expressed. (iii) Since {/it(n)} for fixed i , j gives the distribution of fyy. the mean first passage (iii) A solution of the linear equation gives the mean first passage time from any time from a,- to ay denoted by m,yis given by state into state j . oo (iv) Mean first recurrence times are obtained in the same way. m a = E(Nu ) = Y j n^ /n> n = ] Example 16.3 Consider the three-department job assignment. How many assignments will occur, in (iv) where t = mu is the mean first recurrence time. the average, before a man who is first assignment to ax (engineering) will be assigned to a 3 (sales)? That is, what is m13? 16.6.1 Calculation ofm iy- ( I ) The formula in (6.6) for would required the complete first passage time Solution to Example 16.3 distribution for solution to be obtained. Using the formula for m,y ( 2) A simplification of the problem is obtained by conditioning the formula for 7 n 13 = 1 + P n m 13 +- p 12m 23 m(; on the state at step 1. That is. on one value of i at a time. There are two unknowns. Hence we form a similar equation for m23 as follows. (3) Given that the process is in state at at time 0, either the next state is a; in m23 = 1 + Pzim l3 + P22m 23 which case /V(/ = 1 , or it is in some other state ak afier which it enter state Now. recall that ar in which case the passage time will be m kj = 1 + NkJ, the passage time 0.5 0.5 0 ' p = 0 0.5 0.5 from ak to fly. .0.75 0.25 0 . By substitution we obtain (i) Thus m13 = 1 + 0.5m)3 + 0.5m23 2 9 4 2 9 5 UNIVERSITY OF IBADAN LIBRARY m23 = 1 + 0.5m23 Solving the simultaneous equation, we found that C H A P T E R 19 mi3 = 4- and m 23 = 2 CH APM AN-K O LM O G O R O V EQUATIONS AND C LASSIFIC ATIO N OF STATES Practice Questions 1. Define the term, steady state probability 17.1 Introduction 2. Write an expression for a limiting distribution. The n,h-step transition probabilities P" is the probability that a process in state / will 3. Solve completely the problem in example 5.1, and draw all the graphs. 4. Use matrix multiplication and limiting probabilities to solve Problem 5.2. be in state j after n additional transitions that is, 5. In the post test in lecture four, obtain the stationary probabilities. /7= H * ,,-„ -^ |X „= ;} ,n 20 ,i ,j ;> 0 . 6. Define and write an expression for The Chapman-Kolmogorov equations provide a method for computing these n-step (a) First-passage probability. transition probabilities. These equations are: (b) First-return probability. p;p?kp;' for all n, m > 0, all i, j 7. Using the post test of lecture four, find the mean first passage time from state 5 to state 4 by making state 4 absorbing. (This has nothing to do with states and are established by observing that {1,2}.) (The University o fS ydney, 2009) 8. The transition matrix P of a Markov chain X = (Xn: n > 0) is fT " = |x.. = 'l = Z ^ f r — “ -/ .X. = K |X ,= /} . 1 2 3 4. 5 • K r - 1 p \x - . - J - !x - - *• x » = ' M x „ = * 1* . = -I I •. o 0 0 0: 1 1 K-1' 2 . 0 ■ 0 1/3 1/2 0 - i c c 3 0 0 1 " 0 0 (.=ii 4 .: 0 1/3 0 1/6 1 /2 If wc let P"” denote the matrix of n-step transition probabilities, P”, then it can be 5 1 /2 0 ■ .0 0 1 / 2 . asserted that p i n —m) __ p i n I p ir n I V ■ ■ (a) Specify the classes of this chain and determine vyhether they are transient, null where the dot represents matrix multiplication. recurrent or positive recurrent. Hence, (b) Find all stationary distributions for this chain. p m ) _ p p i i i - i ) _ p p p i " - - i _ _ p " (c) Find the mean recurrence time m.jj for all positive recurrent states. and thus P,nl- may be calculated by multiplying the matrix P by itself n times. (The University o f Sydney. 20 JO) pniis said to be Accessible from slate / if for some P >0. Two states / and j accessible to each other is said to communicate and wc write / j. 296 297 UNIVERSITY OF IBADAN LIBRARY Let denotes the one-step transition probabilities, and />! = Pt Solution: Observe that P"k P" represents the probability that starting in / the process will go to The one-slep transition on probability matrix is given by '0.7 0.3'| stated in n + ^transitions through a path which takes it into K at the nth transition. p = .0.4 0.6 J 17.1.1 Proof of C - K Equations 07 fo.i 0.3Hence. Pm = P: °-3) Using remark (3) above, summing over all intermediate states /(yields the probability 0.4 0.6 1,0.4 0.5 that the process will be in state j after n + m transitions. We have r r = e k « = j \x , =;} 0.61 0.39' v 0.52 0.48, T " = L k . . , = ^ x . = * k = ' ' } f 0.61 0.39'j 0.61 0.39'j = 5> k,., = V ,K .t .X 0 = /}p {x „ = K\X„ = l} 0.52 0.48 J ,0-52 0.48 J *«0 p; *=0 f 0.579 0.4251 \ [ 0.5668 0.4332 J Matrix ofn-slep transition probabilities: P(nl Let Plnl denote the matrix of n -step transition probabilities P'j then the C-K Equation Hence, the required probability P*n equal 0.5749. asserts that pin-rm) _ p lm ) p(m) Example 17.2 Consider Example 2.4. Given that it rained on Monday and Tuesday, what is the By induction p‘ probability that it will rain on Thursday?p\.»\ _ p i" - \* k ) _ p i i-l _ p n That is the n -step transition matrix may be obtained by multiplying the matrix P by Solution itself n times. The two-step transition matrix is given by Example 17.1 Consider example in which the weather is considered as a two-state Markov chain. If a = 0.7 and p = 0.4, the calculate the probability that it will rain four days from today given that it is raining today. 298 299 UNIVERSITY OF IBADAN LIBRARY Proof: the Is' two parts follow trivially from the definition of communication. To 0.7 0 0.3 0 '0.7 0 0.3 0 prove (iii) suppose that /<-> /., and j k then there exists m,- n such that p{2) = p'- = 0.5 0 0.5 0 0.5 0 0.5 0 P"' > 0, P" > 0. Hence, 0 0.4 0 0.6 0 0.4 0 0.6 0 0.2 0 0.2 0 P',r = t P"' P* - K p"k > 0 /•=U Similarly, we may show there exists an S for which Pks. > 0 . Two states that '0.49 0.12 0.21 0.1 8n 0.35 communicate are said to be in the same class and by the proposition any two classes 0.20 0.15 0.30 0.20 .arc either disjoint or identical. We say that the Markov chain is Irreducible. If there is 0.12 0.20 0.48 , 0.10 0.16 oniy one class- that is, if all states communicate with each other.0.10 0.64, Stale is said to have period d i f P ” = 0, whenever n is not divisible by d and d is the Since rain on Thursday is equivalent to the process being in either state 0 or state 1 on greatest integer with this property. (If P* = 0, for all n > 0; then define the period of i Thursday, the required probability is given by />2„ + = 0.49 + 0.12 = 0.61 to be infinite). A state with period 1 is said to be A periodic. Let d (i) denote the period of/, we can show that periodicity is a class property. 17.2 Classification of States In order to analyze precisely the asymptotic behaviour of the Markov chain process, I7.2.2Recurrcnt (or Persistent) State we need to introduce some principles of classifying state of a Markov chain. A state f e S is said to be Recurrent if Pr(7] i; yesterday, then it will be sunny tomorrow with probability 0.6; if it was cloudy today (iii) If i j, then j <-» i; then i k. but sunny yesterday, then it will be sunny tomorrow with probability 4; if it was cloudy for the last two days, then it will be sunny tomorrow with probability 1 . ' 301 300 UNIVER ©ooSITY OF IBADAN LIBRARY oo-> o Definitely, the model above is not a Markov chain. However, such a model can be (b) Obtain the transition matrix transformed into a Markov chain. (c) Find the stationary distribution in terms of p and q where p + q = 1 (a) Transform this into a Markov chain (b) Obtain the transition probability matrix Solution (e) Find the stationary distribution of this Markov chain. The state are (2, 0), (1,0), (1, 1), (0, 1). The transition matrix is Solution To state-* (2,0) (l,0) (l,l) (O.l) (a) Suppose we say that the state at any time is determined by the weather From state conditions during both that day and the previous day. We say the process is in: 9 P 0 . 0 State (S. S) if it was sunny both today and yesterday; P = (2,0) 0 0 <7 p State (S. C) if it was sunny both yesterday but cloudy today; (1 . 0 ) 0 logep0(O= - X t ^ P n ( t) = pn'( t ) = APn-l(0 “ Apn(t), 71 = 0, 1,2,... PoCO = 3 0 6 307 UNIVERSITY OF IBADAN LIBRARY Ane -Xt An t j e~Xttr ^p„(t) can still be written as N o w p M = T 7 T = 1 F ^ W ’ i ~ 0,1,2' 0pn(O = *P n-l(0 “ Apn(0 » n > 0 so that Ae ' Xl \ t Je~Xtt r _ Atn = 1 P , ( 0 ~Z) + A _ ( r + ; ) ! ! ! 0Pi(O = *Po(O -A pi(0 putr = 1 , j = 0 At n = 2 At ° e -xtt1 Pi ( 0 = = Xte~Xt = (At)e_Xt Dp2(t) = Apj(t) - Ap2(t) (1 + 0) ! 1 ! Atn = 3 Then Z)p3(t) = Ap2(t) -A p 3(t) (D + A )p 2( t ) = X2te~xt Dpx(t) can be rewritten as A2 te-Xc _ X2t j e~xit r p2( 0 = (D +A)p1(t) = Ap0(t) D + A “ (r + / ) ! ! ! So that we have put r = 1 , j = 1 2^-Xt (D + A)Pl(t) = Xe-Xt X2t2e~xi (*Q 'e P2( 0 = 2 ! 2 ! Divide through by (£) + A) Consequently Xe~xt Xe~Xct r ^ _ X3te~xt X3t j e~Xct r Pi (0 = D + A ( r + 1 ) ! 1 ! = D + A = (r + » ! ! ! This is a general solution. put r = 1 , j = 2 ^3j3g-At _ O T _ f ‘ p3( 0 = Xe~Xlt r \ e~ Xtt r 3! 3! may also be written as Noticetliat c T T i j r n r! 1 ! 309 308 UNIVERSITY OF IBADAN LIBRARY In general, Suppose that a continuous-lime Markov chain enters state / at some time, say time 0, C O = (At)ne~XtPn ,n = 0, 1 , 2, and suppose that the process does not leave state / (that is, a transition does not occur) n\ during the next s time units. What is the probability that the process will not have state / during the following t time units? If we fix t, At is a fixed parameter for the distribution and the set P i(t),p2(t).... then To answer this, note that as the process is in state /' at time s, it follows, by the gives a probability distribution of the process at the fixed time interval which is a Markovian property, that the probability it remains in that state during the interval Poisson distribution. In terms of a counts of events the above results shows that the [s,s + t] is just the (unconditional) probability that it stays in state i for at least t time member of events occurring in a fixed time interval t is distributed as a Poisson with units. That is, if we test t; denote the amount of time that the process stays in state i parameter At. before making a transition into a different state, then Also since the mean of the Poisson distribution is equal to the parameter At, At can be P = {ri > 5’ + r|7j > 5 }= /3{7; > t) interpreted as the expected number of events that can occur in time t. the quantityA is the average or mean rate of occurrence of E. for all s ,t > 0. Hence, the random variable 7) is memoryless and must thus be exponentially distributed. 17.5 Continuous Time Process The above gives us a way of construction a continuous-time Markov chain, namely, it A continuous-time Markov chain is a stochastic process having the Markovian is a stochastic process having the properties that each time it enters state /: property that the conditional distribution of the future state at timet + s, given the (i) the amount of time it spends in that state before making a transition into a present state at t and all past states depends only on the present state and is different state is exponentially distributed with rate say vL \ and independent of the past. Thus, this lecture establishes the fact that a continuous time (ii) when the process leaves state /'. it will next enter state j with same process is also distributed as an exponential probability probability, call it p,-y, where * i pfj = 1. 17.5.1 Definition and Properties Consider a continuous-time stochastic process [x{l), t > o} taking on values in the set A state i lor which u,- = 00 is called an instantaneous state since when entered it is of non-negative integers. In analogy with the definition of a discrete-time Markov instantaneously left. Whereas such states are theoretically possible, we shall; assume chain, given earlier, we say that the process [x{l), t > o} is a continuous-time Markov throughout that 0 < v>. < co for all /. (If v, = 0, then state /' is called absorbing since chain if for all s , t > 0 and non-negative integers i , j , X 0 < u < s, once entered it is never life). Hence, lor our purposes or continuous-time Markov chain is a stochastic process that moves from state to slate in accordance with a (discrete-time) Markov chain, but is such that the amount of time it spends in each state, before proceeding to the next If, in addition P^\X(,t i )= j |.Y(j) = l} is independent of s, then the continuous-time stale is exponentially distributed. In addition, the amount of time one process spends in slate t and the next state visited, must be independent random variables. For if the Markov chain is said to have stationary or homogeneous transition probabilities. All next slate visited were dependent on 7). then information as to how long the process Markov chains we consider will be assumed to have stationary transition probabilities. 310 311 UNIVERSITY OF IBADAN LIBRARY has already been in state i would be relevant to the prediction of the next state-and this would contradict the Markovian assumption. . A continuous time stochastic process is said to have Markov property and is called a A continuous-time Markov chain is said to be regular if, with probability 1, the continuous time Markov process if for all t* > > -- .ty > t0 satisfying the number of transitions in any finite length of time is finite. An example of a non- condition tn > tn > t0. regular Markov chain is the one having. Pi, i+ 1 = 1, Vi = i2 1. P(X(tn) = jn/X (tn- i ) = jn-l>X(tn- 2) ~ jn-Z' •••••»^(^o) = jo) It can be shown that this Markov chain-which always goes from state i to i + 1, = P( X(tn) = jn/X ( tn_i) = j n- i) spending an exponentially distributed amount of time with mean ' /2 in state i - will, with positive probability, make an infinite number of transitions in any time interval This is the independent probability and it state that all that is needed to predict the of length t ,t > 0. We shall assume from now on that all Markov chains considered state of the process at time n is the state of the process at the immediately preceding are regular. time. Let qtj be defined by 2. A Markov process is said to be time-homogeneous or stationary if Qij = VtPij, V i * j P{X(t2) = y /^ ( t1) = i) = P iX iti -h ) = j /X ( 0) = i)V i and j, tj < t 2 Since v, is the rate at which the process leaves state i and p,y is the probability that it In words, the process is stationary or time homogeneous if the conditional probability then goes to j , it follows that qtj is the rate when in state i that the process makes a in (2) depends only on the time interval between the events considered, rather on the transition into state j \ and in fact we call qtj the transition rate from / to j. absolute time. Note that ‘time-homogeneous’ and ‘stationary’ denote sameness in Let us denote by Pij(t) the probability that a Markov chain, presently in state i, will be lime. We can also know that a stationary Markov process is defined completely by the in state j after an additional time t transitional probability function which we defined as P j M = = M m = '} PijCO = p M O = ;'/* ( 0) = i } 17.6 The Exponential Process The fundamental equation for stationary Markov process is Chapman-Kolmogorov Let us consider a finite state but continuous time process. Let X (t) denote a random equation for p,y(t + r). By definition, variable. The value of X (t) at fixed t is the state of the process at time t. A time dependent process is the set (/(t)fo r given t > 0. X fo)depends on ^ > t0, fjtl(l + r) = P[XU + t) = j/X (0 ) = 0 and not on t2* ><£2- The process is continuos if t can take value on the t-axis. = ^ P[X(t + r) = j ,X ( t) = k /X {0) = 0 Marginal from joint Definition k Using Markov assumption - ^ P{X(l + t) = j /X { t) = l(,X(0) = i) P{X(t) = k,X (0) = 0 k 312 313 UNIVERSITY OF IBADAN LIBRARY But P{X{t) = k ,X( 0) = /} = P{X(t) = k / X (0 ) = flP fvff)) = i} Under the assumption that py(t) is a continuous function oft, we can express Therefore, Py(At)by the use ofMaclaurin series. Pij(t + r) = P{X(t + t ) = j /X( t ) = k,X( 0) = 0 W O = k / X (0) = i} k P u m = p u ( o ) + + i p ^ ' t o x ^ o 3 + - This is because PfA'(0) = t} = 1 Thus = PiyCO) + Pi'yC0)^t + OĈ Jt)2 ^ P « t + r) = j /X ( t ) = = k /X (0) = /) k LetpijiP) = Ay By the stationary assumption in (2) Py(At) = Pi/ (o) + Ay At -I- o(At)2 for i * j Plj(t + r) = £ P{X(t) = y/X(0) = 4} P{X(t) = k/XQS) = £} Py(At) = Ay A t + o(At)2 for i = ; k = Pkj(j)Ptktf) (By definition) • k Also, let p'y(0) = Xjj This is the general form of Chapman-Kolmogorov equation. Py(At) = 1 + Py y At + o(At)2 A specified form of this is: = 1 + Xjj At + o(At)2 P i j ( t + A t ) = Y P i k ( t ) p k j ( A t ) k The above is forward Chapman-Kolmogrov equation. Sincep-y(O) = 0 /o r i * j is a minimum, Ay is positive. Also, since Py(0) For i = j is a maximum, Ay is non-positive. The forward Chapman-Kolmogrov equation is given as We can unite the forward Chapman-Kolmogorov. Pij(At + t) = ]T p ijt(d t)p k;(t) Py(t + At) = ^ P i k (t)pky(4t) k k = Py (t)pyy(At) + ^ p ifc (t)pky(At) We expect the following to hold k * j i ) 0 < P i j ( t ) < 1 f o r a l l t ii) Pij{ 0) = W ( 0) = j/X { 0) = t) = l , i = j = 0 , i * j And for any given i iii) PikCO = I , W O = j / X (0) = 0 = 1 314 315 UNIVERSITY OF IBADAN LIBRARY Winch wo can write as dpjj(t) P ,-/(t + A t) = PiyCOfl + V c + uC4t)2] + ^ pik ( 0 [ 4 ; 4 t + 0(At)2] dt k * j i = Pij(t) + Vij^jjAt + py (t)0 (d t)2 + Y j [ P i M l kjA t + Pik(t)OOdt)2] k*i £ p « (0 ) = ° Pij(.t + At) - Pij(t) i At Z f t = ° v -1 j Pij(t)Xjj + 2 ^ Pik ( 0 Xkj k*j X A«v = xn + 2 a ‘7 = 0 y y** + M £C ft) I t « jP a (t)O(^Q2 4 t d t Thus since every of A (diagonal element) is non-negative, the diagonal element Ay dt must be equal in magnitude and opposite in signal to the sum of the other element in k the same row. Ay is called the transition rate from i to j for iit j . Ay can be 'I'he limit as d t -» 0 interpreted as the parameter of negative exponential distribution. For each Ay, the dPi/(0 V exponential distribution gives the distribution of time spent in a state i, given that j is ~ ~ d t~ = Z j Pik the next step. Thus if Ty is the random variable with Ayk In matrix form. (a= diagonal element) i ’(Ty) = f'HJ d m dp a it) So that Ay can be estimated as the inverse of a sample mean. dt dt A= f t ; ) / (* ) = <* c.-te P M = (pyM ) or f ( t ) = Ay-c " V . x > 0 B u t ,^ p „ ( t ) = 1 with mean CO CO d E(Ty)= J t / ( ( ) d t = J a ljer x‘l‘ dt dt ^ P./(0 = 0 0 0 317 UNIVERSITY OF IBADAN LIBRARY C H A P T E R 20 = j - [ IN T R O D U C T IO N T O T H E T H E O R Y O F G A M E S AND O U E U IN G M O D E L S =-j—sinccr(2) = 1 A ij A ij 1 Games Tkeary (mines theory is a branch of Stochastic Processes that can be applied to a situation Suppose thill we have the likelihood such as business, stock trading, politics, and so on. where the person involved can be n referred to as a player ox simply a gambler «=1 18.2 Gambler’s Ruin Consider a gambler who plays a game of chance against an adversary. Suppose that at -n lo g X tj- 'Y ^ X ij t the start of the game, the gambler deposits an amount in nairaZ. The adversary deposit dlogL _ n s r 1 0 N ■ /. in naira where N is the cumulated initial capital. a i < 1. 'The role of the game is that if the gambler wins a game he takes N1 from - = - = t the adversary and loses same to the adversary otherwise. 2. The game terminates. If dither player loses all his deposits. When the gambler loses all his deposit, he is said to be ruined: This implies that, 3. No game is jumped. We cun pul the money on a number scale. C = Vf Practice Questions I. Considered a two-state process such as the operation of a loom for weaving I 2 .......... Z ............................................N cloth. The two-state for the looms are 0, the loom is shut off and the operator is repairing it. And 1, the loom is operating and the operator is idle. Consider the 1 lie uaiii in loss is represented by movement along the scale. Gambler’s gain is operating and repair time as continuous. Assume that the constant proportionality is 3 represented by movement to the right observed and its loss represented by movement lor repair transition and 2 for breakdown transition. Find the probability distribution m the lelty observed. of the repair and the operation time. No point .ii. tin. scale is jumped. Movement in either direction on the scale is by pure Obtain the general form of the Chapman-Kolmogorov (C — K) equation. chance. The movement along the scale can be seen as that of a particle that moves at 3. Show that Aiy transition from i to j , V i & j . is 1/p 319 318 UNIVERSITY OF IBADAN LIBRARY random forward and backward. Because of that the process is. known as random tvalk. The points on the scale represent the state of the process. Since P{R\Zj] — qZj, P[Zk) = p and P{Z,} = q Movement- to a point on the scale depends on the point the gambler (or adversary) is at currently. It is therefore a Markov Chain. Generally (10.1) can be written as q-/, = VQxk + qq/(. 1 -q/2) = q(q/2-qz3) probability of winning a game. Let q = 1 - p, the probability of moving to the left of Z, that is losing a game (by the This implies that gambler). Let the points on the scale be denoted by Z0lZv ...,ZN and qZj, the {R\Z2) = P{Zi,R} + P{Zl,R) - q/< = r(q/A - q*s) = P(R\Z,)P{Z.3) + P{R\7.JP{Z,) To unify these equations we define q-,N=q^= o We can write this as These arc boundary conditions on qy . This becomes q/2 = pqx3 + ̂ 1 - r ‘ <7z0 ~ i +/*/*•-■ = 1 i f p*q Then by substitution, equation 10.12 becomes I his is the same as The general solution can be written as I Tie particular solution o f 10.1 1 can be written as I his becomes The boundary conditions . V 1 ' \c /X ' q0 = 1 and q N = 0 That is, when Z= 0, qz — 1 and when Z= /V, qz — 0 I )i vide by XY‘ This implies that A + B = 1, Z = 0 /4 + 5( % ) = °’ Z = yV 325 UNIVERSITY OF IBADAN LIBRARY Solving the system of equations, we obtain In general1' ’ qz = A + BZ Under the boundary conditions q0 = I,- qN ==0 . at Z = 0 and N respectively. Substitute for A and B, we have Thus, >4 = 1 at Z - 0 and A + B N = 0 ' at Z = N Thus, Substitute for A and B 18.2.2 Gambler’s Expected Gain(G) Possible Values q2 = \ - — ZN Gain N -Z with probability l - q z Loss Z with probability q 7 Substituting for qz e ( G ) = 4 - ( i - Z / n I - z The expected gain is £■((7) = [Combined Capital) (Probability of gain) - (initial papital) = n {z/ n ) - z = N ( \ - q , ) - Z = 0 That is 18.2,3 Expected Duration of the Game C(C/)= M (l-q z )- Z Assume that the expected duration of the game has a known value Dz . If the first trial If /;= /r-4pq The complete solution is - 2p Dz = UZ+ Vz Multiplying the solutions results in x = l P 328 329 UNIVERSITY OF IBADAN LIBRARY The required boundary conditions are- - So that A + B — 0, Z = 0, Dy — 1 Nq-p M S w % A = —= —— ; Z = N, D n =0 q - p 1 For Z — 0, Dz — 0 Substituting for A and B in (10.39) we have A + B = 0 J L J L (1 V n _ Q - P ,. Q - P ^\Pp J' L>Z ~ 7 ~ 7 7 T + 7------------- ---- + ' For Z = N,D n = 0 q - p N 'W q - p q - p So that ' { % 183 Queuing Theory The principal pioneer o f queuing system was A.R. Erlang, who began in 1908 to This results in study problems of telephone congestion for the Copenhagen Telephone Company. He was concerned with problems such as the following: A manually operated telephone exchange has a limited number (one or more) of operations when a subscriber attempts to make a call, the subscriber must wait if all the operations are already busy making connections for other subscribers. It is of interest to study the waiting time of subscribers e.g. the average waiting time and the chance that a subscriber will obtain service immediately without waiting and to examine how much the waiting times will So that we have be affected if the number of operations is affected or conditions are changed in any other way. If there are more or if service can be speeded up, subscribers will be ^ r = 0 pleased because waiting will be reduced, but the improved facility will become - o expensive to maintain, therefore, a reasonable balance must be stntck. Solving for A, 183.1 Applications of Queuing Theory - N When persons or things needing the services of a facility or persons arrive at a service q - p channel or counter on the account that the facility or persons cannot serve all at a time, a queue or waiting line is formed. Examples of this include: 330 331 UNIVERSITY OF IBADAN LIBRARY (i) cars arriving at a fuel station waiting to be served. (ii) persons waiting at a bus station waiting to be checked in. (iii) books arriving at a librarians desk. (iv) patients waiting to see a doctor or community health dispenser. (v) customers arriving at a departmental store (supermarket). (vi) clients waiting to see the Customer Service Executive or Officer. Arriving customers Served customers Queuing theory is applied into every field of human endeavour. This is because there leaving is no perfect service or treatment that can be meted out. Below are some of the fields of application: ' 1 (i) Business - banks, supermarket, booking offices, and so on. discouraged n (ii) Industries - servicing of automatic machines, production lines, storage, and so customers on. leaving (iii) Engineering - telephony, communication networks, electronic computers, and so on. A QUEUING SYSTEM (iv) Transportation - airports, harbours, railways, traffic operations in cities, postal (OR by Swarup et al. 1978, p505) services, and so on. (v) Others - elevators, restaurants, barber shops, and so on. 18.3.3 Components of the Queue System A queue situation can be divided into five elements. These are: 18.3.2 Concept and Definition (i) 'Arrival mode Queuing theory is concerned with the design and planning of service facilities to meet (ii) Service mechanism a randomly fluctuating demand for service in order to minimize congestion and (iii) Service channels maintain economic balance between service cost and waiting cost. The cost here (iv) System capacity refers to time. (v) Queue discipline A queuing system is composed o f customers arriving at a service channel and is attended to by any one or more o f the service attendants. If a customer is not served (i) Arrival Mode - this refers to the rate at which customers arrive at a service immediately he may decide to wait. In the process, however, a few customers may centre and the statistical law which governs the pattern of arrival. leave the line if they cannot wait. At the end of the process, served customers leave Certain definitions pertaining to the arrival of customers: the system. bulk or batch arrival: more than one arrival allowed to enter into the system simultaneously. balk: customers deciding not o enter a queue because it is long or lengthy. renege: customer leaving a queue due to impatience. 332 333 UNIVERSITY OF IBADAN LIBRARY Symbols and Notations jockey, customer jostling among parallel queues. We shall employ the following symbols and notations this lecture:. stationary, arrival pattern which does-not change with time. n = . number of customers in the system, both waiting and in service, transient: a time-dependent arrivalj>ro«ess. A = average number o f customers arriving per unit o f time The arrival mode is always denoted by M. . average number of customers being served per unit of time A • r 4 traffic intensity (ii) Service Mechanism - this refers to the number o f service points that are available and the duration o f service. When the service points or servers are infinite, c = number o f parallel service channels (servers) the service will be instantaneous, which will result in no queue. In case of finite E ( n ) = average number o f customers in the system, both waiting and in points, queue is inevitable. Customers can be served according to a specific order, service which may be in batches o f fixed size or of variable size. This system is called bulk E (m ) = average number o f customers within in the queue service system. E (v) = average waiting time of customers in the system, both waiting and in service. (iii) Service Channels - where there are more than one channel of service, then £ 0 ) = average waiting time of a customer in the queue arrangement of service may be in parallel or series, or a combination of both, P n ( 0 = probability that there are n customers in the system at any time t, depending on the system design. both waiting and in service. ^n = time independent probability that there are n customers in the system, (iv) System Capacity - most queuing system are limited in such a way that both waiting and in service. waiting rooms are all accommodating. This gives limit to the number o f customers that can be accepted to the waiting line at any given time. Such situation gives rise to 18.4 The Basic Queuing Process f in ite source queues, and results in forced balk. The statistical pattern by which customers arrive over a period o f time must be specified. (v) Queue Discipline - this is a method of customer selection for service when a It is usually assumed that they are generated according to a Poisson process that is, the queue has been formed. The different forms of discipline include: number of customer who arrives until any specific time has a Poisson distribution. (ai) First Come, First Served (FCFS), or The Poisson distribution involves the probability of occurrence of an arrival and is (aii) First In, First-Out (FIFO) independent o f what has occurred in the preceding observation. This Poisson (b) First In, LasvOut (FILO) assumption indicates the number of arrivals per unit time(A) (or mean arrival rate), (c) Last In, First Out (LIFO) while1/^ on the lengthy o f interval between two consecutive arrivals. This time (d) First in. First Out with Priority (FIFOP) between two consecutive arrivals is referred to as "inter-arrival time.” (e) Selection for Service In Random Order (SIRO) The mean service rate /i is the number of customers served per unit time whole average service time (V jt) >s the l'me un‘ts Per customer service time delivered is 335 334 UNIVERSITY OF IBADAN LIBRARY 18.6 Classification of Queuing System given by an experiment distribution where the servicing of a customer takes place Queuing systems, generally, may be completely specified in the following symbolic between the timet andt + At. forms: (a|Z>|c):(d|e) 18.5 Poisson Process and Exponential Distribution Description In queuing theory, the arrival rate and service rate follow a Poisson distribution. First symbol (a) - type of distribution of inter-arrival times However, it should be noted that the number of occurrences in some time interval is a Second symbol (b) - type of distribution of inter-service times Poisson random variate, and the time between successive occurrences is an Third symbol (c) - number o f servers exponential distribution. Both are equivalent Fourth symbol (d) - system capacity Fifth symbol (e) - queue discipline 18.5.1 Axioms of the Poisson Process Given an arrival process [ N ( t ) , t> 0 ] , where N (t) denotes the total number of For the first and second symbols, the following letters may be used: arrivals up to time t, N (0) = 0. an arrival characterized by the following assumptions M = Poisson arrival or departure distributions (axioms) can be described as a Poisson process; Ek = Erlangian or Gamma inter-arrival or service distribution G1 = General input distribution AXIOM 1 - the number of arrivals in non-overlapping intervals are statistically G = General service time distribution independent. This means there is independent increment in the process. An example of a queue system is AXIOMS 2 - the probability of more than one arrival between time t and time (M\Ek\Cy.(N\SIRO) t + At is o(At); this means there is negligibility in the probability of two or more arrivals during the small time interval At. This implies that Queuing system is classified into p0(AO + p1 (At) + o(At) = 1 (i) Poisson Queues (ii) Non-Poisson Queues AXIOMS 3 - the probability that an arrival occurs between time t and time t + At isAAt + o(At). This implies that Definitions Pi (At) = A At + o(At) Transient State: When a queuing system ahs its operating characteristic (e.g. input, Where A, a constant, is independent o f /V(t), At is an incremental element, and output, mean queue length, etc) dependent upon time, then it is said to be in transient o(A t) represents the terms such that state. Ahtm- 0 —°(-A—t) = n0 Steady State: This is a queue system that is independent of time.At Assume Pn(t) to be the probability that there are n customers in the system at time i, then the steady state use becomes 337 336 UNIVERSITY OF IBADAN LIBRARY lim| -nP*n (/) = Pn (independent of t) This can be re-written asP.(i + A/) = PH( l \ 1 - AA/ + 0(At)][l - //At + 0(At]+ />„(r)[AA/][/zAt]+ P „ ^ h ^ x + ° 1 18.7 Poisson Queues 18.7.1 The M\M\1 System Suppose n = 0, we have This deals with the process where arrivals and departures occur randomly over time P0{i + A/) = P0(t Jl - XAt + o(At)]+ Px (rXl - + °(A0] generally known as birth-death process. [//A/ + o(A/)] + o[A/] 1. Model 1: (M |A f|l):(o o |FIFO) In this model, we have Poisson input, exponential service, single channel, infinite = P0(/J l-/lA /]+ P l(t)/^ r + o(A/) system capacity and first in first out basis. If P„(t), be the probability that there are ncustomers in the system at time t, then in We can record the difference equation order to write the difference equation for P„{t), we first consider how the system can P„ (/ + At) - P„(/) = - U + p)A tPn (/) + /iA/ Pntl (/) + XAt Pn, (/) +o(A/); n > 1 get to state En at time t + At. To be in state En of time / + A t , the system could have and been in the state £nat time t and have no arrivals or service competitions in Ator be in P0 (/ + A/) •- P0 (l) = - XAtP0 (t)+ (iAt P, (/) + o (At) state £n_i of time t and have, during A,, one service completion and no arrivals. If Then, we assume that n > 1 (having arrivals and service independent of each other), it can be easily seen that lira P|' (' + ^ ' = - U + m ) P J ' ) + 0 ) + V>,„(/) + o(At)A t a n d Pn (t + At) = P„(t). P(no arrivals in At). P{no service completions in At) lim - ^ , ( / ) + / iP ,( 0 + o(A t) +Pn (0 - P{one arrival in At).P (one service in At) Ai—*0 A/ T^n + l (t). P(one service com pleted in At). P(_no arrivals in At) +Pn-i( t) .P (o n e arriva l in A t).P (no service completions in At) + o(At) So that we have n > 1 - P j n = P ( t ) = A A + p ) P jn + p P J l ) + J.P-i(t) n > I dt and 338 339 UNIVERSITY OF IBADAN LIBRARY In general we have ~dt p„w = K ( ' ) = - K ( ' ) + mP M P. = 1 - 1 P, Vn P. The above are known as difference equations in n and t. The steady-state solutions Proof for Pn in the system at an arbitrary point of time is obtained by taking the limit as By mathematical condition, we have / —► oG.. rp,, *\ n >1 r n ~ - pr n- \ » If the steady-state exists (/l < /j, as t - » co), then P P P„(t) —> P„ and Pn (/) —> 0 as t —»oo A + p ' - i P P \ P I f A = p there exist no queue X -'+ fiX 1 A" If — > I we have an explosive state P ’ Using the condition of steady state,we have = - Pr 0 = -(A + ^ + / / P „ , + ^ _ , ; n > 1 9 . Using the boundary condition; Z ^ . = 1. then 6.5 becomes and 0 = -AP0 + fjPy >=Z P* = ^ o Z n-o l/'J Using iterate procedure we have = P., Sum o f geometric series where — < 1 'pi ^ p' ii P 1 - P P ' A > p2 = f x + » V - a k = p J p = /> V - p J f - 1 This implies that p ,= p ; p l / ' J 1 - P Resulting in the steady-state P„ = p" (1 — p \ p < 1 and n >0 341 340 UNIVERSITY OF IBADAN LIBRARY This is the probability distribution of queue length. (jii) Average queue length Characteristics of Model 1 £ ( » . ) = I » p.; (i) Probability of queue size greater or equal to n. where m = n -1 (that is number of customers in queue />(* „ ) = ! > , . = z o - p ) p i minus customer in service) A' *n K = - i ) P „ = i > r , , - l P , t=n r = 0 - p ) p " t p ‘ ~ - 2 K - =( n-0 > p . - i i i = 0j ^ v = . * 7 ^ — [ i - ( i - p ) ] 1-P 1- P (ii) Average number of customers in the system ■ r b - ’ . P 2 E ( n ) = Y j n PB = £ n (l-p )p " 1 - P n=0 0) = ^ m- ■ V ' P(m > 0 ££de 1 P p - /1 = p ( i - p ) f S p - , Since /? < 1 P ( P - * ) d e ^ = P ( ' - p ) LO-p)-J This is because P (m > 0) = P(n > l) = ]jT P(1 - /}, - Px • L"»u P (v) The fluctuation (variance) of queue length 1 - / 9 ^ - / l T ( * ) = I« = o [ n - i r (n ) ] 2P„ = E^=on2P n - [ ^ ( n )]2 By algebraic transformations, P(n) = ( l - p ) £ £ - [ £ f _ P ( 1 -P )2 342 343 UNIVERSITY OF IBADAN LIBRARY V'.-(O) :P(w = 0) = (n-W = P (No customer on the systemn upon arrival) Example 18.1 To find y/uin for / > 0,' we suppose there be n customers in the system upon arrival, A TV repairman finds that the time spent on his jobs has an exponential distribution l or a customer to go into service at time between 0 mid t, it means all the customers with mean 30 minutes. If he repairs sets in the order in which they come in and if the must have been served at time t. arrival of sets is approximately Poisson with an average rate of 10 per day. Therefore, (i) What is the repairman’s expected idle time each day? (ii) How many jobs are ahead o f the average set just brought in? t//i (,)=: p [(n -1) customers are served at timet) P [one customer being served in timedt] Solution {idt 2 = — = — . setsperhour 8 ' 4 The waiting time w is therefore w < t] // = ^>V60 = 2 setsperfhour (i) The probability o f no unit in the queue is asZ»t’i^ J o^ - ( / ) + V'-C«) Po n 8 8 Hence the idle time for repairman in 8 hour days = - , 0 = 3 hours „ z l ( « - i ; 8 i = (l - p ) p - //t (l - /o)dt + (l - p) E(n) = - V j o b s o(ii) 2 - /V 4 3 - \ - pe I> 0 18.7.2 Waiting Time Distribution for Model 1 The distribution of waiting time in queue is Waiting time is mostly a continuous random variable and there is a non-zero I " P / = 0 probability of delay being zero. Denote time spent in queue by w. Let (/„.(/) be the ^ {,) = 1 - / » / > o cumulative probability distribution so that from a complex randomness of the Poisson, we have 345 344 UNIVERSITY OF IBADAN LIBRARY Characteristics of Waiting Time Distribution for Model 1 ( i) Average waiting time of a customer (in the queue) (iv) Average waiting time that a customer spends in the system including service X E(v) = | t.\//(wl w > 0)ir £ ( h -) = 0 o u> = f tpp{\ - p) o 0 I * = -------[ x e's dx, for (// - X) = x P - K 1 P _ A p - X p ( l~ p ) p (p ~ X ) Relation between Average Queue Length and Average Waiting Time (ii) Average waiting time of an arrival that has to want (Little’s Formula) E (w /w > 0)= A2 p[w> 0) E(m) p(p-x) E(w) = A £ (v )= —!— p ( p - * ) \ / p p(p ~ x) P - X 1 It can be seen that E(n) = A E(v), E(n) = X E(w) and E(v) = E(w) + — P ~ X P Example 18.2 We note that P(w > 0) = 1 - P(w = 0)= 1 - (l - p ) = p Amvals at a telephone both are considered to be Poisson with an average time of 10 minutes between one arrival and the next. The length of a phone call is assumed to be (iii) For the busy period distribution, suppose v is the random variable denoting the distributed exponentially with mean 3 minutes. total time that a customer had to spend in the system including service. This makes (i) What is the probability that a person arriving at the booth will have to wait? the cumulative density function to be (ii) The telephone department will initial a record booth when convinced that an • arrival would expect waiting far at least 3 minutes for phone. By how much v{w /w > 0) = — U ; where ^(w ) = [ipw (/)] P[w > 0) at should the flow of arrivals increase in order to justify a record booth. A / l P ) / l / 'J t .> 0 347 346 UNIVERSITY OF IBADAN LIBRARY Solution E(v) = \ E(n) = — We are given A u-A. This result applies to the FIFO SIRO and LIFO cases. These three queue discipline ^ = K) = 0, 1®Person Per minute sonly differ in the distribution of waiting time when the probabilities of along and and short waiting times change depending upon the discipline used. When the waiting time distribution is not required, the symbol GD(general discipline) can be used to p = ̂ = 0.33 person per minute represents the three queue disciplines above. (/) P(w > o ) = l - / >u = l - f l - — l P ) 18.7.4 Modellll (M |M |l):(A f|F /FO ) _ A _ 0.01 There is a deviation from the previous model 1 (especially 1) because the number of ~ M ~ 0.33 customers is now finite (W). As long as n < N, the difference equated o f model = 0.33 remains valid for this model. If the system is in state Ew, then the probability of an (ii) The installation of record booth will be justified if the arrival rate is greater arrival into the system is zero. than the waiting time. Then the length of queue will go on increasing. Thus, the additional difference equation for n = N becomes Now, E(w) = , ^— r = 3 p „ { t + a/ ) = p H ( i ) [i - M 'l+ ^ v - i ( 0 - M i - H + < > (a 0 MKM-A) A1 resulting in the differential-difference equation. 0.33 (0.33-A1) Where E(w) = 3 and A = A'(w) for record booth. On simplification this yields 4at p n( / ) = - / / P n C 0 + ^ ^ - , « ) A1 = 0.16. hence the arrival rate should become 0.16 person per minute to justifies the and gives the resultant steady state difference equation record booth. 0 = - / i Pn + A P n, ( O 18.7.3 ModelII(Af |M |1): (oo|S //?0) Given the interval 1 < n < N -1 , the complete set of steady-state difference equations This model is similar to model 1. The only difference is in the service discipline. The for this model is as follows. first follow the FIFO rule, while this follows the SJRO rule. We recall that the /^ ,= A P 0 derivation of Pn for model I does not depend on any specific queue discipline, it may pP.,.i = (A + ai)P, - A P„ , then be concluded that for the SIRO rule case, we must have. p„ =( ] - p) p " , n >0 P*3,. = A*\ , The average number of customer in the systemv£(n) remains the same irrespective of cases, FIFO or SIRO. Provided P„ remains unchanged, £ (n ) remain the same in all queue discipline, thus 348 349 UNIVERSITY OF IBADAN LIBRARY \s in model I, by iterative procedure, the first two difference equations are 0 -/g )p " P« = ( j j P „ '.n < N -l 1 „ N * I P * 1 ; 0 < n < N n (he same manner, the value of Pn holds for the last difference equation if n = N. Thus, we have N + 1 (p = 0 = p" P0; n < N Note that the steady-state solution exists even for p > 1. Intuitively, there is sense in Using the boundary condition, we can obtain the value of P0. this since the process is prevented from blowing up by the maximum limit. N Thus, given N ->■ co, the steady-state solution results in Boundary condition is ^ P = P, n=0 P„ = (l - p)p" n< co Thus Which is the same as that in model 1. 1 = ^. 2 > n i - p > Characteristics of Model III { i - (i) Average number of customers in the system is given by p /> (N + j) E(n) = Y inPil =P„YJnp" n “II n “0 Thus, t!>dt dp \ - p N+l 1 - p * « P0 = P'>P Tdp L i1 * -P N + \ ( I - P ) : Hence p [l-(N + l ) p N + N p H*'\ ( M O V ) 350 351 UNIVERSITY OF IBADAN LIBRARY (ii) Average queue length We know that Pn = p(> e", thus £("0 = Z l ( ” _1) Pn = E (n) ~d=Y .P'■ n= I (fl) P, =(0.53) (0.5) = 0.27 = £ ( « ) - ( ! -P „ ) P2 = (0.53) (0:5)2 = 0.13 P3 = (0.53) (0.5)3 = 0.07 (b) E(n) = 1(0.27)+ 2(0.12)+ 3(0.07) = 0.74 _ p 2f l - A T p " - , + ( A f - l ) /) K l V p ) ( i > ) Hence, the coverage number of trains in the queue is 0.74, and each train takes on an (iii) Average waiting time. average 'A (0.085) hours for getting service. As the arrival of new train expects to Using Little’s formula: find on average of 0.74 trains in the system before it. E(w) = (0.74) (0.085) hours E{v) = ~ ^ where A1 is the mean rate of customers entering the system and is equal A = 0.0629 hours or 38 minutes to a ( i -/> ,.) 18.7.5 Model IV (Birth- Death Process) Thus, E(w) = E(y) - — = P X Assume the system to be in date En, the probability of a birth occurring in a small Example 18.3 time interval At is considered as AnAt + o(At); and that of the death is considered as At a railway station, only one train is handled at a time. The railway yard is sufficient finAt + o(At),n > 1. The system being in En at time t means it will remain in En at only for two trains to wait while the other is given signal to leave the station. Trains timet -F At provided there is no birth and no death/on birth and one death, or the arrive at the station at an average rate of 6 per hour and the railway station can handle system might have been in E ^ a n d had a birth, or in En+1and had a death. Thus, this them on an average of 12 per hours. Assuming Poisson arrivals and exponential result in service distribution, (a) Find the steady-state probabilities for the various numbers of trains in the Pn (t + At) = Pn (t) (1 - AnA t - o(At)){\ - iinAt - o(At) ) 4- Pn+1(t) (Mn+i^t system. + o (A t) ) ( l - An+1At - o{At) + Pn- i ( t ) (An_aAt + o (A t))(l (b) Also, find the average waiting time of a new train coming into the yard. - lin_xA t - o { A t ) + o(At), n > 1 Solution P0 (t + At) = P0( t ) ( l — A0A t- o(At) + P^O C /ijA t -H\o(At)) + o(At), n = 0 \ 2 = 6 / / = 12, p = — = 0.5 12 Dividing by At, and taking limit as At -*0, the diffential - difference equations results Probability of no train in the system (both waiting and in service is d P„. = ——-^-7 = — = 0.53 “JT^nCO — ~ (A n + P)i)Pn(t)+ Pn + l ̂ >1+ 1 (0 T tl > 1 I - / / * 1 - (0.5)’* 352 353 UNIVERSITY OF IB DAN LIBRARY and By mathematical induction, one can prove that this formula is correct "h Pn n ^n -l d ^n+1 — P n - ' ^ 71-1Pn+1 Pn+1 d t P° ^ ~ ~*nPo(t) + 7 1 -1 Since Pn (t) is independent o f time, the steady-state solution =1 nipi-1=0 +i ^ P n (t) = 0 and the differential-difference equation reduce to o = - (^n "h Pti ) + P ti+1 n+l + 1* H > 1 Making use of the boundary condition, we obtain PQPn P ^ n - lP n - and 0. andpn = p f o m > 1 P2P1 ° then A _= h ± P 2 n A1 .3 ^2^ 1^0 Po = l - p P3P2 Pi Thus pn = pn( 1 - p), fo r n > 0 So that in general '71 — ' 0 (same as model 1)n _ ^n-l^n -2 —^0 n . PnPn- i ~Pi II. When Xnn = —n+l f’o r n > 0 , andpn = p fo rn > 1 Then -1 = n,"=ro #-*7J+-1 7̂0 . » 2 1 V A" Po = 1 + Z-,1n ! p nn 355 354 UNIVERSITY OF IBADAN LIBRARY It is given that the service increases with increase in the number of persons. -1 Thus, pn = np. where there are n persons. 1 + p + | p 2 + j f />3 + - ] = e~P X OP Thus £(7.) = ^n =n0 p „ = n^=0n . ( - p " ) e -<’ Pn = { ^ . pn) e ~P f o r n ^ ° Here we can see that pn follows the Poisson distribution where p = - . But, p > 1 or e~p.p .e p = p p < 1 most be finite. = 5/g persons III. When An = Afo rn > 0 , andpn = np_ fo rn > 1 The average solving time is inversely proportional to the number of people solvingon Then 1 the problem is given by day problem.i-iZ Xn Expected time for a person entering the line jVPo = + n! 71 = 1 ^ E(?T) = -l day or 8 hours. = e~p Thus Practice Questions 1. Derive, using both methods, the probability that a gambler will be ruined p* m G s pn) e ~p f o r n ~ 0 given that his initial capital is Z. Here, service rate increase with increase in queue length. Hence it is known as the 2. Show that gambler's expected gain is given asN (l — qz ) - N. queuing problem with infinite number of channels= (M\M\co): (oo|F I F O ) 3. Under what condition can the expected gain be zero? 4. Company A enters into a project deal with another company B. 4 's initial Example 18.4 deposit is /V577t, while f?’s initial deposit is NAm. For every success, A gains Problems arrive at a computing center in a Poisson fashion at an average rate of five more naira from 6 , otherwise it loses same to B. If the probability of success per day. The rules of the computing center are that any man waiting to get his is 0.7, what is the probability of losing the entire deal? problem solved must aid the man whose problem is being solved. Tf the time to solve 5. A gambler's initial fortune is t. On each play of the game the gambler wins 1 a problem with one man has an exponential distribution with mean time of 'A day, with probability p, or loses 1 with probability 1 — p. He or she continues and if the average solving time is inversely proportional to the number people playing until he/she is n ahead (that is, the fortune is t + ?t). or losing by m. working on the problem, approximate the expected time in the center for a person Here 0 < i - m and i + n < N. What is the probability that the gambler-quits entering the line. as a winner? Solution A = 5 problem per day, p = 3 problems per day 356 UNIVERSITY OF IBADAN LIBRARY 6 . Given an initial capital, Z, show that expected duration of the game is be the number of customer in the system (serviced and waiting) immediately M after an event. Suppose that an event is equally likely to be an arrival or a N 1 - completed service. q - p q - p Ml (a) State the transition graph and transition matrix and find the stationary distribution. Describe the model 1 of the M|Af llqueue discipline, and show that (b) If a customer arrivers. what is the probability that he finds the system empty? (a) the average number of customers in the system is given as Full? ^2 (c) If the system is empty, the time until it is empty again is called a “busy (b) the average queue length is given as —. period". During a busy period, what is the expected number of times that the 8. In the M |M |1 system of a queuing process, show that the system is full? (a) steady state probability of model 1 is Pn = pn( l - p ) , where p < 1 and n > (d) Show that a limit distribution is a stationary distribution. 0. (b) the waiting distribution is given as ( 1 - p . t = o m = [ l - p e - r t ' - P * , t > 0 9. SAO Super market has one cashier at its counter. The service discipline of the cashier is FIFO. It is observed that the supermarket has 18 arrivals on average of every 10 minutes while the cashier can serve 12 customers in 6 minutes. If the distributions of arrivals and service rates arc poisson and exponential respectively. Calculate. (a) The traffic intensity and interpret the figure obtained (b) The average number of customers in the system (c) The average queue length (d) The average time a customer spends in the system (e) The average time a customer waits before being served 10. Customers arrive at an ATM where there is room for three customers to wait in line. Customers arrive alone with probability and in pairs with probability ^ (but only one can be served at a time). If both cannot join, they both leaver call a completed services or an arrival an “event” and let the slate 358 UNIVERSITY OF IBADAN LIBRARY R E F E R E N C E S Amah in. u . N.(2()07). STA 211- Probability II. Ibadan Distance Learning Centre Rosen. K. H., Michaels. J. G.. Gross. J. L.. Grossman. J. W. and Shier, D. R. (2000). Series. Distance Learning Centre. University o f Ibadan, Handbook o f Discrete and Combinatorial Mathematics. CRC Press. Alawode. (). A. andShittu 0 . 1. (2011). Probability III. An Unpublished lecture note Odeyinka. J. A. and Oscni, B. A. (2008). Basic Tools in Statistical Theory. Highland developed lor the Ibadan Distance Learning Programme. Publishers. Bhat, B. R. ( 1985). Modern Probability Theory: An Introductory Textbook. 2"'1 Olubusoye, O. F.. (2000). UnpublishedLec/u/v notes on STA 311. Edition. Wiley Eastern Ltd.. Bombay. Roussas, G. G. (1973). A First Course in Mathematical Statistics. Addison-Wesley Bhat U. N. (1971). Elements of Applied Stochastic Processes. John Wiley & Sons Publishing Company. Ltd.. London. Shittu. O. I. (2011 ).UnpublishcdAG/c.v on Probability 11. Attenborough. M. (2003). Mathematics for Electrical Engineering anti Computing. ii'ii'ir. wikinedia.ore Nevvnes. Elsevier. Linacre House, Jordan Hill, Oxford. Encyclopaedia of Physics (2nd Edition), R.G. Lcmcr. G.L. Trigg. VHC publishers. Brualdi, R. A. (1999). Introductory Combinatorics. Pearson Education Asia Limited 1991, ISBN (Verlagsgesellschaft) 3-527-26954-1. ISBN (VHC Inc.) 0-89573-752-3 and China Machine Press. Bunneheka BMSG. Ekanayake CiEMUPD (2009) "A new point estimator for the Feller W. (1970). An Introduction to Probability theory and its applications. 3rJ median of gamma distribution". Viyoihtyu J Science. 14:95-103 Edition. John Wiley and Sons Inc. New York. Fisz M. (1963): Probability Theory and Mathematical Statistics. John Wiley and Sons, Barry C. Arnold (1983). Pareto Distributions. International Co-operative Publishing New York. I louse. ISBN 0-N9974-Q12-X. Fisher Snedccor (19.38). The true characteristic function of the F distribution. Dan Musa (2010): http://probabilityandstats.wordpress.com/2010/02/18/the- Biometrika. 69. 261-264. malching-problem/ Ilori S. A. and Ajayi O. O. (2000). Algebra. University Mathematics Scries. Y-Books Grimaldi, R. P. (1999). Discrete and Combinatorial Mathematics. Pearson Addison- (a Wesley. Gupta. B. D. (2001). Mathematical Physics. Vikas Publishing House PVT Ltd. Johnson NL, Kolz S. Balakrislinan N (1994) Continuous univariate distributions Vol Hogg. R. V. and Craig. A. T. (1970). Introduction to Mathematical Statistics. New 1. Wiley Series in Probability and Statistics. York: Macmillan Publishing Co. Krishnamoorlhy, K. (2006). Handbook o f Statistical Distributions with Applications. Krishnamoorlhy K. (2006): Handbook of Statistical Distributions with Applications. Chapman^ llall/C'RC. Chapman A-1 la 11/CRC Merris. R. (2003). Combinatorics. Wiley-lntcrseienee. Mood. A. M.. ( iraybill. F. A. and Boos, D. C. (1974). Introduction to theory o/ Lord, Nick. (July 2010). "Binomial averages when the mean is an integer", IJie Statistics. McGrow-1 lill Inc. Mathematical Gazelle 94. 331-332. division of Associated Book Makers Nig. Ltd.) 3f> I UNIVERSITY OF IBADAN LIBRARY . Muiu.ll I (2(M)S).SVt///.v//a// Physics (2nd Edition), I . Mandl, Manchester Physics, John Wiley & Sons. ISBN 97X0471915331 Ross S. (1988). A First Course in Probability. Macmillan Publishing Coy. New York. Ross. Sheldon M. (2009). Introduction to probability and statistics fo r engineers and Maxwell. J.C. (I860) Illustrations of the dynamical tneory ot gases. Philosophical scientists (4th ed.). Associated Press, p. 267. ISBN 97S-0-12-370483-2. Magazine 19. 19-32 and Philosophical Magazine 20. 21-37. Shangodoyin. D. K... Olubusoye. O. E., Shittu, O.l. and Adepoju. A. A. (2002). Mood, A. M., Graybill, F. A. and Boes. D. C. (1963). Introduction to theory o f Statistical Theory and Methods. Joytal Printing Press, ISBN: 978-2906-23-9 Statistics. McGraw-Hill Books Coy. Swarup K... Gupta P.K.. and Mohan M. (1978): Operation Research. Sultcm Chant/and Moners P. and Peres Y. (2008). Brownian Motion. An Unpublished lecture note on Sons. New Delhi. Second Ed. Reprinted. the web. Nadarajah, S. and Kotz, S (2006).The beta-exponential distribution. Reliability Udofia G. (1997): Lecture Series on Stochastic Processes. University' ofUyo, Uvo. Engineering and System Safety, 91(1 ):689-697. Nigeria. (Unpublished) Neumann. P. (1966). "Uber den Median der Binomial- and Poissonverteilung". Hissenscluifiliche Zeitschrift tier Technischen Universitat Dresden (in German) 19: Udomboso. C. G. and Shittu O. 1. (2011). Stochastic Processes. A lecture note 29-33. developed for the Ibadan Distance Learning Programme. Ugbebor. O. O. and Bassey U. K. (2003). Mathematics for Users. University Odeyiiika J.A., and Oseni B.A. (2008): Basic Tools in Statistical Theory. Highland Mathematics Series. Y-Books (a division of Associated Book Makers Nig. Ltd.) Publishers Ugbebor, O. O. (2010). Unpublished Lecture notes on Probability in Mathematics and Statistics Departments, University of Ibadan. Ollolsson P. (2005): Probability, Statistics and Stochastic Processes. John Wiley and Wadsworth, G. P. (1960). Introduction to probability and random variables. USA: Sons, U.S.A. McGraw-Hill New York. p. 52. Rogers L. C. G. and Williams D. (1994): Diffusions. Markov processes and Martingles, Vol. I . second ed„ Wiley Series in Probability and Mathematical Young. G.A. and Smith R. L. (2005). Essentials of Statistical Inference. Cambridge Statistics: Probability and Mathematical Statistics, John Wiley & Sons Ltd.. University Press Chichester, 1994. Foundations. Young, H.D and Freedman R.A. (2008): University Physics - With Modern Physics Rogers L C. G. and Williams D. (1994): Diffusions, Markov processes and (12th Edition), (Original edition), Addison-Wesley (Pearson International). 1st Martingles. Vol. 2, Cambridge Mathematical Library, Cambridge University Press. Edition: 1949, 12th Edition:, ISBN (10-) 0-321-50130-6, ISBN (13-) 978-0-321- Cambridge. 2000, Ito calculus. Reprint of the second (1994) EDITION. 50130-1 Roussas G.G. ( 1973): A First Course in Mathematical Statistics. Addison-Wiley Publishing Company 362 363 UNIVERSITY OF IBADAN LIBRARY