for electrical & computer engineers / Roy D. Yates, David J. Goodman. p. cm. Includes PREFACE. When we started teaching the course Probability and Stochastic Processes to Rutgers .. PDF of the Sum of Two Random Variables Website for Students caite.info Available for download: We are happy to introduce you to the second edition of our textbook. Students and instructors . Most books on probability, statistics, stochastic processes, and random . PDF of the Sum of Two Random Variables. Roy D. Yates and David J. Goodman. July 26, A web-based solution set constructor for the second edition is also under construction.

Author: | BABETTE MATSUZAKI |

Language: | English, Spanish, French |

Country: | Mauritania |

Genre: | Business & Career |

Pages: | 224 |

Published (Last): | 26.03.2016 |

ISBN: | 895-2-29512-686-8 |

ePub File Size: | 18.78 MB |

PDF File Size: | 18.36 MB |

Distribution: | Free* [*Regsitration Required] |

Downloads: | 28884 |

Uploaded by: | LAURIE |

Yates - Probability and Stochastic Processes (2nd Edition) - Ebook download as PDF File .pdf), Text File .txt) or read book online. pdf. Probability-and-Stochastic-Processes-2nd-Roy-D-Yates-and-David-J- SECOND EDITION Problem Solutions July 26, Draft Roy D. Yates and David. Probability and Stochastic Processes: A Friendly Introduction for Electrical and Computer Engineers. Edition 2. Roy D. Yates and David J. Goodman. Problem.

What is the probability P[R] that a second-generation plant has round peas? Most people can do it in a respectable time, provided they train for it. Log in. It is customary to refer to Venn diagrams to display relationships among sets. Another notation for intersection is AB. For more than two events to be independent, the probability model has to meet a set of conditions.

However, our knowledge changes if we learn that it was raining an hour ago in Phoenix. This knowledge would cause us to assign a higher probability to the truth of the statement, It is raining now in Phoenix. Both views are useful when we apply probability theory to practical problems. While the structure of the subject conforms to principles of pure logic, the terminology is not entirely abstract. The point of view is different from the one we took when we started studying physics.

There we said that if we do the same thing in the same way over and over again — send a space shuttle into orbit, for example — the result will always be the same. To predict the result, we have to take account of all relevant facts.

In this case, repetitions of the same procedure yield different results. The situ1 While each outcome may be unpredictable, there are consistent patterns to be observed when we repeat the procedure a large number of times.

Understanding these patterns helps engineers establish test procedures to ensure that a factory meets quality objectives. In this repeatable procedure making and testing a chip with unpredictable outcomes the quality of individual chips , the probability is a number between 0 and 1 that states the proportion of times we expect a certain thing to happen, such as the proportion of chips that pass a test.

As an introduction to probability and stochastic processes, this book serves three purposes: To exhibit the logic of the subject, we show clearly in the text three categories of theoretical material: These three axioms are the foundation on which the entire subject rests. Each theorem would be accompanied by a complete proof. While rigorous, this approach would completely fail to meet our second aim of conveying the intuition necessary to work on practical problems. To address this goal, we augment the purely mathematical material with a large number of examples of practical phenomena that can be analyzed by means of probability theory.

We also include brief quizzes that you should try to solve as you read the book. Each one will help you decide whether you have grasped the material presented just before the quiz. The problems at the end of each chapter give you more practice applying the material introduced in the chapter. Some of them take you more deeply into the subject than the examples and quizzes do. Most people who study probability have already encountered set theory and are familiar with such terms as set, element, For them, the following paragraphs will review material already learned and introduce the notation and terminology we use here.

A set is a collection of things. We use capital letters to denote sets.

The things that together make up the set are elements. When we use mathematical notation to refer to set elements, we usually use small letters.

Thus we can have a set A with elements x, y, and z. One way is simply to name the elements: In addition to set inclusion, we also have the notion of a subset, which describes a relationship between two sets.

This is the mathematical way of stating that A and B are identical if and only if every element of A is an element of B and every element of B is an element of A. This is the set of all things that we could possibly consider in a given context. In any study, all set operations relate to the universal set for that study. The members of the universal set include all of the elements of all of the sets in the study.

We will use the letter S to denote the universal set. The null set, which is also important, may seem like it is not a set at all. It is customary to refer to Venn diagrams to display relationships among sets. By convention, the region enclosed by the large rectangle is the universal set S. Closed surfaces within this rectangle denote sets. There are three operations for doing this: Union and intersection combine two existing sets to produce a third set.

The complement operation forms a new set from one existing set. Another notation for intersection is AB. The complement of a set A, denoted by A c , is the set of all elements in S that are not in A. It is a combination of intersection and complement. In working with probability we will frequently refer to two important properties of collections of sets. A collection of sets A 1 ,. A1 A collection of sets A 1 ,. As we see in the following theorem, this can be complicated to show.

Theorem 1. Proof There are two parts to the proof: Quiz 1. In addition, each slice may have mushrooms M or onions O as described by the Venn diagram at right. Probability is a number that describes a set. The higher the number, the more probability there is. In this sense probability is like a quantity that measures a physical phenomenon; for example, a weight or However, it is not necessary to think about probability in physical terms.

Fortunately for engineers, the language of probability including the word probability itself makes us think of things that we experience. The basic model is a repeatable experiment. An experiment consists of a procedure and observations.

There is uncertainty in what will be observed; otherwise, performing the experiment would be unnecessary. Some examples of experiments include 1. Flip a coin. Did it land with heads or tails facing up? Walk to a bus stop. How long do you wait for the arrival of a bus? Give a lecture. How many students are seated in the fourth row? Transmit one of a collection of waveforms over a channel. What waveform arrives at the receiver?

Which waveform does the receiver identify as the transmitted waveform? For the most part, we will analyze models of actual physical experiments. We create models because real experiments generally are too complicated to analyze. Is it rush hour? Some drivers drive faster than others. Consequently, it is necessary to study a model of the experiment that captures the important part of the actual physical experiment. Since we will focus on the model of the experiment almost exclusively, we often will use the word experiment to refer to the model of an experiment.

Example 1. Flip a coin and let it land on a table. Observe which side head or tail faces you after the coin lands. Heads and tails are equally likely. As we have said, an experiment consists of both a procedure and observations. It is important to understand that two experiments with the same procedure but with different observations are different experiments. For example, consider these two experiments: Observe the sequence of heads and tails.

Observe the number of heads. These two experiments have the same procedure: They are different experiments because they require different observations. We will describe models of experiments in terms of a set of possible experimental outcomes. In the context of probability, we give precise meaning to the word outcome. In probability terms, we call this universal set the sample space. The requirement that outcomes be mutually exclusive says that if one outcome occurs, then no other outcome also occurs.

For the set of outcomes to be collectively exhaustive, every outcome of the experiment must be in the sample space. In common speech, an event is just something that occurs. In an experiment, we may say that an event occurs when a certain phenomenon is observed.

That is, for each outcome, either the particular event occurs or it does not. Table 1. All of this may seem so simple that it is boring. A probability problem arises from some practical situation that can be modeled as an experiment. Getting this right is a big step toward solving the problem. Each subset of S is an event. An outcome x is a nonnegative real number. A short-circuit tester has a red light to indicate that there is a short circuit and a green light to indicate that there is no short circuit.

Consider an experiment consisting of a sequence of three tests. In each test the observation is the color of the light that is on at the end of a test. An outcome of the experiment is a sequence of red r and green g lights. We denote the event that light n was red or green by R n or G n. We can also denote an outcome as an intersection of events R i and G j. In Example 1. An event space and a sample space have a lot in common. The members of both are mutually exclusive and collectively exhaustive.

The members of a sample space are outcomes. By contrast, the members of an event space are events. The event space is a set of events sets , while the sample space is a set of outcomes elements. Usually, a member of an event space contains many outcomes. Consider a simple example: Examine the coins in order penny, then nickel, then dime, then quarter and observe whether each coin shows a head h or a tail t.

What is the sample space? How many elements are in the sample space? The sample space consists of 16 four-letter words, with each letter either h or t. For example, the outcome tthh refers to the penny and the nickel showing tails and the dime and quarter showing heads. There are 16 members of the sample space. Continuing Example 1. Each B i is an event containing one or more outcomes. Its members are mutually exclusive and collectively exhaustive.

The experiment in Example 1. Mathematically, however, it is equivalent to many real engineering problems. For example, observe a pair of modems transmitting four bits from one computer to another. For each bit, observe whether the receiving modem detects the bit correctly c , or makes an error e.

Or, test four integrated circuits. For each one, observe whether the circuit is acceptable a , or a reject r.

In all of these examples, the sample space contains 16 four-letter words formed with an alphabet containing two letters. The concept of an event space is useful because it allows us to express any event as a union of mutually exclusive events. We will observe in the next section that the entire theory of probability is based on unions of mutually exclusive events. The following theorem shows how to use an event space to represent an event as a union of mutually exclusive events.

Figure 1.

Many practical problems use the mathematical technique contained in the theorem. Classify each one as a voice call v if someone is speaking, or a data call d if the call is carrying a modem or fax signal. Your observation is a sequence of three letters each letter is either For example, two voice calls followed by one data call corresponds to vvd. Write the elements of the following sets: This leads to a set-theory representation with a sample space universal set S , outcomes s that are elements of S , and events A that are sets of elements.

To complete the model, we assign a probability P[ A] to every event, A, in the sample space. With respect to our physical idea of the experiment, the probability of an event is the proportion of the time that event is observed in a large number of runs of the experiment. This is the relative frequency notion of probability. Mathematically, this is expressed in the following axioms. Axiom 3 For any countable collection A 1 , A2 ,. We will build our entire theory of probability on these three axioms.

Axioms 1 and 2 simply establish a probability as a number between 0 and 1. Axiom 3 states that the probability of the union of mutually exclusive events is the sum of the individual probabilities. We will use this axiom over and over in developing the theory of probability and in solving problems. In fact, it is really all we have to work with.

Everything else follows from Axiom 3. To use Axiom 3 to solve a practical problem, we refer to Theorem 1. A useful extension of Axiom 3 applies to the union of two disjoint events. Although it may appear that Theorem 1. In fact, a simple proof of Theorem 1. If you are curious, Problem 1. It is a simple matter to extend Theorem 1. The correspondence refers to a sequential experiment consisting of n repetitions of the basic experiment.

We refer to each repetition of the experiment as a trial. We create models because real experiments generally are too complicated to analyze. Is it rush hour? Some drivers drive faster than others. Consequently, it is necessary to study a model of the experiment that captures the important part of the actual physical experiment.

Since we will focus on the model of the experiment almost exclusively, we often will use the word experiment to refer to the model of an experiment. Example 1. Flip a coin and let it land on a table. Observe which side head or tail faces you after the coin lands. Heads and tails are equally likely. As we have said, an experiment consists of both a procedure and observations. It is important to understand that two experiments with the same procedure but with different observations are different experiments.

For example, consider these two experiments: Observe the sequence of heads and tails. Observe the number of heads. These two experiments have the same procedure: They are different experiments because they require different observations. We will describe models of exper- iments in terms of a set of possible experimental outcomes. In the context of probability, we give precise meaning to the word outcome. In probability terms, we call this universal set the sample space.

The requirement that outcomes be mutually exclusive says that if one outcome occurs, then no other outcome also occurs. For the set of outcomes to be collec- tively exhaustive, every outcome of the experiment must be in the sample space. In common speech, an event is just something that occurs.

In an experiment, we may say that an event occurs when a certain phenomenon is observed. That is, for each outcome, either the particular event occurs or it does not.

Table 1. All of this may seem so simple that it is boring. A probability problem arises from some practical situation that can be modeled as an experiment. Getting this right is a big step toward solving the problem. Each subset of S is an event. An outcome x is a nonnegative real number.

Consider an experiment consisting of a sequenceof threetests. In each test the observationis the color of the light that is on at the end of a test. An outcome of the experiment is a sequence of red r and green g lights. We denote the event that light n was red or green by R n or G n. We can also denote an outcome as an intersection of events R i and G j.

In Example 1. An event space and a sample space have a lot in common. The members of both are mutually exclusive and collectively exhaustive. The members of a sample space are outcomes. By contrast, the members of an event space are events. The event space is a set of events sets , while the sample space is a set of outcomes elements.

Usually, a member of an event space contains many outcomes. Consider a simple example: Examine the coins in order penny, then nickel, then dime, then quarter and observe whether each coin shows a head h or a tail t. What is the sample space?

How many elements are in the sample space? The sample space consists of 16 four-letter words, with each letter either h or t. For example, the outcome t t hh refers to the penny and the nickel showing tails and the dime and quarter showing heads.

There are 16 members of the sample space. Each B i is an event containing one or more outcomes. Its members are mutually exclusive and collectively exhaustive. The experiment in Example 1. Mathematically, however, it is equivalent to many real engineering problems.

For example, observe a pair of modems transmitting four bits fromone computer to another. For each bit, observe whether the receiving modemdetects the bit correctly c , or makes an error e. Or, test four integrated circuits. For each one, observe whether the circuit is acceptable a , or a reject r.

In all of these examples, the sample space contains 16 four-letter words formed with an alphabet containing two letters. The concept of an event space is useful because it allows us to express any event as a union of mutually exclusive events. We will observe in the next section that the entire theory of probability is based on unions of mutually exclusive events. The following theorem shows how to use an event space to represent an event as a union of mutually exclusive events.

Figure 1. Many practical problems use the mathematical technique contained in the theorem.

Classify each one as a voice call v if someone is speaking, or a data call d if the call is carrying a modem or fax signal. For example, two voice calls followed by one data call corresponds to vvd. Write the elements of the following sets: This leads to a set-theory representation with a sample space universal set S , outcomes s that are elements of S , and events A that are sets of elements.

To complete the model, we assign a probability P[ A] to every event, A, in the sample space. With respect to our physical idea of the experiment, the probability of an event is the proportion of the time that event is observed in a large number of runs of the experiment.

This is the relative frequency notion of probability. Mathematically, this is expressed in the following axioms. Axiom 3 For any countable collection A 1 , A 2 ,. We will build our entire theory of probability on these three axioms. Axioms 1 and 2 simply establish a probability as a number between 0 and 1. Axiom 3 states that the prob- ability of the union of mutually exclusive events is the sum of the individual probabilities.

We will use this axiom over and over in developing the theory of probability and in solving problems.

In fact, it is really all we have to work with. Everything else follows from Axiom 3. To use Axiom 3 to solve a practical problem, we refer to Theorem1. A useful extension of Axiom 3 applies to the union of two disjoint events.

Although it may appear that Theorem1. In fact, a simple proof of Theorem1. If you are curious, Problem1. It is a simple matter to extend Theorem1. In Chapter 7, we show that the probability measure established by the axioms corre- sponds to the idea of relative frequency. The correspondence refers to a sequential exper- iment consisting of n repetitions of the basic experiment. We refer to each repetition of the experiment as a trial. In these n trials, N A n is the number of times that event A occurs.

Theorem 7. Another consequence of the axioms can be expressed as the following theorem: Proof Each outcome s i is an event a set with the single element s i. Applying Theorem 1. The expression in the square brackets is an event. Within the context of one experiment, P[ A] can be viewed as a function that transforms event A to a number between 0 and 1.

In these experiments we say that the n outcomes are equally likely. What is the probability of each outcome? Find the probabilities of the events: A score of 90 to is an A, 80 to 89 is a B, 70 to 79 is a C, 60 to 69 is a D, and below 60 is a failing grade of F.

While we do not supply the proofs, we suggest that students prove at least some of these theorems in order to gain experience working with the axioms. The following useful theoremrefers to an event space B 1 , B 2 ,. Proof The proof follows directly from Theorem 1. In this table, the rows and columns each represent an event space.

This method is shown in the following example. It also observes whether calls carry voice v , data d , or fax f. This model implies an experiment in which the procedure is to monitor a call and the observation consists of the type of call, v, d, or f , and the length, l or b.

In this problem, each call is classifed in two ways: The sample space can be represented by a table in which the rows and columns are labeled by events and the intersection of each row and column event contains a single outcome.

The corresponding table entry is the probability of that outcome. In this case, the table is V D F L 0. Thus we can apply Theorem 1. Classify the call as a voice call V if someone is speaking, or a data call D if the call is carrying a modemor fax signal. Classify the call as long L if the call lasts for more than three minutes; otherwise classify the call as brief B.

Based on data collected by the telephone company, we use the following probability model: Find the following probabilities: Sometimes, we refer to P[ A] as the a priori probability, or the prior probability, of A. Rather than the outcome s i , itself, we obtain information that the outcome 1. That is, we learn that some event B has occurred, where B consists of several outcomes. The notation for this new probability is P[ A B]. The circuits come froma high-qualityproductionline.

Thereforetheprior probability P[A] is very low. In advance, we are pretty certain that the second circuit will be accepted. However, some wafers become contaminated by dust, and these wafers have a high proportionof defective chips. In this case, it is illogical to speak of the probability of A given that B occurs. Note that P[ A B] is a respectable probability measure relative to a sample space that consists of all the outcomes in B.

This means that P[ A B] has properties corresponding to the three axioms of probability. Axiom 1: Axiom 2: Axiom 3: We saw in Example 1. What is the conditional prob- ability that the bottom card is the ace of clubs given that the bottom card is a black card?

The sample space consists of the 52 cards that can appear on the bottomof the deck. Let A denote the event that the bottom card is the ace of clubs. Let B be the event that the bottom card is a black card. Let X 1 and X 2 denote the number of dots that appear on die 1 and die 2, respectively.

What is P[A]? What is P[B]? What is P[A B]? B 1 2 3 4 1 2 3 4 We begin by observing that the sample space has 16 elements corresponding to the four possible values of X 1 and the same four values of X 2. We drawthe sample space as a set of black circles in a two-dimensional diagram, in which the axes represent the events X 1 and X 2. Each outcome is a pair of values X 1 , X 2. The rectangle represents A. The triangle represents B. It contains six outcomes. Law of Total Probability In many situations, we begin with information about conditional probabilities.

Using these conditional probabilities, we would like to calculate unconditional probabilities. The law of total probability shows us how to do this.

Proof This follows from Theorem 1. The usefulness of the result can be seen in the next example. Each hour, machine B 1 produces resistors, B 2 produces resistors, and B 3 produces resistors. All of the resistors are mixed together at random in one bin and packed for shipment. What is the probability that the company ships a resistor that is within 50 of the nominal value?

To do so we have the following formula: It has a name because it is extremely useful for making inferences about phenomena that cannot be observed directly. For each possible state, B i , we knowthe prior probability P[B i ] and P[ A B i ], the probability that an event A occurs the resistor meets a quality criterion if B i is the actual state. Now we observe the actual event either the resistor passes or fails a test , and we ask about the thing we are interested in the machines that might have produced the resistor.

In performing the calculations, we use the law of total probability to calculate the denominator in Theorem 1. What is the probability that an acceptable resistor comes from machine B 3? Your observation is a sequence of three letters each one is either v or d. For example, three voice calls corresponds to vvv.

The outcomes vvv and ddd have probability 0. Count the number of voice calls N V in the three calls you have observed. Describe in words and also calculate the following probabilities: P[ A] describes our prior knowledge before the experiment is performed that the outcome is included in event A.

The fact that the outcome is in B is partial information about the experiment.

It is in this sense that the events are independent. Problem 1. The logic behind this conclusion is that if learning that event B occurs does not alter the probability of event A, then learning that B does not occur also should not alter the probability of A. Keep in mind that independent and disjoint are not synonyms. In some contexts these words can have similar meanings, but this is not the case in probability.

In most situations independent events are not disjoint! When we have to calculate probabilities, knowledge that events A and B are disjoint is very helpful. Axiom3 enables us to add their probabilities to obtain the probability of the union. Knowledge that events C and D are independent is also very useful. Are the events R 2 that the second light was red and G 2 that the second light was green independent?

Are the events R 1 and R 2 independent? That is, R 2 and G 2 must be disjoint because the second light cannot be both red and green. Learning whether or not the event G 2 second light green occurs dras- tically affects our knowledge of whether or not the event R 2 occurs. In this example we have analyzed a probability model to determine whether two events are independent.

In many practical applications we reason in the opposite direction. Our 1. We then use this knowledge to build a probability model for the experiment. Amechanical test determines whether pins have the correct spacing, and an electrical test checks the relationship of outputs to inputs.

We assume that electrical failures and mechanical failures occur independently. Our information about circuit production tells us that mechanical failures occur with prob- ability 0. What is the probability model of an experiment that consists of testing an integrated circuit and observing the results of the mechanical and electrical tests?

To build the probability model, we note that the sample space contains four outcomes: Let M and E denote the events that the mechanical and electrical tests are acceptable.

Often we consider larger sets of independent events. For more than two events to be independent, the probability model has to meet a set of conditions. On the other hand, if we know that a set is independent, it is a simple matter to determine the probability of the intersection of any subset of the events.

Just multiply the probabilities of the events in the subset. Your observation is a sequence of two letters either v or d. For example, two voice calls corresponds to vv. The two calls are independent and the probability that any one of them is a voice call is 0. Denote the identity of call i by C i. Count the number of voice calls in the two calls you have observed.

N V is the number of voice calls. Determine whether the following pairs of events are independent: The procedure followed for each subexperiment may depend on the results of the previous subexperiments. To do so, we assemble the outcomes of each subexperiment into sets in an 1.

Each branch leads to a node. The labels of the branches of the second subexperiment are the conditional probabilities of the events in the second subexperiment.

We continue the procedure taking the remaining subexperiments in order. Each leaf corresponds to an outcome of the entire sequential experiment. The probability of each outcome is the product of the probabilities and conditional probabilities on the path from the root to the leaf. We usually label each leaf with a name for the event and the probability of the event.

The experiment of testing a resistor can be viewed as a two-step procedure. First we identify which machine B 1 , B 2 , or B 3 produced the resistor. Sketch a sequential tree for this experiment. What is the probability of choosing a resistor from machine B 2 that is not acceptable? This two-step procedure corresponds to the tree shown in Figure 1.

We observe in this example a general property of all tree diagrams that represent sequen- tial experiments. The probabilities on the branches leaving any node add up to 1. This is a consequence of the lawof total probability and the property of conditional probabilities that 1 Unlike biological trees, which grow from the ground up, probabilities usually grow from left to right.

Some of them have their roots on top and leaves on the bottom. Moreover, Axiom 2 implies that the probabilities of all of the leaves add up to 1. In particular, the timing was designed so that with probability 0. Also, what is P[W], the probability that you wait for at least one light? With the ace worth 1 point, you draw cards until your total is 3 or more. You win if your total is 3. What is P[W], the probability that you win? Let C i denote the event that card C is the i th card drawn.

For example, 3 2 is the event that the 3 was the second card drawn. The tree is 1. Coin 1 is biased. Let C i denote the event that coin i is picked. Given that the outcome is a tail, what is the probability P[C 1 T] that you picked the biased coin? First, we construct the sample tree. Consequently, the system will page a phone up to three times before giving up.

If a single paging attempt succeeds with probability 0. What is the probability that we draw no queens? In theory, we can draw a sample space tree for the seven cards drawn. However, the resulting tree is so large that this is impractical. In fact, you may wonder if million is even approximately the number of such combinations. To solve this problem, we need to develop procedures that permit us to count how many seven-card combinations there are and how many of them do not have a queen.

The results we will derive all follow from the fundamental principle of counting. This principle is easily demonstrated by a few examples. Generally, if an experiment E has k subexperiments E 1 ,. The outcome of the ex- periment is an ordered sequence of the 52 cards of the deck. How many possible outcomes are there? The procedure consists of 52 subexperiments.

In each one the observation is the identity of one card. How many outcomes are there? In general, an ordered sequence of k distinguishable objects is called a k-permutation. We will use the notation n k to denote the number of possible k- permutations of n distinguishable objects.

Sampling without Replacement Choosing objects froma collection is also called sampling, and the chosen objects are known as a sample. In particular, once we choose an object for a k-permutation, we remove the object from the collection and we cannot choose it again. Consequently, this is also called sampling without replacement. When an object can be chosen repeatedly, we have sampling with replacement, which we examine in the next subsection.

When we choose a k-permutation, different outcomes are distinguished by the order in which we choose objects. However, in many practical problems, the order in which the objects are chosen makes no difference. For example, in many card games, only the set of cards received by a player is of interest.

The order in which they arrive is irrelevant. What we are doing is picking a subset of the collection of objects. Each subset is called a k-combination. Choose a k-combination out of the n objects. Choose a k-permutation of the k objects in the k-combination. By Theorem 1. For each choice of starting lineup, the manager must submit to the umpire a batting order for the 9 starters.

What is the probability of getting a hand without any queens? Sampling with Replacement Now we address sampling with replacement. In this case, each object can be chosen repeatedly because a selected object is replaced by a duplicate. Let xy denote the outcome that card type x is used in slot A and card type y is used in slot B.

The fact that Example 1. Since we were sampling with replacement, there were always three possible outcomes for each of the subexperiments to choose a PCMCIA card. Hence, by the fundamental theorem of counting, Example 1. This result generalizes naturally when we want to choose with replacement a sample of n objects out of a collection of m distinguishable objects.

The experiment consists of a sequence of n identical subexperi- ments. Sampling with replacement ensures that in each subexperiment, there are m possible outcomes. Hence there are m n ways to choose with replacement a sample of n objects. Sampling with replacement also arises when we perform n repetitions of an identical subexperiment.

Each subexperiment has the same sample space S. Using x i to denote the outcome of the i th subexperiment, the result for n repetitions of the subexperiment is a sequence x 1 ,. Each microprocessor is tested to determine whether it runs reliably at an acceptable clock speed. In testing four microprocessors, the observation sequence x 1 x 2 x 3 x 4 is one of 16 possible outcomes: Note that we can think of the observation sequence x 1 ,. For sequences of identical subexperiments, we can formulate the following restatement of Theorem 1.

A grade of s j indicates that the micropro- cessor will function reliably at a maximum clock rate of s j megahertz MHz. In testing 10 microprocessors, we use x i to denote the grade of the i th microprocessor tested.

A more challenging problem is to calculate the number of observation sequences such that each subexperiment outcome appears a certain number of times. There are exactly 10 such words. Writing down all 10 sequences of Example 1. Each sequence is uniquely determined by the placement of the ones. The details can be found in the proof of the following theorem: Subexperiment Procedure 0 Label n 0 slots as s 0. An example of a code word is We start with a simple subexperiment in which there are two outcomes: The results of all trials of the subexperiment are mutually independent.

An outcome of the complete experiment is a sequence of suc- cesses and failures denoted by a sequence of ones and zeroes. For example, From Theorem 1. The second formula in this theoremis the result of multiplying the probability of n 0 failures in n trials by the number of outcomes with n 0 failures. If we randomly test resistors, what is the probability of T i , the event that i resistors test acceptable?

Testing each resistor is an independent trial with a success occurring when a resistor is acceptable. This shows that although we might expect the number acceptable to be close to 78, that does not mean that the probability of exactly 78 acceptable is high.

The receiver detects the correct information if three or more binary symbols are received correctly. On each trial, a success occurs when a binary symbol is received correctly. The error event E occurs when the number of successes is strictly less than three: Now suppose we perform n independent repetitions of a subexperiment for which there are m possible outcomes for any subexperiment. That is, the sample space for each subex- periment is s 0 ,. An outcome of the experiment consists of a sequence of n subexperiment outcomes.

In the probability tree of the experiment, each node has m branches and branch i has probability p i. The probability of an experimental outcome is just the product of the branchprobabilities encountered on a path from the root of the tree to the leaf representing the outcome. For example, the experimental outcome s 2 s 0 s 3 s 2 s 4 occurs with probability p 2 p 0 p 3 p 2 p 4. To calculate P[S n 0 , Let S v, f ,m denote the event that we observe v voice calls, f fax calls, and m modem calls out of observed calls.

Let S 25,25,25,25 denote the probability of exactly 25 microprocessors of each grade. The packet has been coded in such a way that if three or fewer bits are received in error, then those bits can be corrected.

If more than three bits are received in error, then the packet is decoded with errors. What is P[C]? The operation consists of n components and each component succeeds with probability p, independent of any other component.

Let W i denote the event that component i succeeds. As depicted in Figure 1. The operation succeeds if all of its components succeed. Therefore, the complete operation fails if any component program fails. Whenever the operation consists of k components in series, we need all k components to succeed in order to have a successful operation.

The operation succeeds if any component works. This operation occurs when we introduce redundancy to promote reliability. In a redundant system, such as a space shuttle, there are n computers on board so that the shuttle can continue to function as long as at least one computer operates successfully. Draw a diagram of the operation 1. On the left is the original operation. On the right is the equivalent operation with each pair of series components replaced with an equivalent component.

A diagram of the operation is shown in Figure 1. The entire operation then consists of W 5 and W 6 in parallel which is also shown in Figure 1. Similarly, when the components are in parallel, calculating the probability that the device succeeds is hard. This is how we calculated the probability that the parallel device works.

The device is designed with redundancy so that it works even if one of its chips is defective. Each chip contains n transistors and functions properly if all of its transistors work. A transistor works with probability p independent of any other transistor. What is the probability P[C] that a chip works? What is the probability P[M] that the memory module works? You can use this text to learn probability without Matlab.

Nevertheless, Matlabprovides a convenient programming environment for solving probability problems and for building models of probabilistic systems. Each problem also has a label that reflects our estimate of degree of difficulty. Skiers will recognize the following symbols:. Every ski area emphasizes that these designations are relative to the trails at that area. Similarly, the difficulty of our problems is relative to the other problems in this text.

Libraries and bookstores contain an endless collection of textbooks at all levels covering the topics presented in this textbook. We know of two in comic book format [GS93, Pos01].

The reference list on page is a brief sampling of books that can add breadth or depth to the material in this text. Most books on probability, statistics, stochastic processes, and random signal processing contain expositions of the basic principles of probability and random variables, covered in Chapters 1—4.

In advanced texts, these expositions serve mainly to establish notation for more specialized topics. It presents probability as a branch of number theory. It presents the concepts of probability from a historical perspective, focusing on the lives and contributions of mathematicians and others who stimulated major advances in probability and statistics and their application various fields including psychology, economics, government policy, and risk management.

The summaries at the end of Chapters 5—12 refer to books that supplement the specialized material in those chapters. We are grateful for assistance and suggestions from many sources including our students at Rutgers and Polytechnic Universities, instructors who adopted the first edition, reviewers, and the Wiley team. At Wiley, we are pleased to acknowledge the continuous encouragement and enthusiasm of our executive editor, Bill Zobrist and the highly skilled support of marketing manager, Jennifer Powers, Senior Production Editor, Ken Santor, and Cover Designer, Dawn Stanley.

Unique among our teaching assistants, Dave Famolari took the course as an undergrad- uate. Later as a teaching assistant, he did an excellent job writing homework solutions with a tutorial flavor. The first edition also benefited from reviews and suggestions conveyed to the publisher by D. Finally, we acknowledge with respect and gratitude the inspiration and guidance of our teachers and mentors who conveyed to us when we were students the importance and elegance of probability theory.

A lot of students find it hard to do well in this course. We think there are a few reasons for this difficulty. One reason is that some people find the concepts hard to use and understand. Many of them are successful in other courses but find the ideas of probability difficult to grasp.

Usually these students recognize that learning probability theory is a struggle, and most of them work hard enough to do well. However, they find themselves putting in more effort than in other courses to achieve similar results.

Other people have the opposite problem. The work looks easy to them, and they under- stand everything they hear in class and read in the book.

There are good reasons for assuming. There are very few basic concepts to absorb. The terminology like the word probability , in most cases, contains familiar words. With a few exceptions, the mathematical manipulations are not complex. You can go a long way solving problems with a four-function calculator. For many people, this apparent simplicity is dangerously misleading because it is very tricky to apply the math to specific problems. A few of you will see things clearly enough to do everything right the first time.

However, most people who do well in probability need to practice with a lot of examples to get comfortable with the work and to really understand what the subject is about. Most of the work in this course is that way, and the only way to do well is to practice a lot. Taking the midterm and final are similar to running in a five-mile race.

Most people can do it in a respectable time, provided they train for it.

So, our advice to students is, if this looks really weird to you, keep working at it. You will probably catch on. It may be harder than you think. The theoretical material covered in this book has helped both of us devise new communication techniques and improve the operation of practical systems.

We hope you find the subject intrinsically interesting. If you master the basic ideas, you will have many opportunities to apply them in other courses and throughout your career. We have worked hard to produce a text that will be useful to a large population of stu- dents and instructors. We welcome comments, criticism, and suggestions. Feel free to send us e-mail at ryates winlab. In addition, the Website, http: Expected Value and Variance 7. Vectors and Matrices State Classification Stationary Probabilities Now you can begin.

The title of this book is Probability and Stochastic Processes. We say and hear and read the word probability and its relatives possible, probable, probably in many contexts. Within the realm of applied mathematics, the meaning of probability is a question that has occupied mathematicians, philosophers, scientists, and social scientists for hundreds of years.

Everyone accepts that the probability of an event is a number between 0 and 1. Some people interpret probability as a physical property like mass or volume or temperature that can be measured. This is tempting when we talk about the probability that a coin flip will come up heads. This probability is closely related to the nature of the coin. Fiddling around with the coin can alter the probability of heads.

Another interpretation of probability relates to the knowledge that we have about some- thing. We might assign a low probability to the truth of the statement, It is raining now in Phoenix, Arizona , because we know that Phoenix is in the desert. However, our knowledge changes if we learn that it was raining an hour ago in Phoenix. This knowledge would cause us to assign a higher probability to the truth of the statement, It is raining now in Phoenix.

Both views are useful when we apply probability theory to practical problems. Whichever view we take, we will rely on the abstract mathematics of probability, which consists of definitions, axioms, and inferences theorems that follow from the axioms.

While the structure of the subject conforms to principles of pure logic, the terminology is not entirely abstract. Instead, it reflects the practical origins of probability theory, which was developed to describe phenomena that cannot be predicted with certainty. The point of view is differ- ent from the one we took when we started studying physics.

There we said that if we do the same thing in the same way over and over again — send a space shuttle into orbit, for example — the result will always be the same.

To predict the result, we have to take account of all relevant facts. In this case, repetitions of the same procedure yield different results.

The situ-. While each outcome may be unpredictable, there are consistent patterns to be observed when we repeat the procedure a large number of times. Understanding these patterns helps engineers establish test procedures to ensure that a factory meets quality objectives.

In this repeatable procedure making and testing a chip with unpredictable outcomes the quality of individual chips , the probability is a number between 0 and 1 that states the proportion of times we expect a certain thing to happen, such as the proportion of chips that pass a test.

To exhibit the logic of the subject, we show clearly in the text three categories of theoretical material: Definitions establish the logic of probability theory, while axioms are facts that we accept without proof.

Theorems are consequences that follow logically from definitions and axioms. Each theorem has a proof that refers to definitions, axioms, and other theorems. Although there are dozens of definitions and theorems, there are only three axioms of probability theory. These three axioms are the foundation on which the entire subject rests. To meet our goal of presenting the logic of the subject, we could set out the material as dozens of definitions followed by three axioms followed by dozens of theorems.

Each theorem would be accompanied by a complete proof. While rigorous, this approach would completely fail to meet our second aim of conveying the intuition necessary to work on practical problems. To address this goal, we augment the purely mathematical material with a large number of examples of practical phenomena that can be analyzed by means of probability theory.

We also interleave definitions and theorems, presenting some theorems with complete proofs, others with partial proofs, and omitting some proofs altogether. We find that most engineering students study probability with the aim of using it to solve practical problems, and we cater mostly to this goal.

We also encourage students to take an interest in the logic of the subject — it is very elegant — and we feel that the material presented will be sufficient to enable these students to fill in the gaps we have left in the proofs.

Therefore, as you read this book you will find a progression of definitions, axioms, theorems, more definitions, and more theorems, all interleaved with examples and comments designed to contribute to your understanding of the theory. We also include brief quizzes that you should try to solve as you read the book.

Each one will help you decide whether you have grasped the material presented just before the quiz. The problems at the end of each chapter give you more practice applying the material introduced in the chapter. They vary considerably in their level of difficulty.

Some of them take you more deeply into the subject than the examples and quizzes do. The mathematical basis of probability is the theory of sets. Most people who study proba- bility have already encountered set theory and are familiar with such terms as set, element,. For them, the following paragraphs will review ma- terial already learned and introduce the notation and terminology we use here.

For people who have no prior acquaintance with sets, this material introduces basic definitions and the properties of sets that are important in the study of probability.

A set is a collection of things. We use capital letters to denote sets.