SystemOrganization addCategory: #Statistics! Stream subclass: #ProbabilityDistribution instanceVariableNames: '' classVariableNames: 'RandomGenerator' poolDictionaries: '' category: 'Statistics'! ProbabilityDistribution subclass: #ContinuousProbability instanceVariableNames: '' classVariableNames: '' poolDictionaries: '' category: 'Statistics'! !ContinuousProbability methodsFor: 'probability functions' stamp: 'lr 12/27/2009 10:34'! distribution: aCollection "This is a slow and dirty trapezoidal integration to determine the area under the probability function curve y=density (x) for x in the specified collection. The method assumes that the collection contains numerically-ordered elements." | t aStream x1 x2 y1 y2 | t := 0.0. aStream := ReadStream on: aCollection. x2 := aStream next. y2 := self density: x2. [ x1 := x2. x2 := aStream next ] whileTrue: [ y1 := y2. y2 := self density: x2. t := t + ((x2 - x1) * (y2 + y1)) ]. ^ t * 0.5! ! !ContinuousProbability methodsFor: 'accessing' stamp: 'lr 12/27/2009 10:34'! mean self subclassResponsibility! ! !ContinuousProbability methodsFor: 'accessing' stamp: 'lr 12/27/2009 10:34'! variance self subclassResponsibility! ! ContinuousProbability subclass: #Exponential instanceVariableNames: 'mu' classVariableNames: '' poolDictionaries: '' category: 'Statistics'! !Exponential class methodsFor: 'instance creation' stamp: 'lr 12/27/2009 10:34'! mean: p ^ self parameter: 1.0 / p! ! !Exponential class methodsFor: 'instance creation' stamp: 'lr 12/27/2009 10:34'! parameter: p ^ p > 0.0 ifTrue: [ self new setParameter: p ] ifFalse: [ self error: 'The probability parameter must be greater than 0.0' ]! ! !Exponential methodsFor: 'probability functions' stamp: 'lr 12/27/2009 10:34'! density: x ^ x > 0.0 ifTrue: [ mu * (mu * x) negated exp ] ifFalse: [ 0.0 ]! ! !Exponential methodsFor: 'probability functions' stamp: 'lr 12/27/2009 10:35'! distribution: anInterval ^ anInterval last <= 0.0 ifTrue: [ 0.0 ] ifFalse: [ 1.0 - (mu * anInterval last) negated exp - (anInterval first > 0.0 ifTrue: [ self distribution: (0.0 to: anInterval first) ] ifFalse: [ 0.0 ]) ]! ! !Exponential methodsFor: 'private' stamp: 'lr 12/27/2009 10:34'! inverseDistribution: x ^ x ln negated / mu! ! !Exponential methodsFor: 'accessing' stamp: 'lr 12/27/2009 10:34'! mean ^ 1.0 / mu! ! !Exponential methodsFor: 'private' stamp: 'lr 12/27/2009 10:34'! setParameter: p mu := p! ! !Exponential methodsFor: 'accessing' stamp: 'lr 12/27/2009 10:34'! variance ^ 1.0 / (mu * mu)! ! Exponential subclass: #Gamma instanceVariableNames: 'n' classVariableNames: '' poolDictionaries: '' category: 'Statistics'! !Gamma class methodsFor: 'instance creation' stamp: 'lr 12/27/2009 10:34'! events: k mean: p | events | events := k truncated. ^ events > 0 ifTrue: [ (self parameter: events / p) setEvents: events ] ifFalse: [ self error: 'the number of events must be greater than 0' ]! ! !Gamma methodsFor: 'probability functions' stamp: 'lr 12/27/2009 10:34'! density: x | t | ^ x > 0.0 ifTrue: [ t := mu * x. (mu raisedTo: n) / (self gamma: n) * (x raisedTo: n - 1) * t negated exp ] ifFalse: [ 0.0 ]! ! !Gamma methodsFor: 'private' stamp: 'lr 12/27/2009 10:34'! gamma: anInteger | t | t := anInteger - 1.0. ^ self computeSample: t outOf: t! ! !Gamma methodsFor: 'accessing' stamp: 'lr 12/27/2009 10:34'! mean ^ super mean * n! ! !Gamma methodsFor: 'private' stamp: 'lr 12/27/2009 10:34'! setEvents: events n := events! ! !Gamma methodsFor: 'accessing' stamp: 'lr 12/27/2009 10:34'! variance ^ super variance * n! ! ContinuousProbability subclass: #Normal instanceVariableNames: 'mu sigma' classVariableNames: '' poolDictionaries: '' category: 'Statistics'! !Normal commentStamp: '' prior: 0! How long before a success occurs or how many events occur in a certain time interval?! !Normal class methodsFor: 'instance creation' stamp: 'lr 12/27/2009 10:34'! mean: a deviation: b ^ b > 0.0 ifTrue: [ self new setMean: a standardDeviation: b ] ifFalse: [ self error: 'standard deviation must be greater than 0.0' ]! ! !Normal methodsFor: 'probability functions' stamp: 'lr 12/27/2009 10:34'! density: x | twoPi t | twoPi := 2 * 3.1415926536. t := (x - mu) / sigma. ^ (-0.5 * t squared) exp / (sigma * twoPi sqrt)! ! !Normal methodsFor: 'accessing' stamp: 'lr 12/27/2009 10:34'! mean ^ mu! ! !Normal methodsFor: 'random sampling' stamp: 'lr 12/27/2009 10:34'! next | v1 v2 s rand u | rand := Uniform from: -1.0 to: 1.0. [ v1 := rand next. v2 := rand next. s := v1 squared + v2 squared. s >= 1 ] whileTrue. u := (-2.0 * s ln / s) sqrt. ^ mu + (sigma * v1 * u)! ! !Normal methodsFor: 'private' stamp: 'lr 12/27/2009 10:34'! setMean: m standardDeviation: s mu := m. sigma := s! ! !Normal methodsFor: 'accessing' stamp: 'lr 12/27/2009 10:34'! variance ^ sigma squared! ! ContinuousProbability subclass: #Uniform instanceVariableNames: 'startNumber stopNumber' classVariableNames: '' poolDictionaries: '' category: 'Statistics'! !Uniform class methodsFor: 'instance creation' stamp: 'lr 12/27/2009 10:34'! from: begin to: end ^ begin > end ifTrue: [ self error: 'illegal interval' ] ifFalse: [ self new setStart: begin toEnd: end ]! ! !Uniform methodsFor: 'probability functions' stamp: 'lr 12/27/2009 10:34'! density: x ^ (x between: startNumber and: stopNumber) ifTrue: [ 1.0 / (stopNumber - startNumber) ] ifFalse: [ 0 ]! ! !Uniform methodsFor: 'private' stamp: 'lr 12/27/2009 10:34'! inverseDistribution: x "x is a random number between 0 and 1" ^ startNumber + (x * (stopNumber - startNumber))! ! !Uniform methodsFor: 'accessing' stamp: 'lr 12/27/2009 10:34'! mean ^ (startNumber + stopNumber) / 2! ! !Uniform methodsFor: 'private' stamp: 'lr 12/27/2009 10:34'! setStart: begin toEnd: end startNumber := begin. stopNumber := end! ! !Uniform methodsFor: 'accessing' stamp: 'lr 12/27/2009 10:34'! variance ^ (stopNumber - stopNumber) squared / 12! ! ProbabilityDistribution subclass: #DiscreteProbability instanceVariableNames: '' classVariableNames: '' poolDictionaries: '' category: 'Statistics'! DiscreteProbability subclass: #Bernoulli instanceVariableNames: 'prob' classVariableNames: '' poolDictionaries: '' category: 'Statistics'! !Bernoulli commentStamp: '' prior: 0! Does an event occur? density function answers the probability of occurrence of one of two events! !Bernoulli class methodsFor: 'examples' stamp: 'lr 12/27/2009 10:34'! cardgame "is the first draw of a card an ace?" "does a car arrive in the next second?" "will a machine break down today?" (self parameter: 4 / 52) next! ! !Bernoulli class methodsFor: 'instance creation' stamp: 'lr 12/27/2009 10:34'! parameter: aNumber ^ (aNumber between: 0.0 and: 1.0) ifTrue: [ self new setParameter: aNumber ] ifFalse: [ self error: 'The probability must be between 0.0 and 1.0' ]! ! !Bernoulli methodsFor: 'probability functions' stamp: 'lr 12/27/2009 10:34'! density: x x = 1 ifTrue: [ ^ prob ]. x = 0 ifTrue: [ ^ 1.0 - prob ]. self error: ' outcomes of a Bernoulli can only be 1 or 0'! ! !Bernoulli methodsFor: 'private' stamp: 'lr 12/27/2009 10:34'! inverseDistribution: x ^ x <= prob ifTrue: [ 1 ] ifFalse: [ 0 ]! ! !Bernoulli methodsFor: 'accessing' stamp: 'lr 12/27/2009 10:34'! mean ^ prob! ! !Bernoulli methodsFor: 'private' stamp: 'lr 12/27/2009 10:34'! setParameter: aNumber prob := aNumber! ! !Bernoulli methodsFor: 'accessing' stamp: 'lr 12/27/2009 10:34'! variance ^ prob * (1.0 - prob)! ! Bernoulli subclass: #Binomial instanceVariableNames: 'n' classVariableNames: '' poolDictionaries: '' category: 'Statistics'! !Binomial commentStamp: '' prior: 0! how many successes occurred in N trials? density funciton answers what is the probability that x successes will occur in the next N trials? i.e., N repeated Bernoulli trials! !Binomial class methodsFor: 'instance creation' stamp: 'lr 12/27/2009 10:34'! events: n mean: m ^ n truncated <= 0 ifTrue: [ self error: 'number of events must be > 0' ] ifFalse: [ self new events: n mean: m ]! ! !Binomial class methodsFor: 'examples' stamp: 'lr 12/27/2009 10:34'! flippingCoins1 "Did I get heads?" (Bernoulli parameter: 0.5) next! ! !Binomial class methodsFor: 'examples' stamp: 'lr 12/27/2009 10:34'! flippingCoins2 "How many heads did I get in 5 trials?" (self events: 5 mean: 2.5) next! ! !Binomial methodsFor: 'probability functions' stamp: 'lr 12/27/2009 10:34'! density: x ^ (x between: 0 and: n) ifTrue: [ (self computeSample: x outOf: n) / (self computeSample: x outOf: x) * (prob raisedTo: x) * (1.0 - prob raisedTo: n - x) ] ifFalse: [ 0.0 ]! ! !Binomial methodsFor: 'private' stamp: 'lr 12/27/2009 10:34'! events: nn mean: m n := nn truncated. self setParameter: m / n! ! !Binomial methodsFor: 'random sampling' stamp: 'lr 12/27/2009 10:34'! next | t | t := 0. n timesRepeat: [ t := t + super next ]. ^ t! ! Bernoulli subclass: #Geometric instanceVariableNames: '' classVariableNames: '' poolDictionaries: '' category: 'Statistics'! !Geometric commentStamp: '' prior: 0! How many repeated, independent Bernoulli trials are needed before the first success is obtained? e.g., how many seconds before the next car arrives (as versus how many cars arrive in the next 20 sec as in a binomial question)! !Geometric class methodsFor: 'examples' stamp: 'lr 12/27/2009 10:34'! carsArriving | sample | "two cars arrive every minute" sample := self mean: 60 / 2. "what is the probability that it will take 30 sec before the next car arrives?" sample density: 30. "Did the next car arrive in 30 to 40 seconds?" sample distribution: (30 to: 40)! ! !Geometric class methodsFor: 'instance creation' stamp: 'lr 12/27/2009 10:41'! mean: m ^ self parameter: 1 / m! ! !Geometric methodsFor: 'probability functions' stamp: 'lr 12/27/2009 10:34'! density: x ^ x > 0 ifTrue: [ prob * (1.0 - prob raisedTo: x - 1) ] ifFalse: [ 0.0 ]! ! !Geometric methodsFor: 'private' stamp: 'lr 12/27/2009 10:34'! inverseDistribution: x ^ (x ln / (1.0 - prob) ln) ceiling! ! !Geometric methodsFor: 'accessing' stamp: 'lr 12/27/2009 10:34'! mean ^ 1.0 / prob! ! !Geometric methodsFor: 'accessing' stamp: 'lr 12/27/2009 10:34'! variance ^ (1.0 - prob) / prob squared! ! !DiscreteProbability methodsFor: 'probability functions' stamp: 'lr 12/27/2009 10:34'! distribution: aCollection "Answer the sum of the discrete values of the density function for each element in the collection." | t | t := 0.0. aCollection do: [ :i | t := t + (self density: i) ]. ^ t! ! DiscreteProbability subclass: #Poisson instanceVariableNames: 'mu' classVariableNames: '' poolDictionaries: '' category: 'Statistics'! !Poisson commentStamp: '' prior: 0! how many events occur in a unit time? used for sampling potential demands by customers for service The Poisson is typically the rate at which the service is provided. density function determines the probability that, in a unit interval, x events will occur. ! !Poisson class methodsFor: 'instance creation' stamp: 'lr 12/27/2009 10:34'! mean: p ^ p > 0.0 ifTrue: [ self new setMean: p ] ifFalse: [ self error: 'mean must be greater than 0.0' ]! ! !Poisson methodsFor: 'probability functions' stamp: 'lr 12/27/2009 10:34'! density: x ^ x >= 0 ifTrue: [ (mu raisedTo: x) * mu negated exp / x factorial ] ifFalse: [ 0.0 ]! ! !Poisson methodsFor: 'accessing' stamp: 'lr 12/27/2009 10:34'! mean ^ mu! ! !Poisson methodsFor: 'random sampling' stamp: 'lr 12/27/2009 10:34'! next | p n q | p := mu negated exp. n := 0. q := 1.0. [ q := q * RandomGenerator next. q >= p ] whileTrue: [ n := n + 1 ]. ^ n! ! !Poisson methodsFor: 'private' stamp: 'lr 12/27/2009 10:34'! setMean: p mu := p! ! !Poisson methodsFor: 'accessing' stamp: 'lr 12/27/2009 10:34'! variance ^ mu! ! DiscreteProbability subclass: #SampleSpace instanceVariableNames: 'data' classVariableNames: '' poolDictionaries: '' category: 'Statistics'! !SampleSpace class methodsFor: 'instance creation' stamp: 'lr 12/27/2009 10:34'! data: aCollection ^ self new setData: aCollection! ! !SampleSpace class methodsFor: 'examples' stamp: 'lr 12/27/2009 10:34'! heights | heights | heights := self data: #(60 60 60 62 62 64 64 64 64 66 66 66 68 68 68 68 68 70 70 70). "what is the probability of randomly selecting a student with height 64?" heights density: 64. "what is the probability of randomly selecting a student whose height is between 60 and 64?" heights distribution: (60 to: 64 by: 2)! ! !SampleSpace methodsFor: 'probability functions' stamp: 'lr 12/27/2009 10:34'! density: x "x must be in the sample space; the probability must sum over all occurrences of x in the sample space." ^ (data includes: x) ifTrue: [ (data occurrencesOf: x) / data size ] ifFalse: [ 0 ]! ! !SampleSpace methodsFor: 'private' stamp: 'lr 12/27/2009 10:34'! inverseDistribution: x ^ data at: (x * data size) truncated + 1! ! !SampleSpace methodsFor: 'private' stamp: 'lr 12/27/2009 10:34'! setData: aCollection data := aCollection! ! !ProbabilityDistribution class methodsFor: 'class initialization' stamp: 'lr 12/27/2009 10:34'! initialize "Uniformly distributed random numbers in the range [o,1]." RandomGenerator := Random new! ! !ProbabilityDistribution class methodsFor: 'instance creation' stamp: 'lr 12/27/2009 10:34'! new ^ self basicNew! ! !ProbabilityDistribution methodsFor: 'random sampling' stamp: 'lr 12/27/2009 10:34'! atEnd ^ false! ! !ProbabilityDistribution methodsFor: 'private' stamp: 'lr 12/27/2009 10:34'! computeSample: m outOf: n ^ m > n ifTrue: [ 0.0 ] ifFalse: [ n factorial / (n - m) factorial ]! ! !ProbabilityDistribution methodsFor: 'probability functions' stamp: 'lr 12/27/2009 10:34'! density: x self subclassResponsibility! ! !ProbabilityDistribution methodsFor: 'probability functions' stamp: 'lr 12/27/2009 10:34'! distribution: aCollection self subclassResponsibility! ! !ProbabilityDistribution methodsFor: 'private' stamp: 'lr 12/27/2009 10:34'! inverseDistribution: x self subclassResponsibility! ! !ProbabilityDistribution methodsFor: 'random sampling' stamp: 'lr 12/27/2009 10:34'! next "This is a general random number generation method for any probability law; use the (0,1) uniformly distributed random varible U as the value of the law's distribution function. Obtain the next random value and then solve for the inverse. The inverse solution is defined by the subclass." ^ self inverseDistribution: RandomGenerator next! ! ProbabilityDistribution initialize!