bayesian biostatistics - itamallman.rhon.itam.mx/~lnieto/index_archivos/biostats1.pdf · bayesian...
Post on 01-Jun-2020
11 Views
Preview:
TRANSCRIPT
UNIVERSITY OF PERNAMBUCO
BAYESIAN BIOSTATISTICS
INSTRUCTORS:
LUIS E. NIETO-BARAJAS & LYN PIZANO-RODRIGUEZ
EMAILS: lnieto@itam.mx & lpizano@yahoo.com
URL: http://allman.rhon.itam.mx/~lnieto
BAYESIAN BIOSTATISTICS, BY: LUIS E. NIETO & LYN PIZANO
II Encontro de Pós-Graduação e Pesquisa da Universidade de PernambucoRecife-PE, 8-10 de Outubro de 2008 2
BAYESIAN BIOSTATISTICS
Ø DEFINITIONS:
o Biostatistics (Wikipedia).
Is the application of statistics to a wide range of topics in biology. The
science of biostatistics encompasses the design of biological experiments,
especially in medicine and agriculture; the collection, summarization, and
analysis of data from those experiments; and the interpretation of, and
inference from, the results.
o Bayesian inference (Wikipedia).
Is statistical inference in which evidence or observations are used to update
or to newly infer the probability that a hypothesis may be true.
Ø OUTLINE:
1. Introduction
2. Exploratory Data Analysis
3. Probability Theory
4. Decision Theory
5. Bayesian inference
6. Priors
7. Clinical trial design
8. Hierarchical models
Appendix
BAYESIAN BIOSTATISTICS, BY: LUIS E. NIETO & LYN PIZANO
II Encontro de Pós-Graduação e Pesquisa da Universidade de PernambucoRecife-PE, 8-10 de Outubro de 2008 3
Ø REFERENCES:
o Spiegelhalter, D. J., Abrams, K. R. and Myles, J. P. (2004). Bayesian
Approaches to Clinical Trials and Health-Care Evaluation. Wiley:
Chichester.
o Bernardo, J. M. (1981). Bioestadística: Una perspectiva Bayesiana. Vicens
Vives: Barcelona. (http://www.uv.es/bernardo/Bioestadistica.pdf)
o Bernardo, J. M. and Smith, A. F. M. (2000), Bayesian Theory. Wiley: New
York.
Ø SOFTWARE:
To understand the Bayesian concepts:
1) First Bayes (http://www.shef.ac.uk/~st1ao/1b.html)
To make statistical analyses:
2) WinBUGS (http://www.mrc-bsu.cam.ac.uk/bugs/)
3) Splus
4) R (http://www.r-project.org/)
BAYESIAN BIOSTATISTICS, BY: LUIS E. NIETO & LYN PIZANO
II Encontro de Pós-Graduação e Pesquisa da Universidade de PernambucoRecife-PE, 8-10 de Outubro de 2008 4
1. Introduction
Ø The OBJECTIVE of Statistics, and in particular of Bayesian Statistics, is to
provide a methodology to adequately analyze the available information
(data analysis or descriptive statistics) and to decide in a reasonable way
the best way to proceed (decision theory or inferential statistics).
Ø DIAGRAM of Statistics:
Ø INFERENCE: It is the process to know population characteristics though a
subset of the population called sample. There are different ways to make
inference:
Assumption \ Approach Classic Bayesian
Parametric √√√ √√
Non parametric √√ √
Population
Sample
Sampling Inference
DDaattaa aannaallyyss iiss
DDeecciiss iioonn mmaakkiinngg
BAYESIAN BIOSTATISTICS, BY: LUIS E. NIETO & LYN PIZANO
II Encontro de Pós-Graduação e Pesquisa da Universidade de PernambucoRecife-PE, 8-10 de Outubro de 2008 5
Ø Some basic definitions:
o Element or individual: Object (person, item, animal, planta, etc.) whose
properties are to be analyzed.
o Population: A Collection of individuals or objects.
o Sample: A subset of the population.
o Parameter: A numerical value summarizing all the data in an entire
population.
o Statistic: A numerical value summarizing the sample data.
Ø VARIABLE: Characteristic or feature to be measure in an individual.
Ø ¿How to get a representative sample?
o Through random selection with a probability scheme. (Randomized trials !)
o The selection is made with replacement or without replacement if the
population is large ⇒ to induce independence.
Types of variables
Numeric Categorical
Continuous Discrete Ordinal Nominal
BAYESIAN BIOSTATISTICS, BY: LUIS E. NIETO & LYN PIZANO
II Encontro de Pós-Graduação e Pesquisa da Universidade de PernambucoRecife-PE, 8-10 de Outubro de 2008 6
2. Exploratory data analysis
Ø Assume that the collecting of data has been made. Let X1,X2,...,Xn be a
sample of size n of observations from the variable of interest X, where each
Xi represents the characteristic of interest for individual i.
Ø Exploratory techniques are divided in two:
1) Graphic techniques, and
2) Descriptive measures.
Ø Study: Measure the survival times of 100 terminal cancer patients who
were given supplemental ascorbate (Vitamin C) as part of their routine
management and 1000 matched controls (similar patients who have
received the same treatment except for the ascorbate).
Objective: Determine whether supplemental ascorbate prolongs the
survival times of patients with terminal human cancer.
Variables: Cancer type, Sex, Age (years), Survival times (days) for both
cases and controls.
Graphic techniques:
Ø CATEGORICAL VARIABLES:
o Barplot: displays the frequencies or relative frequencies of each category.
o Piechart: Displays the relative frequencies of a categorical variable as the
size of a piece of pie.
BAYESIAN BIOSTATISTICS, BY: LUIS E. NIETO & LYN PIZANO
II Encontro de Pós-Graduação e Pesquisa da Universidade de PernambucoRecife-PE, 8-10 de Outubro de 2008 7
0.530.47
Female Male1.0 1.5 2.0 2.5 3.0
0.0
0.1
0.2
0.3
0.4
0.5
Sex
Barplot for Sex
Stomach
Bro
nchu
s
Colon
Rectum
Ovary
Brea
st
Bla
dder
Kidney Gall_Bladder
Esophagus
Reticulum
Prostate Uterus
Brain
PancreasCLL
Piechart for Type (in percentage)
13171776117822221131
BAYESIAN BIOSTATISTICS, BY: LUIS E. NIETO & LYN PIZANO
II Encontro de Pós-Graduação e Pesquisa da Universidade de PernambucoRecife-PE, 8-10 de Outubro de 2008 8
Ø NUMERIC VARIABLES:
o Stem and leaf plots: Shows the form of the distribution of observed values
in a vertical position.
o Frequencies distribution: Contains frequencies and relative frequencies,
absolute and cumulative.
o Histogram: Graphical representation of the relative frequencies.
Stem and leaf plot for Age Decimal point is 1 place to the right of the colon
3 : 89
4 : 344
4 : 688999
5 : 01122333344
5 : 5566666777788999
6 : 00122233344
6 : 556667778888888999999
7 : 000000111233444444
7 : 5566677779
8 : 0
8 :
9 : 3
Frequencies distribution for Age
class lower.l upper.l freq rel.freq cum.freq rel.cum.freq [1,] 35 40 2 0.02 2 0.02 [2,] 40 45 3 0.03 5 0.05 [3,] 45 50 7 0.07 12 0.12 [4,] 50 55 12 0.12 24 0.24 [5,] 55 60 16 0.16 40 0.40 [6,] 60 65 11 0.11 51 0.51 [7,] 65 70 25 0.25 76 0.76 [8,] 70 75 14 0.14 90 0.90 [9,] 75 80 9 0.09 99 0.99 [10,] 80 85 0 0.00 99 0.99 [11,] 85 90 0 0.00 99 0.99 [12,] 90 95 1 0.01 100 1
BAYESIAN BIOSTATISTICS, BY: LUIS E. NIETO & LYN PIZANO
II Encontro de Pós-Graduação e Pesquisa da Universidade de PernambucoRecife-PE, 8-10 de Outubro de 2008 9
40 50 60 70 80 90
0.0
0.01
0.02
0.03
0.04
0.05
Age
Histogram for Age
o In the case of survival data, because of the presence of censored
observations, it is better to produce Kaplan-Meier plots: This graph
consists of plotting the empirical probability of dying (or presenting the
event of interest) after time “t”.
Time (days)
Sur
viva
l pro
babi
lity
0 1000 2000 3000 4000
0.0
0.2
0.4
0.6
0.8
1.0
CasesControls
Kaplan-Meir plots for Cases and Controls
BAYESIAN BIOSTATISTICS, BY: LUIS E. NIETO & LYN PIZANO
II Encontro de Pós-Graduação e Pesquisa da Universidade de PernambucoRecife-PE, 8-10 de Outubro de 2008 10
Numerical descriptive measures:
Ø Numerical measures could be either of central tendency, position, or
dispersion.
Ø CENTRAL TENDENCY MEASURES: These measures locate the central part of a
the values of a variable. The three most important are:
o Mean: is the arithmetic average of the observations.
∑=
=n
1iiX
n1
X = Sample mean
The mean is not a good central measure when the distribution of the
data is skewed.
o Mediana: is the observation that lies just at the middle of a dataset alter
being ordered.
l = n×0.5+0.5 = position of the median
m = X(l) = median (observation X that lies at position l
after ordering the data).
The median is a good indicator of central tendency when the
distribution of the data is skewed.
o Mode: is the observation that occurs the most frequently in a dataset. If
this value is unique we say that the frequencies distribution is unimodal.
BAYESIAN BIOSTATISTICS, BY: LUIS E. NIETO & LYN PIZANO
II Encontro de Pós-Graduação e Pesquisa da Universidade de PernambucoRecife-PE, 8-10 de Outubro de 2008 11
Ø POSITION OR LOCATION MEASURES: These are called quantiles or
percentiles. For p∈(0,1), the pth percentile is the observation that divides
the dataset such that p×100% of the observations are smaller and (1-
p)×100% are larger. The most common percentiles are:
o Quartiles: are observations that divide a dataset in 4 parts of equal
number of observations.
Q1 = X(n×0.25+0.5) = First quartile
Q2 = X(n×0.50+0.5) = Second quartile
Q3 = X(n×0.75+0.5) = Third quartile
Ø DISPERSION MEASURES: These are measures of the variability
(concentration, dispersion) of a dataset. The most common measures are:
o Range: is the simplest measure and indicates the spread between the
smallest and the largest observations.
R = Maximum – minimum = range
o Interquartile range: is the distance between the first and the third
quartiles.
ICR = Q3 – Q1 = interquartile range
o Variance: is the average of squared deviations of each obervation to the
mean.
( )
−
−=−
−= ∑∑
==
2n
1ii
n
1i
2i
2 XnX1n
1XX
1n1
S = sample variance
BAYESIAN BIOSTATISTICS, BY: LUIS E. NIETO & LYN PIZANO
II Encontro de Pós-Graduação e Pesquisa da Universidade de PernambucoRecife-PE, 8-10 de Outubro de 2008 12
The square root of the variance is called standard deviation, i.e.,
2SS = = sample standard deviation
o Variation coefficient: measures the relative dispersion of a dataset with
respect to the location.
XS
cv = = sample variation coefficient
This measure is useful to compare the variability of two datasets
because it does not depend on the measuring scale.
Descriptive measures for Age
Min. 1st Qu. Median Mean 3rd Qu. Max. 38 56 65 63.2 70 93
Descriptive measures for Survival times
Variable n events mean se(mean) median Cases 100 92 781 112 331 Controls 100 100 360 32.2 269
BAYESIAN BIOSTATISTICS, BY: LUIS E. NIETO & LYN PIZANO
II Encontro de Pós-Graduação e Pesquisa da Universidade de PernambucoRecife-PE, 8-10 de Outubro de 2008 13
Ø BOXPLOT DIAGRAMS: The boxplot summarizes the most important
descriptive measures. It also allows us to assess symmetry and the presence
of outliers. This diagram is also useful to compare different variables. 0
1000
2000
3000
4000
Cases Controls
BAYESIAN BIOSTATISTICS, BY: LUIS E. NIETO & LYN PIZANO
II Encontro de Pós-Graduação e Pesquisa da Universidade de PernambucoRecife-PE, 8-10 de Outubro de 2008 14
3. Probability Theory
Ø Probability theory acts as a bridge between descriptive statistics and
inferential statistics.
Ø Informally: Probability is a quantification or measure of the uncertainty
associated to the occurrence of an event.
Ø Formally: Probability is a function that satisfies 3 axioms:
1. ( ) 0AP ≥ for all event A
2. ( ) 1P =Ω , where Ω contains all possibilities
3. ( ) ( ) ( )BPAPBAP +=∪ , if ∅=∩ BA
Ø Although there is only one mathematical definition of probability, there are
several ways of assigning a probability: classic, frequentist and subjective.
For example, if somebody says that the probability of a coin coming up
heads is ½, how did he get this number?
Data analysis (Descriptive S.)
Decision theory (Inferential S.)
Probability
BAYESIAN BIOSTATISTICS, BY: LUIS E. NIETO & LYN PIZANO
II Encontro de Pós-Graduação e Pesquisa da Universidade de PernambucoRecife-PE, 8-10 de Outubro de 2008 15
Ø Something important about probabilities is that the quantification of
uncertainty is subject to change according to the conditions or the
knowledge we have about the event ⇒ conditional probabilities.
o Example: Consider the experiment of tossing two fair coins, and let A, B
and C three events such that
A=two heads
B=the first coin is head
C=at least one of the coins is head
P(A)=1/4
P(A given that we know B)=1/2
P(A given that we know C)=1/3
Ø CONDITIONAL PROBABILITY: Once we know that event B has occurred we
are interested in the probability of A. This is obtained by,
( ) ( )( )BP
BAPBAP
∩= , if ( ) 0BP ≠
o Comment: Broadly speaking, all probabilities are conditional probabilities:
( )HAP .
Ø From the definition of conditional probability we can derive two important
results: the marginalization rule and the Bayes’ theorem.
o Result 1: Marginalization rule.
( ) ( ) ( ) ( ) ( )cc BPBAPBPBAPAP += ,
where =cB not B.
BAYESIAN BIOSTATISTICS, BY: LUIS E. NIETO & LYN PIZANO
II Encontro de Pós-Graduação e Pesquisa da Universidade de PernambucoRecife-PE, 8-10 de Outubro de 2008 16
o Example: Prognosis.
We wish to determine the probability of survival (up to a specified point in
time) following a particular cancer diagnosis, given that it depends on the
stage of disease at diagnosis among other factors. Let
A=surviving
B=cancer was diagnosed at an early stage
Bc= cancer was NOT diagnosed at an early stage
Computing P(A) directly may be difficult, but we can obtain it by using the
marginalization rule.
Suppose patients with early stage disease have good prognosis, say
( ) 80.0BAP = , but for late stage it is poor, say ( ) 20.0BAP c = . We also
know that the majority of all diagnoses are early stage, that is , ( ) 90.0BP = ,
and therefore ( ) 10.0BP c = . Then the marginal probability of surviving is:
( ) 74.010.020.090.080.0AP =×+×=
o Result 2: Bayes Theorem.
( ) ( )( ) ( )BPABP ×=AP
BAP
This theorem tells us formally the learning process: ( ) ( )ABPBP → .
o Example: Prognosis (cont…)
The probability that the disease was diagnosed at an early stage can be
updated if we know that the patient has survived. A priori we knew that
P(B)=0.90. Now suppose that we find out that the patient survived, that is,
we know A. Then a revised probability of an early diagnosis is:
BAYESIAN BIOSTATISTICS, BY: LUIS E. NIETO & LYN PIZANO
II Encontro de Pós-Graduação e Pesquisa da Universidade de PernambucoRecife-PE, 8-10 de Outubro de 2008 17
( ) 97.090.074.080.0ABP =×=
Ø ODDS AND LOG-ODDS: An alternative way of reporting a probability.
Instead of quantifying the uncertainty in the [0,1] scale, we can do it in the
[0,∞) scale:
p1p
O−
=
and
O1Op+
= .
The natural logarithm of the odds is called logit,
( ) ( )∞∞−∈
−
= ,p1
plogpitlog .
o Example: a probability of 0.20 (20% chance) corresponds to odds of
O=0.20/0.80=0.25 or, in betting language, “4 to 1 against”. Conversely,
betting odds of “7 to 4 against” correspond to O=4/7 or a probability of
p=4/11=0.36.
o Bayes Theorem for odds: The learning mechanism given by the Bayes
Theorem can also be written in terms of odds:
( )( )
( )( )
( )( )cc BPBP
ABPABP
×= cBAP
BAP.
Or equivalently,
( )( )
( )( )
( )( )BP1
BPBAP1
BAPABP1
ABP−
×−
=−
.
BAYESIAN BIOSTATISTICS, BY: LUIS E. NIETO & LYN PIZANO
II Encontro de Pós-Graduação e Pesquisa da Universidade de PernambucoRecife-PE, 8-10 de Outubro de 2008 18
This form of the Bayes Theorem allows us to update ( )BP into ( )ABP
without calculating ( )AP .
o Example: Prognosis (cont…)
The initial odds for an early stage diagnosis are:
( ) ( ) 910.0/90.0BP/BP c == ,
the ratio ( ) ( ) 420.0/80.0BAP/BAP c == , therefore the updated odds are
( ) ( ) 3694ABP/ABP c =×= .
Ø BAYESIAN THEORY is based on the subjective interpretation of probability
and has its roots in Bayes Theorem.
BAYESIAN BIOSTATISTICS, BY: LUIS E. NIETO & LYN PIZANO
II Encontro de Pós-Graduação e Pesquisa da Universidade de PernambucoRecife-PE, 8-10 de Outubro de 2008 19
4. Decision theory
Ø Statistical Inference is a way of making decisions. Classical methods of
inference ignore important aspects of the decision-making process;
however, Bayesian methods of inference do take them into account.
Ø What is a decision problem? We face a decision problem when we have to
select from two or more ways of proceeding.
Ø MAKING DECISIONS is a fundamental aspect in the life of a professional
person, for instance, a physician must make decisions constantly in an
environment with uncertainty, decisions about the best treatment for a
patient, etc.
Ø DECISION THEORY proposes a method of making decisions based on some
basic principles about the coherent election between alternative options.
Ø ELEMENTS OF A DECISION PROBLEM under uncertainty:
A decision problem is defined by the quadruplet (D, E, C, ≤), where:
q D : Space of decisions. Set of possible alternatives, it has to be exhaustive
(contains all possibilities) and exclusive (electing one element in D
excludes the election of any other).
D = d1,d2,...,dk.
BAYESIAN BIOSTATISTICS, BY: LUIS E. NIETO & LYN PIZANO
II Encontro de Pós-Graduação e Pesquisa da Universidade de PernambucoRecife-PE, 8-10 de Outubro de 2008 20
q E : Space of uncertain events. Contains uncertain events relevant to the
decision problem.
Ei = Ei1,Ei2,...,Eimi., i=1,2,…,k. Ei = E i1, E i2,K, E im i ,i = 1,2,K, k
q C : Space of consequences. Set of all possible consequences and describes
the consequences of electing a decision.
C = c1,c2,...,ck.
q ≤ : Preference relation among different options. Is defined in such a way
that d1≤d2 if d2 is preferred over d1.
• REMARK: For the moment we will consider discrete spaces (decisions,
events and consequences), although the theory is also applied to continuous
spaces.
Ø DECISION TREE (under uncertainty):
d1
di
dk
c11 c12
c1m1
There is not full information about the consequences of making a decision
E11
E12
E1m1
Ei1
Ek1
Ei2
Ek2
Eimi
Ekmk
ci1
ck1
ci2
ck2
cimi
ckmk
BAYESIAN BIOSTATISTICS, BY: LUIS E. NIETO & LYN PIZANO
II Encontro de Pós-Graduação e Pesquisa da Universidade de PernambucoRecife-PE, 8-10 de Outubro de 2008 21
Ø Example: Decision problem.
A physician needs to decide whether to carry out surgery on a person he
believes has a malignant tumor or to treat with chemotherapy. If the patient
has a benignant tumor, the life expectancy is 20 years. If he has a
malignant tumor, undergoes surgery, and survives, he is given 10 years of
life; whereas if he has a malignant tumor and does not undergo surgery, he
is only given 2 years of life.
D = d1, d2, where d1 = surgery, d2 = therapy
E = E11, E12, E13, E21, E22, where E11 = survival / tumor, E12 = survival /
no tumor, E13 = dead, E21 = tumor, E22 = no tumor
C = c11, c12, c13, c21, c22, where c11=10, c12=20, c13=0, c21=2, c22=20
Decision node Uncertainty (random) node
Surgery
Therapy
Surv
Dead
M.Tum
B.Tum
M.Tum
B.Tum
10 yrs.
20 yrs.
0 yrs.
2 yrs.
20 yrs.
BAYESIAN BIOSTATISTICS, BY: LUIS E. NIETO & LYN PIZANO
II Encontro de Pós-Graduação e Pesquisa da Universidade de PernambucoRecife-PE, 8-10 de Outubro de 2008 22
Ø In practice, most decision problems have a much more complex structure.
For instance, one may have to decide whether or not to carry out an
experiment, and if one does the experiment, make another decision
according to the result of the experiment. (Sequential decision problems).
Ø Frequently, the set of uncertain events is the same for all decisions, that is,
Ei = E i1, E i2,K, E im i = E1, E 2,K, Em = E , for all i. In this case, the
problem can be represented as:
E1 ... Ej ... Em
d1 c11 ... c1j ... c1m
M M M M
di ci1 ... cij ... cim
M M M M
dk ck1 ... ckj ... ckm
Ø The OBJECTIVE of a decision problem under uncertainty is then to make the
best decision di from the set D without knowing which of the events Eij
from Ei will occur.
Ø Although the events that form each Ei are uncertain, in the sense that we do
not know which of them will occur, in general, we have an idea of the
probability of each of them. For instance,
BAYESIAN BIOSTATISTICS, BY: LUIS E. NIETO & LYN PIZANO
II Encontro de Pós-Graduação e Pesquisa da Universidade de PernambucoRecife-PE, 8-10 de Outubro de 2008 23
25 years
Ø Sometimes it is difficult to order our preferences among all possible
different consequences. It might be simpler to assign a utility measure to
each of the consequences and then order them according to their utility.
Ø QUANTIFICATION of uncertain events and of consequences.
q The information that the decision maker has about the possible occurrence
of the events can be quantified through a probability function on the space
E.
live 10 yrs. more
die in 1 month
reach 90 yrs.
Which is more probable?
Consequences Earn much money & have little available
time
Earn little money & have much available
time
Earn regular money & have regular available
time
BAYESIAN BIOSTATISTICS, BY: LUIS E. NIETO & LYN PIZANO
II Encontro de Pós-Graduação e Pesquisa da Universidade de PernambucoRecife-PE, 8-10 de Outubro de 2008 24
q In the same way, it is possible to quantify the preferences of the decision
maker among different consequences through a utility function in such a
way that 'j'iij cc ≤ ⇔ ( ) ( )'j'iij cucu ≤ .
Ø Alternatively, it is possible to represent the decision tree as follows:
Ø How to make the best decision?
If in some way we were able to make the uncertainty disappear, we could
order our preferences according to the utility of each decision. Then the
best decision would be the one that has the maximum utility.
d1
di
dk
u(c11)
u(c12)
u(c1m1)
P(E11|d1)
P(E12|d1)
P(E1m1|d1)
P(Ei1|di)
P(Ek1|dk)
P(Ei2|di)
P(Ek2|dk)
P(Eimi|di)
P(Ekmk|dk)
u(ci1)
u(ck1)
u(ci2)
u(ck2)
u(cimi)
u(ckmk)
Uncertainty Decider Go away
BAYESIAN BIOSTATISTICS, BY: LUIS E. NIETO & LYN PIZANO
II Encontro de Pós-Graduação e Pesquisa da Universidade de PernambucoRecife-PE, 8-10 de Outubro de 2008 25
Ø STRATEGIES: There are 4 strategies or criteria proposed in the literature to
disappear the uncertainty and make decisions: Optimistic, pessimistic or
minimax, conditional or most probable, and expected utility.
q Whichever strategy one takes, the best option is the one that maximizes the
tree “without uncertainty.”
Ø AXIOMS OF COHERENCE. These are a series of principles that establish the
conditions for making coherent decisions and that clarify the possible
ambiguity in the process of making a decision. There are four axioms of
coherence:
1. COMPARABILITY. This axiom establishes that we should at least be able
to express preferences between two different options.
2. TRANSITIVITY. This axiom establishes that preferences must be
transitive to avoid contradictions.
3. SUBSTITUTION AND DOMINATION. This axiom establishes that there are
equivalent options and there are also options dominated by others.
4. REFERENCE EVENTS. This axiom establishes that to be able to make
reasonable decisions, it is necessary to measure the information and the
preferences of the decision maker in a quantitative form.
Ø IMPLICATIONS: As a consequence of the axioms, if we want to make
coherent decisions, the way of making a decision is as follows:
BAYESIAN BIOSTATISTICS, BY: LUIS E. NIETO & LYN PIZANO
II Encontro de Pós-Graduação e Pesquisa da Universidade de PernambucoRecife-PE, 8-10 de Outubro de 2008 26
1) Assign a utility u(c) for all c in C.
2) Assign a probability P(E) for all E in E.
3) Select the (optimal) option that maximizes the expected utility.
o Theorem: Bayesian decision criteria.
The expected utility of the option di = ijij m,,1j,Ec K= is defined as:
( ) ( ) ( )∑=
=im
1jiijiji dEPcudu .
Then the optimal decision is d∗ such that ( ) ( )ii
* dumaxdu = .
Ø Example. Decision problem (cont…).
Assume that the prior believes of the physician are that a patient survives
the surgery 90% of the times and 60% of the tumors are malignant tumor.
We consider that undertaking a surgery is independent of the condition of
the tumor. Furthermore, assuming that the utility is proportional to the
years of life, then the decision problem is re-written as
Surgery
Therapy
(0.9) Surv
(0.1) Dead
(0.6) M.Tum
(0.4) B.Tum
(0.6) M.Tum
(0.4) B.Tum
10 yrs.
20 yrs.
0 yrs.
2 yrs.
20 yrs.
BAYESIAN BIOSTATISTICS, BY: LUIS E. NIETO & LYN PIZANO
II Encontro de Pós-Graduação e Pesquisa da Universidade de PernambucoRecife-PE, 8-10 de Outubro de 2008 27
Then, the expected utility of each option becomes
( ) ( )( ) ( )( ) ( ) 6.121.004.09.0206.09.010du 1 =++= , and
( ) ( ) ( ) 2.94.0206.02du 2 =+=
Therefore, the option that maximizes the expected utility is d1, that is, the
optimal decision is to carry out surgery.
Ø FINAL COMMENT:
The more we know about the uncertain events,
the better the decision made is.
How do we reduce uncertainty about E?
Obtaining additional information (Z) about the events E’s. We then update
our knowledge by using the Bayes Theorem, that is,
( )EP ( )ZEP
In this case we have two situations:
1) Initial situation (a-priori):
( )ijEP , ( )ijcu , ( ) ( ) ( )∑=j
ijiji EPcudu
2) Final situation (a-posteriori):
( )ZEP ij , ( )ijcu , ( ) ( ) ( )∑=j
ijiji ZEPcuZdu
In any case, the option that maximizes the expected utility is the optimal
decision.
Bayes Theo.
Initial expected
utility
Final expected
utility
BAYESIAN BIOSTATISTICS, BY: LUIS E. NIETO & LYN PIZANO
II Encontro de Pós-Graduação e Pesquisa da Universidade de PernambucoRecife-PE, 8-10 de Outubro de 2008 28
5. Bayesian inference
Ø Let X = random variable of interest
(e.g. response to a drug or the survival time of patients).
Ø The behavior of X depends, in a parametric world, on the value of some
unknown quantities θ called parameters.
( )θ⇒ xf
denotes the density function of X that depends on θ.
Ø INFERENCE PROBLEM. Let F ( ) Θ∈θθ= ,xf be a parametric family indexed
by the parameter θ∈Θ. Let X1,...,Xn be a random sample (r. s.) of
observations from f(x|θ) ∈F. The inference problem consists of estimating
the real value of the parameter θ.
o In a Bayesian perspective, the inference problem can be seen as a decision
problem with the following elements:
D = space of decisions (in point estimation, D = Θ)
E = Θ (parameter space)
C = ( ) Θ∈θ∈θ ,d:,d D
≤ : will be represented by a utility function or a loss function.
Ø The sample gives additional information about the uncertain events θ ∈ Θ.
The problem consists of how to update the information.
BAYESIAN BIOSTATISTICS, BY: LUIS E. NIETO & LYN PIZANO
II Encontro de Pós-Graduação e Pesquisa da Universidade de PernambucoRecife-PE, 8-10 de Outubro de 2008 29
Ø If the coherence axioms are accepted, the decision maker is capable of
quantifying his or her knowledge about the uncertain events through a
probability function. We then define,
( )θf the prior distribution (or a-priori). Quantifies the initial
knowledge about θ.
( )θxf sample information generating process. Gives additional
information about θ.
( )θxf the likelihood function. Contains all information about θ given
by the sample ( )n1 X,XX K= . ( ) ( )∏=
θ=θn
1iixfxf
q All this information about θ is combined to obtain a final knowledge or a-
posteriori after having observed the sample. The way to do it is by means
of Bayes Theorem:
( ) ( ) ( )( )xf
fxfxf
θθ=θ ,
where ( ) ( ) ( )∫Θ
θθθ= dfxfxf or ( ) ( )∑θ
θθ fxf .
As ( )xf θ is a function of θ, then we can write
Finally,
( )xf θ the posterior distribution (or a-posteriori). Summarizes all
available knowledge about θ (prior + sample).
( ) ( ) ( )θθ∝θ fxfxf
BAYESIAN BIOSTATISTICS, BY: LUIS E. NIETO & LYN PIZANO
II Encontro de Pós-Graduação e Pesquisa da Universidade de PernambucoRecife-PE, 8-10 de Outubro de 2008 30
Ø Example: Tumor response.
X=tumor response under a therapy
=otherwise 0,
response positive if ,1x
( ) ( )θ=θ⇒ xBerxf ,
where θ=probability of response.
The prior believes of the experts are that the probability of response (θ) for
this new therapy is well represented by
( ) ( )3,3Betaf θ=θ
theta
0.0 0.2 0.4 0.6 0.8 1.0
01
23 Prior
After testing the therapy on n=10 patients, only 2 of them responded
positively, which give us a likelihood
( ) ( )82 1xf θ−θ=θ
BAYESIAN BIOSTATISTICS, BY: LUIS E. NIETO & LYN PIZANO
II Encontro de Pós-Graduação e Pesquisa da Universidade de PernambucoRecife-PE, 8-10 de Outubro de 2008 31
theta
0.0 0.2 0.4 0.6 0.8 1.0
01
23 Prior
Likelihood
Combining the prior with the additional information given by the
likelihood, we get a posterior knowledge about θ given by
( ) ( )11,5Betaxf θ=θ
theta
0.0 0.2 0.4 0.6 0.8 1.0
01
23 Prior
LikelihoodPosterior
BAYESIAN BIOSTATISTICS, BY: LUIS E. NIETO & LYN PIZANO
II Encontro de Pós-Graduação e Pesquisa da Universidade de PernambucoRecife-PE, 8-10 de Outubro de 2008 32
Ø REMARK: As θ is a random quantity, since we are uncertain about the true
value of θ, the density function ( )θxf that generates relevant information
about θ is actually a conditional density. Moreover, as θ is unknown, f(x|θ)
can not be used to describe the behavior of the r.v. X.
Ø PREDICTIVE DISTRIBUTION: The preditive distribution is the marginal
density function f(x) that allows us to determine which values of the
random variable are more probable.
1) Prior predictive distribution. Using the prior ( )θf and marginalizing
( ) ( ) ( )∫ θθθ= dfxfxf or ( ) ( ) ( )∑θ
θθ= fxfxf
2) Posterior predictive distribution. Using the posterior ( )xf θ and
marginalizing
( ) ( ) ( )∫ θθθ= dxfxfxxf FF or ( ) ( ) ( )∑θ
θθ= xfxfxxf FF
Ø Example: Tumor response (cont…).
Our idea is to determine the probability of response for a set of m=10 new
patients, say ( )θ∼= ∑=
,10BinXY10
1ii , ← θ unknown.
The posterior knowledge we have about θ is that ( ) ( )11,5Betaxf θ=θ . One
alternative to determine the value of θ is to take the average (posterior
mean), that is,
( ) 31.0xEˆ =θ=θ ⇒ ( )31.0,10BinY ≈ ← plug-in
BAYESIAN BIOSTATISTICS, BY: LUIS E. NIETO & LYN PIZANO
II Encontro de Pós-Graduação e Pesquisa da Universidade de PernambucoRecife-PE, 8-10 de Outubro de 2008 33
However, this procedure does not take into account the uncertainty around
θ. So the correct answer will be given by the posterior predictive
distribution which takes the form
( ) ( )11,5,10BeBinxyf = ← Beta−Binomial 0.
00.
050.
150.
25
Y
Binomial distribution
0.0
0.05
0.15
0.25
Y
Beta-Binomial distribution
Beta-BinomialBinomial
0.0
0.05
0.10
0.15
0.20
0.25
Y
Precitive distribution
BAYESIAN BIOSTATISTICS, BY: LUIS E. NIETO & LYN PIZANO
II Encontro de Pós-Graduação e Pesquisa da Universidade de PernambucoRecife-PE, 8-10 de Outubro de 2008 34
Ø MODEL COMPARISON. ( )111 ,xf:M θ vs. ( )222 ,xf:M θ
We can naturally solve this problem by considering a decision problem but
alternatively, we can compute a Bayes factor (likelihood ratio)
( )( )xfxfB
2
1= ,
where ( ) ( ) ( ) ( )∫ θθπθ= jjjjj dxfxf , j=1,2.
If B is large (>10) ⇒ data supports M1
If B is small (<1/10) ⇒ data supports M2
Ø SUMMARY: Bayesian analysis.
( )θxf and ( )θf are probability distributions that define a joint model
( )θ,xf , which implies a posterior ( )xf θ and a predictive (marginal) ( )xf .
o Posterior probabilities are over parameter space Θ, e. g.
§ ( )x15.0P >θ
§ ( )xP 21 δ+θ>θ , etc.
BAYESIAN BIOSTATISTICS, BY: LUIS E. NIETO & LYN PIZANO
II Encontro de Pós-Graduação e Pesquisa da Universidade de PernambucoRecife-PE, 8-10 de Outubro de 2008 35
6. Priors
Ø There exist several classes of prior distributions. In terms of the amount of
information they carry, we classify them as informative and
noninformative.
Ø INFORMATIVE PRIOR DISTRIBUTIONS: These are prior distributions that
contain relevant information about the occurrence of the uncertain events
θ. There are two kinds:
o Subjective prior: probability model reflecting (personal) judgement
about uncertain events (parameter values).
o Historical prior: (from related studies) judgement about uncertain
events (parameter values) informed by related earlier studies. We can
achieve a historical prior by:
§ Discount with inflated variance, or
§ Use only a fraction of the data set.
Ø Example: Amount of tyrosine.
The consequences of certain treatment can be determined by the amount of
tyrosine (θ) in the urine. The prior information about this quantity in
patients shows that it is around 39mg./24hrs. and that the percentage of
times this quantity exceeds 49mg./24hrs. is 25%.
According to this information, “it can be implied” that the normal
distribution models “reasonably well” this behavior, so
BAYESIAN BIOSTATISTICS, BY: LUIS E. NIETO & LYN PIZANO
II Encontro de Pós-Graduação e Pesquisa da Universidade de PernambucoRecife-PE, 8-10 de Outubro de 2008 36
θ ∼ ( )2,N τµ ,
where µ=E(θ)=mean and τ2=Var(θ)=variance. Moreover,
How?
( ) 25.03949
ZP49P =
τ−
>=>θ ⇒ τ−= 3949Z 25.0 ,
as Z0.25 = 0.675 (from tables) ⇒ τ = 100.675
Therefore, θ ∼ N(39, 219.47).
theta
Den
sity
0 20 40 60 80
0.0
0.00
50.
010
0.01
50.
020
0.02
5
Informative prior for Tyrosine
Amount of tyrosine (θ) around 39 µ=39
P(θ > 49) = 0.25 (given µ=39)
τ=14.81
BAYESIAN BIOSTATISTICS, BY: LUIS E. NIETO & LYN PIZANO
II Encontro de Pós-Graduação e Pesquisa da Universidade de PernambucoRecife-PE, 8-10 de Outubro de 2008 37
Ø Example: Amount of tyrosine (cont...)
Suppose now that there exist only 3 possible values (categories) for the
amount of tyrosine: θ1 = low, θ2 = medium, & θ3 = high. Assume even
further that θ2 is three times as frequent as θ1 and that θ3 is twice as
frequent as θ1. We can specify the prior distribution for the amount of
tyrosine by,
letting pi=P(θi), i =1,2,3. Then,
p2=3p1 and p3=2p1. Moreover, 1ppp 321 =++
⇒ p1+ 3p1+ 2p1=1 ⇔ 6p1=1 ∴ p1=1/6, p2=1/2 and p3=1/3
Ø NONINFORMATIVE PRIOR DISTRIBUTIONS: These are prior distributions that
do not give us any relevant information about the occurrence of the
uncertain events θ. There are several criteria to define a noninformative
prior:
1) Principle of insufficient reasoning: According to this principle, in the
absence of evidence against, all possibilities have the same prior
probability.
⇒ Uniform priors
2) Invariant prior distribution: Jeffreys (1946) proposed a noninformative
prior distribution that is invariant under re-parameterizations.
2/1)(Idet)( θ∝θπ , θ∈Θ,
where ( )
θ∂θ∂θ∂
−=θ θ ' Xflog
E)(I2
|X is Fisher’s information matrix.
BAYESIAN BIOSTATISTICS, BY: LUIS E. NIETO & LYN PIZANO
II Encontro de Pós-Graduação e Pesquisa da Universidade de PernambucoRecife-PE, 8-10 de Outubro de 2008 38
o Example: Let X be a r.v. with conditional distribution given θ,
( ) ( )θ=θ xBerxf , i.e., ( ) ( ) )x(I1xf 1,0x1x −θ−θ=θ , θ∈(0,1). Then,
( )2/1 ,2/1Beta)( θ=θπ ← Jeffreys prior
( )1 ,0Uniform)( θ=θπ ← Insufficient reasoning prior
theta
Den
sity
0.0 0.2 0.4 0.6 0.8 1.0
1.0
1.5
2.0
2.5
3.0
JeffreysUniform
Noninformative priors
Ø COMMENTS on noninformative priors:
o Useful for data analysis
o Impractical for design problems: need to consider inference before
recording data
o Solution: Consider two different priors:
§ design prior (optimistic informative) vs.
§ analysis prior (skeptic, vague)
optimistic = investigator, drug developer
skeptic = regulator, decision maker
BAYESIAN BIOSTATISTICS, BY: LUIS E. NIETO & LYN PIZANO
II Encontro de Pós-Graduação e Pesquisa da Universidade de PernambucoRecife-PE, 8-10 de Outubro de 2008 39
Ø CONJUGATE PRIORS: Prior distribution such that the posterior ( )xθπ and the
prior ( )θπ belong to the same family (i.e., have the same form but with
updated parameters).
o Example: Let X1,X2,...,Xn be a r.s. from ( ) ( )θ=θ xBerxf .
Prior: ( ) ( ) ( ) )(I1)b()a()ba(
b,aBeta )1,0(1b1a θθ−θ
ΓΓ+Γ
=θ=θπ −−
Likelihood: ( ) ( ) ( )∏=
−∑∑ θ−θ=θn
1ii1,0
xnxxI1xf ii
Posterior: ( ) ( ) )(I1)b()a()ba()b,a(Betaxf )1,0(
1b1a
11
1111
11 θθ−θΓΓ+Γ=θ=θ −−
where, ∑+= i1 xaa and ∑−+= i1 xnbb .
o Example: Let X1,X2,...,Xn be a r.s. from ( ) ( )2,xNxf σµ=θ .
( ) ( )200 ,N σµµ=µπ is the conjugate prior for µ, and
( ) ( )0022 b,aGamma −− σ=σπ is the conjugate prior for σ−2.
if ∞→σ20 then ( ) .cte∝µπ ← improper noninformative prior
if 0a 0 ≈ and 0b 0 ≈ then ( ) ( )001.0,001.0Gamma 22 −− σ=σπ ←vague prior
o More examples of conjugate families can be found in the list of formulas.
http://www.uv.es/~bernardo/FormulBT.pdf
BAYESIAN BIOSTATISTICS, BY: LUIS E. NIETO & LYN PIZANO
II Encontro de Pós-Graduação e Pesquisa da Universidade de PernambucoRecife-PE, 8-10 de Outubro de 2008 40
7. Clinical trial design
Ø “The main objective of almost all trials on human subjects is (or should be)
a decision concerning the treatment of patients in the future”.
Ø The most commonly performed clinical trials evaluate new drugs, medical
devices (like a new catheter), biologics, psychological therapies, or other
interventions.
Ø Clinical trials may be required before the national regulatory authority will
approve marketing of the drug or device, or a new dose of the drug, for use
on patients.
Ø For drug development trials (pharmaceutical industry) we have several
phases:
1) Phase I study: deal with identifying a safe dose (maximum tolerable
dose without toxicities), usually on healthy volunteers.
2) Phase II study: concerned with finding an effective dose.
3) Phase III study: are intended to prove treatment benefit over an
appropriate control.
4) Phase IV study: monitor the use and possible side-effects of a drug in
routine use.
Ø The objective of the trial is usually specified as statistical hypothesis of
clinically meaningful events, that is, in terms of the parameters of the
model.
For instance:
BAYESIAN BIOSTATISTICS, BY: LUIS E. NIETO & LYN PIZANO
II Encontro de Pós-Graduação e Pesquisa da Universidade de PernambucoRecife-PE, 8-10 de Outubro de 2008 41
H0: treatment equivalence
H1: superiority of new treatment
(H2: inferiority of new treatment)
Ø Example: Lung cancer trial.
The physicians would be willing to use routinely the new treatment only if
it confers at least 13.5% improvement in two year survival (from a baseline
of 15%), and unwilling if less than 11% improvement.
Thus, if T= time to dead, and θ=P(T>2), then
H0: [ ]285.0,26.0∈θ ← range of equivalence
H1: θ > 0.285
H2: θ < 0.26
Ø Example: Tumor response.
Stopping criteria based on posterior probabilities.
Let 1,0Yi ∈ be the tumor response under new therapy for patient i.
( )1YP i ==θ .
Suppose that the standard of care is %150 =θ , and that the range of
equivalence is [ ]%20 , %10 , therefore
H0: [ ]20.0,10.0∈θ
H1: θ > 0.20
H2: θ < 0.10
Let n21 y,,y,y K be the response for n patients then we need to evaluate
two situations:
BAYESIAN BIOSTATISTICS, BY: LUIS E. NIETO & LYN PIZANO
II Encontro de Pós-Graduação e Pesquisa da Universidade de PernambucoRecife-PE, 8-10 de Outubro de 2008 42
1) Stop and recommend the experimental therapy if
( ) ( ) 1n1n11 y,,y2.0Py,,yHP π>>θ=∈θ KK
2) Stop and abandon the experimental therapy if
( ) ( ) 2n1n12 y,y1.0Py,yHP π><θ=∈θ KK or maxnn ≥
1π and 2π are set to be close to one and are called design parameters.
These can be tuned to achieve desired frequentist properties !!.
Ø FREQUENTIST OPERATING CHARACTERISTICS of a design.
Type –I error:
( )0recommend and stopP θ=α .
Comments:
o Typically Analytically intractable
o Require (independent) Monte Carlo simulation
Ø Example: How to compute type−I error.
1. Fix 0θ=θ
2. Simulate a possible history of the trial
a. Simulate ( )θ∼ ii yfy , n,,1i K=
b. Evaluate posterior probabilities; e.g. ( )n1 y,,y2.0P K>θ
c. Evaluate stopping rules if applicable
d. Upon completion of the trial, record the final decision
3. Repeat the trial simulation M times
4. Record the number of trials MR that end with the final decision of
rejecting H0 and report
MMR≈α
BAYESIAN BIOSTATISTICS, BY: LUIS E. NIETO & LYN PIZANO
II Encontro de Pós-Graduação e Pesquisa da Universidade de PernambucoRecife-PE, 8-10 de Outubro de 2008 43
Ø DECISION THEORETIC DESIGN.
This is a design based on entirely on the decision theory framework, that is,
we need a space of decisions, uncertain events, consequences and
quantifications: utility and probability model.
o Space of decisions: Dd ∈ , where for example,
Ex 1: choice of the nest dose, 1it zd +=
Ex 2: stopping decision, 2,1,0d t ∈ ,
where 0=stop and abandon, 1=phase III, 2=continue accrual
o Probability model: Quantification of the uncertainty of all unknown
quantities: parameters θ, historical data y0, data y and latent data y~ .
Ex: Dose/Response problem, ( ) ( )( )2iiii ,z,gyNz,yf σθ=θ , n,,1i K= and
prior ( )2,f σθ , with mean response ( )z,g θ at dose z.
BAYESIAN BIOSTATISTICS, BY: LUIS E. NIETO & LYN PIZANO
II Encontro de Pós-Graduação e Pesquisa da Universidade de PernambucoRecife-PE, 8-10 de Outubro de 2008 44
o Utility function: ( )θ,du worth of a decision d under uncertain event θ.
Ex 1: Precision of the dose effect θ, i.e., ( )tzVar1 θ
Ex 2: Sampling cost + reward of success,
( ) ( )( )
=θ+⋅
=⋅+⋅
=
=θ
+ 2d if ,,dUnc
1d if ,SPCmc
0d if ,0
,dU
t*
1t
t
t
t
where m=phase III sample size
S=significant phase II trial
n=cohort size
*1td + =optimal decision at time t+1
BAYESIAN BIOSTATISTICS, BY: LUIS E. NIETO & LYN PIZANO
II Encontro de Pós-Graduação e Pesquisa da Universidade de PernambucoRecife-PE, 8-10 de Outubro de 2008 45
8. Hierarchical models
Ø The Bayesian hierarchical models simplify the simultaneous estimation of
several parameters θi of the same type with two objectives:
1) Borrow strength to improve precision in the estimation of parameters
2) Allow introduce uncertainty in the estimations
Ø In general, we can borrow strength across multiple related
§ Studies
§ Subpopulations
§ Current and historical studies
§ Diseases
§ Etc…
Ø Consider multiple studies (sub-populations):
( )1n1111 y,,yy K= , ( )
2n2212 y,,yy K= , … , ( )kkn1kk y,,yy K=
The hierarchical model can be summarized in the following diagram:
1Y 2Y kY
1θ 2θ kθ
ϕ Hyper-parameter
parameter
observations
BAYESIAN BIOSTATISTICS, BY: LUIS E. NIETO & LYN PIZANO
II Encontro de Pós-Graduação e Pesquisa da Universidade de PernambucoRecife-PE, 8-10 de Outubro de 2008 46
Ø FORMALLY, the hierarchical model can be specified as:
1) Parameters and sub-models for each study:
( )11yf θ , ( )22yf θ , …
2) Borrow strength across studies by combining parameters at the prior
level:
( )ϕθ1f , ( )ϕθ2f , …
together with
( )ϕf
§ Note: The prior on θj could include regression on study specific
covariates:
( )ϕθ ,zf jj
Ø There are two alternatives to the hierarchical models:
1. Weaker dependence. Assuming independent studies: separate ϕj’s
( )jjj f ϕθ∼θ and ( )jf ϕ
2. Stronger dependence. Assuming exchangeable patients (pooling):
common θ
( )θ∼ ii yfy
o Remark: The hierarchical model is a compromise between 1 and 2.
Ø The main application of hierarchical models has been to carry out
META−ANALYSES (quantitative synthesis of multiple studies).
Exchangeability
BAYESIAN BIOSTATISTICS, BY: LUIS E. NIETO & LYN PIZANO
II Encontro de Pós-Graduação e Pesquisa da Universidade de PernambucoRecife-PE, 8-10 de Outubro de 2008 47
Ø Example: EFM: meta−analysis of trials with rare events. (Spiegelhalter et
al. 2004, p. 275)
EFM: Electronic fetal heart rate monitoring in labor.
Aim: Early detection of altered heart rate pattern and hence a potential
benefit in perinatal mortality.
Number of studies: 9 randomized trials
Outcome: Perinatal mortality measured as odds ratio in deaths per 1000
births (comparing EFM vs. control).
Statistical models:
Let ( )ORlogj =θ and Yj the observed OR then
( )2jj ,Ny σθ∼
a) Approximate normal likelihood + fixed effects:
( )2j ,N τµ∼θ
b) Approximate normal likelihood + random effects:
( )2j ,N τµ∼θ , Uniform∼µ , Uniform∼τ
Let tkr the observed deaths in the treatment group, and
ckr the observed deaths in the control group
tkn and ckn the total number of patients in each group
Then
( )tktktk p,nBinr ∼ and ( )ckckck p,nBinr ∼
with
( ) kktkpitlog θ+φ= and ( ) kckpitlog φ=
BAYESIAN BIOSTATISTICS, BY: LUIS E. NIETO & LYN PIZANO
II Encontro de Pós-Graduação e Pesquisa da Universidade de PernambucoRecife-PE, 8-10 de Outubro de 2008 48
c) Binomial likelihood + random effects (uniform risks)
( )2j ,N τµ∼θ , Uniformp ck ∼
d) Binomial likelihood + random effects (uniform logits)
( )2j ,N τµ∼θ , ( ) Uniformpitlog ckck ∼=φ
BAYESIAN BIOSTATISTICS, BY: LUIS E. NIETO & LYN PIZANO
II Encontro de Pós-Graduação e Pesquisa da Universidade de PernambucoRecife-PE, 8-10 de Outubro de 2008 49
Appendix (Computational Aspects)
Ø There are several packages available for producing statistical analyses
(Bayesian or Frequentist).
Ø Depending on the type of analysis and the technique, the choice of one
package or another could make our life easier (or more miserable).
Ø In general, we can classify the packages in two types:
1. Windows−based packages: (windows = menus)
o Simple to use: Follow the menus
o Little or none freedom: Type of analyses are constrained to the
available routines
o Examples: Statgraphics, Minitab, SPSS, etc…
2. Program−based packages:
o More complicated to use: Need to write your own code (not from
scratch, there are usually lots of available commands)
o Much freedom: Type of analyses are open to the imagination or
needs of the researcher
o Examples: Splus, R, Matlab, WinBugs, etc…
Ø For descriptive statistics (exploratory data analysis) we recommend the use
of a windows-based package
BAYESIAN BIOSTATISTICS, BY: LUIS E. NIETO & LYN PIZANO
II Encontro de Pós-Graduação e Pesquisa da Universidade de PernambucoRecife-PE, 8-10 de Outubro de 2008 50
Ø For a more complete inferential analysis (probability model involved) we
recommend a windows-based package, if the routine is available, otherwise
we will require a program-based package.
Ø For a Bayesian analysis we will necessarily require a program-based
package, for example WinBugs.
Ø R (OR SPLUS) ↔ WINBUGS:
o These two packages share the same syntax, although the type of
analysis they produce is different
o Both packages are of free access
o Given to the flexibility of making your own code, R has become very
popular among applied statisticians. Nowadays there are plenty of “R
packages” (routines) freely available for most statistical applications
(Frequentist or Bayesian)
o Some Bayesian books provide code in WinBugs for doing their
examples. Unfortunately our reference book (Spiegelhalter, et al. 2004)
does not do it. However, another book that does provide the code is:
§ Congdon, P. (2001). Bayesian Statistical Modelling, Wiley:
Chichester. (ftp://www.wiley.co.uk/pub/books/congdon)
BAYESIAN BIOSTATISTICS, BY: LUIS E. NIETO & LYN PIZANO
II Encontro de Pós-Graduação e Pesquisa da Universidade de PernambucoRecife-PE, 8-10 de Outubro de 2008 51
#R commands for some graphs #An electronic version of this file can be found at #http://allman.rhon.itam.mx/~lnieto/index_archivos/CommandsR.txt #Reading data sets a331<-read.table("http://allman.rhon.itam.mx/~lnieto/index_archivos/A331a.txt",row.names=1) #Assigning names to the columns (variables) dimnames(a331)[[2]]<-c("Type","Sex","Age","TimeCases","TimeControls","CID") #Attach the database to the search path attach(a331) #Barplot for Sex barplot(table(Sex)/dim(a331)[1],names=c("Female","Male"),xlab="Sex", legend=c("0.47","0.53"),col=2:3) title("Barplot for Sex") #Pie chart for Type pie(table(Type),labels=c("Stomach","Bronchus","Colon","Rectum", "Ovary","Breast","Bladder","Kidney","Gall_Bladder","Esophagus", "Reticulum","Prostate","Uterus","Brain","Pancreas","CLL"),col=2:17) title("Piechart for Type (in percentage)") legend(-1.2,0.8,paste(table(Type)),fill=2:17) #Stem and leaf chart for Age stem(Age) #Frequency distribution for Age age.h<-hist(Age,probability=F,plot=F) n<-length(age.h$breaks) age.h1<-cbind(age.h$breaks[1:(n-1)],age.h$breaks[2:n],age.h$counts,age.h$counts/100,cumsum(age.h$counts),cumsum(age.h$counts/100)) dimnames(age.h1)[[2]]<-c("lower.l","upper.l","freq","rel.freq","cum.freq","rel.cum.freq") age.h1 #Histogram for Age hist(Age,probability=T) #Kaplan-Meier plots library(survival) plot(survfit(Surv(TimeCases,CID),conf.int=0),xlab="Time (days)",ylab="Survival probability") lines(survfit(Surv(TimeControls,seq(1,1,,dim(a331)[1]))),lty=3) legend(2000,0.9,c("Cases","Controls"),lty=c(1,3)) title("Kaplan-Meir plots for Cases and Controls") #Summary statistics summary(Age) survfit(Surv(TimeCases,CID),conf.int=0.95) survfit(Surv(TimeControls,CID),conf.int=0.95)
BAYESIAN BIOSTATISTICS, BY: LUIS E. NIETO & LYN PIZANO
II Encontro de Pós-Graduação e Pesquisa da Universidade de PernambucoRecife-PE, 8-10 de Outubro de 2008 52
#Boxplots for Cases and Controls boxplot(TimeCases,TimeControls,names=c("Cases","Controls")) #Prior for theta u<-seq(0,1,.01) plot(u,dbeta(u,3,3),type="l",xlab="theta",ylab="",ylim=c(0,3.4)) legend(0.6,3.1,"Prior",lty=1) #Prior + Likelihood plot(u,dbeta(u,3,3),type="l",xlab="theta",ylab="",ylim=c(0,3.4)) lines(u,dbeta(u,3,9),lty=3) legend(0.6,3.1,c("Prior","Likelihood"),lty=c(1,3)) #Prior + Likelihood + Posterior plot(u,dbeta(u,3,3),type="l",xlab="theta",ylab="",ylim=c(0,3.4)) lines(u,dbeta(u,3,9),lty=3) lines(u,dbeta(u,5,11),lty=4) legend(0.6,3.1,c("Prior","Likelihood","Posterior"),lty=c(1,3,4)) #Defining new function: Beta-Binomial density dbebin<- function(x, n = 1, a = 1, b = 1) y <- gamma(a + b)/gamma(a)/gamma(b)/gamma(a + b + n) y <- y * choose(n, x) * gamma(a + x) * gamma(b + n - x) y #Plot of conditional & predictive densities (same graph) fx<-cbind(dbinom(0:10,10,0.31),dbebin(0:10,10,5,11)) barplot(t(fx),xlab="Y",col=2:3,beside=T,legend=c("Binomial","Beta-Binomial")) title("Precitive distribution") #Plot of conditional & predictive densities (different graphs) par(mfrow=c(2,1)) barplot(t(fx[,1]),xlab="Y",col=2,ylim=c(0,0.25)) title("Binomial distribution") barplot(t(fx[,2]),xlab="Y",col=3,ylim=c(0,0.25)) title("Beta-Binomial distribution") #Two noninformative prior densities x<-seq(0,1,0.01) n<-length(x) x<-x[-c(1,n)] par(mfrow=c(1,1)) plot(x,dbeta(x,0.5,0.5),type="l",xlab="theta",ylab="Density") lines(x,x/x,lty=3) legend(0.4,3,legend=c("Jeffreys","Uniform"),lty=c(1,3)) title("Noninformative priors") #Informative prior for Tyrosine x<-seq(0,80,.01) plot(x,dnorm(x,39,sqrt(219.47)),type="l",xlab="theta",ylab="Density") title("Informative prior for Tyrosine")
top related