slides casarin monte carlo

Upload: kristen-fields

Post on 02-Jun-2018

222 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/10/2019 Slides Casarin Monte carlo

    1/46

  • 8/10/2019 Slides Casarin Monte carlo

    2/46

  • 8/10/2019 Slides Casarin Monte carlo

    3/46

    Contents

    1 A Matlab Primer 1

    1.1 Programming Languages . . . . . . . . . . . . . . . . . . . . . 11.2 Fourth Generation Languages (4GPL) . . . . . . . . . . . . . 51.3 Matlab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

    1.3.1 Operators . . . . . . . . . . . . . . . . . . . . . . . . . 61.3.2 Logical Operators . . . . . . . . . . . . . . . . . . . . . 61.3.3 Creating Matrices . . . . . . . . . . . . . . . . . . . . . 71.3.4 Matrix Description . . . . . . . . . . . . . . . . . . . . 71.3.5 Other Functions . . . . . . . . . . . . . . . . . . . . . . 71.3.6 Loops and If Statements . . . . . . . . . . . . . . . . . 81.3.7 Procedures . . . . . . . . . . . . . . . . . . . . . . . . . 9

    1.4 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91.4.1 Input, Output and Graphics . . . . . . . . . . . . . . . 91.4.2 Ordinary Least Square . . . . . . . . . . . . . . . . . . 111.4.3 A Bayesian Linear Regression Model . . . . . . . . . . 12

    1.5 From Matlab to Scilab and R . . . . . . . . . . . . . . . . . . 17

    2 Monte Carlo Integration 212.1 Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212.2 A Monte Carlo Estimator . . . . . . . . . . . . . . . . . . . . 222.3 Asymptotic Properties . . . . . . . . . . . . . . . . . . . . . . 242.4 Optimal Number of MC Samples . . . . . . . . . . . . . . . . 25

    2.5 Appendix - Matlab Code . . . . . . . . . . . . . . . . . . . . . 27

    3 Importance Sampling 313.1 Importance Sampling . . . . . . . . . . . . . . . . . . . . . . . 313.2 Properties of the IS Estimators . . . . . . . . . . . . . . . . . 323.3 Generating Student-t Variables . . . . . . . . . . . . . . . . . 34

    i

  • 8/10/2019 Slides Casarin Monte carlo

    4/46

    ii CONTENTS

  • 8/10/2019 Slides Casarin Monte carlo

    5/46

    Chapter 1

    A Matlab Primer

    Aim

    Learn some basic facts in Matlab programming

    Contents

    1. Programming Languages

    2. Fourth Generation Languages (4GPL)

    3. Matlab

    4. Examples

    5. From Matlab to Scilab

    1.1 Programming Languages

    If you need to carry out an econometric analysis, before starting to write a

    code, may be you would like to have a look to the following link

    http://www.feweb.vu.nl/econometriclinks/software.html

    where many of the most used econometrics softwares and their contributed

    libraries are linked.

    1

  • 8/10/2019 Slides Casarin Monte carlo

    6/46

    2 CHAPTER 1. A MATLAB PRIMER

    In the following we report a brief description of the softwares listed at the

    econometriclinks webpage maintained by the Royal Economic Society:

    A+, ACML, ADMB, AIMMS, ALOGIT, Alyuda, AMOS, AMPL, APL,Apophenia, Arc, AREMOS, AutoBox, Autometrics, AutoSignal

    B34S, BACC, BATS, BETA, BIOGEME, BMDP, Brodgar, BUGS,BV4

    BACC: Bayesian Analysis, Computation and Communication. Free high

    quality generic software developed for different operating systems (Windows,

    Unix) and different front-ends. Specific model procedures as well. Supportedby the US NSF. Developed by Bill McCausland under the supervision of John

    Geweke.

    BUGS: Bayesian inference Using Gibbs Sampling (MCMC: Markov Chain

    Monte Carlo)

    C(++), CART, Census X12, Caterpillar-SSA, CPLEX, ConfortS,CVar

    DataDesk, Dataplore, Dataplot, DATAVIEW, DEA-Solver, DEME-TRA, Draco, DYALOG, DYNARE

    DYNARE: A Program for the Resolution and Simulation of Dynamic Models

    with Forward Variables Through the Use of a Relaxation Algorithm. Com-

    putes k-th order approximations of dynamic stochastic general equilibrium

    (DSGE) models. Also allows Bayesian Estimation of DSGEs

    EasyFit, EasyReg, EcoWin, ECTS, EQS, Eviews, Excel, EXPO

    FAME, ForecastPro, Fortran, FreeFore, FSQP

    GAMS, GARCH, GAUSS, GAUSSX, GiveWin, Gempack, GeoDa,Genstat, GLIM, GLIMMIX, GQOPT, graphpad, Gnuplot, GSL, GRETL

    GAMS: Generic Algebraic Modeling System for large scale optimization

    problems.

  • 8/10/2019 Slides Casarin Monte carlo

    7/46

    1.1. PROGRAMMING LANGUAGES 3

    GAUSS: is a programming language designed to operate with and on ma-

    trices. It is a general purpose tool. As such, it is a long way from more

    specialised econometric packages. On a spectrum which runs from the com-

    puter language C at one end to, say, the menu-driven econometric program

    EViews at the other, GAUSS is very much at the programming end.

    GRETL: Is a cross-platform software package for econometric analysis, writ-

    ten in the C programming language. It is free, open-source software.

    HLM

    ICRFS-Plus, ILOG, IDAMS, IMSL, INSTAT, ITSM J, JMP, JMulti, JStatCom, JWAVE

    KNITRO

    MacAnova, Maple, Mendeley, MARS, Mathcad, Mathematica, Math-Player, MathML, MathType,MATLAB, Matrixer, M@ximize, MetrixND,

    MHTS, Microfit, MiKTeX, Minitab, MINOS, MIXOR, MLE, MLwiN,

    Modeleasy, ModelQED, Modler, MOSEK, Mplus, Modula, MuPAD,

    Mx.

    MATLAB: It is a high-level language and more specifically a 4GPL (such as

    SAS, SPSS, Stata, GAUSS) which allows matrix manipulations for numeri-

    cal computing.

    NAG Mark 22 Numerical Libraries (2009), Genstat, MLP (ML estima-tion))

    Octave, O-Matrix, Omegahat, OpenDX, Ox, OxEdit, OxGauss, Ox-MetricsOctave: a high-level language, primarily intended for numerical computa-

    tions. It provides a convenient command line interface for solving linear and

    nonlinear problems numerically, and for performing other numerical experi-

    ments using a language that is mostly compatible with Matlab. It may also

  • 8/10/2019 Slides Casarin Monte carlo

    8/46

    4 CHAPTER 1. A MATLAB PRIMER

    be used as a batch-oriented language.

    Ox: is an object-oriented matrix programming language for statistics and

    econometrics developed by Jurgen Doornik

    PASS, PASW, PcFiml, PcGets, PcGive, PcNaive, PythonPython: Free Open Source Dynamic object-oriented programming language

    that can be used for many kinds of software development. It offers strong

    support for integration with other languages and tools, comes with extensive

    standard libraries

    R,RATS, REG-X, ReSampling Stats, Rlab, Rlab+R: is 4GPL, it is a free software environment for statistical computing and

    graphics. It compiles and runs on a wide variety of UNIX platforms, Win-

    dows and MacOS.

    RATS: developed by Estima, RATS (Regression Analysis of Time Series) is

    an econometrics and time-series analysis software package.

    S+,SAS, SCA, Scilab, SciPy, SciViews, Sciword, SCP, Shazam, Sigmaplot,

    SIMSTAT, SOLAS, SOL, Soritec, SpaceStat, SQlite, SPAD, Speakeasy,

    IBM SPSS, SsfPack, STAMP, Stata, StatCrunch, Statgraphics, Sta-

    tistica, Stat/Transfer, StatsDirect, STL, Statview, SUDAAN, SVAR,

    SYSTAT

    SAS: is a 4GPL which allows to define a sequence of operations (statistical

    analysis and data management) to be performed on data

    Scilab: is 4GPL free and open source for numerical computation, similar to

    Matlab

    TSM, TISEAN, TRAMO/SEATS, TSP, TVARTRAMO/SEATS:

    UNISTAT, VassarStats, ViSta

  • 8/10/2019 Slides Casarin Monte carlo

    9/46

    1.2. FOURTH GENERATION LANGUAGES (4GPL) 5

    Web Decomp, WebStat, WEKA, WinIDAMS, WINKS, Windows KWIK-STAT, XploRe, Winsolve, X-12-ARIMA, XLisp-Stat, Xtremes, X(G)PL

    1.2 Fourth Generation Languages (4GPL)

    Each step in the development of Computer Languages has aimed to reduce

    the amount of time required to write programs and reduce the amount of

    skill required to write Programs.

    In the 1GPL the programs are written in binary code and can access

    binary digits. To write programs with 1GPL is a very skilled job and it is

    very time consuming to test and debug programs.

    In the2GPL, the programs are written in symbolic assembly code, they

    access bytes and are slightly less time demanding.

    In the 3GPL, the programs are written in a High Level Language (e.g.

    COBOL, Pascal, C, Fortran, etc), they can access records and programming

    requires less time and skills.

    In the 4GPL, the programs perform BOOLEAN operations on SETS

    (Mathematical), they requires less time and skills. A well known example of

    4GPL is SQL.

    Scilab,Matlab,GaussandR, see

    http://www.scilab.org/

    http://www.mathworks.it/

    http://www.aptech.com/

    http://www.r-project.org/are 4GPL and have some common features. They are a long way from more

    specialised econometric packages, are not menu-driven programs (such as E-

    Views) and are very much at the programming end. Thus all of them require

    a certain degree of familiarity with programming methods and structures.

  • 8/10/2019 Slides Casarin Monte carlo

    10/46

    6 CHAPTER 1. A MATLAB PRIMER

    Another common feature is that they are extremely powerful for matrix

    manipulationand in this sense they are more useful for economists than the3GPL programming languages (such as C or Fortran), where the basic data

    units are all scalars. At the same time they are very flexible and allows more

    expert users to use interface to procedures written in other languages such

    as C, C++, or Fortran.

    An important feature of Scilab and R is that the source code of their

    libraries are available, which is not generally the case for Matlab and Gauss.

    Finally note that Matlab, Gauss and R have a lot of proprietary and con-tributed libraries oriented to statistics and econometrics.

    1.3 Matlab

    1.3.1 Operators

    Select submatrix from matrix:x( startrow : endrow, startcolumn : endcolumn ) Transposition operator: Matrix Operators: + - * \ % Element-by-element operators: .+ .- .* .\ Concatenating operators:[leftmatrix, rightmatrix] [uppermatrix; bottommatrix]

    Relational operators: < > == /= >= .== ./= .>= .

  • 8/10/2019 Slides Casarin Monte carlo

    11/46

  • 8/10/2019 Slides Casarin Monte carlo

    12/46

    8 CHAPTER 1. A MATLAB PRIMER

    y= ceil( x ); y= floor( x ); y= reshape( x,r,c ); Kronecker product: kron( x , y ) y= trimr( x,t,b );

    1.3.6 Loops and If Statements

    for i=start:step:increment;

    ...

    end;

    while logical expression;

    ...

    end;

    if logical expression 1;

    ...elseif logical expression 2;

    ...

    else;

    ...

    end;

    Example of do loop with counter:

    i=1;while (i=100);

    ...

    i=i+1;

    end;

  • 8/10/2019 Slides Casarin Monte carlo

    13/46

  • 8/10/2019 Slides Casarin Monte carlo

    14/46

    10 CHAPTER 1. A MATLAB PRIMER

    end;

    end;

    %*************************************************

    % Some Pictures...

    %*************************************************

    % figure(1) to have distinct graphs

    figure(1);

    title(Time series data);

    ylabel(Data);xlabel(Time);

    plot(xx,yy);

    figure(2);

    title(Time-varying log-volatility);

    a=plot(xx,s,color,[1 0 0]); %[red green blue] the rgb convention

    axis([1 n min(s) max(s)]); % Set tics

    figure(3);

    title(Dummy);

    plot(xx,d,color,[1 0 0]); %[red green blue] the rgb conventionaxis([1 n -0.1 1.1]); % Set tics

    %*************************************************

    % All charts in one pictures...%*************************************************

    figure(4);

    subplot(3,1,1);

    title(Time series data);ylabel(Data);

    xlabel(Time);

    plot(xx,yy);

    subplot(3,1,2);

    title(Time-varying log-volatility);

    plot(xx,s,color,[1 0 0]); %[red green blue] the rgb convention

    axis([1 n min(s) max(s)]); % Set tics

    subplot(3,1,3);

    title(Dummy);plot(xx,d,color,[1 0 0]); %[red green blue] the rgb convention

    axis([1 n -0.1 1.1]); % Set tics

    %*************************************************% histogram

    %*************************************************

    figure(5);

    hist(yy,50);

    %*************************************************

    % Save the results in a ouput file

    %*************************************************fid = fopen(C:/Dottorato/Teaching/SummerSchoolBertinoro/...

    TutorialAntonietta/TutorialRobAnt/AllLab/MatlabCode/ChapterMatlab/OutPound.txt, w);

    fprintf(fid, %5.2f\n, yy);

    fclose(fid);

  • 8/10/2019 Slides Casarin Monte carlo

    15/46

    1.4. EXAMPLES 11

    %*************************************************

    1.4.2 Ordinary Least Square

    We learn how to use structures in Matlab

    function results=ols(y,x)

    % PURPOSE: least-squares regression

    %---------------------------------------------------

    % USAGE: results = ols(y,x)

    % where: y = dependent variable vector (nobs x 1)% x = independent variables matrix (nobs x nvar)

    %---------------------------------------------------% RETURNS: a structure

    % results.meth = ols

    % results.beta = bhat

    % results.tstat = t-stats

    % results.yhat = yhat% results.resid = residuals

    % results.sige = e*e/(n-k)% results.rsqr = rsquared

    % results.rbar = rbar-squared

    % results.dw = Durbin-Watson Statistic

    % results.nobs = nobs

    % results.nvar = nvars

    % results.y = y data vector

    Check for the correct number of input argument and if the number ofrows of x is equal to the number of rows of y

    if (nargin ~= 2); error(Wrong # of arguments to ols);

    else[nobs nvar] = size(x); [nobs2 ndep] = size(y);

    if (nobs ~= nobs2); error(x and y must have same # obs in ols);end;

    end;

    k=nvar;

    Evaluate all the statistics that are usually involved in a OLS estimation

    results.y = y;results.nobs = nobs;

    results.nvar = nvar;

    %xpxi = (x*x)\eye(k);

    results.beta = xpxi*(x*y);

    results.yhat = x*results.beta;

    results.resid = y - results.yhat;

    sigu = results.resid*results.resid;results.sige = sigu/(nobs-nvar);

    tmp = (results.sige)*(diag(xpxi));results.tstat = results.beta./(sqrt(tmp));

  • 8/10/2019 Slides Casarin Monte carlo

    16/46

    12 CHAPTER 1. A MATLAB PRIMER

    ym = y - mean(y);

    rsqr1 = sigu; rsqr2 = ym*ym;

    results.rsqr = 1.0 - rsqr1/rsqr2; % r-squaredrsqr1 = rsqr1/(nobs-nvar);

    rsqr2 = rsqr2/(nobs-1.0);

    results.rbar = 1 - (rsqr1/rsqr2); % rbar-squared

    ediff = results.resid(2:nobs) - results.resid(1:nobs-1);results.dw = (ediff*ediff)/sigu; % durbin-watson

    end;

    We save as a function the ols.m code and run the following simulationexample

    nob=100;

    x1=ones(nob,1);

    x2=randn(nob,1).*((1:nob)/10);

    x=[x1 x2];sig=2;y=x*[10; 0.9]+sig*randn(nob,1);

    res=ols(y,x);

    res.beta

    %%

    figure(1)

    plot([res.yhat y]);

    figure(2)plot(res.resid);

    1.4.3 A Bayesian Linear Regression Model

    LetyRn,X Rn Rk and Rk. Consider the simple regression model

    y= X+ (1.1)

    Nn(0n, 2In)(1.2)

    with the following prior specification

    (1.3) R N(r, T)

    or equivalently

    (1.4) Q N(q, Ik)

    where QQ= T1 and q= Qr.

  • 8/10/2019 Slides Casarin Monte carlo

    17/46

  • 8/10/2019 Slides Casarin Monte carlo

    18/46

  • 8/10/2019 Slides Casarin Monte carlo

    19/46

    1.4. EXAMPLES 15

    0.9610 11.2966 0

    0.9200 11.2740 0

    Theil-Goldberger estimates

    1.0037

    0.9569

    0.9198

    We apply now the inference procedure to a financial dataset. We consider

    monthly data on the short-term interest rate (the three-month Treasury Billrate) and on the AAA corporate bond yield in the USA. As Treasury Bill

    notes and AAA bonds are low-risk securities and one could expect that there

    is a relationship between their interest rate. We consider data from January

    1950 to December 1999.

    Letyibe the monthly change in the Treasury Bill rate andzithe monthly

    change in the AAA bond rate. We will fit on this set of data the heteroscedas-

    tic model presented above with

    yi=1+2zi+i

    that corresponds to set xi = (1, zi) and = (1, 2)

    in the multivariate

    regression model given above. The results of the estimation procedure are

    Gibb sampling estimates

    Coefficient t-statistic t-probability

    0.0053 0.7805 0.2177

    0.2751 19.8628 0

    Theil-Goldberger estimates

    0.0057

    0.2747

  • 8/10/2019 Slides Casarin Monte carlo

    20/46

    16 CHAPTER 1. A MATLAB PRIMER

    100 200 300 400 500 6001.5

    1

    0.5

    0

    0.5

    1

    1.5

    Actual

    Fitted

    100 200 300 400 500 6001

    0.5

    0

    0.5

    1

    1.5

    Residuals

    Figure 1.1: Actual and fitted data (top) and residuals (bottom) using theBayesian estimates of the linear regression model.

    The estimates of the2 are 0.0283 for the Gibbs sampler and 0.0282 for the

    Theil-Goldberger procedure.

    The actual and fitted data and the residuals are given in Fig. 1.1. The

    plot of the residuals shows that in the second half of the sample (say after

    the 1975) the variance is underestimated. More precisely one should account

    in the model for the time variation in the variance of the data. This call

    for heteroscedastic linear regression models (see Chapter??) or for nonlinear

    models such as stochastic volatility models (see Chapter ?? and ??).

    References

    Gelfand, Alan E., and A.F.M Smith. 1990. Sampling-Based Approachesto Calculating Marginal Densities, Journal of the American Statistical Asso-

    ciation, Vol. 85, pp. 398-409.

  • 8/10/2019 Slides Casarin Monte carlo

    21/46

  • 8/10/2019 Slides Casarin Monte carlo

    22/46

    18 CHAPTER 1. A MATLAB PRIMER

    //*************************************************

    // Some Pictures...//*************************************************

    // figure(1) to have distinct graphs

    figure(1);

    title("Time series data");

    ylabel("Data");

    xlabel("Time");

    plot(xx,yy);

    figure(2);

    title("Time-varying log-volatility");

    plot(xx,s,color,[1 0 0]); //[red green blue] the rgb convention

    a=gca();

    a.data_bounds=[1,min(s);n,max(s)];// Set tics

    figure(3);

    title("Dummy");

    plot(xx,d,color,[1 0 0]); //[red green blue] the rgb convention

    a=gca();

    a.data_bounds=[1,-0.1;n,1.1];// Set tics

    //*************************************************

    // All charts in one pictures...

    //*************************************************

    figure(4);

    subplot(3,1,1);

    title("Time series data");

    ylabel("Data");

    xlabel("Time");

    plot(xx,yy);

    subplot(3,1,2);

    title("Time-varying log-volatility");

    plot(xx,s,color,[1 0 0]); //[red green blue] the rgb convention

    a=gca();

    a.data_bounds=[1,min(s);n,max(s)];// Set tics

    subplot(3,1,3);

    title("Dummy");plot(xx,d,color,[1 0 0]); //[red green blue] the rgb convention

    a=gca();

    a.data_bounds=[1,-0.1;n,1.1];// Set tics

    //*************************************************

    // histogram

  • 8/10/2019 Slides Casarin Monte carlo

    23/46

    1.5. FROM MATLAB TO SCILAB AND R 19

    //*************************************************

    figure(5);histplot(100,yy);

    //*************************************************

    // Save the results in a ouput file

    //*************************************************

    fprintfMat(C:/Dottorato/Teaching/SummerSchoolBertinoro/TutorialAntonietta/...

    TutorialRobAnt/AllLab/MatlabCode/ChapterMatlab/OutPound.txt,yy,%5.2f);

    // attention this overwrites the existing file

    R

    #*************************************************

    # basic in I/O, graphical, statistical procedures

    #*************************************************

    # Load UK/EU exchange rate data

    yy=scan("C:/Dottorato/Teaching/SummerSchoolBertinoro/TutorialAntonietta/...

    TutorialRobAnt/AllLab/MatlabCode/ChapterMatlab/pound.txt",sep="\t",skip=0,na.strings=".")

    dim(yy)=c(1006,1);

    #*************************************************

    n=dim(yy); # evaluate the number of rows #

    n=n[1];

    xx=(1:n);

    #*************************************************# for endfor if end

    # (1) Evaluate sequentially the variance

    # (2) Built a dummy variable, based on the value

    # of the variance estimated recursively

    #*************************************************

    wn=10; # set the value of a variable#

    s=array(0,n); # define a n-dim null vector #

    d=array(0,n);

    for (j in ((wn+1):n)){

    s[j]=var(yy[(j-wn+1):j]);

    if (s[j]>0.45){

    d[j]=1;

    }

    }

    #*************************************************

    # Some Pictures...

    #*************************************************

    # figure(1) to have distinct graphs

  • 8/10/2019 Slides Casarin Monte carlo

    24/46

    20 CHAPTER 1. A MATLAB PRIMER

    dev.new();plot(xx,yy,main="Time series data",xlab="Time",ylab="Data",type="l");

    dev.new();

    plot(xx,s,main="Time-varying log-volatility",xlab="Time",ylab="Data",type="l");

    #[red green blue] the rgb convention

    dev.new();

    plot(xx,d,main="Dummy",xlab="Time",ylab="Data",type="l");

    #[red green blue] the rgb convention

    #*************************************************

    # All charts in one pictures...

    #*************************************************

    par(mfrow=c(3,1),pin=c(5,1.5));plot(xx,yy,main="Time series data",xlab="Time",ylab="Data",type="l");

    plot(xx,s,main="Time-varying log-volatility",xlab="Time",ylab="Data",type="l");

    #[red green blue] the rgb convention

    plot(xx,d,main="Dummy",xlab="Time",ylab="Data",type="l");

    #[red green blue] the rgb convention

    #*************************************************

    # histogram

    #*************************************************

    dev.new();

    hist(yy,50);

    #*************************************************

    # Save the results in a ouput file

    #*************************************************

    save(yy, file = "C:/Dottorato/Teaching/SummerSchoolBertinoro/TutorialAntonietta/...

    TutorialRobAnt/AllLab/MatlabCode/ChapterMatlab/OutPound.txt");

  • 8/10/2019 Slides Casarin Monte carlo

    25/46

    Chapter 2

    Monte Carlo Integration

    Aim

    Apply basic Monte Carlo principles to solve some basic integrationproblems. Discuss the choice of the number of samples in a MonteCarlo estimation.

    Contents

    1. Integration

    2. A Monte Carlo Estimator

    3. Asymptotic Properties

    4. Optimal Number of MC Samples

    5. Appendix - Matlab Code

    2.1 Integration Our aim is to approximate the integral

    (2.1) (f) =

    10

    f(x)dx

    21

  • 8/10/2019 Slides Casarin Monte carlo

    26/46

    22 CHAPTER 2. MONTE CARLO INTEGRATION

    for the following integrand functions f

    1. f(x) =x

    2. f(x) =x2

    3. f(x) = cos(x)

    We apply a Monte Carlo approach and re-write the integration problem instatistical terms as follows

    (2.2)

    10

    f(x)dx=

    +

    f(x)I[0,1](x)dx= E(f(X))

    where IA(x) if the indicator function that holds 1 ifxA and 0 otherwiseand X U[0,1] is a random variable with a standard uniform distribution.

    2.2 A Monte Carlo Estimator

    Let X1, . . . , X n be a set ofn i.i.d. samples from a uniform distribution.The integral= E(f(X)) approximates as follows

    (2.3) n= 1

    n

    ni=1

    f(Xi)

    that is called a Monte Carlo estimator ofE(f(X)).

    The results of the Monte Carlo estimates for different sample sizes n =1, . . . , 50 and different integrand functions fare given in Fig. 2.1

    Find the mean and the variance of the estimator and give a Monte Carloapproximation for the expression of the variance.

  • 8/10/2019 Slides Casarin Monte carlo

    27/46

  • 8/10/2019 Slides Casarin Monte carlo

    28/46

    24 CHAPTER 2. MONTE CARLO INTEGRATION

    where

    2(f) = V(f(X1)) = +

    (x )2f(x)I[0,1](x)dx

    For the differentfwe find the analytical solution of the integral (f) (see

    also horizontal dotted lines in Fig. 2.1)

    1. Forf(x) =x

    (2.6) E(f(X1)) = 1

    0

    xdx= 1

    2x2

    0

    1

    = 1/2

    2. Forf(x) =x2

    (2.7) E(f(X1)) =

    10

    x2dx=

    13 x30

    1

    = 1/3

    3. Forf(x) = cos(x)

    (2.8) E(f(X1)) = 10 cos(x)dx=

    1sin(x)0

    1 = 0

    2.3 Asymptotic Properties

    Under the i.i.d. and finite variance assumptions we have

    (2.9) na.s.

    n

    (2.10)

    n (n ) Dn

    N(0, 2(f))

    For the different fwe have

  • 8/10/2019 Slides Casarin Monte carlo

    29/46

    2.4. OPTIMAL NUMBER OF MC SAMPLES 25

    1. Forf(x) =x

    V(f(X1)) = E(f(X1)2) (E(f(X1)))2

    =

    10

    x2dx 1

    0

    xdx

    2= 1/3 1/4 = 1/12

    2. Forf(x) =x2

    V(f(X1)) = 1/5 1/9 = 4/45

    3. Forf(x) = cos(x)

    V(f(X1)) = 1/2 0 = 1/2

    When the variance V(f(X1)) is unknown one can use the Monte Carlo

    estimator

    (2.11) 2(f) = 1

    n

    1

    n

    i=1

    (Xi n)2

    The empirical approximations of the asymptotic variances are given in Fig.

    2.2.

    Exercise: use the asymptotic distribution and the approximation of theasymptotic variance to find the 5% confidence intervals of the MC estimator

    of.

    2.4 Optimal Number of MC Samples

    It is possible to use the asymptotic properties of a MC estimator to findthe optimal number n of samples that are necessary to reach an accuracy

  • 8/10/2019 Slides Casarin Monte carlo

    30/46

    26 CHAPTER 2. MONTE CARLO INTEGRATION

    MC Variances

    0 10 20 30 40 500.05

    0.1

    0.15

    0.2

    0.25f(x)=x

    Empirical Variance

    Theoretical Variance

    0 10 20 30 40 500

    0.05

    0.1

    0.15

    0.2f(x)=x

    2

    Empirical Variance

    Theoretical Variance

    0 10 20 30 40 500

    0.5

    1

    1.5

    f(x)=cos(x)

    Empirical Variance

    Theoretical Variance

    Figure 2.2: Monte Carlo variance estimates 2n (solid lines) for different sam-ple sizes n= 1, . . . , 50 and the true value 2 (horizontal dotted lines).

    level , for a given confidence level , in the Monte Carlo estimation of.

    The asymptotic results allow us to find nsuch that

    (2.12) P r|n | 2(f)/n = 1

    that is

    (2.13) X=

    n

    2(f)n =

    X

    22(f)

  • 8/10/2019 Slides Casarin Monte carlo

    31/46

    2.5. APPENDIX - MATLAB CODE 27

    where X = 1(1/2), with 1 the inverse cumulative distribution

    function of a standard normal.When the variance 2(f) is unknown one can use the Monte Carlo esti-

    mator 2n(f) and then apply a similar asymptotic argument. In this case the

    optimal number of simulations should satisfy the following relationship

    (2.14) 2n(f)n2

    X2

    One can check iteratively the condition.

    1. Start with n1 MC samples X1, . . . , X n1

    2. If 2n(f) n2X2

    then stop otherwise

    3. evaluatek1=n2X2

    nand generatek1samplesXn1+1, . . . , X n1+k1(xindicates the integer part ofx)

    Exercise: write a Matlabs code for computing the optimal number of sam-

    ples that are needed to estimate (f) for the different integrand functions f

    given in Section 1 and for the accuracy level = 0.001.

    2.5 Appendix - Matlab Code% Uniform Random Number% Monte Carlo method as an approximated integration technique

    % integrate f(x) on the [0,1] interval

    % solution: 1/2, 1/3, and 0

    clc;

    n=50;

    x=rand(n,1);gav=zeros(n,3);

    gavvar=NaN(n,3);

    gav(1,1)=x(1,1);gav(1,2)=x(1,1)^2;

    gav(1,3)=cos(pi*x(1,1));

    for i=2:n

    gav(i,1)=sum(x(1:i))/i;gav(i,2)=sum(x(1:i).^2)/i;

    gav(i,3)=sum(cos(pi*x(1:i)))/i;gavvar(i,1)=var(x(1:i));

  • 8/10/2019 Slides Casarin Monte carlo

    32/46

    28 CHAPTER 2. MONTE CARLO INTEGRATION

    gavvar(i,2)=var(x(1:i).^2);

    gavvar(i,3)=var(cos(pi*x(1:i)));

    end%

    %

    %%%%%%%%% Graphics (mean) %%%%%%%%%%

    figure(1);subplot(3,1,1);

    plot(gav(:,1));

    line((1:n),ones(n,1)/2,color,red);

    legend(Empirical Average,Theoretical Mean,...Location,NorthEastOutside);

    title(f(x)=x);

    %subplot(3,1,2);

    plot(gav(:,2));

    line((1:n),ones(n,1)/3,color,red);

    legend(Empirical Average,Theoretical Mean,...Location,NorthEastOutside);title(f(x)=x^2);

    %

    subplot(3,1,3);plot(gav(:,3));

    line((1:n),ones(n,1)*0,color,red);

    legend(Empirical Average,Theoretical Mean,...

    Location,NorthEastOutside);title(f(x)=cos(\pi x));

    To export picture to a .eps file one can use

    %%%%%%%%% Export a picture %%%%%%%%%%%%%

    dire=C:\Dottorato\Teaching\SummerSchoolBertinoro;figu=\TutorialAntonietta\TutorialRobAnt\Figure\;

    figname=strvcat([strcat(dire,figu,MC1.eps)]);print (gcf,-depsc2, figname);

    %

    %%%%%%%%% Graphics (variance) %%%%%%%%%%figure(2);

    subplot(3,1,1);plot(gavvar(:,1));

    line((1:n),ones(n,1)/12,color,red);

    legend(Empirical Variance,Theoretical Variance,...

    Location,NorthEastOutside);

    title(f(x)=x);

    %subplot(3,1,2);

    plot(gavvar(:,2));

    line((1:n),ones(n,1)*4/45,color,red);legend(Empirical Variance,Theoretical Variance,...

    Location,NorthEastOutside);

    title(f(x)=x^2);

    %

    subplot(3,1,3);plot(gavvar(:,3));

    line((1:n),ones(n,1)*1/2,color,red);legend(Empirical Variance,Theoretical Variance,...

  • 8/10/2019 Slides Casarin Monte carlo

    33/46

    2.5. APPENDIX - MATLAB CODE 29

    Location,NorthEastOutside);

    title(f(x)=cos(\pi x));

  • 8/10/2019 Slides Casarin Monte carlo

    34/46

    30 CHAPTER 2. MONTE CARLO INTEGRATION

  • 8/10/2019 Slides Casarin Monte carlo

    35/46

    Chapter 3

    Importance Sampling

    Aim

    Define and apply the importance sampling method and study itsproperties.

    Contents

    1. Importance Sampling (IS)

    2. Properties of the IS Estimators

    3. Generating Student-t Variables

    3.1 Importance Sampling

    Let be a probability density function, fa measurable function and

    (3.1) = E(f(X)) =

    f(x)(x)dx

    the integral of interest.

    In importance sampling (see Section 3.3 in Robert and Casella (2004)) a

    distribution g (called importance distribution or instrumental distribution)

    31

  • 8/10/2019 Slides Casarin Monte carlo

    36/46

    32 CHAPTER 3. IMPORTANCE SAMPLING

    is used to apply a change of measure

    (3.2) =

    (x)

    g(x)f(x)g(x)dx

    The resulting integral is then evaluated numerically by using a i.i.d. sample

    X1, . . . , X n fromg

    (3.3) ISn = 1

    n

    ni=1

    w(Xi)f(Xi)

    wherew(Xi) =

    (Xi)

    g(Xi), i= 1, . . . , n

    are called importance weights.

    3.2 Properties of the IS Estimators

    The Monte Carlo estimator ISn of is unbiased

    Eg(ISn ) =

    1

    n

    ni=1

    w(xi)f(xi)

    ni=1

    g(xi)dxi

    =

    (x1)

    g(x1)f(x1)g(x1)dx1

    =

    f(x1)(x1)dx1

    and converges almost surely to , under the assumption supp g

    supp .

    Nevertheless the existence of the variance and of a limiting distribution is

    not guaranteed. We shall notice that Vg(ISn ) Eg((ISn )2) thus the condition

    we need to check is the existence of an upper bound for the second order

  • 8/10/2019 Slides Casarin Monte carlo

    37/46

  • 8/10/2019 Slides Casarin Monte carlo

    38/46

  • 8/10/2019 Slides Casarin Monte carlo

    39/46

    3.3. GENERATING STUDENT-T VARIABLES 35

    0 1 2 3 4 5

    x 104

    0

    1

    2

    Studentt

    0 1 2 3 4 5

    x 104

    0

    5

    10

    Normal

    0 1 2 3 4 5

    x 104

    0

    1

    2

    Cauchy

    Figure 3.1: Importance sampling weights for the proposal distributionsT(, 0, 1),N(0, /( 2)) andC(0, 1)

    where< 0 and cumulative distribution function

    F(x) = 1

    x

    1

    (1 + ((u )/)2)du

    = 1

    2+

    1

    arctan

    x IR(x)

    The inverse c.d.f. method can be applied in order to generate from the

    Cauchy. IfX=F1(U), where U U[0,1], then X C(, ).

    From the results in Fig. 3.1 one can see that the importance weights for

    Student-t and Cauchy are not unstable while the importance weights asso-

    ciated to the normal exhibit some large jumps. For all the functions theresults in Fig. 3.2 show that the normal proposal produces jumps in the

    progressive averages (green lines) that are due to the unbounded variance of

    the estimator. However for the first function the normal proposal behaves

    quite well when compared with the Cauchy and Student-t proposals. For the

  • 8/10/2019 Slides Casarin Monte carlo

    40/46

    36 CHAPTER 3. IMPORTANCE SAMPLING

    second and third function the Cauchy proposal seems to converge faster than

    the Student-t. In all the pictures we plotted (black lines) the approximationobtained with an exact simulation from a Student-t with = 12.

    Exercise - Use repeated Monte Carlo experiments to find the distribution

    of the estimator n(f). Plot the 95% and 5% quantiles and the mean of the

    estimator for n= 1, . . . , 50000.

    The Matlab code is

    %%%%%%% Importance weight for T(nustar,0,1)

    function w=w1(x,nu,nustar)w=pdf(t,x,nu)/pdf(t,x,nustar);

    end

    %

    %%%%%%% Importance weight for N(0,nu/(nu-2))

    function w=w2(x,nu)w=pdf(t,x,nu)/pdf(normal,x,0,sqrt(nu/(nu-2)));

    end%

    %%%%%%% Importance weight for C(0,1)

    function w=w3(x,nu)

    w=pdf(t,x,nu)/pdfcauchy(x,0,1);

    end

    %clc;

    %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% Importance sampling

    %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

    nu=12;

    nustar=7;

    nIS=50000;

    mu1IS=zeros(nIS,4);mu2IS=zeros(nIS,4);mu3IS=zeros(nIS,4);

    %

    mu1IScum=zeros(nIS,4);

    mu2IScum=zeros(nIS,4);

    mu3IScum=zeros(nIS,4);

    %wIS=zeros(nIS,3);

    for i=1:nIS% Proposal 1

    x1=random(t,nustar);

    % Proposal 2

    x2=random(normal,0,sqrt(nu/(nu-2)));

    % Proposal 3x3=tan((rand(1,1)-0.5)*pi);

    %x3=random(normal,0,1)/random(normal,0,1);% Exact

  • 8/10/2019 Slides Casarin Monte carlo

    41/46

  • 8/10/2019 Slides Casarin Monte carlo

    42/46

    38 CHAPTER 3. IMPORTANCE SAMPLING

    plot((1:nIS),wIS(:,3));

    legend(Cauchy,Location,NorthEast);

    set(gca,FontSize,fs);

    figure(2)

    plot((1:nIS),mu1IScum(:,1:3));

    hold on;plot((1:nIS),mu1IScum(:,4),-k);

    hold off;

    legend(Student-t,Normal,Cauchy,Exact,Location,NorthEast);

    ylim([0.00001 0.00015]);set(gca,FontSize,fs);

    figure(3)plot((1:nIS),mu2IScum(:,1:3));

    hold on;

    plot((1:nIS),mu2IScum(:,4),-k);

    hold off;legend(Student-t,Normal,Cauchy,Exact,Location,NorthEast);ylim([1 1.4]);

    set(gca,FontSize,fs);

    figure(4)

    plot((1:nIS),mu3IScum(:,1:3));

    hold on;

    plot((1:nIS),mu3IScum(:,4),-k);hold off;

    legend(Student-t,Normal,Cauchy,Exact,Location,NorthEast);

    ylim([3 9]);

    set(gca,FontSize,fs);

    This code calls the following function defined by the user

    %%%%%%% Cauchy probability density functionfunction f=pdfcauchy(x,a,b)

    f=1/(pi*b*(1+((x-a)/b)^2));

    end

    %

  • 8/10/2019 Slides Casarin Monte carlo

    43/46

    3.3. GENERATING STUDENT-T VARIABLES 39

    0 1 2 3 4 5

    x 104

    2

    4

    6

    8

    10

    12

    14

    x 105

    Studentt

    NormalCauchy

    Exact

    0 1 2 3 4 5

    x 104

    1

    1.05

    1.1

    1.15

    1.2

    1.25

    1.3

    1.35

    1.4

    StudenttNormalCauchyExact

    0 1 2 3 4 5

    x 104

    3

    4

    5

    6

    7

    8

    9

    StudenttNormalCauchyExact

    Figure 3.2: Charts from one to three: IS for the different functions f.In each chart the IS estimators for different proposals (colored lines) andthe Monte Carlo estimator with exact simulation from theT(12, 0, 1) (blacklines).

  • 8/10/2019 Slides Casarin Monte carlo

    44/46

  • 8/10/2019 Slides Casarin Monte carlo

    45/46

    Exercise

    Importance Sampling

    Consider a Student-t distributionT(,,2) with density

    (3.6) (x) = ((+ 1)/2)

    (/2)

    1 +

    (x )22

    (+1)/2IR(x)

    w.l.o.g. take = 0, = 1 and = 12.

    Study the performance of the importance sampling estimator ISn of

    (3.7) = E(f(X)) =

    f(x)(x)dx=

    (x)g(x)

    f(x)g(x)dx

    when the following instrumental distributions, g(x), are used

    1.T(, 0, 1) with < (e.g. = 7)

    2.N(0, /( 2))

    3.C(0, 1)

    for the following test functions

    1.

    f(x) =

    sin(x)

    x

    5I(x)(2.1,+)

    41

  • 8/10/2019 Slides Casarin Monte carlo

    46/46

    42 CHAPTER 3. IMPORTANCE SAMPLING

    2.

    f(x) = x1 x

    3.

    f(x) = x5

    1 + (x 3)2 I[0,+)(x)

    Metropolis-Hastings

    Write a M.-H. algorithm to generate n = 500 i.i.d. random samples from

    a zero-mean and independent bivariate normal distribution,N2(0, I2), withcovariance matrix, I2 and mean 0 = (0, 0)

    . Use alternatively independent

    and random walk proposals with variance covariance matrix2I2. (Try with

    different values of2).