Revision: 1.34 of Date: 1998/07/02 17:16:33
Please note that this is an imperfect automatic translation of the file. The math typesetting is most likely clobbered.
We consider the problem of estimating an unknown distribution function F in the presence of censoring under the conditions that a parametric model is believed to hold approximately. We use a Bayesian approach, in which the prior on F is a mixture of Dirichlet distributions. A hyperparameter of the prior determines the extent to which this prior concentrates its mass around the parametric family. A Gibbs sampling algorithm to estimate the posterior distributions of the parameters of interest is reviewed. An importance sampling scheme enables us to use the output of the Gibbs sampler to very quickly recalculate the posterior when we change the hyperparameters of the prior. The calculations can be done sufficiently fast to enable the dynamic display of the changing posterior as the prior hyperparameters are varied.This paper provides a literate program completely documenting the code for performing the dynamic graphics.
We begin with our usual copyright.
<Copyright>= (U->) ;;; ;;; $Revision: 1.34 $ of $Date: 1998/07/02 17:16:33 $ ;;; ;;; Copyright (C) 1994, 1995, 1998. Doss and Narasimhan ;;; ;;; Hani J. Doss (doss@stat.ohio-state.edu) and ;;; B. Narasimhan (naras@stat.stanford.edu) ;;; ;;; This program is free software; you can redistribute it and/or modify ;;; it under the terms of the GNU General Public License as published by ;;; the Free Software Foundation; either version 2 of the License, or ;;; (at your option) any later version. ;;; ;;; This program is distributed in the hope that it will be useful, ;;; but WITHOUT ANY WARRANTY; without even the implied warranty of ;;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ;;; GNU General Public License for more details. ;;; ;;; You should have received a copy of the GNU General Public License ;;; along with this program; if not, write to the Free Software ;;; Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. ;;;
Definescopyright
(links are to index).
This document is a literate program implementing the theory described in our paper [cite doss:nara:1998b]. A literate program is a program written in a style that makes it easy for humans to read, understand and modify. For more information on Literate Programming, see [cite knut:1992]. A quicker introduction is available on the World Wide Web; see [cite lee:1994]. This document uses the Noweb [cite rams:1994], [cite rams:1994a], [cite rams:1995] literate programming tools. Although it is not required, we recommend that any serious user of this software have the Noweb tools installed. Noweb tools, besides being free, are extremely easy to install and require no special expertise other than basic knowledge about how TeX or LaTeX work. Having Noweb allows one to take full advantage of our software---syntax errors will be minimized and the code we have written can be reused with the user's modifications spliced into place automatically by Noweb.
We expect any serious user of our software to read the original paper [cite doss:nara:1998b], a copy of which is included in the software distribution.
This document is available in three forms: Postscript, PDF and HTML. All versions are accessible from the web pages of the authors.
We wish to remark that the software only does sensitivity analysis. No general facility is provided for generating observations from Markov chains. Indeed, since the range of models for which MCMC methods are applicable is large and such methods most likely involve problem-specific issues, it is our opinion that building such a supertool, if it is at all possible, is a non-trivial task. However, the Fortran program used in generating the output for our example is included along with this software and can be used for models similar to ours. Of course, any appropriate method may be used to generate the samples as long as the output is available in a form usable by our software. The requirements on the data that can be used with our software are spelt out below.
Corresponding to each Markov chain output, there must be two files with the extensions .in (input file) and .out (output file). For example, mc1.in and mc1.out.
The input file must have the following structure. The first four items in the file can be anything, string or number, either on a single line or any conceivable combination of lines. The next three items must be the shape of the Gamma distribution on , the scale of the Gamma distribution on ---the parametrization for shape a and scale b is proportional to ---and . The next three values values following these quantities can be anything, but the one following it should be the number of data points, or sets. Nothing else is read from the input file.
The output file must have the following structure for each data point generated by the Markov chain. The value of theta must be followed by the number of distinct values of the data points, which must be followed by a frequency table of the actual data value and the corresponding frequency. The layout of the values on lines does not matter as long as at least a single white space delimits values. If this structure is violated, errors will result. A peek at the data files included with this software will help the reader.
A note on performance. The calculations involved in reweighting are non-trivial and require a reasonably powerful computer for smooth performance. The efficiency can be improved by dynamically loading C programs that compute various quantities. The version of software described here does so by default. An older version that does not use dynamic loading which is available upon request from the authors. Without dynamic loading, the performance is very bad indeed.
Dynamic libraries for Windows and Macintosh are provided. Suggestions for various Unix platforms are also provided. Section [->] has more details.
It is assumed that a proper installation of Lisp-Stat described in [cite tier:1990] is available. The version number on Lisp-Stat should be 3.52.0 or higher since dynamic loading uses the new shared library mechanism.
Some additions planned for the future are listed in Section [->].
The software consists of the following components.
<*>= <Copyright> (require "utility") (require "call-by-reference") (defpackage "BSA" (:use "XLISP" "USER" "UTILITY" "CALL-BY-REFERENCE")) (in-package "BSA") (import '(user::rseq user::standard-deviation)) <The Master Prototype> <The Slave Prototype> (export '(master-proto))
The master prototype master-proto inherits from dialog-proto
of Lisp-Stat
and contains a number of slots. A rich set of methods facilitate
interaction with the master.
<The Master Prototype>= (<-U) (defproto master-proto '(identifier number-of-markov-chains number-of-points number-of-data-values number-of-months data-file-names summary-data indicator-counts log-constants-of-proportionality slaves hyperparameters-used-in-markov-chains initially-specified-hyperparameter-values current-hyperparameter-values hyperparameter-names log-mixture-density importance-weights hyperparameter-ranges hyperparameter-sliders work-space shared-library density-abscissae density-ordinates expectation-abscissae expectation-ordinates standard-deviation-ordinates statistics statistics-print-formats statistics-labels number-of-slider-stops superimpose timing timing-button lazy) () dialog-proto "The Master prototype. Creates and manipulates a harem of slaves.") <Methods for Master Prototype> <Defaults for Master>
Definescurrent-hyperparameter-values
,data-file-names
,density-abscissae
,density-ordinates
,expectation-abscissae
,expectation-ordinates
,hyperparameter-names
,hyperparameter-ranges
,hyperparameter-sliders
,hyperparameters-used-in-markov-chains
,identifier
,importance-weights
,indicator-counts
,initially-specified-hyperparameter-values
,lazy
,log-constants-of-proportionality
,log-mixture-density
,master-proto
,number-of-data-values
,number-of-markov-chains
,number-of-months
,number-of-points
,number-of-slider-stops
,shared-library
,slaves
,standard-deviation-ordinates
,statistics
,statistics-labels
,statistics-print-formats
,summary-data
,superimpose
,timing
,work-space
(links are to index).
Throughout, we shall use m for the number of Markov chains, and n for the number of data points in the output of each Markov chain. Note that arrays are indexed from 0 so that index i refers to the (i+1)-th position.
Here is a description of all the slots in master-proto.
.run
to the
string. In addition, a file with the extension .lsp
is also
created so that invoking xlispstat
on that file with
automatically recreate everything without annoying questions.
*default-number-of-months*
.
number-of-months
. If this value
is 61, say, then one would be able to calculate the law of F(t)
for t ranging from 0 to 60 months.
*default-number-of-plot-stops*
.
*default-number-of-plot-stops*
.
*default-number-of-months*
(minus 1, as we start at 0).
*default-number-of-months*
(minus 1, as we start at 0).
Effective Sample Size = m * n / (1 + cv(W)^2),where cv(W) is the coefficient of variation of the importance weights W. See, for example, [cite kong:liu:wong:1994].
The methods for master-proto
can be broken down as follows.
<Methods for Master Prototype>= (<-U) <The Master :identifier Method> <The Master :number-of-markov-chains Method> <The Master :number-of-points Method> <The Master :number-of-months Method> <The Master :number-of-data-values Method> <The Master :summary-data Method> <The Master :indicator-counts Method> <The Master :log-constants-of-proportionality Method> <The Master :slaves Method> <The Master :initially-specified-hyperparameter-values Method> <The Master :hyperparameter-names Method> <The Master :current-hyperparameter-values Method> <The Master :number-of-hyperparameters Method> <The Master :hyperparameters-used-in-markov-chains Method> <The Master :hyperparameter-ranges Method> <The Master :hyperparameter-sliders Method> <The Master :log-mixture-density Method> <The Master :importance-weights Method> <The Master :loglik Method> <The Master :compute-log-hmix Method> <The Master :calc-weights Method> <The Master :isnew Method> <The Master :graphical-interface Method> <The Master :process-run-file Method> <The Master :process-frequency-table Method> <The Master :create-run-file Method> <The Master :synchronize Method> <The Master :consolidate-computation Method> <The Master :reset Method> <The Master :effective-sample-size Method> <The Master :print-all-statistics Method> <The Master :labelled-hyperparameter-values Method> <The Master :statistics Method> <The Master :statistics-print-formats Method> <The Master :statistics-labels Method> <The Master :superimpose Method> <The Master :toggle-timing Method> <The Master :close Method>
Some of these methods are mere accessor and modifier methods for the slots and we can get them easily out of the way.
<The Master :identifier Method>= (<-U) (defmeth master-proto :identifier (&optional name) "Method args: (&optional name) Sets or retrieves the identifier slot." (if name (setf (slot-value 'identifier) name) (slot-value 'identifier)))
Defines:identifier
(links are to index).
<The Master :number-of-markov-chains Method>= (<-U) (defmeth master-proto :number-of-markov-chains () "Method args: () Returns m, the number of Markov chains used." (slot-value 'number-of-markov-chains))
Defines:number-of-markov-chains
(links are to index).
<The Master :number-of-points Method>= (<-U) (defmeth master-proto :number-of-points () "Method args: () Returns n, the number of data points." (slot-value 'number-of-points))
Defines:number-of-points
(links are to index).
<The Master :number-of-months Method>= (<-U) (defmeth master-proto :number-of-months () "Method args: () Returns n, the number of months for which F(t) is computed." (slot-value 'number-of-months))
Defines:number-of-months
(links are to index).
<The Master :number-of-data-values Method>= (<-U) (defmeth master-proto :number-of-data-values () "Method args: () Returns n, the number of data values, or sets." (slot-value 'number-of-data-values))
Defines:number-of-data-values
(links are to index).
<The Master :summary-data Method>= (<-U) (defmeth master-proto :summary-data () "Method args: () Returns the summary data containing pairs theta and number of distinct data points. A 2d array of size mn by 2." (slot-value 'summary-data))
Defines:summary-data
(links are to index).
<The Master :indicator-counts Method>= (<-U) (defmeth master-proto :indicator-counts () "Method args: () Returns the data values a 2d array of size mn by number-of-months." (slot-value 'indicator-counts))
Defines:indicator-counts
(links are to index).
<The Master :log-constants-of-proportionality Method>= (<-U) (defmeth master-proto :log-constants-of-proportionality () "Method args: () Returns a list of values of the log constants of proportionality for the posteriors for each Markov chain." (slot-value 'log-constants-of-proportionality))
Defines:log-constants-of-proportionality
(links are to index).
<The Master :slaves Method>= (<-U) (defmeth master-proto :slaves () "Method args: () Retrieves the slaves." (slot-value 'slaves))
Defines:slaves
(links are to index).
<The Master :initially-specified-hyperparameter-values Method>= (<-U) (defmeth master-proto :initially-specified-hyperparameter-values () "Method args: () Returns a list of initially specified values of the hyper parameters. Used mainly for resetting the hyper parameter values before dynamic exploration." (slot-value 'initially-specified-hyperparameter-values))
Defines:initially-specified-hyperparameter-values
(links are to index).
<The Master :hyperparameter-names Method>= (<-U) (defmeth master-proto :hyperparameter-names (&optional index) "Method args: (index) Returns a string identifying the hyperparameter index or the whole slot." (if index (select (slot-value 'hyperparameter-names) index) (slot-value 'hyperparameter-names)))
Defines:hyperparameter-names
(links are to index).
<The Master :current-hyperparameter-values Method>= (<-U) (defmeth master-proto :current-hyperparameter-values (&optional i x) "Method args: (&optional i x) Returns a list of hyper-parameter values. If i is given and a sequence, then all values in the list are set. The dimensions must match. If i is not a sequence, returns the i-th value. If x is specified, sets the i-th hyperparameter value to x. Note that x is ignored if i is a sequence." (if x (if (slot-value 'timing) (time (progn (setf (elt (slot-value 'current-hyperparameter-values) i) x) (send self :synchronize))) (progn (setf (elt (slot-value 'current-hyperparameter-values) i) x) (send self :synchronize))) (if i (if (sequencep i) (progn (setf (slot-value 'lazy) t) (dotimes (j (send self :number-of-hyperparameters)) (send (select (slot-value 'hyperparameter-sliders) j) :value (select i j))) (setf (slot-value 'lazy) nil) (send self :synchronize)) (elt (slot-value 'current-hyperparameter-values) i)) (slot-value 'current-hyperparameter-values))))
Defines:current-hyperparameter-values
(links are to index).
<The Master :number-of-hyperparameters Method>= (<-U) (defmeth master-proto :number-of-hyperparameters () "Method args: () Returns the number of hyperparameters." (length (slot-value 'current-hyperparameter-values)))
Defines:number-of-hyperparameters
(links are to index).
<The Master :hyperparameters-used-in-markov-chains Method>= (<-U) (defmeth master-proto :hyperparameters-used-in-markov-chains () "Method args: () Returns the hyperparameter-values used in running the Markov chains. 2D array of size number-of-markov-chains by 3." (slot-value 'hyperparameters-used-in-markov-chains))
Defines:hyperparameters-used-in-markov-chains
(links are to index).
<The Master :hyperparameter-ranges Method>= (<-U) (defmeth master-proto :hyperparameter-ranges (&optional ranges) "Method args: (&optional ranges) Returns the ranges within which the hyperparameters will be varied. List of lists. This is calculated as the maximum and minimum of all the values used in the Markov chains if not given." (if ranges (setf (slot-value 'hyperparameter-ranges) ranges) (if (slot-value 'hyperparameter-ranges) (slot-value 'hyperparameter-ranges) (let* ((values (column-list (slot-value 'hyperparameters-used-in-markov-chains))) (max (mapcar #'max values)) (min (mapcar #'min values))) (append (mapcar #'(lambda (x y) (list x y)) min max) (list (list 0 (1- (slot-value 'number-of-months)))))))))
Defines:hyperparameter-ranges
(links are to index).
<The Master :hyperparameter-sliders Method>= (<-U) (defmeth master-proto :hyperparameter-sliders () "Method args: () Returns the sliders associated with the hyperparameters." (slot-value 'hyperparameter-sliders))
Defines:hyperparameter-sliders
(links are to index).
<The Master :log-mixture-density Method>= (<-U) (defmeth master-proto :log-mixture-density () "Method args: () Returns the log of the mixture density at each of the data points. An array of nm values." (slot-value 'log-mixture-density))
Defines:log-mixture-density
(links are to index).
<The Master :importance-weights Method>= (<-U) (defmeth master-proto :importance-weights () "Method args: () Retrieves the importance sampling weights. An array of size number-of-markov-chains by number-of-points." (slot-value 'importance-weights))
Defines:importance-weights
(links are to index).
<The Master :loglik Method>= (<-U) (defmeth master-proto :loglik (n) "Returns the log-quasi-likelihood as a function of n. The dimension of n should be one less than the number of Markov chains." (let ((result (slot-value 'work-space))) (call-by-reference-oldcfun "logLikelihood" (slot-value 'shared-library) (coerce n '(vector c-double)) result) (aref result 0)))
Defines:loglik
(links are to index).
<The Master :compute-log-hmix Method>= (<-U) (defmeth master-proto :compute-log-hmix () "Method args: () Computes the log of the mixture density at the various data points." (call-by-reference-oldcfun "computeLogHmix" (slot-value 'shared-library)))
Definescompute-log-hmix
(links are to index).
<The Master :calc-weights Method>= (<-U) (defmeth master-proto :calc-weights () "Method args: () Calculates the importance weights." (call-by-reference-oldcfun "calcWeights" (slot-value 'shared-library)))
Defines:calc-weights
(links are to index).
The :isnew
method for master-proto
is a bit involved.
<The Master :isnew Method>= (<-U) (defmeth master-proto :isnew (&key identifier? number-of-markov-chains? number-of-points? file-names? log-constants?) "Method args: (&key identifier? number-of-markov-chains? number-of-points file-names? log-constants?) Please refer to the literate program and the paper for a thorough discussion." <Set up slot values for master object> (send self :calc-weights) <Create slaves of master object> <Set up dialog and wait for user input> <Some final touches>)
Defines:isnew
(links are to index).
The :synchronize
message at the end of this method forces the
computations anyway, so we avoid the computations now. In other words,
initially, we want to be lazy in performing intensive
computations.
<Set up slot values for master object>= (<-U) [D->] (setf (slot-value 'work-space) (make-array 1 :initial-element 0.0 :element-type 'c-double)) (setf (slot-value 'lazy) t)
We need to determine m, the number of chains and n, the number of
data points to be used. If the argument identifier?
is given, then
all the necessary inputs can be read from the ``run'' file. Otherwise,
we need to prompt the user for everything.
<Set up slot values for master object>+= (<-U) [<-D->] (setf (slot-value 'hyperparameter-names) (list "M(R)" "Theta-Shape" "Theta-Scale" "Time (months)")) (setf (slot-value 'shared-library) (shlib::shlib-open *default-shared-lib-name*)) (setf (slot-value 'statistics) (make-array 3 :initial-element 0.0 :element-type 'c-double)) (setf (slot-value 'statistics-print-formats) *default-statistics-print-formats*) (setf (slot-value 'statistics-labels) *default-statistics-labels*) (if identifier? (progn (setf (slot-value 'identifier) identifier?) (send self :process-run-file :number-of-points? number-of-points?)) (send self :graphical-interface :identifier? identifier? :number-of-markov-chains? number-of-markov-chains? :number-of-points? number-of-points? :file-names? file-names? :log-constants? log-constants?))
Time to get information on how the exploration is to proceed.
<Set up slot values for master object>+= (<-U) [<-D->] <Set up Hyperparameter-ranges and stops> (setf (slot-value 'current-hyperparameter-values) (copy-vector (send self :initially-specified-hyperparameter-values)))
This code chunk sets up a large dialog box for all the various quantities. In most cases, the user will just continue without change.
<Set up Hyperparameter-ranges and stops>= (<-U) (let* ((dialog-items ()) (result nil) (ranges (select (send self :hyperparameter-ranges) (iseq 3))) (init-vals (if identifier? (slot-value 'initially-specified-hyperparameter-values) (mapcar #'(lambda(x) (* 0.5 (sum x))) ranges))) (stop-vals (if identifier? (slot-value 'number-of-slider-stops) (mapcar #'(lambda(x) (1+ (- (second x) (first x)))) ranges))) (hypernames (select (slot-value 'hyperparameter-names) (iseq 3))) (prompt-item (send text-item-proto :new (format nil "You can change the interval between which the~%~ hyperparameters may be varied and the number~%~ of stops in the interval (end points included).~%~ Please note that NO ERROR CHECKING is done!~%~%~ The default settings computed from the data files~%~ are shown below. Please click OK when done."))) (col-1 (send text-item-proto :new "Hyperparameter")) (col-2 (send text-item-proto :new "Minimum")) (col-3 (send text-item-proto :new "Maximum")) (col-4 (send text-item-proto :new "Stops")) (col-5 (send text-item-proto :new "Initial")) (headings (list col-1 col-2 col-3 col-4 col-5)) (widths (map-elements #'send headings :width)) (hnames (mapcar #'(lambda(x) (send text-item-proto :new x)) hypernames)) (mins (mapcar #'(lambda(x) (send edit-text-item-proto :new (format nil "~g" (select x 0)) :text-length 10)) ranges)) (maxs (mapcar #'(lambda(x) (send edit-text-item-proto :new (format nil "~g" (select x 1)) :text-length 10)) ranges)) (stops (mapcar #'(lambda(x) (send edit-text-item-proto :new (format nil "~g" x) :text-length 10)) stop-vals)) (inits (mapcar #'(lambda(x) (send edit-text-item-proto :new (format nil "~g" x) :text-length 10)) init-vals)) (abort (send modal-button-proto :new "Abort" :action #'top-level)) (dialog nil) (ok (send modal-button-proto :new "OK" :action #'(lambda() (list (map-elements #'send mins :text) (map-elements #'send maxs :text) (map-elements #'send stops :text) (map-elements #'send inits :text)))))) (dotimes (i (length hnames)) (send (select hnames i) :width (select widths 0)) (send (select mins i) :width (select widths 1)) (send (select maxs i) :width (select widths 2)) (send (select stops i) :width (select widths 3)) (send (select inits i) :width (select widths 4))) (setf dialog-items (list prompt-item headings (list (mapcar #'(lambda (x y z w u) (list x y z w u)) hnames mins maxs stops inits)) (list ok abort))) (setf dialog (send modal-dialog-proto :new dialog-items)) (send dialog :title "Hyperparameter Ranges and Stops") (setf result (mapcar #'convert-to-numbers (send dialog :modal-dialog))) (send self :hyperparameter-ranges (append (mapcar #'(lambda(x y) (list x y)) (select result 0) (select result 1)) (list (list 0 (1- (slot-value 'number-of-months)))))) (setf (slot-value 'initially-specified-hyperparameter-values) (make-array 4 :element-type 'c-double :initial-contents (append (select result 3) (list (truncate (* 0.5 (slot-value 'number-of-months))))))) (setf (slot-value 'number-of-slider-stops) (append (select result 2) (list (slot-value 'number-of-months)))))
Now, make the initial arrays for calling the C functions.
<Set up slot values for master object>+= (<-U) [<-D->] (setf (slot-value 'density-abscissae) (make-array *default-number-of-plot-stops* :initial-contents (rseq 0 1 *default-number-of-plot-stops*) :element-type 'c-double)) (setf (slot-value 'density-ordinates) (make-array *default-number-of-plot-stops* :initial-element 0 :element-type 'c-double)) (let ((number-of-stops (send self :number-of-months))) (setf (slot-value 'expectation-abscissae) (make-array number-of-stops :initial-contents (iseq 0 (1- number-of-stops)) :element-type 'c-double)) (setf (slot-value 'expectation-ordinates) (make-array number-of-stops :initial-element 0 :element-type 'c-double)) (setf (slot-value 'standard-deviation-ordinates) (make-array number-of-stops :initial-element 0 :element-type 'c-double)))
The current values of the hyperparameters must be those specified initially.
<Set up slot values for master object>+= (<-U) [<-D->] (let ((tmp-array (send self :initially-specified-hyperparameter-values))) (setf (slot-value 'current-hyperparameter-values) (make-array (length tmp-array) :initial-contents tmp-array :element-type 'c-double)))
We set up the number of points to be used in Reverse Logistic Regression if necessary. After that, all the data areas will have been set up and so we pass addresses of the data-arrays to the C routines.
<Set up slot values for master object>+= (<-U) [<-D->] (let* ((n (slot-value 'number-of-points)) (nc (min *default-number-of-points* n))) (unless (or log-constants? identifier? (= (slot-value 'number-of-markov-chains) 1)) (setf nc (select (get-tested-value-dialog (format nil "How many points should I use from each chain for~%~ estimating the constants of proportionality?~%~ Unless you are prepared to wait for a long time, you~%~ should go with the default or less!~%") :initial nc :test #'(lambda(x) (and (numberp x) (> x 0) (<= x n))) :error-message "Invalid entry. Please try again!") 0))) <Pass data array addresses to C Routines>)
<Pass data array addresses to C Routines>= (<-U) (call-by-reference-oldcfun "initializeAddress" (slot-value 'shared-library) (slot-value 'summary-data) (slot-value 'hyperparameters-used-in-markov-chains) (slot-value 'current-hyperparameter-values) (slot-value 'importance-Weights) (slot-value 'log-mixture-density) (slot-value 'log-constants-of-proportionality) (slot-value 'number-of-markov-chains) (slot-value 'number-of-points) (slot-value 'number-of-months) (slot-value 'number-of-data-values) (slot-value 'indicator-counts) (slot-value 'statistics) nc)
If the constants of proportionality have to be estimates, this is the time to do it after duly notifying the user.
<Set up slot values for master object>+= (<-U) [<-D] (unless (or log-constants? identifier?) (unless (= (slot-value 'number-of-markov-chains) 1) (message-dialog "The Constants of proportionality will now be~%~ be estimated by Reverse Logistic Regression.~%~ This will take a while and only happens once.~%~%~ You will see the results of iterations on the~%~ Console as the maximization is done.~%~% Click OK to continue.") <Perform Reverse Logistic Regression>)) (unless identifier? (send self :compute-log-hmix) (send self :create-run-file))
<Perform Reverse Logistic Regression>= (<-U) (flet ((loglik (n) (send self :loglik n))) (setf initial-guess (nelmeadmax #'loglik (slot-value 'log-constants-of-proportionality) :epsilon *default-maximization-tolerance* :count-limit 50000)) (dotimes (i (length initial-guess)) (setf (aref (slot-value 'log-constants-of-proportionality) i) (select initial-guess i))))
We are ready to create the slave objects, a plot of the density of F(t) as well as a plot of E(S(t)) for various values of t.
<Create slaves of master object>= (<-U) (setf (slot-value 'slaves) (list (send slave-proto :new self) (send slave-proto :new self :survival-plot t))) (send (second (slot-value 'slaves)) :title (strcat "Plot of E(S(t))-" (slot-value 'identifier)))
The sliders for controlling the hyperparameters need to be set up.
There are three dialog items that correspond to every hyperparameter:
a text-item
where the hyperparameter name will be displayed, a
value-text-item
where the value of the hyperparameter will be
displayed and underneath the first two, a slider showing the slider
stop. Figure [->] shows a typical slider. For nice looks,
the width of the hyperparameter name string and the value string
should add up to the width of the slider.
A typical slider in a dialog [*]
<Set up dialog and wait for user input>= (<-U) (let* ((hyperparameter-labels (send self :hyperparameter-names)) (hyperparameter-ranges (send self :hyperparameter-ranges)) (triples <Make triples for sliders>) (slider-and-dialog-items <Make sliders with triples>) (dialog-items (select slider-and-dialog-items 0)) (sliders-alone (select slider-and-dialog-items 1)) (reset-button (send button-item-proto :new "Reset" :action #'(lambda() (send self :reset)))) (timing-button (send button-item-proto :new "Timing:OFF" :action #'(lambda() (send self :toggle-timing)))) (stats-button (send button-item-proto :new "Statistics" :action #'(lambda() (send self :print-all-statistics))))) (setf (slot-value 'hyperparameter-sliders) sliders-alone) (setf (slot-value 'timing-button) timing-button) (call-next-method (list (list reset-button stats-button timing-button) (list dialog-items))))
Let us first create the triples we need for the hyperparameter sliders
for use with the function make-sliders
. The triples are (a) the
label for the hyperparameter, (b) the range within which the
hyperparameter will be varied, and (c) the action that is to be taken
when the slider is pressed as a function.
<Make triples for sliders>= (<-U) (mapcar #'(lambda(x y z) (list x y #'(lambda(w) (send self :current-hyperparameter-values z w)))) hyperparameter-labels hyperparameter-ranges (iseq (send self :number-of-hyperparameters)))
We are now ready to create the sliders and associated text and value
items. The function make-sliders
returns a multiple-values
list
of slider-items
using triples
with a specified
layout along with a list of the appropriate scroll-item
s. A
slider-item
is a list of two elements, the first element being a
list containing a text-item
and a value-item
and the second is
a scroll-item
.
<Make sliders with triples>= (<-U) (multiple-value-list (make-sliders triples :layout (list (list t) (list t t) (list t)) :formats *default-hyperparameter-print-format* :no-of-slider-stops (slot-value 'number-of-slider-stops)))
Finally, we create a decent title, become alert once again as opposed to being lazy and then synchronize everything and show those windows.
<Some final touches>= (<-U) (send self :title (strcat (send self :identifier) "-Master")) (let ((hd (send self :number-of-hyperparameters)) (hyper-sliders (slot-value 'hyperparameter-sliders)) (hyper-vals (send self :current-hyperparameter-values))) (setf hyper-vals (coerce hyper-vals 'list)) (send (select hyper-sliders 1) :display-value) (dotimes (l hd) (unless (= l 1) (send (select hyper-sliders l) :value (select hyper-vals l))))) (setf (slot-value 'lazy) nil) (send self :synchronize)
In handling the ``run'' file, note the assumptions made about the file structure and how some irrelevant headings are expected and skipped.
<The Master :process-run-file Method>= (<-U) (defmeth master-proto :process-run-file (&key number-of-points?) "Method args: (&key number-of-points?) Processes a run file to get all inputs." (format t "~%Processing Run file ... ") (let ((fh (open (strcat (slot-value 'identifier) ".run") :direction :input)) (file-names nil) (alpha-shape-scale nil) (constants nil) (m nil) (n nil) (mn nil) (index 0) (indicator-counts nil) (summary-data nil) (log-hmix nil)) <Read number of Markov chains> <Read number of points> (setf mn (* m n)) <Read number of sets> <Read number of months> (setf (slot-value 'importance-weights) (make-array mn :initial-element 0.0 :element-type 'c-double)) <Process the table containing info on Markov chains> <Process Hyperparameter Ranges etc.> <Read summary data, log mixture density, indicator counts> (setf (slot-value 'indicator-counts) indicator-counts) (setf (slot-value 'summary-data) summary-data) (setf (slot-value 'log-mixture-density) log-hmix) (close fh)) (format t "done~%"))
<Read number of Markov chains>= (<-U) ;; Ignore the information in first line---just some info. (read fh nil) ;; Ignore the heading "Number of Markov chains" (read fh nil) (setf m (read fh nil)) (setf (slot-value 'number-of-markov-chains) m)
<Read number of points>= (<-U) ;; Ignore the heading "Number of Points" (read fh nil) (setf n (read fh nil)) (if number-of-points? (setf n number-of-points?)) (setf (slot-value 'number-of-points) n)
<Read number of sets>= (<-U) ;; Ignore the heading "Number of Data Values" (read fh nil) (setf (slot-value 'number-of-data-values) (read fh nil))
<Read number of months>= (<-U) ;; Ignore the heading "Number of Months" (read fh nil) (setf (slot-value 'number-of-months) (read fh nil))
<Process Hyperparameter Ranges etc.>= (<-U) ;; Ignore the leading line (read fh nil) (let ((range nil) (r (iseq 3)) (s (iseq 3)) (i (iseq 3))) (dotimes (j 3) (read fh nil) (read fh nil) (setf range (cons (read fh nil) nil)) (read fh nil) (setf range (cons (read fh nil) range)) (setf (select r j) (reverse range)) (read fh nil) (setf (select i j) (read fh nil)) (read fh nil) (setf (select s j) (read fh nil))) (send self :hyperparameter-ranges (append r (list (list 0 (1- (slot-value 'number-of-months)))))) (setf (slot-value 'number-of-slider-stops) (append s (list (slot-value 'number-of-months)))) (setf (slot-value 'initially-specified-hyperparameter-values) (append i (list (* 0.5 (slot-value 'number-of-months))))))
<Process the table containing info on Markov chains>= (<-U) ;; Now read the table of file name, alpha, ;; shape, scale and log constant after ignoring the heading. (read fh nil) (dotimes (i m) (setf file-names (cons (read fh nil) file-names)) (setf alpha-shape-scale (cons (read fh nil) alpha-shape-scale)) (setf alpha-shape-scale (cons (read fh nil) alpha-shape-scale)) (setf alpha-shape-scale (cons (read fh nil) alpha-shape-scale)) (setf constants (cons (read fh nil) constants))) (setf (slot-value 'data-file-names) (reverse file-names)) (setf (slot-value 'log-constants-of-proportionality) (if (= m 1) (make-array 1 :initial-element 0 :element-type 'c-double) (make-array (1- m) :initial-contents (rest (reverse constants)) :element-type 'c-double))) (setf (slot-value 'hyperparameters-used-in-markov-chains) (make-array (list m 3) :initial-contents (reverse alpha-shape-scale) :element-type 'c-double))
<Read summary data, log mixture density, indicator counts>= (<-U) ;; Ignore heading. (read fh nil) (setf indicator-counts (make-array (list mn (slot-value 'number-of-months)) :element-type 'c-long)) (setf summary-data (make-array (list mn 2) :initial-element 0.0 :element-type 'c-double)) (setf log-hmix (make-array mn :initial-element 0.0 :element-type 'c-double)) (dotimes (j n) (dotimes (k m) (setf index (+ (* n k) j)) (setf (aref summary-data index 0) (read fh nil)) (setf (aref summary-data index 1) (read fh nil)) (setf (aref log-hmix index) (read fh nil)) (dotimes (l (slot-value 'number-of-months)) (setf (aref indicator-counts index l) (truncate (read fh nil))))))
Defines:process-run-file
(links are to index).
<The Master :process-frequency-table Method>= (<-U) (defmeth master-proto :process-frequency-table (fh nd ind) "Args: (fh nd ind) This is a convenience method used in reading the data files. It reads the frequency table of size nd from fh, the file stream, and fills the indicator-counts slot at index ind." (let ((x-vals (make-array nd)) (freq (make-array nd)) (sort-index nil) (indicator-counts (slot-value 'indicator-counts)) (number-of-months (slot-value 'number-of-months)) (j 0) (k 0) (freq-sum 0) (smallest-month 0)) <Read frequency table and sort values> (dotimes (i nd) (setf smallest-month <Find smallest month not less than i-th ordered x-val>) (when smallest-month <Fill table entries with current frequency sum> <Bump starting index for next stretch> <Update frequency sum>)) <Handle situation when x-values run out before month values>))
Defines:process-frequency-table
(links are to index).
<Read frequency table and sort values>= (<-U) (dotimes (l nd) (setf (aref x-vals l) (read fh nil)) (setf (aref freq l) (read fh nil))) (setf sort-index (make-array nd :initial-contents (order x-vals)))
In the code snippet below, nil
is returned if no such month is
found.
<Find smallest month not less than i-th ordered x-val>= (<-U) (loop (if (>= j number-of-months) (return nil)) (if (>= j (elt x-vals (elt sort-index i))) (return j)) (setf j (1+ j)))
<Fill table entries with current frequency sum>= (<-U) (dotimes (x (- smallest-month k)) (setf (aref indicator-counts ind (+ x k)) freq-sum))
<Bump starting index for next stretch>= (<-U) (setf k smallest-month)
<Update frequency sum>= (<-U) (setf freq-sum (+ freq-sum (elt freq (elt sort-index i))))
<Handle situation when x-values run out before month values>= (<-U) (dotimes (x (- number-of-months k)) (setf (aref indicator-counts ind (+ x k)) freq-sum))
<The Master :graphical-interface Method>= (<-U) (defmeth master-proto :graphical-interface (&key identifier? number-of-markov-chains? number-of-points? file-names? log-constants?) "Method args: (&key identifier? number-of-chains? number-of-points file-names? log-constants?) Sets up a graphical interface for some inputs if not specified." (let* ((id (if identifier? identifier? <Get an identifier>)) (m (if number-of-markov-chains? number-of-markov-chains? <Get the number of Markov chains>)) (file-names (if file-names? file-names? <Get data file names>)) (n (if number-of-points? number-of-points? <Get the number of points>)) (initial-guess (if log-constants? log-constants? (if (> m 1) <Get initial guess>)))) (setf (slot-value 'identifier) id) (setf (slot-value 'data-file-names) file-names) (setf (slot-value 'number-of-markov-chains) m) (setf (slot-value 'number-of-points) n) (setf (slot-value 'number-of-months) *default-number-of-months*) (setf (slot-value 'indicator-counts) (make-array (list (* m n) (slot-value 'number-of-months)) :element-type 'c-long)) (setf (slot-value 'importance-weights) (make-array (* m n) :initial-element 0 :element-type 'c-double)) (setf (slot-value 'log-mixture-density) (make-array (* m n) :initial-element 0 :element-type 'c-double)) (setf (slot-value 'log-constants-of-proportionality) (if (= m 1) (make-array 1 :initial-element 0 :element-type 'c-double) (make-array (1- m) :initial-contents initial-guess :element-type 'c-double))) (let* ((alphas nil) (shapes nil) (scales nil) (summary-data (make-array (list (* m n) 2) :initial-element 0 :element-type 'c-double ))) <Read in data files and set up data>)))
<Get an identifier>= (<-U) (get-nonempty-string-dialog (format nil "Please enter a unique short descriptive name~%~ for this exploration.~%~ Ex: CancerData") :initial "BreastCancer")
<Get the number of Markov chains>= (<-U) (select (get-tested-value-dialog (format nil "How many Markov chains for ~a?" id) :initial *default-number-of-markov-chains* :test #'(lambda(x) (and (numberp x) (> x 0))) :error-message "No. of Markov chains must be >= 1!") 0)
<Get data file names>= (<-U) (let* ((dialog-items ()) (filenames nil) (prompt-item (send text-item-proto :new (format nil "Please enter names of all data files without~%~ the extensions for run ~a and click OK.~%" id))) (col-1 (send text-item-proto :new "MC number")) (col-2 (send text-item-proto :new " Data Filename ")) (headings (list col-1 col-2)) (width-a (send col-1 :width)) (width-b (send col-2 :width)) (widths (map-elements #'send headings :width)) (file-name-items (map-elements #'send edit-text-item-proto :new (repeat "" m) :text-length 30)) (mc-numbers (mapcar #'(lambda(x) (send text-item-proto :new (format nil "~5d" x))) (1+ (iseq m)))) (abort (send modal-button-proto :new "Abort" :action #'top-level)) (dialog nil) (ok (send modal-button-proto :new "OK" :action #'(lambda() (map-elements #'send file-name-items :text))))) (dolist (x mc-numbers) (send x :width width-a)) (dolist (x file-name-items) (send x :width width-b)) (setf dialog-items (list prompt-item headings (list (mapcar #'(lambda (x y) (list x y)) mc-numbers file-name-items)) (list ok abort))) (loop (setf dialog (send modal-dialog-proto :new dialog-items)) (setf file-names (send dialog :modal-dialog)) (if (or (some-files-dont-exist (map-elements #'strcat file-names ".in")) (some-files-dont-exist (map-elements #'strcat file-names ".out"))) (message-dialog "Some files don't exist. Please try again!") (return file-names))))
<Get the number of points>= (<-U) (select (get-tested-value-dialog (format nil "How many points to use for ~a in reweighting?" id) :initial *default-number-of-points* :test #'(lambda(x) (and (numberp x) (> x 0))) :error-message "No. of points must be > 0!") 0)
<Get initial guess>= (<-U) (select (get-nonnil-value-dialog (format nil "Enter an initial guess for estimating the ~% ~ constants in the format shown below.~% ~ Dimension should be ~d!~%" (1- m)) :initial (1+ (iseq (1- m))) :test #'(lambda(x) (let ((val (select x 0))) (and val (listp val) (= (length val (1- m)))))) :error-message "Improper guess!") 0)
To process the data, we use both the input and output file used in running the Markov chains.
<Read in data files and set up data>= (<-U) [D->] (dotimes (j m) ;;; First process the input file parameters (let ((fh (open (strcat (select file-names j) ".in") :direction :input))) ;; discard the first four values (dotimes (i 4) (read fh nil)) (setf shapes (cons (read fh nil) shapes)) (setf scales (cons (read fh nil) scales)) (setf alphas (cons (read fh nil) alphas)) ;; discard next three values (warmup, iterations and gap) ;; Assumes all input files are consistent. (read fh nil) (read fh nil) (read fh nil) (setf (slot-value 'number-of-data-values) (read fh nil)) (close fh)) ;;; Now process the corresponding output file. (let ((fh (open (strcat (select file-names j) ".out") :direction :input)) (ind (* j n)) (nd nil)) (dotimes (i n) (setf (aref summary-data ind 1) (read fh nil)) (setf nd (read fh nil)) (setf (aref summary-data ind 0) nd) (send self :process-frequency-table fh nd ind) (setf ind (1+ ind))) (close fh)))
Now that all the Markov chain files have been processed, the slot-values can be set up.
<Read in data files and set up data>+= (<-U) [<-D] (setf (slot-value 'hyperparameters-used-in-markov-chains) (make-array (list m 3) :initial-contents (bind-columns (reverse alphas) (reverse shapes) (reverse scales)) :element-type 'c-double)) (setf (slot-value 'summary-data) summary-data)
<The Master :create-run-file Method>= (<-U) (defmeth master-proto :create-run-file () "Method args: () Creates a data and a lisp file for subsequent runs." (format t "~%Creating Run file for subsequent runs ... ") (let ((mn nil) (index 0) (m (slot-value 'number-of-markov-chains)) (n (slot-value 'number-of-points)) (p (slot-value 'number-of-data-values)) (file-names (slot-value 'data-file-names)) (summary-data (slot-value 'summary-data)) (indicator-counts (slot-value 'indicator-counts)) (hypers-used (slot-value 'hyperparameters-used-in-markov-chains)) (log-constants (slot-value 'log-constants-of-proportionality)) (log-hmix (slot-value 'log-mixture-density)) (fh (open (strcat (slot-value 'identifier) ".run") :direction :output))) (setf mn (* m n)) <Write number of Markov chains, etc.> <Write table of info on Markov chains> <Write info on hyperparameters> <Write summary data, log mixture density, indicator counts> (close fh)) <Create lisp file for subsequent runs> (format t "done~%") <Show informative message>)
<Write number of Markov chains, etc.>= (<-U) (format fh "~s~%~%" "Automatically generated file. Do not edit unless~ you know what you are doing!") (format fh "~s ~g~%" "Number of Markov Chains" m) (format fh "~s ~g~%" "Number of Points" n) (format fh "~s ~g~%" "Number of Data Values" (slot-value 'number-of-data-values)) (format fh "~s ~g~%" "Number of Months used" (slot-value 'number-of-months))
<Write info on hyperparameters>= (<-U) (let ((r (send self :hyperparameter-ranges)) (s (slot-value 'number-of-slider-stops)) (i (slot-value 'initially-specified-hyperparameter-values))) (format fh "~%~s~%" "Hyperparameter Exploration Range, Initial Value, stops, etc.") (format fh "~%~s ~s ~g ~s ~g ~s ~g ~s ~g~%" "M(R)" "Min" (select (select r 0) 0) "Max" (select (select r 0) 1) "Initial" (select i 0) "Stops" (select s 0)) (format fh "~%~s ~s ~g ~s ~g ~s ~g ~s ~g~%" "Theta-Shape" "Min" (select (select r 1) 0) "Max" (select (select r 1) 1) "Initial" (select i 1) "Stops" (select s 1)) (format fh "~%~s ~s ~g ~s ~g ~s ~g ~s ~g~%" "Theta-Scale" "Min" (select (select r 2) 0) "Max" (select (select r 2) 1) "Initial" (select i 2) "Stops" (select s 2)))
<Write table of info on Markov chains>= (<-U) (format fh "~%~%~s~%~%" "Table of file name, alpha, shape, scale, log constants") (dotimes (j m) (format fh "~s ~g ~g ~g ~g~%" (select file-names j) (aref hypers-used j 0) (aref hypers-used j 1) (aref hypers-used j 2) (if (= j 0) -1 (elt log-constants (1- j)))))
<Write summary data, log mixture density, indicator counts>= (<-U) (format fh "~%~%~s~%~%" "Table of no of distinct values, theta, log-hmix, Delta-X-t.") (dotimes (j n) (dotimes (k m) (setf index (+ (* k n) j)) (format fh "~g ~g " (aref summary-data index 0) (aref summary-data index 1)) (format fh "~g " (aref log-hmix index)) (dotimes (l (slot-value 'number-of-months)) (format fh "~d " (aref indicator-counts index l))) (format fh "~%")))
<Create lisp file for subsequent runs>= (<-U) (let* ((id (slot-value 'identifier)) (fh (open (strcat id ".lsp") :direction :output))) (format fh ";;;Automatically generated file. Do not edit~ unless you know what you are doing.~%") (format fh "(require ~s)~%" "bsa") (format fh "(use-package ~s)~%" "BSA") (format fh "(defvar ~a-master (send master-proto :new :identifier? ~s))~%" id id) (close fh))
<Show informative message>= (<-U) (let ((id (slot-value 'identifier))) (message-dialog (format nil "For your information: Two files~%~ ~a.run and ~a.lsp~%were created.~%~ To repeat this run quickly the next time around~%~ you only need to load the file ~a.lsp into~%~ xlispstat." id id id)))
The :synchronize
method is responsible for synchronizing the
slaves so that the density estimates they display is for the current
values of the hyperparameters. Thus, if any hyperparameter value is
changed, the :synchronize
method should be invoked.
<The Master :synchronize Method>= (<-U) (defmeth master-proto :synchronize () "Method args: () Synchronizes all slaves." (when (not (slot-value 'lazy)) (send self :consolidate-computation) (let* ((t-format (select *default-hyperparameter-print-format* 3)) (slave1 (first (slot-value 'slaves))) (slave2 (second (slot-value 'slaves))) (t-value (send self :current-hyperparameter-values 3))) (send slave1 :title (format nil "Law of F(~v,vf)-~a" (first t-format) (second t-format) t-value (slot-value 'identifier))) (setf (select (slot-value 'statistics-labels) 0) (format nil "E(S(~g))" t-value)) (setf (select (slot-value 'statistics-labels) 1) (format nil "SD(S(~g))" t-value)) (send slave1 :start-buffering) (send slave1 :clear-lines :draw nil) (send slave1 :add-lines (slot-value 'density-abscissae) (slot-value 'density-ordinates)) (send slave1 :adjust-to-data) (send slave1 :buffer-to-screen) (send slave2 :start-buffering) (send slave2 :clear-lines :draw nil) (send slave2 :add-lines (slot-value 'expectation-abscissae) (slot-value 'expectation-ordinates)) (send slave2 :adjust-to-data) (send slave2 :buffer-to-screen))))
Defines:synchronize
(links are to index).
In this method we compute all the relevant statistics such as E(S(t)) and sigma_S(t) for various values of t, as well as the effective sample size. This method was added to consolidate all computations for acceptable performance and makes several other methods above superfluous.
<The Master :consolidate-computation Method>= (<-U) (defmeth master-proto :consolidate-computation () "Method args: () Consolidates all computations for sake of speed. This method recalculates the importance weights, computes relevant statistics and saves them in the slots." (call-by-reference-oldcfun "consolidateComputation" (slot-value 'shared-library) *default-number-of-plot-stops* (slot-value 'statistics) (slot-value 'density-abscissae) (slot-value 'density-ordinates) (slot-value 'expectation-abscissae) (slot-value 'expectation-ordinates) (slot-value 'standard-deviation-ordinates)))
Defines:consolidate-computation
(links are to index).
<The Master :reset Method>= (<-U) (defmeth master-proto :reset () "Method args: () Resets the state of all objects." (send self :current-hyperparameter-values (send self :initially-specified-hyperparameter-values)))
Defines:reset
(links are to index).
The actual computation of the effective sample size is now done in the C code that computes the importance weights (see section [->]). We note that that the formula for computing effective sample size was originally used in estimating standard deviations in stratified sampling. It can be used to give ballpark figures, but should not be taken too literally.
<The Master :effective-sample-size Method>= (<-U) (defmeth master-proto :effective-sample-size () (select (slot-value 'statistics) 2))
Defines:effective-sample-size
(links are to index).
<The Master :print-all-statistics Method>= (<-U) (defmeth master-proto :print-all-statistics () (let* ((hyperparameter-names (slot-value 'hyperparameter-names)) (hyper-strings (send self :labelled-hyperparameter-values)) (max-name-len (max (mapcar #'length hyperparameter-names)))) (format t "~%~%*** Statistics for ~a ***~%~%" (send self :identifier)) (format t "Hyperparameter Settings:~%" ) (dolist (x hyper-strings) (format t " ~a~%" x)) (format t "Total Sample Size = ~5d.~%" (* (slot-value 'number-of-markov-chains) (slot-value 'number-of-points))) (format t "Effective Sample Size = ~5d.~%" (select (slot-value 'statistics) 2))) (dolist (x (slot-value 'slaves)) (send x :print-summary)) (format t "~%*** End of Statistics ***~%~%"))
Defines:print-all-statistics
(links are to index).
<The Master :labelled-hyperparameter-values Method>= (<-U) (defmeth master-proto :labelled-hyperparameter-values () (let* ((hyper-names (slot-value 'hyperparameter-names)) (max-name-len (max (mapcar #'length hyper-names)))) (mapcar #'(lambda(a b) (let ((y (select *default-hyperparameter-print-format* a))) (if (listp y) (format nil "~va = ~v,vf" max-name-len b (first y) (second y) (send self :current-hyperparameter-values a)) (format nil (strcat "~va = " y) max-name-len b (send self :current-hyperparameter-values a))))) (iseq (send self :number-of-hyperparameters)) hyper-names)))
Defines:labelled-hyperparameter-values
(links are to index).
The :statistics method for the master returns the value of
the slot statistics
.
<The Master :statistics Method>= (<-U) (defmeth master-proto :statistics () "Method args: () Returns the slot value statistics." (slot-value 'statistics))
Defines:statistics
(links are to index).
The :statistics-print-formats method for the master returns
the value of the slot statistics-print-formats
.
<The Master :statistics-print-formats Method>= (<-U) (defmeth master-proto :statistics-print-formats () "Method args: () Returns the slot value statistics-print-formats." (slot-value 'statistics-print-formats))
Defines:statistics-print-formats
(links are to index).
The :statistics-labels method for the master returns the
value of the slot statistics-labels
.
<The Master :statistics-labels Method>= (<-U) (defmeth master-proto :statistics-labels () "Method args: () Returns the slot value statistics-labels." (slot-value 'statistics-labels))
Defines:statistics-labels
(links are to index).
This method toggles timing on and off.
<The Master :toggle-timing Method>= (<-U) (defmeth master-proto :toggle-timing () "Method args: () Toggles timing on and off." (send self :hide-window) (setf (slot-value 'timing) (not (slot-value 'timing))) (if (slot-value 'timing) (send (slot-value 'timing-button) :slot-value 'text "Timing:ON") (send (slot-value 'timing-button) :slot-value 'text "Timing:OFF")) (send self :show-window))
Defines:toggle-timing
(links are to index).
This method returns the value of the superimpose
slot.
<The Master :superimpose Method>= (<-U) (defmeth master-proto :superimpose () "Method args: () Returns the value of slot superimpose." (slot-value 'superimpose))
Defines:superimpose
(links are to index).
The :close method for the master must close the slaves window that are active. Finally, it must commit hara-kiri.
<The Master :close Method>= (<-U) (defmeth master-proto :close () "Method args: () Kills all subordinate slave and commits suicide." (dolist (x (slot-value 'slaves)) (send x :remove)) (call-next-method))
Defines:close
(links are to index).
We shall use the prefix ``BSA'' for Bayesian Sensitivity Analysis.
<Defaults for Master>= (<-U) (defvar *default-master-object-prefix* "BSA-") (defvar *default-number-of-markov-chains* 8) (defvar *default-number-of-points* 50) (defvar *default-number-of-plot-stops* 51) (defvar *default-no-of-slider-stops* '(64 51 121 61)) (defvar *default-hyperparameter-print-format* '((7 2) (7 2) (7 2) (7 2))) (defvar *default-number-of-months* 61) (defvar *default-maximization-tolerance* 1e-5) (defvar *default-shared-lib-name* (if (member ':msdos xlisp::*features*) "win/bsa.dll" "./libbsa@SHLIB_SUFFIX@"))
Defines*default-hyperparameter-print-format*
,*default-master-object-prefix*
,*default-maximization-tolerance*
,*default-no-of-slider-stops*
,*default-number-of-markov-chains*
,*default-number-of-months*
,*default-number-of-points*
,*default-shared-lib-name
(links are to index).
The Slave prototype inherits from scatterplot-proto
.
<The Slave Prototype>= (<-U) (defproto slave-proto '(master survival-plot color-index) () scatterplot-proto "The Slave prototype. Master is its master upon whom the slave relies for all data. Survival-plot is nil or t indicating what is to be plotted. Color-index is used in superimposition.") <Methods for Slave Prototype> <Defaults for Slave>
Definesslave-proto
(links are to index).
The methods for slave-proto
follow.
<Methods for Slave Prototype>= (<-U) <The Slave :isnew Method> <The Slave :redraw-background Method> <The Slave :redraw-statistics Method> <The Slave :print-summary Method> <The Slave :close Method>
<The Slave :isnew Method>= (<-U) (defmeth slave-proto :isnew (master &key (go-away t) (survival-plot nil)) "Method args: master &rest args Creates a new instance of the slave-proto object." (setf (slot-value 'master) master) (setf (slot-value 'survival-plot) survival-plot) (setf (slot-value 'color-index) 0) (call-next-method 2 :go-away go-away :draw nil) (if (not (slot-value 'survival-plot)) (let ((skip (+ (send self :text-ascent) (send self :text-descent))) (len (length (send master :statistics-print-formats)))) (send self :margin 0 (+ (* len skip) (send self :text-descent)) 0 0 :draw nil))) (send self :redraw))
Defines:isnew
(links are to index).
<The Slave :redraw-background Method>= (<-U) (defmeth slave-proto :redraw-background () "Method args: () Redraws the background of the screen" (send self :start-buffering) (call-next-method) (unless (slot-value 'survival-plot) (send self :redraw-statistics)) (send self :buffer-to-screen))
Defines:redraw-background
(links are to index).
<The Slave :redraw-statistics Method>= (<-U) (defmeth slave-proto :redraw-statistics () "Method args: () Redraws the statistics on the screen" (let* ((master (slot-value 'master)) (ascent (send self :text-ascent)) (descent (send self :text-descent)) (skip (+ ascent descent)) (em (send self :text-width "m")) (en (send self :text-width "n")) (canvas-width (send self :canvas-width)) (y 0) (stats-labels (send master :statistics-labels)) (name-width (max (mapcar #'(lambda(x) (send self :text-width x)) stats-labels))) (value-strings (mapcar #'(lambda(x y) (format nil x y)) (send master :statistics-print-formats) (coerce (send master :statistics) 'list))) (value-width (max (mapcar #'(lambda(x) (send self :text-width x)) value-strings))) (x2 (- canvas-width value-width em)) (x1 (- x2 name-width em))) (dotimes (i (length stats-labels)) (let ((sl (select stats-labels i)) (sv (select value-strings i))) (setf y (+ y skip)) (send self :draw-text sl x1 y 0 0) (send self :draw-text sv x2 y 0 0)))))
Defines:redraw-statistics
(links are to index).
This method basically calculates the mean and variance of beta distributions.
<The Slave :print-summary Method>= (<-U) (defmeth slave-proto :print-summary () "Method args: () Prints the mean and variance of the Law of F(t)." (if (slot-value 'survival-plot) (let* ((master (slot-value 'master)) (x (send master :slot-value 'expectation-abscissae)) (y (send master :slot-value 'expectation-ordinates)) (stddevs (send master :slot-value 'standard-deviation-ordinates))) (format t "~%Table of E(S(t))~%") (format t "------------------------------------~%") (format t "Time t (months) E(S(t)) SD(S(t))~%") (format t "------------------------------------~%") (dotimes (i (length x)) (format t "~14d ~5,3f ~5,3f~%" (aref x i) (aref y i) (aref stddevs i)))) (let* ((master (slot-value 'master)) (stats (send master :slot-value 'statistics)) (stat-print-formats (send master :slot-value 'statistics-print-formats)) (hypers (send master :current-hyperparameter-values)) (alpha (aref hypers 0)) (t-value-index (truncate (aref hypers 3)))) (format t (strcat "Mean of S(~d): " (first stat-print-formats) "~%") t-value-index (elt stats 0)) (format t (strcat "Std. Dev. of S(~d): " (second stat-print-formats) "~%") t-value-index (elt stats 1)))))
Defines:print-summary
(links are to index).
<The Slave :close Method>= (<-U) (defmeth slave-proto :close () (ok-or-cancel-dialog "Please use the master to quit"))
<Defaults for Slave>= (<-U) (defvar *default-slave-plot-size* '(300 265)) (defvar *default-slave-plot-stops* 51) (defvar *default-statistics-labels* '("E(S(t))" "SD(S(t))" "Eff. Sample Size")) (defvar *default-statistics-print-formats* '("~5,3f" "~5,3f" "~5,0f"))
Defines*default-slave-plot-size*
,*default-slave-plot-stops*
,*default-statistics-labels*
,*default-statistics-print-formats*
(links are to index).
In designing the C programs, we assume one thing---that no compaction is done during the lifetime of the master object. This allows us to refrain from passing pointers to data arrays each time a routine is called. Instead, all required pointers could be passed once when the master object is created and they remain fixed for the life of the master. Current versions of Lisp-Stat don't do memory compaction or move objects around.
Here's our copyright for C programs.
<C Copyright>= (U->) /** *** $Revision: 1.34 $ of $Date: 1998/07/02 17:16:33 $ *** *** Copyright (C) 1994, 1995, 1998. Doss and Narasimhan *** *** Hani J. Doss (doss@stat.ohio-state.edu) and *** B. Narasimhan (naras@stat.stanford.edu) *** *** This program is free software; you can redistribute it and/or modify *** it under the terms of the GNU General Public License as published by *** the Free Software Foundation; either version 2 of the License, or *** (at your option) any later version. *** *** This program is distributed in the hope that it will be useful, *** but WITHOUT ANY WARRANTY; without even the implied warranty of *** MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the *** GNU General Public License for more details. *** *** You should have received a copy of the GNU General Public License *** along with this program; if not, write to the Free Software *** Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. **/
<C Routines>= <C Copyright> <Header files> <Beta functions> <Global Variables> <Address Initialization Routine> <The Density Routine> <The LogProbability Routine> <The LogLikelihood Routine> <The hthetaOverHmix Routine> <The compute-log-hmix Routine> <The calcWeights Routine> <The computeStatistics Method> <The computeLawOfFOfT Method> <The computeMeanOfFBarOfT Method> <The consolidateComputation Routine>
Here are the headers files we will need. Note the declaration for the
prototype of the gamma
function that comes with
Lisp-Stat. This function is not declared static
in the
Lisp-Stat source and is therefore available to us.
<Header files>= (<-U) #include <stdio.h> #include <stdlib.h> #include <math.h>
We need two other functions, betdens
and logbeta
are indeed
declared static
in the Lisp-Stat source. Therefore, a
copy of these routines is needed. In Lisp-Stat
, the lnGamma
function is known as mygamma
.
<Beta functions>= (<-U) extern double mygamma(double a); static double logbeta (double a, double b) { static double da = 0.0, db = 0.0, lbeta = 0.0; if (a != da || b != db) { /* cache most recent call */ da = a; db = b; lbeta = mygamma(da) + mygamma(db) - mygamma(da + db); } return(lbeta); } static double betadens (double x, double a, double b) { if (x <= 0.0 || x >= 1.0) return 0.0; return exp(log(x) * (a - 1) + log(1 - x) * (b - 1) - logbeta(a, b)); }
Definesbetadens
,logbeta
,mygamma
(links are to index).
We begin with some global variables for all these C routines. Note
that these are declared static
.
<Global Variables>= (<-U) static double *summaryData, *hyperValuesUsedInMarkovChains, *currentHyperValues, *importanceWeights, *logMixtureDensity, *logConstantsOfProportionality; static double *statistics; static int numberOfMarkovChains, numberOfPoints, numberOfPointsForConstants, numberOfDataValues, numberOfMonths, numberOfDataValues, *indicatorCounts; #define LOW_LOG_PROBABILITY -1e10;
DefinescurrentHyperValues
,hyperValuesUsedinMarkovChains
,importanceWeights
,indicatorCounts
,logConstantsOfProportionality
,logMixtureDensity
,LOW_LOG_PROBABILITY
,numberOfDataValues
,numberOfMarkovChains
,numberOfMonths
,numberOfPoints
,numberOfPointsForConstants
,statistics
,summaryData
(links are to index).
This initialization routine gathers addresses and stores them a static storage area. On the Windows platform, we also need to export the routines defined here.
<Address Initialization Routine>= (<-U) #ifdef _Windows __declspec(dllexport) void initializeAddress (double *a, double *b, double *c, double *d, double *e, double *ff, int *g, int *h, int *i, int *j, int *k, double *l, int *m); #endif /* * Initialization routine, must be called as soon as the slots * are created in the master :isnew method. */ void initializeAddress (argSummaryData, argHyperValuesUsedinMarkovChains, argCurrentHyperValues, argImportanceWeights, argLogMixtureDensity, argLogConstantsOfProportionality, argNumberOfMarkovChains, argNumberOfPoints, argNumberOfMonths, argNumberOfDataValues, argIndicatorCounts, argStatistics, argNumberOfPointsForConstants) double *argSummaryData, *argHyperValuesUsedinMarkovChains, *argCurrentHyperValues, *argImportanceWeights, *argLogMixtureDensity, *argLogConstantsOfProportionality; double *argStatistics; int *argNumberOfMarkovChains, *argNumberOfPoints, *argNumberOfMonths, *argNumberOfDataValues, *argIndicatorCounts, *argNumberOfPointsForConstants; { summaryData = argSummaryData; hyperValuesUsedInMarkovChains = argHyperValuesUsedinMarkovChains; currentHyperValues = argCurrentHyperValues; importanceWeights = argImportanceWeights; logMixtureDensity = argLogMixtureDensity; logConstantsOfProportionality = argLogConstantsOfProportionality; indicatorCounts = argIndicatorCounts; numberOfMarkovChains = *argNumberOfMarkovChains; numberOfPoints = *argNumberOfPoints; numberOfMonths = *argNumberOfMonths; numberOfDataValues = *argNumberOfDataValues; statistics = argStatistics; numberOfPointsForConstants = *argNumberOfPointsForConstants; }
DefinesinitializeAddress
(links are to index).
:f
Method
This routine calculates the j-th posterior density at a point. See
:f
method for master-proto
. The formulas used for this and the
next three methods all follow the development in [cite geye:1994].
<The Density Routine>= (<-U) double f (argJ, argEta, argIndex) int argJ, argIndex; double *argEta; { double nj, numberOfDistinctXS = summaryData[2 * argIndex], theta = summaryData[2 * argIndex + 1], alpha, shape, scale; int index = argJ * 3; if (argJ == 0) nj = -1.0; else nj = argEta[argJ - 1]; alpha = hyperValuesUsedInMarkovChains[index]; shape = hyperValuesUsedInMarkovChains[index + 1]; scale = hyperValuesUsedInMarkovChains[index + 2]; return exp(numberOfDistinctXS * log(alpha) + shape * log(scale) + (shape - 1.0) * log(theta) - scale * theta - mygamma(shape) + nj); }
Definesf
(links are to index).
:logp
Method
This routine calculates the probability that a given point comes
from a particular Markov chain. See :logp
method for
master-proto
.
<The LogProbability Routine>= (<-U) double logp (argJ, argEta, argIndex) int argJ, argIndex; double *argEta; { double tmp, tmp1, sum = 0; int k; for (k = 0; k < numberOfMarkovChains; k++) { tmp = f(k, argEta, argIndex); if (k == argJ) tmp1 = tmp; sum += tmp; } if (tmp1 > 0.0) return log(tmp1 / sum); else return LOW_LOG_PROBABILITY; }
Defineslogp
(links are to index).
:loglik
Method
The routine that calculates the log-likelihood. See :loglik
method
for master-proto
.
<The LogLikelihood Routine>= (<-U) #ifdef _Windows __declspec(dllexport) void logLikelihood (double *a, double *b); #endif void logLikelihood (argEta, result) double *argEta, *result; { double sum = 0.0; int i, j, index = 0; for (j = 0; j < numberOfMarkovChains; j++) { for (i = 0; i < numberOfPointsForConstants; i++) { sum += logp(j, argEta, index); index++; } } *result = sum; }
Definesloglik
(links are to index).
:htheta-over-hmix
Method
See :htheta-over-hmix
method for master-proto
.
<The hthetaOverHmix Routine>= (<-U) double hthetaOverHmix (argIndex) int argIndex; { double numberOfDistinctXS, theta, alpha, shape, scale; numberOfDistinctXS = summaryData[2 * argIndex]; theta = summaryData[2 * argIndex + 1]; alpha = currentHyperValues[0]; shape = currentHyperValues[1]; scale = currentHyperValues[2]; return exp(numberOfDistinctXS * log(alpha) + shape * log(scale) + (shape - 1.0) * log(theta) - scale * theta - mygamma(shape) - logMixtureDensity[argIndex]); }
DefineshthetaOverHmix
(links are to index).
:compute-log-hmix
Method
See :compute-log-hmix
method for master-proto
.
<The compute-log-hmix Routine>= (<-U) #ifdef _Windows __declspec(dllexport) void computeLogHmix (); #endif void computeLogHmix () { int i, k, mn; double sum; mn = numberOfMarkovChains * numberOfPoints; for (i = 0; i < mn; i++) { sum = 0.0; for (k = 0; k < numberOfMarkovChains; k++) { sum += f(k, logConstantsOfProportionality, i); } logMixtureDensity[i] = log(sum); } }
DefinescomputeLogHmix
(links are to index).
:calc-weights
Method
See :calc-weights
method for master-proto
.
<The calcWeights Routine>= (<-U) #ifdef _Windows __declspec(dllexport) void calcWeights (); #endif void calcWeights () { int i, j, index = 0, sampleSize; double tmp, sum = 0.0, meanWeight, sumCenteredWeightsSquared; for (j = 0; j < numberOfMarkovChains; j++) { for (i = 0; i < numberOfPoints; i++) { tmp = hthetaOverHmix(index); importanceWeights[index] = tmp; sum += tmp; index++; } } index = 0; sampleSize = numberOfMarkovChains * numberOfPoints; meanWeight = 1.0 / sampleSize; sumCenteredWeightsSquared = 0.0; for (j = 0; j < numberOfMarkovChains; j++) { for (i = 0; i < numberOfPoints; i++) { importanceWeights[index] /= sum; tmp = importanceWeights[index] - meanWeight; sumCenteredWeightsSquared += tmp * tmp; index++; } } tmp = sumCenteredWeightsSquared / (sampleSize - 1); statistics[2] = floor(sampleSize / (1 + sampleSize * sampleSize * tmp)); }
DefinescalcWeights
(links are to index).
:compute-statistics
Method
See :compute-statistics
method for slave-proto
. In computing
the mean and variance of the mixture of Beta densities, we have to use
the facts E(X) = E(E(X|W)) and V(X) = E(V(X|W)) + V(E(X|W)). In
our case, F(t) is a Beta random variable X_i with probability
w_i, the normalized importance weight.
<The computeStatistics Method>= (<-U) #ifdef _Windows __declspec(dllexport) void computeStatistics (double *a); #endif void computeStatistics (argStatistics) double *argStatistics; { double alpha = currentHyperValues[0], expectation = 0.0, expectationOfConditionalSquaredMean = 0.0, expectationOfVariance = 0.0, tmp, sumCenteredWeightsSquared = 0, betaMean, betaVariance, betaParam1, betaParam2; int i, j, index = 0, tValueIndex = (int) currentHyperValues[3]; <Compute Mean and Variance in C> argStatistics[0] = 1.0 - expectation; argStatistics[1] = sqrt(expectationOfConditionalSquaredMean - expectation * expectation); }
DefinescomputeStatistics
(links are to index).
<Compute Mean and Variance in C>= (<-U U->) for (j = 0; j < numberOfMarkovChains; j ++) { for (i = 0; i < numberOfPoints; i++) { betaParam1 = <C Expression for first beta parameter>; betaParam2 = <C Expression for second beta parameter>; betaMean = betaParam1 / (betaParam1 + betaParam2); tmp = importanceWeights[index] * betaMean; betaVariance = betaMean * betaParam2 / ((betaParam1 + betaParam2) * (betaParam1 + betaParam2 + 1)); expectation += tmp; expectationOfVariance += importanceWeights[index] * betaVariance; expectationOfConditionalSquaredMean += tmp * betaMean; index++; } }
:compute-law-of-f-of-t
Method
See :compute-law-of-f-of-t
method for slave-proto
. Note the
difference in that some parameters need to be passed in this C routine
unlike the Lisp routine.
This method basically calculates for equally spaced values of u in [0,1] the weighted sum
where denotes the beta density with parameters a and b evaluated at u.
<The computeLawOfFOfT Method>= (<-U) #ifdef _Windows __declspec(dllexport) void computeLawOfFOfT (int *n, double *a, double *b); #endif void computeLawOfFOfT (int *n, double *x, double *y) { double alpha, sum, betaParam1, betaParam2; int i, j, k, index, tValueIndex; alpha = currentHyperValues[0]; tValueIndex = (int) currentHyperValues[3]; for (k = 0; k < *n; k++) { sum = 0.0; index = 0; for (j = 0; j < numberOfMarkovChains; j++) { for (i = 0; i < numberOfPoints; i++) { betaParam1 = <C Expression for first beta parameter>; betaParam2 = <C Expression for second beta parameter>; sum += importanceWeights[index] * betadens(x[k], betaParam1, betaParam2); index++; } } y[k] = sum; } }
DefinescomputeLawOfFOfT
(links are to index).
Here are the formulas for the two beta density parameters. The first one is of course
with the understanding that indicates degree of concentration around the exponential family for theta.
The expression for the first beta parameter follows.
<C Expression for first beta parameter>= (<-U <-U) (indicatorCounts[index * numberOfMonths + tValueIndex] + alpha * (1 - exp(-tValueIndex * summaryData[2*index + 1])))
The second one is
which in C is given by the snippet below.
<C Expression for second beta parameter>= (<-U <-U) (alpha + numberOfDataValues - betaParam1)
:compute-mean-of-fbar-of-t
Method
See :compute-mean-of-fbar-of-t
method for slave-proto
. This
method basically calculates for several values of t, the quantity
E(S(t)). The values of t for which this is done is the same as the
list of values used in the slider for t.
<The computeMeanOfFBarOfT Method>= (<-U) #ifdef _Windows __declspec(dllexport) void computeMeanOfFBarOfT (int *n, double *a, double *b, double *c); #endif void computeMeanOfFBarOfT (n, x, y, standardDeviations) double *x, *y, *standardDeviations; int *n; { double alpha = currentHyperValues[0], expectation = 0.0, expectationOfConditionalSquaredMean = 0.0, expectationOfVariance = 0.0, tmp, sumCenteredWeightsSquared = 0, betaMean, betaVariance, betaParam1, betaParam2; int i, j, k, index = 0, tValueIndex; y[0] = 1.0; standardDeviations[0] = 0.0; for (k = 1; k < *n; k++) { index = 0; expectation = 0.0; expectationOfConditionalSquaredMean = 0.0; expectationOfVariance = 0.0; tValueIndex = (int) (x[k]); <Compute Mean and Variance in C> y[k] = 1 - expectation; standardDeviations[k] = sqrt(expectationOfConditionalSquaredMean - expectation * expectation); } }
DefinescomputeMeanOfFBarOfT
(links are to index).
:consolidate-computation
Method in CThis method consolidates all computation by first computing importance weights, then computing all the statistics in one shot.
<The consolidateComputation Routine>= (<-U) #ifdef _Windows __declspec(dllexport) void consolidateComputation (int *n, double *a, double *b, double *c, double *d, double *e, double *ff); #endif void consolidateComputation (n, argStatistics, argDensityAbscissae, argDensityOrdinates, argExpectationAbscissae, argExpectationOrdinates, argStandardDeviationOrdinates) double *argStatistics, *argDensityAbscissae, *argDensityOrdinates, *argExpectationAbscissae, *argExpectationOrdinates, *argStandardDeviationOrdinates; int *n; { int k; calcWeights(); computeLawOfFOfT(n, argDensityAbscissae, argDensityOrdinates); computeMeanOfFBarOfT(&numberOfMonths, argExpectationAbscissae, argExpectationOrdinates, argStandardDeviationOrdinates); k = (int) currentHyperValues[3]; statistics[0] = argExpectationOrdinates[k]; statistics[1] = argStandardDeviationOrdinates[k]; }
DefinesconsolidateComputation
(links are to index).
README
file.
<README file>= Copyright (C) Hani J. Doss and B. Narasimhan -------------------------------------------- This file provides information on installing and using BSA (Bayesian Sensitivity Analysis) software. We expect any serious user of the software to read our paper that has now appeared in @InBook{doss:nara:1998b, author = {Hani J. Doss and B. Narasimhan}, title = {Practical Nonparametric and Semiparametric Bayesian Statistics}, chapter = {Dynamic Display of Changing Posterior in Bayesian Survival Analysis}, publisher = {Springer}, year = {1998}, editor = {Dipak Dey, Peter M\"uller, Debajyoti Sinha}, number = {133}, series = {Lecture Notes in Statistics}, pages = {63--87} } We have included a copy of this paper (ddg.ps.gz) in the subdirectory ddg. Requirements: A) XLisp-Stat 3.53-5 or higher. Freely available from ftp://ftp.stat.umn.edu/pub/xlispstat/current B) Windows or Unix. Windows includes 95 and NT. (Mac version is in development.) C) On Unix, you will also need a C compiler. Step 1. ------ You probably received the entire package as a compressed archive named bsa.tgz. On Unix, the contents of the archive may be extracted into a directory called bsa by executing the commands: gunzip -c bsa.tgz | tar -xvf - If you are using GNU tar, this can be done in one shot via: tar -xvzf bsa.tgz To extract the files on Windows, you need to use an extractor like WinZip. See http://www.winzip.com for more details. Step 2 ------ On Windows, you skip this step. Macs are not supported yet because the authors don't know how to create a dynamic shared library. On Unix, you need to configure the software to your environment. Change to the bsa directory and type ./configure make This will compile the lisp files and create a shared library for your platform. Step 3 ------ Start using the program. In Unix, the command xlispstat BreastCancerRadiationOnly or xlispstat BreastCancerRadiationChemo will fire up the readymade examples. On Windows, you fire up Lisp-Stat and load the files BreastCancerRadiationOnly.lsp or BreastCancerRadiationChemo.lsp to proceed. If you are inclined to use commands instead of mouse-clicks, you can send ``messages'' to the master object in the listener window as shown in the following examples: (send BreastCancerRadiationOnly-master :current-hyperparameter-values '(1 4 150 12)) (send BreastCancerRadiationOnly-master :print-all-statistics) (send BreastCancerRadiationOnly-master :slot-value 'bsa::importance-weights) Note how the master-objects names are related to the lisp file names and how internal slot-names have to be prefixed by the package name. For dealing with a new problem, we provide a few points regarding the software. A number of inputs are required for running the program. These are discussed in detail in the literate program (bsa.ps) under the section titled ``Introduction.'' For convenience we repeat the details here. This excerpted part is indented two spaces for easy reference. First note that the software only does sensitivity analysis. No general facility is provided for generating observations from Markov chains. Indeed, since the range of models for which MCMC methods are applicable are large and such methods most likely involve problem-specific issues, it is our opinion that building such a supertool, if it is at all possible, is a non-trivial task. However, the Fortran program used in generating the output for our example is included along with this software and can be used for models similar to ours. Of course, any appropriate method may be used to generate the samples as long as the output is available in a form usable by our software. The requirements on the data that can be used with our software are spelt out below. Corresponding to each Markov chain output, there must be two files with the extensions ".in" (input file) and ".out" (output file). For example, "mc1.in" and "mc1.out". The input file must have the following structure. The first four items in the file can be anything, string or number, either on a single line or any conceivable combination of lines. The next three items *must* be the shape of the Gamma distribution on theta, the scale of the Gamma distribution on theta---the parametrization for shape a and scale b is proportional to x^{a-1} exp(-bx)---and M(R). The next three values values following these quantities can be anything, but the one following it should be the number of data points, that is, the number of sets or intervals. In the Fortran program we use -99 is used to denote infinity. Nothing else is read from the input file. The output file must have the following structure for each data point generated by the Markov chain. The value of theta must be followed by the number of distinct values of the data points, which must be followed by a frequency table of the actual data value and the corresponding frequency. The layout of the values on lines does not matter as long as at least a single white space delimits values. If this structure is violated, errors will result. A peek at the data files included with this software will help the reader. It is assumed that a proper installation of XLisp-Stat is available. For a new problem, you probably have several Markov chain output files although even one should work. (In the latter case, reweighting reduces to simple Importance Sampling.) a) It is best to create a new directory for your problem and have your data files there. For example, the directory "BreastCancer" contains relevant data files for our Breast Cancer data. b) The only files you actually need to run the program are: 1) Either one of bsa.fsl or bsa.lsp 2) Either one of utility.fsl or utility.lsp 3) Either one of call-by-reference.fsl or call-by-reference.lsp and 4) the shared library libbsa.so or libbsa.sl as the case may be. On Windows, instead of the shared library, we need the whole subdirectory "win". Copy these files/directories to where you have the data files and work there. 5) The file new-problem.lsp. Invoke Lisp-Stat and load the file named "new-problem.lsp". The first time (and first time only), the following inputs will be needed. Inputs ------ 1) An indentifier for uniquely identifying the run. Use a meaningful name here. Let us assume this is BreastCancer (the default) in the discussion below. 2) The number of Markov chain outputs that you want to use for reweighting. Must be >= 1, with 1 denoting straight Importance Sampling. 3) The names of the files containing output from Markov chains, *without the extensions*. The software will automatically tag on the extensions .in and .out when looking for files. 4) The number of points per chain to use in the dynamic reweighting. Thus if you specify 50 and have 8 chains, then 50 points from each of the eight chains (= 400) will be used. 5) An initial guess for maximizing the log-quasilikelihood which will provide an estimate of the constants of proportionality. 6) The range between which you want to vary the hyperparameters. If you use only one chain, then you *must* specify the range. Otherwise, the range will be a single point. If you specified many chains, the default settings for each hyperparameter will be the minimum and maximum values from values used in all Markov chains. The number of stops should be an odd number if you want to hit the middle of the interval. 7) The number of points to use in estimating the constants of proportionality. If you use all of the data, the estimation can take a while. It is almost always better to go with the default or less. (If you are really interested in using more points, then start off with 10, and use the estimates thus obtained to start your larger optimization. This will save you a lot of time.) Once you specify this, the maximization will take place. This is a good point to go refill your coffee cup. After the estimation, two files are created so that you are not bombarded with questions in subsequent explorations. For example, BreastCancer.lsp (and) BreastCancer.run For repeating the exploration next time, you only need to load the file BreastCancer.lsp into XLisp-Stat. This will bypass all the inputs we discussed above except for the question about ranges. The file BreastCancer.run contains pre-processed information for faster loading and will be used when BreastCancer.lsp is loaded. All files are text files and can be viewed with a text viewer. Examples for Breast Cancer Data ------------------------------- The two files BreastCancerRadiationOnly.lsp (and) BreastCancerRadiationChemo.lsp and the corresponding run files are provided for experimentation. These exist in the main directory "bsa" itself and concern the dataset on two treatments described in the paper Dynamic Display of Changing Posterior in Bayesian Survival Analysis by Hani J. Doss and B. Narasimhan By default, they use 50 points each and 8 Markov chains. We wish to note that an earlier version of the software was used to produce the results in the paper and subsequently a bug was found. This does not change any of the conclusions of the paper but the numbers shown in table 1.1 in the paper are off from the actual values obtained using the software. A replacement is provided in the file newtable.tex and shows that the agreement between estimates obtained by reweighting and those obtained by actual runs of Markov chains is, if anything, better that what the original table indicated. Just load the lisp files into XLisp-Stat to do the dynamic exploration. If you have a sufficiently fast machine, you can use more points. To completely reproduce our work from scratch, you need to use the data files in the subdirectory "BreastCancer". The authors may be contacted via email at: doss@stat.ohio-state.edu (Hani J. Doss) naras@stat.Stanford.EDU (B. Narasimhan) Note on the program itself -------------------------- The programs are written in a literate style using the Noweb literate programming tools. We provide two utility packages that one can use independently of the program: utility.lsp and call-by reference.lsp. The former contains functions we have found useful in writing Lispstat programs; the latter implements a call-by-reference glue between Lispstat and C. Enjoy!
:hyperparameter-ranges
or follow it.
hyperparameters-list
is a list
of hyperparameter points (that is, four-tuples), and master-object
is the master object for the problem.
(dolist (hyperparameters hyperparameters-list) (send master-object :current-hyperparameter-values hyperparameters))
number-of-months
. This has to be done
carefully since it plays a role in many places.
This research was supported by Air Force Office of Scientific Research Grant F49620-94-1-0028.
We are indebted to Fred W. Huffer for the Fortran program dirichlet
that comes with this software. All we did was modify it slightly to
suit our data.
This list is generated automatically. The numeral is that of the first definition of the chunk.
Here is a list of the identifiers used, and where they appear. Underlined entries indicate the place of definition. This index is generated automatically.
[1] Hani J. Doss and B. Narasimhan. Bayesian Poisson regression: Sensitivity analysis through dynamic graphics. Technical report, Penn State Erie, The Behrend College, 1994.
[2] Hani J. Doss and B. Narasimhan. Dynamic display of changing posterior in Bayesian survival analysis. In Practical Nonparametric and Semiparametric Bayesian Statistics, D. Dey, P. Müller, and D. Sinha, eds, 1998. Springer-Verlag, N. Y.
[3] C. J. Geyer. Estimating normalizing constants and reweighting mixtures in Markov chain Monte Carlo. Technical report, School of Statistics, University of Minnesota, 1994.
[4] Donald E. Knuth. Literate Programming. Center for Study of Language and Information, Stanford University, 1992.
[5] A. Kong, J. S. Liu, and W. H. Wong. Sequential imputations and Bayesian missing data problems. J. Amer. Statist. Assoc., 1994.
[6] Christopher Lee. Literate programming---propaganda and tools. World Wide Web, 1994. Available from http://www.ius.cs.cmu.edu/help/Programming/literate.html.
[7] Norman Ramsey. Literate programming simplified. IEEE Software, 1994.
[8] Norman Ramsey. The Noweb Hacker's Guide, 1994. Manual for Noweb.
[9] Norman Ramsey. The noweb home page. World Wide Web, 1995. On the Web at http://www.cs.virginia.edu/ nr/noweb/.
[10] Luke Tierney. LISP-STAT: An Object-Oriented Environment for Statistical Computing and Dynamic Graphics. John Wiley &Sons, 1990.