STATISTICS - THEORY AND PRACTICE

May 30-31, 2008

 

Title: Order Thresholding

 Mike Akritas, Penn State University

Abstract: When testing against a high-dimensional alternative, omnibus test designed to detect any departure from the null hypothesis have low power. Neyman's (1937) truncation idea, though motivated by a different type of problem, served as the spring board for the related approaches of adaptive truncation, and soft and hard thresholding (cf. Donoho and Johnstone, 1994, Fan and Lin, 1998, Spokoiny, 1996). This talk presents a new thresholding method, called order thresholding, based on L-statistics. Numerical comparisons with existing thresholding methods, and an extension to testing problems in high-dimensional factorial designs are presented.

 

Title: Negative Dependence and the Simes Inequality

 Henry W. Block, Thomas H. Savits and Jie Wang, University of Pittsburg

Abstract: It is shown that the Simes Inequality is reversed for a broad class of negatively dependent distributions. This resolves a conjecture of Sarkar (Ann. Stat. 26, 1998, 494-504).

 

Title: Rich Transformation Models and More.

 Kjell Doksum, University of Wisconsin

Abstract: Regression transformation models are models where an increasing transformation of the response satisfies a linear model in terms of increasing transformations of covariates. Richard Johnson and his collaborators have proposed such models. These will be discussed along with other proposals such as the Box-Cox transformation model. Properties of statistical procedures will be considered for two cases: The case where the transformation is known and the case where the transformation is unknown and needs to be estimated. Conditions under which properties of statistical procedures are asymptotically the same for these two cases will be presented.

 

Title: Finite Markov Chain Imbedding and Its Application To Matching Probability Between Two Markov Dependent DNA Sequences

 James C. Fu, University of Manitoba

Abstract: In this talk, the concept of finite Markov chain imbedding technique for studying the distributions of runs and patterns in a sequence of multi-state trials will be introduced. The method is extended to study the distribution of the longest matching between two Markov dependent DNA sequences. The method is very simple, efficient and accurate. To illustrate the method, numerical results on matching probabilities for both i.i.d. and Markov dependent sequences will be presented.

 

Title: Regression Model Checking with Long Memory Design and Errors

 Hira L. Koul, Michigan State University

Abstract: This talk will discuss a test for fitting a parametric regression model with long memory (LM) Gaussian design and nonparametric heteroscedastic LM moving average errors. The asymptotic null distribution of these tests will be discussed in some detail. The proposed test is illustrated by fitting a linear regression model between the two currency exchange rate data sets that exhibit LM.

 

Title: Some statistical applications in the financial services industry

 Wenqing Lu, HSBC

Abstract: With rich credit bureau data, financial service providers rely on statistical analysis and modeling techniques to make decisions on direct marketing, pricing, and risk management. This talk reviews some common statistical methods used to improve decision making in the financial industry. The nature of economical cycle can make the statistical prediction very challenging as demonstrated in this new credit crisis. The presenter has work experience at Fair Isaac & Co, Washington Mutual, and HSBC.

 

Title: Block sampler for univariate and multivariate asymmetric stochastic volatility models

 Yasuhiro Omori, University of Tokyo

Abstract: We discuss an efficient Markov chain Monte Carlo algorithm for the univariate and multivariate asymmetric stochastic volatility (ASV) models where there exists a correlation between today's return and tomorrow's volatility. The state vector is divided into several blocks where each block consists of many state variables. For each block, corresponding disturbances are sampled simultaneously from their conditional posterior distribution. The algorithm is based on the multivariate normal approximation of the conditional posterior density and exploits a conventional simulation smoother for a linear and Gaussian state space model. The algorithm is applied to the univariate ASV models, multivariate factor ASV models, and multivariate ASV models.

 

Title: Revisiting Local Asymptotic Normality (LAN) and Passing on to Local Asymptotic Mixed Normality (LAMN)

 George G. Roussas, University of California, Davis

Abstract: The standard set-up of Locally Asymptotically Normal families of probability measures is revisited, and the basic asymptotic results are reviewed. In the form of applications, some statistical examples are considered, in the framework of hypotheses testing and asymptotic efficiency of estimates. Also, some generalizations are mentioned. Certain families of probability measures do not enjoy the property of being Locally Asymptotically Normal, but rather they are what has been termed as Locally Asymptotically Mixed Normal. Such families are briefly considered, and some general results are indicated.

 

Title: Confidence Regions for Parameters of Linear Models

 Andrew L. Rukhin, National Institute of Standards and Technology and University of Maryland at Baltimore County

Abstract: A method is suggested for constructing a conservative confidence region for the parameters of a general linear model on the basis of a linear estimator. In meta-analytical applications, when the results of independent but heterogeneous studies are to be combined, this region can be employed with little to no

knowledge of error variances. The required optimization problem is formulated and some properties of its solution are described. The motivating example is a study in which several laboratories performed measurements via different techniques of gold vapor pressure as a function of the absolute temperature.

 

Title: System Signatures in Dynamic Reliability Settings

 Francisco J. Samaniego, University of California, Davis

Abstract: The concept of the 'signature' of an engineered system is described, and some basic representation and preservation theorems concerning them are reviewed. We then define the dynamic signature of a system, conditioned on the events that the system is working at the inspection time t and that exactly i components have failed by time t, assuming these events are compatible. Applications of dynamic signatures to nonparametric modeling in reliability and to the engineering practice of "burn in" are treated is some detail. This work is joint with N. Balakrishnan and J. Navarro.

 

Title: Rates of convergence for estimators of convolutions of densities

 Anton Schick, SUNY Binghampton

Abstract: The goal of this talk is to give an overview of various types of convergence results for estimating the convolution of a density with itself. The estimator of this convolution is a local U-statistic based on a random sample from the base density. The surprising fact is that under rather mild assumptions on the base density this estimator has a convergence rate of the order root-n, point-wise and in various norms, and (functional) central limit theorems can be proved in the corresponding normed spaces. Integrability conditions on the base density are key to these results. A violation of these conditions results in slower rates of convergence. The behavior of the local U-statistic is now similar to that of kernel estimators with the customary trade-off between bias and variance terms. These slower rates of convergence, however, are still faster than the optimal rates of convergence for kernel estimators based on a sample from the convolution.

 

TIitle: Intrinsic Aging and Classes of Nonparametric Distributions

Moshe Shaked, Rhonda Righter and J. George Shanthikumar), University of Arizona

Abstract: A general framework is developed for understanding the nonparametric (aging) properties of nonnegative random variables. For this purpose considered are aging properties of various residual and conditional lifetimes. The notion of intrinsic aging is also used, and the aging properties of the intrinsic life and the actual life are related. Some new concepts of aging are introduced as a result of the general setup. Several recent results in the literature are special cases of the general results.

 

Title: A Bivariate Generalized von Mises Distribution with Applications to Circular Genomes

 Grace S. Shieh, Academia Sinica, TAIWAN

Abstract: Recent studies show that gene order is extensively conserved between closely related species, but rapidly became less conserved among more distantly related species. Furthermore, this trend is likely to be universal in prokaryotes (Tamames, 2001; Wolf, 2001). Therefore, gene order conservation is a valid phylogenetic measure (Bentley and Pankhillm, 2004), and we propose to infer evolutionary independence and distance of any pair of circular genomes, which constitute 476 (each having single chromosome) among 566 prokaryotic Genomes (NCBI, Aug. 7, 2007), by comparing their distribution of orthologs. Since the distributions of orthologs in a given paired genomes are often either multi-modal or asymmetric, a

bivariate distribution with generalized von Mises distributions (BGVM) is proposed to model a pair of circular genomes. Some distributional properties of BGVM are addressed. Maximum likelihood estimation and a likelihood ratio test for independence are developed. The procedures are applied to three pairs of circular genomes to infer their evolutionary independence; after the independence hypothesis has been rejected, their evolutionary distances are also estimated. These results are consistent with those based on different types of data or different methods. Future work on a new measure of association between paired circular genomes will also be discussed.

 

Title: On Competing Risks and Degradation Processes: Modeling and Inferential Issues

 Nozer Singpurwalla, George Washington University

 

Title: The Five Most Consequential Ideas in the History of Statistics

 Stephen M. Stigler, University of Chicago (formerly University of WisconsinMadison)

Abstract: Modern statistics invariably and understandably focuses upon the latest in technique and technology. The history of statistics permits a broader view and can identify ideas that not only have driven development but also retain contemporary relevance. The five most influential of these are identified. What are they? Surely, you may say, they would include Bayes’s Theorem? But no. Nor does the list include important ideas such as cross-validation or the bootstrap, rank or robust statistics, simulation or loglinear models. What then would be on the list? You may be surprised.

 

Title: Model estimation, checking and evaluation via prediction

 L.J. Wei, Harvard University

Abstract: Recent technology advancements for obtaining bio- and genetic-markers have drastically enhanced the knowledge of certain disease processes and the potential for accurately predicting patient’s clinical outcomes. Traditional statistical methods for the so-called individualized/personalized medicine with such markers are derived under a rather strong assumption, that is, one can accurately identify the true model (at least for the large sample case), which relates the predictors to their corresponding clinical phenotype variable(s). In practice, however, it is difficult if not impossible, even to identify the class of models which contains the true one. Therefore, it is interesting and important to investigate whether the standard statistical methods for model estimation, evaluation and comparisons can be modified when the fitted model may not be correctly specified. In this talk, we discuss new procedures for predicting future observations and for evaluating and comparing prediction rules. One key feature of the proposals is that their validity does not require that assumption that the fitted models are correct. Moreover, the new proposal provides a reliability measure of the estimated prediction precision, an important component for model evaluation and checking. The new methods are illustrated with examples with continuous, binary and censored responses.