VCU Bar Image

Seminars & Invited Talks


   

September 23, 2011
12 PM
One Capitol Square
Room 305

Lynne T. Penberthy, Ph.D.
Associate Professor
Department of Internal Medicine
Virginia Commonwealth University
VCU Health Systems
School of Medicine
Richmond, VA

Massey Cancer Center Cancer Research, Informatics and Services (CRIS) Core

The Cancer Research, Informatics and Services (CRIS) Core is a multi-level system of data sources and sophisticated software applications and analytic services. The CRIS Core provides integrated patient level source data and datasets, analytic and reporting services, support and training to clinical research staff and investigators performing clinical trials, and performs innovative informatics research. In this seminar, the CRIS Core technologies and capabilities will be described. Collaborative opportunities between the CRIS Core, Biostatistics Shared Resource, and others will also be highlighted.

September 30, 2011
12 PM
Sanger Hall
Room 2-020

Paul A. Harris, Ph.D.
Director, Office of Research Informatics Operations
Associate Professor
Department of Biomedical Informatics
Department of Biomedical Engineering
Vanderbilt University
Nashville, TN

Research Electronic Data Capture (REDCap) - Planning, Collecting and Managing Data for Clinical and Translational Research

REDCap (Research Electronic Data Capture) is a software application and workflow methodology designed to collect and manage data for research studies. REDCap study databases are secure, web-based applications and easy to create, launch and manage on a project-by-project basis. REDCap uses a study-specific data dictionary to eliminate all programming requirements for the creation of electronic case report forms and participant survey instruments for individual studies – making it extremely fast to develop and launch for any size study. Vanderbilt developed and launched the REDCap in 2004 and began sharing the software with other academic and non-profit institutions in 2005 at no cost under a unique consortium dissemination model. This consortium now consists of 271 active partners across six continents serving 30,860 end-users (www.project-redcap.org).

This presentation will begin by introducing a series of 'best practices' for consideration when planning any research data collection strategy. REDCap will then be introduced as a practical software platform for actual implementation. Finally, a discussion of the REDCap consortium will include advanced topics related to reuse of standardized validated instruments and methods for establishing and managing data in multi-center studies.

October 28, 2011
12 PM
One Capitol Square
Room 305

Krista Y. Christensen, M.P.H, Ph.D.
Epidemiologist
U.S. Environmental Protection Agency
National Center for Environmental Assessment-Washington
Washington, DC

Assessing the Health Impact of Environmental Chemicals

Risk assessment is an important component to chemical evaluation and the regulatory process. Risk assessment involves estimation of both the nature and probability of adverse health outcomes that may occur from exposure to a given substance. At the U.S. Environmental Protection Agency, a 4-step process is used to assess risk, involving: hazard identification, dose response assessment, exposure assessment, and risk characterization.

Cumulative risk assessment considers multiple substances or factors together, rather than one at a time. There are methodological and statistical challenges to cumulative risk assessment, such as correlation among exposures and high dimensional data. One approach to address these concerns utilizes non-linear optimization, which combines information about many exposures into a single weighted sum. An example using NHANES data on liver function will be used to outline this approach.

November 11, 2011
12 PM
One Capitol Square
Room 305

Gang Zheng, Ph.D.
Mathematical Statistician
Office of Biostatistics Research
National Heart, Lung and Blood Institute
Bethesda, MD

Joint Analysis of Genetic Association with Binary and Quantitative Traits with Outcome Dependent Sampling

We study the analysis of a joint association between a genetic marker with both binary (case-control) and quantitative (continuous) traits, where the quantitative trait values are only available for the cases due to outcome-dependent sampling (e.g., data sharing). Data sharing becomes common in genetic association studies, under which a phenotype of interest is not measured for some subgroup. The trend test (or Pearson’s test) and F test are often respectively used to analyze the binary and quantitative traits. Due to the outcome-dependent sampling, the usual F test can be applied using the subgroup with the observed quantitative traits. We propose a modified F test by also incorporating the genotype frequencies of the subgroup whose traits are not observed. Further, a combination of this modified F test and Pearson’s test is proposed by Fisher’s combination of their p-values as a joint analysis. Due to the correlation of the two analyses, we propose to use a Gamma (scaled chi-squared) distribution to fit the asymptotic null distribution for the joint analysis. The proposed modified F-test and the joint analysis can also be applied to test single trait association (either binary or quantitative trait). Through simulations, we identify the situations under which the proposed tests are more powerful than the existing ones. Application to a real dataset of rheumatoid arthritis is presented.

November 18, 2011
12 PM
One Capitol Square
Room 305

Karl E. Peace, Ph.D.
Professor, Georgia Cancer Coalition Distinguished Cancer Scholar
Director of the Center for Biostatistics
Georgia Southern University
Jiann-Ping Hsu College of Public Health
Statesboro, GA
and
Adjunct Professor, Department of Biostatistics
Virginia Commonwealth University
Richmond, VA

The Importance of Numbers (of Patients) in Clinical Trials

Appropriate attention to the number of patients needed to answer an important research or medical question is often not considered at the design stage for many clinical trials. Numerous reports may be found in the medical literature of clinical trials with low power (i.e. small numbers of patients). When comparing treatments in trials with small numbers or low power, confidence intervals are likely to cover 0 and to be relatively wide. Hence, such trials are unlikely to provide a conclusion of either treatment similarity or treatment superiority, leading one to question the ethics and logic of conducting such trials and recommending that a discussion of power be included in the informed consent document.

Positive Control Trials with insufficient power can also lead to decisions based upon false positive results. In fact, trials with low power, of drugs whose intrinsic efficacy is borderline, lead to high false positive results. This will be illustrated by focusing on publications in Cancer of comparative trials where the goal was to assess whether a new candidate therapy was superior to an existing therapy.

December 9, 2011
12 PM
One Capitol Square
Room 305

Sally Hunsberger, Ph.D.
National Cancer Institute, Biometric Research Branch

A Finite Mixture Survival Model to Characterize Risk Groups of Neuroblastoma

Neuroblastoma is a childhood cancer with patients experiencing heterogeneous survival outcomes despite aggressive treatment. Disease outcomes range from early death to spontaneous regression of the tumor followed by cure. Due to this heterogeneity, it is of interest to identify patients with similar types of neuroblastoma so that specific types of treatment can be developed. Oncologists are especially interested in identifying patients who will be cured so that the minimum amount of a potentially toxic treatment can be given to this group of patients. We analyze a large cohort of neuroblastoma patients and develop a finite mixture model that uses covariates to predict the probability of being in a cure group or other (one or more) risk groups. A prediction method is developed that uses the estimated probabilities to assign a patient to different risk groups. The robustness of the model and the prediction method is examined via simulation by looking at misclassification rates under mis-specified models.

February 3, 2012
12 PM
One Capitol Square
Room 305

Richard A. Rode, Ph.D.
Research Fellow
Statistics at Abbott Laboratories
Chicago, IL

Clinical Drug Development: The Use of Surrogate Markers vs. Clinical Endpoints

Clinical drug development for diseases requiring the chronic administration of treatment frequently rely on changes in either biomarkers or surrogate markers to initially inform the regulatory review process and consequently medical practice. However, longer-term effects of treatment on patient safety may not be known at the time of marketing approval (authorization) and are reported years later following completion of a clinical outcome(s) trial. Examples of this apparent discrepancy recently have been reported in several different disease areas (e.g., obesity, diabetes). Therefore, the intent of this seminar is to review statistical elements included in the following FDA Guidance for Industry: 1) Developing Products for Weight Management and 2) Diabetes Mellitus - Evaluating Cardiovascular Risk in New Antidiabetic Therapies to Treat Type 2 Diabetes.

February 24, 2012
12 PM
One Capitol Square
Room 305

William Anderson, Ph.D.
Assistant Professor of Mathematics and Statistics
Department of Mathematics and Computer Science
University of Richmond
Richmond, VA

An Intensity Based Corporate Default Model with Frailty and Contagion Effects

The current and ongoing financial crisis underscores the need for default models which include common unobserved covariates (a form of frailty), as well as account for correlation and feedback effects (contagion). We introduce and estimate a new point process model that incorporates these crucial factors. The model is calibrated to US corporate default data spanning the years 1970 to 2008. Several nested default models are also considered, and goodness-of-fit tests suggest the necessity of the inclusion of frailty and contagion in default models. Further statistical tests suggest the model replicates both the level and time-series variations of default rates for in-sample and out-of-sample periods. Our model and findings have important implications in risk management and in analysis of credit derivatives.

March 23, 2012
12 PM
One Capitol Square
Room 305

Vernon M. Chinchilli, Ph.D.
Distinguished Professor and Chair
Department of Public Health Sciences
Penn State Hershey College of Medicine
Hershey, PA

Matrix-based Concordance Correlation Coefficient for Repeated Measures

In many clinical studies, Lin’s concordance correlation coefficient (CCC) is a common tool to assess the agreement of a continuous response measured by two raters or methods. However, the need for measures of agreement may arise for more complex situations, such as when the responses are measured on more than one occasion by each rater or method. In this work, we propose a new CCC in the presence of repeated measurements, called the matrix-based concordance correlation coefficient (MCCC) based on a matrix norm that possesses the properties needed to characterize the level of agreement between two p × 1 vectors of random variables. It can be shown that the MCCC reduces to Lin’s CCC when p = 1. For inference, we propose an estimator for the MCCC based on U-statistics. Furthermore, we derive the asymptotic distribution of the estimator of the MCCC, which is proven to be normal. The simulation studies confirm that overall in terms of accuracy, precision, and coverage probability, the estimator of the MCCC works very well in general cases especially when n is greater than 40. Finally, we use real data from an Asthma Clinical Research Network (ACRN) study and the Penn State Young Women’s Health Study for demonstration.

March 30, 2012
12 PM
One Capitol Square
Room 305

Donald A. Berry, Ph.D.
Professor
Department of Biostatistics
The University of Texas MD Anderson Cancer Center
Houston, TX

Statistical Leadership and Keys to Becoming a Successful Biostatistical Consultant

Statistics is a fantastic profession. Consulting can be fun and also rewarding. We get to deal with a variety of types of people and disciplines. We get to deal with a variety of problems, most of which have interesting twists and turns. Consulting can be challenging and also frustrating. How does one interact with a collaborator who doesn't seem to be able to follow the simplest of arguments? I'll give examples. As regards becoming a successful biostatistician, I'll consider various measures of success and suggest paths to achieve them. I hope that there will be many questions and lots of discussion.

April 6, 2012
12 PM
One Capitol Square
Room 305

Todd Coffey, Ph.D.
Senior Statistical Scientist II
Seattle Genetics, Inc.
Bothell, WA

Nonclinical Statistics: Past, Present, Future and the Changing Role of the Statistician

Nonclinical statistics is the application of statistical theory, methods, and thinking to medical research outside the sphere of clinical trials. For many years the nonclinical statistician has been primarily a consultant, working as a specialized expert to solve challenging problems with small groups of scientists. While statisticians continue in that role at many large companies, there has been a subtle yet detectable shift in the perception and expectations for nonclinical statisticians. This shift can be traced to many factors, including the development of the internet, excellent and ubiquitous statistical software programs, changes in business practices, and abundant statistics education in many disciplines. In this talk I will provide examples of the interesting problems nonclinical statisticians tackle, present some unsolved questions, describe how job responsibilities are changing, and provide predictions on what additional skills will be required of statisticians in the future.

April 13, 2012
12 PM
One Capitol Square
Room 305

Jack Kalbfleisch, Ph.D.
Professor
Department of Biostatistics
School of Public Health
University of Michigan
Ann Arbor, MI

Matching in Cluster Randomized Trials

Cluster randomized trials with relatively few clusters have been widely used in recent years for evaluation of healthcare strategies. The balance match weighted (BMW) design, introduced in Xu and Kalbfleisch (Biometrics, 2010), uses propensity scores and applies the optimal full matching with constraints technique to a prospective randomized design. This is done with the general aim of minimizing the mean squared error (MSE) of the treatment effect estimator. We review this work and consider extensions to clinical trials that involve more than two treatment arms where multiple treatment options need to be evaluated. Simulations are used to assess the approach and to make comparisons with other approaches in the literature. The method is illustrated in a study on an educational intervention on the treatment of ischemic stroke in emergencies hospitals.

April 20, 2012
12 PM
One Capitol Square
Room 305

Daniel Schaid, Ph.D.
Professor of Biostatistics
Mayo Clinic
Rochester, MN

Statistical Challenges for Testing Disease Associations with Rare Genetic Variants

Genome wide association studies have demonstrated many common genetic variants associated with a wide variety of common diseases; they have also suggested that rare variants are likely to play a significant role in disease etiology. Yet, rare variants pose significant statistical challenges. Modeling the effects of single variants is unreliable due to sparse data and weak power. A variety of statistical approaches to test disease associations with rare variants will be reviewed, with emphasis on two general strategies: a “burden” test and kernel regression. The popular “burden” test, based on a weighted sum of the variants within a gene, provides reasonable power when all rare variants in a gene have the same direction of effect on a trait. In contrast, if variants are a mixture of neutral, risk, and protective effects, then the alternative C(alpha) test (for case-control data), a test for binomial over-dispersion, can have greater power. The C(alpha) test is a special case of kernel regression which can be viewed as a variance components approach. These two strategies can be viewed as first-moment and second moment-moment tests, respectively, which are nearly orthogonal. The power of each strategy depends on the mixture of risk and protective effects. These different strategies can be combined into a joint simultaneous test of first and second moments, based on generalized linear models. This provides a flexible way to incorporate covariates, and to extend to other types of traits (e.g., quantitative). By using a simultaneous test of both moments, robustness can be gained to compensate for lack of knowledge about the balance of protective, neutral, and risk variants. Theory and simulations will be presented to illustrate the power of different strategies to test for associations of rare variants with traits.

May 4, 2012
12 PM
One Capitol Square
Room 305

Heather J. Hoffman, Ph.D.
Assistant Professor
Department of Epidemiology and Biostatistics
School of Public Health and Health Services
The George Washington University
Washington, DC

The Provision of Services and Care for HIV-Exposed Infants: A Comparison of Maternal and Child Health (MCH) Clinic and HIV Comprehensive Care Clinic (CCC) models

Prevention of Mother-to-Child transmission of HIV programs require follow-up of HIV-exposed infants (HEI) for infant feeding support, prophylactic medicines, and HIV diagnosis for 18 months. Retention and receipt of HIV services are challenging in resource limited settings. We conducted an observational prospective cohort study to compare infant follow-up results when HEI services were provided within Maternal and Child Health (MCH) clinics or in specialized HIV Comprehensive Care Clinics (CCC). From April 2008-April 2009, we enrolled 363 HEI (184 in CCC; 179 in MCH) at 6-8 weeks of age in two purposively selected District hospitals in Kenya with similar characteristics but different models of service delivery. In the CCC model, HEI received immunization and growth monitoring in MCH but cotrimoxazole (CTX) prophylaxis and infant HIV testing in the CCC. In the MCH model, all services were provided in the MCH. Data were collected at enrollment, 14 weeks, 6, 9, and 12 months. Poisson regression with robust error variance estimation was used to examine the relationship between total number of study follow-up visits per infant and model of service adjusting for significant covariates and to test for significant differences in rates of service uptake at the study visits. Generalized estimating equations for binary data were used to test for significant differences in attendance at each follow-up visit between the models of service. We found infants in MCH were 1.14, 1.42,1.95, and 1.29 times more likely to attend 14-week, 6, 9, and 12-month postnatal visits, respectively, and 2.24 (95% CI: 1.57,3.18) times more likely to attend all four visits. Infants in MCH were 1.33 (95% CI ;1.10,1.62) times more likely to have HIV antibody testing at 1 year than CCC, but there were no differences for PCR or CTX initiation at 6 weeks. We concluded that HIV services integrated in MCH in one district hospital led to better follow-up of HEI than in a similar hospital with services provided separately in a CCC.

May 7, 2012
12 PM
One Capitol Square
Room 305

Xu Zhang, Ph.D.
Assistant Professor
Department of Mathematics and Statistics
Georgia State University
Atlanta, GA

A Proportional Hazards Regression Model for the Subdistribution with Right-censored and Left-truncated Competing Risks Data

With competing risks failure time data, one often needs to assess the covariate effects on the cumulative incidence probabilities. Fine and Gray proposed a proportional hazards regression model to directly model the subdistribution of a competing risk. They developed the estimating procedure for right-censored competing risks data, based on the inverse probability of censoring weighting. Right-censored and left-truncated competing risks data sometimes occur in biomedical research. In this paper, we study the proportional hazards regression model for the subdistribution of a competing risk with right-censored and left-truncated data. We adopt a new weighting technique to estimate the parameters in this model. We have derived the large sample properties of the proposed estimators. To illustrate the application of the new method, we analyze the failure time data for children with acute leukemia. In this example, the failure times for children who had bone marrow transplants were left truncated.U

May 11, 2012
12 PM
One Capitol Square
Room 305

William L. Anderson, Ph.D.
Assistant Professor
Department of Mathematics and Computer Sciences
University of Richmond
Richmond, VA

The Connections between Hierarchical Regression, Numerical Analysis and Splines

The various forms of regression are a dominant feature of modern data analysis today. This is hardly surprising since the basic premises of regression are well understood in many different areas of research, and basic regression analysis is a standard component in many statistical software packages. However, researchers do not have to venture very far in their applications of regression analysis to run into trouble from a computational and modeling point of view. This is especially apparent when modeling longitudinal or repeated measures data using classical regression. Hierarchical models or mixed models overcome many of the limitations of classical regression and are well suited to handle longitudinal data. We will explore the deep connections between the seemingly orthogonal topics of numerical analysis, hierarchical models and splines (semiparametric regression). The intuitive concepts of these topics are introduced via the Donohue and Levitt abortion-crime data set. This talk is aimed at graduate students and practitioners.

May 22, 2012
12 PM
One Capitol Square
Room 305

Robert J. Carrico
Ph.D. Candidate
Department of Biostatistics
Virginia Commonwealth University
Richmond, VA

Unbiased Estimation for the Contextal Effect of Duration of Adolescent Height Growth on Adulthood Obesity and Health Outcomes via Hierarchical Linear and Nonlinear Models

This dissertation has multiple aims in studying hierarchical linear models in biomedical data analysis. In Chapter 1, the novel idea of studying the durations of adolescent growth spurts as a predictor of adulthood obesity is defined, established, and illustrated. The concept of contextual effects modeling is introduced in this first section as we study secular trend of adulthood obesity and how this trend is mitigated by the durations of individual adolescent growth spurts and the secular average length of adolescent growth spurts. It is found that individuals with longer periods of fast height growth in adolescence are more prone to having favorable BMI profiles in adulthood.

In Chapter 2, we study the estimation of contextual effects in a hierarchical generalized linear model (HGLM). We simulate data and study the effects using the higher level group sample mean as the estimate for the true mean versus using an Empirical Bayes (EB) approach (Shin and Raudenbush 2010). We study this comparison for logistic, probit, log-linear, ordinal and nominal regression models. We find that in general the EB estimate lends a parameter estimate much closer to the true value, except for cases with very small variability in the upper level, where it is a more complicated situation and there is likely no need for contextual effects analysis.

In Chapter 3, the HGLM studies are made clearer with large-scale simulations. These large scale simulations are shown for logistic regression and probit regression models for binary outcome data. With repetition we are able to establish coverage percentages of the confidence intervals of the true contextual effect. Coverage percentages show the percentage of simulations that have confidence intervals containing the true parameter values. Results confirm observations from the preliminary simulations in the previous section of this paper, and an accompanying example of adulthood hypertension shows how these results can be used in an application.