ASReml Tutorial: C1 Variance structures p. 1. ASReml tutorial. C1 Variance structures. Arthur Gilmour

Similar documents
Exercises. ASReml Tutorial: B4 Bivariate Analysis p. 55

Paired plot designs experience and recommendations for in field product evaluation at Syngenta

Latin Square Design. Design of Experiments - Montgomery Section 4-2

Mixed Models Lecture Notes By Dr. Hanford page 151 More Statistics& SAS Tutorial at Type 3 Tests of Fixed Effects

Subject-specific observed profiles of change from baseline vs week trt=10000u

Supplementary Figures Supplementary Figure 1 Comparison of among-replicate variance in invasion dynamics

Multiple-point simulation of multiple categories Part 1. Testing against multiple truncation of a Gaussian field

Chapter 27. Inferences for Regression. Remembering Regression. An Example: Body Fat and Waist Size. Remembering Regression (cont.)

Statistical Consulting Topics. RCBD with a covariate

Mixed models in R using the lme4 package Part 2: Longitudinal data, modeling interactions

Replicated Latin Square and Crossover Designs

Algebra I Module 2 Lessons 1 19

Resampling Statistics. Conventional Statistics. Resampling Statistics

PROC GLM AND PROC MIXED CODES FOR TREND ANALYSES FOR ROW-COLUMN DESIGNED EXPERIMENTS

in the Howard County Public School System and Rocketship Education

System Identification

1'-tq/? BU-- _-M August 2000 Technical Report Series of the Department of Biometrics, Cornell University, Ithaca, New York 14853

More About Regression

GLM Example: One-Way Analysis of Covariance

Moving on from MSTAT. March The University of Reading Statistical Services Centre Biometrics Advisory and Support Service to DFID

Orthogonal rotation in PCAMIX

Restoration of Hyperspectral Push-Broom Scanner Data

Problem Points Score USE YOUR TIME WISELY USE CLOSEST DF AVAILABLE IN TABLE SHOW YOUR WORK TO RECEIVE PARTIAL CREDIT

Linear mixed models and when implied assumptions not appropriate

GENOTYPE AND ENVIRONMENTAL DIFFERENCES IN FIBRE DIAMETER PROFILE CHARACTERISTICS AND THEIR RELATIONSHIP WITH STAPLE STRENGTH IN MERINO SHEEP

Variation in fibre diameter profile characteristics between wool staples in Merino sheep

Senior Math Studies Lesson Planning Date Lesson Events

RANDOMIZED COMPLETE BLOCK DESIGN (RCBD) Probably the most used and useful of the experimental designs.

Bootstrap Methods in Regression Questions Have you had a chance to try any of this? Any of the review questions?

COMP Test on Psychology 320 Check on Mastery of Prerequisites

STAT 113: Statistics and Society Ellen Gundlach, Purdue University. (Chapters refer to Moore and Notz, Statistics: Concepts and Controversies, 8e)

MATH& 146 Lesson 11. Section 1.6 Categorical Data

DELTA MODULATION AND DPCM CODING OF COLOR SIGNALS

Modelling Intervention Effects in Clustered Randomized Pretest/Posttest Studies. Ed Stanek

Module 8 VIDEO CODING STANDARDS. Version 2 ECE IIT, Kharagpur

Proceedings of the Third International DERIVE/TI-92 Conference

Machine Learning of Expressive Microtiming in Brazilian and Reggae Drumming Matt Wright (Music) and Edgar Berdahl (EE), CS229, 16 December 2005

Best Pat-Tricks on Model Diagnostics What are they? Why use them? What good do they do?

Learning Musicianship for Automatic Accompaniment

Tutorial: Trak design of an electron injector for a coupled-cavity linear accelerator

Predictability of Music Descriptor Time Series and its Application to Cover Song Detection

AP Statistics Sec 5.1: An Exercise in Sampling: The Corn Field

I. Model. Q29a. I love the options at my fingertips today, watching videos on my phone, texting, and streaming films. Main Effect X1: Gender

Sociology 7704: Regression Models for Categorical Data Instructor: Natasha Sarkisian

LCD and Plasma display technologies are promising solutions for large-format

+ b ] and um we kept going like I think I got

Salt on Baxter on Cutting

Higher Order MIMO Testing with the R&S SMW200A Vector Signal Generator

RCBD with Sampling Pooling Experimental and Sampling Error

Color Gamut Mapping based on Mahalanobis Distance for Color Reproduction of Electronic Endoscope Image under Different Illuminant

MITOCW ocw f08-lec19_300k

DV: Liking Cartoon Comedy

Sampling Worksheet: Rolling Down the River

Model II ANOVA: Variance Components

Mixed Effects Models Yan Wang, Bristol-Myers Squibb, Wallingford, CT

ISOMET. Compensation look-up-table (LUT) and Scan Uniformity

Blueline, Linefree, Accuracy Ratio, & Moving Absolute Mean Ratio Charts

PRECISION OF MEASUREMENT OF DIAMETER, AND DIAMETER-LENGTH PROFILE, OF GREASY WOOL STAPLES ON-FARM, USING THE OFDA2000 INSTRUMENT

ECE302H1S Probability and Applications (Updated January 10, 2017)

ECONOMICS 351* -- INTRODUCTORY ECONOMETRICS. Queen's University Department of Economics. ECONOMICS 351* -- Winter Term 2005 INTRODUCTORY ECONOMETRICS

a user's guide to Probit Or LOgit analysis

complex than coding of interlaced data. This is a significant component of the reduced complexity of AVS coding.

Visual Encoding Design

E X P E R I M E N T 1

Detecting Musical Key with Supervised Learning

The Time Series Forecasting System Charles Hallahan, Economic Research Service/USDA, Washington, DC

Reproducibility Assessment of Independent Component Analysis of Expression Ratios from DNA microarrays.

Update on Antenna Elevation Pattern Estimation from Rain Forest Data

A Novel Approach towards Video Compression for Mobile Internet using Transform Domain Technique

Box-Jenkins Methodology: Linear Time Series Analysis Using R

Koester Performance Research Koester Performance Research Heidi Koester, Ph.D. Rich Simpson, Ph.D., ATP

AP Statistics Sampling. Sampling Exercise (adapted from a document from the NCSSM Leadership Institute, July 2000).

Single-switch Scanning Example. Learning Objectives. Enhancing Efficiency for People who Use Switch Scanning. Overview. Part 1. Single-switch Scanning

Sampling Issues in Image and Video

Release Year Prediction for Songs

UC San Diego UC San Diego Previously Published Works

Analysis, Synthesis, and Perception of Musical Sounds

Supplemental Material for Gamma-band Synchronization in the Macaque Hippocampus and Memory Formation

PREDICTING THE PERCEIVED SPACIOUSNESS OF STEREOPHONIC MUSIC RECORDINGS

1. Model. Discriminant Analysis COM 631. Spring Devin Kelly. Dataset: Film and TV Usage National Survey 2015 (Jeffres & Neuendorf) Q23a. Q23b.

Exploring the Effects of Pitch Layout on Learning a New Musical Instrument

HIGH-DIMENSIONAL CHANGEPOINT DETECTION

Normalization Methods for Two-Color Microarray Data

Reliability. What We Will Cover. What Is It? An estimate of the consistency of a test score.

Tutorial 0: Uncertainty in Power and Sample Size Estimation. Acknowledgements:

Reconstruction of Ca 2+ dynamics from low frame rate Ca 2+ imaging data CS229 final project. Submitted by: Limor Bursztyn

Research Article Design and Analysis of a High Secure Video Encryption Algorithm with Integrated Compression and Denoising Block

Sensors, Measurement systems Signal processing and Inverse problems Exercises

Import and quantification of a micro titer plate image

Motion Video Compression

Analysis of WFS Measurements from first half of 2004

Modeling sound quality from psychoacoustic measures

2D ELEMENTARY CELLULAR AUTOMATA WITH FOUR NEIGHBORS

CONCLUSION The annual increase for optical scanner cost may be due partly to inflation and partly to special demands by the State.

Generating Spectrally Rich Data Sets Using Adaptive Band Synthesis Interpolation

HBI Database. Version 2 (User Manual)

MANOVA COM 631/731 Spring 2017 M. DANIELS. From Jeffres & Neuendorf (2015) Film and TV Usage National Survey

Video coding standards

Relationships Between Quantitative Variables

STAT 503 Case Study: Supervised classification of music clips

Transcription:

ASReml tutorial C1 Variance structures Arthur Gilmour ASReml Tutorial: C1 Variance structures p. 1

ASReml tutorial C1 Variance structures Arthur Gilmour ASReml Tutorial: C1 Variance structures p. 2

Overview Traditional variance models assume independent effects: σ 2 I General variance structures - Unstructured - every variance and covariance is a separate parameter - Structured - variances and covariances are functions of parameters Spatial models - correlation based on distance - paramerterized in terms of correlation and variance ASReml Tutorial: C1 Variance structures p. 3

Overview Traditional variance models General variance structures - Unstructured- Structured Spatial models - correlation based on distance - parameterized in terms of correlation and variance Compound variance structures - formed as a direct product ASReml Tutorial: C1 Variance structures p. 4

General Variance structures Unstructured (US) is parameterised directly as variances and covariances Symmetric Lower triangle rowwise V 11 V 21 V 22 V 31 V 32 V 33 ASReml Tutorial: C1 Variance structures p. 5

Reduced parameterization Diagonal (DIAG) has zero covariances Factor Analytic (FACV, XFA): Σ = ΛΛ + Ψ Cholesky (CHOLn, CHOLnC): Σ = LDL where L is unit lower triangle Antedependence (ANTEn): Σ 1 = UDU where U is unit lower triangle ASReml Tutorial: C1 Variance structures p. 6

Reduced parameterization Aim in using alternate forms is to accomodate the variance heterogeneity adequately while minimising the number of parameters force a positive definite structure. ANTE (a generalization of AR) is suited to ordered levels (e.g. times) CHOL, XFA, FACV are suited to unordered levels (e.g. sites, traits) ASReml Tutorial: C1 Variance structures p. 7

General variance structures DIAG - off diagonal is zero CHOLi - Σ = LDL - L is lower triangle unit matrix with i off-diagonal bands - D is diagonal matrix of conditional variances. ASReml Tutorial: C1 Variance structures p. 8

CHOL1 of order 4 e.g. in CHOL1 L = 1 0 0 0 a 1 0 0 0 b 1 0 0 0 c 1 D = diag(a B C D) so that Σ = A aa 0 0 aa aaa + B bb 0 0 bb bbb + C cc 0 0 cc ccc + D ASReml Tutorial: C1 Variance structures p. 9

CHOL1C of order 4 e.g. in CHOL1C L = 1 0 0 0 a 1 0 0 b 0 1 0 c 0 0 1 D = diag(a B C D) so that Σ = A aa ba ca aa aaa + B baa caa ba baa bab + C bac ca aac cab cac + D ASReml Tutorial: C1 Variance structures p. 10

Antedependence is a generalized form of Autoregressive ANTEi - Σ 1 = UDU - U is upper triangle unit matrix with i off-diagonal bands - D is diagonal matrix of conditional inverse variances. Since parameterization is obtuse for CHOL and ANTE, you may supply an unstructured matrix as starting values and ASReml will factorize it. ASReml Tutorial: C1 Variance structures p. 11

Factor Analytic Correlation Form: FAi Σ = D(LL + E)D Parameters are elements of p i matrix L and diag(σ) = DD; E is defined such that diag(ll + E) is Identity. Variance Form: FACVi Σ = ΛΛ + Ψ Paramaters are Λ = DL and Ψ = DED ASReml Tutorial: C1 Variance structures p. 12

Extended Factor Analytic Same parameterization as FACV but in order (Ψ) vec(λ) Elements of Ψ may be zero (making Σ singular) Requires use of xfa(t, i) model term which inserts i columns of zeros into the design matrix corresponding to the i factors. Much faster than FAi and FACVi when more than 10 levels in term. ASReml Tutorial: C1 Variance structures p. 13

Extended Factor Analytic... xfa(trait,1).dam... xfa(trait,1).dam 2 xfa(trait,1) 0 XFA1 2*0 1.1 0.9 dam Covariance/Variance/Correlation Mat 1.550 1.000 1.000 1.437 1.332 1.000 1.245 1.154 1.000 ASReml Tutorial: C1 Variance structures p. 14

Other structures US - unstructured OWNi - user supplies program to calculate G and the derivatives of G AINV - Use fixed relationship matrix GIVi - Use user defined fixed relationship matrix (see.giv,.grm) ASReml Tutorial: C1 Variance structures p. 15

Spatial structures ID - Identity CORU - uniform correlation AR1 1 ρ ρ 2 ρ 3 ρ 4 ρ 5 AR2, MA1, MA2, ARMA, SAR1, SAR2, CORU, CORB, CORG EXP, GAU IEXP, AEXP, IGAU, AGAU, IEUC, LVR, ISP, SPH, MAT one or two dimensional distance ASReml Tutorial: C1 Variance structures p. 16

Variances Equal variance correlation append V to code e.g. AR1V, CORUV Unequal (Heterogeneous) variance correlation append H to code e.g. AR1H, CORUH If D is the diagonal matrix of variances, and C is a correlation matrix, Σ = D 0.5 CD 0.5 ASReml Tutorial: C1 Variance structures p. 17

ASReml tutorial C2 Spatial Analysis Arthur Gilmour ASReml Tutorial: C2 Spatial Analysis p. 18

Two basic kinds Regular grid e.g. field trial - interest is in adjusting for other effects ASReml Tutorial: C2 Spatial Analysis p. 19

Two basic kinds Regular grid e.g. field trial - interest is in adjusting for other effects Irregular grid e.g. survey - interest is in modelling the spatial pattern - kriging ASReml Tutorial: C2 Spatial Analysis p. 20

Two basic kinds Regular grid e.g. field trial - interest is in adjusting for other effects Irregular grid e.g. survey - interest is in modelling the spatial pattern - kriging ASReml is regularly used for former - developing capability for latter ASReml Tutorial: C2 Spatial Analysis p. 21

Single field trial Slate Hall Farm - Barley 1976 - Balanced Incomplete block design - 25 varieties, 6 replicates - layout 10 rows by 15 columns BIB Model fixed: treatments random: rep block Spatial Model Autoregressive error model R = Σ R Σ C ASReml Tutorial: C2 Spatial Analysis p. 22

Slate Hall base Slate Hall 1976 Cereal trial rep 6 latrow 30 latcol 30 fldrow 10 fldcol 15 variety 25 yield!/100 shf.dat!dopart $1!DISPLAY 15!SPATIAL!TWOWAY ASReml Tutorial: C2 Spatial Analysis p. 23

Slate Hall - Design based!part 1 RCB Analysis yield mu var!r rep!part 2 # BIB analysis yield mu var!r rep latrow latcol ASReml Tutorial: C2 Spatial Analysis p. 24

Slate Hall - Model based!part 3 # Fitting AR1.AR1 yield mu var predict var 1 2 fldrow fldrow AR1.1 fldcol fldcol AR1.1 ASReml Tutorial: C2 Spatial Analysis p. 25

Slate Hall - Model + Design!PART 4 # Fitting AR1.AR1 yield mu var!r rep latrow latcol predict var 1 2 fldrow fldrow AR1.1 fldcol fldcol AR1.1 ASReml Tutorial: C2 Spatial Analysis p. 26

Slate Hall - summary Model LogL(l) 2 (l) RCB -167.694 2 BIB design -132.134 4 Spatial model -124.676 3 BIB+Spatial -124.312 6 Spatial correlation model fits better than the BIB model ASReml Tutorial: C2 Spatial Analysis p. 27

Spatial components Source terms Gamma Component Comp/SE % C rep 6 6.2003E-05.724166E-05 0.00 0 B latrow 30 30.6327E-01.228684 0.71 0 P latcol 30 30.1608E-03.581362E-03 0.00 0 P Variance 150 125 1.000 3.61464 4.28 0 P Residual AutoR 10.4652.465209 4.85 0 U Residual AutoR 15.6741.674095 8.76 0 U ASReml Tutorial: C2 Spatial Analysis p. 28

Variogram Slate Hall 1976 Cereal trial F3 1 Variogram of residuals 31 Jan 2005 16:15:30 1.888194 0 Outer displacement Inner displacement ASReml Tutorial: C2 Spatial Analysis p. 29

Residual to plan Slate Hall 1976 Cereal trial F3 1 Field plot of residuals 31 Jan 2005 16:15:30 Range: 4.80 5.37 ASReml Tutorial: C2 Spatial Analysis p. 30

row/column Slate Hall 1976 Cereal trial F3 1 Residuals V Row and Column position: 31 Jan 2005 16:15:30 Range: 4.80 5.37 ASReml Tutorial: C2 Spatial Analysis p. 31

Spatial analysis in Forest Genetic trials. Typically not a complete rectangle - add missing values to complete the pattern - use map points (if < 5000 trees) With Tree model, must include Nugget variance - either Nugget is residual, spatial is in G or spatial is residual and Nugget is G, spatial model typically superior to design model for growth/production traits - less so for disease and conformation traits ASReml Tutorial: C2 Spatial Analysis p. 32

ASReml tutorial C3 MultiEnvironment Trials Arthur Gilmour ASReml Tutorial: C3 MultiEnvironment Trials p. 33

Multi environment trial In early generational cereal breeding, run several trials with 1 or two replicates of test lines, 20 percent check lines for error estimation. More power from fitting as correlated effects across sites. ASReml Tutorial: C3 MultiEnvironment Trials p. 34

MET in ASReml Three Multi Environment Trial seq col 15 # Actually 12 12 and 15 respectively row 34 # Actually 34 34 and 28 respectively chks 7 # Check 7 is the test lines test 336 # coded 0 for check lines geno 337 yld!*.01 site 3 met.dat!section site ASReml Tutorial: C3 MultiEnvironment Trials p. 35

Spatial models yld site chk.site!r at(site,3).row.02, at(site).col.90.40.036 site.test site 2 1 12 col AR1.1271!S2=2.19 34 row AR1.751 12 col AR1.25!S2=0.84 34 row AR1.56 15 col ID!S2=0.19 28 row AR1.38 ASReml Tutorial: C3 MultiEnvironment Trials p. 36

Model genetic variation site.test 2 site 0 FA1.5.5.5.1.1.1 test ASReml Tutorial: C3 MultiEnvironment Trials p. 37

Components Source Model terms Component Comp/SE % Residual 1236 1213 at(site,01).col 15 15 0.323302E-05 0.00 0 at(site,02).col 15 15 0.142114 1.32 0 at(site,03).col 15 15 0.446791E-01 1.77 0 at(site,3).row 34 34 0.241380E-01 2.80 0 Variance[ 1] 408 0 2.60271 5.18 0 Residual AR=AutoR 12 0.407051 4.45 0 Residual AR=AutoR 34 0.882580 33.50 0 Variance[ 2] 408 0 1.00339 8.29 0 Residual AR=AutoR 12 0.282407 4.84 0 Residual AR=AutoR 34 0.580701 11.37 0 Variance[ 3] 420 0 0.105411 5.59 0 ASReml Tutorial: C3 MultiEnvironment Trials p. 38 Residual AR=AutoR 28 0.687455 10.14 0

Factor Analytic site.test FA D(L 1 1 0.518516 5.35 0 site.test FA D(L 1 2 1.13028 2.18 0 site.test FA D(L 1 3 0.735010 6.04 0 site.test FA D(L 0 1 0.991585 7.99 0 site.test FA D(L 0 2 0.731805E-01 1.07 0 site.test FA D(L 0 3 0.121810 7.17 0 Covariance/Variance/Correlation FA D(LL +E)D 0.9916 0.5865 0.3811 0.1579 0.7308E-01 0.8313 0.1325 0.7844E-01 0.1218 ASReml Tutorial: C3 MultiEnvironment Trials p. 39

ASReml tutorial C4 Repeated Measures Arthur Gilmour ASReml Tutorial: C4 Repeated Measures p. 40

Main approaches General variance structure (Multivariate approach) UnStructured, Autoregressive, EXPponential regular measurements Regression Approach Longitudinal model Random regression irregular measurements ASReml Tutorial: C4 Repeated Measures p. 41

Multivariate approach Suited when most animals have most measures Repeats are at significant standard times Say WWT, 200dayWT, 400dayWT, 600dayWT Discuss ASReml Tutorial: C4 Repeated Measures p. 42

Multivariate WWT WT200 WT400 WT600 Trait Tr.sex,!r Tr.animal!f Tr.cohort 1 2 1 0 Trait 0 US 10*0 Tr.animal 2 Tr 0 US 10*0 animal 0 AINV ASReml Tutorial: C4 Repeated Measures p. 43

Multivariate WWT WT200 WT400 WT600 Trait Tr.sex,!r Tr.animal!f Tr.cohort 1 2 1 0 Trait 0 US 10*0 Tr.animal 2 Tr 0 US 10*0 animal 0 AINV ASReml Tutorial: C4 Repeated Measures p. 44

Multivariate WWT WT200 WT400 WT600 Trait Tr.sex,!r Tr.animal!f Tr.cohort 1 2 1 0 Trait 0 US 10*0 Tr.animal 2 Tr 0 US 10*0 animal 0 AINV ASReml Tutorial: C4 Repeated Measures p. 45

Random Regression Appropriate when - there is considerable unbalance in times of measurement - there are varying numbers of measurements - all animals have multiple measures Concept: Regression for each individual consisting of an overall response pattern (fixed) plus an individual (random) adjustment. ASReml Tutorial: C4 Repeated Measures p. 46

RR principles This is a reduced parameterization model which must be well formulated - mean profile of higher order than random profile - random profile generally low order Usually formulated as polynomial but could be low order spline ASReml Tutorial: C4 Repeated Measures p. 47

RR Example!WORK 150 Random regression analysis of emd animal!p sire 89!I dam 1052!I year 2!I!V21=V4!==2!*-365 flock 5 sex 2!A aod tobr 3!I dob!-14800!+v21 age wt fat emd sdf01a.ped!skip 1 sdfwfml.csv!skip 1!MAXIT 20!DDF!FCON!MVremove!DOPART $1 ASReml Tutorial: C4 Repeated Measures p. 48

RR Model!PART 1 # Linear RR emd mu age year wt sex sex.wt flock, tobr aod dob year.dob year.age, year.sex year.flock year.tobr, sex.dob tobr.dob,!r!{ animal animal.age!},!{ ide(animal) ide(animal).age!}, at(year,1,2).spl(age,20) ASReml Tutorial: C4 Repeated Measures p. 49

RR G structure 0 0 2 animal 2 2 0 US!GP # Intercept and slope 1.3 0.01 0.01 animal 0 AINV ide(animal) 2# Intercept and slope 2 0 US!GP 1.6 0.01 0.03 ide(animal) ASReml Tutorial: C4 Repeated Measures p. 50

Fitting PART 1 Fixed terms year.age year.sex year.tobr are NS But retain year.age because of the year.spl terms variance of ide(animal).age is at boundary LogL after dropping 3 interactions was -726.867 ASReml Tutorial: C4 Repeated Measures p. 51

Quadratic RR!PART 2 # Quadratic RR using pol emd mu age year wt sex sex.wt, flock tobr aod dob year.dob, year.flock sex.dob tobr.dob, year.age,!r pol(age,2).animal, pol(age,1).ide(animal), at(year,1,2).spl(age,20) 0 0 2... ASReml Tutorial: C4 Repeated Measures p. 52

PART 2 G structures 0 0 2 pol(age,2).animal 2 3 0 US 1.6.6.6.3.3.3 animal 0 AINV pol(age,1).ide(animal) 2 0 US 2.1.6 1.3 ide(animal) ASReml Tutorial: C4 Repeated Measures p. 53

PART 2 LogL -643.67 so significant quadratic curvature Obtained inital values by ignoring G structure in initial run. ASReml Tutorial: C4 Repeated Measures p. 54

Spline curvature!part 3!SPLINE spl(age,3) 4 0 6 emd mu age year wt sex sex.wt, flock tobr aod dob year.dob, year.age year.sex year.flock, year.tobr sex.dob tobr.dob,!r!{ animal animal.age, animal.spl(age,3)!},!{ ide(animal) ide(animal).age, ide(animal).spl(age,3)!}, at(year,1,2).spl(age,20) ASReml Tutorial: C4 Repeated Measures p. 55 0 0 2

Simpler!PART 4 emd mu age year wt sex sex.wt flock, tobr aod dob year.dob year.age, year.flock year.tobr sex.dob tobr.dob,!r pol(age,2).animal ide(animal), at(year,1,2).spl(age,20) 0 0 1 pol(age,2).animal 2 3 0 US 1.6 ASReml Tutorial: C4 Repeated Measures p. 56

Interpretation.res file has pol() coefficients. say T Form TGT to get full matrix of variances (all times). ASReml Tutorial: C4 Repeated Measures p. 57