Cal State Northrdge Ψ427 Andrew Answorth PhD Statstcs AGAIN? What do we want to do wth statstcs? Organze and Descrbe patterns n data Takng ncomprehensble data and convertng t to: Tables that summarze the data Graphs Extract (.e. INFER) meanng from data Infer POPULATION values from SAMPLES Hypothess Testng Groups Hypothess Testng Relaton/Predcton Psy 427 - Cal State Northrdge 2 Descrptves Dsorganzed Data Comedy 7 Suspense 8 Comedy 7 Suspense 7 Drama 8 Horror 7 Drama 5 Comedy 6 Horror 8 Comedy 5 Drama 3 Drama 3 Suspense 7 Horror 8 Comedy 6 Suspense 6 Horror 8 Comedy 6 Drama 7 Horror 9 Drama 5 Horror 9 Drama 6 Suspense 4 Drama 5 Horror 7 Suspense 3 Suspense 4 Horror 7 Suspense 5 Horror 10 Suspense 5 Horror 9 Suspense 6 Comedy 6 Drama 8 Comedy 7 Comedy 5 Comedy 4 Drama 4 Psy 427 - Cal State Northrdge 3 1
Descrptves Reducng and Descrbng Data Genre Average Ratng Comedy 5.9 Drama 5.4 Horror 8.2 Suspense 5.5 Psy 427 - Cal State Northrdge 4 Descrptves Dsplayng Data Ratng of Move Genre Enjoyment Average Ratng 9 8 7 6 5 4 3 2 1 0 Comedy Drama Horror Suspense Genre Psy 427 - Cal State Northrdge 5 Inferental Inferental statstcs: Is a set of procedures to nfer nformaton about a populaton based upon characterstcs from samples. Samples are taken from Populatons Sample Statstcs are used to nfer populaton parameters Psy 427 - Cal State Northrdge 6 2
Inferental Populatons the complete set of people, anmals, events or objects that share a common characterstc A samples some subset or subsets, selected from the populaton. representatve smple random sample. Psy 427 - Cal State Northrdge 7 Defnton Populaton Sample The group (people, thngs, A subset of the anmals, etc.) you are populaton; used as a ntendng to measure or representatve of the study; they share some populaton common characterstc Sze Large to Theoretcally Infnte Substantally Smaller than the populaton (e.g. 1 to (populaton - 1)) Descrptve Characterstcs Parameters Statstcs Symbols Greek Latn Mean µ X Standard Devaton σ s or SD Psy 427 - Cal State Northrdge 8 Inferental 3.7 Does the number of hours 3.6 students study per day 3.5 affect the grade they are 3.4 lkely to receve n statstcs 3.3 3.2 (Ψ320)? 3.2 GPA 3.1 3 2.9 1 hr per day (n=15) 3.6 3.7 3 hrs 5 hrs per day per day (n=15) (n=15) hours of study per day Psy 427 - Cal State Northrdge 9 3
Inferental Sometmes manpulaton s not possble Is predcton possble? Can a relatonshp be establshed? E.g., number of cgarettes smoked by per and the lkelhood of gettng lung cancer, The level of chld abuse n the home and the severty of later psychatrc problems. Use of the death penalty and the level of crme. Psy 427 - Cal State Northrdge 10 Inferental Measured constructs can be assessed for co-relaton (where the coeffcent of correlaton vares between -1 to +1) -1 0 1 Regresson analyss can be used to assess whether a measured construct predcts the values on another measured construct (or multple) (e.g., the level of crme gven the level of death penalty usage). Psy 427 - Cal State Northrdge 11 Measurement Statstcal analyses depend upon the measurement characterstcs of the data. Measurement s a process of assgnng numbers to constructs followng a set of rules. We normally measure varables nto one of four dfferent levels of measurement: Nomnal Ordnal Interval Rato Psy 427 - Cal State Northrdge 12 4
Ordnal Measurement Where Numbers Representatve Relatve Sze Only Contans 2 peces of nformaton B C D SIZE Psy 427 - Cal State Northrdge 13 Interval Measurement: Where Equal Dfferences Between Numbers Represent Equal Dfferences n Sze B C D Numbers representng Sze 1 2 3 Dff n numbers 2-1=1 3-2=1 Dff n sze Sze C Sze B =Sze X Sze D Sze C = Sze X SIZE Psy 427 - Cal State Northrdge 14 Psy 427 - Cal State Northrdge 15 5
Psy 427 - Cal State Northrdge 16 Measurement Rato Scale Measurement In rato scale measurement there are four knds of nformaton conveyed by the numbers assgned to represent a varable: Everythng Interval Measurement Contans Plus A meanngful 0-pont and therefore meanngful ratos among measurements. Psy 427 - Cal State Northrdge 17 True Zero pont Psy 427 - Cal State Northrdge 18 6
Measurement Rato Scale Measurement If we have a true rato scale, where 0 represents an a complete absence of the varable n queston, then we form a meanngful rato among the scale values such as: 4 = 2 2 However, f 0 s not a true absence of the varable, then the rato 4/2 = 2 s not meanngful. Psy 427 - Cal State Northrdge 19 Percentles and Percentle Ranks A percentles the score at whch a specfed percentage of scores n a dstrbuton fall below To say a score 53 s n the 75th percentle s to say that 75% of all scores are less than 53 The percentle rankof a score ndcates the percentage of scores n the dstrbuton that fall at or below that score. Thus, for example, to say that the percentle rank of 53 s 75, s to say that 75% of the scores on the exam are less than 53. Psy 427 - Cal State Northrdge 20 Percentle Scores whch dvde dstrbutons nto specfc proportons Percentles = hundredths P1, P2, P3, P97, P98, P99 Quartles = quarters Q1, Q2, Q3 Decles = tenths D1, D2, D3, D4, D5, D6, D7, D8, D9 Percentles are the SCORES Psy 427 - Cal State Northrdge 21 7
Percentle Rank What percent of the scores fall below a partcular score? ( Rank.5) PR = 100 N Percentle Ranks are the Ranks not the scores Psy 427 - Cal State Northrdge 22 Example: Percentle Rank Rankng no tes just number them Score: 1 3 4 5 6 7 8 10 Rank: 1 2 3 4 5 6 7 8 Rankng wth tes -assgn mdpont to tes Score: 1 3 4 6 6 8 8 8 Rank: 1 2 3 4.5 4.5 7 7 7 Psy 427 - Cal State Northrdge 23 Step 1 Step 2 Step 3 Step 4 Assgn Mdpont to Tes Percentle Rank (Apply Formula) Data Order Number 9 1 1 1 2.381 5 2 2 2 7.143 2 3 3 4 16.667 3 3 4 4 16.667 3 3 5 4 16.667 4 4 6 7 30.952 8 4 7 7 30.952 9 4 8 7 30.952 1 5 9 10 45.238 7 5 10 10 45.238 4 5 11 10 45.238 8 6 12 12 54.762 3 7 13 14 64.286 7 7 14 14 64.286 6 7 15 14 64.286 5 8 16 17.5 80.952 7 8 17 17.5 80.952 4 8 18 17.5 80.952 5 8 19 17.5 80.952 8 9 20 20.5 95.238 8 9 21 20.5 95.238 Steps to Calculatng Percentle Ranks Example: ( Rank.5) PR N (4.5) 100 = 16.667 21 3 3 = 100 = Psy 427 - Cal State Northrdge 24 8
Percentle X = ( p)( n + 1) P Where X P s the score at the desred percentle, p s the desred percentle (a number between 0 and 1) and n s the number of scores) If the number s an nteger, than the desred percentle s that number If the number s not an nteger than you can ether round or nterpolate; for ths class we ll just round (round up when p s below.50 and down when p s above.50) Psy 427 - Cal State Northrdge 25 Percentle Apply the formula X = ( p)( n + 1) P 1. You ll get a number lke 7.5 (thnk of t as place1.proporton) 2. Start wth the value ndcated by place1 (e.g. 7.5, start wth the valuen the 7 th place) 3. Fnd place2 whch s the next hghest placenumber (e.g. the 8 th place) and subtract the valuen place1 from the value n place2, ths dstance1 4. Multple the proporton number by the dstance1 value, ths s dstance2 5. Add dstance2 to the value n place1 and that s the nterpolated value Psy 427 - Cal State Northrdge 26 Example: Percentle Example 1: 25 th percentle: {1, 4, 9, 16, 25, 36, 49, 64, 81} X 25 = (.25)(9+1) = 2.5 place1= 2, proporton =.5 Value n place1= 4 Value n place2 = 9 dstance1 = 9 4 = 5 dstance2 = 5 *.5 = 2.5 Interpolated value = 4 + 2.5 = 6.5 6.5 s the 25 th percentle Psy 427 - Cal State Northrdge 27 9
Example: Percentle Example 2: 75 th percentle {1, 4, 9, 16, 25, 36, 49, 64, 81} X 75 = (.75)(9+1) = 7.5 place1= 7, proporton =.5 Value n place1= 49 Value n place2 = 64 dstance1 = 64 49 = 15 dstance2 = 15 *.5 = 7.5 Interpolated value = 49 + 7.5 = 56.5 56.5 s the 75 th percentle Psy 427 - Cal State Northrdge 28 Quartles To calculate Quartles you smply fnd the scores the correspond to the 25, 50 and 75 percentles. Q 1 = P 25, Q 2 = P 50, Q 3 = P 75 Psy 427 - Cal State Northrdge 29 Reducng Dstrbutons Regardless of numbers of scores, dstrbutons can be descrbed wth three peces of nfo: Central Tendency Varablty Shape (Normal, Skewed, etc.) Psy 427 - Cal State Northrdge 30 10
Measures of Central Tendency Measure Defnton Mode Medan Mean Most frequent value Level of Measurement Dsadvantage nom., ord., nt./rat. Mddle value ord., nt./rat. Arthmetc average nt./rat. Crude Only two ponts contrbute Affected by skew Psy 427 - Cal State Northrdge 31 The Mean Only used for nterval & rato data. Mean M = X = = 1 = X Major advantages: The sample value s a very good estmate of the populaton value. n n X Psy 427 - Cal State Northrdge 32 Reducng Dstrbutons Regardless of numbers of scores, dstrbutons can be descrbed wth three peces of nfo: Central Tendency Varablty Shape (Normal, Skewed, etc.) Psy 427 - Cal State Northrdge 33 11
How do scores spread out? Varablty Tell us how far scores spread out Tells us how the degree to whch scores devate from the central tendency Psy 427 - Cal State Northrdge 34 How are these dfferent? Mean = 10 Mean = 10 Psy 427 - Cal State Northrdge 35 Measure of Varablty Measure Defnton Related to: Range Largest - Smallest Mode Interquartle Range X 75 - X 25 Sem-Interquartle Range (X 75 - X 25)/2 Medan Average Absolute Devaton X X N Varance N ( X ) 2 X = 1 N 1 Mean Standard Devaton ( X ) 2 X N = 1 N 1 Psy 427 - Cal State Northrdge 36 12
The Range The smplest measure of varablty Range (R) = X hghest X lowest Advantage Easy to Calculate Dsadvantages Lke Medan, only dependent on two scores unstable {0, 8, 9, 9, 11, 53} Range = 53 {0, 8, 9, 9, 11, 11} Range = 11 Does not reflect all scores Psy 427 - Cal State Northrdge 37 Varablty: IQR InterquartleRange = P 75 P 25 or Q 3 Q 1 Ths helps to get a range that s not nfluenced by the extreme hgh and low scores Where the range s the spread across 100% of the scores, the IQR s the spread across the mddle 50% Psy 427 - Cal State Northrdge 38 Varablty: SIQR Sem-nterquartle range =(P 75 P 25 )/2 or (Q 3 Q 1 )/2 IQR/2 Ths s the spread of the mddle 25% of the data The average dstance of Q1 and Q3 from the medan Better for skewed data Psy 427 - Cal State Northrdge 39 13
Varablty: SIQR Sem-Interquartle range Q 1 Q 2 Q 3 Q 1 Q 2 Q 3 Psy 427 - Cal State Northrdge 40 Varance The average squareddstance of each score from the mean Also known as the mean square Varance of a sample: s 2 Varance of a populaton: σ 2 Psy 427 - Cal State Northrdge 41 Varance When calculated for a sample s 2 = ( X ) 2 X N 1 When calculated for the entre populaton σ = 2 ( X X ) 2 N Psy 427 - Cal State Northrdge 42 14
Standard Devaton Varance s n squared unts What about regular old unts Standard Devaton = Square root of the varance s = ( X ) 2 X N 1 Psy 427 - Cal State Northrdge 43 Standard Devaton Uses measure of central tendency (.e. mean) Uses all data ponts Has a specal relatonshp wth the normal curve Can be used n further calculatons Standard Devaton of Sample = SDor s Standard Devaton of Populaton = σ Psy 427 - Cal State Northrdge 44 Why N-1? When usng a sample (whch we always do) we want a statstc that s the best estmate of the parameter ( X ) 2 X 2 ( X ) 2 X E = σ N 1 E = σ N 1 Psy 427 - Cal State Northrdge 45 15
Degrees of Freedom Usually referred to as df Number of observatons mnus the number of restrctons + + + =10-4 free spaces 2 + + + =10-3 free spaces 2 + 4 + + =10-2 free spaces 2 + 4 + 3 + =10 Last space s not free!! Only 3 dfs. Psy 427 - Cal State Northrdge 46 Reducng Dstrbutons Regardless of numbers of scores, dstrbutons can be descrbed wth three peces of nfo: Central Tendency Varablty Shape (Normal, Skewed, etc.) Psy 427 - Cal State Northrdge 47 T erm s tha t D escrb e D strbu to ns T e rm F eatu res E xa m p le left sde s m rror "S ym m etrc" m age of rght sde "P ostvely skew ed " rght tal s longer then the left "N egatvely skew ed " left tal s longer than the rght "U nm odal" one hghest po nt "B m odal" tw o hgh ponts "N orm al" unm odal, sym m etrc, asym p totc Psy 427 - Cal State Northrdge 48 16
Psy 427 - Cal State Northrdge 49 Normal Dstrbuton 0.025 0.02 0.015 f(x) 0.01 0.005 0 20 40 60 80 100 120 140 160 180 Example: The Mean = 100 and the Standard Devaton = 20 Psy 427 - Cal State Northrdge 50 Normal Dstrbuton (Characterstcs) Horzontal Axs = possble X values Vertcal Axs = densty (.e. f(x)related to probablty or proporton) Defned as 1 2 2 ( X µ ) 2σ f ( X ) = ( e) σ 2π 1 f ( X ) = *(2.71828183) ( s) 2*(3.14159265) 2 2 ( X X ) 2s The dstrbuton reles on only the meanand s Psy 427 - Cal State Northrdge 51 17
Normal Dstrbuton (Characterstcs) Bell shaped, symmetrcal, unmodal Mean, medan, mode all equal No real dstrbuton s perfectly normal But, many dstrbutons are approxmately normal, so normal curve statstcs apply Normal curve statstcs underle procedures n most nferental statstcs. Psy 427 - Cal State Northrdge 52 Normal Dstrbuton f(x) µ 4sd µ 3sd µ 2sd µ 1sd µ µ + 1sd µ + 2sd µ + 3sd µ + 4sd Psy 427 - Cal State Northrdge 53 The standard normal dstrbuton A normal dstrbuton wth the added propertes that the mean = 0 and the s = 1 Convertng a dstrbuton nto a standard normal means convertng raw scores nto Z-scores Psy 427 - Cal State Northrdge 54 18
Z-Score Formula Raw score Z-score X X score - mean Z = = s standard devaton Z-score Raw score X = Z ( s) + X Psy 427 - Cal State Northrdge 55 Propertes of Z-Scores Z-score ndcates how many SD s a score falls above or below the mean. Postve z-scores are above the mean. Negatve z-scores are below the mean. Area under curve probablty Z s contnuous so can only compute probablty for range of values Psy 427 - Cal State Northrdge 56 Propertes of Z-Scores Most z-scores fall between -3 and +3 because scores beyond 3sd from the mean Z-scores are standardzed scores allows for easy comparson of dstrbutons Psy 427 - Cal State Northrdge 57 19
The standard normal dstrbuton Rough estmates of the SND (.e. Z-scores): Psy 427 - Cal State Northrdge 58 Have Need Chart When rough estmatng sn t enough X X Z = s Z-Table Raw Score Z-score Area under Dstrbuton X = Z ( s) + X Z-table Psy 427 - Cal State Northrdge 59 What about negatve Z values? Snce the normal curve s symmetrc, areas beyond, between, and below postve z scores are dentcal to areas beyond, between, and below negatve z scores. There s no such thng as negatve area! Psy 427 - Cal State Northrdge 60 20
Norms and Norm-Referenced Tests Norm -statstcal representatons of a populaton (e.g. mean, medan). Norm-referenced test (NRT) Compares an ndvdual's results on the test wth the preestablshed norm Made to compare test-takers to each other I.E. -The Normal Curve Psy 427 - Cal State Northrdge 61 Norms and Norm-Referenced Tests Normally rather than testng an entre populaton, the norms are nferred from a representatve sample or group (nferental stats revsted). Norms allow for a better understandng of how an ndvdual's scores compare wth the group wth whch they are beng compared Examples: WAIS, SAT, MMPI, Graduate Record Examnaton (GRE) Psy 427 - Cal State Northrdge 62 Crteron-Referenced Tests Crteron-referenced tests (CRTs) - ntended to measure how well a person has mastered a specfc knowledge set or skll Cutscore pont at whch an examnee passes f ther score exceeds that pont; can be decded by a panel or by a sngle nstructor Crteron the doman n whch the test s desgned to assess Psy 427 - Cal State Northrdge 63 21