Shining a light on NOIR - Rethinking Scales of Measurement

Similar documents
1. Update Software in Meter

2014 Essentially Ellington Competition & Festival Recording and Application Guidelines

New York Lyric Opera

New York Lyric Opera Theatre

New York Lyric Opera Theatre

Undergraduate Enrollment

JAMES A. FARLEY NATIONAL AIR MAIL WEEK MAY 15 21, 1938 FINDING GUIDE

1.1 Common Graphs and Data Plots

2015 Broadcasters Calendar

Producer s Guide to Working with SAG-AFTRA on a Modified Low Budget Theatrical Motion Picture

TORK MODEL DWZ100A 1 CHANNEL DIGITAL TIME SWITCH

Stand Alone Pricing- Rate Card PRI Services MRC Notes 6 Channel Minimum $ minutes of Long Distance included per channel

Options not included in this section of Schedule No. 12 have previously expired and the applicable pages may have been deleted/removed.

TORK MODEL DZM200A 2 CHANNEL DIGITAL TIME SWITCH WITH MOMENTARY CONTACT

1. Cable Coordination

Table of Contents ABOUT OOYALA S GLOBAL VIDEO INDEX REPORT...3 EXECUTIVE SUMMARY...4 RED STATE/BLUE STATE...5 AROUND THE WORLD IN 80 PLAYS...

Initialisms are abbreviations made from the first letter of each of the words in a title or name.

Income Exemptions Exemptions Exemptions At least than Over Over Over 5

07/06/1995. Page 2 of 6

Public Opinion and Understanding of Advance Warning Arrow Displays Used in Short-Term, Mobile, and Moving Work Zones

Ferenc, Szani, László Pitlik, Anikó Balogh, Apertus Nonprofit Ltd.

Statistical Computing (36-350) Basics of character manipulation. Cosma Shalizi and Vincent Vu November 7, 2011

Music for All Brings America s Outstanding Student Musicians to Indianapolis March 15-17

2015 NCAA Division I Men's Basketball Championship News Conference Satellite Coordinates

SEVENTH GRADE. Revised June Billings Public Schools Correlation and Pacing Guide Math - McDougal Littell Middle School Math 2004

Chapter 6. Normal Distributions

Problem Points Score USE YOUR TIME WISELY USE CLOSEST DF AVAILABLE IN TABLE SHOW YOUR WORK TO RECEIVE PARTIAL CREDIT

Currently, SBS International reaches more than 13 million households in the US through major satellite and cable service providers.

Blueline, Linefree, Accuracy Ratio, & Moving Absolute Mean Ratio Charts

Here and elsewhere in the chapter, to capitalize a word means to capitalize its first letter.

NETFLIX MOVIE RATING ANALYSIS

Visual Encoding Design

in the Howard County Public School System and Rocketship Education

Chapter 27. Inferences for Regression. Remembering Regression. An Example: Body Fat and Waist Size. Remembering Regression (cont.)

Looking to reach water professionals

Moving on from MSTAT. March The University of Reading Statistical Services Centre Biometrics Advisory and Support Service to DFID

Principles of Data Visualization. Jeffrey University of Washington

E X P E R I M E N T 1

What is Statistics? 13.1 What is Statistics? Statistics

COMP Test on Psychology 320 Check on Mastery of Prerequisites

Looking to reach water professionals

[PDF] MathXL Standalone Access Card (6-month Access)

STAT 113: Statistics and Society Ellen Gundlach, Purdue University. (Chapters refer to Moore and Notz, Statistics: Concepts and Controversies, 8e)

Guide to the Delos Franklin Wilcox Papers

Chapter 3. Averages and Variation

AskDrCallahan Calculus 1 Teacher s Guide

Legislative Testimony

North Carolina Standard Course of Study - Mathematics

Unstaged Cancer in the U.S.:

Choral Sight-Singing Practices: Revisiting a Web-Based Survey

Archives of the Center for the Calligraphic Arts

Replicated Latin Square and Crossover Designs

TI-Inspire manual 1. Real old version. This version works well but is not as convenient entering letter

Chapter 5. Describing Distributions Numerically. Finding the Center: The Median. Spread: Home on the Range. Finding the Center: The Median (cont.

ST. MARY S UNIVERSITY Spring 2008 FINAL EXAMINATION FEDERAL INCOME TAXATION PROFESSOR G. FLINT ESSAY PLEASE READ CAREFULLY

Linear mixed models and when implied assumptions not appropriate

Frequencies. Chapter 2. Descriptive statistics and charts

UNIVERSITY OF MASSACHUSETTS Department of Biostatistics and Epidemiology BioEpi 540W - Introduction to Biostatistics Fall 2002

800 MHz Band Reconfiguration

success by association

Research Article. ISSN (Print) *Corresponding author Shireen Fathima

Lecture 2 Video Formation and Representation

Contributions to SE43 Group 10 th Meeting

Light-Emitting Diode (LED) Traffic Signal and Uninterruptible Power Supply (UPS) Usage: A Nationwide Survey

CS229 Project Report Polyphonic Piano Transcription

(Week 13) A05. Data Analysis Methods for CRM. Electronic Commerce Marketing

Melodic Pattern Segmentation of Polyphonic Music as a Set Partitioning Problem

15 Win 4 Numbers Good for Two Weeks (Now Until Saturday October 21)

Algebra I Module 2 Lessons 1 19

Notes Unit 8: Dot Plots and Histograms

Paired plot designs experience and recommendations for in field product evaluation at Syngenta

Bootstrap Methods in Regression Questions Have you had a chance to try any of this? Any of the review questions?

Q1. In a division sum, the divisor is 4 times the quotient and twice the remainder. If and are respectively the divisor and the dividend, then (a)

Monthly Permits Issued for February, 2015

BRAND REPORT FOR THE 6 MONTH PERIOD ENDED JUNE 2018

Version : 1.0: klm. General Certificate of Secondary Education November Higher Unit 1. Final. Mark Scheme

Open Access Determinants and the Effect on Article Performance

CHAPTER I BASIC CONCEPTS

Characterization and improvement of unpatterned wafer defect review on SEMs

Estimation of inter-rater reliability

Statistics for Engineers

Distribution of Data and the Empirical Rule

abc Mark Scheme Statistics 3311 General Certificate of Secondary Education Higher Tier 2007 examination - June series

BIBLIOGRAPHIC DATA: A DIFFERENT ANALYSIS PERSPECTIVE. Francesca De Battisti *, Silvia Salini

University Microfilms International tann Arbor, Michigan 48106

Predicting the Importance of Current Papers

Multiple-point simulation of multiple categories Part 1. Testing against multiple truncation of a Gaussian field

CHAPTER 2 SUBCHANNEL POWER CONTROL THROUGH WEIGHTING COEFFICIENT METHOD

Reliability. What We Will Cover. What Is It? An estimate of the consistency of a test score.

Supplemental Material: Color Compatibility From Large Datasets

Mixed models in R using the lme4 package Part 2: Longitudinal data, modeling interactions

8 Nonparametric test. Question 1: Are (expected) value of x and y the same?

Permutations of the Octagon: An Aesthetic-Mathematical Dialectic

The Research of Controlling Loudness in the Timbre Subjective Perception Experiment of Sheng

success by association

800 MHz Band Reconfiguration

DIGITAL SIGN SURVEY SURVEY REQUESTED BY CYLCE JOHNSON ON 2/26/07 - QUESTION: NAHBA SURVEY ON SIGN INTENSITY (BRIGHTNESS)

Design for Information

Appendix B. Elements of Style for Proofs

Guide for Utilization Measurement and Management of Fleet Equipment NCHRP 13-05

Transcription:

Shining a light on NOIR - Rethinking Scales of Measurement Chris Brunsdon National Centre for Geocomputation Maynooth University contact: christopher.brunsdon@nuim.ie Sep 2017 (ECTQG@York) Chris Brunsdon NOIR (1 of 51)

Steven s Original Scales: Scale Basic Empirical Operations Mathematical Group Structure Permissible statistics NOMINAL Determination of equality Permutation group: x = f (x), f permutes x values Number of cases, mode ORDINAL Determination of greater or less Isotonic group: x = f (x) f any monotonic increasing function Median, percentiles INTERVAL Determination of equality of intervals or differences General linear group: x = ax + b Mean, standard deviation, rank order correlation RATIO Determination of equality of ratios Similarity group: x = ax Coefficient of variation Properties accumulate as columns are descended N-O-I-R. Restrictive on C3, expansive on C2,C4. See Stevens, Stanley Smith. 1946. On the Theory of Scales of Measurement. Science 103 (677 680) Chris Brunsdon NOIR (2 of 51)

A permissible operator view x y = z f (x) f (y) = f (z) Scale Permissible Operators ( ) NOMINAL ORDINAL INTERVAL RATIO =, >,<,, +,, Chris Brunsdon NOIR (3 of 51)

A Possible Nested Arrangement? Nominal Ordinal Interval Strength of Precision Bad things come in threes... if you count them in threes Scales of measurement are nested... if you only look at the nesting scales Is the list universal? If not, what is missing? or is there anything else that slots in? Ratio Chris Brunsdon NOIR (4 of 51)

Slotting In - Splitting Ordinal Nominal Graded Rank Interval ORDINAL (GRADED,RANK) Graded membership e.g. High, Medium, Low Rank - position in a race etc. In one respect the same - ie etc valid But also unique for each observation - no ties (mostly!) Rank-based statistics now meaningful for comparisons Ratio Chris Brunsdon NOIR (5 of 51)

Aside: Why are we doing this anyway? Driving force arguably comes from Measurement Theory An aspect of scientific thought eg Krantz, Luce, Suppes, and Tversky: Foundations of measurement vols I-III Chosen scale of measurement influence what kinds of analysis are meaningful Steven s uses the term permissible Main idea is that results of analyses should be invariant if data is transformed by a permissible function f If X is the data then A(X ) = A(f (X )) or possibly f(a(x )) = A(f (X )) Chris Brunsdon NOIR (6 of 51)

eg Temperature Data: Year Peak Daily Temperature - Week 1 of July ( C) 2016 20.5 24.5 26.0 22.0 20.1 18.2 19.1 2017 22.4 20.1 18.7 19.2 19.1 20.3 22.7 Units t-test H 0 : µ a = µ b Mean (2016) Mean (2017) % means C p = 0.38 21.5 C 20.4 C -5.3% F p = 0.38 70.7 F 68.6 C -2.9% NB. 21.5 C = 70.7 F and 20.4 C = 68.6 F Chris Brunsdon NOIR (7 of 51)

This Gives Rise to a New Scale Example - Running shoes Is there a difference in the rate I run for shoes A and B? Shoes Units Measurements A Pace min/km 6.45 6.44 5.80 5.93 6.08 6.37 6.64 6.30 6.61 6.06 Speed km/hr 9.30 9.32 10.34 10.12 9.87 9.42 9.04 9.53 9.08 9.90 B Pace min/km 6.47 6.42 6.45 6.47 6.37 6.68 7.00 6.73 6.17 6.36 Speed km/hr 9.28 9.35 9.30 9.28 9.42 8.98 8.58 8.92 9.72 9.43 Should I choose pace or speed to test? Both are measures of rapidity No obvious reason to favour one over the other Chris Brunsdon NOIR (8 of 51)

Looking at the test(s) t-tests using both variables Variable t-statistic D.F. p-value Speed 2.122 18 0.0480 Pace -2.100 18 0.0501 Wilcoxon signed rank tests using both variables - replaces values by rank - demoting the precision of information. Variable W -statistic p-value Speed 75 0.0630 Pace 25 0.0630 Is there a way to carry out a consistent test without loss of information and power? Chris Brunsdon NOIR (9 of 51)

The log interval scale Also proposed by Stevens Essentially log(x) is interval scale, not x Group structure is f (x) = ax b, f is a permissible transform Note that pace = 60 speed 1 so fits this structure So log(pace) and log(speed) are interval data, and t-test is permissible t-tests using both variables logged Variable t-statistic D.F. p-value Speed 2.112 18 0.0489 Pace -2.112 18 0.0489 Introducing this level of measurement leads to a better approach Note that it implies initial measurements only meaningful for x > 0 Chris Brunsdon NOIR (10 of 51)

This brings focus to constrained measurements For log interval measurements we have x > 0 Other constrained measurement levels exist: probabilities p [0, 1] (constrained in both directions) counts n must be non-negative integers - n Z + Here the only permissible transform is f (x) = x - the identity function This is the absolute scale Chris Brunsdon NOIR (11 of 51)

Augmenting NOIR Nominal Graded Rank Interval Log Interval Ratio The hierarchical structure is gone A further thought for analysis - output statistic may be a different level of measurement than the data. So p-values (absolute) must be equal under permissible input transforms But means are measured at the same level as the input data, so can be equivalent under permissible interval or ratio transforms. Absolute Chris Brunsdon NOIR (12 of 51)

Measurement Scales for Statistics or Tests Statistic Mean Quantiles Standard deviation p-value Posterior Probabilities Level of Measurement Ratio or Interval Rank, Graded, Interval, Log Interval, Ratio, Absolute Ratio? Absolute Absolute Chris Brunsdon NOIR (13 of 51)

The Cyclic Measurement Scale Nominal Graded Rank Cyclic Angles, Times of Day, Times of Year Difference between eg 359 and 357 same as between 359 and 1 Have a well-formed notion of =,,+,,,, but not >,<,, So in terms of NOIR they have some characteristics of Interval and Ratio data but not those of Ordinal Interval Log Interval Ratio Absolute Chris Brunsdon NOIR (14 of 51)

Cyclic Measurement Scales - Defining Difference 3 4 C 0 or C C δ 1 δ 2 δ 1 + δ 2 1 2 C 1 4 C Difference is not exactly the same for cyclic data Mean and circular variance also defined differently, but permissible. Quantiles not well defined - occasionally mean also undefined Median defined but not in terms of order - also sometimes undefined... the latter if locations on the circle have centre of gravity at the centre of the circle. Also statistical tests exist eg for comparing two samples. Chris Brunsdon NOIR (15 of 51)

Circular Mean and Standard Deviation 3 4 C 0 or C 1 4 C Circular Mean: x = tan 1 2 ( i sin(x i), i cos(x i)) Circular SD: ν = ln ( ( 1 n i sin(x i ) ) 2 ( + 1 n i cos(x i ) ) ) 2 1 2 C Chris Brunsdon NOIR (16 of 51)

Circular Median 3 4 C 0 or C 1 2 C 1 4 C Circular Median: If working in radians: { } 1 n argmin ψ n j=1 (π π θ j ψ ) ψ is any angle for which half of the data points lie in [ψ, ψ + π) and the majority of points are nearer to ψ than ψ + π ψ may not be unique... Chris Brunsdon NOIR (17 of 51)

Example - Adding spatial weighting to circular means Moving Window Mean Directions Streamlines NOAA Wind direction data Chris Brunsdon NOIR (18 of 51)

Proposed Alternative Lists of Levels Tukey and Mosteller Chrisman 1 Names 2 Grades (e.g. freshmen, sophomores etc.) 3 Counted fractions bound by 0 and 1 4 Counts (non-negative integers) 5 Amounts (non-negative real numbers) 6 Balances (any real number) 1 Nominal 2 Graded membership 3 Ordinal 4 Interval 5 Log-Interval 6 Extensive Ratio 7 Cyclical Ratio 8 Derived Ratio 9 Counts 10 Absolute Chris Brunsdon NOIR (19 of 51)

Some Further Extensions Increased Dimension Initially to 2D Similar to direction, no,>,,< Obviously important for geographers! Constraints eg Values must be in positive (in R + ), or in [0, 1] or an integer (in Z + ) Already there in Mosteller and Tukey or Chrisman implicitly Look into this in a multidimensional context Partially ordered sets Chris Brunsdon NOIR (20 of 51)

2D Point Data Eg. locations of people sitting in Gordon square (near UCL) 2D measurements are integral - eg easting on its own means little group structure is set of Euclidean transforms - combinations of: Scaling Rotation Translation Chris Brunsdon NOIR (21 of 51)

2D Mean (Mean Centre) x = argmin x [ n i=1 (x x i) 2] So x minimises squared distance to each of the data points Associated measure of spread: 1 n D s = n i=1 ( x x i) 2 Standard distance - root mean squared distance from x to data points. Both consistent under Euclidean transform Chris Brunsdon NOIR (22 of 51)

2D Median(Median Centre) x = argmin x [ n i=1 x x i ] So x minimises summed absolute distance to each of the data points Associated measure of spread: D m = median ( x x i ) Median distance - median distance from x to data points. Both also consistent under Euclidean transform Chris Brunsdon NOIR (23 of 51)

Thoughts on Medians They can be defined even for scales of measurement without,>,,< operators... on the basis of distance This also implies measures of spread... based on this distance Generally (ie it needs proving!) if level of measurement may be ordered it corresponds to 50th percentile But it doesn t need this to be defined! Chris Brunsdon NOIR (24 of 51)

Compositional Data Limerick County Galway East Roscommon Galway Cork North West Donegal Offaly Tipperary Mayo Wexford Cavan Monaghan Carlow Kilkenny Kerry Cork South West Laois Sligo Leitrim Clare Cork East Meath West Louth Waterford Longford Westmeath Kildare South Meath East Wicklow Cork South Central Kildare North Dublin South West Dublin Mid West Limerick City Cork North Central Dublin Fingal Galway West Dublin Bay North Dun Laoghaire Dublin West Dublin Rathdown Dublin North West Dublin South Central Dublin Bay South Dublin Central House/Bungalow Flat/Apartment Not Stated Bed Sit Caravan/Mobile home Household Type by Dáil Constituency There are a set of proportions for each constituency - they add up to one. That is, sums down columns below all add up to one - and all values must be greater than or equal to zero. Chris Brunsdon NOIR (25 of 51)

Permissible Transforms Arguably none Measurements are p = (p 1, p 2,, p m ) such that p 1, p 2,, p m 0 and m j=1 p j = 1 Any translation, rotation, scaling etc. would violate these In this sense, there are similar to a multidimensional absolute level Note that the dimension of p is m 1. Multidimensional mean and median as for 2D data still make sense If p 1, p 2,, p m meet constraints, so do their mean and median centres Weighted versions usually more useful x = argmin x [ n i=1 w i (x x i ) 2] x = argmin x [ n i=1 w i x x i ] Chris Brunsdon NOIR (26 of 51)

Means, Medians etc. House/ Flat/ Caravan/ Statistic Weighted Bungalow Apartment Bed-Sit Mobile Not Stated Median N 0.829 0.142 0.007 0.003 0.019 Y 0.827 0.144 0.008 0.003 0.019 Mean N 0.871 0.106 0.003 0.003 0.017 Y 0.870 0.107 0.003 0.003 0.017 Chris Brunsdon NOIR (27 of 51)

Inhomogeneity of Distance? More potential for large distances further away from the constraints is a transformation onto m 1 dimensional unconstrained space useful? Other 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 House/Bungalow Chris Brunsdon NOIR (28 of 51)

Proposed Approach (Aitchison) Isometric Log Ratios( Firstly transform to log( p 1 g p ), log( p 1 g p ),, log( pm ) g p ) Then express in as coordinmates with an m 1 dimensional orthogonal basis Euclidean distances in this space correspond to an alternative measure of distance for (p 1, p 2,, p m ) proposed by Aitchison An inverse transform exists Can compute mean and median on a distance basis in transformed space Then transform back to composition space Chris Brunsdon NOIR (29 of 51)

The transformed composition data -2.5-1.0-2.5-1.0 var 1-5.0-3.0 var 2 var 3-0.6-0.2-5 -3-2.5-1.0 var 4-5.0-3.0-5 -3-0.6-0.2-0.6-0.2 Chris Brunsdon NOIR (30 of 51)

The recomputed summary statistics House/ Flat/ Caravan/ Statistic Weighted Bungalow Apartment Bed-Sit Mobile Not Stated Median N 0.884 0.094 0.002 0.002 0.018 Y 0.882 0.095 0.002 0.002 0.018 Mean N 0.900 0.079 0.002 0.002 0.017 Y 0.898 0.081 0.002 0.002 0.017 Chris Brunsdon NOIR (31 of 51)

Discussion The ilr transform rather like log Transformed Data is a multidimensional measure scale Permitted transforms - Euclidean - rotation, translation Similar to earlier 2D Chris Brunsdon NOIR (32 of 51)

Directional Data Revisited Also interpretable as 2D data? x = (x, y) Constraint is x 2 + y 2 = 1 Also a log connection: Complex representation as e iθ log of this is iθ this can work as interval scale although inverse transform is cyclic e iθ = e iθ+2kπ if k Z Chris Brunsdon NOIR (33 of 51)

Partially Ordered Sets - The Basics A partially ordered set has some pairs of members which holds but not all pairs Properties of and friends 1 a a (Reflexivity) 2 If a b and b a then a = b (Antisymmetry) 3 If a b and b c then a c (Transitivity) 4 a b implies a b but a b 5 b a means the same as a b Take-home etc. work like comparison operators like < etc. but only on some pairs of objects... Chris Brunsdon NOIR (34 of 51)

An Example Data Set Data Description Indicator Name Description s 1 Income Per capita income (1974) s 2 Illiteracy Illiteracy (1970 percept of popn.) s 3 LifeExp Life expectancy in years (1969-71) s 4 Murder Murder and non-negligent manslaughter rate per 100,000 popn. (1976) s 5 HSGrad Percent high-school graduates (1970) Table: US Well-being variables by State Chris Brunsdon NOIR (35 of 51)

US states data as a poset Definition of etc here: For US states a and b, a b if and only if s 1a s 1b and s 2a s 2b and s 3a s 3b and s 4a s 4b and s 5a s 5b a b implies state b is doing better that state a on all indicators. States no longer fully rankable, but some still precede others Only requires consensus on sign of variables, not on weighting Chris Brunsdon NOIR (36 of 51)

Visualising the US poset: Arkansas West Virginia Vermont Oklahoma New Hampshire New Jersey Nebraska Texas Maine Massachusetts Mississippi New Mexico Rhode Island Missouri Virginia Utah Connecticut Minnesota Alabama Tennessee Delaware Wisconsin Iowa Montana North Dakota Louisiana Michigan Arizona Colorado Oregon South Carolina Nevada Ohio Pennsylvania Florida Idaho Indiana Kansas Kentucky Wyoming California Georgia North Carolina Illinois New York Maryland South Dakota Washington Figure: Hasse Diagram (Peeled Minimal Elements) Chris Brunsdon NOIR (37 of 51)

Some terminology A chain C P is a set such that all a, b in C are comparable. Note that a chain is therefore an ordered set. A chain is maximal if no other chain C exists such that C C. The depth of a poset {P, } is the length of its longest chain. An antichain A P is a set such that no distinct a, b in A are comparable. An antichain is maximal if no other antichain A exists such that A A. An element a P is a maximal element if there is no element b P such that a b. The maximal element set is the set of all such elements. Similar for minimal Chris Brunsdon NOIR (38 of 51)

Geographical Hasse Diagram is Revealing... Figure: Hasse Diagram (Based on Geographical Location) In general states in the north west tend to enjoy a better state of well being (at least on the basis of this index)... Chris Brunsdon NOIR (39 of 51)

Some chains of well-being Florida California Chain Order 4 Downstream 3 Downstream 2 Downstream 1 Downstream Self 1 Upstream 2 Upstream 3 Upstream 4 Upstream Not in chain Alabama Texas New York Vermont Figure: State-focused Relationship Maps Chris Brunsdon NOIR (40 of 51)

Minimal and Maximal Elements and the Maximal Antichain Maximal Antichain Minimal Elements Member Not Member Maximal Elements Figure: Significant Set Maps Chris Brunsdon NOIR (41 of 51)

Do these sets cluster?... It looks like they do, at least here Minimal Elements Maximal Elements Maximal Antichain Join Count statistic 5.043 4.076 2.817 p-value 0.000 0.000 0.002 Tobler revisited? not everything is comparable to everything else, but near things are less likely to be comparable than distant things. Chris Brunsdon NOIR (42 of 51)

Are Measurement Theory and Steven s Scales Helpful Anyway? Idea is not without its critics The original simple idea would be helpful if comprehensive But it isn t! Especially for geographers... Rather like i before e except after c My neighbour is agreeing to reimburse the conciege with madiera and caffeine. So many contradictions, hardly a structural rule... Chris Brunsdon NOIR (43 of 51)

Against Proscription Previous points were a critique of Steven particular categorisation... but not of measurement theory per se Are there times when it malkes sense to use an analysis technique that isn t permissible? A lot of non-parametric statistical methods do this Chris Brunsdon NOIR (44 of 51)

Counterexamples Spearman s Rank Correlation Coefficient equivalent to Pearson s coefficent applied to ranks... calculating means and variances of ranks - NOT PERMISSIBLE! Wilcoxon Rank Sums test ADDING RANKS NOT PERMISSIBLE! Chris Brunsdon NOIR (45 of 51)

Tensions between Measurement Theory and Statistical Models Doesn t statistical modelling make this redundant? Choosing a log interval scale might imply t-tests on logged data But so would a log-normal model Indeed although logs in the running example ensures an invariant p-value... it would be numerically incoorect of model assumption not true Also it is quite possible to derive the distribution of a sum of ranks... even though measurement theory says this is meaningless! Chris Brunsdon NOIR (46 of 51)

But in some ways not meaningless... Higher average ordinal score does imply more high ranking scores Its just that difference don t make sense - 4.5 > 4.2 but 4.5 to 4.2 is not the same as 3.5 to 3.2 or 1.9 to 1.6... Similar ID numbers may be thought of as nominal BUT If allocated in sequence they may be a proxy for ordinal time Floor on an apartment block is ordinal, but could be ratio if all floors same height It depends on context as well as measurement level Chris Brunsdon NOIR (47 of 51)

Some quotes... Permission is not required in data analysis. If a mathematician gives or witholds permission..., he (sic) may be accessory to helping the practioner escape the reality of defining the research problem. Guttman, 1977 Experience has shown that in a wide range of situations that the application of proscribed statistics to data can yield results that are scientifically meaningful, useful in making descisions, and valuable as a basis for further research. Velleman and Wilkinson, 1993 Chris Brunsdon NOIR (48 of 51)

Perhaps if not axiomatic, still sometimes helpful? Ultimately need to think about research questions of themselves Not the scale of measurement of data used to investigate them They can occasionally provide useful guidelines, though Thus although means of Likert scales can be compared, they do not convey the full richness of interval or ratio means Possibly the idea of casting as in the C programming language is useful x = (float) i or n = (int) y convert data of one type to another but sometimes with a loss of detail, or future flexibility Chris Brunsdon NOIR (49 of 51)

Occasionally the idea of Measurement Scale is Food For Thought Spearman s Rank Correlation Coefficient flawed in measurement scale terms but Kendall s τ coefficient isn t Kendall - suppose we have two variables for each case i; x i and y i. If we choose two cases at random, say j and k let p = Pr (x i > x j and y i > y j ) then τ = 2 p 1 Only uses >, no means etc. - therefore fine for any ordered levels. Can make a local statistic out of it if a location l and radius r is associated with p and a further condition that observations i and j are with a distance r from l. Chris Brunsdon NOIR (50 of 51)

Final Thoughts NOIR was a useful starting point but actually a lot more going on - suggested revised diagram below I haven t covered all possible measurement scales here Perhaps an axiomatic approach is unhelpful But viewed as one way to assess analysis it has some uses... But perhaps we need to move beyond NOIR as quantitative and theoretical geographers Chris Brunsdon NOIR (51 of 51)