Ope Joural of Statstcs,,, 369-38 ttp://dxdoorg/436/os445 ublsed Ole October (ttp://wwwscrorg/oural/os) Mu ealzed ellger Dstace for Model Selecto Sall Saples apa Ngo *, Bertrad Ntep Laboratore de Mateatques et Applcatos (LMA), Uverste Cek Ata Dop, Dakar-Fa, Seegal Eal: * papago@ucadedus, tepoo@yaoofr Receved May 7, ; revsed Jue 5, ; accepted July, ABSRAC I statstcal odelg area, te Akake forato crtero AIC, s a wdely kow ad extesvely used tool for odel coce e φ-dvergece test statstc s a recetly developed tool for statstcal odel selecto e popularty of te dvergece crtero s owever tepered by ter kow lack of robustess sall saple I ts paper te pealzed u ellger dstace type statstcs are cosdered ad soe propertes are establsed e lt laws of te estates ad test statstcs are gve uder bot te ull ad te alteratve ypoteses, ad approxatos of te power fuctos are deduced A odel selecto crtero relatve to tese dvergece easures are developed for paraetrc ferece Our terest s te proble to testg for coosg betwee two odels usg soe foratoal type statstcs, we depedet saple are draw fro a dscrete populato ere, we dscuss te asyptotc propertes ad te perforace of ew procedure tests ad vestgate ter sall saple beavor Keywords: Geeralzed Iforato; Estato; ypotess est; Mote Carlo Sulato Itroducto A copreesve surveys o earso c-square type statstcs as bee provded by ay autors as Cocra [], Watso [] ad Moore [3,4], partcular o quadratcs fors te cell frequeces Recetly, Adrews [5] as exteded te earso c-square testg etod to o-dyac paraetrc odels, e, to odels wt covarates Because earso c-square statstcs provde atural easures for te dscrepacy betwee te observed data ad a specfc paraetrc odel, tey ave also bee used for dscratg aog copetg odels Suc a stuato s frequet Socal Sceces were ay copetg odels are proposed to ft a gve saple A well kow dffculty s tat eac c-square statstc teds to becoe large wtout a crease ts degrees of freedo as te saple sze creases As a cosequece goodess-of-ft tests based o earso type c-square statstcs wll geerally reect te correct specfcato of every copetg odel o crcuvet suc a dffculty, a popular etod for odel selecto, wc s slar to use of Akake [6] Iforato Crtero (AIC), cossts cosderg tat te lower te c-square statstc, te better s te odel e precedg selecto rule, owever, does ot take to accout rado varatos eret te values of te * Correspodg autor statstcs We propose ere a procedure for takg to accout te stocastc ature of tese dffereces so as to assess ter sgfcace e a propose of ts paper s to address ts ssue We sall propose soe coveet asyptotcally stadard oral tests for odel selecto based o φ-dvergece type statstcs Followg Vuog [7,8] te procedures cosdered ere are testg te ull ypotess tat te copetg odels are equally close to te data geeratg process (DG) versus te alteratve ypotess tat oe odel s closer to te DG were closeess of a odel s easured accordg to te dscrepacy plct te φ-dvergece type statstc used us te outcoes of our tests provde forato o te stregt of te statstcal evdece for te coce of a odel based o ts goodess-of-ft (see Ngo [9]; Dedou ad Ngo []) e odel selecto approac roposed ere dffers fro tose of Cox [], ad Akake [] for o ested ypoteses s dfferece s tat te preset approac s based o te dscrepacy plct te dvergece type statstcs used, wle tese oter approaces as Vuog s [7] tests for odel selecto rely o te Kullback-Lebler [3] forato crtero (KLIC) Bera [4] sowed tat by usg te u ellger dstace estator, oe ca sultaeously obta asyptotc effcecy ad robustess propertes te presece of outlers e works of Spso [5] ad Copyrgt ScRes
37 NGOM, B NE Ldsay [6] ave sow tat, te tests ypoteses, robust alteratves to te lkelood rato test ca be geerated by usg te ellger dstace We cosder a geeral class of estators tat s very broad ad cotas ost of estators curretly used practce we forg dvergece type statstcs s covers te case studes arrs ad Basu [7]; Basu et al [8]; Basu ad Basu [9] were te pealzed ellger dstace s used e reader of ts paper s orgazed as follows Secto troduces te basc otatos ad deftos Secto 3 gves a sort overvew of dvergece easures Secto 4 vestgates te asyptotc dstrbuto of te pealzed ellger dstace I Secto 5, soe applcatos for testg ypoteses are proposed Secto 6 presets soe sulato results Secto 7 cocludes te paper Deftos ad Notato I ts secto, we brefly preset te basc assuptos o te odel ad paraeters estators, ad we defe our geeralzed dvergece type statstcs We cosder a dscrete statstcal odel, e X, X,, X a depedet rado saple fro a dscrete populato wt support X,, Let p,, p be a probablty vector e Ω were Ω s te splex of probablty -vectors, p, p,, p ; p,,,, p We cosder a paraeter odel p,, p : wc ay or ay ot cota te true dstrbuto, were Θ s a copact subset of k-desoal Eucldea space (wt k < ) If cotas, te tere exsts a θ Θ suc tat ad te odel s sad to be correctly specfed We are terested testg : (wt true paraeter ) versus : By we deote te usual Eucldea or ad we terpret probablty dstrbutos o X as row vectors fro R For splcty we restrct ourselves to ukow true paraeters θ satsfyg te classcal regularty codtos gve by Brc []: ) rue θ s a teror pot of ad p, for, us p,, p s a teror pot of te set ) e appg : s totally dfferetable at θ so tat te partal dervatves of p wt respect to eac θ exst at θ ad p (θ) as a lear approxato at θ gve by k p p p o were o deotes a fucto verfyg 3) e Jacoba atrx J o l p k s of full rak (e of rak k ad k < ) 4) e verse appg : s cotuous at 5) e appg : s cotuous at every pot Uder te ypotess tat, tere exsts a ukow paraeter θ suc tat ad te proble of pot estato appears a atural way Let be saple sze We ca estate te dstrbuto, p,, p,, p p p by te vector of observed frequeces o X e of easurable appg X s o paraetrc estator p,, p s de- N fed by p, N X were X f X () oterwse We ca ow defe te class of φ-dvergece type statstcs cosdered ts paper 3 A Bref Revew of φ-dvergeces May dfferet easures quatfyg te degree of dscrato betwee two probablty dstrbutos ave bee studed te past ey are frequetly called dstace easures, altoug soe of te are ot strctly etrcs ey ave bee appled to dfferet areas, suc as edcal age regstrato (Jose W lu [], classfcato ad retreval, aog oters s class of dstaces s referred, te lterature, as te class of φ, f or g-dvergeces (Cssza r []; Vada []; Morales et al [3]; te class of dspartes (Ldsay [6]) e dvergece easures play a portat role statstcal teory, especally large teores of estato ad testg Later ay papers ave appeared te lterature, were dvergece or etropy type easures of forato ave bee used testg statstcal ypoteses Aog oters we refer to Read ad Cresse [4], Zografos et al [5], Salcru et al [6], Bar-e ad Daud [7], Mee dez et al [8]), ardo et al [9] ad te refereces tere A easure of dscrato betwee two probablty dstrbutos called φ-dvergece, was Copyrgt ScRes
NGOM, B NE 37 troduced by Cssza r [3] Recetly, Broatowsk et al [3] preseted a ew dual represetato for dvergeces er a was to troduce estato ad test procedures troug dvergece optzato for dscrete or cotuous paraetrc odels I te proble were depedet saples are draw fro two dfferet dscrete populatos, Basu et al [3] developed soe tests based o te ellger dstace ad pealzed versos of t Cosder two populatos X ad Y, accordg to classfcato crtera ca be grouped to classes speces x, x,, x ad y, y,, y wt probabltes p, p,, p ad Q q, q,, q respectvely e p D, Q q (3) q s te -dvergece betwee ad Q (see Cssza r, [3]) for every te set Φ of real covex fuctos defed o, e fucto (t) s assued to verfy te followg regularty codto: :, R s covex ad cotuous, were ad p l u u Its restrcto o, s fte, twce cotuously dfferetable a egborood of u =, wt ad (cf Lese ad Vada [33]) We sall be terested also paraetrc estators Q Q (33) of wc ca be obtaed by eas of varous pot estators : of te ukow paraeter It s coveet to easure te dfferece betwee observed ad expected frequeces A u Dvergece estator of θ s a zer of D, were s a oparaetrc dstrbuto estate I our case, were data coe fro a dscrete dstrbuto, te eprcal dstrbuto defed () ca be used I partcular f we replace 4 x x x (3) we get te ellger dstace betwee dstrbuto ad θ gve by, D, p p : D (34) Lese ad Vada [33], Ldsay [6] ad Morales et al [3] troduced te so-called u φ-dvergece estate defed by Θ arg Θ D, ; D, D, ; (35) (36) Reark 3 e class of estates (34) cotas te axu lkelood estator (MLE) I partcular f we replace log x x we get KL arg KL, arg log p MLE were KL s te odfed Kullback-Lebler dvergece Bera [4] frst poted out tat te u ellger dstace estator (MDE) of θ, defed by arg, ; Θ D as robustess propretes Furter results were gve by aura ad Boos [34], Spso [5], ad Basu et al [35] for ore detals o ts etod of estato Spso, owever, oted tat te sall saple perforace of te ellger devace test at soe dscrete odels suc as te osso s soewat usatsfactory, te sese tat te test requres a very large saple sze for te c-square approxato to be useful (Spso [5], able 3) I order to avod ts proble, oe possblty s to use te pealzed ellger dstace (see arrs ad Basu, [36]; Basu, Basu ad Basu, [9]; Basu et al [3]) e pealzed ellger dstace faly betwee te probablty vectors ad θ s defed by: D, p p C p were s a real postve uber wt : p ad c : p (37) (38) Note tat we =, ts geerates te ordary ellger dstace (Spso, [5]) ece (37) ca be wrtte as follows arg, (39) Θ D Oe of te suggestos to use te pealzed ellger s otvated by te fact tat ts sutable coce ay lead to a estate ore robust ta te MLE A odel selecto crtero ca be desged to estate a expected overall dscrepacy, a quatty wc reflects te degree of slarty betwee a ftted ap- Copyrgt ScRes
37 NGOM, B NE proxatg odel ad te geeratg or true odel Estato of Kullback s forato (see Kullback- Lebler [3]) s te key to dervg te Akake Iforato crtero AIC (Akake [6]) Motvated by te above developets, we propose by aalogy wt te approac troduced by Vuog [7,8], a ew forato crtero relatg to te φ-dvergeces I our test, te ull ypotess s tat te copetg odels are as close to te data geeratg process (DG) were closeess of a odel s easured accordg to te dscrepacy plct te pealzed ellger dvergece 4 Asyptotc Dstrbuto of te ealzed ellger Dstace ereafter, we focus o asyptotc results We assue tat te true paraeter ad appg : satsfy codtos - 6 of Brc [] We cosder te -vector p,, p, te k Jacoba atrx J J wt J l l p l te k atrx,, ; l,, k D dag J ad te k k Fser forato atrx I p p p r s rs,,, k D were dag dag,, p p e above defed atrces are cosdered at te pot θ Θ were te dervatves exst ad all te coordates p (θ) are postve e stocastc covergeces of rado vectors X to a rado vector X are deoted by X X ad L X X (covergeces probablty ad law, respectvely) Istead cx for a sequece of postve ubers c we ca wrte X oc p s relato eas: D l x l sup x cx x A estator of s cosstet f for every Θ te rado vector p,, p teds probablty to p,, p, e f l x for all We eed te followg result to prove eore 43 roposto 4 (Madal et al [37]) Let Φ, let p:θ Ω be twce cotuously dfferetable a egborood of ad assue tat codtos - 5 of Secto old Suppose tat I s te k k Fser Iforato atrx ad satsfyg (37) e te ltg dstrbuto of + s N, I Lea 4 We ave were N,,, a estator of p,, p defed () wt ad roof Deote N dag N V p,, N p were s X X oterwse as V p ; ; p ( p ; ; p ad applyg te Cetral Lt eore we ave N p N p,, N, were For splcty, we wrte D, dag D, stead (4) eore 43 Uder te assuptos of roposto (4), we ave were, N, M M M M dag M J I roof A frst order aylor expaso gves (4) Copyrgt ScRes
NGOM, B NE 373 J o (4) I te sae way as Morales et al [8], t ca be establsed tat: I D dag o Fro (4) ad (43) we obta J I D dag o terefore te rado vectors ad M I (43) were I s te uty atrx, ave te sae asyptotc dstrbuto Furterore t s clear (applyg CL) tat - Beg ples N, te atrx dag I N, I, M M terefore, we get N, M M M M (44) e case wc s terest to us ere s to test te ypotess : Our proposal s based o te follow g pealzed dvergece test statstc D, were ad ave bee troduce eore (43) ad (37) respectvely Usg arguets slar to tose developed by Basu [7], uder te assuptos of (43) ad te ypotess : =, te asyptotc dstrbuto of D, s a c-square we = wt derees k degrees of freedo Sce te oters ebers of pealzed ellger dstace tests dffer fro te ordary ellger dstace test oly at te epty cells, tey too ave te sae asyptotc dstrbuto Cosderg ow te case we te odel s wrog e : We troduce te followg regularty assuptos (A ) ere exsts arg f Θ D, suc tat: as we + Λ * (A ) ere exsts ;, wt Λ Λ Λ p (4) ad Λ Λ suc tat Λ N, eore 44 Uder : ad assue tat codtos (A ) ad (A ) old, we ave: were, N,, D D J J J J (45),,, wt D p, p,,, p Ad J p p, p p,, wt D p, p,,, p p p p p roof A frst order aylor expaso gves,, J o D D (46) Fro te assued assuptos (A ) ad (A ), te result follows 5 Applcatos for estg ypotess e estate ca be used to perfor statstcal tests D, 5 est of Goodess-Ft For copleteess, we look at D, te usual way, e as a goodess-of-ft statstc Recall tat ere s te u pealzed ellger dstace est- D, s a cosstet estator ator of Sce Copyrgt ScRes
374 NGOM, B NE of tc, D, s D, te ull ypotess we usg te stats- o : D, or equvaletly, o : = ece, f o s reected so tat oe ca fer tat te paraetrc odel s sspecfed Sce D, s o-egatve ad takes value zero oly we =, te tests are defed troug te crtcal rego C D, q, k were q,k s te ( )-quatle of te -dstrbuto wt k degrees of freedo Reark 5 eore (44) ca be used to gve te followg approxato to te power of test : o D, Approxated power fucto s D, q, k q D,,, k were q,k s te ( )-quatle of te -dstrbuto wt k degrees of freedo ad s a sequece of dstrbuto fucto tedg uforly to te stadard oral dstrbuto x Note tat f o : D,, te for ay fxed sze te probablty of reecto o : D, wt te reec to rule D, q, k teds to oe as (57) Obtag te approxate saple, guarateeg a power for a gve alteratve, s a terestg applcato of Forula (57) If we ws te power to be equal to *, we ust solve te equato q, k D,, It s ot dffcult to ceck tat te saple sze *, s te soluto of te followg equato, D, D *, e soluto s gve by q, k aab ab D, q, k Copyrgt ScRes wt a ad, b q,kd, ad te requred sze s, were deotes teger part of 5 est for Model Selecto As we etoed above, we oe cooses a partcular -dvergece type statstc D, D, wt te correspodg u pealzed ellger dstace estator of, oe actually evaluates te goodess-of-ft of te paraetrc odel accordg to te dscrepacy D, betwee te true dstrbuto ad te specfed odel us t s atural to defe te best odel aog a collecto of copetg odels to be te odel tat s closest to te true dstrbu- to accordg to te dscrepacy D, I ts paper we cosder te proble of selectg betwee two odels Let G G( ; be aoter odel, were s a q-desoal paraetrc odels I a slar way, we ca defe te u pealzed ellger dstace estator of ad te correspodg dscrepacy D, G for te odel G Our specal terest s te stuato wc a researcer as two copetg paraetrc odels ad G, ad e wses to select te better of two odels based o ter dscrato statstc betwee te observatos ad odels ad G, defed respectvely by D, ad D, Let te two copetg paraetrc odels ad G wt te gve dscrepacy D, Défto 5 eq : D, D, eas tat te two odels are equvalet, p : D, D, eas tat s better ta G, G : D, D, eas tat s worse ta G Reark 53 ) It does ot requre tat te sae d vergece type statstcs be used forg D, ad D, Coosg, owever, dfferet ds- crepacy for evaluatg copetg odels s ardly utfed ) s defto does ot requre tat eter of te copetg odels be correctly specfed O te oter ad, a correctly specfed odel ust be at least as good as ay oter odel e followg expresso of te dcator D, D, s ukow, but fro te prevous secto, t ca be estated by te dfferece D, D, s dfferece coverges to zero uder te ull ypo-
NGOM, B NE 375 eq tess, but coverges to a strctly egatve or postve costat we ad G olds ese propertes actually ustfy te use of D, D, as a odel selecto dcator ad coo procedure of selectg te odel wt gest goodess-of-ft As argued te troducto, owever, t s portat to take to accout te rado ature of te dfferece D, D, so as to assess ts sgfc- ace o do so we cosder te asyptotc dstrbuto of D, D, eq uder Our aor task s to propose soe tests for odel selecto, e for te ull ypotess eq agast te alteratve or G We use te ext lea wt a d as te correspodg u pealzed ellger dstace estator of ad Usg ad defed earler, we cosder te vector K k,, k were were k D, p, q D, p Q q q,,, wt,, wt,, Lea 54 Uder te assuptos of te eore (44), we ave () for te odel,,, K p D D () for odel G Q o G p K,,, D D G Q G o roof e results follow fro a frst order aylor expaso We defe K K ; Q Q K K ; Q Q wc s te varace of K K; Q Q Sce K, K, Q, Q, ad * are cosstetly estated by ter saple aalogues K, K, Q, Q ad *, ece s cosstetly estated by K K Q Q K K Q Q ; ; Next we defe te odel selecto statstc ad ts asyptotc dstrbuto uder te ull ad alteratve ypotess Let I D, D, were I stads for te pealzed ellger Idcator e followg teore provdes te lt dstrbuto of I uder te ull ad alteratves ypotess eore 55 Uder te assuptos of eore (44), suppose tat, te eq ) Uder te ull ypotess, I N, ) Uder te ull ypotess probablty 3) Uder te ull ypotess G probablty roof Fro te Lea (54), t follows tat D, D,,, Q Q G G o p D D G K K Uder eq : G ad G we get,, Q Q op D D K K K K; Q Q op Fally, applyg te Cetral Lt eore ad assuptos (A ) ad (A ), we ca ow edately obta I N, 6 Coputatoal Results 6 Exaple o llustrate te odel procedure dscussed te precedg secto, we cosder a exaple We eed to defe te copetg odels, te estato etod used for eac copetg odel ad te ellger pealzed pealzed type statstc to easure te departure of eac proposed paraetrc odel fro te true data geeratg process For our copetg odels, we cosder te proble of coosg betwee te faly of osso dstrbuto ad Copyrgt ScRes
376 NGOM, B NE te faly of Geoetrc dstrbuto e osso dstrbuto () s paraeterzed by ad as desty f x exp x, x! x G x, p p p for for x ad zero oterwse e Geoetrc dstrbuto G(p) s paraeterzed by p ad as desty x ad zero oterwse We use te u pealzed ellger dstace statstc to evaluate te dscrepacy of te proposed odel fro te true data geeratg process We partto te real le to tervals C, C,, Λ, were C ad C e coce of te cells s dscussed below e correspodg u pealzed ellger dstace estator of ad p are: arg D, arg p arg D, c f p p p arg f p pp c p ad p p are probabltes of te cells C, C uder te osso ad Geoetrc true dstrbuto respectvely We cosder varous sets of experets wc data are geerated fro te xture of a osso ad Geoetrc dstrbuto ese two dstrbutos are xture of a osso ad Geoetrc dstrbuto ese two dstrbutos are calbrated so tat ter two eas are close (4 ad 5 respectvely) ece te DG (Data Geeratg rocess) s geerated fro M(π) wt te desty π π os 4 π Geo were π (π [, ] s specfc value to eac set of experets I eac set of experet several rado saple are draw fro ts xture of dstrbutos e saple sze vares fro to 3, ad for eac saple sze te uber of replcato s I eac set of experet, we coose two values of te paraeter = ad =, were = correspods to te classc ellger dstace e a s to copare te accuracy of te selecto odel depedg o te paraeter settg cose I order a perfect ft by te proposed etod, for te cose paraeters of tese two dstrbutos, we ote tat ost of te ass s cocetrated betwee ad erefore, te cose partto as egt cells defed by C, C,,,,7 ad C7, C8 7, Copyrgt ScRes represets te last cell We coose dfferet values of π wc are, 5, 535, 75, Altoug our proposed odel selecto procedure does ot requre tat te data geeratg process belog to eter of te copetg odels, we cosder te two ltg cases π = ad π = for tey correspod to te correctly specfed cases o vestgate te case were bot copetg odels are sspecfed but ot at equal dstace fro te DG, we cosder te case π = 5, π = 75 ad π = 5 secod case s terpreted slarly as a Geoetrc slgtly cotaated by a osso dstrbuto e forer case correspod to a DG wc s osso but slgtly cotaated by a Geoetrc dstrbuto I te last case, π = 535 s te value for wc te osso D, G ad te Geoetrc D, Gp faly are approxatvely at equal dtace to te xture (π) accordg to te pealzed ellger dstace wt te above cells us ts set of experets correspods approxatvely to te ull ypotess of our proposed odel selecto test e results of our dfferet sets of experets are preseted ables -5 e frst alf of eac table gves te average values of te u pealzed ellger dstace estator ad p, te pealzed ellger goodess-of-ft statstcs, D, Gp D G ad dcator statstcs e values pareteses are stadard errors e secod alf of eac table gves percetage te uber of tes our proposed odel selecto procedure based o favors te osso odel, te Geoetrc odel, ad decsve e tests are coducted at 5% oal sgfcace level I te frst two sets of experets (π = ad π = ) were oe odel s correctly specfed, we use te labels correct, correct ad decsve we a coce s ade e frst alves of ables -5 cofr our asyptotc results ey all sow tat te u pealzed ellger estators ad p coverge to ter pseudo-true values te sspecfed cases ad to ter true values te correctly specfed cases as te saple sze creases Wt respect to our, ts dverges to or + at te approxate rate of except te able 5 I te latter case te statstc coverges, as expected, to zero wc s te ea of te asyptotc N(, ) dstrbuto uder our ull ypotess of equvalece Wt te excepto of ables ad, we observed a large percetage of correct decsos s s because bot odels are ow correctly specfed I cotrast, turg to te secod alves of te ables ad, we frst ote tat te percetage of correct coces usg statstc steadly creases ad ultately coverges to, ad te ellger
NGOM, B NE 377 able DG = os(4) 3 4 5 3 (3) 95(3) 97() 5() () 395(4) 49(4) 45(3) 45(8) 45(3) D (os) = 33(7) 8(5) 59(3) 4(3) 37() = / 96(4) 64(3) 48() 34() 3() D (Geo) = 39(8) 348() 8(9) 8() 7(5) = 78(7) 6(8) 4(6) 36(6) 3(3) = Correct Idecsve Icorrect 367(4) 43(69) 434(38) 483(5) 497(8) 77% 3% 87% 3% = 36(33) 398(48) 373(9) 46(35) 45(87) Correct Idecsve Icorrect 7% 3% 79% % 9% 8% 83% 7% 96% 4% 86% 7% 93% 7% able DG = Geo() 3 4 5 3 96(4) 3(3) 3() 3() () 39() 46(89) 49(67) 49(58) 435(34) D (os) = 356(4) 39() 7(9) 53(8) 44(7) = 5 8() 73(7) 54(7) 46(7) 37() D (Geo) = 5(6) 89(5) 53(3) 39() 33() = 3(4) 67(3) 44() 35() 7(98) = 88(43) 56(37) 3(5) 334(4) 34(3) Correct 36% 6% 77% 84% 9% Idecsve 64% 38% 3% 6% 8% Icorrect = 7(7) 6(5) 76(96) 3(65) 49(3) Correct Idecsve Icorrect 36% 64% 6% 38% 77% 3% 84% 6% 9% 8% able 3 DG = 75 Geo() + 5 os(4) 3 4 5 3 3(3) 97() 8(8) (5) () 46(7) 39(55) 48(55) 397(43) 4() D (os) = 546(3) 47() 4(9) 4(8) 367(6) = 344(7) 34(5) 3(5) 3(5) 34(3) D (Geo) = 5(6) 89(5) 53(3) 39() 33() = 367(6) 43(53) 434(47) 483(7) 537() = () 8(89) 8() 37(99) 3(84) Geo 3% 4% 5% 64% 8% Idecsve 77% 6% 5% 36% 9% os = 84(9) 83(7) 845(6) 967(5) 3(78) Geo Idecsve os 7% 8% 3% 5% 83% % 9% 89% % % 77% % 33% 66% % Copyrgt ScRes
378 NGOM, B NE able 4 DG = 75 os(4) + 5 Geo() 3 4 5 3 3(3) (3) () 6() 3() 4(43) 49(3) 397(8) 4(6) 49(7) D (os) = 779(45) 634(3) 65(8) 57(4) 5() = 443(4) 473() 5() 5(8) 483(4) D (Geo) = 55(35) 87(5) 53(3) 39() 33() = 64(5) 66(5) 7(4) 69(3) 63() = Geo Idecsve os 4(7) 44() 49(8) 77() 89(9) 38% 6% 37% 63% 3% 68% 7% 83% % 79% = 8(37) 37(33) 3(6) 66(8) 83(6) Geo Idecsve os 48% 5% 45% 55% 46% 54% 3% 7% 4% 76% able 5 DG = 535 os(4) + 465 Geo() 3 4 5 3 96(6) 4(5) (3) 3(7) 4() 3968(6) 396(46) 398(374) 43(39) 4() D (os) = 869(63) 6(46) 58(36) 55(38) 3(5) = 633(3) 49(8) 369(7) 3(6) 4(7) D (Geo) = 867(5) 68(37) 553(3) 495(6) 37() = 57() () 63() 87(9) 37() = 79(4) 38(5) 8(99) 334() 44(67) Geo 3% 4% 5% % 3% Idecsve 9% 9% 93% 88% 88% os 5% 4% % % % = 86(4) 48(64) 378(9) 45(86) 67(73) Geo Idecsve os 5% 9% 3% 6% 9% 4% 4% 95% % 9% 9% % % 88% % e precedg coets for te secod alves of ables ad also apply to te secod alves of ables 3 ad 4 I all ables -4, te results cofr, sall saples, te relatve doato of te odel selecto procedure based o te pealzed ellger statstc test ( = ) ta te oter correspodg to te coce of classcal ellger statstc test ( = ), percetages of correct decsos able 5 also cofrs our asyptotcs results: as saple sze creases, te percetage of reecto of bot odels coverges, as t sould, to I Fgures, 3, 5, 7 ad 9 we plot te stogra of datasets ad overlay te curves for Geoetrc ad osso dstrbuto We te DG s correctly specfed Fgure, te osso dstrbuto as reasoable cace of beg dstgused fro geoetrc dstrbuto Slarly, Fgure 3, as ca be see, te Geoetrc dstrbuto closely approxates te data sets I Fgures 5 ad 7 two dstrbutos are close but te Geoetrc (Fgure 5) ad te osso dstrbutos (Fgure 7) does appear to be uc closer to te data sets We = 535, te dstrbuto for bot (Fgure 9) osso dstrbuto ad Geoetrc dstrbuto are slar, wle beg slgtly correspodg to te ordary ellger dstace As expected, our statstc dvergece dverges to (Fgures ad 8) ad to + (Fgures 4 ad 8) ore rapdly syetrcal about te axs tat passes troug te ode of data dstrbuto s follows fro Copyrgt ScRes
NGOM, B NE 379 Fgure stogra of DG os(4) wt = 5 Fgure 4 Coparaso barplot of depedg Fgure Coparatve barplot of depedg Fgure 5 stogra of DG = 75 Geo + 5 os wt = 5 Fgure 3 stogra of DG-Geo() wt = 5 te fact tat tese two dstrbutos are equdstat fro te fact tat tese two dstrbutos are equdstat fro te DG ad would be dffcult to dstgus fro data practce e precedg results tables ad te eore (55) cofr, Fgures, 4, 6 ad 8, tat te ellger dcator for te odel selecto procedure based o paelzed ellger dvergece statstc wt = 5 (lgt bars) doates te procedure obtaed wt = (dark bars) we we use te pealzed ellger dstace test ta te classcal ellger dstace test ece, Fgure allows a coparso wt te asyptotc N (, ) approxato uder our ull ypotess of equvalece ece te dcator /, based o te pealzed ellger dstace s closer to te ea of N (, ) ta s te dcator Copyrgt ScRes
38 NGOM, B NE Fgure 6 Coparatve barplot of depedg Fgure 9 stogra of DG = 465 Geo + 535 os wt = 5 Fgure 7 stogra of DG = 5 Geo + 75 os wt = 5 Fgure Coparatve barplot of depedg Fgure 8 Coparatve barplot of 7 Cocluso I ts paper we vestgated te probles of odel selecto usg dvergece type statstcs Specfcally, we proposed soe asyptotcally stadard oral ad c-square tests for odel selecto based o dvergece type statstcs tat use te correspodg u pealzed ellger estator Our tests are based o testg weter te copetg odels are equally close to te true dstrbuto agast te alteratve ypoteses tat oe odel s closer ta te oter were closeess of a odel s easured accordg to te dscrepacy plct te dvergece type statstcs used e pealzed ellger dvergece crtero outperfors classcal crtera for odel selecto based o te ordary ellger dstace, especally sall saple, te dfferece s Copyrgt ScRes
NGOM, B NE 38 expected to be al for large saple sze Our work ca be exteded several drectos Oe exteso s to use rado stead of fxed cells Rado cells arse we te boudares of eac cell c deped o soe ukow paraeter vector, wc are estated For varous exaples, see eg, Adrews [37] For stace, wt approprate rado cells, te asyptotc dstrbuto of a earso type statstc ay becoe depedet of te true paraeter o uder correct specfcato I vew of ts latter result, t s expected tat our odel selecto test based o pealzed ellger dvergece easures wll rea asyptotcally orally or csquare dstrbuted 8 Ackowledgeets s researc was supported, part, by grats fro AIMS (Afrca Isttute for Mateatcal Sceces) 6 Melrose Road, Muzeberg-Cape ow 7945 Sout Afrca REFERENCES [] W G Cocra, e est of Goodess of Ft, e Aals of Mateatcal Statstcs, Vol 3, No 3, 95, pp 35-345 do:4/aos/777938 [] G S Watso, O te Costructo of Sgfcace ests o te Crcle ad te Spere, Boetrka, Vol 43, No 3-4, 956, pp 344-35 do:37/3393 [3] D S Moore, C-Square ests Studes Statstcs, 978 [4] D S Moore, ests of C-Squared ype Goodess of Ft ecques, 986 [5] D W K Adrews, C-Square Dagostc ests for Ecooetrc Models: eory, Ecooetrca, Vol 56, No 6, 988, pp 49-453 do:37/935 [6] A Kake, Iforato eory ad Exteso of te Lkelood Rato rcple, roceedgs of te Secod Iteratoal Syposu of Iforato eory, 973, pp 57-8 [7] Q Vuog, Lkelood Rato ests for Model Selecto ad No-Nested ypoteses, Ecooetrka, Vol 57, No, 989, pp 57-36 do:37/9557 [8] Q Vuog ad W Wag, Mu C-Square Estato ad ests for Model Selecto, Joural of Ecooetrcs, Vol 57, No -, 993, pp 4-68 do:6/34-476(93)94-d [9] Ngo, Selected Estated Models wt Á-Dvergece Statstcs Global, Joural of ure ad Appled Mateatcs, Vol 3, No, 7, pp 47-6 [] A Dédou ad Ngo, Cutoff e Based o Geeralzed Dvergece Measure, Statstcs ad robablty Letters, Vol 79, No, 9, pp 343-35 do:6/spl96 [] D R Cox, ests of Separate Fales of ypoteses, roceedgs of te Fourt Berkeley Syposu o Mate- atcal Statstcs ad robablty, Los Ageles, -3 Jue 96, pp 5-3 [] Akake, A New Look at te Statstcal Model Idetfcato, IEEE rasacto o Iforato eory, Vol 9, No 6, 974, pp 76-73 [3] S Kullback ad R A Lebler, O Iforato ad Suffcecy, e Aals of Mateatcal Statstcs, Vol, No, 95, pp 79-86 do:4/aos/7779694 [4] R J Bear, Mu ellger Dstace Estates for araetrc Models, e Aals of Mateatcal Statstcs, Vol 5, No 3, 977, pp 445-463 [5] D G Spso, ellger Devace est: Effcecy, Breakdow ots ad Exaples, Joural of Aerca Statstcal Assocato, Vol 84, No 45, 989, pp 7-3 do:8/6459989478744 [6] B G Ldsay, Effcecy versus Robustess: e Case for Mu Dstace ellger Dstace ad Related Metods, Aals of Statstcs, Vol, No, 994, pp 8-4 do:4/aos/76355 [7] A Basu ad B G Ldsay, Mu Dsparty Estato for Cotuous Models: Effcecy, Dstrbutos ad Robustess, e Aals of Mateatcal Statstcs, Vol 46, No 4, 994, pp 683-75 do:7/bf773476 [8] A Basu, I R arrs ad S Basu, ests of ypoteses Dscrete Models Based o te ealzed ellger Dstace, Statstcs ad robablty Letters, Vol 7, No 4, 996, pp 367-373 do:6/67-75(95)-8 [9] A Basu ad S Basu, ealzed Mu Dsparty Metods for Multoal Models, Statstca Sca, Vol 8, 998, pp 84-86 [] M W Brc, e Detecto of artal Assocato, II: e Geeral Case, Joural of te Royal Statstcal Socety, Vol 7, No, 965, pp -4 [] J W lu, J B A Matz ad A M Vergever, f-iforato Measures to Medcal Iage Regstrato, IEEE rasactos o Medcal Iagg, Vol 3, No, 4, pp 58-56 do:9/mi483687 [] I Vada, eory of Statstcal Evdece ad Iforato, Kluwe Acadec lubser, Dordrect, 989 [3] D Morales, L ardo ad I Vada, Asyptotc Dvergece of Estates of Dscrete Dstrbuto, Joural of Statstcal lag ad Iferece, Vol 483, No 3, 995, pp 347-369 do:6/378-3758(95)3-y [4] N Cresse ad R C Read, Multoal Goodess of Ft est, Joural of te Royal Statstcal Socety, Vol 463, No 3, 984, pp 44-464 [5] K Zografos ad K Feretos, Dvergece Statstcs Saplg ropertes ad Multoal Goodess of ft ad Dvergece ests, Coucatos Statstcs eory ad Metods, Vol 9, No 5, 99, pp 785-8 do:8/36998839 [6] M Salcru, D Morales, M L Meedez, et al, O te Applcatos of Dvergece ype Measures estg Statstcal ypoteses, Joural of Multvarate Aalyss, Vol 5, No, 994, pp 37-39 do:6/va99468 [7] A Bar-e ad J J Dad, Geeralsato of te Ma- Copyrgt ScRes
38 NGOM, B NE alaobs Dstace te Mxed Case, Joural of Multvarate Aalyss, Vol 53, No, 995, pp 33-34 do:6/va9954 [] L ardo, D Morales, M Salcrù ad M L Meedez, Geeralzed Dvergeces Measures: Aout of Iforato, Asyptotc-Dstrbuto ad Its Applcatos to est Statstcal ypoteses, Iteratoal Sceces, Vol 84, No 3-4, 995, pp 8-98 [3] M L Meedez, L ardo, M Salcrù ad D Morales, Dvergece Measures, Based o Etropy Fuctos ad Statstcal Iferece, Sakyã: e Ida Joural of Statstcs, Vol 57, No 3, 995, pp 35-337 [4] I Csszár, Iforato-ype Measure of Dfferece of robablty Dstrbuto ad Idrect Observatos, Studa Scetaru Mateatcaru ugarca, Vol, 967, pp 99-38 [5] M Broatowsk ad A oa, Dual Dvergece Estators ad ests: Robustess Results, Joural of Multvarate Aalyss, Vol, No,, pp -36 [6] A Basu, A Madal ad L ardo, ypotess estg for wo Dscrete opulatos Based o te ellger Dstace, Statstcs ad robablty Letters, Vol 8, No 3-4,, pp 6-4 do:6/spl98 [7] F Lese ad I Vada, Covex Statstcal Dstace, vol 95 of euber-exte zur Mateatk, 987 [8] R aura ad D D Boos, Mu ellger Dstace Estato for Multvarate Locato ad Covarace, Joural of Aerca Statstcal Assocato, Vol 8, No 333, 989, pp 3-9 [9] A Basu, S Sarkar ad A N Vdyasakar, Mu Negatve Expoetal Dsparty Estato araetrc Models, Joural of Statstcal lag ad Iferece, Vol 58, No, 997, pp 349-37 do:6/s378-3758(96)78-x [] I R arrs ad A Basu, ellger Dstace as ealzed Loglkelood, Coucatos Statstcs eory ad Metods, Vol, No 3, 994, pp 637-646 do:8/369988384 [] A Madal, R K atra ad A Basu, Mu ellger Dstace Estato wt Iler Modfcato, Sakya, Vol 7, 8, pp 3-3 Copyrgt ScRes