************************************* * Metodos Cuantitativos II * Laboratorio 15-nov-2005 * USANDO CLARIFY ************************************* /* En esta sesión vimos cómo usar CLARIFY con logit y mlogit */ . use "D:\Mis documentos\Docencia\Clases\AnEmpirico\NES92_clean.dta", clear . ** Esta es una encuesta post-electoral de la eleccion presidencial de EU en 1992 . ** Donde contendieron Bush papa, Clinton y Perot . desc Contains data from D:\Mis documentos\Docencia\Clases\AnEmpirico\NES92_clean.dta obs: 750 vars: 22 24 Nov 2004 00:59 size: 69,000 (93.4% of memory free) ------------------------------------------------------------------------------- storage display value variable name type format label variable label ------------------------------------------------------------------------------- vote float %9.0g Vote 92: Bush, Clinton, Perot bushapp float %9.0g Bush Approval, 1992 bplace float %9.0g Bush Lib/Con cplace float %9.0g Clinton Lib/Con pplace float %9.0g Perot Lib/Con distbush float %9.0g R-Bush Lib/Con Dist distperot float %9.0g R-Perot Lib/Con Dist oppmilitary float %9.0g Opposition to Use of Military F warok float %9.0g Gulf War Worth Cost education float %9.0g Years of School govemployee float %9.0g Government Employee union float %9.0g Union Household nonwhite float %9.0g Nonwhite place float %9.0g R-Lib/Con distclinton float %9.0g R-Clinton Lib/Con Dist badecon float %9.0g Economy WORSE? partyID float %9.0g PartyID income float %9.0g FamilyIncome, $1000 r1 float %9.0g Pr(v2==0) r2 float %9.0g Pr(v2==1) r3 float %9.0g Pr(v2==2) r4 float %9.0g Pr(v2==3) ------------------------------------------------------------------------------- Sorted by: . summ Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- vote | 601 .8169717 .7186409 0 2 bushapp | 727 1.250344 1.078488 0 3 bplace | 678 5.150442 1.404075 1 7 cplace | 666 3.096096 1.340996 1 7 pplace | 587 4.381601 1.753846 1 7 -------------+-------------------------------------------------------- distbush | 575 2.106087 1.560917 0 6 distperot | 508 2.062992 1.414203 0 5 oppmilitary | 742 2.963612 .8279998 1 5 warok | 713 .5834502 .493333 0 1 education | 745 13.5651 2.574976 2 17 -------------+-------------------------------------------------------- govemployee | 750 .136 .3430173 0 1 union | 750 .1653333 .371729 0 1 nonwhite | 750 .1346667 .341595 0 1 place | 599 4.245409 1.730566 1 7 distclinton | 560 2.067857 1.528162 0 6 -------------+-------------------------------------------------------- badecon | 740 4.014865 .9061782 1 5 partyID | 742 -.1091644 1.999378 -3 3 income | 695 41.92734 30.33958 1.5 140 r1 | 745 .2722288 .0458215 .1040918 .3370141 r2 | 745 .3117232 .0111072 .228696 .3205123 -------------+-------------------------------------------------------- r3 | 745 .3454362 .0352465 .2958629 .4669528 r4 | 745 .0706118 .0211353 .0466106 .2002594 ** Qué es CLARIFY?? . ado desc clarify ------------------------------------------------------------------------------------------ [8] package clarify from http://gking.harvard.edu/clarify ------------------------------------------------------------------------------------------ TITLE Clarify: Software for Interpreting and Presenting Statistical Results DESCRIPTION/AUTHOR(S) Version 2.1 (January 5, 2003) Michael Tomz, Stanford University Jason Wittenberg, University of Wisconsin, Madison Gary King, Harvard University Clarify is a program that uses Monte Carlo simulation to convert the raw output of statistical procedures into results that are of direct interest to researchers. The program, designed for use with the Stata statistics package, can help researchers in three ways. (1) It can extract new quantities of interest from standard statistical models, thereby enriching the substance of social science research. (2) It can assess the uncertainty surrounding any quantity of interest, so it should improve the candor and realism of statistical discourse. (3) It can convert raw parameter estimates into results that anyone, regardless of statistical training, can understand. Thus, it should be useful to those who want to convey their results to a broader audience. Note: Site administrators installing Clarify for general use should type net set ado SITE before the net install command. Support: email clarify@latte.harvard.edu ------------------------------------------------------------------------------------------ ** 1. ESTIMANDO UN MODELO MLOGIT con CLARIFY: . estsimp mlogit vote bplace distclinton badecon partyID educ income, basecategory(0) nolog Multinomial logistic regression Number of obs = 449 LR chi2(12) = 351.61 Prob > chi2 = 0.0000 Log likelihood = -295.18913 Pseudo R2 = 0.3733 ------------------------------------------------------------------------------ vote | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- 1 | bplace | .1444014 .1383184 1.04 0.296 -.1266976 .4155005 distclinton | -.9192259 .1459223 -6.30 0.000 -1.205228 -.6332235 badecon | .4975386 .1956128 2.54 0.011 .1141445 .8809328 partyID | -1.016 .1111729 -9.14 0.000 -1.233894 -.7981047 education | -.0359952 .0832538 -0.43 0.665 -.1991696 .1271792 income | .0085228 .0055849 1.53 0.127 -.0024234 .0194691 _cons | -.6330693 1.552966 -0.41 0.684 -3.676826 2.410687 -------------+---------------------------------------------------------------- 2 | bplace | -.0349927 .1217286 -0.29 0.774 -.2735763 .2035909 distclinton | -.3648462 .1129703 -3.23 0.001 -.5862639 -.1434285 badecon | .2198126 .1642349 1.34 0.181 -.1020818 .541707 partyID | -.3510387 .0952625 -3.68 0.000 -.5377497 -.1643276 education | -.1207075 .0719538 -1.68 0.093 -.2617344 .0203193 income | .0041594 .0048097 0.86 0.387 -.0052673 .0135862 _cons | 1.603889 1.289002 1.24 0.213 -.922508 4.130285 ------------------------------------------------------------------------------ (Outcome vote==0 is the comparison group) Simulating main parameters. Please wait.... Note: Clarify is expanding your dataset from 750 observations to 1000 observations in order to accommodate the simulations. This will append missing values to the bottom of your original dataset. % of simulations completed: 7% 14% 21% 28% 35% 42% 50% 57% 64% 71% 78% 85% 92% 100% Number of simulations : 1000 Names of new variables : b1 b2 b3 b4 b5 b6 b7 b8 b9 b10 b11 b12 b13 b14 ** Las nuevas variables b1...b14 contienen los 14 parámetros o betas simulados ** 2. ESTABLECIENDO EL ESCENARIO DE INTERÉS: . setx mean // Fija todas las variables indepenedientes en sus medias ** 3. SIMULANDO LAS CANTIDADES DE INTERES, en este caso, probabilidades de voto ** por cada uno de los candidatos, suponiendo que todo está en sus medias: . simqi Quantity of Interest | Mean Std. Err. [95% Conf. Interval] ---------------------------+-------------------------------------------------- Pr(vote=0) | .3462096 .0355934 .2822846 .4208964 Pr(vote=1) | .3260898 .0368523 .2535851 .4038918 Pr(vote=2) | .3277006 .0332986 .2632911 .3933733 ** Noten cómo, al considerar a un respondent promedio o típico, las probabilidades ** de voto son muy similares para cada candidato--sus intervalos de confianza se ** traslapan... ** ¿Cuál sería la probabilidad de voto para un respondent típico pero con la máxima ** educación? . setx educ max // Fija educ en su valor máximo (17) y deja todo lo demás en sus medias ** Veamos el escenario a simular: . setx You have set the following values for the explanatory variables: ------------------------------------ Variable | Value Description ----------+------------------------- badecon | 3.991091 mean bplace | 5.336303 mean distcl~n | 2.153675 mean educat~n | 17 max income | 46.67372 mean partyID | .0846325 mean ------------------------------------ . simqi Quantity of Interest | Mean Std. Err. [95% Conf. Interval] ---------------------------+-------------------------------------------------- Pr(vote=0) | .3921437 .0556051 .2888138 .5073365 Pr(vote=1) | .3380416 .0557387 .2355007 .4518999 Pr(vote=2) | .2698147 .0459481 .1854675 .3637835 ** La probabilidad de voto por Bush es un poco mayor, pero el intervalo ** de confianza sigue siendo elevado en cada categoria ** Ahora, alguien que percibe o piensa que la economía va muy mal, que Bush es cercano ** ideológicamente, Clinton lejano, y con mínima educacion: . setx badecon 5 bplace 1 distcl 4 educ min . setx You have set the following values for the explanatory variables: ------------------------------------ Variable | Value Description ----------+------------------------- badecon | 5 5 bplace | 1 1 distcl~n | 4 4 educat~n | 6 min income | 46.67372 mean partyID | .0846325 mean ------------------------------------ . simqi Quantity of Interest | Mean Std. Err. [95% Conf. Interval] ---------------------------+-------------------------------------------------- Pr(vote=0) | .3328591 .1469903 .1021935 .6560302 Pr(vote=1) | .0764298 .0556076 .0141419 .2105235 Pr(vote=2) | .5907111 .1495008 .2801442 .8459563 ** Este individuo votaría por Perot antes que por Bush... pero ** noten que sigue habiendo traslapes entre los CI. ** ¿Y cómo cambia la probabilidad de voto al pasar de valores minimos a máximos? . setx min . simqi Quantity of Interest | Mean Std. Err. [95% Conf. Interval] ---------------------------+-------------------------------------------------- Pr(vote=0) | .0512365 .0540368 .0056483 .1913643 Pr(vote=1) | .6131154 .1973702 .1992366 .9311015 Pr(vote=2) | .3356481 .182923 .0567608 .7286881 ** Votas por Clinton... . setx max . simqi Quantity of Interest | Mean Std. Err. [95% Conf. Interval] ---------------------------+-------------------------------------------------- Pr(vote=0) | .8826424 .0649765 .7220204 .9651069 Pr(vote=1) | .0073966 .0068616 .0009729 .0266758 Pr(vote=2) | .109961 .0630299 .0309213 .2680381 ** Votas por Bush... . setx mean . set x You have set the following values for the explanatory variables: ------------------------------------ Variable | Value Description ----------+------------------------- badecon | 3.991091 mean bplace | 5.336303 mean distcl~n | 2.153675 mean educat~n | 14.30735 mean income | 46.67372 mean partyID | .0846325 mean ------------------------------------ ** Calculando las FIRST DIFFERENCES en la probabilidad de votar por CLINTON ** cuando la educacion pasa de MIN a MAX y lo demás es promedio: . simqi fd(prval(1)) changex(educ min max) // Recuerden que la categoria 1 es voto por Clinton First Difference: educ min max Quantity of Interest | Mean Std. Err. [95% Conf. Interval] ---------------------------+-------------------------------------------------- dPr(vote = 1) | .0699361 .1534429 -.2677334 .3396356 ** La probabilidad aumenta en 7%... pero tiene un alto error estándar. ** Y para Bush: . simqi fd(prval(0)) changex(educ min max) First Difference: educ min max Quantity of Interest | Mean Std. Err. [95% Conf. Interval] ---------------------------+-------------------------------------------------- dPr(vote = 0) | .1769127 .1344398 -.1416432 .4053011 ** ...la probabilidad aumenta en 17.7%... pero con un amplio margen de error ** La educación parece aumentar más el voto Bush que el voto Clinton ** pero noten que con errores estándar tan grandes, la predicción no es precisa. ** ¿Y qué pasa si aumenta de min a max tanto la educacion como el ingreso? . simqi fd(prval(0)) changex(educ min max income min max) First Difference: educ min max income min max Quantity of Interest | Mean Std. Err. [95% Conf. Interval] ---------------------------+-------------------------------------------------- dPr(vote = 0) | .00086 .1530943 -.3207762 .2894045 ** la probabilidad de votar por Bush prácticamente no cambia... ** ¿Y si sólo cambia el ingreso? . simqi fd(prval(0)) changex(income min max) First Difference: income min max Quantity of Interest | Mean Std. Err. [95% Conf. Interval] ---------------------------+-------------------------------------------------- dPr(vote = 0) | -.1846292 .1246933 -.403498 .0838569 . simqi fd(prval(1)) changex(income min max) First Difference: income min max Quantity of Interest | Mean Std. Err. [95% Conf. Interval] ---------------------------+-------------------------------------------------- dPr(vote = 1) | .1863197 .149482 -.1019115 .4813845 ** Bush pierde votos, Clinton los gana, pero con amplios errores estándar ** Si se dan cuenta, los resultados no cambian mucho al jugar con EDUC e INCOME, ** puesto que ambas variables no eran significativas en la estimacion original. ** Si jugáramos con BADECON o PARTYID los resultados sí cambiarían significativemente. ** La moraleja es, entonces, primero estimar un modelo con resultados significativos, ** y luego simular escenarios interesantes que cambien significativamente las probabilidades ** de observar cierto evento... ------------------------------------------------------------------------------------------