*************************************
* Metodos Cuantitativos II
* Laboratorio 15-nov-2005
* USANDO CLARIFY
*************************************

/*
En esta sesión vimos cómo usar CLARIFY con logit y mlogit
*/


. use "D:\Mis documentos\Docencia\Clases\AnEmpirico\NES92_clean.dta", clear

. ** Esta es una encuesta post-electoral de la eleccion presidencial de EU en 1992
. ** Donde contendieron Bush papa, Clinton y Perot

. desc

Contains data from D:\Mis documentos\Docencia\Clases\AnEmpirico\NES92_clean.dta
  obs:           750                          
 vars:            22                          24 Nov 2004 00:59
 size:        69,000 (93.4% of memory free)
-------------------------------------------------------------------------------
              storage  display     value
variable name   type   format      label      variable label
-------------------------------------------------------------------------------
vote            float  %9.0g                  Vote 92: Bush, Clinton, Perot
bushapp         float  %9.0g                  Bush Approval, 1992
bplace          float  %9.0g                  Bush Lib/Con
cplace          float  %9.0g                  Clinton Lib/Con
pplace          float  %9.0g                  Perot Lib/Con
distbush        float  %9.0g                  R-Bush Lib/Con Dist
distperot       float  %9.0g                  R-Perot Lib/Con Dist
oppmilitary     float  %9.0g                  Opposition to Use of Military F
warok           float  %9.0g                  Gulf War Worth Cost
education       float  %9.0g                  Years of School
govemployee     float  %9.0g                  Government Employee
union           float  %9.0g                  Union Household
nonwhite        float  %9.0g                  Nonwhite
place           float  %9.0g                  R-Lib/Con
distclinton     float  %9.0g                  R-Clinton Lib/Con Dist
badecon         float  %9.0g                  Economy WORSE?
partyID         float  %9.0g                  PartyID
income          float  %9.0g                  FamilyIncome, $1000
r1              float  %9.0g                  Pr(v2==0)
r2              float  %9.0g                  Pr(v2==1)
r3              float  %9.0g                  Pr(v2==2)
r4              float  %9.0g                  Pr(v2==3)
-------------------------------------------------------------------------------
Sorted by:  

. summ

    Variable |       Obs        Mean    Std. Dev.       Min        Max
-------------+--------------------------------------------------------
        vote |       601    .8169717    .7186409          0          2
     bushapp |       727    1.250344    1.078488          0          3
      bplace |       678    5.150442    1.404075          1          7
      cplace |       666    3.096096    1.340996          1          7
      pplace |       587    4.381601    1.753846          1          7
-------------+--------------------------------------------------------
    distbush |       575    2.106087    1.560917          0          6
   distperot |       508    2.062992    1.414203          0          5
 oppmilitary |       742    2.963612    .8279998          1          5
       warok |       713    .5834502     .493333          0          1
   education |       745     13.5651    2.574976          2         17
-------------+--------------------------------------------------------
 govemployee |       750        .136    .3430173          0          1
       union |       750    .1653333     .371729          0          1
    nonwhite |       750    .1346667     .341595          0          1
       place |       599    4.245409    1.730566          1          7
 distclinton |       560    2.067857    1.528162          0          6
-------------+--------------------------------------------------------
     badecon |       740    4.014865    .9061782          1          5
     partyID |       742   -.1091644    1.999378         -3          3
      income |       695    41.92734    30.33958        1.5        140
          r1 |       745    .2722288    .0458215   .1040918   .3370141
          r2 |       745    .3117232    .0111072    .228696   .3205123
-------------+--------------------------------------------------------
          r3 |       745    .3454362    .0352465   .2958629   .4669528
          r4 |       745    .0706118    .0211353   .0466106   .2002594


** Qué es CLARIFY??

. ado desc clarify

------------------------------------------------------------------------------------------
[8] package clarify from http://gking.harvard.edu/clarify
------------------------------------------------------------------------------------------

TITLE
      Clarify: Software for Interpreting and Presenting Statistical Results

DESCRIPTION/AUTHOR(S)
      
      Version 2.1 (January 5, 2003)
      
      Michael Tomz, Stanford University
      Jason Wittenberg, University of Wisconsin, Madison
      Gary King, Harvard University
      
      Clarify is a program that uses Monte Carlo simulation to convert the
      raw output of statistical procedures into results that are of direct
      interest to researchers.  The program, designed for use with the Stata
      statistics package, can help researchers in three ways.
      
      (1) It can extract new quantities of interest from standard statistical
      models, thereby enriching the substance of social science research.
      (2) It can assess the uncertainty surrounding any quantity of interest,
      so it should improve the candor and realism of statistical discourse.
      (3) It can convert raw parameter estimates into results that anyone,
      regardless of statistical training, can understand. Thus, it should be
      useful to those who want to convey their results to a broader audience.
      
      Note: Site administrators installing Clarify for general use should
            type net set ado SITE before the net install command.
      
      Support: email clarify@latte.harvard.edu
      
------------------------------------------------------------------------------------------


** 1.  ESTIMANDO UN MODELO MLOGIT con CLARIFY:
 
. estsimp mlogit vote bplace distclinton badecon partyID educ income, basecategory(0) nolog

Multinomial logistic regression                   Number of obs   =        449
                                                  LR chi2(12)     =     351.61
                                                  Prob > chi2     =     0.0000
Log likelihood = -295.18913                       Pseudo R2       =     0.3733

------------------------------------------------------------------------------
        vote |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
1            |
      bplace |   .1444014   .1383184     1.04   0.296    -.1266976    .4155005
 distclinton |  -.9192259   .1459223    -6.30   0.000    -1.205228   -.6332235
     badecon |   .4975386   .1956128     2.54   0.011     .1141445    .8809328
     partyID |     -1.016   .1111729    -9.14   0.000    -1.233894   -.7981047
   education |  -.0359952   .0832538    -0.43   0.665    -.1991696    .1271792
      income |   .0085228   .0055849     1.53   0.127    -.0024234    .0194691
       _cons |  -.6330693   1.552966    -0.41   0.684    -3.676826    2.410687
-------------+----------------------------------------------------------------
2            |
      bplace |  -.0349927   .1217286    -0.29   0.774    -.2735763    .2035909
 distclinton |  -.3648462   .1129703    -3.23   0.001    -.5862639   -.1434285
     badecon |   .2198126   .1642349     1.34   0.181    -.1020818     .541707
     partyID |  -.3510387   .0952625    -3.68   0.000    -.5377497   -.1643276
   education |  -.1207075   .0719538    -1.68   0.093    -.2617344    .0203193
      income |   .0041594   .0048097     0.86   0.387    -.0052673    .0135862
       _cons |   1.603889   1.289002     1.24   0.213     -.922508    4.130285
------------------------------------------------------------------------------
(Outcome vote==0 is the comparison group)

Simulating main parameters.  Please wait....

Note: Clarify is expanding your dataset from 750 observations to 1000
observations in order to accommodate the simulations.  This will append
missing values to the bottom of your original dataset.

% of simulations completed: 7% 14% 21% 28% 35% 42% 50% 57% 64% 71% 78% 85% 92% 100% 

Number of simulations  : 1000
Names of new variables : b1 b2 b3 b4 b5 b6 b7 b8 b9 b10 b11 b12 b13 b14

** Las nuevas variables b1...b14 contienen los 14 parámetros o betas simulados

** 2. ESTABLECIENDO EL ESCENARIO DE INTERÉS:

. setx mean     // Fija todas las variables indepenedientes en sus medias

** 3. SIMULANDO LAS CANTIDADES DE INTERES, en este caso, probabilidades de voto
** por cada uno de los candidatos, suponiendo que todo está en sus medias:

. simqi          

      Quantity of Interest |     Mean       Std. Err.    [95% Conf. Interval]
---------------------------+--------------------------------------------------
                Pr(vote=0) |   .3462096     .0355934     .2822846    .4208964
                Pr(vote=1) |   .3260898     .0368523     .2535851    .4038918
                Pr(vote=2) |   .3277006     .0332986     .2632911    .3933733

** Noten cómo, al considerar a un respondent promedio o típico, las probabilidades 
** de voto son muy similares para cada candidato--sus intervalos de confianza se
** traslapan...

** ¿Cuál sería la probabilidad de voto para un respondent típico pero con la máxima
** educación?

. setx educ max    // Fija educ en su valor máximo (17) y deja todo lo demás en sus medias

** Veamos el escenario a simular:

. setx

You have set the following values for the explanatory variables:

------------------------------------
 Variable |    Value     Description
----------+-------------------------
  badecon |   3.991091       mean   
   bplace |   5.336303       mean   
 distcl~n |   2.153675       mean   
 educat~n |         17       max    
   income |   46.67372       mean   
  partyID |   .0846325       mean   
------------------------------------

. simqi

      Quantity of Interest |     Mean       Std. Err.    [95% Conf. Interval]
---------------------------+--------------------------------------------------
                Pr(vote=0) |   .3921437     .0556051     .2888138    .5073365
                Pr(vote=1) |   .3380416     .0557387     .2355007    .4518999
                Pr(vote=2) |   .2698147     .0459481     .1854675    .3637835

** La probabilidad de voto por Bush es un poco mayor, pero el intervalo 
** de confianza sigue siendo elevado en cada categoria


** Ahora, alguien que percibe o piensa que la economía va muy mal, que Bush es cercano
** ideológicamente, Clinton lejano, y con mínima educacion:

. setx badecon 5 bplace 1 distcl 4 educ min 

. setx

You have set the following values for the explanatory variables:

------------------------------------
 Variable |    Value     Description
----------+-------------------------
  badecon |          5        5     
   bplace |          1        1     
 distcl~n |          4        4     
 educat~n |          6       min    
   income |   46.67372       mean   
  partyID |   .0846325       mean   
------------------------------------


. simqi

      Quantity of Interest |     Mean       Std. Err.    [95% Conf. Interval]
---------------------------+--------------------------------------------------
                Pr(vote=0) |   .3328591     .1469903     .1021935    .6560302
                Pr(vote=1) |   .0764298     .0556076     .0141419    .2105235
                Pr(vote=2) |   .5907111     .1495008     .2801442    .8459563

** Este individuo votaría por Perot antes que por Bush... pero
** noten que sigue habiendo traslapes entre los CI.

** ¿Y cómo cambia la probabilidad de voto al pasar de valores minimos a máximos?

. setx min

. simqi

      Quantity of Interest |     Mean       Std. Err.    [95% Conf. Interval]
---------------------------+--------------------------------------------------
                Pr(vote=0) |   .0512365     .0540368     .0056483    .1913643
                Pr(vote=1) |   .6131154     .1973702     .1992366    .9311015
                Pr(vote=2) |   .3356481      .182923     .0567608    .7286881

** Votas por Clinton...

. setx max

. simqi

      Quantity of Interest |     Mean       Std. Err.    [95% Conf. Interval]
---------------------------+--------------------------------------------------
                Pr(vote=0) |   .8826424     .0649765     .7220204    .9651069
                Pr(vote=1) |   .0073966     .0068616     .0009729    .0266758
                Pr(vote=2) |    .109961     .0630299     .0309213    .2680381

** Votas por Bush...

. setx mean
. set x

You have set the following values for the explanatory variables:

------------------------------------
 Variable |    Value     Description
----------+-------------------------
  badecon |   3.991091       mean   
   bplace |   5.336303       mean   
 distcl~n |   2.153675       mean   
 educat~n |   14.30735       mean   
   income |   46.67372       mean   
  partyID |   .0846325       mean   
------------------------------------

** Calculando las FIRST DIFFERENCES en la probabilidad de votar por CLINTON
** cuando la educacion pasa de MIN a MAX y lo demás es promedio:

. simqi fd(prval(1)) changex(educ min max)   // Recuerden que la categoria 1 es voto por Clinton

First Difference: educ min max

      Quantity of Interest |     Mean       Std. Err.    [95% Conf. Interval]
---------------------------+--------------------------------------------------
             dPr(vote = 1) |   .0699361     .1534429    -.2677334    .3396356

** La probabilidad aumenta en 7%... pero tiene un alto error estándar.

** Y para Bush:

. simqi fd(prval(0)) changex(educ min max)

First Difference: educ min max

      Quantity of Interest |     Mean       Std. Err.    [95% Conf. Interval]
---------------------------+--------------------------------------------------
             dPr(vote = 0) |   .1769127     .1344398    -.1416432    .4053011

** ...la probabilidad aumenta en 17.7%... pero con un amplio margen de error
** La educación parece aumentar más el voto Bush que el voto Clinton
** pero noten que con errores estándar tan grandes, la predicción no es precisa.


** ¿Y qué pasa si aumenta de min a max tanto la educacion como el ingreso?

. simqi fd(prval(0)) changex(educ min max income min max)

First Difference: educ min max income min max

      Quantity of Interest |     Mean       Std. Err.    [95% Conf. Interval]
---------------------------+--------------------------------------------------
             dPr(vote = 0) |     .00086     .1530943    -.3207762    .2894045

** la probabilidad de votar por Bush prácticamente no cambia...


** ¿Y si sólo cambia el ingreso?

. simqi fd(prval(0)) changex(income min max)

First Difference: income min max

      Quantity of Interest |     Mean       Std. Err.    [95% Conf. Interval]
---------------------------+--------------------------------------------------
             dPr(vote = 0) |  -.1846292     .1246933     -.403498    .0838569

. simqi fd(prval(1)) changex(income min max)

First Difference: income min max

      Quantity of Interest |     Mean       Std. Err.    [95% Conf. Interval]
---------------------------+--------------------------------------------------
             dPr(vote = 1) |   .1863197      .149482    -.1019115    .4813845

** Bush pierde votos, Clinton los gana, pero con amplios errores estándar

** Si se dan cuenta, los resultados no cambian mucho al jugar con EDUC e INCOME,
** puesto que ambas variables no eran significativas en la estimacion original.

** Si jugáramos con BADECON o PARTYID los resultados sí cambiarían significativemente.

** La moraleja es, entonces, primero estimar un modelo con resultados significativos,
** y luego simular escenarios interesantes que cambien significativamente las probabilidades
** de observar cierto evento...

------------------------------------------------------------------------------------------