------------------------------------------------------------------------------- log: C:\Stata8\lab10nov.smcl log type: smcl opened on: 10 Nov 2004, 10:12:22 . use "C:\Documents and Settings\computob1\Escritorio\Wage2.dta", clear . describe Contains data from C:\Documents and Settings\computob1\Escritorio\Wage2.dta obs: 935 vars: 17 14 Apr 1999 13:41 size: 24,310 (97.7% of memory free) ------------------------------------------------------------------------------- storage display value variable name type format label variable label ------------------------------------------------------------------------------- wage int %9.0g monthly earnings hours byte %9.0g average weekly hours IQ int %9.0g IQ score KWW byte %9.0g knowledge of world work score educ byte %9.0g years of education exper byte %9.0g years of work experience tenure byte %9.0g years with current employer age byte %9.0g age in years married byte %9.0g =1 if married black byte %9.0g =1 if black south byte %9.0g =1 if live in south urban byte %9.0g =1 if live in SMSA sibs byte %9.0g number of siblings brthord byte %9.0g birth order meduc byte %9.0g mother's education feduc byte %9.0g father's education lwage float %9.0g natural log of wage ------------------------------------------------------------------------------- Sorted by: ** Esta es una base de datos para explicar los salarios mensuales (wage) . summ Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- wage | 935 957.9455 404.3608 115 3078 hours | 935 43.92941 7.224256 20 80 IQ | 935 101.2824 15.05264 50 145 KWW | 935 35.74439 7.638788 12 56 educ | 935 13.46845 2.196654 9 18 -------------+-------------------------------------------------------- exper | 935 11.56364 4.374586 1 23 tenure | 935 7.234225 5.075206 0 22 age | 935 33.08021 3.107803 28 38 married | 935 .8930481 .3092174 0 1 black | 935 .1283422 .3346495 0 1 -------------+-------------------------------------------------------- south | 935 .3411765 .4743582 0 1 urban | 935 .7176471 .4503851 0 1 sibs | 935 2.941176 2.306254 0 14 brthord | 852 2.276995 1.595613 1 10 meduc | 857 10.68261 2.849756 0 18 -------------+-------------------------------------------------------- feduc | 741 10.21727 3.3007 0 18 lwage | 935 6.779004 .4211439 4.744932 8.032035 * Analizando la variable IQ a detalle: . summ IQ, detail IQ score ------------------------------------------------------------- Percentiles Smallest 1% 64 50 5% 74 54 10% 82 55 Obs 935 25% 92 59 Sum of Wgt. 935 50% 102 Mean 101.2824 Largest Std. Dev. 15.05264 75% 112 134 90% 120 134 Variance 226.5819 95% 125 137 Skewness -.3404246 99% 132 145 Kurtosis 2.977035 ** La kurtosis es muy cercana a 3, y el sesgo es pequeño: el IQ es una variable ** "casi normal" ** Una variable con kurtosis muy diferente de 3 se considera NO NORMAL ** veamoslo gráficamente . hist IQ (bin=29, start=50, width=3.2758621) ** Ahora veamos la distribucion de salarios: . hist wage (bin=29, start=115, width=102.17241) . summ wage, detail monthly earnings ------------------------------------------------------------- Percentiles Smallest 1% 325 115 5% 433 200 10% 500 233 Obs 935 25% 668 260 Sum of Wgt. 935 50% 905 Mean 957.9455 Largest Std. Dev. 404.3608 75% 1160 2668 90% 1444 2771 Variance 163507.7 95% 1699 3078 Skewness 1.199259 99% 2308 3078 Kurtosis 5.696661 * Los salarios no tienen una distribucion simetrica ni normal: * estan sesgados a la derecha y con Kurtosis muy superior a 3. ** El comando SKTEST prueba la hipotesis nula de normalidad . help sktest . sktest IQ wage Skewness/Kurtosis tests for Normality ------- joint ------ Variable | Pr(Skewness) Pr(Kurtosis) adj chi2(2) Prob>chi2 -------------+------------------------------------------------------- IQ | 0.000 0.977 15.60 0.0004 wage | 0.000 0.000 . 0.0000 ** La hipotesis nula de SKTEST es que la variable es NORMAL... ** Rechazar la Ho de normalidad para IQ y WAGE ** Nota: existen otras pruebas de normalidad ** La gran mayoria de las variables observadas en muestras pequeñas no son normales. ** Pero muchos modelos de regresion se justifican ya sea por propiedades asintoticas ** o recurriendo al Teorema del Limite Central. ** Regresiones robustas para Salarios . reg wage educ exper IQ, robust Regression with robust standard errors Number of obs = 935 F( 3, 931) = 56.74 Prob > F = 0.0000 R-squared = 0.1620 Root MSE = 370.76 ------------------------------------------------------------------------------ | Robust wage | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- educ | 58.10383 7.427005 7.82 0.000 43.52822 72.67944 exper | 17.41712 3.103811 5.61 0.000 11.32584 23.5084 IQ | 5.068828 .8971595 5.65 0.000 3.308139 6.829517 _cons | -539.4111 114.9665 -4.69 0.000 -765.0346 -313.7876 ------------------------------------------------------------------------------ ** Educacion, experiencia e IQ ayudan a mejores salarios... ** Problema: es probable que la gente de mayor IQ estudie mas y que con ello ** acumulen mas experiencia laboral--posible problema de multicolinealidad . summ educ exper Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- educ | 935 13.46845 2.196654 9 18 exper | 935 11.56364 4.374586 1 23 . corr IQ educ exper (obs=935) | IQ educ exper -------------+--------------------------- IQ | 1.0000 educ | 0.5157 1.0000 exper | -0.2249 -0.4556 1.0000 * A mayor IQ, mayor educacion, pero no necesariamente mayor experiencia... * A mayor educacion, tampoco se sigue mayor experiencia... ** Stata puede crear variables con la PREDICCION de cada modelo estimado ** tanto para OLS como para casi cualquier otro metodo. * Este comando genera los "valores predichos" del ultimo modelo en memoria: . predict yhat, xb ** Veamos que tan cerca o lejos quedaron las predicciones de la variable dependiente: . summ yhat wage Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- yhat | 935 957.9455 162.7443 487.1726 1421.321 wage | 935 957.9455 404.3608 115 3078 * Noten que la prediccion tiene exactamente la misma media que la var. dependiente * Esto no es accidental, es parte de la funcion de OLS... * Sin embargo, YHAT tiene una menor dispersion pues el modelo no pudo explicar TODA * la variacion de los salarios. * La diferencia entre YHAT y WAGE son los RESIDUALES PREDICHOS. ** Introduciendo una variable mas: MARRIED . reg wage educ exper IQ married, robust Regression with robust standard errors Number of obs = 935 F( 4, 930) = 49.88 Prob > F = 0.0000 R-squared = 0.1812 Root MSE = 366.68 ------------------------------------------------------------------------------ | Robust wage | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- educ | 58.6964 7.266891 8.08 0.000 44.435 72.95781 exper | 16.12464 3.091988 5.21 0.000 10.05656 22.19273 IQ | 4.99469 .8816971 5.66 0.000 3.264344 6.725037 married | 182.3343 35.64261 5.12 0.000 112.3851 252.2836 _cons | -687.7709 116.9661 -5.88 0.000 -917.3191 -458.2228 ------------------------------------------------------------------------------ ** Los casados tambien ganan mas! ** EFECTOS INTERACTIVOS entre variables CONTINUAS ** Probando la hipotesis de que a mayor educacion, mayor premio a la experiencia ** (o al reves, a mayor experiencia, mayor premio a la educacion) ** Y tambien si a mayor educacion, mayor premio al matrimonio * Generamos dos variables de interaccion (ojo, esta vez no son variables dummy): . generate edex=educ*exper . generate edumarr= educ*married . summ educ exper marr edex edumarr Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- educ | 935 13.46845 2.196654 9 18 exper | 935 11.56364 4.374586 1 23 married | 935 .8930481 .3092174 0 1 edex | 935 151.3711 50.94142 11 280 edumarr | 935 11.98824 4.637949 0 18 * Modelo Base: . reg wage educ exper IQ married , robust Regression with robust standard errors Number of obs = 935 F( 4, 930) = 49.88 Prob > F = 0.0000 R-squared = 0.1812 Root MSE = 366.68 ------------------------------------------------------------------------------ | Robust wage | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- educ | 58.6964 7.266891 8.08 0.000 44.435 72.95781 exper | 16.12464 3.091988 5.21 0.000 10.05656 22.19273 IQ | 4.99469 .8816971 5.66 0.000 3.264344 6.725037 married | 182.3343 35.64261 5.12 0.000 112.3851 252.2836 _cons | -687.7709 116.9661 -5.88 0.000 -917.3191 -458.2228 ------------------------------------------------------------------------------ * Modelo con EDEX: . reg wage educ exper IQ married edex, robust Regression with robust standard errors Number of obs = 935 F( 5, 929) = 41.57 Prob > F = 0.0000 R-squared = 0.1865 Root MSE = 365.7 ------------------------------------------------------------------------------ | Robust wage | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- educ | 22.37458 17.36982 1.29 0.198 -11.71405 56.4632 exper | -28.95696 18.81561 -1.54 0.124 -65.88299 7.969069 IQ | 4.849108 .8780889 5.52 0.000 3.125841 6.572376 married | 183.4278 35.55097 5.16 0.000 113.6583 253.1973 edex | 3.498451 1.486958 2.35 0.019 .5802644 6.416638 _cons | -193.061 235.2749 -0.82 0.412 -654.793 268.6709 ------------------------------------------------------------------------------ ** La educacion interactuada con la experiencia si rinde un premio, * pero ahora EDUC y EXPER ya no son significativas por si mismas--posiblemente * por la colinelidad introducida por la interaccion. * Modelo con EDUC*MARRIED y EDUC*EXPER: . reg wage educ exper IQ married edex edumarr, robust Regression with robust standard errors Number of obs = 935 F( 6, 928) = 34.84 Prob > F = 0.0000 R-squared = 0.1883 Root MSE = 365.48 ------------------------------------------------------------------------------ | Robust wage | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- educ | 1.878909 21.29179 0.09 0.930 -39.90673 43.66454 exper | -25.71994 19.08832 -1.35 0.178 -63.18121 11.74134 IQ | 4.909364 .8818003 5.57 0.000 3.17881 6.639918 married | -172.5088 233.314 -0.74 0.460 -630.393 285.3755 edex | 3.264269 1.507271 2.17 0.031 .3062146 6.222323 edumarr | 25.77711 17.21611 1.50 0.135 -8.009917 59.56414 _cons | 83.74403 282.8882 0.30 0.767 -471.4308 638.9189 ------------------------------------------------------------------------------ * La interaccion EDUC*MARRIED no es significativa, y de hecho arruina la significancia * de MARRIED y EDUC por si solas. . corr educ exper edex (obs=935) | educ exper edex -------------+--------------------------- educ | 1.0000 exper | -0.4556 1.0000 edex | -0.0500 0.8997 1.0000 * La correlacion de EDEX con EXPER es muy alta... mejor la quitamos del modelo * Modelo sin interaccion EDEX y dejando EDUMARR: . reg wage educ exper IQ married edumarr, robust Regression with robust standard errors Number of obs = 935 F( 5, 929) = 40.27 Prob > F = 0.0000 R-squared = 0.1838 Root MSE = 366.29 ------------------------------------------------------------------------------ | Robust wage | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- educ | 31.81828 16.50628 1.93 0.054 -.5756377 64.21219 exper | 16.38175 3.090189 5.30 0.000 10.31719 22.44631 IQ | 5.053906 .8841648 5.72 0.000 3.318714 6.789098 married | -234.8565 229.1449 -1.02 0.306 -684.5581 214.8452 edumarr | 30.21939 16.89404 1.79 0.074 -2.935511 63.37429 _cons | -324.4406 223.6765 -1.45 0.147 -763.4103 114.5291 ------------------------------------------------------------------------------ ** Ahora si, EDUC y EXPER vuelven a ser significativas y vemos que el premio al ** matrimonio aumenta conforme más educacion tienes. ** Y estar casado por si mismo no rinde mejores salarios. ** Sin embargo, este modelo no es muy superior al modelo base... (comparen sus R2) ** Usando XI para crear una interaccion entre BLACK y MARRIED: . xi: reg wage educ exper i.black*i.married IQ, robust i.black _Iblack_0-1 (naturally coded; _Iblack_0 omitted) i.married _Imarried_0-1 (naturally coded; _Imarried_0 omitted) i.black*i.mar~d _IblaXmar_#_# (coded as above) Regression with robust standard errors Number of obs = 935 F( 6, 928) = 38.06 Prob > F = 0.0000 R-squared = 0.1895 Root MSE = 365.21 ------------------------------------------------------------------------------ | Robust wage | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- educ | 58.95939 7.255151 8.13 0.000 44.72098 73.19779 exper | 15.93347 3.088756 5.16 0.000 9.871713 21.99522 _Iblack_1 | -126.759 68.90127 -1.84 0.066 -261.9793 8.461399 _Imarried_1 | 173.652 40.64807 4.27 0.000 93.8792 253.4248 _IblaXmar_~1 | 8.057439 74.96207 0.11 0.914 -139.0574 155.1723 IQ | 3.926311 .9307907 4.22 0.000 2.099612 5.75301 _cons | -557.751 125.614 -4.44 0.000 -804.2714 -311.2306 ------------------------------------------------------------------------------ ** Estar casado te da un premio, ser de color te castiga. ** Pero los casados de color no compensan este hecho. ** HACIENDO PREDICCIONES ** Volvamos a un modelo base: . reg wage educ exper IQ, robust Regression with robust standard errors Number of obs = 935 F( 3, 931) = 56.74 Prob > F = 0.0000 R-squared = 0.1620 Root MSE = 370.76 ------------------------------------------------------------------------------ | Robust wage | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- educ | 58.10383 7.427005 7.82 0.000 43.52822 72.67944 exper | 17.41712 3.103811 5.61 0.000 11.32584 23.5084 IQ | 5.068828 .8971595 5.65 0.000 3.308139 6.829517 _cons | -539.4111 114.9665 -4.69 0.000 -765.0346 -313.7876 ------------------------------------------------------------------------------ * ¿Que pasa si tienes 10 años de educacion y 5 de experiencia??? * La respuesta es la predicción bajo una combinacion lineal de los coeficientes * de la regresion . help lincom . lincom 10*educ + 5*exper ( 1) 10 educ + 5 exper = 0 ------------------------------------------------------------------------------ wage | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- (1) | 668.1239 82.35958 8.11 0.000 506.492 829.7558 ------------------------------------------------------------------------------ ** La prediccion es un sueldo de 668 dlls. ** Estimando la misma pregunta pero estando casado ** Primero metemos MARRIED en la regresion: . reg wage educ exper IQ married , robust Regression with robust standard errors Number of obs = 935 F( 4, 930) = 49.88 Prob > F = 0.0000 R-squared = 0.1812 Root MSE = 366.68 ------------------------------------------------------------------------------ | Robust wage | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- educ | 58.6964 7.266891 8.08 0.000 44.435 72.95781 exper | 16.12464 3.091988 5.21 0.000 10.05656 22.19273 IQ | 4.99469 .8816971 5.66 0.000 3.264344 6.725037 married | 182.3343 35.64261 5.12 0.000 112.3851 252.2836 _cons | -687.7709 116.9661 -5.88 0.000 -917.3191 -458.2228 ------------------------------------------------------------------------------ . lincom 10*educ + 5*exper + married ( 1) 10 educ + 5 exper + married = 0 ------------------------------------------------------------------------------ wage | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- (1) | 849.9216 88.99743 9.55 0.000 675.2625 1024.581 ------------------------------------------------------------------------------ ** El sueldo predicho es 849 dlls, mas alto que antes ** HACIENDO PRUEBAS DE HIPOTESIS PARA UNA O MAS VARIABLES: . ** Pruebas F . test educ //Ho: EDUC es insignifante ( 1) educ = 0 F( 1, 930) = 65.24 // Un F-test con una restriccion es similar a un t-test Prob > F = 0.0000 // Se rechaza Ho . test educ = 70 //HO: El rendimiento de un año de EDUC es 70 dlls ( 1) educ = 70 F( 1, 930) = 2.42 Prob > F = 0.1202 // No se puede rechazar Ho . test educ = 80 //HO: El rendimiento de un año de EDUC es 80 dlls ( 1) educ = 80 F( 1, 930) = 8.59 Prob > F = 0.0035 // Se rechaza Ho al 1% ** Una prueba F tambien prueba la significancia CONJUNTA de un GRUPO de variables... ** (JOINT significance test) . reg wage educ exper IQ married tenure urban black meduc feduc, robust Regression with robust standard errors Number of obs = 722 F( 9, 712) = 26.48 Prob > F = 0.0000 R-squared = 0.2331 Root MSE = 359.63 ------------------------------------------------------------------------------ | Robust wage | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- educ | 46.63542 8.036525 5.80 0.000 30.8573 62.41354 exper | 15.64324 3.883169 4.03 0.000 8.019413 23.26707 IQ | 3.744727 1.036072 3.61 0.000 1.710605 5.778849 married | 183.5051 38.20636 4.80 0.000 108.4945 258.5157 tenure | 4.699898 2.8558 1.65 0.100 -.9068986 10.30669 urban | 175.0415 28.3991 6.16 0.000 119.2856 230.7975 black | -90.86056 41.42059 -2.19 0.029 -172.1817 -9.539458 meduc | 7.249112 5.040555 1.44 0.151 -2.647017 17.14524 feduc | 8.545835 4.884047 1.75 0.081 -1.043021 18.13469 _cons | -700.613 138.5112 -5.06 0.000 -972.5522 -428.6739 ------------------------------------------------------------------------------ * En este modelo FEDUC y MEDUC (educacion de los padres) son débilmente significantes. * ¿Qué tan probable es que AMBAS variables NO SEAN CONJUNTAMENTE SIGNIFICATIVAS * (not jointly significant) . test feduc meduc // Ho: ambas variables son insignificantes ( 1) feduc = 0 ( 2) meduc = 0 F( 2, 712) = 4.62 // Ahora el F-test tiene DOS restricciones Prob > F = 0.0101 // Se rechaza Ho . test meduc ( 1) meduc = 0 F( 1, 712) = 2.07 Prob > F = 0.1508 ** ...Sin embargo no se rechaza que MEDUC sea insignificante por si sola . test tenure meduc // Ho: ambas variables son insignificantes ( 1) tenure = 0 ( 2) meduc = 0 F( 2, 712) = 2.23 Prob > F = 0.1078 // No se rechaza la Ho ** NO podemos descartar la hipotesis nula de que TENURE y MEDUC son CONJUNTAMENTE ** insignificantes... . test tenure meduc black // Ho: Las tres variables son insignificantes ( 1) tenure = 0 ( 2) meduc = 0 ( 3) black = 0 F( 3, 712) = 3.78 Prob > F = 0.0104 // Se rechaza la Ho--noten que BLACK ha sido muy // significante en los modelos anteriores . log close log: C:\Stata8\lab10nov.smcl log type: smcl closed on: 10 Nov 2004, 10:57:28 -------------------------------------------------------------------------------