------------------------------------------------------------------------------- log: C:\DATA\lab30ago.smcl log type: smcl opened on: 30 Aug 2005, 15:33:17 ******************** En esta sesión vimos: -Estadística descriptiva y por grupos: summarize, inspect, hist, box, bysort -Comparación de medias: ttest -Explorando la estructura de los datos: correlate, tabulate -Regresiones simples ******************** . use "C:\Documents and Settings\salon.CUOTAS\Escritorio\Wage1.dta", clear . desc Contains data from C:\Documents and Settings\salon.CUOTAS\Escritorio\Wage1.dta obs: 526 vars: 24 16 Sep 1996 15:52 size: 18,936 (98.2% of memory free) ------------------------------------------------------------------------------- storage display value variable name type format label variable label ------------------------------------------------------------------------------- wage float %8.2g average hourly earnings educ byte %8.0g years of education exper byte %8.0g years potential experience tenure byte %8.0g years with current employer nonwhite byte %8.0g =1 if nonwhite female byte %8.0g =1 if female married byte %8.0g =1 if married numdep byte %8.0g number of dependents smsa byte %8.0g =1 if live in SMSA northcen byte %8.0g =1 if live in north central U.S south byte %8.0g =1 if live in southern region west byte %8.0g =1 if live in western region construc byte %8.0g =1 if work in construc. indus. ndurman byte %8.0g =1 if in nondur. manuf. indus. trcommpu byte %8.0g =1 if in trans, commun, pub ut trade byte %8.0g =1 if in wholesale or retail services byte %8.0g =1 if in services indus. profserv byte %8.0g =1 if in prof. serv. indus. profocc byte %8.0g =1 if in profess. occupation clerocc byte %8.0g =1 if in clerical occupation servocc byte %8.0g =1 if in service occupation lwage float %9.0g log(wage) expersq int %9.0g exper^2 tenursq int %9.0g tenure^2 ------------------------------------------------------------------------------- Sorted by: *** ESTADISTICA DESCRIPTIVA . summ Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- wage | 526 5.896103 3.693086 .53 24.98 educ | 526 12.56274 2.769022 0 18 exper | 526 17.01711 13.57216 1 51 tenure | 526 5.104563 7.224462 0 44 nonwhite | 526 .1026616 .3038053 0 1 -------------+-------------------------------------------------------- female | 526 .4790875 .500038 0 1 married | 526 .608365 .4885804 0 1 numdep | 526 1.043726 1.261891 0 6 smsa | 526 .7224335 .4482246 0 1 northcen | 526 .2509506 .4339728 0 1 -------------+-------------------------------------------------------- south | 526 .3555133 .4791242 0 1 west | 526 .1692015 .3752867 0 1 construc | 526 .0456274 .2088743 0 1 ndurman | 526 .1140684 .318197 0 1 trcommpu | 526 .0437262 .20468 0 1 -------------+-------------------------------------------------------- trade | 526 .2870722 .4528262 0 1 services | 526 .1007605 .3012978 0 1 profserv | 526 .2585551 .4382574 0 1 profocc | 526 .3669202 .4824233 0 1 clerocc | 526 .1673004 .3735991 0 1 -------------+-------------------------------------------------------- servocc | 526 .1406844 .3480267 0 1 lwage | 526 1.623268 .5315382 -.6348783 3.218076 expersq | 526 473.4354 616.0448 1 2601 tenursq | 526 78.15019 199.4347 0 1936 ** Explorando histogramas de la variable dependiente WAGE: . hist wage (bin=22, start=.52999997, width=1.1113636) . hist lwage (bin=22, start=-.63487834, width=.17513427) ** Estadística descriptiva por grupos usando condicional IF: . summ wage if nonwhite==1 Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- wage | 54 5.475926 3.155425 1.96 15 ** Estadística descriptiva por grupos usando BYSORT: . bysort female: summ wage educ exper tenure nonwhite _______________________________________________________________________________ -> female = 0 Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- wage | 274 7.099489 4.160858 1.5 24.98 educ | 274 12.78832 3.002882 2 18 exper | 274 17.55839 13.49991 1 51 tenure | 274 6.474453 8.369297 0 44 nonwhite | 274 .1058394 .3081949 0 1 _______________________________________________________________________________ -> female = 1 Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- wage | 252 4.587659 2.529363 .53 21.63 educ | 252 12.31746 2.472642 0 18 exper | 252 16.42857 13.65274 1 50 tenure | 252 3.615079 5.357968 0 34 nonwhite | 252 .0992063 .2995338 0 1 ** Box plots para grupos de observaciones . graph box wage, over(female) . graph box lwage, over(female) ** Estadísticas más detalladas . by female: summ lwage wage, detail _______________________________________________________________________________ -> female = 0 log(wage) ------------------------------------------------------------- Percentiles Smallest 1% .5596158 .4054651 5% 1.071584 .5128236 10% 1.098612 .5596158 Obs 274 25% 1.413423 .6931472 Sum of Wgt. 274 50% 1.791759 Mean 1.81357 Largest Std. Dev. .5348107 75% 2.171337 3.084659 90% 2.525729 3.100092 Variance .2860225 95% 2.733068 3.129389 Skewness .2109881 99% 3.100092 3.218076 Kurtosis 2.677425 average hourly earnings ------------------------------------------------------------- Percentiles Smallest 1% 1.75 1.5 5% 2.92 1.67 10% 3 1.75 Obs 274 25% 4.11 2 Sum of Wgt. 274 50% 6 Mean 7.099489 Largest Std. Dev. 4.160858 75% 8.77 21.86 90% 12.5 22.2 Variance 17.31274 95% 15.38 22.86 Skewness 1.575794 99% 22.2 24.98 Kurtosis 5.828708 _______________________________________________________________________________ -> female = 1 log(wage) ------------------------------------------------------------- Percentiles Smallest 1% .4054651 -.6348783 5% .8329091 .3576744 10% 1.064711 .4054651 Obs 252 25% 1.098612 .48858 Sum of Wgt. 252 50% 1.321756 Mean 1.416353 Largest Std. Dev. .4442354 75% 1.711936 2.679651 90% 2.014903 2.70805 Variance .197345 95% 2.197225 2.890372 Skewness .3874667 99% 2.70805 3.074081 Kurtosis 5.331479 average hourly earnings ------------------------------------------------------------- Percentiles Smallest 1% 1.5 .53 5% 2.3 1.43 10% 2.9 1.5 Obs 252 25% 3 1.63 Sum of Wgt. 252 50% 3.75 Mean 4.587659 Largest Std. Dev. 2.529363 75% 5.54 14.58 90% 7.5 15 Variance 6.397678 95% 9 18 Skewness 2.818724 99% 15 21.63 Kurtosis 15.02116 ** Usando comodines de prefijo y sufijo (*) . summ e* Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- educ | 526 12.56274 2.769022 0 18 exper | 526 17.01711 13.57216 1 51 expersq | 526 473.4354 616.0448 1 2601 . summ *sq Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- expersq | 526 473.4354 616.0448 1 2601 tenursq | 526 78.15019 199.4347 0 1936 ** Inspeccionando una variable cualquiera--util para detectar typos . inspect wage wage: average hourly earnings Number of Observations ------------------------------ Non- Total Integers Integers | # Negative - - - | # Zero - - - | # Positive 526 109 417 | # # ----- ----- ----- | # # Total 526 109 417 | # # . . . Missing - +---------------------- ----- .53 24.98 526 (More than 99 unique values) ** COMPARACION DE MEDIAS ENTRE GRUPOS: . by female: summ lwage wage _______________________________________________________________________________ -> female = 0 Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- lwage | 274 1.81357 .5348107 .4054651 3.218076 wage | 274 7.099489 4.160858 1.5 24.98 _______________________________________________________________________________ -> female = 1 Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- lwage | 252 1.416353 .4442354 -.6348783 3.074081 wage | 252 4.587659 2.529363 .53 21.63 ** Es estadísticamente significativa la diferencia de salarios promedio ** entre hombres y mujeres? . ttest wage, by(female) Two-sample t test with equal variances ------------------------------------------------------------------------------ Group | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval] ---------+-------------------------------------------------------------------- 0 | 274 7.099489 .2513666 4.160858 6.604626 7.594352 1 | 252 4.587659 .1593349 2.529363 4.273855 4.901462 ---------+-------------------------------------------------------------------- combined | 526 5.896103 .1610262 3.693086 5.579768 6.212437 ---------+-------------------------------------------------------------------- diff | 2.51183 .3034092 1.915782 3.107878 ------------------------------------------------------------------------------ Degrees of freedom: 524 Ho: mean(0) - mean(1) = diff = 0 Ha: diff < 0 Ha: diff != 0 Ha: diff > 0 t = 8.2787 t = 8.2787 t = 8.2787 P < t = 1.0000 P > |t| = 0.0000 P > t = 0.0000 ** Si descartamos dos de las tres hipótesis alternativas (Ha), la diferencia ** es estadísticamente significativa... ** Whites vs. nonwhites... . ttest wage, by(nonwhite) Two-sample t test with equal variances ------------------------------------------------------------------------------ Group | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval] ---------+-------------------------------------------------------------------- 0 | 472 5.944174 .1725901 3.749617 5.605032 6.283316 1 | 54 5.475926 .4293989 3.155425 4.614661 6.337191 ---------+-------------------------------------------------------------------- combined | 526 5.896103 .1610262 3.693086 5.579768 6.212437 ---------+-------------------------------------------------------------------- diff | .4682478 .5306473 -.5742097 1.510705 ------------------------------------------------------------------------------ Degrees of freedom: 524 Ho: mean(0) - mean(1) = diff = 0 Ha: diff < 0 Ha: diff != 0 Ha: diff > 0 t = 0.8824 t = 0.8824 t = 0.8824 P < t = 0.8110 P > |t| = 0.3780 P > t = 0.1890 ** Casados vs. no casados... . ttest wage, by(married) Two-sample t test with equal variances ------------------------------------------------------------------------------ Group | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval] ---------+-------------------------------------------------------------------- 0 | 206 4.843884 .2014351 2.891138 4.446733 5.241034 1 | 320 6.573469 .2229044 3.987435 6.13492 7.012017 ---------+-------------------------------------------------------------------- combined | 526 5.896103 .1610262 3.693086 5.579768 6.212437 ---------+-------------------------------------------------------------------- diff | -1.729585 .3214475 -2.361069 -1.098101 ------------------------------------------------------------------------------ Degrees of freedom: 524 Ho: mean(0) - mean(1) = diff = 0 Ha: diff < 0 Ha: diff != 0 Ha: diff > 0 t = -5.3806 t = -5.3806 t = -5.3806 P < t = 0.0000 P > |t| = 0.0000 P > t = 1.0000 ** Construccion vs. no-construccion... . . ttest wage, by(construc) Two-sample t test with equal variances ------------------------------------------------------------------------------ Group | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval] ---------+-------------------------------------------------------------------- 0 | 502 5.892849 .1656738 3.71198 5.567348 6.21835 1 | 24 5.964167 .6824956 3.343532 4.552317 7.376016 ---------+-------------------------------------------------------------------- combined | 526 5.896103 .1610262 3.693086 5.579768 6.212437 ---------+-------------------------------------------------------------------- diff | -.071318 .7723876 -1.588675 1.446039 ------------------------------------------------------------------------------ Degrees of freedom: 524 Ho: mean(0) - mean(1) = diff = 0 Ha: diff < 0 Ha: diff != 0 Ha: diff > 0 t = -0.0923 t = -0.0923 t = -0.0923 P < t = 0.4632 P > |t| = 0.9265 P > t = 0.5368 *** REGRESIONES PARA EXPLICAR SALARIOS POR HORA (WAGE) . correlate educ exper tenure (obs=526) | educ exper tenure -------------+--------------------------- educ | 1.0000 exper | -0.2995 1.0000 tenure | -0.0562 0.4993 1.0000 . reg wage educ exper tenure Source | SS df MS Number of obs = 526 -------------+------------------------------ F( 3, 522) = 76.87 Model | 2194.1116 3 731.370532 Prob > F = 0.0000 Residual | 4966.30269 522 9.51398984 R-squared = 0.3064 -------------+------------------------------ Adj R-squared = 0.3024 Total | 7160.41429 525 13.6388844 Root MSE = 3.0845 ------------------------------------------------------------------------------ wage | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- educ | .5989651 .0512835 11.68 0.000 .4982176 .6997126 exper | .0223395 .0120568 1.85 0.064 -.0013464 .0460254 tenure | .1692687 .0216446 7.82 0.000 .1267474 .2117899 _cons | -2.872735 .7289643 -3.94 0.000 -4.304799 -1.440671 ------------------------------------------------------------------------------ ** Capturando los valores predichos (fitted values) y los residuales estimados: . predict yhat, xb . predict error, resid ** La media predicha es igual a la media muestral // Los residuales tienen media cero: . summ wage yhat error Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- wage | 526 5.896103 3.693086 .53 24.98 yhat | 526 5.896103 2.044324 -1.934475 13.00783 error | 526 1.90e-09 3.07565 -7.606771 14.6536 ** Introduciendo variables adicionales ** Noten el impacto en R2 ajustada, coeficientes y significancia . reg wage educ exper tenure female Source | SS df MS Number of obs = 526 -------------+------------------------------ F( 4, 521) = 74.40 Model | 2603.10658 4 650.776644 Prob > F = 0.0000 Residual | 4557.30771 521 8.7472317 R-squared = 0.3635 -------------+------------------------------ Adj R-squared = 0.3587 Total | 7160.41429 525 13.6388844 Root MSE = 2.9576 ------------------------------------------------------------------------------ wage | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- educ | .5715048 .0493373 11.58 0.000 .4745803 .6684293 exper | .0253959 .0115694 2.20 0.029 .0026674 .0481243 tenure | .1410051 .0211617 6.66 0.000 .0994323 .1825778 female | -1.810852 .2648252 -6.84 0.000 -2.331109 -1.290596 _cons | -1.567939 .7245511 -2.16 0.031 -2.991339 -.144538 ------------------------------------------------------------------------------ . reg wage educ exper tenure female nonwhite Source | SS df MS Number of obs = 526 -------------+------------------------------ F( 5, 520) = 59.43 Model | 2603.75212 5 520.750424 Prob > F = 0.0000 Residual | 4556.66217 520 8.76281186 R-squared = 0.3636 -------------+------------------------------ Adj R-squared = 0.3575 Total | 7160.41429 525 13.6388844 Root MSE = 2.9602 ------------------------------------------------------------------------------ wage | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- educ | .5703422 .0495667 11.51 0.000 .4729667 .6677177 exper | .025343 .0115814 2.19 0.029 .002591 .0480951 tenure | .1410697 .0211819 6.66 0.000 .0994572 .1826823 female | -1.812043 .2650973 -6.84 0.000 -2.332836 -1.29125 nonwhite | -.115874 .4269179 -0.27 0.786 -.9545699 .7228218 _cons | -1.540298 .7323115 -2.10 0.036 -2.978951 -.1016455 ------------------------------------------------------------------------------ ** Qué importa más: estar casado o el número de dependientes?? . corr numdep married (obs=526) | numdep married -------------+------------------ numdep | 1.0000 married | 0.1545 1.0000 . tab numdep married number of | =1 if married dependents | 0 1 | Total -----------+----------------------+---------- 0 | 123 129 | 252 1 | 36 69 | 105 2 | 27 72 | 99 3 | 9 36 | 45 4 | 7 9 | 16 5 | 4 3 | 7 6 | 0 2 | 2 -----------+----------------------+---------- Total | 206 320 | 526 ** Incluyendo ambas al mismo tiempo: . reg wage educ exper tenure female married numdep Source | SS df MS Number of obs = 526 -------------+------------------------------ F( 6, 519) = 50.87 Model | 2651.76049 6 441.960082 Prob > F = 0.0000 Residual | 4508.6538 519 8.68719421 R-squared = 0.3703 -------------+------------------------------ Adj R-squared = 0.3631 Total | 7160.41429 525 13.6388844 Root MSE = 2.9474 ------------------------------------------------------------------------------ wage | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- educ | .5752695 .0519569 11.07 0.000 .4731978 .6773413 exper | .0218868 .0122498 1.79 0.075 -.0021785 .045952 tenure | .138238 .0211235 6.54 0.000 .09674 .1797361 female | -1.757463 .2665637 -6.59 0.000 -2.28114 -1.233787 married | .4655339 .2942895 1.58 0.114 -.1126111 1.043679 numdep | .1443296 .1084827 1.33 0.184 -.0687895 .3574488 _cons | -2.000828 .777664 -2.57 0.010 -3.528584 -.4730719 ------------------------------------------------------------------------------ ** Incluyendo una a la vez: . reg wage educ tenure female married Source | SS df MS Number of obs = 526 -------------+------------------------------ F( 4, 521) = 74.95 Model | 2615.25659 4 653.814148 Prob > F = 0.0000 Residual | 4545.1577 521 8.72391113 R-squared = 0.3652 -------------+------------------------------ Adj R-squared = 0.3604 Total | 7160.41429 525 13.6388844 Root MSE = 2.9536 ------------------------------------------------------------------------------ wage | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- educ | .5293624 .046981 11.27 0.000 .4370668 .621658 tenure | .1541769 .0187096 8.24 0.000 .1174214 .1909324 female | -1.710499 .2661125 -6.43 0.000 -2.233285 -1.187714 married | .6852168 .2746587 2.49 0.013 .1456422 1.224791 _cons | -1.138527 .6551687 -1.74 0.083 -2.425624 .1485698 ------------------------------------------------------------------------------ . reg wage educ exper tenure female numdep Source | SS df MS Number of obs = 526 -------------+------------------------------ F( 5, 520) = 60.37 Model | 2630.02183 5 526.004367 Prob > F = 0.0000 Residual | 4530.39246 520 8.71229319 R-squared = 0.3673 -------------+------------------------------ Adj R-squared = 0.3612 Total | 7160.41429 525 13.6388844 Root MSE = 2.9517 ------------------------------------------------------------------------------ wage | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- educ | .5932586 .0507704 11.69 0.000 .4935184 .6929988 exper | .0280002 .011641 2.41 0.017 .005131 .0508693 tenure | .1398283 .02113 6.62 0.000 .0983176 .1813389 female | -1.816541 .2643156 -6.87 0.000 -2.335798 -1.297283 numdep | .1854007 .105482 1.76 0.079 -.0218225 .392624 _cons | -2.070319 .7775431 -2.66 0.008 -3.597831 -.5428076 ------------------------------------------------------------------------------ . log close log: C:\DATA\lab30ago.smcl log type: smcl closed on: 30 Aug 2005, 17:00:19 -------------------------------------------------------------------------------