﻿* Encoding: UTF-8.
*Encoding: UTF-8.
***Syntax for data preparation of the ESS data of 2008 to 2012
***Date of syntax 20.09.2016 by Verena Ortmanns
***Refers to the article "Can we assess representativeness of cross-national surveys using the education variable? 
by Verena Ortmanns and Silke L. Schneider published in Survey Research Methods


*For the ESS 2008
*Restriction of the age groupe to those aged 25 to 64.
select if (agea ge 25 and agea le 64).

*Show distribution of the education variable.
freq edulvla.

*Recode the education variable.
miss val edulvla ().
compute ISCED97_5 eq v362.
recode ISCED97_5 (1 eq 1).
recode ISCED97_5 (2 eq 2).
recode ISCED97_5 (3 eq 3).
recode ISCED97_5 (4 eq 4).
recode ISCED97_5 (5 eq 5).
recode ISCED97_5 (55 eq 9).
recode ISCED97_5 (77 eq 9).
recode ISCED97_5 (88 eq 9).
recode ISCED97_5 (99 eq 9).

variable labels ISCED97_5 "5-level version of ISCED 1997".
value labels ISCED97_5
1 "ISCED 0-1"
2 "ISCED 2"
3 "ISCED 3"
4 "ISCED 4"
5 "ISCED 5-6"
9 "Missing".

Format ISCED97_5 (F1.0).
Missing values ISCED97_5 (9).

freq ISCED97_5.
crosstab edulvla by ISCED97_5.

*Use the design weight.
weight by dweight.


*Split file by country variable and calculate the distribution of the ISCED97_5 variable.
split file separate by cntry.
freq ISCED97_5.
*These distributions can be find in the provided excel-table*



*For the ESS 2010 and 2012
*Restriction of the age groupe to those aged 25 to 64.
select if (agea ge 25 and agea le 64).

*Show distribution of the education variable.
freq edulvlb.

*Recode the education variable.
miss val edulvlb ().
compute ISCED97_5 eq edulvlb.
if (edulvlb eq 000) ISCED97_5 eq 1.
if (edulvlb eq 113) ISCED97_5 eq 1.
if (edulvlb eq 129) ISCED97_5 eq 1.
if (edulvlb eq 212) ISCED97_5 eq 2.
if (edulvlb eq 213) ISCED97_5 eq 2.
if (edulvlb eq 221) ISCED97_5 eq 2.
if (edulvlb eq 222) ISCED97_5 eq 2.
if (edulvlb eq 223) ISCED97_5 eq 2.
if (edulvlb eq 229) ISCED97_5 eq 2.
if (edulvlb eq 311) ISCED97_5 eq 3.
if (edulvlb eq 312) ISCED97_5 eq 3.
if (edulvlb eq 313) ISCED97_5 eq 3.
if (edulvlb eq 321) ISCED97_5 eq 3.
if (edulvlb eq 322) ISCED97_5 eq 3.
if (edulvlb eq 323) ISCED97_5 eq 3.
if (edulvlb eq 412) ISCED97_5 eq 4.
if (edulvlb eq 413) ISCED97_5 eq 4.
if (edulvlb eq 421) ISCED97_5 eq 4.
if (edulvlb eq 422) ISCED97_5 eq 4.
if (edulvlb eq 423) ISCED97_5 eq 4.
if (edulvlb eq 510) ISCED97_5 eq 5.
if (edulvlb eq 520) ISCED97_5 eq 5.
if (edulvlb eq 610) ISCED97_5 eq 5.
if (edulvlb eq 620) ISCED97_5 eq 5.
if (edulvlb eq 710) ISCED97_5 eq 5.
if (edulvlb eq 720) ISCED97_5 eq 5.
if (edulvlb eq 800) ISCED97_5 eq 5.
if (edulvlb eq 5555) ISCED97_5 eq 9.
if (edulvlb eq 7777) ISCED97_5 eq 9.
if (edulvlb eq 8888) ISCED97_5 eq 9.
if (edulvlb eq 9999) ISCED97_5 eq 9.

variable labels ISCED97_5 "5-level version of ISCED 1997".
value labels ISCED97_5
1 "ISCED 0-1"
2 "ISCED 2"
3 "ISCED 3"
4 "ISCED 4"
5 "ISCED 5-6"
9 "Missing".

Format ISCED97_5 (F1.0).
Missing values ISCED97_5 (9).

freq ISCED97_5.
crosstab edulvlb by ISCED97_5.

*Use the design weight.
weight by dweight.

*Split file by country variable and calculate the distribution of the ISCED97_5 variable.
split file separate by cntry.
freq ISCED97_5.
*These distributions can be find in the provided excel-table*

