TEDS Data Dictionary

Derived Variables in the 14 Year Dataset

This page gives a listing of derived variables in the 14 Year dataset, in alphabetical order of variable name. For each variable, a short written description is followed by the SPSS syntax (in a box) that was used to derive the variable.

The variables from the twin web tests, twin booklet and teacher questionnaire were derived prior to double entering the dataset. Hence the variables used in the syntax on this page (having names ending in "1") were later used to make the corresponding co-twin variables (with names ending in "2").

This page does not include descriptions of background variables that are derived from other sources and that are included in the 14 Year dataset. For information about such variables, see pages describing background variables, exclusions and scrambled IDs.

List of variables described on this page

Click on a variable name in the table below to go to the description on this page. Alternatively, scroll down and find variables in alphabetical order.

Twin web tests
Adjusted web test scores:
nrvtota1/2, nsctota1/2, nvctota1/2
Mean test item answer time:
nrvatm1/2, nscatm1/2, nvcatm1/2
Total time to complete the entire battery:
ntottm1/2
Cognitive composite
ncg1/2
Achievement composites:
From teacher ratings (cohort 1 only):
nt2a1/2, nt3a1/2
From end-of-KS3 ratings (all cohorts):
npks3t2a1/2, npks3t3a1/2, npks3tall1/2
SLQ language codes
Inference of language for cohort 1 teacher assessment:
nslla1r1/2

Behaviour scales
AQ total:
npaqt1/2
Conners total:
ncconnt1/2, npconnt1/2, ntconnt1/2
Conners hyperactivity-impulsivity:
ncconhit1/2, npconhit1/2, ntconhit1/2
Conners inattention:
ncconint1/2, npconint1/2, ntconint1/2
APSD total:
npapsdt1/2, ntapsdt1/2
APSD callous-unemotional:
npcalt1/2, ntcalt1/2
APSD narcissism:
npnart1/2, ntnart1/2
APSD impulsivity:
npimpt1/2, ntimpt1/2

Background variables
Data flags:
n14year, ncwdata1/2
Twin ages:
ncwage1/2, ntqage1/2, ncqage1/2, npqage, npslage, npslks3age
LLC dates and ages:
ncqLLCage1/2, ncqLLCdate1/2, ncwLLCage1/2, ncwLLCdate1/2, npqLLCage, npqLLCdate, npslks3LLCage, npslks3LLCdate, npslLLCage, npslLLCdate, ntqLLCage1/2, ntqLLCdate1/2

Environment and other scales
BMI:
ncbmi1/2
Puberty scale:
npubtot1/2
Menstruation onset age:
ncpub5age1/2
Chaos at home:
ncchato1/2, npchatot
Parental discipline negative:
ncdisnegt1/2, npdisnegt1/2
Parental discipline positive:
ncdispost1/2, npdispost1/2
Parental discipline avoidance:
ncdisavot1/2, npdisavot1/2
Parental negative feelings:
ncparnegt1/2, npparnegt1/2
Parental positive feelings:
ncparpost1/2, npparpost1/2
Parent feelings total:
ncpart1/2, nppart1/2
Victimization physical:
ncvicph1/2, npvicph1/2, ntvicph1/2
Victimization verbal:
ncvicve1/2, npvicve1/2, ntvicve1/2
Victimization social manipulation:
ncvicso1/2, npvicso1/2, ntvicso1/2
Victimization property:
ncvicpr1/2, npvicpr1/2, ntvicpr1/2

Definitions of derived variables

Listed alphabetically

n14year

Data flag, to show presence of any data (from parent, twin or teacher) for a twin pair in the 14 Year study.
Coded as 1=yes (data present).
The syntax below is executed on the full dataset, after merging together all data sources.

* Only retain twin pairs for whom data exists.
* (parent qnr, parent slq, web consent, teacher or twin qnr for either twin).
FILTER OFF.
USE ALL.
SELECT IF (npqdata = 1 | nslqdata = 1 | ncwcons = 1 | ncqdata1 = 1 | ncqdata2 = 1
 | ntqdata1 = 1 | ntqdata2 = 1).
EXECUTE.

* add a flag variable to show there is some 14yr data present.
COMPUTE n14year = 1.
EXECUTE.

ncbmi1/2

Twin body mass index (BMI), measured in units of kilograms per square metre. Derived from height and weight item variables. The syntax also shows how BMI, heights and weights are cleaned to remove extreme outliers.

* Height variable is in centimetres.
* We want BMI in units of kilograms per square metre.
* So include a scaling factor of 10000 in the BMI calculation.
* and round to the nearest 0.1.
COMPUTE ncbmi1 = RND(((10000 * ncwtkg1) / (nchtcm1 * nchtcm1)), 0.1).  
EXECUTE.

* Get rid of outliers in heights, weights and bmis.
* Recode anomalies to missing on the assumption that they are errors.
* Cut-offs are based on close examination of distributions.
* Heights must be between 130cm and 200cm.
RECODE
  nchtcm1 (Lowest thru 130=SYSMIS) (201 thru Highest=SYSMIS)  .
EXECUTE .
* Weights must be between 30 and 94 kg.
RECODE
  ncwtkg1 (Lowest thru 29.9=SYSMIS)  (94 thru Highest=SYSMIS)  .
EXECUTE .
* BMI must be between 12 and 35.
RECODE
  ncbmi1 (Lowest thru 11.9999=SYSMIS) (35 thru Highest=SYSMIS)  .
EXECUTE .

* get rid of bmi values for the removed heights and weights.
DO IF (SYSMIS(nchtcm1) | SYSMIS(ncwtkg1)).
  RECODE ncbmi1 (ELSE=SYSMIS).
END IF.
EXECUTE.
* and the other way around, to remove data for dodgy BMIs.
DO IF (SYSMIS(ncbmi1)).
  RECODE nchtcm1 ncwtkg1 (ELSE=SYSMIS).
END IF.
EXECUTE.

* Round weights to integer kg now BMI calculated.
COMPUTE ncwtkg1 = RND(ncwtkg1).
EXECUTE.
FORMATS ncwtkg1 (F1.0).

ncchato1/2, npchatot

Chaos at home scale.
Derived from chaos items in the child (nc) and parent (np) booklets.
Each scale is derived from all the available chaos items (5 for parents, 6 for children).
Each item is coded 0/1/2 so the scale values range from 0 to 10 for parents, and 0 to 12 for children.
The parent version is specific to the home/family, not to individual twins.

* Chaos scale uses all available items: 5 for parent, 6 for child.
* Require at least 3 items to be non-missing.
COMPUTE npchatot = 5 * MEAN.3(npcha1xr, npcha2x, npcha3xr, npcha4x, npcha5xr).
COMPUTE ncchato1 = 6 * MEAN.3(nccha1r1, nccha21, nccha31, nccha4r1, nccha51, nccha6r1).
EXECUTE.

ncconhit1/2, npconhit1/2, ntconhit1/2

Conners hyperactivity-impulsivity scale.
Derived from Conners items in the child (nc), parent (np) and teacher (nt) booklets.
Each scale is derived from 9 Conners items (each item having values 0/1/2/3), giving a total score between 0 and 27.

* Conners Hyperactivity-Impulsivity scale from 9 items.
* Require at least 5 items to be non-missing.
COMPUTE npconhit1 = 9 * MEAN.5(npcon011, npcon051, npcon081, npcon101,
 npcon111, npcon131, npcon141, npcon161, npcon181).
COMPUTE ncconhit1 = 9 * MEAN.5(nccon011, nccon051, nccon081, nccon101,
 nccon111, nccon131, nccon141, nccon161, nccon181).
COMPUTE ntconhit1 = 9 * MEAN.5(ntcon011, ntcon051, ntcon081, ntcon101,
 ntcon111, ntcon131, ntcon141, ntcon161, ntcon181).
EXECUTE.

ncconint1/2, npconint1/2, ntconint1/2

Conners inattention scale.
Derived from Conners items in the child (nc), parent (np) and teacher (nt) booklets.
Each scale is derived from 9 Conners items (each item having values 0/1/2/3), giving a total score between 0 and 27.

* Conners Inattention scale from 9 items.
* Require at least 5 items to be non-missing.
COMPUTE npconint1 = 9 * MEAN.5(npcon021, npcon031, npcon041,
 npcon061, npcon071, npcon091, npcon121, npcon151, npcon171).
COMPUTE ncconint1 = 9 * MEAN.5(nccon021, nccon031, nccon041,
 nccon061, nccon071, nccon091, nccon121, nccon151, nccon171).
COMPUTE ntconint1 = 9 * MEAN.5(ntcon021, ntcon031, ntcon041,
 ntcon061, ntcon071, ntcon091, ntcon121, ntcon151, ntcon171).

ncconnt1/2, npconnt1/2, ntconnt1/2

Conners total scale.
Derived from Conners items in the child (nc), parent (np) and teacher (nt) booklets.
Each scale is derived from all 18 Conners items (each item having values 0/1/2/3), giving a total score between 0 and 54.

* Conners total scale from all 18 items.
* Require at least 9 items to be non-missing.
COMPUTE npconnt1 = 18 * MEAN.9(npcon011, npcon021, npcon031, npcon041, npcon051,
 npcon061, npcon071, npcon081, npcon091, npcon101, npcon111, npcon121, npcon131,
 npcon141, npcon151, npcon161, npcon171, npcon181).
COMPUTE ncconnt1 = 18 * MEAN.9(nccon011, nccon021, nccon031, nccon041, nccon051,
 nccon061, nccon071, nccon081, nccon091, nccon101, nccon111, nccon121, nccon131,
 nccon141, nccon151, nccon161, nccon171, nccon181).
COMPUTE ntconnt1 = 18 * MEAN.9(ntcon011, ntcon021, ntcon031, ntcon041, ntcon051,
 ntcon061, ntcon071, ntcon081, ntcon091, ntcon101, ntcon111, ntcon121, ntcon131,
 ntcon141, ntcon151, ntcon161, ntcon171, ntcon181).
EXECUTE.

ncdisavot1/2, ncdisnegt1/2, ncdispost1/2, npdisavot1/2, npdisnegt1/2, npdispost1/2

Parental discipline subscales.
Subscales for negative/harsh discipline (disnegt), positive discipline (dispost) and avoidance (disavot).
Derived from the 6 discipline items in the child (nc) and parent (np) booklets.
Each subscale is derived from 2 items; each item is coded 0/1/2 so the scale values range from 0 to 4.

* Parental Discipline.
* Parent and child versions, from questionnaires.
* Negative parental discipline subscale from 2 items.
* Require at least 1 item to be non-missing.
COMPUTE npdisnegt1 = 2 * MEAN.1(npdis011, npdis021).
COMPUTE ncdisnegt1 = 2 * MEAN.1(ncdis11, ncdis21).
* Postive discipline subscale, 2 items.
COMPUTE npdispost1 = 2 * MEAN.1(npdis031, npdis041).
COMPUTE ncdispost1 = 2 * MEAN.1(ncdis31, ncdis41).
* Avoiding discipline subscale, 2 items.
COMPUTE npdisavot1 = 2 * MEAN.1(npdis051, npdis061).
COMPUTE ncdisavot1 = 2 * MEAN.1(ncdis51, ncdis61).
EXECUTE.

ncg1/2

General cognitive ability scale (g).
Derived from adjusted Vocab and Ravens test scores, which are described elsewhere on this page.

* These variables must be standardised on the non-excluded sample.
* So apply a filter before doing anything else.
USE ALL.
COMPUTE filter_$=(exclude1 = 0).
VARIABLE LABELS filter_$ 'exclude1 = 0 (FILTER)'.
VALUE LABELS filter_$ 0 'Not Selected' 1 'Selected'.
FORMATS filter_$ (f1.0).
FILTER BY filter_$.
EXECUTE.

* First standardise the cognitive web test scores.
* Note: use the adjusted Vocab and Ravens scores, not the raw scores.
DESCRIPTIVES
  VARIABLES= nrvtota1 nvctota1  /SAVE .

* Compute G as the mean of the two standardised test scores.
COMPUTE g1 = MEAN.2(znrvtota1, znvctota1) .
EXECUTE .

* standardise the new variable.
DESCRIPTIVES
  VARIABLES= g1 (ncg1) /SAVE .

* All the cognitive variables have now been computed on the filtered sample.
* They will have missing values for the exclusions (nexclude = 1).
* The filter can now be removed.
FILTER OFF.
USE ALL.
EXECUTE .

ncparnegt1/2, ncparpost1/2, ncpart1/2, npparnegt1/2, npparpost1/2, nppart1/2

Negative (parnegt) and positive (parpost) parental feelings subscales, and overall total scales (part).
Derived from the 7 parental feelings items in the child (nc) and parent (np) booklets.
Each item is coded 0/1/2 so the scale values range from 0 to 2 times the number of items included.

* Negative parental feelings scale from 4 items.
* Require at least 2 items to be non-missing.
COMPUTE npparnegt1 = 4 * MEAN.2(nppar011, nppar041, nppar051, nppar071).
COMPUTE ncparnegt1 = 4 * MEAN.2(ncpar11, ncpar41, ncpar51, ncpar71).
EXECUTE.

* Positive parental feelings scale from 3 items.
* Require at least 2 items to be non-missing.
COMPUTE npparpost1 = 3 * MEAN.2(nppar021, nppar031, nppar061).
COMPUTE ncparpost1 = 3 * MEAN.2(ncpar21, ncpar31, ncpar61).
EXECUTE.

* Overall total scale derived from all 7 items.
* coded in the negative feelings direction, so reverse each positive item.
* (coded 0/1/2) by subtracting it from 2.
COMPUTE nppart1 = 7 * MEAN.4(nppar011, (2 - nppar021), (2 - nppar031), 
    nppar041, nppar051, (2 - nppar061), nppar071).
COMPUTE ncpart1 = 7 * MEAN.4(ncpar11, (2 - ncpar21), (2 - ncpar31), 
    ncpar41, ncpar51, (2 - ncpar61), ncpar71).
EXECUTE.

ncpub5age1/2

Age, derived as an integer number of months, of reported onset of menstruation. Derived from raw item variables reporting the year (ncpub5y) and month (ncpub5m) of onsent of menstruation and from the twin birth date (aonsdob). These raw date variables are not retained in the dataset for reasons of potential identifiability.

* Convert raw year + month of first menstruation into an age variable.
* This avoids retention of dates that could aid identifiability.
* Get year and month of twin birth.
COMPUTE aonsby = XDATE.YEAR(aonsdob).
COMPUTE aonsbm = XDATE.MONTH(aonsdob).
EXECUTE.
* First derive age in integer months.
* If menstruation onsent month is missing, just use the year.
IF (SYSMIS(ncpub5m1)) ncpub5months = ((ncpub5y1 - aonsby) * 12).
EXECUTE.
* otherwise include months in calculation.
IF (~SYSMIS(ncpub5m1)) ncpub5months = ((12 * (ncpub5y1 - aonsby)) + (ncpub5m1 - aonsbm)).
EXECUTE.
* Now recode age in months into ordinal age groups based on years.
* after inspecting the distribution: under 10, 10, 11, 12, 13, 14.
* (ages above 14 do not occur in this dataset).
RECODE ncpub5months (LOWEST THRU 119=9) (120 THRU 131=10)
 (132 THRU 143=11) (144 THRU 155=12) (156 THRU 167=13) (168 THRU HIGHEST = 14)
INTO ncpub5age1.
EXECUTE.

ncqage1/2

Age of twin (in decimal years) when the twin booklet was returned.
Derived from the return date of the child booklet (ncrdate1/2); this return of data may have been either on paper or on line. The variable aonsdob is the twin birth date. These date variables are not retained in the dataset.

* Parent and twin questionnaires were returned on paper in cohorts 1/2.
* but on line for cohorts 3/4 (a single variable is now used for paper and on line return dates).
* Age when child questionnaires returned.
COMPUTE ncqage1 = RND(DATEDIFF(ncrdate1, aonsdob, 'days') / 365.25, 0.1) .
COMPUTE ncqage2 = RND(DATEDIFF(ncrdate2, aonsdob, 'days') / 365.25, 0.1) .
EXECUTE.

ncqLLCage1/2, ncqLLCdate1/2, ncwLLCage1/2, ncwLLCdate1/2, npqLLCage, npqLLCdate, npslks3LLCage, npslks3LLCdate, npslLLCage, npslLLCdate, ntqLLCage1/2, ntqLLCdate1/2

Age and date variables derived for use in datasets in the LLC TRE (but not to be used in other datasets).
Ages and dates are derived for the parent questionnaire ('npq'), the child questionnaire ('ncq'), the child web activities ('ncw'), the teacher questionnaire ('ntq'), the parent SLQ questionnaire ('npsl') and estimates for when end-of-KS3 assessments were made ('npslks3').
The LLC date variables contain only the month and year, not the day, as a means of reducing identifiability. The date variables are strings formatted as 'yyyy-mm'. These LLC dates are designed to enable the TEDS measures to be placed in a time sequence with NHS medical diagnosis dates in the data in the TRE.
The LLC age variables are integers measuring the number of months between birth and the given TEDS activity, consistent with the matching LLC date variables.
Variable aonsdob is the twin birth date - the raw date variables are not retained in the dataset.

* First extract year and month as temp variables, from birth date and activity dates.
COMPUTE birthyear = XDATE.YEAR(aonsdob).
COMPUTE birthmonth = XDATE.MONTH(aonsdob).
COMPUTE ncwyear1 = XDATE.YEAR(ncwstdt1).
COMPUTE ncwmonth1 = XDATE.MONTH(ncwstdt1).
COMPUTE ncwyear2 = XDATE.YEAR(ncwstdt2).
COMPUTE ncwmonth2 = XDATE.MONTH(ncwstdt2).
COMPUTE ntqyear1 = XDATE.YEAR(ntqdate1).
COMPUTE ntqmonth1 = XDATE.MONTH(ntqdate1).
COMPUTE ntqyear2 = XDATE.YEAR(ntqdate2).
COMPUTE ntqmonth2 = XDATE.MONTH(ntqdate2).
COMPUTE ncqyear1 = XDATE.YEAR(ncrdate1).
COMPUTE ncqmonth1 = XDATE.MONTH(ncrdate1).
COMPUTE ncqyear2 = XDATE.YEAR(ncrdate2).
COMPUTE ncqmonth2 = XDATE.MONTH(ncrdate2).
COMPUTE npqyear = XDATE.YEAR(nprdate).
COMPUTE npqmonth = XDATE.MONTH(nprdate).
COMPUTE npslyear = XDATE.YEAR(nslrdate).
COMPUTE npslmonth = XDATE.MONTH(nslrdate).
EXECUTE.

* The agreed LLC date format is a string yyyy-mm (nominal by default for strings).
* adding '0' where necessary for two-digit months.
STRING ncwLLCdate1 ncwLLCdate2 ntqLLCdate1 ntqLLCdate2 ncqLLCdate1 ncqLLCdate2 npqLLCdate npslLLCdate npslks3LLCdate (A7).
IF (ncwmonth1 < 10) ncwLLCdate1 = CONCAT(STRING(ncwyear1, F4), '-0', STRING(ncwmonth1, F1)).
IF (ncwmonth1 >= 10) ncwLLCdate1 = CONCAT(STRING(ncwyear1, F4), '-', STRING(ncwmonth1, F2)).
IF (ncwmonth2 < 10) ncwLLCdate2 = CONCAT(STRING(ncwyear2, F4), '-0', STRING(ncwmonth2, F1)).
IF (ncwmonth2 >= 10) ncwLLCdate2 = CONCAT(STRING(ncwyear2, F4), '-', STRING(ncwmonth2, F2)).
IF (ntqmonth1 < 10) ntqLLCdate1 = CONCAT(STRING(ntqyear1, F4), '-0', STRING(ntqmonth1, F1)).
IF (ntqmonth1 >= 10) ntqLLCdate1 = CONCAT(STRING(ntqyear1, F4), '-', STRING(ntqmonth1, F2)).
IF (ntqmonth2 < 10) ntqLLCdate2 = CONCAT(STRING(ntqyear2, F4), '-0', STRING(ntqmonth2, F1)).
IF (ntqmonth2 >= 10) ntqLLCdate2 = CONCAT(STRING(ntqyear2, F4), '-', STRING(ntqmonth2, F2)).
IF (ncqmonth1 < 10) ncqLLCdate1 = CONCAT(STRING(ncqyear1, F4), '-0', STRING(ncqmonth1, F1)).
IF (ncqmonth1 >= 10) ncqLLCdate1 = CONCAT(STRING(ncqyear1, F4), '-', STRING(ncqmonth1, F2)).
IF (ncqmonth2 < 10) ncqLLCdate2 = CONCAT(STRING(ncqyear2, F4), '-0', STRING(ncqmonth2, F1)).
IF (ncqmonth2 >= 10) ncqLLCdate2 = CONCAT(STRING(ncqyear2, F4), '-', STRING(ncqmonth2, F2)).
IF (npqmonth < 10) npqLLCdate = CONCAT(STRING(npqyear, F4), '-0', STRING(npqmonth, F1)).
IF (npqmonth >= 10) npqLLCdate = CONCAT(STRING(npqyear, F4), '-', STRING(npqmonth, F2)).
IF (npslmonth < 10) npslLLCdate = CONCAT(STRING(npslyear, F4), '-0', STRING(npslmonth, F1)).
IF (npslmonth >= 10) npslLLCdate = CONCAT(STRING(npslyear, F4), '-', STRING(npslmonth, F2)).
EXECUTE.

* The agreed LLC age variable is in integer months.
* and it must agree with the birth and booklet year/month variables that will be available in the LLC.
COMPUTE ncwLLCage1 = (ncwmonth1 + (ncwyear1 * 12)) - (birthmonth + (birthyear * 12)).
COMPUTE ncwLLCage2 = (ncwmonth2 + (ncwyear2 * 12)) - (birthmonth + (birthyear * 12)).
COMPUTE ntqLLCage1 = (ntqmonth1 + (ntqyear1 * 12)) - (birthmonth + (birthyear * 12)).
COMPUTE ntqLLCage2 = (ntqmonth2 + (ntqyear2 * 12)) - (birthmonth + (birthyear * 12)).
COMPUTE ncqLLCage1 = (ncqmonth1 + (ncqyear1 * 12)) - (birthmonth + (birthyear * 12)).
COMPUTE ncqLLCage2 = (ncqmonth2 + (ncqyear2 * 12)) - (birthmonth + (birthyear * 12)).
COMPUTE npqLLCage = (npqmonth + (npqyear * 12)) - (birthmonth + (birthyear * 12)).
COMPUTE npslLLCage = (npslmonth + (npslyear * 12)) - (birthmonth + (birthyear * 12)).
EXECUTE.

* Estimated age/date when KS3 assessments were made (reported by parents in SLQ).
* Assuming that KS3 assessments were made in June at the end of the school year.
* in which twins reached the age of 14 years.
* (ignoring exceptions, which we have no knowledge of).
* Cohort 1: KS3 assessments should have been done in June 2008.
* Hence exame year is assumed to be 2007 + cohort.
* Only compute if SLQ data are present.
IF (nslqdata = 1) npslks3LLCage = (6 + ((2007 + cohort) * 12)) - (birthmonth + (birthyear * 12)).
* The LLC date is then inferred (or guessed) based purely on the cohort.
IF (nslqdata = 1) npslks3LLCdate = CONCAT(STRING((2007 + cohort), F4), '-06').
EXECUTE.

ncvicph1/2, ncvicpr1/2, ncvicso1/2, ncvicve1/2, npvicph1/2, npvicpr1/2, npvicso1/2, npvicve1/2, ntvicph1/2, ntvicpr1/2, ntvicso1/2, ntvicve1/2

Victimisation scales: physical (ph), property (pr), social manipulation (so) and verbal (ve).
Derived from victimisation items in the child (nc), parent (np) and teacher (nt) booklets.
Each scale is derived from 4 victimisation items (each item having values 0/1/2), giving scale values between 0 and 8.
Response of 3=don't know have been set to missing values in the SPSS variables, hence these values are treated as missing in the computation.

* Items now have 3=don't know set to missing values.
* therefore no need to recode these before summing to make scales.

* Victimization Physical scale from 4 items.
* Require at least 2 items to be non-missing.
COMPUTE npvicph1 = 4 * MEAN.2(npvic011, npvic051, npvic091, npvic131).
COMPUTE ncvicph1 = 4 * MEAN.2(ncvic011, ncvic051, ncvic091, ncvic131).
COMPUTE ntvicph1 = 4 * MEAN.2(ntvic011, ntvic051, ntvic091, ntvic131).
EXECUTE.

* Victimization Verbal scale from 4 items.
* Require at least 2 items to be non-missing.
COMPUTE npvicve1 = 4 * MEAN.2(npvic031, npvic071, npvic111, npvic151).
COMPUTE ncvicve1 = 4 * MEAN.2(ncvic031, ncvic071, ncvic111, ncvic151).
COMPUTE ntvicve1 = 4 * MEAN.2(ntvic031, ntvic071, ntvic111, ntvic151).
EXECUTE.

* Victimization Social Manipulation scale from 4 items.
* Require at least 2 items to be non-missing.
COMPUTE npvicso1 = 4 * MEAN.2(npvic021, npvic061, npvic101, npvic141).
COMPUTE ncvicso1 = 4 * MEAN.2(ncvic021, ncvic061, ncvic101, ncvic141).
COMPUTE ntvicso1 = 4 * MEAN.2(ntvic021, ntvic061, ntvic101, ntvic141).
EXECUTE.

* Victimization Property scale from 4 items.
* Require at least 2 items to be non-missing.
COMPUTE npvicpr1 = 4 * MEAN.2(npvic041, npvic081, npvic121, npvic161).
COMPUTE ncvicpr1 = 4 * MEAN.2(ncvic041, ncvic081, ncvic121, ncvic161).
COMPUTE ntvicpr1 = 4 * MEAN.2(ntvic041, ntvic081, ntvic121, ntvic161).
EXECUTE.

ncwage1/2

Age of twin (in decimal years) when the web tests were started.
Derived from the date when the twin started the web activities (ncwstdt1/2). The variable aonsdob is the twin birth date. These date variables are not retained in the dataset.

* Web test age - compute age when web tests started by twin.
COMPUTE ncwage1 = RND(DATEDIFF(ncwstdt1, aonsdob, 'days') / 365.25, 0.1) .
COMPUTE ncwage2 = RND(DATEDIFF(ncwstdt2, aonsdob, 'days') / 365.25, 0.1) .
EXECUTE.

ncwdata1/2

Data flag to show the presence or absence of meaningful twin web data (1=yes, 0=no).
Computed from item variables flagging the presence of data for the three web tests (vocabulary, ravens, science).

COMPUTE ncwdata1 = 0.
EXECUTE.
IF (SUM(nvcdata1, nrvdata1, nscdata1) > 0) ncwdata1 = 1.
EXECUTE.

ncwLLCage1/2, ncwLLCdate1/2

See ncqLLCage1/2, etc above.

npapsdt1/2, ntapsdt1/2

APSD total scale.
Derived from APSD items in the parent (np) and teacher (nt) booklets.
Each scale is derived from all 20 APSD items (each item having values 0/1/2), giving a total score between 0 and 40.

* APSD total scale from all 20 items, reversed where necessary.
* Require at least 10 of the items to be non-missing.
COMPUTE npapsdt1 = 20 * MEAN.10(npaps011, npaps021, npap03r1, npaps041,
 npaps051, npaps061, npap07r1, npaps081, npaps091, npaps101, npaps111,
 npap12r1, npaps131, npaps141, npaps151, npaps161, npaps171, npap18r1,
 npaps191, npap20r1).
COMPUTE ntapsdt1 = 20 * MEAN.10(ntaps011, ntaps021, ntap03r1, ntaps041,
 ntaps051, ntaps061, ntap07r1, ntaps081, ntaps091, ntaps101, ntaps111,
 ntap12r1, ntaps131, ntaps141, ntaps151, ntaps161, ntaps171, ntap18r1,
 ntaps191, ntap20r1).
EXECUTE.

npaqt1/2

AQ total from all 38 parent-reported items.
Use reverse-coded items where appropriate.
Require at least 19 of the 38 items to be non-missing.

* AQ total score from all 38 items, reversed where necessary.
* Require at least half of the items to be non-missing.
COMPUTE npaqt1 = 38 * MEAN.19(npaq01r1, npaq021, npaq03r1, npaq041, npaq051, npaq061, 
  npaq071, npaq08r1, npaq091, npaq10r1, npaq11r1, npaq121, npaq131, npaq14r1,
  npaq151, npaq161, npaq171, npaq181, npaq191, npaq201, npaq211, npaq22r1,
  npaq231, npaq24r1, npaq25r1, npaq26r1, npaq27r1, npaq28r1, npaq291, npaq30r1, 
  npaq311, npaq321, npaq331, npaq341, npaq351, npaq36r1, npaq37r1, npaq38r1).
EXECUTE.

npcalt1/2, ntcalt1/2

APSD callous-unemotional scale.
Derived from APSD items in the parent (np) and teacher (nt) booklets.
Each scale is derived from 6 APSD items (each item having values 0/1/2), giving scale values between 0 and 12.

* APSD Callous-Unemotional scale from 6 items.
* Require at least 3 items to be non-missing.
COMPUTE npcalt1 = 6 * MEAN.3(npap03r1, npap07r1, npap12r1, npap18r1, npaps191, npap20r1).
COMPUTE ntcalt1 = 6 * MEAN.3(ntap03r1, ntap07r1, ntap12r1, ntap18r1, ntaps191, ntap20r1).
EXECUTE.

npimpt1/2, ntimpt1/2

APSD impulsivity scale.
Derived from APSD items in the parent (np) and teacher (nt) booklets.
Each scale is derived from 5 APSD items (each item having values 0/1/2), giving scale values between 0 and 10.

* APSD Impulsivity scale from 5 items.
* Require at least 3 items to be non-missing.
COMPUTE npimpt1 = 5 * MEAN.3(npaps011, npaps041, npaps091, npaps131, npaps171).
COMPUTE ntimpt1 = 5 * MEAN.3(ntaps011, ntaps041, ntaps091, ntaps131, ntaps171).
EXECUTE.

npks3t2a1/2, npks3t3a1/2, npks3tall1/2

Academic achievement scales, derived from end-of-KS3 ratings reported to families then reported by parents to TEDS in the SLQ. There is a two-subject mean npks3t2a1/2 (English and Maths), and a three-subject mean npks3t3a1/2 (English, Maths and Science), and an all-subject mean npks3tall1/2. The latter may including up to 11 subjects, depending on which results were taken by twins and reported by parents (English, maths, science, geography, history, DT, ICT, art, music, PE and a foreign language). The scales are derived as means of subject NC levels, hence the scales have the same range of values (NC levels 1 to 9).

* Combined subject means from parent-reported end of KS3 school ratings.
* 2-subject (English, maths) and 3-subject (English, maths, science).
* means similar to those used at other ages.
COMPUTE npks3t2a1 = MEAN.2(nslengt1, nslmatt1).
COMPUTE npks3t3a1 = MEAN.3(nslengt1, nslmatt1, nslscit1).
* Add a third scale, using the mean from all 11 subjects, where reported.
* Using only the first foreign language (some twins reported 2).
* and requiring at least 6 to be non-missing.
COMPUTE npks3tall1 = MEAN.6(nslengt1, nslmatt1, nslscit1, nsldtt1, 
  nslgeot1, nslhist1, nslictt1, nslartt1, nslmust1, nslpet1, nslla1t1).
EXECUTE.

npnart1/2, ntnart1/2

APSD narcissism scale.
Derived from APSD items in the parent (np) and teacher (nt) booklets.
Each scale is derived from 7 APSD items (each item having values 0/1/2), giving scale values between 0 and 14.

* APSD Narcissism scale from 7 items.
* Require at least 4 items to be non-missing.
COMPUTE npnart1 = 7 * MEAN.4(npaps051, npaps081, npaps101, npaps111, npaps141, npaps151, npaps161).
COMPUTE ntnart1 = 7 * MEAN.4(ntaps051, ntaps081, ntaps101, ntaps111, ntaps141, ntaps151, ntaps161).
EXECUTE.

npparnegt1/2, npparpost1/2, nppart1/2

See ncparnegt1/2, etc above.

npqage

Age of twin (in decimal years) when the parent booklet was returned.
Derived from the return date of the parent booklet (nprdate); this return of data may have been either on paper or on line. The variable aonsdob is the twin birth date. These date variables are not retained in the dataset.

* Parent and twin questionnaires - returned on paper in cohorts 1/2.
* but on line for cohorts 3/4 (a single variable is now used for paper and on line return dates).
* Age when parent questionnaire returned.
COMPUTE npqage = RND(DATEDIFF(nprdate, aonsdob, 'days') / 365.25, 0.1) .
EXECUTE.

npqLLCage, npqLLCdate

See ncqLLCage1/2, etc above.

npslage

Age of twin (in decimal years) when the parent SLQ questionnaire was returned.
Derived from the return date of the parent SLQ questionnaire (nslrdate); this return of data may have been either on paper or on line. The variable aonsdob is the twin birth date. These date variables are not retained in the dataset.

* Age when SLQ questionnaire returned.
COMPUTE npslage = RND(DATEDIFF(nslrdate, aonsdob, 'days') / 365.25, 0.1) .
EXECUTE.

npslks3age

Estimated age of twin (in decimal years) when the end-of-KS3 teacher assessment was made, as reported in the parent SLQ questionnaire.
Derived from the school cohort (which determines the year of assessment) and the year and month of birth, using assumptions stated in the syntax comments. The variable aonsdob is the twin birth date; this variable is not retained in the dataset.

* Estimated age when KS3 assessments were made (part of SLQ).
* Assuming that KS3 assessments were made in June at the end of the school year.
* in which twins reached the age of 14 years.
* (ignoring exceptions which we have no knowledge of).
* Cohort 1: KS3 assessments should have been done in June 2008.
* Hence exame year is assumed to be 2007 + cohort.
* Estimate age only to the nearest month, by subtraction.
* and only compute the age if exam data are present.
IF (nslqdata = 1) npslks3age = RND(((2007 + cohort) + (6 / 12)) - (birthyear + (birthmonth / 12)), 0.1).
EXECUTE.

npslks3LLCage, npslks3LLCdate

See ncqLLCage1/2, etc above.

npslLLCage, npslLLCdate

See ncqLLCage1/2, etc above.

npubtot1/2

Pubertal scale, derived from items in the twin questionnaire. The same scale is used for both boys and girls, using the three items common to both sexes (1, 2 and 3) together with sex-specific items (4 and 5 for girls, 6 and 7 for boys). All items are re-scaled to the range 1 to 4, so that a mean can be applied.

* First recode into temporary variables.
* Recode values 0-3 into values 1-4, and 4 (not sure) into missing.
RECODE ncpub11 ncpub21 ncpub31 ncpub41 ncpub61 ncpub71
 (0=1) (1=2) (2=3) (3=4) (ELSE=SYSMIS)
INTO pub11 pub21 pub31 pub41 pub61 pub71.
EXECUTE.
* Item 5 only has values 0 and 1 - recode these to 1 and 4.
* so the range of values is comparable with the other items.
RECODE ncpub51 
 (0=1) (1=4) (ELSE=SYSMIS)
INTO pub51 .
EXECUTE.
* Items 1-3 are common to both sexes, but items 4-5 are for girls.
* and items 6-7 are for boys.
* Require a minimum of only one item to be non-missing for the scale.
DO IF (sex1 = 0).
  COMPUTE npubtot1 = MEAN(pub11, pub21, pub31, pub41, pub51).
ELSE IF (sex1 = 1).
  COMPUTE npubtot1 = MEAN(pub11, pub21, pub31, pub61, pub71).
END IF.
EXECUTE.
* Note that the filter uses sex1 (admin data) not ncsex1 (puberty qnr).
* so rare cases where these disagree will not use the sex-specific items.

npvicph1/2, npvicpr1/2, npvicso1/2, npvicve1/2

See ncvicph1/2, etc above.

nrvatm1/2

Mean item answer time (in seconds) for the twin Ravens web test.

COMPUTE nrvatm1 = MEAN(nrc01at1,nrc03at1,nrc05at1,nrc07at1,nrc09at1,nrc11at1,
  nrd01at1,nrd03at1,nrd05at1,nrd07at1,nrd09at1,nrd11at1,nre01at1, 
  nre03at1,nre05at1,nre07at1,nre09at1,nre11at1,nrf01at1,nrf02at1,
  nrf03at1,nrf04at1,nrf05at1,nrf06at1,nrf07at1,nrf08at1,nrf09at1,
  nrf10at1,nrf11at1,nrf12at1).

nrvtota1/2

Adjusted total score for the Ravens web test.
Scores are adjusted for twins who discontinued at any point in the test, by replacing missing item scores with 'chance' item scores: the mean score from a random answer to the item. Discontinued items have previously been identified and given item response values of -2.

* Ravens test.
* Compute the adjusted item scores in new temporary variables.
* Discontinue only applies after the first three items in each subtest.
* so do not compute adjusted scores for the first three in each subtest.
RECODE nrc07sc1 nrc09sc1 nrc11sc1 nrd07sc1 nrd09sc1 nrd11sc1
 nre07sc1 nre09sc1 nre11sc1 nrf04sc1 nrf05sc1 nrf06sc1
 nrf07sc1 nrf08sc1 nrf09sc1 nrf10sc1 nrf11sc1 nrf12sc1
 (ELSE=COPY)
 INTO nrc07sc1a nrc09sc1a nrc11sc1a nrd07sc1a nrd09sc1a nrd11sc1a 
 nre07sc1a nre09sc1a nre11sc1a nrf04sc1a nrf05sc1a nrf06sc1a 
 nrf07sc1a nrf08sc1a nrf09sc1a nrf10sc1a nrf11sc1a nrf12sc1a .
EXECUTE.
* Replace missing/zero item scores (due to discontinue, responses coded -2).
* with chance scores - these are always 1/8=0.125 as there are always 8 possible responses.
IF (nrc07an1 = -2) nrc07sc1a = 0.125.
IF (nrc09an1 = -2) nrc09sc1a = 0.125.
IF (nrc11an1 = -2) nrc11sc1a = 0.125.
IF (nrd07an1 = -2) nrd07sc1a = 0.125.
IF (nrd09an1 = -2) nrd09sc1a = 0.125.
IF (nrd11an1 = -2) nrd11sc1a = 0.125.
IF (nre07an1 = -2) nre07sc1a = 0.125.
IF (nre09an1 = -2) nre09sc1a = 0.125.
IF (nre11an1 = -2) nre11sc1a = 0.125.
IF (nrf04an1 = -2) nrf04sc1a = 0.125.
IF (nrf05an1 = -2) nrf05sc1a = 0.125.
IF (nrf06an1 = -2) nrf06sc1a = 0.125.
IF (nrf07an1 = -2) nrf07sc1a = 0.125.
IF (nrf08an1 = -2) nrf08sc1a = 0.125.
IF (nrf09an1 = -2) nrf09sc1a = 0.125.
IF (nrf10an1 = -2) nrf10sc1a = 0.125.
IF (nrf11an1 = -2) nrf11sc1a = 0.125.
IF (nrf12an1 = -2) nrf12sc1a = 0.125.
EXECUTE.

* The adjusted test score is now the sum of these adjusted item scores.
* plus the unadjusted item scores for the first three items in each subtest.
* Only compute the adjusted test score for twins with Ravens data.
DO IF (nrvstat1 = 2).
 COMPUTE nrvtota1 = SUM(nrc01sc1, nrc03sc1, nrc05sc1, nrc07sc1a, nrc09sc1a, nrc11sc1a,
 nrd01sc1, nrd03sc1, nrd05sc1, nrd07sc1a, nrd09sc1a, nrd11sc1a, 
 nre01sc1, nre03sc1, nre05sc1, nre07sc1a, nre09sc1a, nre11sc1a, 
 nrf01sc1, nrf02sc1, nrf03sc1, nrf04sc1a, nrf05sc1a, nrf06sc1a, 
 nrf07sc1a, nrf08sc1a, nrf09sc1a, nrf10sc1a, nrf11sc1a, nrf12sc1a).
END IF.
EXECUTE.

nscatm1/2

Mean item answer time (in seconds) for the twin Science web test.

COMPUTE nscatm1 = MEAN(nsc01at1,nsc02Aat1,nsc02Bat1,nsc03at1,nsc04at1,
 nsc05Aat1,nsc05Bat1,nsc06at1,nsc07at1,nsc08at1,
 nsc09Aat1,nsc09Bat1,nsc09Cat1,nsc09Dat1,nsc10Aat1,
 nsc10Bat1,nsc10Cat1,nsc11Aat1,nsc11Bat1,nsc11Cat1,
 nsc11Dat1,nsc12at1,nsc13at1,nsc14Aat1,nsc14Bat1,
 nsc14Cat1,nsc15at1,nsc16at1,nsc17Aat1,nsc17Bat1,
 nsc18Aat1,nsc18Bat1,nsc18Cat1,nsc19at1,nsc20at1,
 nsc21at1,nsc22at1,nsc23at1,nsc24at1).

nsctota1/2

Adjusted total score for the Science web test.
Scores are adjusted for twins who discontinued at any point in the test, by replacing missing item scores with 'chance' item scores: the mean score from a random answer to the item. Note that different items have different response formats so the chance score varies from item to item. Discontinued items have previously been identified and given item response values of -2.

* Science test.
* Compute the adjusted item scores in new temporary variables.
* Copy existing item scores, but discontinued items (answers coded -2, number or string).
* will take chance scores - the latter vary from item to item due to different response formats.
* Twins can only discontinue after 5 wrong items, so do not recode the first 5 items.
* and do not recode item 9D: requires keyboard input such that chance score is effectively zero.
RECODE nsc05Asc1 nsc05Bsc1 nsc06sc1 nsc07sc1 nsc08sc1 
 nsc09Asc1 nsc09Bsc1 nsc09Csc1 nsc10Asc1 nsc10Bsc1 nsc10Csc1 nsc11Asc1 nsc11Bsc1 nsc11Csc1 
 nsc11Dsc1 nsc12sc1 nsc13sc1 nsc14Asc1 nsc14Bsc1 nsc14Csc1 nsc15sc1 nsc16sc1 nsc17Asc1 nsc17Bsc1 
 nsc18Asc1 nsc18Bsc1 nsc18Csc1 nsc19sc1 nsc20sc1 nsc21sc1 nsc22sc1 nsc23sc1 nsc24sc1
 (ELSE=COPY)
INTO nsc05Asc1a nsc05Bsc1a nsc06sc1a nsc07sc1a nsc08sc1a 
 nsc09Asc1a nsc09Bsc1a nsc09Csc1a nsc10Asc1a nsc10Bsc1a nsc10Csc1a nsc11Asc1a nsc11Bsc1a nsc11Csc1a 
 nsc11Dsc1a nsc12sc1a nsc13sc1a nsc14Asc1a nsc14Bsc1a nsc14Csc1a nsc15sc1a nsc16sc1a nsc17Asc1a nsc17Bsc1a 
 nsc18Asc1a nsc18Bsc1a nsc18Csc1a nsc19sc1a nsc20sc1a nsc21sc1a nsc22sc1a nsc23sc1a nsc24sc1a .
EXECUTE.
* random score is 0.25 where twins have to click one of four possible responses.
* and so on, varying from item to item.
IF (nsc05Aan1 = -2) nsc05Asc1a = 0.2.
IF (nsc05Ban1 = -2) nsc05Bsc1a = 0.25.
IF (nsc06an1 = '-2') nsc06sc1a = 1.
IF (nsc07an1 = -2) nsc07sc1a = 0.2.
IF (nsc08an1 = -2) nsc08sc1a = 0.25.
IF (nsc09Aan1 = -2) nsc09Asc1a = 0.25.
IF (nsc09Ban1 = -2) nsc09Bsc1a = 0.25.
IF (nsc09Can1 = -2) nsc09Csc1a = 0.25.
IF (nsc10Aan1 = '-2') nsc10Asc1a = 1.333.
IF (nsc10Ban1 = -2) nsc10Bsc1a = 0.25.
IF (nsc10Can1 = -2) nsc10Csc1a = 0.25.
IF (nsc11Aan1 = -2) nsc11Asc1a = 0.25.
IF (nsc11Ban1 = -2) nsc11Bsc1a = 0.25.
IF (nsc11Can1 = -2) nsc11Csc1a = 0.25.
IF (nsc11Dan1 = -2) nsc11Dsc1a = 0.25.
IF (nsc12an1 = -2) nsc12sc1a = 0.2.
IF (nsc13an1 = -2) nsc13sc1a = 0.25.
IF (nsc14Aan1 = -2) nsc14Asc1a = 0.143.
IF (nsc14Ban1 = -2) nsc14Bsc1a = 0.143.
IF (nsc14Can1 = -2) nsc14Csc1a = 0.25.
IF (nsc15an1 = '-2') nsc15sc1a = 0.012.
IF (nsc16an1 = -2) nsc16sc1a = 0.25.
IF (nsc17Aan1 = -2) nsc17Asc1a = 0.25.
IF (nsc17Ban1 = '-2') nsc17Bsc1a = 1.333.
IF (nsc18Aan1 = -2) nsc18Asc1a = 0.25.
IF (nsc18Ban1 = '-2') nsc18Bsc1a = 0.167.
IF (nsc18Can1 = -2) nsc18Csc1a = 0.25.
IF (nsc19an1 = '-2') nsc19sc1a = 0.042.
IF (nsc20an1 = '-2') nsc20sc1a = 0.012.
IF (nsc21an1 = -2) nsc21sc1a = 0.25.
IF (nsc22an1 = -2) nsc22sc1a = 0.25.
IF (nsc23an1 = -2) nsc23sc1a = 0.2.
IF (nsc24an1 = '-2') nsc24sc1a = 0.1.
EXECUTE.

* The adjusted test score is now the sum of these adjusted item scores.
* plus the unadjusted item scores for the first five items and item 9.
DO IF (nscstat1 = 2).
 COMPUTE nsctota1 = SUM(nsc01sc1, nsc02Asc1, nsc02Bsc1, nsc03sc1, nsc04sc1,
 nsc05Asc1a, nsc05Bsc1a, nsc06sc1a, nsc07sc1a, nsc08sc1a, nsc09Asc1a, nsc09Bsc1a,
 nsc09Csc1a, nsc09Dsc1, nsc10Asc1a, nsc10Bsc1a, nsc10Csc1a, nsc11Asc1a,
 nsc11Bsc1a, nsc11Csc1a, nsc11Dsc1a, nsc12sc1a, nsc13sc1a, nsc14Asc1a,
 nsc14Bsc1a, nsc14Csc1a, nsc15sc1a, nsc16sc1a, nsc17Asc1a, nsc17Bsc1a, 
 nsc18Asc1a, nsc18Bsc1a, nsc18Csc1a, nsc19sc1a, nsc20sc1a, nsc21sc1a,
 nsc22sc1a, nsc23sc1a, nsc24sc1a).
END IF.
EXECUTE.

nslla1r1/2

Inference of the foreign language assessed by teachers in cohort 1.
Note that in cohort 1, teachers were asked to give ratings for a foreign language (at most one per twin), but they were not explicitly asked to state which language was assessed. In the syntax below, the assessed language is inferred from other related items in the teacher questionnaire.

* Cohort 1 were not asked which foreign language was assessed.
* but in some cases the language can be inferred from the data.
* Start by checking criteria for useable data (these criteria must apply to both twins in each pair).
* using temporary variables that will be dropped at the end of this script.

* Condition A: English must be used all or most of the time (coded 1) at home, for both twins.
COMPUTE conda = 0.
EXECUTE.
IF (nslhleng1 = 1 & nslhleng2 = 1 & nslhlengu1 = 1 & nslhlengu2 = 1) conda = 1.
EXECUTE.

* Condition B: English (coded 1) must be the main language used at school, for both twins.
COMPUTE condb = 0.
EXECUTE.
IF (nslscla1 = 1 & nslscla2 = 1) condb = 1.
EXECUTE.

* Condition C: for each twin, there must be just one foreign language studied at school.
* including Welsh (code 2) but not English (code 1).
* Need to check each of 4 responses for each twin, then count the foreign languages.
* Non-English languages have codes greater than 1 in the language name variables.
COUNT schoolforeignlangs1 = nslls1n1 nslls2n1 nslls3n1 nslls4n1 (2 thru highest).
COUNT schoolforeignlangs2 = nslls1n2 nslls2n2 nslls3n2 nslls4n2 (2 thru highest).
EXECUTE.
* now condition C is met if both twins have exactly one foreign language taught at school.
COMPUTE condc = 0.
EXECUTE.
IF (schoolforeignlangs1 = 1 & schoolforeignlangs2 = 1) condc = 1.
EXECUTE.

* Now compute coded twin-specific variables to detect whether the twins have SLQ data.
* and are in cohort 1, and whether the three conditions are met.
* and whether language assessment data have been recorded.
DO IF (nslqdata = 1 & cohort > 1).
 * has SLQ data but not in cohort 1: code 0.
 COMPUTE nslla1r1 = 0.
 COMPUTE nslla1r2 = 0.
ELSE IF (nslqdata = 1 & cohort = 1 & (conda = 0 | condb = 0 | condc = 0)).
 * has SLQ data, in cohort 1, but does not meet all three conditions: code 1.
 COMPUTE nslla1r1 = 1.
 COMPUTE nslla1r2 = 1.
ELSE IF (nslqdata = 1 & cohort = 1 & conda = 1 & condb = 1 & condc = 1).
 * has SLQ data, in cohort 1, and meets all three conditions.
 * code 2 if foreign language TA level is missing.
 IF (SYSMIS(nslla1t1)) nslla1r1 = 2.
 IF (SYSMIS(nslla1t2)) nslla1r2 = 2.
 * code 3 if language TA is present and language is already specified.
 IF (nslla1t1 >= 0 & ~SYSMIS(nslla1n1)) nslla1r1 = 3.
 IF (nslla1t2 >= 0 & ~SYSMIS(nslla1n2)) nslla1r2 = 3.
 * code 4 if language TA is present but language name is missing.
 * (these are the cases where we can recode the missing language name).
 IF (nslla1t1 >= 0 & SYSMIS(nslla1n1)) nslla1r1 = 4.
 IF (nslla1t2 >= 0 & SYSMIS(nslla1n2)) nslla1r2 = 4.
END IF.
EXECUTE.

* Finally, for useable cases (coded 4) copy the taught language to the assessed language.
* as long as the taught language is not English (coded 1). 
* or rare cases of Asian (7), classics (8) or 'other' (9).
* in other words, it should be one of the common languages coded 2-6.
DO IF (nslla1r1 = 4).
 IF (RANGE(nslls1n1,2,6)) nslla1n1 = nslls1n1.
 IF (RANGE(nslls2n1,2,6)) nslla1n1 = nslls2n1.
 IF (RANGE(nslls3n1,2,6)) nslla1n1 = nslls3n1.
 IF (RANGE(nslls4n1,2,6)) nslla1n1 = nslls4n1.
END IF.
DO IF (nslla1r2 = 4).
 IF (RANGE(nslls1n2,2,6)) nslla1n2 = nslls1n2.
 IF (RANGE(nslls2n2,2,6)) nslla1n2 = nslls2n2.
 IF (RANGE(nslls3n2,2,6)) nslla1n2 = nslls3n2.
 IF (RANGE(nslls4n2,2,6)) nslla1n2 = nslls4n2.
END IF.
EXECUTE.

nt2a1/2, nt3a1/2

Academic achievement scales, derived from teacher ratings.
Note that teacher data were only collected in cohort 1.
There are overall 2-subject (nt2a1/2, English and Maths) and 3-subject (nt3a1/2, English, Maths and Science) means. Each is derived as a mean of NC level ratings, hence has the same value range (3-8) as the English, maths and science teacher NC level ratings.

* Overall 2-subject and 3-subject means.
* from teacher NC ratings.
COMPUTE nt2a1 = MEAN.2(nteng1, ntmat1).
COMPUTE nt3a1 = MEAN.3(nteng1, ntmat1, ntsci1).
EXECUTE.

ntapsdt1/2

See npapsdt1/2, ntapsdt1/2 above.

ntcalt1/2

See npcalt1/2, ntcalt1/2 above.

ntconhit1/2

See ncconhit1/2, etc above.

ntconint1/2

See ncconint1/2, etc above.

ntconnt1/2

See ncconnt1/2, npconnt1/2, ntconnt1/2 above.

ntimpt1/2

See npimpt1/2, ntimpt1/2 above.

ntnart1/2

See npnart1/2, ntnart1/2 above.

ntottm1/2

Total time taken (in minutes) to complete all 3 web tests. Derived from item variables containing the time (in seconds) spent on each web test.

* Compute total web test time (in minutes).
* but only for twins who have finished all tests.
COMPUTE ntottm1 = (1/60) * SUM.3(nvctime1,nrvtime1,nsctime1).
EXECUTE.

ntqage1/2

Age of twin (in decimal years) when the teacher questionnaire was returned.
Derived from the return date of the teacher questionnaire (ntqdate1/2); this return of data may have been either on paper or on line. The variable aonsdob is the twin birth date. These date variables are not retained in the dataset.

* Age when teacher questionnaire returned (on paper or on line).
COMPUTE ntqage1 = RND(DATEDIFF(ntqdate1, aonsdob, 'days') / 365.25, 0.1) .
COMPUTE ntqage2 = RND(DATEDIFF(ntqdate2, aonsdob, 'days') / 365.25, 0.1) .
EXECUTE.

ntqLLCage1/2, ntqLLCdate1/2

See ncqLLCage1/2, etc above.

ntvicph1/2, ntvicpr1/2, ntvicso1/2, ntvicve1/2

See ncvicph1/2, etc above.

nvcatm1/2

Mean item answer time (in seconds) for the twin Vocabulary web test.

COMPUTE nvcatm1 = MEAN(nvc01at1,nvc02at1,nvc03at1,nvc04at1,nvc05at1,nvc06at1,nvc07at1,nvc08at1,
 nvc09at1,nvc10at1,nvc11at1,nvc12at1,nvc13at1,nvc14at1,nvc15at1,nvc16at1,
 nvc17at1,nvc18at1,nvc19at1,nvc20at1,nvc21at1,nvc22at1,nvc23at1,nvc24at1,
 nvc25at1,nvc26at1,nvc27at1).

nvctota1/2

Adjusted total score for the Vocabulary web test.
Scores are adjusted for twins who discontinued at any point in the test, by replacing missing item scores with 'chance' item scores: the mean score from a random answer to the item. Discontinued items have previously been identified and recoded with item response value -2.

* Vocabulary test.
* Compute the adjusted item scores in new temporary variables.
* Copy existing item scores, but replace missing item scores (due to discontinue).
* with chance scores - the latter vary between items due to different scoring rules.
* Twins can only discontinue after 5 wrong items, so do not recode the first 5 items.
RECODE nvc06sc1 nvc07sc1 nvc08sc1 nvc09sc1 nvc10sc1 nvc11sc1 nvc12sc1
 nvc13sc1 nvc14sc1 nvc15sc1 nvc16sc1 nvc17sc1 nvc18sc1 nvc19sc1 nvc20sc1
 nvc21sc1 nvc22sc1 nvc23sc1 nvc24sc1 nvc25sc1 nvc26sc1 nvc27sc1
 (ELSE=COPY)
INTO nvc06sc1a nvc07sc1a nvc08sc1a nvc09sc1a nvc10sc1a nvc11sc1a nvc12sc1a 
 nvc13sc1a nvc14sc1a nvc15sc1a nvc16sc1a nvc17sc1a nvc18sc1a nvc19sc1a nvc20sc1a 
 nvc21sc1a nvc22sc1a nvc23sc1a nvc24sc1a nvc25sc1a nvc26sc1a nvc27sc1a .
EXECUTE.
* Where discontinued (response=-2), replace score of 0 with chance score.
* Most items have 4 possible responses, with one 2-point and one 1-point correct responses.
* hence chance score is (2+1)/4=0.75 .
* Items 2 and 10 have only 3 possible responses, with just one correct 2-point response: chance score is 2/3=0.667 .
* Items 1, 6, 9 have 4 possible responses, with just one correct 2-point response: chance score is 2/4=0.5 .
IF (nvc06an1 = -2) nvc06sc1a = 0.5.
IF (nvc07an1 = -2) nvc07sc1a = 0.75.
IF (nvc08an1 = -2) nvc08sc1a = 0.75.
IF (nvc09an1 = -2) nvc09sc1a = 0.5.
IF (nvc10an1 = -2) nvc10sc1a = 0.667.
IF (nvc11an1 = -2) nvc11sc1a = 0.75.
IF (nvc12an1 = -2) nvc12sc1a = 0.75.
IF (nvc13an1 = -2) nvc13sc1a = 0.75.
IF (nvc14an1 = -2) nvc14sc1a = 0.75.
IF (nvc15an1 = -2) nvc15sc1a = 0.75.
IF (nvc16an1 = -2) nvc16sc1a = 0.75.
IF (nvc17an1 = -2) nvc17sc1a = 0.75.
IF (nvc18an1 = -2) nvc18sc1a = 0.75.
IF (nvc19an1 = -2) nvc19sc1a = 0.75.
IF (nvc20an1 = -2) nvc20sc1a = 0.75.
IF (nvc21an1 = -2) nvc21sc1a = 0.75.
IF (nvc22an1 = -2) nvc22sc1a = 0.75.
IF (nvc23an1 = -2) nvc23sc1a = 0.75.
IF (nvc24an1 = -2) nvc24sc1a = 0.75.
IF (nvc25an1 = -2) nvc25sc1a = 0.75.
IF (nvc26an1 = -2) nvc26sc1a = 0.75.
IF (nvc27an1 = -2) nvc27sc1a = 0.75.
EXECUTE.

* The adjusted test score is now the sum of these adjusted item scores.
* plus the unadjusted item scores for the first five items.
* Only compute the adjusted test score for twins with valid test data.
DO IF (nvcstat1 = 2).
 COMPUTE nvctota1 = SUM(nvc01sc1, nvc02sc1, nvc03sc1, nvc04sc1,
 nvc05sc1, nvc06sc1a, nvc07sc1a, nvc08sc1a, nvc09sc1a, nvc10sc1a,
 nvc11sc1a, nvc12sc1a, nvc13sc1a, nvc14sc1a, nvc15sc1a, nvc16sc1a, 
 nvc17sc1a, nvc18sc1a, nvc19sc1a, nvc20sc1a, nvc21sc1a, nvc22sc1a,
 nvc23sc1a, nvc24sc1a, nvc25sc1a, nvc26sc1a, nvc27sc1a).
END IF.
EXECUTE.