TEDS Data Dictionary

Derived Variables in the 4 Year Dataset

This page gives a listing of derived variables in the 4 Year dataset, in alphabetical order of variable name. For each variable, a short written description is followed (in a box) by the SPSS syntax that was used to derive the variable.

This page does not include descriptions of background variables that are derived from other sources and that are included in the 4 Year dataset. For information about such variables, see pages describing background variables, exclusions and scrambled IDs.

Most of the twin-specific variables were derived prior to double entering the dataset. Hence the variable names used in the syntax often lack the endings (1 or 2) used for the final double entered variables.

List of variables described on this page

Click on a variable name in the table below to go to the description on this page. Alternatively, scroll down and find variables in alphabetical order.

Definitions of derived variables

Listed alphabetically

dadparc1/2

Total score for the four parent-administered Parca tests in the child booklet: Odd-one-out, Drawing, Draw-a-man, Puzzle.
Computed as a simple sum with unequal weightings for the tests, and re-scaled so the range of values is 0 to 1.

* Total score for the 4 parent-administered test.
* requiring at least 3 of the 4 to be non-missing.
* The total is scaled from 0 to 1.
* Divide total by 46 if all 4 are non-missing.
COMPUTE dadparc = SUM.4(doddt, ddrawt, dpuzt, dmant) / 46.
EXECUTE.
* Divide total by 34 if dpuzt or dmant is missing.
* or divide total by 37 if ddrawt is missing.
* or divide total by 33 if doddt is missing.
IF (SYSMIS(doddt)) dadparc = SUM.3(ddrawt, dpuzt, dmant) / 33.
IF (SYSMIS(ddrawt)) dadparc = SUM.3(doddt, dpuzt, dmant) / 37.
IF (SYSMIS(dpuzt)) dadparc = SUM.3(doddt, ddrawt, dmant) / 34.
IF (SYSMIS(dmant)) dadparc = SUM.3(doddt, ddrawt, dpuzt) / 34.
EXECUTE.
dadparn1/2

Standardised mean score for three of the four parent-administered Parca tests in the child booklet. (The variable dadparn is used to derive cognitive composites drawg and dscnv, which are described elsewhere on this page).
Dadparn is derived as a mean of the standardised scores, giving equal weighting to each test. The variables are standardised on the non-excluded sample, defined by variable exclude1.

* Filter out all the standard exclusions for the twin.
* (medical, perinatal, unknown sex/zyg, missing 1st Contact).
* This will affect all standardised variables derived below.
USE ALL.
COMPUTE filter_$=(exclude1 = 0).
VARIABLE LABEL filter_$ 'exclude1 = 0 (FILTER)'.
VALUE LABELS filter_$ 0 'Not Selected' 1 'Selected'.
FORMAT filter_$ (f1.0).
FILTER BY filter_$.
EXECUTE.

* Standardised mean for the parent-administered tests.
* Note this includes just three tests (not draw-a-man).
* First standardise the test scores.
DESCRIPTIVES VARIABLES=doddt (zdoddt) ddrawt (zddrawt)
 dpuzt (zdpuzt) /SAVE.
* Find the mean - require at least two to be non-missing.
COMPUTE adparn = MEAN.2(zdoddt, zddrawt, zdpuzt).
EXECUTE.
* Now standardise this mean.
DESCRIPTIVES VARIABLES=adparn (dadparn) /SAVE.

* Remove filter: no longer needed now standardisations completed.
FILTER OFF.
USE ALL.
EXECUTE.
dagemdex, dagestex

Exclusion variables (coded 1=exclude, 0=not excluded) based on twin age criteria. The criteria are strict for variable dagestex, and moderate for variable dagemdex. See comments in syntax below for detailed exclusion criteria.
Derived from double-entered twin age variables dtwbage1/2 which are described elsewhere on this page.

* Moderate version: each twin age no greater than 54/12 (4 years 6 months).
* and no less than 46/12 (3 years 10 months).
* Applies only to twin measures in the twin booklet (not parent booklet age).
COMPUTE dagemdex = 0.
EXECUTE.
IF (dtwbage1 > (54/12) | dtwbage2 > (54/12)) dagemdex = 1.
IF (dtwbage1 < (46/12) | dtwbage2 < (46/12)) dagemdex = 1.
EXECUTE.

* Strict version: each twin age no greater than 51/12 (4 years 3 months).
* and no less than 46/12 (3 years 10 months).
* and age differences no more than 2 months.
COMPUTE dagestex = 0.
EXECUTE.
IF (dtwbage1 > (51/12) | dtwbage2 > (51/12)) dagestex = 1.
IF (dtwbage1 < (46/12) | dtwbage2 < (46/12)) dagestex = 1.
EXECUTE.
IF (dtwbagediff > (2/12)) dagestex = 1.
EXECUTE.
dalg2zy, dalgzyg

Zygosity (1=MZ, 2=DZ, 5=indeterminate, 99=inconsistent data) derived from the 4 year parent booklet data using the zygosity algorithm.
Variable dalg2zy derives purely from the booklet data, while dalgzyg also takes into account knowledge of twin sexes.
Computed from derived variable dtempzyg, which is described elsewhere on this page, from booklet item variables, and from twin sexes sex1/2 (from admin data).

* First check whether twins have different sexes.
COMPUTE sexdif = ABS(sex1 - sex2).
EXECUTE.

* Now zygosity from algorithm can be derived.
* using the computed value of dtempzyg.
* Start with default value of 5 (indeterminate).
COMPUTE dalgzyg = 5.
EXECUTE.
* Now use the difference score: 0.64 or less means MZ.
* 0.70 or more means DZ.
IF (dtempzyg <= 0.64) dalgzyg = 1.
IF (dtempzyg >= 0.70) dalgzyg = 2.
EXECUTE.
* Now over-rule the score and conclude DZ if there are clear differences.
* in eye colour, hair shade or hair texture, or if they look very different. 
IF (dzyhairs = 3 | dzyhairt = 3 | dzyeyes = 3 | dzypeas = 3) dalgzyg = 2.
EXECUTE.
* Also over-rule score if alike as two peas in a pod (conclude MZ).
IF (dzypeas = 1) dalgzyg = 1.
EXECUTE.
* But if the latter clashes with clear differences in hair/eyes.
* then the result is inconsistent (value 99).
IF ((dzypeas = 1) & (dzyhairs = 3 | dzyhairt = 3 | dzyeyes = 3)) dalgzyg = 99.
EXECUTE.
* Copy the result into a second variable representing the derived.
* zygosity without reference to information about twin sexes.
* This will be used for admin purposes, to track changes in estimated.
* zygosity for pairs where the twin sexes are updated.
COMPUTE dalg2zy = dalgzyg.
EXECUTE.
* Finally, for dalgzyg but not dalg2zy, over-rule all other data.
* if twins have opposite sexes (DZ) or sexes are unknown.
IF (sexdif = 1) dalgzyg = 2.
IF (SYSMIS(sexdif)) dalgzyg = 5.
EXECUTE.
danxocbt1/2, danxshyt1/2, danxfeart1/2, danxnafft1/2, danxncogt1/2, danxt1/2

ARBQ anxiety scales.
danxocbt1/2: OCB (obsessive-compulsive behaviours).
danxshyt1/2: shyness, also called social anxiety or behavioural inhibition.
danxfeart1/2: fear.
danxnafft1/2: negative affect.
danxncogt1/2: negative cognition.
danxt1/2: overall total anxiety scale.
These ARBQ scales are comparable with those used at other ages (3, 7, 9, 16), subject to many variations in the items that were included at each age. As at later ages, the ARBQ scales are derived from SDQ emotion items in addition to the available ARBQ measure items. At age 4, we have 12 ARBQ items plus 4 SDQ emotion items, which is more than at age 3 but fewer than at ages 7/9/16. This makes it possible to derive all five subscales used at later ages, but with somewhat reduced numbers of contributing items. Two Behar emotion items have also been retained at age 4 from age 3; these two items are used in the total anxiety scale but not in the subscales. Additional details are given in the syntax below.
Published papers are broadly in agreement about the items used to make the subscales. At all ages where the items are present, the following decisions have been made about the use of the items:
(a) anx09, "twitches", and sdqemo1, "headaches", are not used in any of the subscales, in agreement with at least two papers. anx09 was used in OCB by one paper and in negative affect by another paper while sdqemo1 was used in fear by one paper.
(b) anx05, "asks for reassurance", is used in the negative cognition subscale, as in at least two papers. It was used in OCB by another paper.
(c) anx06, "doing things over and over", is used in the OCB subscale, in agreement with at least two papers. It was used in negative affect in another paper.

* ARBQ (anxiety) scales.
* Comparable with ARBQ scales at ages 3, 7, 9 and 16.
* Note differences from age 3 which has fewer ARBQ items.
* (also Behar items in place of some SDQ items).
* and differences from ages 7/9/16 where there are more ARBQ items.

* Shyness (social anxiety) subscale, from the same 3 items as at age 3.
* (at later ages, anx23 is dropped and more items are added).
COMPUTE danxshyt = 3 * MEAN.2(danx02, danx07, danx23r).

* Obsessive-compulsive behaviour subscale, from 4 items.
* (same as at age 7; items not present at age 3).
COMPUTE danxocbt = 4 * MEAN.2(danx04, danx06, danx14, danx21).

* Fear subscale, from 3 items, adding SDQ item sdqemo5.
* to the 2 items used at age 3.
* (after age 4, anx22 is dropped and more items are added).
COMPUTE danxfeart = 3 * MEAN.2(danx20, danx22, dsdqemo5).

* Negative cognition subscale, from just 2 items, one of which is SDQ.
* (items not present at age 3; more items were added from age 7).
COMPUTE danxncogt = 2 * MEAN.1(danx05, dsdqemo2).

* Negative affect subscale, from 2 items.
* (items not present at age 3; more items were added from age 7).
COMPUTE danxnafft = 2 * MEAN.1(danx16, dsdqemo3).

* Total anxiety scale, from all 12 ARBQ items.
* plus the 4 SDQ emotion items and 2 Behar emotion items.
COMPUTE danxt = 18 * MEAN.9(danx02, danx04, danx05, danx06, danx07, danx09, 
  danx14, danx16, danx22, danx20, danx21, danx23r,
  dsdqemo1, dsdqemo2, dsdqemo3, dsdqemo5, dbeh33, dbeh42).
EXECUTE.
* This includes the 14 items from the subscales above.
* plus anx09 (twitches) and sdqemo1 (headaches) that are not used.
* in any subscales, plus the two behar emotion items carried.
* over from age 3 and also not used in any subscales.
dcont1/2, demot1/2, dhypt1/2, dprot1/2

Overall behaviour scales derived from SDQ, Behar and additional hyperactivity items.
dcont1/2: Conduct scale, from 5 SDQ and 5 Behar items.
dhypt1/2: Hyperactivity scale, from 5 SDQ, 2 Behar and 3 additional hyperactivity items.
dprot1/2: Prosocial scale, from 5 SDQ and 3 Behar items.
demot1/2: Emotion scale, from 4 SDQ and 2 Behar items.
These scales are similar to, and comparable with, the Behar scales used at ages 2 and 3, although at age 4 many Behar items have been replaced with SDQ items.
See comments in syntax below for further details. At least half of the component items are required to be non-missing for each scale.

* Overall scales, including SDQ + Behar + Hyperactivity.
* In other words, all related, correlating items for hyperactivity.
*  conduct, emotion and prosocial.
* Note that there is no overall peer problems scale here because.
*  the only available items are SDQ, already scaled above.
* These scales are similar to equivalent Behar scales used at ages 2 and 3.
* except that, at age 4, SDQ items have replaced many of the Behar items.
* along with minor differences between ages 2 and 3.
* Hyperactivity: 10 items.
* (5 SDQ, 2 Behar and 3 additional hyperactivity items).
COMPUTE dhypt = 10 * MEAN.5(dsdqhyp1, dsdqhyp2, dsdqhyp3, dsdqhyp4r, dsdqhyp5r, 
  dhyp1, dhyp2, dhyp3, dbeh30, dbeh37).
EXECUTE.
* Conduct: 10 items.
* (5 SDQ, 5 Behar).
COMPUTE dcont = 10 * MEAN.5(dsdqcon1, dsdqcon2r, dsdqcon3, dsdqcon4, dsdqcon5, 
 dbeh05, dbeh12, dbeh35, dbeh38, dbeh40).
EXECUTE.
* Prosocial: 8 items.
* (5 SDQ, 3 Behar).
COMPUTE dprot = 8 * MEAN.4(dsdqpro1, dsdqpro2, dsdqpro3, dsdqpro4, dsdqpro5,
  dbeh01, dbeh41, dbeh43).
EXECUTE.
* Emotion: 6 items.
* (4 SDQ, 2 Behar).
COMPUTE demot = 6 * MEAN.3(dsdqemo1, dsdqemo2, dsdqemo3, dsdqemo5, 
 dbeh33, dbeh42).
EXECUTE.

* Note that some Behar items (dbeh06, 16, 24, 26, 27, 28, 34) have not been included.
*  in any scales above because they show little if any correlation.
*  or do not clearly belong to the given traits (conduct, emotion, etc).
*  These same Behar items are not used in scales at ages 2 and 3.
demot1/2

See dbeht1/2, etc above.

dbpamus1/2, dbpangr1/2, dbpaway1/2, dbpclos1/2, dbpfrus1/2, dbphapp1/2, dbpimpa1/2, ddiasko1/2, ddiexpl1/2, ddifirm1/2, ddijoke1/2, ddishou1/2, ddismak1/2, dgboard1/2, dgbooks1/2, dgmessy1/2, dgmusi1/2, dgphys1/2, dgpuzz1/2, dgtapes1/2, dgzoo1/2, dgmessy1/2, dgmusi1/2, dgphys1/2, dgpuzz1/2, dgtapes1/2, dgzoo1/2, dtbook1/2, dteat1/2, dtloca1/2, dtpron1/2, dtrhym1/2, dtsent1/2, dttalk1/2, dtword1/2

Standardised twin-specific versions of items in the parent booklet, originally coded as elder twin values and younger twin differences. See comments in the syntax below for details of how the variables are derived.
These twin-specific items are used to derive scales (described elsewhere on this page) for parental feelings, discipline, twin talk and twin games.

* Make twin-specific variables from difference variables.
* first standardise all the raw items.
DESCRIPTIVES VARIABLES= dtrhym dtrhymd dtpron dtprond dtsent dtsentd
 dtword dtwordd dtloca dtlocad dtbook dtbookd dttalk dttalkd dteat dteatd
 dgmessy dgmessyd dgpuzz dgpuzzd dgmusi dgmusid dgtapes dgtapesd
 dgbooks dgbooksd dgzoo dgzood dgphys dgphysd dgboard dgboardd
 ddismak ddismakd ddishou ddishoud ddiexpl ddiexpld ddifirm ddifirmd
 ddijoke ddijoked ddiasko ddiaskod dbpimpa dbpimpad dbphapp dbphappd
 dbpamus dbpamusd dbpaway dbpawayd dbpangr dbpangrd dbpclos dbpclosd dbpfrus dbpfrusd
 /SAVE.
* The standardised elder twin item is the twin 1 variable - rename.
RENAME VARIABLES (Zdtrhym Zdtpron Zdtsent Zdtword Zdtloca Zdtbook
 Zdttalk Zdteat Zdgmessy Zdgpuzz Zdgmusi Zdgtapes Zdgbooks Zdgzoo
 Zdgphys Zdgboard Zddismak Zddishou Zddiexpl Zddifirm Zddijoke
 Zddiasko Zdbpimpa Zdbphapp Zdbpamus Zdbpaway Zdbpangr Zdbpclos Zdbpfrus
 = dtrhym1 dtpron1 dtsent1 dtword1 dtloca1 dtbook1 
 dttalk1 dteat1 dgmessy1 dgpuzz1 dgmusi1 dgtapes1 dgbooks1 dgzoo1 
 dgphys1 dgboard1 ddismak1 ddishou1 ddiexpl1 ddifirm1 ddijoke1 
 ddiasko1 dbpimpa1 dbphapp1 dbpamus1 dbpaway1 dbpangr1 dbpclos1 dbpfrus1).
* set variable widths and variable levels (treat as scales).
FORMATS dtrhym1 dtpron1 dtsent1 dtword1 dtloca1 dtbook1 
 dttalk1 dteat1 dgmessy1 dgpuzz1 dgmusi1 dgtapes1 dgbooks1 dgzoo1 
 dgphys1 dgboard1 ddismak1 ddishou1 ddiexpl1 ddifirm1 ddijoke1 
 ddiasko1 dbpimpa1 dbphapp1 dbpamus1 dbpaway1 dbpangr1 dbpclos1 dbpfrus1 (F4.3).
VARIABLE LEVEL dtrhym1 dtpron1 dtsent1 dtword1 dtloca1 dtbook1 
 dttalk1 dteat1 dgmessy1 dgpuzz1 dgmusi1 dgtapes1 dgbooks1 dgzoo1 
 dgphys1 dgboard1 ddismak1 ddishou1 ddiexpl1 ddifirm1 ddijoke1 
 ddiasko1 dbpimpa1 dbphapp1 dbpamus1 dbpaway1 dbpangr1 dbpclos1 dbpfrus1 (SCALE).
* for twin 2, subtract the standardised difference item.
* from the standardised twin 1 item.
COMPUTE dtrhym2X = dtrhym1 - Zdtrhymd.
COMPUTE dtpron2X = dtpron1 - Zdtprond.
COMPUTE dtsent2X = dtsent1 - Zdtsentd.
COMPUTE dtword2X = dtword1 - Zdtwordd.
COMPUTE dtloca2X = dtloca1 - Zdtlocad.
COMPUTE dtbook2X = dtbook1 - Zdtbookd.
COMPUTE dttalk2X = dttalk1 - Zdttalkd.
COMPUTE dteat2X = dteat1 - Zdteatd.
COMPUTE dgmessy2X = dgmessy1 - Zdgmessyd.
COMPUTE dgpuzz2X = dgpuzz1 - Zdgpuzzd.
COMPUTE dgmusi2X = dgmusi1 - Zdgmusid.
COMPUTE dgtapes2X = dgtapes1 - Zdgtapesd.
COMPUTE dgbooks2X = dgbooks1 - Zdgbooksd.
COMPUTE dgzoo2X = dgzoo1 - Zdgzood.
COMPUTE dgphys2X = dgphys1 - Zdgphysd.
COMPUTE dgboard2X = dgboard1 - Zdgboardd.
COMPUTE ddismak2X = ddismak1 - Zddismakd.
COMPUTE ddishou2X = ddishou1 - Zddishoud.
COMPUTE ddiexpl2X = ddiexpl1 - Zddiexpld.
COMPUTE ddifirm2X = ddifirm1 - Zddifirmd.
COMPUTE ddijoke2X = ddijoke1 - Zddijoked.
COMPUTE ddiasko2X = ddiasko1 - Zddiaskod.
COMPUTE dbpimpa2X = dbpimpa1 - Zdbpimpad.
COMPUTE dbphapp2X = dbphapp1 - Zdbphappd.
COMPUTE dbpamus2X = dbpamus1 - Zdbpamusd.
COMPUTE dbpaway2X = dbpaway1 - Zdbpawayd.
COMPUTE dbpangr2X = dbpangr1 - Zdbpangrd.
COMPUTE dbpclos2X = dbpclos1 - Zdbpclosd.
COMPUTE dbpfrus2X = dbpfrus1 - Zdbpfrusd.
EXECUTE.
* standardise these differences to make the twin 2 items.
DESCRIPTIVES VARIABLES= dtrhym2X (dtrhym2) dtpron2X (dtpron2)
 dtsent2X (dtsent2) dtword2X (dtword2) dtloca2X (dtloca2) dtbook2X (dtbook2)
 dttalk2X (dttalk2) dteat2X (dteat2) dgmessy2X (dgmessy2) dgpuzz2X (dgpuzz2)
 dgmusi2X (dgmusi2) dgtapes2X (dgtapes2) dgbooks2X (dgbooks2)
 dgzoo2X (dgzoo2) dgphys2X (dgphys2) dgboard2X (dgboard2)
 ddismak2X (ddismak2) ddishou2X (ddishou2) ddiexpl2X (ddiexpl2)
 ddifirm2X (ddifirm2) ddijoke2X (ddijoke2) ddiasko2X (ddiasko2)
 dbpimpa2X (dbpimpa2) dbphapp2X (dbphapp2) dbpamus2X (dbpamus2)
 dbpaway2X (dbpaway2) dbpangr2X (dbpangr2) dbpclos2X (dbpclos2)
 dbpfrus2X (dbpfrus2) /SAVE.
dcbmi1/2

BMI (body mass index), in units of kilograms per square metre, for each twin.
Derived from heights (in centimetres) and weights (in kilograms) which are item variables from the child booklet, and which are cleaned (removing extreme outliers) before deriving BMI.

* BMI for child.
COMPUTE dcbmi = RND(((10000 * dcwtkg) / (dchtcm * dchtcm)), 0.1).
EXECUTE.
dchatot, dyhfac1z, dyhfac2z

Standardised composites for home chaos, based on the "Your Home" items in the parent booklet. See comments in the syntax below for differences between these composites.
All composites are standardised on the non-excluded twin sample, filtered using derived variable dexclude (also described elsewhere on this page).

* Chaos.
* These items are per-family, not twin specific.
* First standardise the items.
DESCRIPTIVES VARIABLES= dyhbed (zdyhbed) dyhhear (zdyhhear)
 dyhzoo (zdyhzoo) dyhontop (zdyhontop)
 dyhtv (zdyhtv) dyhcalm (zdyhcalm) /SAVE.
* 1st scale (chaotic): 3 items.
COMPUTE dyhfac1 = MEAN.2(zdyhhear, zdyhzoo, zdyhtv).
EXECUTE.
* 2nd scale (calm): 3 items.
COMPUTE dyhfac2 = MEAN.2(zdyhbed, zdyhontop, zdyhcalm).
EXECUTE.
* standardise the scales.
DESCRIPTIVES VARIABLES= dyhfac1 (dyhfac1z) dyhfac2 (dyhfac2z) /SAVE.

* overall chaos scale from the subscales above.
* subtract calm scale from chaotic scale.
COMPUTE chatot = dyhfac1z - dyhfac2z.
EXECUTE.
* standardise.
DESCRIPTIVES VARIABLES= chatot (dchatot) /SAVE.
dcont1/2

See dbeht1/2, etc above.

ddiasko1/2, ddiexpl1/2, ddifirm1/2, ddijoke1/2, ddishou1/2, ddismak1/2

See dbpamus1/2, etc above.

ddisneg1/2, ddispos1/2, ddisavo1/2, ddis1/2

Standardised composites for parental discipline, derived from the 6 items in the parent booklet. The composites are for negative/harsh discipline (ddisneg, 2 items), positive discipline (ddispos, 2 items) and avoidance (ddisavo, 2 items); and an overall 'total' composite (ddis) derived from the three others. Each composite is standardised so that it has mean zero and standard deviation one for the available sample.
The composites are computed from derived twin-specific versions of the items, which are described elsewhere on this page.

* Discipline.
* Use twin-specific versions of items, as already derived.
* Note these items are already standardised and double entered.
* 1st scale: negative/harsh discipline (smack and shout items).
COMPUTE ddisneg1x = MEAN.1(ddismak1, ddishou1).
COMPUTE ddisneg2x = MEAN.1(ddismak2, ddishou2).
EXECUTE.
* 2nd scale: positive discipline (explain and firm items).
COMPUTE ddispos1x = MEAN.1(ddiexpl1, ddifirm1).
COMPUTE ddispos2x = MEAN.1(ddiexpl2, ddifirm2).
EXECUTE.
* 3rd scale: avoidance (ask and joke items).
COMPUTE ddisavo1x = MEAN.1(ddijoke1, ddiasko1).
COMPUTE ddisavo2x = MEAN.1(ddijoke2, ddiasko2).
EXECUTE.
* standardise the scales.
DESCRIPTIVES VARIABLES= ddisneg1x (ddisneg1) ddisneg2x (ddisneg2)
 ddispos1x (ddispos1) ddispos2x (ddispos2)
 ddisavo1x (ddisavo1) ddisavo2x (ddisavo2) /SAVE.
 
* overall parental discipline scale, from subscales above.
* add smack/shout and joke/ask, and subtract explain/firm.
COMPUTE dis1 = ddisneg1 + ddisavo1 - ddispos1.
COMPUTE dis2 = ddisneg2 + ddisavo2 - ddispos2.
EXECUTE.
* standardise.
DESCRIPTIVES VARIABLES= dis1 (ddis1) dis2 (ddis2) /SAVE.
ddrawt1/2

Total score for the Parca Drawing test, derived as the sum of the 6 item scores.
Items 1-3 have scores 0/1, while items 4-6 have scores 0/1/2, hence the total score has range 0-9.

* Drawing test total score (0-9): sum of scores for all 6 items.
COMPUTE ddrawt = SUM(dpd01, dpd02, dpd03, dpd04, dpd05, dpd06).
EXECUTE.
derisk1/2, deriskf

Standardised environment risk composites: deriskf is a family-specific composite, while derisk1/2 is a twin-specific composite.
Each composite is derived as a mean of several other standardised measures, equally weighted. These measures include 4 year derived variables dlifeev, dmdtot, dchatot, ddis1/2 and dpar1/2, all of which are described elsewhere on this page; and 1st Contact variables ases_r, amedtot and atwmed1/2 (see 1st Contact derived variables page).
All variables are standardised.

* Environment risk composites.
* Family version and twin-specific version.

* standardise component variables (if not already standardised).
DESCRIPTIVES VARIABLES= dlifeev (zdlifeev) dmdtot (zdmdtot) /SAVE .

* Family-specific composite is a mean of 5 standardised family variables.
* from 1st Contact and age 4.
COMPUTE eriskf = MEAN.3(ases_r, amedtot, dchatot, zdmdtot, zdlifeev).
EXECUTE.

* Twin-specific composite is a mean of the same 5 family variables.
* plus three twin-specific variables from 1st Contact and age 4.
COMPUTE erisk1 = MEAN.5(ases_r, amedtot, dchatot, zdmdtot, zdlifeev,
 ddis1, dpar1, atwmed1).
COMPUTE erisk2 = MEAN.5(ases_r, amedtot, dchatot, zdmdtot, zdlifeev,
 ddis2, dpar2, atwmed2).
EXECUTE.

* standardise the new composites.
DESCRIPTIVES VARIABLES= eriskf (deriskf) erisk1 (derisk1) erisk2 (derisk2) /SAVE .
dexcludm, dexcluds

Twin pair exclusion variables (coded 1=exclude, 0=not excluded) incorporating standard analysis exclusions plus twin age exclusions. Variable dexcluds incorporates strict age exclusions, while variable dexcludm incorporates moderate age exclusion criteria.
Derived from the standard exclusion variables exclude1/2 and from age exclusion variables dagemdex and dagestex, which are described elsewhere on this page.

COMPUTE dexcludm = 0.
COMPUTE dexcluds = 0.
EXECUTE.
IF (dagemdex = 1 | exclude1 = 1 | exclude2 = 1) dexcludm = 1.
IF (dagestex = 1 | exclude1 = 1 | exclude2 = 1) dexcluds = 1.
EXECUTE.
dfbmcat, dmbmcat

BMI categories (dfbmcat for father, dmbmcat for mother), ordinally coded 1=underweight, 2=normal, 3=overweight, 4=obese. Derived from BMI variables dfbmi and dmbmi, which are described elsewhere on this page.

* Ordinal parent categories for underweight, normal, overweight and obese.
* These replace numerous old category variables containing.
* essentially the same information.
* Use ranges 0-20=under, 20-25=normal, 25-30=over, 30 or more=obese.
RECODE dmbmi dfbmi
 (SYSMIS=SYSMIS) (LOWEST THRU 19.999=1) (20 THRU 24.999=2)
 (25 THRU 29.999=3) (30 THRU HIGHEST=4)
INTO dmbmcat dfbmcat.
EXECUTE.
dfbmi, dmbmi

BMI for the mother (dmbmi) and father (dfbmi), in units of kilograms per square metre. Derived from parent heights and weights, which were recorded in the twin booklets, hence duplicated in the raw data. For the dataset, the raw heights and weights are cleaned and converted into a single measurement for each parent, as shown in the syntax below.

* Use differences between 2 twin booklets to identify and remove further anomalies.
* where only one of two measurements is in central 'normal' range.
* on the assumption that the more extreme measurement was a mistake.
* Mother heights: central range 155 to 175 cm.
IF (RANGE(dmhtcm1, 155, 175) & ~RANGE(dmhtcm2, 155, 175)) dmhtcm2 = $SYSMIS.
IF (~RANGE(dmhtcm1, 155, 175) & RANGE(dmhtcm2, 155, 175)) dmhtcm1 = $SYSMIS.
EXECUTE.
* Father heights: range 166 to 190cm.
IF (RANGE(dfhtcm1, 166, 190) & ~RANGE(dfhtcm2, 166, 190)) dfhtcm2 = $SYSMIS.
IF (~RANGE(dfhtcm1, 166, 190) & RANGE(dfhtcm2, 166, 190)) dfhtcm1 = $SYSMIS.
EXECUTE.
* Mother weights: central range 50 to 80 kg.
IF (RANGE(dmwtkg1, 50, 80) & ~RANGE(dmwtkg2, 50, 80)) dmwtkg2 = $SYSMIS.
IF (~RANGE(dmwtkg1, 50, 80) & RANGE(dmwtkg2, 50, 80)) dmwtkg1 = $SYSMIS.
EXECUTE.
* Father weights: central range 65 to 100 kg.
IF (RANGE(dfwtkg1, 65, 100) & ~RANGE(dfwtkg2, 65, 100)) dfwtkg2 = $SYSMIS.
IF (~RANGE(dfwtkg1, 65, 100) & RANGE(dfwtkg2, 65, 100)) dfwtkg1 = $SYSMIS.
EXECUTE.

* now check remaining differences from the two twin booklets (temp variables).
COMPUTE mumheightdiff = ABS(dmhtcm1 - dmhtcm2).
COMPUTE dadheightdiff = ABS(dfhtcm1 - dfhtcm2).
COMPUTE mumweightdiff = ABS(dmwtkg1 - dmwtkg2).
COMPUTE dadweightdiff = ABS(dfwtkg1 - dfwtkg2).
EXECUTE.
* Where a difference remains large, both are in the central 'normal' range.
* so we have to treat both as unreliable and recode to missing.
* Heights: remove if difference is greater than 6cm (over 2 inches).
DO IF (mumheightdiff > 6).
 RECODE dmhtcm1 dmhtcm2 (ELSE=SYSMIS).
END IF.
DO IF (dadheightdiff > 6).
 RECODE dfhtcm1 dfhtcm2 (ELSE=SYSMIS).
END IF.
EXECUTE.
* Weights: remove if difference is greater than 3.2kg (over half a stone).
DO IF (mumweightdiff > 3.2).
 RECODE dmwtkg1 dmwtkg2 (ELSE=SYSMIS).
END IF.
DO IF (dadweightdiff > 3.2).
 RECODE dfwtkg1 dfwtkg2 (ELSE=SYSMIS).
END IF.
EXECUTE.

* Now the measurements are cleaned up, for each parent.
* use the mean height and weight so we can drop duplicated measurements.
COMPUTE dmhtcm = RND(MEAN(dmhtcm1, dmhtcm2)).
COMPUTE dfhtcm = RND(MEAN(dfhtcm1, dfhtcm2)).
COMPUTE dmwtkg = RND(MEAN(dmwtkg1, dmwtkg2), 0.1).
COMPUTE dfwtkg = RND(MEAN(dfwtkg1, dfwtkg2), 0.1).
EXECUTE.

* Now derive mother and father BMI in standard units of kg per square metre.
COMPUTE dmbmi = RND(((10000 * dmwtkg) / (dmhtcm * dmhtcm)), 0.1).
COMPUTE dfbmi = RND(((10000 * dfwtkg) / (dfhtcm * dfhtcm)), 0.1).
EXECUTE.
dfemale1/2

Female gender role scale, derived from 12 items in the "Your Active Child" section of the child booklet.
Derived as the mean of 12 item variables, all having values 1-5, and re-scaled to give a "total" with range 12-60.
At least half the items are required to be non-missing.

COMPUTE dfemale = 12 * MEAN.6(dac01, dac02, dac03, dac06,
 dac09, dac11, dac13, dac15, dac18, dac22, dac23, dac24).
EXECUTE.
dgboard1/2, dgbooks1/2, dgmessy1/2, dgmusi1/2, dgphys1/2, dgpuzz1/2, dgtapes1/2, dgzoo1/2

See dbpamus1/2, etc above.

dgender1/2

Overall gender role scale.
Computed from derived variables dmale and dfemale (described elsewhere on this page), and re-scaled to range 0-100

* overall gender composite is difference of male and female.
* transformed so range is 0 to 100.
* (constant term now changed from 1.1 to 1.095 to avoid negatives).
COMPUTE dgender = 48.25 + (1.095 * (dmale - dfemale)).
EXECUTE.
dgfac11/2, dgfac21/2, dggfac1z, dggfac2z

Standardised composites for twin games, based on the "Your Twins' Play" items in the parent booklet. See comments in the syntax below for differences between these composites.
The twin-specific composites are computed from derived twin-specific versions of the items, which are described elsewhere on this page. All composites are standardised on the non-excluded twin sample, filtered using derived variable dexclude (also described elsewhere on this page).

* Play materials and toys.
* -----------------------.
* These items are per-family, not twin specific.
* The items have already been reverse-coded from raw data.
* so are coded in the right direction  (higher=toys more available).
* Standardise the items.
DESCRIPTIVES VARIABLES=dgdough dgtricyc dgoutdor dgmobile dgcoord /SAVE.
* 1st scale: 2 items relating to outdoor toys.
COMPUTE dggfac1 = MEAN.1(Zdgtricyc, Zdgoutdor).
EXECUTE.
* 2nd scale: 3 other items.
COMPUTE dggfac2 = MEAN.2(Zdgdough, Zdgmobile, Zdgcoord).
EXECUTE.
* standardise the scales.
DESCRIPTIVES VARIABLES= dggfac1 (dggfac1z) dggfac2 (dggfac2z) /SAVE.

* Twin games.
* ----------.
* Use twin-specific versions of items, as already derived.
* Note these items are already standardised and double entered.
* 1st scale: books, puzzles and tapes.
COMPUTE dgfac11x = MEAN.2(dgbooks1, dgpuzz1, dgtapes1).
COMPUTE dgfac12x = MEAN.2(dgbooks2, dgpuzz2, dgtapes2).
EXECUTE.
* 2nd scale: physical and messy games, music, board games.
COMPUTE dgfac21x = MEAN.3(dgmessy1, dgphys1, dgmusi1, dgboard1).
COMPUTE dgfac22x = MEAN.3(dgmessy2, dgphys2, dgmusi2, dgboard2).
EXECUTE.
* Note that dgzoo1/2 is not used in either scale.
* standardise the new scales.
DESCRIPTIVES VARIABLES= dgfac11x (dgfac11) dgfac12x (dgfac12)
 dgfac21x (dgfac21) dgfac22x (dgfac22) /SAVE.
dgramma1/2

Grammar composite scale, with ordinal values 0/1/2.
Computed from various dsay and dsayl item variables.

* Grammar composite: essentially ordinal categories coded 0/1/2.
* constructed from the dsay and dsayl items.
* 0 unless talking in long sentences.
IF (dsay01 < 6) dgramma = 0.
EXECUTE.
* 1 if talking in long sentences but not using both '-est' words and 'but'.
IF (dsay01 = 6 & (dsayl07 = 0 | dsayl13 = 0)) dgramma = 1.
EXECUTE.
* 2 if talking in long sentences and using both '-est' words and 'but'.
IF (dsay01 = 6 & dsayl07 = 1 & dsayl13 = 1) dgramma = 2.
EXECUTE.
* Note that dgramma is missing if dsay01 is missing.
* or if dsay01=6 and dsayl07 and dsayl13 are both missing.
* or if dsay01=6 and either dsayl07 or dsayl13 equals 1.
* and the other is missing.
dhand1/2, dhands1/2

Handedness variables, coded 1=left, 2=right, 3=mixed. Variable dhands uses stricter criteria than variable dhand. Both variables are derived from the handedness items in the Parca Drawing test.
Temporary variable nhanded is not retained in the dataset.

* first count how many handedness items are non-missing.
COUNT nhanded = dpd04h dpd05h dpd06h (1 THRU 3).
EXECUTE.
* Compute nominal 1=left, 2=right, 3=mixed handedness variables.
* Stringent (dhands) and less stringent (dhand) versions.
* Set all initially to default value of 3 (mixed).
* Require at least 2 non-missing for less stringent version.
IF (nhanded >= 2) dhand = 3.
* Require all three to be non-missing for stringent version.
IF (nhanded = 3) dhands = 3.
EXECUTE.

* Less stringent handedness: 3 items must show same handedness.
* if all present, but allow one item to be missing.
DO IF (nhanded >= 2).
 IF ((dpd04h = 1 | SYSMIS(dpd04h)) & (dpd05h = 1 | SYSMIS(dpd05h))
  & (dpd06h = 1 | SYSMIS(dpd06h))) dhand = 1.
 IF ((dpd04h = 2 | SYSMIS(dpd04h)) & (dpd05h = 2 | SYSMIS(dpd05h))
  & (dpd06h = 2 | SYSMIS(dpd06h))) dhand = 2.
END IF.
EXECUTE.

* More stringent handedness: 3 items must show same handedness with none missing.
IF (dpd04h = 1 & dpd05h = 1 & dpd06h = 1) dhands = 1.
IF (dpd04h = 2 & dpd05h = 2 & dpd06h = 2) dhands = 2.
EXECUTE.
dhypt1/2

See dbeht1/2, etc above.

dleft1/2, dright1/2, dtoleft1/2, dtorigh1/2

Variables for left- and right-handedness, coded 1=yes, 0=no. Variables dtoleft and dtoright use stricter criteria than variables dleft and dright respectively.
Derived from the handedness items in the Parca Drawing test.

* These are simple nominal 1=yes 0=no variables to show left or right handedness.
* Stringent (dtoleft, dtorigh) and less stringent (dleft, dright) versions.
* First count how many handedness items are non-missing.
COUNT nhanded = dpd04h dpd05h dpd06h (1 THRU 3).
EXECUTE.
* Set all initially to default value of 0.
* Require at least 2 non-missing for less stringent version.
DO IF (nhanded >= 2).
 COMPUTE dleft = 0.
 COMPUTE dright = 0.
END IF.
* Require all three to be non-missing for stringent version.
DO IF (nhanded = 3).
 COMPUTE dtoleft = 0.
 COMPUTE dtorigh = 0.
END IF.
EXECUTE.

* less stringent version: at least 2 of the three items show relevant handedness.
IF ((dpd04h=1 & dpd05h=1) | (dpd04h=1 & dpd06h=1) | (dpd05h=1 & dpd06h=1)) dleft = 1 .
IF ((dpd04h=2 & dpd05h=2) | (dpd04h=2 & dpd06h=2) | (dpd05h=2 & dpd06h=2)) dright = 1 .
EXECUTE.

* more stringent version: all three items must show handedness.
IF ((dpd04h=1) & (dpd05h=1) & (dpd06h=1)) dtoleft = 1.
IF ((dpd04h=2) & (dpd05h=2) & (dpd06h=2)) dtorigh = 1.
EXECUTE.
dlifeev

Life events composite, with ordinal values 0 to 5 denoting the number of significant life events experienced by the family at age 4. The five life events are:
apparent change of marital status of respondent (by comparison of derived variable dmstatus with equivalent 3 Year variable cmstatus);
serious illness experienced by any family member (from items dillfam and dhosp1/2);
changes in job for respondent and for partner (items djobch and djobchp respectively);
and arrival of a new child in the family (item dsibnew).

* Life events composite.
* First make a flag variable (1Y 0N) to show assumed changes in marital status.
* since 3 Year, by comparing values of cmstatus and dmstatus.
* Use simple rule that status is unchanged if cmstatus equals dmstatus.
* and status is changed if not equal (status change missing if either variables is missing).
* We don't know who respondent is at each age, but assume it is the same.
IF (cmstatus = dmstatus) mstch34 = 0.
IF (cmstatus ~= dmstatus) mstch34 = 1.
EXECUTE.

* Now another flag variable for serious illness in family.
* (mother or father or sibling or either twin at 4).
IF ((dillfam = 1) | (dhosp1 = 1) | (dhosp2 = 1)) illfam4 = 1 .
IF ((dillfam = 0) & (dhosp1 = 0) & (dhosp2 = 0)) illfam4 = 0 .
EXECUTE .

* Recode partner job change to 1Y 0N (recode no partner to missing).
RECODE djobchp (0=0) (1=1) (2=SYSMIS) (SYSMIS=SYSMIS)
INTO jobchp4.
EXECUTE.

* Now make life events composite as the sum of 5 flag variables.
* (the 3 above plus respondent job change and new sibling in family).
* Require at least 3 of the components to be non-missing.
COMPUTE dlifeev = SUM.3(mstch34, illfam4, jobchp4, djobch, dsibnew).
EXECUTE.
dllang1/2, dllpic1/2, dllslow1/2, dlltalk1/2, dllvoc1/2

Low-language flag variables, coded 1=yes (low) 0=no.
Dlltalk1/2 flags speech problems.
Dllvoc1/2 flags poor vocabulary from the MCDI Vocabulary score.
Dllpic1/2 flags poor vocabulary from the Picture Vocabulary score.
Dllslow1/2 flags concerns about slow language development.
These are computed from items dsay01 and dsayc01, and from derived variables dpictot and dvocab.
Dllang1/2 is the overall Low Language category, and is based on the other four flag variables.

* Talking: low if not talking, or if talking in phrases.
* of no more than three words (derived from dsay01).
RECODE dsay01 (1=1) (3=1) (4=1) (2=0) (5=0) (6=0) 
INTO dlltalk.
EXECUTE.
* Vocabulary (from dvocab): low if total score less than 22.
RECODE dvocab (LOWEST THRU 21=1) (22 THRU HIGHEST=0)
INTO dllvoc.
EXECUTE.
* or if child is not talking intelligibly yet (from dsay01).
* (in which case dvocab is missing).
IF (ANY(dsay01,1,2)) dllvoc = 1.
EXECUTE.
* Picture vocabulary: low if total is less than 5 (from dpictot).
RECODE dpictot (LOWEST THRU 4=1) (5 THRU HIGHEST=0)
INTO dllpic.
EXECUTE.
* Slow development: parent has concerns (dsayc01=1).
* about slow language development (dsayc1a=1).
IF (dsayc01 = 1 & dsayc1a = 1) dllslow = 1.
IF (dsayc01 = 0) dllslow = 0.
IF (dsayc01 = 1 & dsayc1a = 0) dllslow = 0.
IF (dsayc01 = 1 & SYSMIS(dsayc1a)) dllslow = 0.
EXECUTE.
* Overall (dllang) categorise as low language.
* if low on any two (or more) of the four criteria.
IF (SUM(dlltalk, dllvoc, dllpic, dllslow) <= 1) dllang = 0.
IF (SUM(dlltalk, dllvoc, dllpic, dllslow) > 1) dllang = 1.
EXECUTE.
* or if low just on talking (dlltalk=1).
* or if low just on MCDI vocab (dllvoc=1).
IF (dlltalk = 1) dllang = 1.
IF (dllvoc = 1) dllang = 1.
EXECUTE.
dmale1/2

Male gender role scale, derived from 12 items in the "Your Active Child" section of the child booklet.
Derived as the mean of 12 item variables, all having values 1-5, and re-scaled to give a "total" with range 12-60.
At least half the items are required to be non-missing.

COMPUTE dmale = 12 * MEAN.6(dac04, dac05, dac07, dac08, dac10,
 dac12, dac14, dac16, dac17, dac19, dac20, dac21).
EXECUTE.
dmant1/2

Total score for the Parca Draw-a-Man test, derived as the sum of the 12 item scores.
Each item has scores 0/1, hence the total score has range 0-12.
Variable dmant exists as an item in the raw data, but this contains some anomalous totals hence it is re-computed as a derived variable.

COMPUTE dmant = SUM.12(dmanh, dmane, dmann, dmanm, dmanea, dmanha,
 dmanb, dmana, dmanl, dmanhan, dmanf, dmanc).
EXECUTE.
dmbmcat

See dfbmcat, dmbmcat above.

dmbmi

See dfbmi, dmbmi above.

dmdtot

Maternal depression total scale. Derived from the 10 maternal depression items, all having values 0-3 (already reversed in the item coding where appropriate). The scale is derived as a mean, requiring at least half the items to be non-missing, then scaled up to represent a total with range 0 to 30.

COMPUTE dmdtot = 10 * MEAN.5(dmdlaugh, dmdenjoy, dmdblame, dmdanx,
 dmdscare, dmdontop, dmdunhap, dmdsad, dmdcry, dmdharm).
EXECUTE.
dmstatus

Marital status categories.
Computed from item variables for the 8 raw marital status item tick-boxes (coded 1=yes, 0=no) in the parent booklet. Recoded to missing if more than one box was ticked.

IF (AN01a = 1) dmstatus = 1.
IF (AN01b = 1) dmstatus = 2.
IF (AN01c = 1) dmstatus = 3.
IF (AN01d = 1) dmstatus = 4.
IF (AN01e = 1) dmstatus = 5.
IF (AN01f = 1) dmstatus = 6.
IF (AN01g = 1) dmstatus = 7.
IF (AN01h = 1) dmstatus = 8.
EXECUTE.
* recode to missing if more than one box ticked.
DO IF (SUM(AN01b, AN01c, AN01d, AN01e, AN01f, AN01g, AN01h) > 1).
 RECODE dmstatus (ELSE=SYSMIS).
END IF.
EXECUTE.
dodd01s1/2 through to dodd15s1/2

Item scores (1 if correct, 0 if not) for the Parca Odd One Out test items.
Derived from the respective raw item response variables (coded 1-3 for valid responses, denoting picture selected, or 5 for invalid responses).

* items with 1 as the correct response.
RECODE ODD02 ODD11 ODD12
 (1=1) (2=0) (3=0) (5=0) (SYSMIS=SYSMIS)
INTO dodd02s dodd11s dodd12s .
EXECUTE.
* items with 2 as the correct response.
RECODE ODD01 ODD04 ODD06 ODD08 ODD10 ODD13 ODD15
 (1=0) (2=1) (3=0) (5=0) (SYSMIS=SYSMIS)
INTO dodd01s dodd04s dodd06s dodd08s dodd10s dodd13s dodd15s .
EXECUTE.
* items with 3 as the correct response.
RECODE ODD03 ODD05 ODD07 ODD09 ODD14
 (1=0) (2=0) (3=1) (5=0) (SYSMIS=SYSMIS)
INTO dodd03s dodd05s dodd07s dodd09s dodd14s .
EXECUTE.
doddt1/2

Total score for the Parca Odd One Out test, derived as the sum of the item scores for the first 13 items. All items have scores 0/1, hence the total score has range 0-13.

COMPUTE doddt = SUM(dodd01s, dodd02s, dodd03s, dodd04s, dodd05s,
 dodd06s, dodd07s, dodd08s, dodd09s, dodd10s, dodd11s, dodd12s, dodd13s).
EXECUTE.
dparca1/2

Standardised Parca total score, derived as the mean of the standardised parent-administered total (dreparc) and the standardised parent-reported total (dadparc), both of which are derived variables described elsewhere on this page.
Standardisations are carrried out on the non-excluded twin sample.

* Filter out all the standard exclusions for the twin.
* (medical, perinatal, unknown sex/zyg, missing 1st Contact).
* This will affect all standardised variables derived below.
USE ALL.
COMPUTE filter_$=(exclude1 = 0).
VARIABLE LABEL filter_$ 'exclude1 = 0 (FILTER)'.
VALUE LABELS filter_$ 0 'Not Selected' 1 'Selected'.
FORMAT filter_$ (f1.0).
FILTER BY filter_$.
EXECUTE.

* Standardised Parca composite.
* (parent administered and reported).
* First standardise the two total scores.
DESCRIPTIVES VARIABLES=dreparc (zreparc) dadparc (zadparc) /SAVE.
* Now compute the mean, requiring just one to be non-missing.
COMPUTE parca = MEAN.1(zreparc, zadparc).
EXECUTE.
* Standardise this mean.
DESCRIPTIVES VARIABLES=parca (dparca) /SAVE.

* Remove filter: no longer needed now standardisations completed.
FILTER OFF.
USE ALL.
EXECUTE.
dparneg1/2, dparpos1/2, dpar1/2

Standardised composites for parental feelings, based on the 7 "Being a Parent" items in the parent booklet. There are composites for negative feelings (dparneg, 4 items) and positive feelings (dparpos, 3 items), and an overall total composite.
The composites are computed from derived twin-specific versions of the items, which are described elsewhere on this page. All composites are standardised so that they mean zero and standard deviation one for the available twin sample.

* Parental feelings.
* Use twin-specific versions of items, as already derived.
* Note these items are already standardised and double entered.
* 1st scale (negative feelings): 4 items.
COMPUTE dparneg1x = MEAN.3(dbpangr1, dbpfrus1, dbpimpa1, dbpaway1).
COMPUTE dparneg2x = MEAN.3(dbpangr2, dbpfrus2, dbpimpa2, dbpaway2).
EXECUTE.
* 2nd scale (positive feelings): 3 items.
COMPUTE dparpos1x = MEAN.2(dbphapp1, dbpamus1, dbpclos1).
COMPUTE dparpos2x = MEAN.2(dbphapp2, dbpamus2, dbpclos2).
EXECUTE.
* standardise the scales.
DESCRIPTIVES VARIABLES= dparneg1x (dparneg1) dparneg2x (dparneg2)
 dparpos1x (dparpos1) dparpos2x (dparpos2) /SAVE.
 
* overall parental feelings scale, from subscales above.
* subtract positivity scale from negativity scales.
COMPUTE par1 = dparneg1 - dparpos1.
COMPUTE par2 = dparneg2 - dparpos2.
EXECUTE.
* standardise.
DESCRIPTIVES VARIABLES= par1 (dpar1) par2 (dpar2) /SAVE.
dpbage

Twin age (in years) on completion of the parent booklet. The parent booklet date is the same for both twins, hence only one age variable is required.
The age is computed from the corresponding raw date variable (dpbdate). The variable aonsdob is the twins' birth date. All date variables are dropped from the dataset after deriving ages.

COMPUTE dpbdate = DATE.DMY(parent_dd, parent_mm, parent_yyyy).
EXECUTE.
* if parent booklet date missing, use the twin booklet date.
IF (SYSMIS(dpbdate)) dpbdate = dtwbdate.
EXECUTE.
* fill in any remaining missing values with the admin return date.
IF (SYSMIS(dpbdate)) dpbdate = ReturnedDate.
EXECUTE.
* use date to derive age.
COMPUTE dpbage =  RND(((DATEDIFF(dpbdate,aonsdob,"days")) / 365.25), 0.1) .
EXECUTE.
dpbLLCage, dpbLLCdate, dtwbLLCage1/2, dtwbLLCdate1/2

Age and date variables derived for use in datasets in the LLC TRE (but not to be used in other datasets).
Ages and dates are derived for the parent booklet ('dpb') and twin booklet ('dtwb').
The LLC date variables contain only the month and year, not the day, as a means of reducing identifiability. The date variables are strings formatted as 'yyyy-mm'. These LLC dates are designed to enable the TEDS measures to be placed in a time sequence with NHS medical diagnosis dates in the data in the TRE.
The LLC age variables are integers measuring the number of months between birth and the given TEDS activity, consistent with the matching LLC date variables.
Variable aonsdob is the twin birth date - the raw date variables are not retained in the dataset.

* First extract year and month as temp variables, from birth date and activity dates.
COMPUTE birthyear = XDATE.YEAR(aonsdob).
COMPUTE birthmonth = XDATE.MONTH(aonsdob).
COMPUTE dtwbyear = XDATE.YEAR(dtwbdate).
COMPUTE dtwbmonth = XDATE.MONTH(dtwbdate).
COMPUTE dpbyear = XDATE.YEAR(dpbdate).
COMPUTE dpbmonth = XDATE.MONTH(dpbdate).
EXECUTE.

* The agreed LLC date format is a string yyyy-mm (nominal by default for strings).
* adding '0' where necessary for two-digit months.
STRING dtwbLLCdate dpbLLCdate (A7).
IF (dtwbmonth < 10) dtwbLLCdate = CONCAT(STRING(dtwbyear, F4), '-0', STRING(dtwbmonth, F1)).
IF (dtwbmonth >= 10) dtwbLLCdate = CONCAT(STRING(dtwbyear, F4), '-', STRING(dtwbmonth, F2)).
IF (dpbmonth < 10) dpbLLCdate = CONCAT(STRING(dpbyear, F4), '-0', STRING(dpbmonth, F1)).
IF (dpbmonth >= 10) dpbLLCdate = CONCAT(STRING(dpbyear, F4), '-', STRING(dpbmonth, F2)).
EXECUTE.

* The agreed LLC age variable is in integer months.
* and it must agree with the birth and booklet year/month variables that will be available in the LLC.
COMPUTE dtwbLLCage = (dtwbmonth + (dtwbyear * 12)) - (birthmonth + (birthyear * 12)).
COMPUTE dpbLLCage = (dpbmonth + (dpbyear * 12)) - (birthmonth + (birthyear * 12)).
EXECUTE.
dpic01s1/2 through to dpic08s1/2

Item scores (1 if correct, 0 if not) for the Picture Vocabulary test items.
Derived from the respective raw item response variables (coded 1-4 for valid responses, denoting picture selected, or 5 for invalid responses).

* items with 1 as the correct response.
RECODE PIC02 PIC06
 (1=1) (2=0) (3=0) (4=0) (5=0) (SYSMIS=SYSMIS)
INTO dpic02s dpic06s .
EXECUTE.
* items with 2 as the correct response.
RECODE PIC01
 (1=0) (2=1) (3=0) (4=0) (5=0) (SYSMIS=SYSMIS)
INTO dpic01s .
EXECUTE.
* items with 3 as the correct response.
RECODE PIC03 PIC04 PIC08
 (1=0) (2=0) (3=1) (4=0) (5=0) (SYSMIS=SYSMIS)
INTO dpic03s dpic04s dpic08s .
EXECUTE.
* items with 4 as the correct response.
RECODE PIC05 PIC07
 (1=0) (2=0) (3=0) (4=1) (5=0) (SYSMIS=SYSMIS)
INTO dpic05s dpic07s .
EXECUTE.
dpictot1/2

Total score for the Picture Vocabulary test, derived as the sum of the item scores for all 8 items. All items have scores 0/1, hence the total score has range 0-8.

COMPUTE dpictot = SUM(dpic01s, dpic02s, dpic03s,
 dpic04s, dpic05s, dpic06s, dpic07s, dpic08s).
EXECUTE.
dprot1/2

See dbeht1/2, etc above.

dpuz01s1/2 through to dpuz12s1/2

Item scores (1 if correct, 0 if not) for the Parca Puzzle test items.
Derived from the respective raw item response variables (coded 1-4 for valid responses, denoting picture selected, or 5 for invalid responses).

* items with 1 as the correct response.
RECODE PUZ06 PUZ10 PUZ12
 (1=1) (2=0) (3=0) (4=0) (5=0) (SYSMIS=SYSMIS)
INTO dpuz06s dpuz10s dpuz12s .
EXECUTE.
* items with 2 as the correct response.
RECODE PUZ03 PUZ05
 (1=0) (2=1) (3=0) (4=0) (5=0) (SYSMIS=SYSMIS)
INTO dpuz03s dpuz05s .
EXECUTE.
* items with 3 as the correct response.
RECODE PUZ01 PUZ04 PUZ09
 (1=0) (2=0) (3=1) (4=0) (5=0) (SYSMIS=SYSMIS)
INTO dpuz01s dpuz04s dpuz09s .
EXECUTE.
* items with 4 as the correct response.
RECODE PUZ02 PUZ07 PUZ08 PUZ11
 (1=0) (2=0) (3=0) (4=1) (5=0) (SYSMIS=SYSMIS)
INTO dpuz02s dpuz07s dpuz08s dpuz11s .
EXECUTE.
dpuzt1/2

Total score for the Parca Puzzle test, derived as the sum of the item scores for all 12 items. All items have scores 0/1, hence the total score has range 0-12.

COMPUTE dpuzt = SUM(dpuz01s, dpuz02s, dpuz03s, dpuz04s, dpuz05s,
 dpuz06s, dpuz07s, dpuz08s, dpuz09s, dpuz10s, dpuz11s, dpuz12s).
EXECUTE.
drawg1/2, dscnv1/2, dscv1/2

Standardised cognitive ability composites. Variable drawg represents general cognitive ability ('g'), dscv represents verbal ability, and dscnv represents non-verbal ability.
These composites are computed from other derived variables (dtvoc, dgramma, dreparc, dadparn) which are all described elsewhere on this page.
All standardisations are done on the non-excluded twin sample, using variable exclude1.

* Filter out all the standard exclusions.
* (medical, perinatal, unknown sex/zyg, missing 1st Contact).
* This will affect all standardised variables derived below.
USE ALL.
COMPUTE filter_$=(exclude1 = 0).
VARIABLE LABEL filter_$ 'exclude1 = 0 (FILTER)'.
VALUE LABELS filter_$ 0 'Not Selected' 1 'Selected'.
FORMAT filter_$ (f1.0).
FILTER BY filter_$.
EXECUTE.

* General cognitive ability composites.
* These combine the parent-report and parent-administered Parca.
* (non-verbal ability) with the vocab and grammar (verbal ability).

* First we need standardised (transformed) vocab and grammar scores.
* and parent-reported Parca total.
DESCRIPTIVES VARIABLES= dtvoc (zdtvoc) dgramma (zgramma)
 dreparc (zreparc) /SAVE.

* Derive composites as means from standardised scores.
* requiring all of them to be non-missing.

* Verbal ability composite: from vocab and grammar.
COMPUTE scv = MEAN.2(zdtvoc, zgramma).
EXECUTE.
* Non-verbal ability composite: from two Parca scores.
COMPUTE scnv = MEAN.2(dadparn, zreparc).
EXECUTE.
* General cognitive ability ('g'): from all four scores.
COMPUTE rawg = MEAN.4(zdtvoc, zgramma, dadparn, zreparc).
EXECUTE.
* Now standardise them.
DESCRIPTIVES VARIABLES=scv (dscv) scnv (dscnv) rawg (drawg) /SAVE.
* Remove filter: no longer needed now standardisations completed.
FILTER OFF.
USE ALL.
EXECUTE.
dreparc1/2

Total score for the parent-reported Parca measure. Derived as the sum of the item scores for items 1-11 and 12b. Each item has score 0 or 1, hence the total score has range 0-12.

* Parent-reported Parca total score (0-12).
* Sum of first 11 items plus just one part of item 12.
COMPUTE dreparc = SUM(dpr01, dpr02, dpr03, dpr04, dpr05,
 dpr06, dpr07, dpr08, dpr09, dpr10, dpr11, dpr12b).
EXECUTE.
dright1/2

See dleft1/2, dright1/2, dtoleft1/2, dtorigh1/2 above.

dscnv1/2, dscv1/2

See drawg1/2, dscnv1/2, dscv1/2 above.

dsdqbeht1/2, dsdqemot1/2, dsdqcont1/2, dsdqhypt1/2, dsdqpert1/2, dsdqprot1/2

SDQ subscales. At age 4, 24 of the 25 SDQ items were collected.
dsdqemot1/2: emotional problems (previously called anxiety), from 4 items.
dsdqcont1/2: conduct problems, from 5 items.
dsdqhypt1/2: hyperactivity, from 5 items.
dsdqpert1/2: peer problems, from 5 items.
dsdqprot1/2: prosocial behaviour, from 5 items.
dsdqbeht1/2: total behaviour problems, from the 19 emotion, conduct, hyperactivity and peer problem items (but not the prosocial items).
Each scale represents a "total", having a range of values that would be possible from summing the items. Each scale is derived as a mean and then scaled up, hence providing equivalent scale values for twins having missing values for one or more of the items. At least half of the component items are required to be non-missing in each case.

* SDQ emotional symptoms (previously called Anxiety), from 4 items.
COMPUTE dsdqemot = 4 * MEAN.2(dsdqemo1, dsdqemo2, dsdqemo3, dsdqemo5).
EXECUTE.
* SDQ conduct, from 5 items.
COMPUTE dsdqcont = 5 * MEAN.3(dsdqcon1, dsdqcon2r, dsdqcon3, dsdqcon4, dsdqcon5).
EXECUTE.
* SDQ hyperactivity, from 5 items.
COMPUTE dsdqhypt = 5 * MEAN.3(dsdqhyp1, dsdqhyp2, dsdqhyp3, dsdqhyp4r, dsdqhyp5r).
EXECUTE.
* SDQ peer problems, from 5 items.
COMPUTE dsdqpert = 5 * MEAN.3(dsdqper1, dsdqper2r, dsdqper3r, dsdqper4, dsdqper5).
EXECUTE.
* SDQ prosocial, from 5 items.
COMPUTE dsdqprot = 5 * MEAN.3(dsdqpro1, dsdqpro2, dsdqpro3, dsdqpro4, dsdqpro5).
EXECUTE.
* SDQ total behaviour problems, from 19 items.
COMPUTE dsdqbeht = 19 * MEAN.10(dsdqemo1, dsdqemo2, dsdqemo3, dsdqemo5,
 dsdqcon1, dsdqcon2r, dsdqcon3, dsdqcon4, dsdqcon5,
 dsdqhyp1, dsdqhyp2, dsdqhyp3, dsdqhyp4r, dsdqhyp5r,
 dsdqper1, dsdqper2r, dsdqper3r, dsdqper4, dsdqper5).
EXECUTE.
dtbook1/2, dteat1/2, dtloca1/2, dtpron1/2, dtrhym1/2, dtsent1/2, dttalk1/2, dtword1/2

See dbpamus1/2, etc above.

dtempzyg

Zygosity difference score, used in the zygosity algorithm.
The score is scaled to have values between 0 and 1, with higher values representing greater differences between the twins.
Computed from 20 different item and derived variables relating to twin differences. All contributing variables are ordinal, with higher values for greater differences. See comments in syntax for details of the method of computation.
Temporary variables sumzyg, zyg1/2/3 and zygtot are not retained in the dataset.

* Compute difference sum, from ordinal variables with higher values = more different.
COMPUTE sumzyg = SUM(dzyprof, dzyyou, dzyhairs, dzyhairt, dzyeyes, dzyears,
 dzyteet2, dzyold, dzyphot2, dzyfac, dzytyp, dzymistp, dzymists, dzymistr,
 dzymistb, dzymistf, dzymistc, dzymistm, dzytoget, dzypeas).
EXECUTE.
* Determine maximum possible score, depending on number of non-missing.
* responses in the above variables (total is 54 if none missing).
COUNT zyg1 = dzyfac dzytyp (0 thru 1).
COUNT zyg2 = dzyprof dzyyou dzyteet2 (1 thru 2).
COUNT zyg3 = dzyhairs dzyhairt dzyeyes dzyears dzyold dzyphot2 dzymistp
 dzymists dzymistr dzymistb dzymistf dzymistc dzymistm dzypeas (1 thru 3).
COUNT zyg4 = dzytoget (1 thru 4).
EXECUTE.
COMPUTE zygtot = SUM(zyg1, (2 * zyg2), (3* zyg3), (4 * zyg4)).
EXECUTE.
* Can now re-scale difference score to range 0-1.
* requiring at least half the data to be non-missing.
* (total possible score must be 27 or higher).
IF (zygtot >= 27) dtempzyg = sumzyg / zygtot.
EXECUTE.
dtfac11/2, dtfac21/2

Standardised composites for twin talk, based on the "How you talk to your twins" items in the parent booklet. See comments in the syntax below for differences between these composites.
The composites are computed from derived twin-specific versions of the items, which are described elsewhere on this page. All composites are standardised on the non-excluded twin sample, filtered using derived variable dexclude (also described elsewhere on this page).

* Talking to twins.
* Use twin-specific versions of items, as already derived.
* Note these items are already standardised and double entered.
* 1st scale: 3 items.
COMPUTE dtfac11x = MEAN.2(dtpron1, dtsent1, dtword1).
COMPUTE dtfac12x = MEAN.2(dtpron2, dtsent2, dtword2).
EXECUTE.
* 2nd scale: 4 items.
COMPUTE dtfac21x = MEAN.3(dtrhym1, dtbook1, dtloca1, dttalk1).
COMPUTE dtfac22x = MEAN.3(dtrhym2, dtbook2, dtloca2, dttalk2).
EXECUTE.
* Note that dteat1/2 is not used in either scale.
* standardise the scales.
DESCRIPTIVES VARIABLES= dtfac11x (dtfac11) dtfac12x (dtfac12)
 dtfac21x (dtfac21) dtfac22x (dtfac22) /SAVE.
dtoleft1/2, dtorigh1/2

See dleft1/2, dright1/2, dtoleft1/2, dtorigh1/2 above.

dtwbage1/2

Twin age (in years) on completion of the child booklet.
The age is computed from the corresponding date variable (dtwbdate). The variable aonsdob is the twins' birth date. All date variables are dropped from the dataset after deriving ages.

* convert all dd/mm/yyyy integer values into raw dates.
COMPUTE dtwbdate = DATE.DMY(booklet_dd, booklet_mm, booklet_yyyy).
EXECUTE.
* if twin booklet date missing, use parent booklet date.
IF (SYSMIS(dtwbdate)) dtwbdate = dpbdate.
EXECUTE.
* fill in any remaining missing values with the admin return date.
IF (SYSMIS(dtwbdate)) dtwbdate = ReturnedDate.
EXECUTE.
* use date to derive twin age.
COMPUTE dtwbage = RND(((DATEDIFF(dtwbdate,aonsdob,"days")) / 365.25), 0.1) .
EXECUTE.
dtwbLLCage1/2, dtwbLLCdate1/2

See dpbLLCage, etc above.

dtwbagediff

Twin pair age difference (in years) for completion of the twin booklets.
Computed from double-entered derived age variables dtwbage1/2, which are described elsewhere on this page.

COMPUTE dtwbagediff = RND(ABS(dtwbage1 - dtwbage2), 0.1).
EXECUTE.
dtvoc1/2

Transformed version of dvocab1/2.
Computed from derived variable dvocab1/2, described elsewhere on this page. The transformation is designed to reduce skewness and improve the approximation to a normal distribution, while retaining the same range of values.

* Transformed vocabulary total.
* using reflect square-root function to reduce skew.
* while retaining range of 0 to 48.
COMPUTE dtvoc = 48 * (1 - ((SQRT(49 - dvocab) - 1) / (SQRT(49) - 1)) ).
EXECUTE.
dvocab1/2

MDCI Vocabulary total score. Derived as the sum of all 48 item scores. Each item has scores 0/1, hence the total score has range 0-48.
The total score is recoded to missing (as are all the item scores) if the data suggest that the entire measure has been skipped, resulting in defaults of zero for all the items - see comments in syntax for method.

* Note that all vocab items are already set to.
* to missing if dsay01a=1 (child is not yet talking).
* MCDI vocabulary total score - sum of 48 item scores.
COMPUTE dvocab = SUM(dvc01, dvc02, dvc03, dvc04, dvc05, dvc06,
 dvc07, dvc08, dvc09, dvc10, dvc11, dvc12, dvc13, dvc14, dvc15,
 dvc16, dvc17, dvc18, dvc19, dvc20, dvc21, dvc22, dvc23, dvc24,
 dvc25, dvc26, dvc27, dvc28, dvc29, dvc30, dvc31, dvc32, dvc33,
 dvc34, dvc35, dvc36, dvc37, dvc38, dvc39, dvc40, dvc41, dvc42,
 dvc43, dvc44, dvc45, dvc46, dvc47, dvc48).
EXECUTE.
* There are some zero vocab scores that seem anomalous.
* and may have arisen from default zero item scores if missing.
* Recode items and total to missing if dsay01-03 are missing.
* and vocab total is zero (assuming that the whole page was skipped).
DO IF (dvocab = 0 & SYSMIS(dsay01) & SYSMIS(dsay02) & SYSMIS(dsay03)).
 RECODE dvocab dvc01 dvc02 dvc03 dvc04 dvc05 dvc06 dvc07 dvc08 dvc09 dvc10 dvc11 dvc12 dvc13 dvc14 dvc15 dvc16 
    dvc17 dvc18 dvc19 dvc20 dvc21 dvc22 dvc23 dvc24 dvc25 dvc26 dvc27 dvc28 dvc29 dvc30 dvc31 dvc32 
    dvc33 dvc34 dvc35 dvc36 dvc37 dvc38 dvc39 dvc40 dvc41 dvc42 dvc43 dvc44 dvc45 dvc46 dvc47 dvc48
 (0=SYSMIS).
END IF.
EXECUTE.
* Likewise if dsay01=5 or 6 (child is talking in sentences).
* and vocab total is zero, then assume vocab items were skipped.
DO IF (dvocab = 0 & ANY(dsay01,5,6)).
 RECODE dvocab dvc01 dvc02 dvc03 dvc04 dvc05 dvc06 dvc07 dvc08 dvc09 dvc10 dvc11 dvc12 dvc13 dvc14 dvc15 dvc16 
    dvc17 dvc18 dvc19 dvc20 dvc21 dvc22 dvc23 dvc24 dvc25 dvc26 dvc27 dvc28 dvc29 dvc30 dvc31 dvc32 
    dvc33 dvc34 dvc35 dvc36 dvc37 dvc38 dvc39 dvc40 dvc41 dvc42 dvc43 dvc44 dvc45 dvc46 dvc47 dvc48
 (0=SYSMIS).
END IF.
EXECUTE.
dyhfac1z, dyhfac2z

See dchatot, dyhfac1z, dyhfac2z above.