TEDS Data Dictionary

Derived Variables in the 3 Year Dataset

This page gives a listing of derived variables in the 3 Year dataset, in alphabetical order of variable name. For each variable, a short written description is followed (in a box) by the SPSS syntax that was used to derive the variable.

This page does not include descriptions of background variables that are derived from other sources and that are included in the 3 Year dataset. For information about such variables, see pages describing background variables, exclusions and scrambled IDs.

Most of the twin-specific variables were derived prior to double entering the dataset. Hence the variable names used in the syntax often lack the endings (1 or 2) used for the final double entered variables.

List of variables described on this page

Click on a variable name in the table below to go to the description on this page. Alternatively, scroll down and find variables in alphabetical order.

Definitions of derived variables

Listed alphabetically

cadparc1/2

Total score for the three parent-administered Parca tests in the child booklet, computed as a simple sum with unequal weightings for the tests, and re-scaled so the range of values is 0 to 1.

* Total score for the 3 parent-administered test.
* requiring at least 2 of the 3 to be non-missing.
* The total is scaled from 0 to 1.
* Divide total by 37 if all 3 are non-missing.
COMPUTE cadparc = SUM.3(coddt, cdrawt, cmatcht) / 37.
EXECUTE.
* Divide total by 23 if coddt or cmatcht is missing.
* or divide total by 28 if cdrawt is missing.
IF (SYSMIS(coddt)) cadparc = SUM.2(cdrawt, cmatcht) / 23.
IF (SYSMIS(cmatcht)) cadparc = SUM.2(cdrawt, coddt) / 23.
IF (SYSMIS(cdrawt)) cadparc = SUM.2(coddt, cmatcht) / 28.
EXECUTE.
cadparn1/2

Standardised mean score for the three parent-administered Parca tests in the child booklet. Derived as a mean of the standardised scores, giving equal weighting to each test. The variables are standardised on the non-excluded twin sample, defined by variable exclude1.

* Filter out all the standard exclusions for the twin.
* (medical, perinatal, unknown sex/zyg, missing 1st Contact).
* This will affect all standardised variables derived below.
USE ALL.
COMPUTE filter_$=(exclude1 = 0).
VARIABLE LABEL filter_$ 'exclude1 = 0 (FILTER)'.
VALUE LABELS filter_$ 0 'Not Selected' 1 'Selected'.
FORMAT filter_$ (f1.0).
FILTER BY filter_$.
EXECUTE.

* Standardised mean for the parent-administered tests.
* First standardise the test scores.
DESCRIPTIVES VARIABLES=coddt (zcoddt) cdrawt (zcdrawt)
 cmatcht (zcmatcht) /SAVE.
* Find the mean - require at least two to be non-missing.
COMPUTE adparn = MEAN.2(zcoddt, zcdrawt, zcmatcht).
EXECUTE.
* Now standardise this mean.
DESCRIPTIVES VARIABLES=adparn (cadparn) /SAVE.

* Remove filter: no longer needed now standardisations completed.
FILTER OFF.
USE ALL.
EXECUTE.
cagemdex, cagestex

Exclusion variables (coded 1=exclude, 0=not excluded) based on twin age criteria. The criteria are strict for variable cagestex, and moderate for variable cagemdex. See comments in syntax below for detailed exclusion criteria.
Derived from double-entered twin age variables cadage1/2 and crepage1/2 which are described elsewhere on this page.

* Moderate version: each twin age no greater than 42/12 (3 years 6 months).
* and no less than 34/12 (2 years 10 months).
* Applies only to twin measures in the twin booklet (not parent booklet age).
COMPUTE cagemdex = 0.
EXECUTE.
IF (ctwbage1 > (42/12) | ctwbage2 > (42/12)) cagemdex = 1.
IF (ctwbage1 < (34/12) | ctwbage2 < (34/12)) cagemdex = 1.
EXECUTE.
* Strict version: each twin age no greater than 39/12 (3 years 3 months).
* and no less than 34/12 (2 years 10 months).
* and age differences no more than 2 months.
COMPUTE cagestex = 0.
EXECUTE.
IF (ctwbage1 > (39/12) | ctwbage2 > (39/12)) cagestex = 1.
IF (ctwbage1 < (34/12) | ctwbage2 < (34/12)) cagestex = 1.
EXECUTE.
IF (ctwbagediff > (2/12)) cagestex = 1.
EXECUTE.
calg2zy, calgzyg

Zygosity (1=MZ, 2=DZ, 5=indeterminate, 99=inconsistent data) derived from the 3 year parent booklet data using the zygosity algorithm.
Variable calg2zy derives purely from the booklet data, while calgzyg also takes into account knowledge of twin sexes.
Computed from derived variable ctempzyg, which is described elsewhere on this page, from booklet item variables, and from twin sexes sex1/2 (from admin data).

* First check whether twins have different sexes.
COMPUTE sexdif = ABS(sex1 - sex2).
EXECUTE.
* Now zygosity from algorithm can be derived.
* Start with default value of 5 (indeterminate).
COMPUTE calgzyg = 5.
EXECUTE.
* Now use the difference score: 0.64 or less means MZ.
* 0.70 or more means DZ.
IF (ctempzyg <= 0.64) calgzyg = 1.
IF (ctempzyg >= 0.70) calgzyg = 2.
EXECUTE.
* Now over-rule the score and conclude DZ if there are clear differences.
* in eye colour, hair shade or hair texture, or if they look very different. 
IF (czyhairs = 3 | czyhairt = 3 | czyeyes = 3 | czypeas = 3) calgzyg = 2.
EXECUTE.
* Also over-rule score if alike as two peas in a pod (conclude MZ).
IF (czypeas = 1) calgzyg = 1.
EXECUTE.
* But if the latter clashes with clear differences in hair/eyes.
* then the result is inconsistent (value 99).
IF ((czypeas = 1) & (czyhairs = 3 | czyhairt = 3 | czyeyes = 3)) calgzyg = 99.
EXECUTE.
* Copy the result into a second variable representing the derived.
* zygosity without reference to information about twin sexes.
* This will be used for admin purposes, to track changes in estimated.
* zygosity for pairs where the twin sexes are updated.
COMPUTE calg2zy = calgzyg.
EXECUTE.
* Finally, for calgzyg but not calg2zy, over-rule all other data.
* if twins have opposite sexes (DZ) or sexes are unknown.
IF (sexdif = 1) calgzyg = 2.
IF (SYSMIS(sexdif)) calgzyg = 5.
EXECUTE.
canxshyt1/2, canxfeart1/2, canxt1/2

ARBQ scales.
canxshyt1/2: shyness or social anxiety subscale.
canxfeart1/2: fear subscale.
canxt1/2: total anxiety scale.
These ARBQ scales are comparable with those used at other ages (4, 7, 9, 16), but are reduced because relatively few items are available at age 3 (only 8 of the 20+ ARBQ items used at later ages, and none of the SDQ emotion items). Here, at age 3, two of the ARBQ subscales are derived from 5 of the 8 recognised ARBQ measure items. All 8 items are used in the total anxiety scale, along with the 6 Behar emotion items. The Behar emotion items are not sufficient for use in the subscales. Further details are explained in comments in the syntax below.

* ARBQ scales.
* Similar to those derived at age 4.
*  except that there are fewer ARBQ items at age 3.
* The OCB, negative affect and negative cognition subscales used at later ages.
*  cannot be derived at age 3 because there are too few items.

* Shyness or social anxiety.
* Use the same 3 items as at age 4.
* (cbeh39 is the same item as danx23).
COMPUTE canxshyt = 3 * MEAN.2(canx02, canx07, cbeh39r).

* Fear.
* 2 of the 3 items used at age 4.
COMPUTE canxfeart = 2 * MEAN.1(canx20, canx22).

* Overall total anxiety scale.
* from all 14 available anxiety items at age 3, namely.
*  all the ARBQ and Behar emotion items.
COMPUTE canxt = 14 * MEAN.7(canx02, canx07, canx16, canx20, canx22, 
  cbeh10, cbeh14, cbeh15, cbeh20, cbeh22, cbeh33, cbeh39r, cbeh42, cbeh44).
EXECUTE.
* this includes the 5 items from the subscales above.
* plus 1 OCB item and 1 negative affect item (no subscales possible).
* plus 7 items that are not used in any ARBQ subscales at any ages.
* (6 Behar items; and the 'twitches' item).
ccont1/2, cemot1/2, chypt1/2, cprot1/2

Overall behaviour scales derived from Behar behaviour items (not just those comparable with SDQ) along with additional Hyperactivity items.
ccont1/2: Conduct scale, from 9 Behar items.
cemot1/2: Emotion scale, from 6 Behar items.
chypt1/2: Hyperactivity scale, from 5 Behar and 3 additional hyperactivity items.
cprot1/2: Prosocial scale, from 10 Behar items.
These are very similar to equivalent Behar scales used at age 2, except that (a) an extra emotion item was added at age 3, and (b) 3 extra hyperactivity items were added at age 3.
See comments in syntax below for further details. At least half of the component items are required to be non-missing for each scale.

* Overall Behar scales, using all suitable correlating items.
*  for conduct, emotion, hyperactivity and prosocial.
*  and not just those that are similar to SDQ items.
* These are similar to, and comparable with, the overall Behar scales used at ages 2 and 4.
*  although there are a few more items at age 3 than at age 2.
*  and SDQ items replace some Behar items at age 4.
* Conduct problems: 9 Behar items.
COMPUTE ccont = 9 * MEAN.5(cbeh05, cbeh07, cbeh12, cbeh17, 
   cbeh23, cbeh29, cbeh35, cbeh38, cbeh40).
* Emotion problems: 6 Behar items.
COMPUTE cemot = 6 * MEAN.3(cbeh10, cbeh14, cbeh20, 
   cbeh33, cbeh42, cbeh44).
* Hyperactivity: 5 Behar items + 3 additional Hyperactivity items.
COMPUTE chypt = 8 * MEAN.4(cbeh02, cbeh04, cbeh19, cbeh30, cbeh37, 
   chyp1, chyp2, chyp3).
* Prosocial: 10 Behar items.
COMPUTE cprot = 10 * MEAN.5(cbeh01, cbeh03, cbeh09, cbeh13, cbeh18, 
   cbeh21, cbeh25, cbeh36, cbeh41, cbeh43).
EXECUTE.
* (no Peer problems scale here - there are no additional items.
*   beyond those used in the SDQ-comparable scale).

* Note that, as at age 2, some Behar items (cbeh06, 15, 16, 22, 24, 26, 27, 28, 32, 34, 39).
*  have not been included in any Behar scales because they show little if any.
*  correlation or because they do not clearly belong to the scaled traits.
*  (conduct, emotion, hyperactivity, prosocial, peer).
* Of these, beh15, 22, 39 are used in ARBQ scales described elsewhere on this page.
cbpamus1/2, cbpangr1/2, cbpaway1/2, cbpclos1/2, cbpfrus1/2, cbphapp1/2, cbpimpa1/2, cdiasko1/2, cdiexpl1/2, cdifirm1/2, cdijoke1/2, cdishou1/2, cdismak1/2, cgboard1/2, cgbooks1/2, cgmessy1/2, cgmusi1/2, cgphys1/2, cgpuzz1/2, cgtapes1/2, cgzoo1/2, cgmessy1/2, cgmusi1/2, cgphys1/2, cgpuzz1/2, cgtapes1/2, cgzoo1/2, ctbook1/2, cteat1/2, ctloca1/2, ctpron1/2, ctrhym1/2, ctsent1/2, cttalk1/2, ctword1/2

Standardised twin-specific versions of items in the parent booklet, originally coded as elder twin values and younger twin differences. See comments in the syntax below for details of how the variables are derived.
These twin-specific items are used to derive scales (described elsewhere on this page) for parental feelings, discipline, twin talk and twin games.

* Make twin-specific variables from difference variables.
* first standardise all the raw items.
DESCRIPTIVES VARIABLES= ctrhym ctrhymd ctpron ctprond ctsent ctsentd
 ctword ctwordd ctloca ctlocad ctbook ctbookd cttalk cttalkd cteat cteatd
 cgmessy cgmessyd cgpuzz cgpuzzd cgmusi cgmusid cgtapes cgtapesd
 cgbooks cgbooksd cgzoo cgzood cgphys cgphysd cgboard cgboardd
 cdismak cdismakd cdishou cdishoud cdiexpl cdiexpld cdifirm cdifirmd
 cdijoke cdijoked cdiasko cdiaskod cbpimpa cbpimpad cbphapp cbphappd
 cbpamus cbpamusd cbpaway cbpawayd cbpangr cbpangrd cbpclos cbpclosd cbpfrus cbpfrusd
 /SAVE.
* The standardised elder twin item is the twin 1 variable - rename.
RENAME VARIABLES (Zctrhym Zctpron Zctsent Zctword Zctloca Zctbook
 Zcttalk Zcteat Zcgmessy Zcgpuzz Zcgmusi Zcgtapes Zcgbooks Zcgzoo
 Zcgphys Zcgboard Zcdismak Zcdishou Zcdiexpl Zcdifirm Zcdijoke
 Zcdiasko Zcbpimpa Zcbphapp Zcbpamus Zcbpaway Zcbpangr Zcbpclos Zcbpfrus
 = ctrhym1 ctpron1 ctsent1 ctword1 ctloca1 ctbook1 
 cttalk1 cteat1 cgmessy1 cgpuzz1 cgmusi1 cgtapes1 cgbooks1 cgzoo1 
 cgphys1 cgboard1 cdismak1 cdishou1 cdiexpl1 cdifirm1 cdijoke1 
 cdiasko1 cbpimpa1 cbphapp1 cbpamus1 cbpaway1 cbpangr1 cbpclos1 cbpfrus1).
* for twin 2, subtract the standardised difference item.
* from the standardised twin 1 item.
COMPUTE ctrhym2X = ctrhym1 - Zctrhymd.
COMPUTE ctpron2X = ctpron1 - Zctprond.
COMPUTE ctsent2X = ctsent1 - Zctsentd.
COMPUTE ctword2X = ctword1 - Zctwordd.
COMPUTE ctloca2X = ctloca1 - Zctlocad.
COMPUTE ctbook2X = ctbook1 - Zctbookd.
COMPUTE cttalk2X = cttalk1 - Zcttalkd.
COMPUTE cteat2X = cteat1 - Zcteatd.
COMPUTE cgmessy2X = cgmessy1 - Zcgmessyd.
COMPUTE cgpuzz2X = cgpuzz1 - Zcgpuzzd.
COMPUTE cgmusi2X = cgmusi1 - Zcgmusid.
COMPUTE cgtapes2X = cgtapes1 - Zcgtapesd.
COMPUTE cgbooks2X = cgbooks1 - Zcgbooksd.
COMPUTE cgzoo2X = cgzoo1 - Zcgzood.
COMPUTE cgphys2X = cgphys1 - Zcgphysd.
COMPUTE cgboard2X = cgboard1 - Zcgboardd.
COMPUTE cdismak2X = cdismak1 - Zcdismakd.
COMPUTE cdishou2X = cdishou1 - Zcdishoud.
COMPUTE cdiexpl2X = cdiexpl1 - Zcdiexpld.
COMPUTE cdifirm2X = cdifirm1 - Zcdifirmd.
COMPUTE cdijoke2X = cdijoke1 - Zcdijoked.
COMPUTE cdiasko2X = cdiasko1 - Zcdiaskod.
COMPUTE cbpimpa2X = cbpimpa1 - Zcbpimpad.
COMPUTE cbphapp2X = cbphapp1 - Zcbphappd.
COMPUTE cbpamus2X = cbpamus1 - Zcbpamusd.
COMPUTE cbpaway2X = cbpaway1 - Zcbpawayd.
COMPUTE cbpangr2X = cbpangr1 - Zcbpangrd.
COMPUTE cbpclos2X = cbpclos1 - Zcbpclosd.
COMPUTE cbpfrus2X = cbpfrus1 - Zcbpfrusd.
EXECUTE.
* standardise these differences to make the twin 2 items.
DESCRIPTIVES VARIABLES= ctrhym2X (ctrhym2) ctpron2X (ctpron2)
 ctsent2X (ctsent2) ctword2X (ctword2) ctloca2X (ctloca2) ctbook2X (ctbook2)
 cttalk2X (cttalk2) cteat2X (cteat2) cgmessy2X (cgmessy2) cgpuzz2X (cgpuzz2)
 cgmusi2X (cgmusi2) cgtapes2X (cgtapes2) cgbooks2X (cgbooks2)
 cgzoo2X (cgzoo2) cgphys2X (cgphys2) cgboard2X (cgboard2)
 cdismak2X (cdismak2) cdishou2X (cdishou2) cdiexpl2X (cdiexpl2)
 cdifirm2X (cdifirm2) cdijoke2X (cdijoke2) cdiasko2X (cdiasko2)
 cbpimpa2X (cbpimpa2) cbphapp2X (cbphapp2) cbpamus2X (cbpamus2)
 cbpaway2X (cbpaway2) cbpangr2X (cbpangr2) cbpclos2X (cbpclos2)
 cbpfrus2X (cbpfrus2) /SAVE.
ccbmi1/2

BMI (body mass index), in units of kilograms per square metre, for each twin.
Derived from heights (in centimetres) and weights (in kilograms) which are item variables from the child booklet, and which are cleaned (removing extreme outliers) before deriving BMI.

* BMI for child.
COMPUTE ccbmi = RND(((10000 * ccwtkg) / (cchtcm * cchtcm)), 0.1).
EXECUTE.
cchatot, cyhfac1z, cyhfac2z

Standardised composites for home chaos, based on the "Your Home" items in the parent booklet. See comments in the syntax below for differences between these composites.
All composites are standardised on the non-excluded twin sample, filtered using derived variable cexclude (also described elsewhere on this page).

* Chaos.
* -----.
* These items are per-family, not twin specific.
* First standardise the items.
DESCRIPTIVES VARIABLES= cyhbed (zcyhbed) cyhhear (zcyhhear)
 cyhzoo (zcyhzoo) cyhontop (zcyhontop)
 cyhtv (zcyhtv) cyhcalm (zcyhcalm) /SAVE.
* 1st scale (chaotic): 3 items.
COMPUTE cyhfac1 = MEAN.2(zcyhhear, zcyhzoo, zcyhtv).
EXECUTE.
* 2nd scale (calm): 3 items.
COMPUTE cyhfac2 = MEAN.2(zcyhbed, zcyhontop, zcyhcalm).
EXECUTE.
* standardise the scales.
DESCRIPTIVES VARIABLES= cyhfac1 (cyhfac1z) cyhfac2 (cyhfac2z) /SAVE.

* overall chaos scale from the subscales above.
* subtract calm scale from chaotic scale.
COMPUTE chatot = cyhfac1z - cyhfac2z.
EXECUTE.
* standardise.
DESCRIPTIVES VARIABLES= chatot (cchatot) /SAVE.
ccomplx1/2

Total score for sentence complexity, derived as the sum of item scores for items 1-12. The total score is re-adjusted to zero if item 0 indicates that the child is not yet talking.
Each item has scores 0/1, hence the sentence complexity total score has range 0-12.

* Sentence complexity total: sum of item scores 1 to 12.
* Any missing items are treated like zero scores.
COMPUTE ccomplx = SUM(cs01s, cs02s, cs03s, cs04s, cs05s,
 cs06s, cs07s, cs08s, cs09s, cs10s, cs11s, cs12s).
EXECUTE.
* Assign score of 0 if child is not yet combining words.
* (note that if cs00s=0 then items 1 to 12 will be missing).
IF (cs00s = 0) ccomplx = 0.
EXECUTE.
ccont1/2

See cbeht1/2, etc above.

cdiasko1/2, cdiexpl1/2, cdifirm1/2, cdijoke1/2, cdishou1/2, cdismak1/2

See cbpamus1/2, etc above.

cdisneg1/2, cdispos1/2, cdisavo1/2, cdis1/2

Standardised composites for parental discipline, based on the "Discipline" items in the parent booklet. The composites are for harsh/negative discipline (cdisneg, 2 items), positive discipline (cdispos, 2 items), avoidance (cdisavo, 2 items) and an overall total scale (cdis). This overall composite is derived from the 3 subscales, hence indirectly from all 6 items, with higher values for more negative/harsh and avoidance discipline and less positive discipline.
The composites are computed from derived twin-specific versions of the items, which are described elsewhere on this page.

* Discipline.
* ----------.
* Use twin-specific versions of items, as already derived.
* Note these items are already standardised and double entered.
* 1st scale: negative/harsh discipline (smack and shout).
COMPUTE cdisneg1x = MEAN.1(cdismak1, cdishou1).
COMPUTE cdisneg2x = MEAN.1(cdismak2, cdishou2).
EXECUTE.
* 2nd scale: positive discipline (explain and firm).
COMPUTE cdispos1x = MEAN.1(cdiexpl1, cdifirm1).
COMPUTE cdispos2x = MEAN.1(cdiexpl2, cdifirm2).
EXECUTE.
* 3rd scale: avoidance (ask and joke).
COMPUTE cdisavo1x = MEAN.1(cdijoke1, cdiasko1).
COMPUTE cdisavo2x = MEAN.1(cdijoke2, cdiasko2).
EXECUTE.
* standardise the scales.
DESCRIPTIVES VARIABLES= cdisneg1x (cdisneg1) cdisneg2x (cdisneg2)
 cdispos1x (cdispos1) cdispos2x (cdispos2)
 cdisavo1x (cdisavo1) cdisavo2x (cdisavo2) /SAVE.
 
* overall parental discipline scale, from subscales above.
* add smack/shout and joke/ask (which correlate positively). and subtract explain/firm.
* and subtract explain/firm (which correlates negatively with the others).
COMPUTE dis1 = cdisneg1 + cdisavo1 - cdispos1.
COMPUTE dis2 = cdisneg2 + cdisavo2 - cdispos2.
EXECUTE.
* standardise.
DESCRIPTIVES VARIABLES= dis1 (cdis1) dis2 (cdis2) /SAVE.
cdrawt1/2

Total score for the Parca Drawing test, derived as the sum of the 6 item scores.
Items 1-3 have scores 0/1, while items 4-6 have scores 0/1/2, hence the total score has range 0-9.

* Drawing test total score (0-9): sum of scores for all 6 items.
COMPUTE cdrawt = SUM(cpd01, cpd02, cpd03, cpd04, cpd05, cpd06).
EXECUTE.
cemot1/2

See cbeht1/2, etc above.

cerisk1/2, ceriskf

Standardised environment risk composites: ceriskf is a family-specific composite, while cerisk1/2 is a twin-specific composite.
Each composite is derived as a mean of several other standardised measures, equally weighted. These measures include 3 year derived variables clifeev, cmdtot, cchatot, cdis1/2 and cpar1/2, all of which are described elsewhere on this page; and 1st Contact variables ases, amedtot and atwmed1/2 (see 1st Contact derived variables page).
All variables are standardised.

* Environment risk composites.
* Family version and twin-specific version.

* standardise component variables (if not already standardised).
DESCRIPTIVES VARIABLES= clifeev (zclifeev) cmdtot (zcmdtot) /SAVE .
* reverse SES so higher values indicate more risk (already standardised).
COMPUTE ases_r = -1 * ases.
EXECUTE.

* Family-specific composite is a mean of 5 standardised family variables.
* from 1st Contact and age 3.
COMPUTE eriskf = MEAN.3(ases_r, amedtot, cchatot, zcmdtot, zclifeev).
EXECUTE.
* Twin-specific composite is a mean of the same 5 family variables.
* plus three twin-specific variables from 1st Contact and age 3.
COMPUTE erisk1 = MEAN.5(ases_r, amedtot, cchatot, zcmdtot, zclifeev,
 cdis1, cpar1, atwmed1).
COMPUTE erisk2 = MEAN.5(ases_r, amedtot, cchatot, zcmdtot, zclifeev,
 cdis2, cpar2, atwmed2).
EXECUTE.

* standardise the new composites.
DESCRIPTIVES VARIABLES= eriskf (ceriskf) erisk1 (cerisk1) erisk2 (cerisk2) /SAVE .
cexcludm, cexcluds

Twin pair exclusion variables (coded 1=exclude, 0=not excluded) incorporating standard analysis exclusions plus twin age exclusions. Variable cexcluds incorporates strict age exclusions, while variable cexcludm incorporates moderate age exclusion criteria.
Derived from the standard exclusion variables exclude1/2 and from age exclusion variables cagemdex and cagestex, which are described elsewhere on this page.

COMPUTE cexcludm = 0.
COMPUTE cexcluds = 0.
EXECUTE.
IF (cagemdex = 1 | exclude1 = 1 | exclude2 = 1) cexcludm = 1.
IF (cagestex = 1 | exclude1 = 1 | exclude2 = 1) cexcluds = 1.
EXECUTE.
cfbmcat, cmbmcat

BMI categories (cfbmcat for father, cmbmcat for mother), ordinally coded 1=underweight, 2=normal, 3=overweight, 4=obese. Derived from BMI variables cfbmi and cmbmi, which are described elsewhere on this page.

* Ordinal parent categories for underweight, normal, overweight and obese.
* These replace numerous old category variables containing.
* essentially the same information.
* Use ranges 0-20=under, 20-25=normal, 25-30=over, 30 or more=obese.
RECODE cmbmi cfbmi
 (SYSMIS=SYSMIS) (LOWEST THRU 19.999=1) (20 THRU 24.999=2)
 (25 THRU 29.999=3) (30 THRU HIGHEST=4)
INTO cmbmcat cfbmcat.
EXECUTE.
cfbmi, cmbmi

BMI for the mother (cmbmi) and father (cfbmi), in units of kilograms per square metre. Derived from parent heights and weights, which were recorded in the twin booklets, hence duplicated in the raw data. For the dataset, the raw heights and weights are cleaned and converted into a single measurement for each parent, as shown in the syntax below.

* Use differences between 2 twin booklets to identify and remove further anomalies.
* where only one of two measurements is in central 'normal' range.
* on the assumption that the more extreme measurement was a mistake.
* Mother heights: central range 155 to 175 cm.
IF (RANGE(cmhtcm1, 155, 175) & ~RANGE(cmhtcm2, 155, 175)) cmhtcm2 = $SYSMIS.
IF (~RANGE(cmhtcm1, 155, 175) & RANGE(cmhtcm2, 155, 175)) cmhtcm1 = $SYSMIS.
EXECUTE.
* Father heights: range 166 to 190cm.
IF (RANGE(cfhtcm1, 166, 190) & ~RANGE(cfhtcm2, 166, 190)) cfhtcm2 = $SYSMIS.
IF (~RANGE(cfhtcm1, 166, 190) & RANGE(cfhtcm2, 166, 190)) cfhtcm1 = $SYSMIS.
EXECUTE.
* Mother weights: central range 50 to 80 kg.
IF (RANGE(cmwtkg1, 50, 80) & ~RANGE(cmwtkg2, 50, 80)) cmwtkg2 = $SYSMIS.
IF (~RANGE(cmwtkg1, 50, 80) & RANGE(cmwtkg2, 50, 80)) cmwtkg1 = $SYSMIS.
EXECUTE.
* Father weights: central range 65 to 100 kg.
IF (RANGE(cfwtkg1, 65, 100) & ~RANGE(cfwtkg2, 65, 100)) cfwtkg2 = $SYSMIS.
IF (~RANGE(cfwtkg1, 65, 100) & RANGE(cfwtkg2, 65, 100)) cfwtkg1 = $SYSMIS.
EXECUTE.

* now check remaining differences from the two twin booklets.
COMPUTE mumheightdiff = ABS(cmhtcm1 - cmhtcm2).
COMPUTE dadheightdiff = ABS(cfhtcm1 - cfhtcm2).
COMPUTE mumweightdiff = ABS(cmwtkg1 - cmwtkg2).
COMPUTE dadweightdiff = ABS(cfwtkg1 - cfwtkg2).
EXECUTE.
* Where a difference remains large, both are in the central 'normal' range.
* so we have to treat both as unreliable and recode to missing.
* Heights: remove if difference is greater than 6cm (over 2 inches).
DO IF (mumheightdiff > 6).
 RECODE cmhtcm1 cmhtcm2 (ELSE=SYSMIS).
END IF.
DO IF (dadheightdiff > 6).
 RECODE cfhtcm1 cfhtcm2 (ELSE=SYSMIS).
END IF.
EXECUTE.
* Weights: remove if difference is greater than 3.2kg (over half a stone).
DO IF (mumweightdiff > 3.2).
 RECODE cmwtkg1 cmwtkg2 (ELSE=SYSMIS).
END IF.
DO IF (dadweightdiff > 3.2).
 RECODE cfwtkg1 cfwtkg2 (ELSE=SYSMIS).
END IF.
* Now the measurements are cleaned up, for each parent.
* use the mean height and weight so we can drop duplicated measurements.
COMPUTE cmhtcm = RND(MEAN(cmhtcm1, cmhtcm2)).
COMPUTE cfhtcm = RND(MEAN(cfhtcm1, cfhtcm2)).
COMPUTE cmwtkg = RND(MEAN(cmwtkg1, cmwtkg2), 0.1).
COMPUTE cfwtkg = RND(MEAN(cfwtkg1, cfwtkg2), 0.1).
EXECUTE.

* Now derive mother and father BMI in standard units of kg per square metre.
COMPUTE cmbmi = RND(((10000 * cmwtkg) / (cmhtcm * cmhtcm)), 0.1).
COMPUTE cfbmi = RND(((10000 * cfwtkg) / (cfhtcm * cfhtcm)), 0.1).
EXECUTE.
cfemale1/2, cgender1/2, cmale1/2

Gender role scales, derived from items in the "Your Active Child" section of the child booklet.
Each variable cfemale1/2 (female gender roles) and cmale1/2 (male gender roles) is derived as the mean of 12 item variables, all having values 1-5, re-scaled to give a "total" with range 12-60. Variable cgender1/2 (overall gender role scale) is then derived from the other two scales, and re-scaled to range 0-100.

* male and female role scales are computed as means of relevant items.
* (requiring at least half to be non-missing).
* then re-scaled as a sum.
COMPUTE cmale = 12 * MEAN.6(cac04, cac05, cac07, cac08, cac10,
 cac12, cac14, cac16, cac17, cac19, cac20, cac21).
COMPUTE cfemale = 12 * MEAN.6(cac01, cac02, cac03, cac06,
 cac09, cac11, cac13, cac15, cac18, cac22, cac23, cac24).
EXECUTE.

* overall gender composite is difference of male and female.
* transformed so range is 0 to 100.
COMPUTE cgender = 48.25 + (1.1 * (cmale - cfemale)).
EXECUTE.
cfht1/2

See ccht1/2, cfht1/2, cmht1/2 above.

cfwt1/2

See ccwt1/2, cfwt1/2, cmwt1/2 above.

cgboard1/2, cgbooks1/2, cgmessy1/2, cgmusi1/2, cgphys1/2, cgpuzz1/2, cgtapes1/2, cgzoo1/2

See cbpamus1/2, etc above.

cgender1/2

See cfemale1/2, cgender1/2, cmale1/2 above.

cgfac11/2, cgfac21/2, cggfac1z, cggfac2z

Standardised composites for twin games, based on the "Your Twins' Play" items in the parent booklet. See comments in the syntax below for differences between these composites.
The twin-specific composites are computed from derived twin-specific versions of the items, which are described elsewhere on this page. All composites are standardised on the non-excluded twin sample, filtered using derived variable cexclude (also described elsewhere on this page).

* Play materials and toys.
* -----------------------.
* These items are per-family, not twin specific.
* First reverse the items.
RECODE cgdough cgtricyc cgoutdor cgmobile cgcoord
 (1=3) (2=2) (3=1) (SYSMIS=SYSMIS)
INTO cgdoughr cgtricycr cgoutdorr cgmobiler cgcoordr.
EXECUTE.
* Now standardise them.
DESCRIPTIVES VARIABLES=cgdoughr (zcgdoughr) cgtricycr (zcgtricycr)
 cgoutdorr (zcgoutdorr) cgmobiler (zcgmobiler) cgcoordr (zcgcoordr) /SAVE.
* 1st scale: 2 items relating to outdoor toys.
COMPUTE cggfac1 = MEAN.1(zcgtricycr, zcgoutdorr).
EXECUTE.
* 2nd scale: 3 other items.
COMPUTE cggfac2 = MEAN.2(zcgdoughr, zcgmobiler, zcgcoordr).
EXECUTE.
* standardise them.
DESCRIPTIVES VARIABLES= cggfac1 (cggfac1z) cggfac2 (cggfac2z) /SAVE.

* Twin games.
* ----------.
* Use twin-specific versions of items, as already derived.
* Note these items are already standardised and double entered.
* 1st scale: books, puzzles and tapes.
COMPUTE cgfac11x = MEAN.2(cgbooks1, cgpuzz1, cgtapes1).
COMPUTE cgfac12x = MEAN.2(cgbooks2, cgpuzz2, cgtapes2).
EXECUTE.
* 2nd scale: physical and messy games, music, board games.
COMPUTE cgfac21x = MEAN.3(cgmessy1, cgphys1, cgmusi1, cgboard1).
COMPUTE cgfac22x = MEAN.3(cgmessy2, cgphys2, cgmusi2, cgboard2).
EXECUTE.
* Note that cgzoo1/2 is not used in either scale.
* standardise the new scales.
DESCRIPTIVES VARIABLES= cgfac11x (cgfac11) cgfac12x (cgfac12)
 cgfac21x (cgfac21) cgfac22x (cgfac22) /SAVE.
cgmessy1/2, cgmusi1/2, cgphys1/2, cgpuzz1/2

See cbpamus1/2, etc above.

cgramma1/2

Composite Grammar score, with ordinal values 0 to 4. Derived from sentence complexity total score ccomplx (described elsewhere on this page) and from individual sentence complexity item variables.

* Grammar composite.
* This is essentially an ordinal version of sentence complexity.
* 0 if not combining words yet.
IF (cs00s = 0) cgramma = 0.
EXECUTE.
* 1 if combining words, but zero or missing score on sentence complexity.
IF (cs00s = 1 & (ccomplx = 0 | SYSMIS(ccomplx))) cgramma = 1.
EXECUTE.
* 2 if no score for items 7/8/11 but overall non-zero total score.
* (include 0 in sum with items 7/8/11 in case these items missing).
IF (cs00s = 1 & (SUM(0, cs07s, cs08s, cs11s) = 0) & ccomplx > 0) cgramma = 2.
EXECUTE.
* 3 if non-zero score for items 7/8/11, but less than full total score.
IF (cs00s = 1 & (SUM(cs07s, cs08s, cs11s) > 0) & ccomplx < 12) cgramma = 3.
EXECUTE.
* 4 if full score of 12.
IF (cs00s = 1 & ccomplx = 12) cgramma = 4.
EXECUTE.
cgtapes1/2, cgzoo1/2

See cbpamus1/2, etc above.

chand1/2, chands1/2

Handedness variables, coded 1=left, 2=right, 3=mixed. Variable chands uses stricter criteria than variable chand. Both variables are derived from the handedness items in the Drawing test.

* Handedness variables derived by Corina Greven.
* These are nominal 1=left, 2=right, 3=mixed handedness variables.
* Stringent (chands) and less stringent (chand) versions.

* first count how many handedness items are non-missing.
COUNT nhanded = cpd04h cpd05h cpd06h (1 THRU 3).
EXECUTE.

* Set all initially to default value of 3 (mixed).
* Require at least 2 non-missing for less stringent version.
IF (nhanded >= 2) chand = 3.
* Require all three to be non-missing for stringent version.
IF (nhanded = 3) chands = 3.
EXECUTE.

* Less stringent handedness: 3 items must show same handedness but can be missing.
DO IF (nhanded >= 2).
 IF ((cpd04h = 1 | SYSMIS(cpd04h)) & (cpd05h = 1 | SYSMIS(cpd05h))
  & (cpd06h = 1 | SYSMIS(cpd06h))) chand = 1.
 IF ((cpd04h = 2 | SYSMIS(cpd04h)) & (cpd05h = 2 | SYSMIS(cpd05h))
  & (cpd06h = 2 | SYSMIS(cpd06h))) chand = 2.
END IF.
EXECUTE.

* More stringent handedness: 3 items must show same handedness with none missing.
IF (cpd04h = 1 & cpd05h = 1 & cpd06h = 1) chands = 1.
IF (cpd04h = 2 & cpd05h = 2 & cpd06h = 2) chands = 2.
EXECUTE.
chypt1/2

See cbeht1/2, etc above.

cleft1/2, cright1/2, ctoleft1/2, ctorigh1/2

Variables for left- and right-handedness, coded 1=yes, 0=no. Variables ctoleft and ctoright use stricter criteria than variables cleft and cright respectively.
Derived from the handedness items in the Drawing test.

* Left and right handed variables derived by Angelica Ronald.
* These are simple nominal 1=yes 0=no variables to show left or right handedness.
* Stringent (ctoleft, ctorigh) and less stringent (cleft1/2, cright1/2) versions.

* first count how many handedness items are non-missing.
COUNT nhanded = cpd04h cpd05h cpd06h (1 THRU 3).
EXECUTE.
* Set all initially to default value of 0.
* Require at least 2 non-missing for less stringent version.
DO IF (nhanded >= 2).
 COMPUTE cleft = 0.
 COMPUTE cright = 0.
END IF.
* Require all three to be non-missing for stringent version.
DO IF (nhanded = 3).
 COMPUTE ctoleft = 0.
 COMPUTE ctorigh = 0.
END IF.
EXECUTE.

* less stringent version: at least 2 of the three items show relevant handedness.
IF ((cpd04h=1 & cpd05h=1) | (cpd04h=1 & cpd06h=1) | (cpd05h=1 & cpd06h=1)) cleft = 1 .
IF ((cpd04h=2 & cpd05h=2) | (cpd04h=2 & cpd06h=2) | (cpd05h=2 & cpd06h=2)) cright = 1 .
EXECUTE.

* more stringent version: all three items must show handedness.
IF ((cpd04h=1) & (cpd05h=1) & (cpd06h=1)) ctoleft = 1.
IF ((cpd04h=2) & (cpd05h=2) & (cpd06h=2)) ctorigh = 1.
EXECUTE.
clifeev

Life events composite, with ordinal values 0 to 5 denoting the number of significant life events experienced by the family at age 3. The five life events are: apparent change of marital status of respondent (by comparison of item variable cmstatus with 1st Contact derived variable aadults); serious illness experienced by any family member (from items cillfam and chosp1/2); changes in job for respondent and for partner (items cjobch and cjobchp respectively); and arrival of a new child in the family (item csibnew).

* Life events composite.
* First make a flag variable (1Y 0N) to show changes in marital status.
* since 1st Contact, by comparing values of aadults and cmstatus.
* (We don't know who respondent is at 3, so if in doubt assume unchanged status).
* aadults=1: parents are natural mother and father (married or cohabiting).
IF (aadults = 1 & ANY(cmstatus,1,3)) mstch13 = 0.
IF (aadults = 1 & ANY(cmstatus,2,4,5,6,7,8)) mstch13 = 1.
* aadults=2/4: natural mother or father with someone else (married or cohabiting).
IF (ANY(aadults,2,4) & ANY(cmstatus,2,4)) mstch13 = 0.
IF (ANY(aadults,2,4) & ANY(cmstatus,1,3,5,6,7,8)) mstch13 = 1.
* aadults=3/5/7/10/11: single parent of some sort.
IF (ANY(aadults,3,5,7,10,11) & ANY(cmstatus,5,6,7,8)) mstch13 = 0.
IF (ANY(aadults,3,5,7,10,11) & ANY(cmstatus,1,2,3,4)) mstch13 = 1.
* aadults=9: cohabiting adults of other or unknown status.
IF (aadults = 9 & ANY(cmstatus,1,2,3,4)) mstch13 = 0.
IF (aadults = 9 & ANY(cmstatus,5,6,7,8)) mstch13 = 1.
EXECUTE.
* aadults=6 doesn't occur in these data so ignore.
* aadults=8/12 are essentially unknowns, so leave mstch13 missing.

* Now another flag variable for serious illness in family.
* (mother or father or sibling or either twin at 3).
IF ((cillfam = 1) | (chosp1 = 1) | (chosp2 = 1)) illfam3 = 1 .
IF ((cillfam = 0) & (chosp1 = 0) & (chosp2 = 0)) illfam3 = 0 .
EXECUTE .

* Recode partner job change to 1Y 0N (recode no partner to missing).
RECODE cjobchp (0=0) (1=1) (2=SYSMIS) (SYSMIS=SYSMIS)
INTO jobchp3.
EXECUTE.

* Now make life events composite as the sum of 5 flag variables.
* (the 3 above plus respondent job change and new sibling in family).
* Require at least 3 of the components to be non-missing.
COMPUTE clifeev = SUM.3(mstch13, illfam3, jobchp3, cjobch, csibnew).
EXECUTE.
cmale1/2

See cfemale1/2, cgender1/2, cmale1/2 above.

cmatcht1/2

Total score for the Parca Matching test, derived as the sum of the item scores for the first 14 items. All items have scores 0/1, hence the total score has range 0-14.

* Matching test total score (0-14): sum of scores for items 1-14.
COMPUTE cmatcht = SUM(cpm01s, cpm02s, cpm03s, cpm04s, cpm05s,
 cpm06s, cpm07s, cpm08s, cpm09s, cpm10s, cpm11s, cpm12s, cpm13s, cpm14s).
EXECUTE.
cmbmcat

See cfbmcat, cmbmcat above.

cmbmi

See cfbmi, cmbmi above.

cmdtot

Maternal depression total scale. Derived from the 10 maternal depression items, all having values 0-3 (already reversed in the item coding where appropriate). The scale is derived as a mean, requiring at least half the items to be non-missing, then scaled up to represent a total with range 0 to 30.

* Total scale from all 10 items.
* Require at least half of them to be non-missing.
COMPUTE cmdtot = 10 * MEAN.5(cmdlaugh, cmdenjoy, cmdblame, cmdanx,
 cmdscare, cmdontop, cmdunhap, cmdsad, cmdcry, cmdharm).
EXECUTE.
codd01s1/2 through to codd16s1/2

Item scores (1 if correct, 0 if not) for the Parca Odd One Out test items. Derived from the respective raw item response variables (coded 1-3 for valid responses, denoting picture selected, or 5 for invalid responses).

* derive scores for odd-one-out test items.
* items with 1 as the correct response.
RECODE ODD03 ODD06 ODD08 ODD09 
 (1=1) (2=0) (3=0) (5=0) (SYSMIS=SYSMIS)
INTO codd03s codd06s codd08s codd09s .
EXECUTE.
* items with 2 as the correct response.
RECODE ODD01 ODD02 ODD07 ODD10 ODD13
 (1=0) (2=1) (3=0) (5=0) (SYSMIS=SYSMIS)
INTO codd01s codd02s codd07s codd10s codd13s.
EXECUTE.
* items with 3 as the correct response.
RECODE ODD04 ODD05 ODD11 ODD12 ODD14 ODD15 ODD16
 (1=0) (2=0) (3=1) (5=0) (SYSMIS=SYSMIS)
INTO codd04s codd05s codd11s codd12s codd14s codd15s codd16s .
EXECUTE.
coddt1/2

Total score for the Parca Odd One Out test, derived as the sum of the item scores for the first 14 items. All items have scores 0/1, hence the total score has range 0-14.

* Odd-one-out test total score (0-14): sum of scores for items 1-14.
COMPUTE coddt = SUM(codd01s, codd02s, codd03s, codd04s, codd05s, codd06s,
 codd07s, codd08s, codd09s, codd10s, codd11s, codd12s, codd13s, codd14s).
EXECUTE.
cparca1/2

Standardised Parca total score, derived as the mean of the standardised parent-administered total (creparc) and the standardised parent-reported total (cadparc), both of which are derived variables described elsewhere on this page.
Standardisations are carrried out on the non-excluded twin sample.

* Filter out all the standard exclusions for the twin.
* (medical, perinatal, unknown sex/zyg, missing 1st Contact).
* This will affect all standardised variables derived below.
USE ALL.
COMPUTE filter_$=(exclude1 = 0).
VARIABLE LABEL filter_$ 'exclude1 = 0 (FILTER)'.
VALUE LABELS filter_$ 0 'Not Selected' 1 'Selected'.
FORMAT filter_$ (f1.0).
FILTER BY filter_$.
EXECUTE.

* Standardised Parca composite.
* (parent administered and reported).
* First standardise the two total scores.
DESCRIPTIVES VARIABLES=creparc (zreparc) cadparc (zadparc) /SAVE.
* Now compute the mean, requiring just one to be non-missing.
COMPUTE parca = MEAN.1(zreparc, zadparc).
EXECUTE.
* Standardise this mean.
DESCRIPTIVES VARIABLES=parca (cparca) /SAVE.

* Remove filter: no longer needed now standardisations completed.
FILTER OFF.
USE ALL.
EXECUTE.
cparneg1/2, cparpos1/2, cpar1/2

Standardised composites for parental feelings, based on the "Being a Parent" items in the parent booklet. The composites are for negative feelings (cparneg, 4 items), positive feelings (cparpos, 3 items) and an overall total (cpar, derived from the other two hence using all 7 items). The overall total is coded in the direction of higher values for more negative and less positive feelings.
The composites are computed from derived twin-specific versions of the items, which are described elsewhere on this page.

* Parental feelings.
* -----------------.
* Use twin-specific versions of items, as already derived.
* Note these items are already standardised and double entered.
* 1st scale (negative feelings): 4 items.
COMPUTE cparneg1x = MEAN.3(cbpangr1, cbpfrus1, cbpimpa1, cbpaway1).
COMPUTE cparneg2x = MEAN.3(cbpangr2, cbpfrus2, cbpimpa2, cbpaway2).
EXECUTE.
* 2nd scale (positive feelings): 3 items.
COMPUTE cparpos1x = MEAN.2(cbphapp1, cbpamus1, cbpclos1).
COMPUTE cparpos2x = MEAN.2(cbphapp2, cbpamus2, cbpclos2).
EXECUTE.
* standardise the scales.
DESCRIPTIVES VARIABLES= cparneg1x (cparneg1) cparneg2x (cparneg2)
 cparpos1x (cparpos1) cparpos2x (cparpos2) /SAVE.
 
* overall parental feelings scale, from subscales above.
* subtract positivity scale from negativity scales.
* so the overall scale has higher values for more negative feelings.
COMPUTE par1 = cparneg1 - cparpos1.
COMPUTE par2 = cparneg2 - cparpos2.
EXECUTE.
* standardise.
DESCRIPTIVES VARIABLES= par1 (cpar1) par2 (cpar2) /SAVE.
cpbage, ctwbage1/2

Twin ages (in years) on completion of the parent booklet (cpbage) and the twin booklet (ctwbage1/2). The parent booklet date is the same for both twins, hence only one age variable is required for the parent booklet.
Each age is computed from the corresponding temporary derived date variable (cpbdate, ctwbdate) which in turn is derived from raw date variables in the booklet. The variable aonsdob is the twins' birth date, and is a temporary variable from admin data. These date variables are not retained in the dataset.

* Date variables.
* convert raw dd/mm/yyyy integer values into dates.
COMPUTE cpbdate = DATE.DMY(parent_dd, parent_mm, parent_yy).
COMPUTE ctwbdate = DATE.DMY(booklet_dd, booklet_mm, booklet_yyyy).
EXECUTE.
* if parent booklet date is missing, substitute the twin booklet date.
IF (SYSMIS(cpbdate)) cpbdate = ctwbdate.
EXECUTE.
* conversely, if twin booklet date is missing, use parent booklet date.
IF (SYSMIS(ctwbdate)) ctwbdate = cpbdate.
EXECUTE.
* fill in any remaining missing values with the admin return date.
IF (SYSMIS(cpbdate)) cpbdate = ReturnedDate.
IF (SYSMIS(ctwbdate)) ctwbdate = ReturnedDate.
EXECUTE.

* Age variables.
* Compute twin ages (in decimal years) when each booklet was completed.
* Twin booklet.
COMPUTE ctwbage = RND(((DATEDIFF(ctwbdate,aonsdob,"days")) / 365.25), 0.1) .
EXECUTE.
* Parent booklet.
COMPUTE cpbage =  RND(((DATEDIFF(cpbdate,aonsdob,"days")) / 365.25), 0.1) .
EXECUTE.
cpbLLCage, cpbLLCdate, ctwbLLCage1/2, ctwbLLCdate1/2

Age and date variables derived for use in datasets in the LLC TRE (but not to be used in other datasets).
Ages and dates are derived for the parent booklet ('cpb') and twin booklet ('ctwb').
The LLC date variables contain only the month and year, not the day, as a means of reducing identifiability. The date variables are strings formatted as 'yyyy-mm'. These LLC dates are designed to enable the TEDS measures to be placed in a time sequence with NHS medical diagnosis dates in the data in the TRE.
The LLC age variables are integers measuring the number of months between birth and the given TEDS activity, consistent with the matching LLC date variables.
Variable aonsdob is the twin birth date - the raw date variables are not retained in the dataset.

* First extract year and month as temp variables, from birth date and activity dates.
COMPUTE birthyear = XDATE.YEAR(aonsdob).
COMPUTE birthmonth = XDATE.MONTH(aonsdob).
COMPUTE ctwbyear = XDATE.YEAR(ctwbdate).
COMPUTE ctwbmonth = XDATE.MONTH(ctwbdate).
COMPUTE cpbyear = XDATE.YEAR(cpbdate).
COMPUTE cpbmonth = XDATE.MONTH(cpbdate).
EXECUTE.

* The agreed LLC date format is a string yyyy-mm (nominal by default for strings).
* adding '0' where necessary for two-digit months.
STRING ctwbLLCdate cpbLLCdate (A7).
IF (ctwbmonth < 10) ctwbLLCdate = CONCAT(STRING(ctwbyear, F4), '-0', STRING(ctwbmonth, F1)).
IF (ctwbmonth >= 10) ctwbLLCdate = CONCAT(STRING(ctwbyear, F4), '-', STRING(ctwbmonth, F2)).
IF (cpbmonth < 10) cpbLLCdate = CONCAT(STRING(cpbyear, F4), '-0', STRING(cpbmonth, F1)).
IF (cpbmonth >= 10) cpbLLCdate = CONCAT(STRING(cpbyear, F4), '-', STRING(cpbmonth, F2)).
EXECUTE.

* The agreed LLC age variable is in integer months.
* and it must agree with the birth and booklet year/month variables that will be available in the LLC.
COMPUTE ctwbLLCage = (ctwbmonth + (ctwbyear * 12)) - (birthmonth + (birthyear * 12)).
COMPUTE cpbLLCage = (cpbmonth + (cpbyear * 12)) - (birthmonth + (birthyear * 12)).
EXECUTE.
cpm01s1/2 through to cpm16s1/2

Item scores (1 if correct, 0 if not) for the Parca Matching test items. Derived from the respective raw item response variables (coded 1-4 for valid responses, denoting picture selected, or 5 for invalid responses).

* derive scores for match test items.
* items with 1 as the correct response.
RECODE PM09
 (1=1) (2=0) (3=0) (4=0) (5=0) (SYSMIS=SYSMIS)
INTO cpm09s .
EXECUTE.
* items with 2 as the correct response.
RECODE PM01 PM03 PM04 PM05 PM11 PM12 PM15
 (1=0) (2=1) (3=0) (4=0) (5=0) (SYSMIS=SYSMIS)
INTO cpm01s cpm03s cpm04s cpm05s cpm11s cpm12s cpm15s .
EXECUTE.
* items with 3 as the correct response.
RECODE PM02 PM07 PM10 PM16
 (1=0) (2=0) (3=1) (4=0) (5=0) (SYSMIS=SYSMIS)
INTO cpm02s cpm07s cpm10s cpm16s .
EXECUTE.
* items with 4 as the correct response.
RECODE PM06 PM08 PM13 PM14
 (1=0) (2=0) (3=0) (4=1) (5=0) (SYSMIS=SYSMIS)
INTO cpm06s cpm08s cpm13s cpm14s .
EXECUTE.
cprot1/2

See cbeht1/2, etc above.

crawg1/2, cscnv1/2, cscv1/2

Standardised cognitive ability composites. Variables crawg represents general cognitive ability ('g'), cscv represents verbal ability, and cscnv represents non-verbal ability.
These composites are computed from other derived variables (ctvoc, cgramma, creparc, cadparn) which are all described elsewhere on this page.
All standardisations are done on the non-excluded twin sample, using variable exclude1.

* Filter out all the standard exclusions for the twin.
* (medical, perinatal, unknown sex/zyg, missing 1st Contact).
* This will affect all standardised variables derived below.
USE ALL.
COMPUTE filter_$=(exclude1 = 0).
VARIABLE LABEL filter_$ 'exclude1 = 0 (FILTER)'.
VALUE LABELS filter_$ 0 'Not Selected' 1 'Selected'.
FORMAT filter_$ (f1.0).
FILTER BY filter_$.
EXECUTE.

* General cognitive ability composites.
* ------------------------------------.
* These combine the parent-report and parent-administered Parca.
* (non-verbal ability) with the vocab and grammar (verbal ability).

* First standardise all components (if not already standardised).
DESCRIPTIVES VARIABLES= ctvoc (zctvoc) cgramma (zgramma) creparc (zreparc) /SAVE.

* Derive composites as means from standardised scores.
* requiring all of them to be non-missing.

* Verbal ability composite: from vocab and grammar.
COMPUTE scv = MEAN.2(zctvoc, zgramma).
EXECUTE.
* Non-verbal ability composite: from two Parca scores.
COMPUTE scnv = MEAN.2(cadparn, zreparc).
EXECUTE.
* General cognitive ability ('g'): from all four scores.
COMPUTE rawg = MEAN.4(zctvoc, zgramma, cadparn, zreparc).
EXECUTE.
* Now standardise them.
DESCRIPTIVES VARIABLES=scv (cscv) scnv (cscnv) rawg (crawg) /SAVE.

* Remove filter: no longer needed now standardisations completed.
FILTER OFF.
USE ALL.
EXECUTE.
creparc1/2

Total score for the parent-reported Parca measure. Derived as the sum of all the items scores except for items 13 and 20. Each item has score 0 or 1, hence the total score has range 0-22.

* Parent-reported Parca total score.
* Sum of all items except 13 and 20 (which are not cognitive).
COMPUTE creparc = SUM(cpr01, cpr02, cpr03, cpr04, cpr05,
 cpr06, cpr07, cpr08, cpr09, cpr10, cpr11, cpr12, cpr14,
 cpr15, cpr16, cpr17, cpr18, cpr19, cpr21, cpr22, cpr23, cpr24).
EXECUTE.
crepdiff

Twin pair age difference (in years) for completion of the parent-reported part of each of the twin booklets.
Computed from double-entered derived age variables crepage1/2, which are described elsewhere on this page.

* Difference in ages between pair of twins when.
* parent-reported sections of booklets completed.
COMPUTE crepdiff = ABS(crepage1 - crepage2).
EXECUTE.
cright1/2

See cleft1/2, cright1/2, ctoleft1/2, ctorigh1/2 above.

cscnv1/2, cscv1/2

See crawg1/2, cscnv1/2, cscv1/2 above.

csdqcbeht1/2, csdqccont1/2, csdqcemot1/2, csdqchypt1/2, csdqcpert1/2, csdqcprot1/2

SDQ-comparable subscales derived from Behar behaviour items.
Each is derived from items that are very similar to SDQ items used at later ages, to enable longitudinal comparisons of near-equivalent scales. These age 3 scales are the same as at age 2, except that an additional emotion item has been added at age 3.
csdqccont1/2: conduct problems subscale, from 4 items.
csdqcemot1/2: emotional problems subscale, from 3 items.
csdqchypt1/2: hyperactivity subscale, from 3 items.
csdqcpert1/2: peer problems subscale, from 3 items.
csdqcprot1/2: prosocial subscale, from 5 items.
csdqcbeht1/2: total behaviour problems subscale, from the 13 conduct, emotion, hyperactivity and peer items.
See comments in syntax below for more detailed descriptions.

* SDQ-comparable Behar scales.
* (same as used at age 2, except that another emotion item is added at age 3).
* Derived from Behar items that are very similar to equivalent SDQ items.
*  used at ages 4 and later, to enable longitudinal comparisons.
* Omit age 3 Behar items that are dissimilar to the SDQ.
*  even if they correlate well (used in overall scales instead).
* Conduct subscale from 4 items.
*  Note that cbeh07 (fights) and cbeh29 (bullies) correlate highly.
*  with each other and equate to a single SDQ item.
*  so only use one of them (cbeh29) here so they do not skew the scale.
COMPUTE csdqccont = 4 * MEAN.2(cbeh12, cbeh17, cbeh23, cbeh29).
* Emotion subscale from 3 items.
COMPUTE csdqcemot = 3 * MEAN.2(cbeh10, cbeh14, cbeh44).
* Hyperactivity subscale from 3 items.
COMPUTE csdqchypt = 3 * MEAN.2(cbeh02, cbeh04, cbeh19).
* Peer problem subscale from 3 items.
COMPUTE csdqcpert = 3 * MEAN.3(cbeh08, cbeh11, cbeh31r).
* Prosocial subscale from 5 components.
*  Note that cbeh03 and cbeh18 are in fact identical to SDQ items.
*  Note also that cbeh13, 21 and 25 (helping children who are hurt, ill, upset).
*  correlate highly with each other and equate to a single SDQ item.
*  so only use one of them (cbeh13) here so they do not skew the scale.
COMPUTE csdqcprot = 5 * MEAN.3(cbeh03, cbeh09, cbeh13, cbeh18, cbeh36).
* Total behaviour problems, from 13 conduct/emotion/hyperactivity/peer items.
COMPUTE csdqcbeht = 13 * MEAN.7(cbeh02, cbeh04, cbeh07, cbeh08, 
  cbeh10, cbeh11, cbeh14, cbeh17, cbeh19, cbeh23, cbeh29, cbeh31r, cbeh44).
EXECUTE.
ctbook1/2, cteat1/2, ctloca1/2, ctpron1/2, ctrhym1/2, ctsent1/2, cttalk1/2, ctword1/2

See cbpamus1/2, etc above.

ctempzyg

Zygosity difference score, used in the zygosity algorithm.
The score is scaled to have values between 0 and 1, with higher values representing greater differences between the twins.
Computed from 20 different item and derived variables relating to twin differences. All contributing variables are ordinal, with higher values for greater differences. See comments in syntax for details of the method of computation.

* Compute difference sum, from ordinal variables with higher values = more different.
COMPUTE sumzyg = SUM(czyprof, czyyou, czyhairs, czyhairt, czyeyes, czyears,
 czyteet2, czyold, czyphot2, czyfac, czytyp, czymistp, czymists, czymistr,
 czymistb, czymistf, czymistc, czymistm, czytoget, czypeas).
EXECUTE.
* Determine maximum possible score, depending on number of non-missing.
* responses in the above variables (total is 54 if none missing).
COUNT zyg1 = czyfac czytyp (0 thru 1).
COUNT zyg2 = czyprof czyyou czyteet2 (1 thru 2).
COUNT zyg3 = czyhairs czyhairt czyeyes czyears czyold czyphot2 czymistp
 czymists czymistr czymistb czymistf czymistc czymistm czypeas (1 thru 3).
COUNT zyg4 = czytoget (1 thru 4).
EXECUTE.
COMPUTE zygtot = SUM(zyg1, (2 * zyg2), (3* zyg3), (4 * zyg4)).
EXECUTE.
* Can now re-scale difference score to range 0-1.
* requiring at least half the data to be non-missing.
* (total possible score must be 27 or higher).
NUMERIC ctempzyg (F4.3).
IF (zygtot >= 27) ctempzyg = sumzyg / zygtot.
EXECUTE.
ctfac11/2, ctfac21/2

Standardised composites for twin talk, based on the "How you talk to your twins" items in the parent booklet. See comments in the syntax below for differences between these composites.
The composites are computed from derived twin-specific versions of the items, which are described elsewhere on this page. All composites are standardised on the non-excluded twin sample, filtered using derived variable cexclude (also described elsewhere on this page).

* Talking to twins.
* ----------------.
* Use twin-specific versions of items, as already derived.
* Note these items are already standardised and double entered.
* 1st scale: 3 items.
COMPUTE ctfac11x = MEAN.2(ctpron1, ctsent1, ctword1).
COMPUTE ctfac12x = MEAN.2(ctpron2, ctsent2, ctword2).
EXECUTE.
* 2nd scale: 4 items.
COMPUTE ctfac21x = MEAN.3(ctrhym1, ctbook1, ctloca1, cttalk1).
COMPUTE ctfac22x = MEAN.3(ctrhym2, ctbook2, ctloca2, cttalk2).
EXECUTE.
* Note that cteat1/2 is not used in either scale.
* standardise the scales.
DESCRIPTIVES VARIABLES= ctfac11x (ctfac11) ctfac12x (ctfac12)
 ctfac21x (ctfac21) ctfac22x (ctfac22) /SAVE.
ctloca1/2

See cbpamus1/2, etc above.

ctoleft1/2, ctorigh1/2

See cleft1/2, cright1/2, ctoleft1/2, ctorigh1/2 above.

ctpron1/2, ctrhym1/2, ctsent1/2, cttalk1/2

See cbpamus1/2, etc above.

ctvoc1/2

Transformed version of the vocabulary total score (cvocab, described elsewhere on this page).

* Transformed vocabulary total.
* using reflect square-root function to reduce skew.
* while retaining range of 0 to 100.
COMPUTE ctvoc = 100 * (1 - ((SQRT(101 - cvocab) - 1) / (SQRT(101) - 1)) ).
EXECUTE.
ctword1/2

See cbpamus1/2, etc above.

ctwbage1/2

See cpbage, ctwbage1/2 above.

ctwbagediff

Twin pair age difference (in years) for completion of the twin booklets.
Computed from double-entered derived age variables ctwbage1/2, which are described elsewhere on this page.

* Difference in ages between pair of twins when booklets completed.
COMPUTE ctwbagediff = RND(ABS(ctwbage1 - ctwbage2), 0.1).
EXECUTE.
ctwbLLCage1/2, ctwbLLCdate1/2

See cpbLLCage, etc above.

cuse1/2

Word Use total score. Derived as the sum of all 12 item scores. Each item has scores 0/1, hence the total score has range 0-12.

* Word use total: sum of all 12 item scores.
* Any missing items are treated like zero scores.
COMPUTE cuse = SUM(cwu01, cwu02, cwu03, cwu04, cwu05,
 cwu06, cwu07, cwu08, cwu09, cwu10, cwu11, cwu12).
EXECUTE.
cvocab1/2

Vocabulary total score. Derived as the sum of all 100 item scores. Each item has scores 0/1, hence the total score has range 0-100.
The total score is recoded to missing (as are all the item scores) if the data suggest that the entire measure has been skipped, resulting in defaults of zero for all the items - see comments in syntax for method.

* Vocabulary total score: total from items 1 to 100.
* Any missing items are treated like zero scores.
COMPUTE cvocab = SUM(cvc001, cvc002, cvc003, cvc004, cvc005, cvc006, cvc007, cvc008, cvc009, cvc010, cvc011, cvc012, cvc013, 
 cvc014, cvc015, cvc016, cvc017, cvc018, cvc019, cvc020, cvc021, cvc022, cvc023, cvc024, cvc025, cvc026, cvc027, 
 cvc028, cvc029, cvc030, cvc031, cvc032, cvc033, cvc034, cvc035, cvc036, cvc037, cvc038, cvc039, cvc040, cvc041, 
 cvc042, cvc043, cvc044, cvc045, cvc046, cvc047, cvc048, cvc049, cvc050, cvc051, cvc052, cvc053, cvc054, cvc055, 
 cvc056, cvc057, cvc058, cvc059, cvc060, cvc061, cvc062, cvc063, cvc064, cvc065, cvc066, cvc067, cvc068, cvc069, 
 cvc070, cvc071, cvc072, cvc073, cvc074, cvc075, cvc076, cvc077, cvc078, cvc079, cvc080, cvc081, cvc082, cvc083, 
 cvc084, cvc085, cvc086, cvc087, cvc088, cvc089, cvc090, cvc091, cvc092, cvc093, cvc094, cvc095, cvc096, cvc097, 
 cvc098, cvc099, cvc100).
EXECUTE.

* Default for vocab items is 0, even if missing (entire page skipped).
* so cvocab defaults to 0 which may be anomalous.
* There are 234 twins having cvocab=0 in the raw data.
* Most of these appear not to be anomalous: 189 have cuse=0 and 225 have ccomplx=0.
* There are no such cases where Word Use and Sentence Complexity measures (on next booklet page).
* are entirely missing.
* For the 9 cases with cvocab = 0 and both ccomplx > 0 and cuse > 0 .
* recode the vocab items and score to missing on the assumption that they have functioning.
* word and sentence use so the zero vocabulary score could well be an anomaly.
DO IF (cuse > 0 & ccomplx > 0 & cvocab = 0).
 RECODE cvc000 cvc001 cvc002 cvc003 cvc004 cvc005 cvc006 cvc007 cvc008 cvc009 cvc010 cvc011 cvc012 cvc013 cvc014 
    cvc015 cvc016 cvc017 cvc018 cvc019 cvc020 cvc021 cvc022 cvc023 cvc024 cvc025 cvc026 cvc027 cvc028 
    cvc029 cvc030 cvc031 cvc032 cvc033 cvc034 cvc035 cvc036 cvc037 cvc038 cvc039 cvc040 cvc041 cvc042 
    cvc043 cvc044 cvc045 cvc046 cvc047 cvc048 cvc049 cvc050 cvc051 cvc052 cvc053 cvc054 cvc055 cvc056 
    cvc057 cvc058 cvc059 cvc060 cvc061 cvc062 cvc063 cvc064 cvc065 cvc066 cvc067 cvc068 cvc069 cvc070 
    cvc071 cvc072 cvc073 cvc074 cvc075 cvc076 cvc077 cvc078 cvc079 cvc080 cvc081 cvc082 cvc083 cvc084 
    cvc085 cvc086 cvc087 cvc088 cvc089 cvc090 cvc091 cvc092 cvc093 cvc094 cvc095 cvc096 cvc097 cvc098 
    cvc099 cvc100 cvocab (ELSE=SYSMIS).
END IF.
EXECUTE.
cyhfac1z, cyhfac2z

See cchatot, cyhfac1z, cyhfac2z above.