TEDS Data Dictionary

Derived Variables in the 21 Year Dataset

This page gives a listing of derived variables in the 21 Year dataset, in alphabetical order of variable name. For each variable, a short written description is followed by the SPSS syntax (in a box) that was used to derive the variable.

This page does not include descriptions of background variables that are derived from other sources and that are included in the 21 Year dataset. For information about such variables, see pages describing background variables, exclusions and scrambled IDs.

Most of the twin-specific variables were derived prior to double entering the dataset. Hence the variable names used in the syntax often lack the endings (1 or 2) used for the final double entered variables.

Variable name prefixes indicate the studies from which they were derived:

u1p: TEDS21 phase 1 parent questionnaire
u1c: TEDS21 phase 1 twin questionnaire
u2c: TEDS21 phase 2 twin questionnaire
ucg: G-game twin tests
ucv1: Covid phase 1 twin questionnaire
ucv2: Covid phase 2 twin questionnaire
ucv3: Covid phase 3 twin questionnaire
ucv4: Covid phase 4 twin questionnaire

List of variables described on this page

Click on a variable name in the table below to go to the description on this page. Alternatively, scroll down and find variables in alphabetical order.

Definitions of derived variables

Listed alphabetically

u1cactvm1/2, ucv1actvm1/2, ucv2actvm1/2, ucv3actvm1/2, ucv4actvm1/2

Physical Activity scale, derived from all 3 items of the measure in the TEDS21 phase 1 twin questionnaire (u1c) and the covid phase 1 (ucv1), phase 2 (ucv2), phase 3 (ucv3) and phase 4 (ucv4) twin questionnaires.
This scale is derived as a weighted mean, with weightings 3 for item 1 (strenuous activity), 2 for item 2 (moderate activity) and 1 for item 3 (mild activity). Each item has integer response values 1-5, and the weighted mean is derived in such as way that the scale also has values from 1 to 5. All three items are required to be non-missing for this scale to be computed, because any missing item would complicate the calculation of weightings.

* Compute as a weighted mean, with weightings 3, 2 and 1 respectively.
* for items 1 (strenuous), 2 (moderate) and 3 (mild).
* To keep things simple, require all three items to be non-missing.
COMPUTE u1cactvm = (SUM.3((3 * u1cactv1), (2 * u1cactv2), u1cactv3)) / 6.
COMPUTE ucv1actvm = (SUM.3((3 * ucv1actv1), (2 * ucv1actv2), ucv1actv3)) / 6.
COMPUTE ucv2actvm = (SUM.3((3 * ucv2actv1), (2 * ucv2actv2), ucv2actv3)) / 6.
COMPUTE ucv3actvm = (SUM.3((3 * ucv3actv1), (2 * ucv3actv2), ucv3actv3)) / 6.
COMPUTE ucv4actvm = (SUM.3((3 * ucv4actv1), (2 * ucv4actv2), ucv4actv3)) / 6.
EXECUTE.

u1cage1/2, u2cage1/2, u1page

Age of twin (in decimal years) when various TEDS21 data components were completed (online) or returned (on paper):
u1page: phase 1 parent questionnaire;
u1cage1/2: phase 1 twin questionnaire;
u2cage1/2 phase 2 twin questionnaire.
Derived from item variables representing relevant dates (start dates for electronic data, or logged return dates for paper questionnaire dates). Variable aonsdob is the twin birth date. These date variables are not retained in the dataset.

* For TEDS21, we need the best estimate of date according to return method.
* For web/app users use the start dates.
IF (ANY(u1psource, 1, 2, 3)) u1pdate = u1pstart.
IF (ANY(u1csource, 1, 2, 3)) u1cdate = u1cstart.
IF (ANY(u2csource, 1, 2, 3)) u2cdate = u2cstart.
EXECUTE.
* For paper users use the return date.
IF (u1psource = 4) u1pdate = u1prdate.
IF (u1csource = 4) u1cdate = u1crdate1.
IF (u2csource = 4) u2cdate = u2crdate1.
EXECUTE.

* Now derive the ages.
COMPUTE u1page = RND((DATEDIFF(u1pdate, aonsdob, "days")) / 365.25, 0.1) .
COMPUTE u1cage = RND((DATEDIFF(u1cdate, aonsdob, "days")) / 365.25, 0.1) .
COMPUTE u2cage = RND((DATEDIFF(u2cdate, aonsdob, "days")) / 365.25, 0.1) .

u1cbaqphym1/2, u1cbaqverm1/2, u1cbaqangm1/2, u1cbaqm1/2

BAQ Aggression subscales, and an overall scale, derived from items of the measure in the phase 1 twin questionnaire. The subscales are for physical aggression (u1cbaqphym), verbal aggression (u1cbaqverm) and anger (u1cbaqangm) while the overall scale is u1cbaqm.
Each subscale is a mean of either two or three of the items, while the overall scale is a mean of all 8 items, in each case requiring at least half of the items to be non-missing. Each item has response values 1-5, hence each of these scales has the same range as it is computed as a mean.

* Physical aggression: items 1-3.
COMPUTE u1cbaqphym = MEAN.2(u1cbaq1, u1cbaq2, u1cbaq3).
* Verbal aggression: items 4-6.
COMPUTE u1cbaqverm = MEAN.2(u1cbaq4, u1cbaq5, u1cbaq6).
* Anger: items 7-8.
COMPUTE u1cbaqangm = MEAN.1(u1cbaq7, u1cbaq8).
* Total scale: all 11 items.
COMPUTE u1cbaqm = MEAN.4(u1cbaq1, u1cbaq2, u1cbaq3,
 u1cbaq4, u1cbaq5, u1cbaq6, u1cbaq7, u1cbaq8).
EXECUTE.

u1cbmi1/2

Twin BMI, measured in kilograms per square metre. Derived from twin heights and weights (item variables).

* Height variable is in centimetres.
* We want BMI in units of kilograms per square metre.
* So include a scaling factor of 10000 in the BMI calculation.
COMPUTE u1cbmi = 10000 * u1cwtkg / (u1chtcm * u1chtcm).
EXECUTE.

u1cbsaem1/2, u1cbsagm1/2

BSA Environment (u1cbsaem) and Government (u1cbsagm) scales, derived from items of the BSA measure(s) in the phase 1 twin questionnaire.
The scales are means of 6 and 5 items respectively, which are all the items included in the questionnaire. Some of the items are reversed for the Environment scale. Each item has response values 1-5, hence each of these scales has the same range as it is computed as a mean.

* Attitudes to environment: mean from all 6 items (some reversed).
COMPUTE u1cbsaem = MEAN.3(u1cbsae1, u1cbsae2r, u1cbsae3r, u1cbsae4r, u1cbsae5r, u1cbsae6r).
* Attitudes to government: mean from all 5 items.
COMPUTE u1cbsagm = MEAN.3(u1cbsag1, u1cbsag2, u1cbsag3, u1cbsag4, u1cbsag5).
EXECUTE.

u1cchaost1/2

Chaos total scale, from the twin phase 1 questionnaire, derived from all 6 available items of the Chaos measure (reversed where necessary).
Each item has integer response values 0-2, hence each scale has a range of values from 0 to (2 * number of items) because it is computed as the mean multiplied by the number of component items. At least half the items are required to be non-missing for the scale to be computed.

COMPUTE u1cchaost = 6 * MEAN.3(u1cchaos1r, u1cchaos2, u1cchaos3, u1cchaos4r, u1cchaos5, u1cchaos6r).
EXECUTE.

u1ccommm1/2, ucv1commm1/2, ucv2commm1/2, ucv3commm1/2

Community scale, derived from all 5 items (reversed where necessary) of the measure in the TEDS21 phase 1 twin questionnaire (u1c) and in the covid twin phase 1 (ucv1) and phase 2 (ucv2) and phase 3 (ucv3) questionnaires. Each item has integer response values 1-5, hence the scale has the same range as it is computed as a mean. At least half the items are required to be non-missing for the scale to be computed.

COMPUTE u1ccommm = MEAN.3(u1ccomm1, u1ccomm2r, u1ccomm3, u1ccomm4r, u1ccomm5).
COMPUTE ucv1commm = MEAN.3(ucv1comm1, ucv1comm2r, ucv1comm3, ucv1comm4r, ucv1comm5).
COMPUTE ucv2commm = MEAN.3(ucv2comm1, ucv2comm2r, ucv2comm3, ucv2comm4r, ucv2comm5).
COMPUTE ucv3commm = MEAN.3(ucv3comm1, ucv3comm2r, ucv3comm3, ucv3comm4r, ucv3comm5).
EXECUTE.

u1cdadrm1/2, u1cmumrm1/2, u1ctwnrm1/2

Scales for relationships with twin (u1ctwnrm), mother (u1cmumrm) and father (u1cdadrm), derived from items for these three closely-related measures in the phase 1 twin questionnaire. Each scale is a mean of all 5 items in the respective measure, requiring at least half the items to be non-missing. In the mother and father scales, item 5 is reversed in the syntax. Every item has response values 1-5, hence each scale has the same range as it is computed as a mean.

* Twin relationships.
COMPUTE u1ctwnrm = MEAN.3(u1ctwnr1, u1ctwnr2, u1ctwnr3, u1ctwnr4, u1ctwnr5).
* Mother relationships: reverse the fifth item.
COMPUTE u1cmumrm = MEAN.3(u1cmumr1, u1cmumr2, u1cmumr3, u1cmumr4, (6 - u1cmumr5)).
* Father relationships: reverse the fifth item.
COMPUTE u1cdadrm = MEAN.3(u1cdadr1, u1cdadr2, u1cdadr3, u1cdadr4, (6 - u1cdadr5)).
EXECUTE.

u1cdevmob1/2, u1cdevwdth1/2

Device categories used for the TEDS21 twin phase 1 questionaire, if completed electronically.
u1cdevmob flags mobile devices (mobile phones and tablets), coded 1=yes 0=no.
u1cdevwdth is a device width category, coded 1=small, 2=medium, 3=large, where width refers to the widest side of the device's screen.
Derived from raw item variables, collected on the web/app server, that are not retained in the dataset. Mobile devices could be categorised from both app and web data, using different methods. Screen sizes were only recorded in the web data, not the app data.

* Mobile devices: CMS data.
* ------------------------.
* The raw CMS variable PlatformType has values 'Web' or 'Mobile'.
* so we can assume 'Mobile' value refers to mobile devices.
* while 'Web' in most cases probably means a web browser used on a laptop/desktop.
RECODE PlatformType ('Web'=0) ('Mobile'=1)
INTO u1cdevmob.

* Mobile devices: web backup.
* --------------------------.
* Use substrings of the consenttechuseragent string to categorise broad device types.
* 'Windows', 'Macintosh' and 'X11' are probably all laptops or desktops.
IF (CHAR.INDEX(consenttechuseragent, 'Windows') > 0) devicetype = 1.
IF (CHAR.INDEX(consenttechuseragent, 'Macintosh') > 0) devicetype = 2.
IF (CHAR.INDEX(consenttechuseragent, 'X11') > 0) devicetype = 3.
EXECUTE.
* 'iPhone' and 'Android' are likely to be mobile phones.
* (in rare cases of Windows phones with Android installed, this supercedes 'Windows' above).
IF (CHAR.INDEX(consenttechuseragent, 'iPhone') > 0) devicetype = 4.
IF (CHAR.INDEX(consenttechuseragent, 'Android') > 0) devicetype = 5.
EXECUTE.
* 'iPad' and 'Tablet' are presumably tablets (the latter can overrule 'Windows').
IF (CHAR.INDEX(consenttechuseragent, 'iPad') > 0) devicetype = 6.
IF (CHAR.INDEX(consenttechuseragent, 'Tablet') > 0) devicetype = 7.
EXECUTE.
* 'Mobile' is another common substring, but always indicates mobile phones.
* categorised by other substrings above.

* Recode into a binary variable to flag mobile devices.
* to be retained in the dataset (raw device variables will be dropped).
RECODE devicetype (1 THRU 3=0) (4 THRU 7=1)
INTO u1cdevmob.
EXECUTE.

* Screen width: web backup only.
* ----------------------------.
* Some devices may be used in portrait or landscape mode.
* so for consistency treat the largest dimension as the screen 'width'.
IF (consenttechscrwidth >= consenttechscrheight) screenwidth = consenttechscrwidth.
IF (consenttechscrwidth < consenttechscrheight) screenwidth = consenttechscrheight.
EXECUTE.
* Categorise screen widths arbitrarily as small, medium or large.
RECODE screenwidth
 (LOWEST THRU 767=1) (768 THRU 1199=2) (1200 THRU HIGHEST=3)
INTO u1cdevwdth.
EXECUTE.

u1cduration1/2

Duration of the phase 1 twin questionnaire, measured in decimal minutes, derived as the difference between the start date-time and the end date-time. The same method is used to derive this variable in the CMS and backup data, but the duration cannot be measured for paper booklet data. The start and end date-time variables are not retained in the dataset.

* Derive a duration variable, as the difference between the start and end date-times.
* This will not necessarily match the total time variable derived above.
* because the 9 questionnaire sections may have been completed at different dates or times.
* with pauses in between; and the overall duration will include consent, address check, etc.
* Only derive for cases where the whole thing was finished.
* Derive as number of seconds divided by 60 to get decimal minutes.
IF (u1cstat = 2) u1cduration = DATEDIFF(u1cend, u1cstart, 'seconds') / 60.
EXECUTE.

u1ceatsbint1/2, u1ceatsbodt1/2

Eating disorder symptoms subscales.
Derived from 11 of the 12 items in the Eating Disorder Symptoms measure in the twin phase 1 questionnaire (item 11 is not included in either scale). The subscales are for binge-eating-related symptoms (u1ceatsbint) and for preoccupation with body image (u1ceatsbodt), based on 3 and 8 items respectively.
Each item has integer response values 0-5, hence each subscale has a range of values from 0 to (5 * number of component items) as it is a 'total' computed by multiplying the mean by the number of items. At least half the component items are required to be non-missing for each scale to be computed.

* Binge-eating symptoms: items 1, 5, 10.
COMPUTE u1ceatsbint = 3 * MEAN.2(u1ceats01, u1ceats05, u1ceats10).
* Preoccupation with bodily size: all other items except 11.
COMPUTE u1ceatsbodt = 8 * MEAN.4(u1ceats02, u1ceats03,
 u1ceats04, u1ceats06, u1ceats07, u1ceats08, u1ceats09, u1ceats12).
EXECUTE.

u1cedat

Standardised educational attainment composite for twins, derived from variables in the phase 1 twin SES questionnaire. The two components are u1chqualp (the probable highest level of qualification after current study, which is a derived variable described elsewhere on this page) and u1cdegr1 (the degree classification for those twins who have already graduated). Each component, and the final composite, is standardised by cohort to eliminate significant cohort differences. Coding is such that higher values indicate higher SES. The derivation is explained by comments in the syntax. Standardisation of u1cdegr1 before taking the mean effectively gives twins who are still taking degrees the same level as those who have completed degrees and achieved the mean classification. Note that u1cdegr1 (after standardisation) is given half-weighting in the mean; as a result, twins who have completed degrees with the lowest classifications remain at or above the levels of twins who have taken A-levels but not degrees, approximately preserving an appropriate rank ordering of qualifications.

* Derive twin SES purely based on educational level.
* All components show significant cohort effects.
* so start by standardising within each cohort.
SORT CASES  BY ucohort.
SPLIT FILE SEPARATE BY ucohort.

DESCRIPTIVES VARIABLES=u1chqualp u1cdegr1
  /SAVE.

SPLIT FILE OFF.

* Twin Education composite is a mean of 2 standardised components, unequally weighted.
* (1) probable educational level as derived above and (2) degree classification.
* which is given half weighting; this weighting is designed to retain a higher level.
* for those with low-classification degrees than for those with A-levels.
* Note also that degree classification is missing for over half the twins, namely.
* those without degrees and those still stuying towards degrees.
COMPUTE twineduses = MEAN(Zu1chqualp, (Zu1cdegr1 / 2)).
EXECUTE.

* Re-standardise the new composite to correct the SD to 1.
* and to ensure cohort differences are ironed out in the mean.
SORT CASES  BY ucohort.
SPLIT FILE SEPARATE BY ucohort.

DESCRIPTIVES VARIABLES= twineduses (u1cedat)
  /SAVE.

SPLIT FILE OFF.

u1cfconm1/2

Future Consequences scale, derived from all 4 items of the measure in the phase 1 twin questionnaire. Each item has integer response values 1-5, hence the scale has the same range as it is computed as a mean. At least half the items are required to be non-missing for the scale to be computed.

COMPUTE u1cfconm = MEAN.2(u1cfcon1, u1cfcon2, u1cfcon3, u1cfcon4).
EXECUTE.

u1cfinam1/2

CLAS Financial Wellbeing scale, derived from all 5 items of the measure in the phase 1 twin questionnaire (reversed where necessary). Each item has integer response values 1-5, hence the scale has the same range as it is computed as a mean. At least half the items are required to be non-missing for the scale to be computed.

COMPUTE u1cfinam = MEAN.3(u1cfina1, u1cfina2, u1cfina3, u1cfina4, u1cfina5r).
EXECUTE.

u1cfprdm1/2

Financial Products familiarity scale, derived from all 13 items of the measure in the phase 1 twin questionnaire. Each item has integer response values 0-4, hence the scale has the same range as it is computed as a mean. At least half the items are required to be non-missing for the scale to be computed.

COMPUTE u1cfprdm = MEAN.7(u1cfprd01, u1cfprd02, u1cfprd03,
 u1cfprd04, u1cfprd05, u1cfprd06, u1cfprd07, u1cfprd08, u1cfprd09, 
 u1cfprd10, u1cfprd11, u1cfprd12, u1cfprd13).
EXECUTE.

u1cgoalfult1/2, u1cgoalrelt1/2, ucv1goalfult1/2, ucv1goalrelt1/2, ucv2goalfult1/2, ucv2goalrelt1/2, ucv3goalfult1/2, ucv3goalrelt1/2, ucv4goalfult1/2, ucv4goalrelt1/2

Goals subscales, from the twin TEDS21 phase 1 questionnaire (u1c) and from the twin covid phase 1 (ucv1), phase 2 (ucv2), phase 3 (ucv3) and phase 4 (ucv4) questionnaires. The measure was the same in all questionnaires. The subscales are derived from 5 and 4 items respectively. Each item has integer response values 1-5, hence each scale has the same range as it is computed as a mean. At least half the items are required to be non-missing for the scale to be computed.

* Relationships subscale (5 items).
COMPUTE u1cgoalrelt = 5 * MEAN.3(u1cgoal1, u1cgoal3, u1cgoal4, u1cgoal5, u1cgoal8).
COMPUTE ucv1goalrelt = 5 * MEAN.3(ucv1goal1, ucv1goal3, ucv1goal4, ucv1goal5, ucv1goal8).
COMPUTE ucv2goalrelt = 5 * MEAN.3(ucv2goal1, ucv2goal3, ucv2goal4, ucv2goal5, ucv2goal8).
COMPUTE ucv3goalrelt = 5 * MEAN.3(ucv3goal1, ucv3goal3, ucv3goal4, ucv3goal5, ucv3goal8).
COMPUTE ucv4goalrelt = 5 * MEAN.3(ucv4goal1, ucv4goal3, ucv4goal4, ucv4goal5, ucv4goal8).
* Fulfilment subscale (4 items).
COMPUTE u1cgoalfult = 4 * MEAN.2(u1cgoal2, u1cgoal6, u1cgoal7, u1cgoal9).
COMPUTE ucv1goalfult = 4 * MEAN.2(ucv1goal2, ucv1goal6, ucv1goal7, ucv1goal9).
COMPUTE ucv2goalfult = 4 * MEAN.2(ucv2goal2, ucv2goal6, ucv2goal7, ucv2goal9).
COMPUTE ucv3goalfult = 4 * MEAN.2(ucv3goal2, ucv3goal6, ucv3goal7, ucv3goal9).
COMPUTE ucv4goalfult = 4 * MEAN.2(ucv4goal2, ucv4goal6, ucv4goal7, ucv4goal9).
EXECUTE.

u1chqualp1/2

Probable highest level of educational qualification for twins, once current studies (if any) have been completed. Based on two variables from the phase 1 twin questionnaire: (1) u1chqual, the current highest qualification level; and (2) u1ccqual, the qualification level towards which twins are currently studying (if applicable). Both items are ordinal with the same 1-11 coding, and the derived variable has the same coding. Comments in the syntax below explain the derivation.

* Enhance the 'highest educational qualification level' variable.
* to get a better measure of the 'probable' highest educational level.
* for those who are still studying.

* Default value is current highest level.
COMPUTE u1chqualp = u1chqual.
EXECUTE.
* If missing (generally because twin specified 'other' or 'overseas' qualifications).
* and if currently studying, substitute the value of u1ccqual if present.
IF (SYSMIS(u1chqual) & ~SYSMIS(u1ccqual)) u1chqualp = u1ccqual.
EXECUTE.
* If currently studying towards a higher level than is currently held.
* (if u1ccqual > u1chqual) then use the higher level.
* There are a few apparently unrealistic jumps in level here, but they are very few.
* in number and it's simplest to take both responses at face value.
IF (u1ccqual > u1chqual) u1chqualp = u1ccqual.
EXECUTE.

u1cLLCage1/2, u1cLLCdate1/2, u1pLLCage1/2, u1pLLCdate1/2, u2cLLCage1/2, u2cLLCdate1/2, ucgLLCage1/2, ucgLLCdate1/2, ucv1LLCage1/2, ucv1LLCdate1/2, ucv2LLCage1/2, ucv2LLCdate1/2, ucv3LLCage1/2, ucv3LLCdate1/2, ucv4LLCage1/2, ucv4LLCdate1/2

Age and date variables derived for use in datasets in the LLC TRE (but not to be used in other datasets).
Ages and dates are derived for TEDS21 phase 1 parent ('u1p') and twin ('u1c') and phase 2 twin ('u2c'); the g-game tests ('ucg'), and covid questionnaire phases 1, 2, 3 and 4 ('ucv1', 'ucv2', 'ucv3', 'ucv4' respectively).
The LLC date variables contain only the month and year, not the day, as a means of reducing identifiability. The date variables are strings formatted as 'yyyy-mm'. These LLC dates are designed to enable the TEDS measures to be placed in a time sequence with NHS medical diagnosis dates in the data in the TRE.
The LLC age variables are integers measuring the number of months between birth and the given TEDS activity, consistent with the matching LLC date variables.
Variable aonsdob is the twin birth date - the raw date variables are not retained in the dataset.

* For TEDS21, we first need the best estimate of date according to return method.
NUMERIC u1pdate u1cdate u2cdate (EDATE11).
* For web/app users use the start dates.
IF (ANY(u1psource, 1, 2, 3)) u1pdate = u1pstart.
IF (ANY(u1csource, 1, 2, 3)) u1cdate = u1cstart.
IF (ANY(u2csource, 1, 2, 3)) u2cdate = u2cstart.
EXECUTE.
* For paper users use the return date.
IF (u1psource = 4) u1pdate = u1prdate.
IF (u1csource = 4) u1cdate = u1crdate1.
IF (u2csource = 4) u2cdate = u2crdate1.
EXECUTE.

* Now extract year and month as temp variables, from birth date and activity dates.
COMPUTE birthyear = XDATE.YEAR(aonsdob).
COMPUTE birthmonth = XDATE.MONTH(aonsdob).
COMPUTE u1pyear = XDATE.YEAR(u1pdate).
COMPUTE u1pmonth = XDATE.MONTH(u1pdate).
COMPUTE u1cyear = XDATE.YEAR(u1cdate).
COMPUTE u1cmonth = XDATE.MONTH(u1cdate).
COMPUTE u2cyear = XDATE.YEAR(u2cdate).
COMPUTE u2cmonth = XDATE.MONTH(u2cdate).
COMPUTE ucgyear = XDATE.YEAR(ucgconstdt).
COMPUTE ucgmonth = XDATE.MONTH(ucgconstdt).
COMPUTE ucv1year = XDATE.YEAR(ucv1constdt).
COMPUTE ucv1month = XDATE.MONTH(ucv1constdt).
COMPUTE ucv2year = XDATE.YEAR(ucv2constdt).
COMPUTE ucv2month = XDATE.MONTH(ucv2constdt).
COMPUTE ucv3year = XDATE.YEAR(ucv3constdt).
COMPUTE ucv3month = XDATE.MONTH(ucv3constdt).
COMPUTE ucv4year = XDATE.YEAR(ucv4constdt).
COMPUTE ucv4month = XDATE.MONTH(ucv4constdt).
EXECUTE.

* The agreed LLC date format is a string yyyy-mm (nominal by default for strings).
* adding '0' where necessary for two-digit months.
STRING u1pLLCdate u1cLLCdate u2cLLCdate ucgLLCdate ucv1LLCdate ucv2LLCdate ucv3LLCdate ucv4LLCdate (A7).
IF (u1pmonth < 10) u1pLLCdate = CONCAT(STRING(u1pyear, F4), '-0', STRING(u1pmonth, F1)).
IF (u1pmonth >= 10) u1pLLCdate = CONCAT(STRING(u1pyear, F4), '-', STRING(u1pmonth, F2)).
IF (u1cmonth < 10) u1cLLCdate = CONCAT(STRING(u1cyear, F4), '-0', STRING(u1cmonth, F1)).
IF (u1cmonth >= 10) u1cLLCdate = CONCAT(STRING(u1cyear, F4), '-', STRING(u1cmonth, F2)).
IF (u2cmonth < 10) u2cLLCdate = CONCAT(STRING(u2cyear, F4), '-0', STRING(u2cmonth, F1)).
IF (u2cmonth >= 10) u2cLLCdate = CONCAT(STRING(u2cyear, F4), '-', STRING(u2cmonth, F2)).
IF (ucgmonth < 10) ucgLLCdate = CONCAT(STRING(ucgyear, F4), '-0', STRING(ucgmonth, F1)).
IF (ucgmonth >= 10) ucgLLCdate = CONCAT(STRING(ucgyear, F4), '-', STRING(ucgmonth, F2)).
IF (ucv1month < 10) ucv1LLCdate = CONCAT(STRING(ucv1year, F4), '-0', STRING(ucv1month, F1)).
IF (ucv1month >= 10) ucv1LLCdate = CONCAT(STRING(ucv1year, F4), '-', STRING(ucv1month, F2)).
IF (ucv2month < 10) ucv2LLCdate = CONCAT(STRING(ucv2year, F4), '-0', STRING(ucv2month, F1)).
IF (ucv2month >= 10) ucv2LLCdate = CONCAT(STRING(ucv2year, F4), '-', STRING(ucv2month, F2)).
IF (ucv3month < 10) ucv3LLCdate = CONCAT(STRING(ucv3year, F4), '-0', STRING(ucv3month, F1)).
IF (ucv3month >= 10) ucv3LLCdate = CONCAT(STRING(ucv3year, F4), '-', STRING(ucv3month, F2)).
IF (ucv4month < 10) ucv4LLCdate = CONCAT(STRING(ucv4year, F4), '-0', STRING(ucv4month, F1)).
IF (ucv4month >= 10) ucv4LLCdate = CONCAT(STRING(ucv4year, F4), '-', STRING(ucv4month, F2)).
EXECUTE.

* The agreed LLC age variable is in integer months.
* and it must agree with the birth and booklet year/month variables that will be available in the LLC.
COMPUTE u1pLLCage = (u1pmonth + (u1pyear * 12)) - (birthmonth + (birthyear * 12)).
COMPUTE u1cLLCage = (u1cmonth + (u1cyear * 12)) - (birthmonth + (birthyear * 12)).
COMPUTE u2cLLCage = (u2cmonth + (u2cyear * 12)) - (birthmonth + (birthyear * 12)).
COMPUTE ucgLLCage = (ucgmonth + (ucgyear * 12)) - (birthmonth + (birthyear * 12)).
COMPUTE ucv1LLCage = (ucv1month + (ucv1year * 12)) - (birthmonth + (birthyear * 12)).
COMPUTE ucv2LLCage = (ucv2month + (ucv2year * 12)) - (birthmonth + (birthyear * 12)).
COMPUTE ucv3LLCage = (ucv3month + (ucv3year * 12)) - (birthmonth + (birthyear * 12)).
COMPUTE ucv4LLCage = (ucv4month + (ucv4year * 12)) - (birthmonth + (birthyear * 12)).
EXECUTE.

u1cmarrhopm1/2, u1cmarrworm1/2, u1cmarrm1/2

Marriage Attitudes subscales, and an overall scale, derived from items of the measure in the phase 1 twin questionnaire.
The subscales are for hopeful attitudes (u1cmarrhopm), and worries (u1cmarrworm) while the overall scale is u1cmarrm.
Each subscale is derived from 4 items, while the overall scale is a mean of all 8 items (with the 'worries' items reversed). For each scale, at least half the component items are required to be non-missing. Each item has response values 1-5, hence each of these scales has the same range as it is computed as a mean.

* Marriage hope scale: items 1,3,6,7.
COMPUTE u1cmarrhopm = MEAN.2(u1cmarr1, u1cmarr3, u1cmarr6, u1cmarr7).
* Marriage worry scale: items 2,4,5,8.
COMPUTE u1cmarrworm = MEAN.2(u1cmarr2, u1cmarr4, u1cmarr5, u1cmarr8).
* Overall scale: all 8 items, with items 2/4/5/8 reversed.
COMPUTE u1cmarrm = MEAN.4(u1cmarr1, u1cmarr2r, u1cmarr3,
 u1cmarr4r, u1cmarr5r, u1cmarr6, u1cmarr7, u1cmarr8r).
EXECUTE.

u1cmeduphot1/2, u1cmeduvidt1/2, u1cmedusoct1/2

Media use subscales.
Derived from the 11 items of this measure in the twin phase 1 questionnaire. The subscales are for mobile phone use (u1cmeduphot), use of video (u1cmeduvidt) and for social media use (u1cmedusoct), based on 5, 2 and 4 items respectively.
Each item has integer response values 0-5, hence each subscale has a range of values from 0 to (5 * number of component items) as it is a 'total' computed by multiplying the mean by the number of items. At least half the component items are required to be non-missing for each scale to be computed.

* Phone use total: items 1-5.
COMPUTE u1cmeduphot = 5 * MEAN.3(u1cmedu01, u1cmedu02, u1cmedu03, u1cmedu04, u1cmedu05).
* Video total: items 6-7.
COMPUTE u1cmeduvidt = 2 * MEAN.2(u1cmedu06, u1cmedu07).
* Social media total: items 8-11.
COMPUTE u1cmedusoct = 4 * MEAN.2(u1cmedu08, u1cmedu09, u1cmedu10, u1cmedu11).
EXECUTE.

u1cmfqt1/2, u2cmfqt1/2, ucv1mfqt1/2, ucv2mfqt1/2, ucv3mfqt1/2, ucv4mfqt1/2

MFQ total scale, from the TEDS21 twin phase 1 (u1c), TEDS21 twin phase 2 (u2c) and covid twin phase 1 (ucv1), phase 2 (ucv2), phase 3 (ucv3) and phase 4 (ucv4) questionnaires. Derived from all 8 available items of the MFQ measure (identical measure used in all questionnaires).
Each item has integer response values 0-2, hence each scale has a range of values from 0 to (2 * number of items) because it is computed as the mean multiplied by the number of component items. At least half the items are required to be non-missing for the scale to be computed.

COMPUTE u1cmfqt = 8 * MEAN.4(u1cmfq1, u1cmfq2,
 u1cmfq3, u1cmfq4, u1cmfq5, u1cmfq6, u1cmfq7, u1cmfq8).
COMPUTE u2cmfqt = 8 * MEAN.4(u2cmfq1, u2cmfq2,
 u2cmfq3, u2cmfq4, u2cmfq5, u2cmfq6, u2cmfq7, u2cmfq8).
COMPUTE ucv1mfqt = 8 * MEAN.4(ucv1mfq1, ucv1mfq2,
 ucv1mfq3, ucv1mfq4, ucv1mfq5, ucv1mfq6, ucv1mfq7, ucv1mfq8).
COMPUTE ucv2mfqt = 8 * MEAN.4(ucv2mfq1, ucv2mfq2,
 ucv2mfq3, ucv2mfq4, ucv2mfq5, ucv2mfq6, ucv2mfq7, ucv2mfq8).
COMPUTE ucv3mfqt = 8 * MEAN.4(ucv3mfq1, ucv3mfq2,
 ucv3mfq3, ucv3mfq4, ucv3mfq5, ucv3mfq6, ucv3mfq7, ucv3mfq8).
COMPUTE ucv4mfqt = 8 * MEAN.4(ucv4mfq1, ucv4mfq2,
 ucv4mfq3, ucv4mfq4, ucv4mfq5, ucv4mfq6, ucv4mfq7, ucv4mfq8).
EXECUTE.

u1cmonam1/2

Money Attitudes scale, derived from all 6 items of the measure in the phase 1 twin questionnaire (reversed where necessary). Each item has integer response values 1-5, hence the scale has the same range as it is computed as a mean. At least half the items are required to be non-missing for the scale to be computed.

COMPUTE u1cmonam = MEAN.3(u1cmona1r, u1cmona2r, u1cmona3, u1cmona4, u1cmona5, u1cmona6r).
EXECUTE.

u1cmumrm1/2

See u1cdadrm1/2, etc above.

u1cobult1/2

Online bullying total scale, from the twin phase 1 questionnaire, derived from all 4 available items of the measure.
Each item has integer response values 0-2, hence the scale has a range of values from 0 to 8 because it is computed as the mean multiplied by the number of component items. At least half the items are required to be non-missing for the scale to be computed.

COMPUTE u1cobult = 4 * MEAN.2(u1cobul1, u1cobul2, u1cobul3, u1cobul4).
EXECUTE.

u1cparvm1/2, ucv1parvm1/2, ucv2parvm1/2, ucv3parvm1/2, ucv4parvm1/2

Partner Violence scale, derived from all 6 items of the measure in the TEDS21 phase 1 twin questionnaire (u1c) and in the covid twin phase 1 (ucv1), phase 2 (ucv2), phase 3 (ucv3) and phase 4 (ucv4) questionnaires. Each item has integer response values 1-5, hence the scale has the same range as it is computed as a mean. At least half the items are required to be non-missing for the scale to be computed.

COMPUTE u1cparvm = MEAN.3(u1cparv1, u1cparv2, u1cparv3, u1cparv4, u1cparv5, u1cparv6).
COMPUTE ucv1parvm = MEAN.3(ucv1parv1, ucv1parv2, ucv1parv3, ucv1parv4, ucv1parv5, ucv1parv6).
COMPUTE ucv2parvm = MEAN.3(ucv2parv1, ucv2parv2, ucv2parv3, ucv2parv4, ucv2parv5, ucv2parv6).
COMPUTE ucv3parvm = MEAN.3(ucv3parv1, ucv3parv2, ucv3parv3, ucv3parv4, ucv3parv5, ucv3parv6).
COMPUTE ucv4parvm = MEAN.3(ucv4parv1, ucv4parv2, ucv4parv3, ucv4parv4, ucv4parv5, ucv4parv6).
EXECUTE.

u1cpeerprem1/2, u1cpeerrism1/2, u1cpeerm1/2

Peer Pressure subscales, and an overall scale, derived from items of the measure in the phase 1 twin questionnaire.
The subscales are for submission to peer pressure (u1cpeerprem), and engagement in risky activities (u1cpeerrism) while the overall scale is u1cpeerm.
The subscales are derived from 3 and 4 items respectively, while the overall scale is a mean of all 7 items, in each case requiring at least half of the items to be non-missing. Each item has response values 1-5, hence each of these scales has the same range as it is computed as a mean.

* Subscale: submission to peer pressure (3 items).
COMPUTE u1cpeerprem = MEAN.2(u1cpeer1, u1cpeer2, u1cpeer4).
* Subscale: risky activities (4 items).
COMPUTE u1cpeerrism = MEAN.2(u1cpeer3, u1cpeer5, u1cpeer6, u1cpeer7).
* Overall: 7 items.
COMPUTE u1cpeerm = MEAN.4(u1cpeer1, u1cpeer2, u1cpeer3, u1cpeer4, u1cpeer5, u1cpeer6, u1cpeer7).
EXECUTE.

u1cpersneum1/2, u1cpersextm1/2, u1cpersopem1/2, u1cpersagrm1/2, u1cpersconm1/2

Personality subscales, derived from items of the Big 5 Personality measure in the phase 1 twin questionnaire. Each measure is a mean of 6 of the items, requiring at least 3 of them to be non-missing. Each item has response values 1-5, hence each scale has the same range as it is computed as a mean.

* Big 5 personality.
* Self-rated: 5 subscales, each based on 6 items.
* Neuroticism.
COMPUTE u1cpersneum = MEAN.3(u1cpers01, u1cpers02, u1cpers03, u1cpers04, u1cpers05, u1cpers06).
* Extraversion.
COMPUTE u1cpersextm = MEAN.3(u1cpers07, u1cpers08, u1cpers09, u1cpers10, u1cpers11, u1cpers12).
* Openness.
COMPUTE u1cpersopem = MEAN.3(u1cpers13, u1cpers14, u1cpers15, u1cpers16, u1cpers17, u1cpers18).
* Agreeableness.
COMPUTE u1cpersagrm = MEAN.3(u1cpers19, u1cpers20, u1cpers21, u1cpers22, u1cpers23, u1cpers24).
* Conscientiousness.
COMPUTE u1cpersconm = MEAN.3(u1cpers25, u1cpers26, u1cpers27, u1cpers28, u1cpers29, u1cpers30).
EXECUTE.

u1cpilm1/2, ucv1pilm1/2, ucv2pilm1/2, ucv3pilm1/2, ucv4pilm1/2

Purpose in Life scale, derived from all 5 items of the measure in the phase 1 twin questionnaire and in the covid questionnaires (the same measure was used in each case). Each item has integer response values 1-5, hence the scale has the same range as it is computed as a mean. At least half the items are required to be non-missing for the scale to be computed.

COMPUTE u1cpilm = MEAN.3(u1cpil1, u1cpil2, u1cpil3, u1cpil4, u1cpil5).
COMPUTE ucv1pilm = MEAN.3(ucv1pil1, ucv1pil2, ucv1pil3, ucv1pil4, ucv1pil5).
COMPUTE ucv2pilm = MEAN.3(ucv2pil1, ucv2pil2, ucv2pil3, ucv2pil4, ucv2pil5).
COMPUTE ucv3pilm = MEAN.3(ucv3pil1, ucv3pil2, ucv3pil3, ucv3pil4, ucv3pil5).
COMPUTE ucv4pilm = MEAN.3(ucv4pil1, ucv4pil2, ucv4pil3, ucv4pil4, ucv4pil5).
EXECUTE.

u1cprobt1/2

Problematic internet use total scale, from the twin phase 1 questionnaire, derived from all 6 available items of the measure.
Each item has integer response values 0-4, hence the scale has a range of values from 0 to 24 because it is computed as the mean multiplied by the number of component items. At least half the items are required to be non-missing for the scale to be computed.

COMPUTE u1cprobt = 6 * MEAN.3(u1cprob1, u1cprob2, u1cprob3, u1cprob4, u1cprob5, u1cprob6).
EXECUTE.

u1cq1quan1/2, u1cq2quan1/2, u1cq3quan1/2, u1cq4quan1/2, u1cq5quan1/2, u1cq6quan1/2, u1cq7quan1/2, u1cq8quan1/2, u1cq9quan1/2, u1cquan1/2

Variables showing the number of compulsory questions answered in each of the 9 sections (u1cqXquan) and overall (u1cquan) in the phase 1 twin questionnaire. The same method was used in each of the 3 main questionnaire versions (CMS, backup, paper), using the same subset of item variables in each case. The counts ignore optional questions and those questions that might be skipped because of branching rules.

* Count the number of items completed, not including items that could have been skipped.
* due to branching (if the questionnaire was completed electronically).
* Divide the twin questionnaire into the same 9 sections/themes.
* as used in the CMS/web versions.
COUNT u1cq1quan = 
 u1cpers01 u1cpers02 u1cpers03 u1cpers04 u1cpers05 u1cpers06 u1cpers07 u1cpers08 u1cpers09 u1cpers10
 u1cpers11 u1cpers12 u1cpers13 u1cpers14 u1cpers15 u1cpers16 u1cpers17 u1cpers18 u1cpers19 u1cpers20
 u1cpers21 u1cpers22 u1cpers23 u1cpers24 u1cpers25 u1cpers26 u1cpers27 u1cpers28 u1cpers29 u1cpers30
 u1cself1 u1cself2 u1cself3 u1cself4 u1cself5 u1cself6
 u1cfcon1 u1cfcon2 u1cfcon3 u1cfconqc u1cfcon4
 u1cmanx
 u1crsk1 u1crsk2 u1crsk3 u1crsk4 u1crsk5 u1crsk6
 u1cbaq1 u1cbaq2 u1cbaq3 u1cbaq4 u1cbaq5 u1cbaq6 u1cbaq7 u1cbaq8 (0 THRU HIGHEST).
COUNT u1cq2quan = 
 u1cgoal1 u1cgoal2 u1cgoal3 u1cgoal4 u1cgoal5 u1cgoal6 u1cgoal7 u1cgoal8 u1cgoalqc u1cgoal9
 u1cpil1 u1cpil2 u1cpil3 u1cpil4 u1cpil5
 u1cbsae1 u1cbsae2 u1cbsae3 u1cbsae4 u1cbsae5 u1cbsae6
 u1cbsag1 u1cbsag2 u1cbsag3 u1cbsag4 u1cbsag5
 u1cpolv1 u1cpolv2
 u1cpher01 u1cpher02 u1cpher03 u1cpher04 u1cpher05 u1cpher06 u1cpher07
 u1cpher08 u1cpher09 u1cpher10 u1cpher11 u1cpher12 u1cpher13 u1cpher14 (0 THRU HIGHEST).
COUNT u1cq3quan =
 u1csdq01 u1csdq02 u1csdq03 u1csdq04 u1csdq05 u1csdq06 u1csdq07 u1csdq08 u1csdqqc u1csdq09
 u1csdq10 u1csdq11 u1csdq12 u1csdq13 u1csdq14 u1csdq15 u1csdq16 u1csdq17 u1csdq18 u1csdq19
 u1csdq20 u1csdq21 u1csdq22 u1csdq23 u1csdq24 u1csdq25
 u1cvoln1 u1cvoln2 u1cvoln3 u1cvoln4 u1cvoln5
 u1cmfq1 u1cmfq2 u1cmfq3 u1cmfq4 u1cmfq5 u1cmfq6 u1cmfq7 u1cmfqqc u1cmfq8
 u1cpeer1 u1cpeerqc u1cpeer2 u1cpeer3 u1cpeer4 u1cpeer5 u1cpeer6 u1cpeer7
 u1crelg1 u1crelg2 u1crelg3 u1crelg4 u1crelg5 (0 THRU HIGHEST).
COUNT u1cq4quan = u1crelst u1csexor
 u1crela2 u1crela4 u1crela5 u1crela6
 u1cmarr1 u1cmarr2 u1cmarr3 u1cmarr4 u1cmarr5 u1cmarr6 u1cmarr7 u1cmarr8
 u1csexb1 u1csexb7
 u1cparv1 u1cparv2 u1cparv3 u1cparv4 u1cparv5 u1cparv6 (0 THRU HIGHEST).
COUNT u1cq5quan = 
 u1ctwnr1 u1ctwnr2 u1ctwnr3 u1ctwnr4 u1ctwnr5
 u1cmumr1 u1cmumr2 u1cmumr3 u1cmumr4 u1cmumr5
 u1cdadr1 u1cdadr2 u1cdadr3 u1cdadr4 u1cdadr5
 u1ccomm1 u1ccomm2 u1ccomm3 u1ccomm4 u1ccomm5
 u1cchaos1 u1cchaos2 u1cchaos3 u1cchaos4 u1cchaos5 u1cchaos6
 u1cchild u1ctchild u1cpreg u1ctpreg (0 THRU HIGHEST).
COUNT u1cq6quan = u1cdiet u1callg1
 u1creap01 u1creap02 u1creap03 u1creap05 u1creap06 u1creap07
 u1creap08 u1creap09 u1creapqc u1creap10 u1creap11 u1creap12 u1cantib
 u1crand1 u1crand2 u1crand3 u1crand4 u1crand5 u1cpaink
 u1chtcm u1cwtkg
 u1chosp1
 u1ceats01 u1ceats02 u1ceats03 u1ceats04 u1ceats05 u1ceats06
 u1ceats07 u1ceats08 u1ceats09 u1ceats10 u1ceats11 u1ceats12
 u1cactv1 u1cactv2 u1cactv3 u1cathl
 u1ceatd1 u1ceatd2 u1ceatd3
 u1cslfh01 u1cslfh02 (0 THRU HIGHEST).
COUNT u1cq7quan = u1clivs1
 u1chqualc u1cstatus u1cothinc u1cequalc u1cstex01 u1cbenf1 (0 THRU HIGHEST).
COUNT u1cq8quan = 
 u1cfina1 u1cfina2 u1cfina3 u1cfina4 u1cfina5
 u1cfprd01 u1cfprd02 u1cfprd03 u1cfprd04 u1cfprd05 u1cfprd06 u1cfprd07
 u1cfprd08 u1cfprd09 u1cfprd10 u1cfprd11 u1cfprd12 u1cfprdqc u1cfprd13
 u1cmona1 u1cmona2 u1cmona3 u1cmona4 u1cmona5 u1cmona6 (0 THRU HIGHEST).
COUNT u1cq9quan = 
 u1cmedu01 u1cmedu02 u1cmedu03 u1cmedu04 u1cmedu05 u1cmedu06 u1cmedu07 u1cmedu08 u1cmedu09 u1cmedu10 u1cmedu11
 u1cprob1 u1cprobqc u1cprob2 u1cprob3 u1cprob4 u1cprob5 u1cprob6
 u1codat1 u1cobul1 u1cobul2 u1cobul3 u1cobul4 (0 THRU HIGHEST).
EXECUTE.

* Overall total count of (non-branched) questions answered.
COMPUTE u1cquan = SUM(u1cq1quan, u1cq2quan, u1cq3quan, u1cq4quan,
 u1cq5quan, u1cq6quan, u1cq7quan, u1cq8quan, u1cq9quan).
EXECUTE.

u1cq1stat1/2, u1cq2stat1/2, u1cq3stat1/2, u1cq4stat1/2, u1cq5stat1/2, u1cq6stat1/2, u1cq7stat1/2, u1cq8stat1/2, u1cq9stat1/2, u1cstat1/2

Status variables for each of the 9 sections (u1cqXstat) and overall (u1cstat) in the phase 1 twin questionnaire. Each is initially coded 0=not started, 1=started but not finished, 2=finished. During data cleaning, each may be recoded from 2=finished to 4=excluded (random responder), although this recoding is not shown in the syntax below.
Different methods are used to calculate the status variables in the three versions (CMS, backup, paper). The CMS and backup versions had different types of raw status variables that could be used. The CMS raw data contains variable QnrCompletion, with values 0 to 100, indicating the status of the questionnaire as a whole. The backup raw data contains a status variable for each section, coded 0=not completed, 1=completed (data were only submitted at the end of each section, hence partially-completed sections are not present in the raw backup data). For the paper version, the status variables are based purely on the counts of the questions answered in each section (these are derived variables described elsewhere on this page). Note that the sections and their component questions were presented in strict sequence in the electronic versions, so if a section was left unfinished then subsequent sections would necessarily be unstarted. In the paper version, participants could of course leave questions or whole sections unanswered in a wholly arbitrary way; but an attempt has been made to compute the status variables in a roughly equivalent way.

* CMS raw data.
* ------------.
* Derive a status variable for each section, coded 0=not started 1=started 2=finished.
* Some questions in CMS twin questionnaire are optional, so may be missing.
* therefore need careful approach to determine whether finished.
COMPUTE u1cq1stat = 0.
COMPUTE u1cq2stat = 0.
COMPUTE u1cq3stat = 0.
COMPUTE u1cq4stat = 0.
COMPUTE u1cq5stat = 0.
COMPUTE u1cq6stat = 0.
COMPUTE u1cq7stat = 0.
COMPUTE u1cq8stat = 0.
COMPUTE u1cq9stat = 0.
COMPUTE u1cstat = 0.
EXECUTE.
* If QnrCompletion=100, all must have been finished.
DO IF (QnrCompletion = 100).
 RECODE u1cq1stat u1cq2stat u1cq3stat u1cq4stat
 u1cq5stat u1cq6stat u1cq7stat u1cq8stat u1cq9stat u1cstat (0=2).
END IF.
EXECUTE.
* 1st section: assume complete if all 56 items answered OR if some of 2nd section done.
* and assume partially complete if 1-55 answered AND nothing in 2nd section.
IF (u1cq1stat = 0 & (u1cq1quan = 56 | u1cq2quan > 0)) u1cq1stat = 2.
IF (u1cq1stat = 0 & (RANGE(u1cq1quan,1,55) & u1cq2quan = 0)) u1cq1stat = 1.
EXECUTE.
* 2nd section (42 items, all compulsory).
IF (u1cq2stat = 0 & (u1cq2quan = 42 | u1cq3quan > 0)) u1cq2stat = 2.
IF (u1cq2stat = 0 & (RANGE(u1cq2quan,1,41) & u1cq3quan = 0)) u1cq2stat = 1.
EXECUTE.
* 3rd section (53 items, a few of which are optional).
IF (u1cq3stat = 0 & (u1cq3quan = 53 | u1cq4quan > 0)) u1cq3stat = 2.
IF (u1cq3stat = 0 & (RANGE(u1cq3quan,1,52) & u1cq4quan = 0)) u1cq3stat = 1.
EXECUTE.
* 4th section (22 items, some of which are optional).
IF (u1cq4stat = 0 & (u1cq4quan = 22 | u1cq5quan > 0)) u1cq4stat = 2.
IF (u1cq4stat = 0 & (RANGE(u1cq4quan,1,21) & u1cq5quan = 0)) u1cq4stat = 1.
EXECUTE.
* 5th section (30 items, some of which are optional).
IF (u1cq5stat = 0 & (u1cq5quan = 30 | u1cq6quan > 0)) u1cq5stat = 2.
IF (u1cq5stat = 0 & (RANGE(u1cq5quan,1,29) & u1cq6quan = 0)) u1cq5stat = 1.
EXECUTE.
* 6th section (45 items, all compulsory).
IF (u1cq6stat = 0 & (u1cq6quan = 30 | u1cq7quan > 0)) u1cq6stat = 2.
IF (u1cq6stat = 0 & (RANGE(u1cq6quan,1,29) & u1cq7quan = 0)) u1cq6stat = 1.
EXECUTE.
* 7th section (only 7 non-branched items, all compulsory).
IF (u1cq7stat = 0 & (u1cq7quan = 7 | u1cq8quan > 0)) u1cq7stat = 2.
IF (u1cq7stat = 0 & (RANGE(u1cq7quan,1,6) & u1cq8quan = 0)) u1cq7stat = 1.
EXECUTE.
* 8th section (25 items, all compulsory).
IF (u1cq8stat = 0 & (u1cq8quan = 25 | u1cq9quan > 0)) u1cq8stat = 2.
IF (u1cq8stat = 0 & (RANGE(u1cq8quan,1,24) & u1cq9quan = 0)) u1cq8stat = 1.
EXECUTE.
* 9th section is last and completed if QnrCompletion=100 as above.
* hence treat as unfinished if 1 or more item done and QnrCompletion < 100.
IF (u1cq9stat = 0 & u1cq9quan > 0 & QnrCompletion < 100) u1cq9stat = 1.
EXECUTE.

* Overall status is unfinished if 1st section started but overall unfinished.
IF (u1cstat = 0 & u1cq1stat > 0 & QnrCompletion < 100) u1cstat = 1.
EXECUTE.

* Backup raw data.
* ---------------.
* Derive a status variable for each section, coded 0=not started 1=started 2=finished.
* In this backup version, each section is either not started or finished.
* because data are only submitted at the end of the section.
* Can therefore conveniently recode the existing status flags (0/1) from the admin file.
* regardless of the number of items completed.
RECODE personality__status thoughtsandattitudestest__status aboutyou__status
 loveandrelationships__status homefamilya__status health__status
 educationemploymentandtraining__status finances__status OnlineBehaviourb__status
 (0=0) (1=2)
INTO u1cq1stat u1cq2stat u1cq3stat u1cq4stat
 u1cq5stat u1cq6stat u1cq7stat u1cq8stat u1cq9stat.
EXECUTE.

* Overall status may be unfinished if fewer than 9 sections completed.
IF (u1cq1stat = 0) u1cstat = 0.
IF (u1cq1stat > 0 & u1cq9stat < 2) u1cstat = 1.
IF (u1cq9stat = 2) u1cstat = 2.
EXECUTE.

* Paper raw data.
* --------------.
* Derive a status variable for each section, coded 0=not started 1=started 2=finished.
* In the CMS/web, the purpose is to identify sections started but not finished.
* In this paper version, unlike electronic versions, no questions can be made compulsory.
* as a condition of continuing; any questions can easily be skipped mid-questionnaire.
* hence the status variable has a slightly different meaning in the paper version.
* Therefore treat each section as finished if most (two thirds) compulsory non-branched questions were answered.
RECODE u1cq1quan (0=0) (1 THRU 36=1) (37 THRU 56=2) INTO u1cq1stat.
RECODE u1cq2quan (0=0) (1 THRU 27=1) (28 THRU 42=2) INTO u1cq2stat.
RECODE u1cq3quan (0=0) (1 THRU 34=1) (35 THRU 53=2) INTO u1cq3stat.
RECODE u1cq4quan (0=0) (1 THRU 13=1) (14 THRU 22=2) INTO u1cq4stat.
RECODE u1cq5quan (0=0) (1 THRU 19=1) (20 THRU 30=2) INTO u1cq5stat.
RECODE u1cq6quan (0=0) (1 THRU 29=1) (30 THRU 45=2) INTO u1cq6stat.
RECODE u1cq7quan (0=0) (1 THRU 3=1) (4 THRU 7=2) INTO u1cq7stat.
RECODE u1cq8quan (0=0) (1 THRU 15=1) (16 THRU 25=2) INTO u1cq8stat.
RECODE u1cq9quan (0=0) (1 THRU 14=1) (15 THRU 23=2) INTO u1cq9stat.
EXECUTE.

* Now the overall status is finished if all 9 sections are finished by the above definition.
IF (SUM(u1cq1stat, u1cq2stat, u1cq3stat, u1cq4stat, u1cq5stat, u1cq6stat,
   u1cq7stat, u1cq8stat, u1cq9stat) = 0) u1cstat = 0.
IF (SUM(u1cq1stat, u1cq2stat, u1cq3stat, u1cq4stat, u1cq5stat, u1cq6stat,
   u1cq7stat, u1cq8stat, u1cq9stat) = 18) u1cstat = 2.
IF (RANGE(SUM(u1cq1stat, u1cq2stat, u1cq3stat, u1cq4stat, u1cq5stat, u1cq6stat,
   u1cq7stat, u1cq8stat, u1cq9stat), 1, 17)) u1cstat = 1.
EXECUTE.

u1cq1time1/2, u1cq2time1/2, u1cq3time1/2, u1cq4time1/2, u1cq5time1/2, u1cq6time1/2, u1cq7time1/2, u1cq8time1/2, u1cq9time1/2, u1ctime1/2

Time taken to complete each section (u1cqXtime) and the whole (u1ctime) of the phase 1 twin questionnaire. The times are measured in decimal minutes.
These variables are derived in different ways in the CMS raw data (as a sum of item times) and the backup raw data (as a difference between start and end date-times). The variables cannot be derived for paper booklet data. Note that the item times used in the CMS derivation, and the section start/end times used in the backup derivation, are not retained in the dataset because they are not present in other versions.

* Backup raw data.
* ---------------.
* In the backup data, there are no item times but there is a date-time.
* for the start and end of each section - use these to derive the duration in minutes.
* Should be roughly compatible with the times in the CMS version, which are.
* derived as sums of item times.
NUMERIC u1cq1time u1cq2time u1cq3time u1cq4time u1cq5time u1cq6time u1cq7time u1cq8time u1cq9time u1ctime (F4.1).
* Only derive if the relevant section was completed.
IF (u1cq1stat = 2) u1cq1time = DATEDIFF(personality__subTime, personality__genTime, 'seconds') / 60.
IF (u1cq2stat = 2) u1cq2time = DATEDIFF(thoughtsandattitudestest__subTime, thoughtsandattitudestest__genTime, 'seconds') / 60.
IF (u1cq3stat = 2) u1cq3time = DATEDIFF(aboutyou__subTime, aboutyou__genTime, 'seconds') / 60.
IF (u1cq4stat = 2) u1cq4time = DATEDIFF(loveandrelationships__subTime, loveandrelationships__genTime, 'seconds') / 60.
IF (u1cq5stat = 2) u1cq5time = DATEDIFF(homefamilya__subTime, homefamilya__genTime, 'seconds') / 60.
IF (u1cq6stat = 2) u1cq6time = DATEDIFF(health__subTime, health__genTime, 'seconds') / 60.
IF (u1cq7stat = 2) u1cq7time = DATEDIFF(educationemploymentandtraining__subTime, educationemploymentandtraining__genTime, 'seconds') / 60.
IF (u1cq8stat = 2) u1cq8time = DATEDIFF(finances__subTime, finances__genTime, 'seconds') / 60.
IF (u1cq9stat = 2) u1cq9time = DATEDIFF(OnlineBehaviourb__subTime, OnlineBehaviourb__genTime, 'seconds') / 60.
EXECUTE.
* Sum for total time if all finished.
IF (u1cstat = 2) u1ctime = SUM(u1cq1time, u1cq2time, u1cq3time,
 u1cq4time, u1cq5time, u1cq6time, u1cq7time, u1cq8time, u1cq9time).
EXECUTE.

* CMS raw data.
* ------------.
* Sum the item times for each questionnaire section/theme (including all CoTEDS item times).
* and divided by 60000 to get a time in minutes.
* Only derive times for completed sections.
IF (u1cq1stat = 2) u1cq1time = SUM(u1cpers01rt, u1cpers02rt, 
 u1cpers03rt, u1cpers04rt, u1cpers05rt, u1cpers06rt, u1cpers07rt, u1cpers08rt, u1cpers09rt, u1cpers10rt, 
 u1cpers11rt, u1cpers12rt, u1cpers13rt, u1cpers14rt, u1cpers15rt, u1cpers16rt, u1cpers17rt, u1cpers18rt, u1cpers19rt, u1cpers20rt, 
 u1cpers21rt, u1cpers22rt, u1cpers23rt, u1cpers24rt, u1cpers25rt, u1cpers26rt, u1cpers27rt, u1cpers28rt, u1cpers29rt, u1cpers30rt, 
 u1cself1rrt, u1cself2rt, u1cself3rt, u1cself4rt, u1cself5rt, u1cself6rt, 
 u1cfcon1rt, u1cfcon2rt, u1cfcon3rt, u1cfconqcrt, u1cfcon4rt, u1cmanxrt, 
 u1crsk1rt, u1crsk2rt, u1crsk3rt, u1crsk4rt, u1crsk5rt, u1crsk6rt, 
 u1cbaq1rt, u1cbaq2rt, u1cbaq3rt, u1cbaq4rt, u1cbaq5rt, u1cbaq6rt, u1cbaq7rt, u1cbaq8rt) / 60000.
IF (u1cq2stat = 2) u1cq2time = SUM(u1cgoal1rt, u1cgoal2rt, 
 u1cgoal3rt, u1cgoal4rt, u1cgoal5rt, u1cgoal6rt, u1cgoal7rt, u1cgoal8rt, u1cgoalqcrt, u1cgoal9rt, 
 u1cpil1rt, u1cpil2rt, u1cpil3rt, u1cpil4rt, u1cpil5rt, 
 u1cbsae1rt, u1cbsae2rrt, u1cbsae3rrt, u1cbsae4rrt, u1cbsae5rrt, u1cbsae6rrt, 
 u1cbsag1rt, u1cbsag2rt, u1cbsag3rt, u1cbsag4rt, u1cbsag5rt, 
 u1cpolv1rt, u1cpolv2rt, 
 u1cpher01rt, u1cpher02rt, u1cpher03rt, u1cpher04rt, u1cpher05rt, u1cpher06rt, u1cpher07rt, 
 u1cpher08rt, u1cpher09rt, u1cpher10rt, u1cpher11rt, u1cpher12rt, u1cpher13rt, u1cpher14rt) / 60000.
IF (u1cq3stat = 2) u1cq3time = SUM(u1csdq01rt, u1csdq02rt, 
 u1csdq03rt, u1csdq04rt, u1csdq05rt, u1csdq06rt, u1csdq07rrt, u1csdq08rt, u1csdqqcrt, u1csdq09rt, 
 u1csdq10rt, u1csdq11rrt, u1csdq12rt, u1csdq13rt, u1csdq14rrt, u1csdq15rt, u1csdq16rt, u1csdq17rt, u1csdq18rt, u1csdq19rt, 
 u1csdq20rrt, u1csdq21rrt, u1csdq22rt, u1csdq23rt, u1csdq24rt, u1csdq25rrt, 
 u1cvoln1rt, u1cvoln2rt, u1cvoln3rt, u1cvoln4rt, u1cvoln5rt, 
 u1cmfq1rt, u1cmfq2rt, u1cmfq3rt, u1cmfq4rt, u1cmfq5rt, u1cmfq6rt, u1cmfq7rt, u1cmfqqcrt, u1cmfq8rt, 
 u1cpeer1rt, u1cpeerqcrt, u1cpeer2rt, u1cpeer3rt, u1cpeer4rt, u1cpeer5rt, u1cpeer6rt, u1cpeer7rt, 
 u1crelg1rt, u1crelg2rt, u1crelg3rt, u1crelg4rt, u1crelg5rt) / 60000.
IF (u1cq4stat = 2) u1cq4time = SUM(u1crelstrt, u1csexorrt, 
 u1crela1rt, u1crela2rt, u1crela3rt, u1crela4rt, u1crela5rt, u1crela6rt, 
 u1cmarr1rt, u1cmarr2rrt, u1cmarr3rt, u1cmarr4rrt, u1cmarr5rrt, u1cmarr6rt, u1cmarr7rt, u1cmarr8rrt, 
 u1csexb1rt, u1csexb2rt, u1csexb3rt, u1csexb4rt, u1csexb5rt, u1csexb6rt, u1csexb7rt, 
 u1cparv1rt, u1cparv2rt, u1cparv3rt, u1cparv4rt, u1cparv5rt, u1cparv6rt) / 60000.
IF (u1cq5stat = 2) u1cq5time = SUM( u1cfambrrt, u1ctwnr1rt, u1ctwnr2rt, u1ctwnr3rt, u1ctwnr4rt, u1ctwnr5rt, 
 u1cmumr1rt, u1cmumr2rt, u1cmumr3rt, u1cmumr4rt, u1cmumr5rt, 
 u1cdadr1rt, u1cdadr2rt, u1cdadr3rt, u1cdadr4rt, u1cdadr5rt, 
 u1ccomm1rt, u1ccomm2rrt, u1ccomm3rt, u1ccomm4rrt, u1ccomm5rt, 
 u1cchaos1rt, u1cchaos2rt, u1cchaos3rt, u1cchaos4rt, u1cchaos5rt, u1cchaos6rt, 
 u1ccoteds01rt, u1ccoteds02rt, u1ccoteds03rt, u1ccoteds04rt, u1ccoteds05rt, u1ccoteds06rt, u1ccoteds07rt, u1ccoteds08rt, 
 u1ccoteds09rt, u1ccoteds10rt, u1ccoteds11rt, u1ccoteds12rt, u1ccoteds13rt, u1ccoteds14rt, u1ccoteds15rt, u1ccoteds16rt, 
 u1ccoteds17rt, u1ccoteds18rt, u1ccoteds19rt, u1ccoteds20rt, u1ccoteds21rt, u1ccoteds22rt, u1ccoteds23rt, u1ccoteds24rt, 
 u1ccoteds25rt, u1ccoteds26rt, u1ccoteds27rt, u1ccoteds28rt, u1ccoteds29rt, u1ccoteds30rt, u1ccoteds31rt, u1ccoteds32rt, 
 u1ccoteds33rt, u1ccoteds34rt, u1ccoteds35rt, u1ccoteds36rt, u1ccoteds37rt, u1ccoteds38rt, u1ccoteds39rt, u1ccoteds40rt) / 60000.
IF (u1cq6stat = 2) u1cq6time = SUM(u1cdietrt, u1callg1rt, u1callg2rt, 
 u1creap01rt, u1creap02rt, u1creap03rt, u1creap04rrt, u1creap05rt, u1creap06rrt, 
 u1creap07rrt, u1creap08rt, u1creap09rt, u1creapqcrt, u1creap10rrt, u1creap11rrt, u1creap12rrt, 
 u1cantibrrt, u1crand1rt, u1crand2rt, u1crand3rrt, u1crand4rt, u1crand5rrt, 
 u1cpainkrrt, u1chtrt, u1cwtrt, u1chosp1rt, u1chosp2rt, 
 u1ceats01rt, u1ceats02rt, u1ceats03rt, u1ceats04rt, u1ceats05rt, u1ceats06rt, 
 u1ceats07rt, u1ceats08rt, u1ceats09rt, u1ceats10rt, u1ceats11rt, u1ceats12rt, 
 u1cactv1rt, u1cactv2rt, u1cactv3rt, u1cathlrt, 
 u1ceatd1rt, u1ceatd2rt, u1ceatd3rt, 
 u1cslfh01rt, u1cslfh02rt, u1cslfh03rt, u1cslfh04rt, u1cslfh05rt, u1cslfh06rt, u1cslfh07rt, 
 u1cslfh08rt, u1cslfh09rt, u1cslfh10rt, u1cslfh11rt, u1cslfh12rt, u1cslfh13rt) / 60000.
IF (u1cq7stat = 2) u1cq7time = SUM(u1clivs1rt, u1cses01rt, u1cses02rt, 
 u1cses03rt, u1cses04rt, u1cses05rt, u1cses06rt, u1cses07rt, u1cses08rt, u1cses09rt, u1cses10rt, u1clivs2rt, 
 u1cstex01rt, u1cstex02rt, u1cstex03rt, u1cstex04rt, u1cstex05rt, u1cstex06rt, u1cstex07rt, u1cstex08rt, u1cstex09rt, u1cstex10rt, 
 u1cstex11rt, u1cstex12rt, u1cstex13rt, u1cstex14rt, u1cstex15rt, u1cstex16rt, u1cstex17rt, u1cstex18rt, u1cstex19rt, 
 u1cdegr1rt, u1cdegr2rt, u1cbenf1rt, u1cbenf2rt) / 60000.
IF (u1cq8stat = 2) u1cq8time = SUM(u1cfina1rt, u1cfina2rt, u1cfina3rt, u1cfina4rt, u1cfina5rrt, 
 u1cfprd01rt, u1cfprd02rt, u1cfprd03rt, u1cfprd04rt, u1cfprd05rt, u1cfprd06rt, u1cfprd07rt, 
 u1cfprd08rt, u1cfprd09rt, u1cfprd10rt, u1cfprd11rt, u1cfprd12rt, u1cfprdqcrt, u1cfprd13rt, 
 u1cmona1rrt, u1cmona2rrt, u1cmona3rt, u1cmona4rt, u1cmona5rt, u1cmona6rrt) / 60000.
IF (u1cq9stat = 2) u1cq9time = SUM(u1cmedu01rt, u1cmedu02rt, u1cmedu03rt, 
 u1cmedu04rt, u1cmedu05rt, u1cmedu06rt, u1cmedu07rt, u1cmedu08rt, u1cmedu09rt, u1cmedu10rt, u1cmedu11rt, 
 u1cprob1rt, u1cprobqcrt, u1cprob2rt, u1cprob3rt, u1cprob4rt, u1cprob5rt, u1cprob6rt, 
 u1codat1rt, u1codat2rt, u1cobul1rt, u1cobul2rt, u1cobul3rt, u1cobul4rt) / 60000.
EXECUTE.
* Sum for total time if all finished.
IF (u1cstat = 2) u1ctime = SUM(u1cq1time, u1cq2time, u1cq3time,
 u1cq4time, u1cq5time, u1cq6time, u1cq7time, u1cq8time, u1cq9time).
EXECUTE.

u1crandm1/2

RAND general health scale, derived from all 5 items of the measure in the phase 1 twin questionnaire. Items 3 and 5 are reversed for the scale. Each item has integer response values 1-5, hence the scale has the same range as it is computed as a mean. At least half the items are required to be non-missing for the scale to be computed.

COMPUTE u1crandm = MEAN.3(u1crand1, u1crand2, u1crand3r, u1crand4, u1crand5r).
EXECUTE.

u1creaphlt1/2, u1creapunt1/2, u1creapt1/2

REAP eating habits scales: two subscales and a total scale.
Derived from items of the REAP measure in the twin phase 1 questionnaire.
The subscales are for eating healthy foods (u1creaphlt) and for eating unhealthy foods (u1creapunt), based on 6 items each. The total scale (u1creapt) is derived from all 12 items of the measure.
Each item has integer response values 0-4, hence each scale has a range of values from 0 to (4 * number of component items) as it is a 'total' computed by multiplying the mean by the number of items. At least half the component items are required to be non-missing for each scale to be computed.

* Eating healthy items (1,2,3,5,8,9).
COMPUTE u1creaphlt = 6 * MEAN.3(u1creap01, u1creap02, u1creap03, u1creap05, u1creap08, u1creap09).
* Unhealthy items (4,6,7,10,11,12).
COMPUTE u1creapunt = 6 * MEAN.3(u1creap04, u1creap06, u1creap07, u1creap10, u1creap11, u1creap12).
* Total: all items, using reversed versions for unhealthy items.
COMPUTE u1creapt = 12 * MEAN.6(u1creap01, u1creap02, u1creap03,
 u1creap04r, u1creap05, u1creap06r, u1creap07r, u1creap08, u1creap09,
 u1creap10r, u1creap11r, u1creap12r).
EXECUTE.

u1crelam1/2, ucv1relam1/2, ucv2relam1/2, ucv3relam1/2, ucv4relam1/2

CLAS Love and Relationships scale, derived from items 4-6 of the measure in the TEDS21 phase 1 twin (u1c) questionnaire (items 1-3 are not suitable for scaling); and derived from the same three items that were repeated in the covid phase 1 (ucv1), phase 2 (ucv2), phase 3 (ucv3) and phase 4 (ucv4) twin questionnaires. Each item has integer response values 1-5, hence the scale has the same range as it is computed as a mean. At least half the items are required to be non-missing for the scale to be computed.

COMPUTE u1crelam = MEAN.2(u1crela4, u1crela5, u1crela6).
COMPUTE ucv1relam = MEAN.2(ucv1rela1, ucv1rela2, ucv1rela3).
COMPUTE ucv2relam = MEAN.2(ucv2rela1, ucv2rela2, ucv2rela3).
COMPUTE ucv3relam = MEAN.2(ucv3rela1, ucv3rela2, ucv3rela3).
COMPUTE ucv4relam = MEAN.2(ucv4rela1, ucv4rela2, ucv4rela3).
EXECUTE.

u1crelgt1/2

Religiosity total scale, from the twin phase 1 questionnaire, derived from all 5 available items of the measure.
Items 1-4 have integer response values 0-5. Item 5 has response values 1-5, and for scaling purposes these are adjusted to the 0-5 range to match the other items. Hence the total scale has a range of values from 0 to 25 because it is computed as a mean of 0-5 variables then multiplied by 5. At least half the items are required to be non-missing for the scale to be computed.

* Items 1-4 have 6 responses coded 0-5, while item 5 has 5 responses coded 1-5.
* To make scale, re-scale item 5 to values 0-5 as for the other items.
*  by subtracting 1 then multiplying by 5/4.
NUMERIC u1crelgt (F4.2).
VARIABLE LEVEL u1crelgt (SCALE).
COMPUTE u1crelgt = 5 * MEAN.3(u1crelg1, u1crelg2, u1crelg3, u1crelg4, ((u1crelg5 - 1) * 1.25)).
EXECUTE.

u1crskt1/2, u1prskt1/2

Risk Taking total scales, from the same measure in the twin phase 1 (u1crsk) and parent phase 1 (u1prsk) questionnaires. In each case, the scale is derived from all 6 item variables. Each item has integer response values 0-4, hence each scale has a range of values from 0 to 24. At least half the items are required to be non-missing for each scale to be computed.

COMPUTE u1crskt = 6 * MEAN.3(u1crsk1, u1crsk2, u1crsk3, u1crsk4, u1crsk5, u1crsk6).
COMPUTE u1prskt1 = 6 * MEAN.3(u1prsk11, u1prsk21, u1prsk31, u1prsk41, u1prsk51, u1prsk61).
EXECUTE.

u1csdqemot1/2, u1csdqpert1/2, u1csdqhypt1/2, u1csdqcont1/2, u1csdqprot1/2, u1csdqbeht1/2, u1psdqemot1/2, u1psdqpert1/2, u1psdqhypt1/2, u1psdqcont1/2, u1psdqprot1/2, u1psdqbeht1/2, ucv1sdqemot1/2, ucv1sdqpert1/2, ucv1sdqhypt1/2, ucv1sdqcont1/2, ucv1sdqprot1/2, ucv1sdqbeht1/2, ucv2sdqemot1/2, ucv2sdqpert1/2, ucv2sdqhypt1/2, ucv2sdqcont1/2, ucv2sdqprot1/2, ucv2sdqbeht1/2, ucv3sdqemot1/2, ucv3sdqpert1/2, ucv3sdqhypt1/2, ucv3sdqcont1/2, ucv3sdqprot1/2, ucv3sdqbeht1/2, ucv4sdqemot1/2, ucv4sdqpert1/2, ucv4sdqhypt1/2, ucv4sdqcont1/2, ucv4sdqprot1/2, ucv4sdqbeht1/2

SDQ scales, from the TEDS21 phase 1 twin (u1c) and parent (u1p) questionnaires and from the covid twin phase 1 (ucv1), phase 2 (ucv2), phase 3 (ucv3) and phase 4 (ucv4) questionnaires. The items of the measure were the same in all questionnaires, except that the wording was modified for the parent-reported version.
Total behaviour problems scale plus five subscales, derived from items (reversed where necessary) of the SDQ measure.
Each item has integer response values 0-2, hence each scale has a range of values from 0 to (2 * number of items) because it is computed as the mean multiplied by the number of component items. Require at least half the component items to be non-missing for each scale.
Note that the SDQ Emotion scale was previously referred to as Anxiety.

* Emotional symptoms (previously called Anxiety).
COMPUTE u1psdqemot1 = 5 * MEAN.3(u1psdqemo11, u1psdqemo21, u1psdqemo31, u1psdqemo41, u1psdqemo51).
COMPUTE u1csdqemot = 5 * MEAN.3(u1csdqemo1, u1csdqemo2, u1csdqemo3, u1csdqemo4, u1csdqemo5).
COMPUTE ucv1sdqemot = 5 * MEAN.3(ucv1sdqemo1, ucv1sdqemo2, ucv1sdqemo3, ucv1sdqemo4, ucv1sdqemo5).
COMPUTE ucv2sdqemot = 5 * MEAN.3(ucv2sdqemo1, ucv2sdqemo2, ucv2sdqemo3, ucv2sdqemo4, ucv2sdqemo5).
COMPUTE ucv3sdqemot = 5 * MEAN.3(ucv3sdqemo1, ucv3sdqemo2, ucv3sdqemo3, ucv3sdqemo4, ucv3sdqemo5).
COMPUTE ucv4sdqemot = 5 * MEAN.3(ucv4sdqemo1, ucv4sdqemo2, ucv4sdqemo3, ucv4sdqemo4, ucv4sdqemo5).
EXECUTE.
* Peer problems.
COMPUTE u1psdqpert1 = 5 * MEAN.3(u1psdqper11, u1psdqper2r1, u1psdqper3r1, u1psdqper41, u1psdqper51).
COMPUTE u1csdqpert = 5 * MEAN.3(u1csdqper1, u1csdqper2r, u1csdqper3r, u1csdqper4, u1csdqper5).
COMPUTE ucv1sdqpert = 5 * MEAN.3(ucv1sdqper1, ucv1sdqper2r, ucv1sdqper3r, ucv1sdqper4, ucv1sdqper5).
COMPUTE ucv2sdqpert = 5 * MEAN.3(ucv2sdqper1, ucv2sdqper2r, ucv2sdqper3r, ucv2sdqper4, ucv2sdqper5).
COMPUTE ucv3sdqpert = 5 * MEAN.3(ucv3sdqper1, ucv3sdqper2r, ucv3sdqper3r, ucv3sdqper4, ucv3sdqper5).
COMPUTE ucv4sdqpert = 5 * MEAN.3(ucv4sdqper1, ucv4sdqper2r, ucv4sdqper3r, ucv4sdqper4, ucv4sdqper5).
EXECUTE.
* Hyperactivity.
COMPUTE u1psdqhypt1 = 5 * MEAN.3(u1psdqhyp11, u1psdqhyp21, u1psdqhyp31, u1psdqhyp4r1, u1psdqhyp5r1).
COMPUTE u1csdqhypt = 5 * MEAN.3(u1csdqhyp1, u1csdqhyp2, u1csdqhyp3, u1csdqhyp4r, u1csdqhyp5r).
COMPUTE ucv1sdqhypt = 5 * MEAN.3(ucv1sdqhyp1, ucv1sdqhyp2, ucv1sdqhyp3, ucv1sdqhyp4r, ucv1sdqhyp5r).
COMPUTE ucv2sdqhypt = 5 * MEAN.3(ucv2sdqhyp1, ucv2sdqhyp2, ucv2sdqhyp3, ucv2sdqhyp4r, ucv2sdqhyp5r).
COMPUTE ucv3sdqhypt = 5 * MEAN.3(ucv3sdqhyp1, ucv3sdqhyp2, ucv3sdqhyp3, ucv3sdqhyp4r, ucv3sdqhyp5r).
COMPUTE ucv4sdqhypt = 5 * MEAN.3(ucv4sdqhyp1, ucv4sdqhyp2, ucv4sdqhyp3, ucv4sdqhyp4r, ucv4sdqhyp5r).
EXECUTE.
* Conduct.
COMPUTE u1psdqcont1 = 5 * MEAN.3(u1psdqcon11, u1psdqcon2r1, u1psdqcon31, u1psdqcon41, u1psdqcon51).
COMPUTE u1csdqcont = 5 * MEAN.3(u1csdqcon1, u1csdqcon2r, u1csdqcon3, u1csdqcon4, u1csdqcon5).
COMPUTE ucv1sdqcont = 5 * MEAN.3(ucv1sdqcon1, ucv1sdqcon2r, ucv1sdqcon3, ucv1sdqcon4, ucv1sdqcon5).
COMPUTE ucv2sdqcont = 5 * MEAN.3(ucv2sdqcon1, ucv2sdqcon2r, ucv2sdqcon3, ucv2sdqcon4, ucv2sdqcon5).
COMPUTE ucv3sdqcont = 5 * MEAN.3(ucv3sdqcon1, ucv3sdqcon2r, ucv3sdqcon3, ucv3sdqcon4, ucv3sdqcon5).
COMPUTE ucv4sdqcont = 5 * MEAN.3(ucv4sdqcon1, ucv4sdqcon2r, ucv4sdqcon3, ucv4sdqcon4, ucv4sdqcon5).
EXECUTE.
* Prosocial.
COMPUTE u1psdqprot1 = 5 * MEAN.3(u1psdqpro11, u1psdqpro21, u1psdqpro31, u1psdqpro41, u1psdqpro51).
COMPUTE u1csdqprot = 5 * MEAN.3(u1csdqpro1, u1csdqpro2, u1csdqpro3, u1csdqpro4, u1csdqpro5).
COMPUTE ucv1sdqprot = 5 * MEAN.3(ucv1sdqpro1, ucv1sdqpro2, ucv1sdqpro3, ucv1sdqpro4, ucv1sdqpro5).
COMPUTE ucv2sdqprot = 5 * MEAN.3(ucv2sdqpro1, ucv2sdqpro2, ucv2sdqpro3, ucv2sdqpro4, ucv2sdqpro5).
COMPUTE ucv3sdqprot = 5 * MEAN.3(ucv3sdqpro1, ucv3sdqpro2, ucv3sdqpro3, ucv3sdqpro4, ucv3sdqpro5).
COMPUTE ucv4sdqprot = 5 * MEAN.3(ucv4sdqpro1, ucv4sdqpro2, ucv4sdqpro3, ucv4sdqpro4, ucv4sdqpro5).
EXECUTE.
* Behaviour problems total - all items except prosocial.
COMPUTE u1psdqbeht1 = 20 * MEAN.10(u1psdqhyp11, u1psdqemo11,u1psdqcon11, u1psdqper11,
 u1psdqcon2r1, u1psdqemo21, u1psdqhyp21, u1psdqper2r1, u1psdqcon31, u1psdqemo31, u1psdqper3r1, 
 u1psdqhyp31, u1psdqemo41, u1psdqcon41, u1psdqper41, u1psdqhyp4r1, u1psdqcon51, u1psdqper51, 
 u1psdqemo51, u1psdqhyp5r1).
COMPUTE u1csdqbeht = 20 * MEAN.10(u1csdqhyp1, u1csdqemo1, u1csdqcon1, u1csdqper1, 
 u1csdqcon2r, u1csdqemo2, u1csdqhyp2, u1csdqper2r, u1csdqcon3, u1csdqemo3, u1csdqper3r, 
 u1csdqhyp3, u1csdqemo4, u1csdqcon4, u1csdqper4, u1csdqhyp4r, u1csdqcon5, u1csdqper5,
 u1csdqemo5, u1csdqhyp5r).
COMPUTE ucv1sdqbeht = 20 * MEAN.10(ucv1sdqhyp1, ucv1sdqemo1, ucv1sdqcon1, ucv1sdqper1, 
 ucv1sdqcon2r, ucv1sdqemo2, ucv1sdqhyp2, ucv1sdqper2r, ucv1sdqcon3, ucv1sdqemo3, ucv1sdqper3r, 
 ucv1sdqhyp3, ucv1sdqemo4, ucv1sdqcon4, ucv1sdqper4, ucv1sdqhyp4r, ucv1sdqcon5, ucv1sdqper5,
 ucv1sdqemo5, ucv1sdqhyp5r).
COMPUTE ucv2sdqbeht = 20 * MEAN.10(ucv2sdqhyp1, ucv2sdqemo1, ucv2sdqcon1, ucv2sdqper1, 
 ucv2sdqcon2r, ucv2sdqemo2, ucv2sdqhyp2, ucv2sdqper2r, ucv2sdqcon3, ucv2sdqemo3, ucv2sdqper3r, 
 ucv2sdqhyp3, ucv2sdqemo4, ucv2sdqcon4, ucv2sdqper4, ucv2sdqhyp4r, ucv2sdqcon5, ucv2sdqper5,
 ucv2sdqemo5, ucv2sdqhyp5r).
COMPUTE ucv3sdqbeht = 20 * MEAN.10(ucv3sdqhyp1, ucv3sdqemo1, ucv3sdqcon1, ucv3sdqper1, 
 ucv3sdqcon2r, ucv3sdqemo2, ucv3sdqhyp2, ucv3sdqper2r, ucv3sdqcon3, ucv3sdqemo3, ucv3sdqper3r, 
 ucv3sdqhyp3, ucv3sdqemo4, ucv3sdqcon4, ucv3sdqper4, ucv3sdqhyp4r, ucv3sdqcon5, ucv3sdqper5,
 ucv3sdqemo5, ucv3sdqhyp5r).
COMPUTE ucv4sdqbeht = 20 * MEAN.10(ucv4sdqhyp1, ucv4sdqemo1, ucv4sdqcon1, ucv4sdqper1, 
 ucv4sdqcon2r, ucv4sdqemo2, ucv4sdqhyp2, ucv4sdqper2r, ucv4sdqcon3, ucv4sdqemo3, ucv4sdqper3r, 
 ucv4sdqhyp3, ucv4sdqemo4, ucv4sdqcon4, ucv4sdqper4, ucv4sdqhyp4r, ucv4sdqcon5, ucv4sdqper5,
 ucv4sdqemo5, ucv4sdqhyp5r).
EXECUTE.

u1cselft1/2

Self Control total scale, from the twin phase 1 questionnaire, derived from all 6 available items of the Self Control measure (reversed where necessary).
Each item has integer response values 0-4, hence the scale has a range of values from 0 to 24 because it is computed as the mean multiplied by the number of component items. At least half the items are required to be non-missing for the scale to be computed.

COMPUTE u1cselft = 6 * MEAN.3(u1cself1r, u1cself2, u1cself3, u1cself4, u1cself5, u1cself6).
EXECUTE.

u1csexbriskt1/2

Risky sexual behaviour total scale, from the twin phase 1 questionnaire, derived from 5 of the 7 items of the measure.
Details of the derivation are explained in the syntax below. Briefly, each of items 2, 3, 4 and 6 is recoded to a 0-4 range, reversing where necessary such that 0=no risk and 4=highest risk. The scale is computed as the mean multiplied by the number of component items, hence it has a range 0-16. At least half the items are required to be non-missing for the scale to be computed. For twins who responded 0=no in screening item 1, a zero score is given.

* Risky sexual behaviour.
* Based on items 1, 2, 3, 4 and 6 of the sexual behaviour measure.
* Omit item 5 (other contraceptives) because it correlates negatively with item 4 and not at all with the others.
* Omit item 7 (HIV) as it has an odd negative correlation with item 1 and does not correlate with the others.
* Recode each of items 2, 3, 4 and 6 to a scale of 0-4, with 0=no risk and 4=highest risk.
* and in each case recoded from missing to 0 (no risk) if the item 1 response was 'no' (never had sex).

* Item 2 (raw coding 1-7): lower age=higher risk, so reverse the coding and convert to 0-4 scale.
* Highest age range response is 17 or older, corresponding to lowest risk level (recoded to 0).
COMPUTE u1csexb2R = (7 - u1csexb2) * 4 / 6.
EXECUTE.
* Item 3, number of sexual partners (raw coding 1-5) is coded in the right direction.
* rescale to 1-4 (1=1 person up to 4=15+ people).
* reserving 0 for the no-risk cases of sexual intercourse with 0 people.
RECODE u1csexb3
 (1=1) (2=1.75) (3=2.5) (4=3.25) (5=4)
INTO u1csexb3R.
EXECUTE.
* Item 4 (use of condoms) has coding 0-4 but needs reversing.
* (assume 'always' response is equivalent to zero risk).
RECODE u1csexb4
 (0=4) (1=3) (2=2) (3=1) (4=0)
INTO u1csexb4R.
EXECUTE.
* Item 6 needs no recoding: 0-4 with 0=no risk, but create a new variable to allow recoding from item 1.
COMPUTE u1csexb6R = u1csexb6.
EXECUTE.
* The 4 variables above are all missing if item 1 has response 0 (never had sex).
* so in this case recode them all to 0=no risk.
DO IF (u1csexb1 = 0).
 RECODE u1csexb2R u1csexb3R u1csexb4R u1csexb6R (SYSMIS=0).
END IF.
EXECUTE.
* Now create total scale from these four recoded variables.
COMPUTE u1csexbriskt = 4 * MEAN.2(u1csexb2R, u1csexb3R, u1csexb4R, u1csexb6R).
EXECUTE.
* drop temporary variables u1csexb2R u1csexb3R u1csexb4R u1csexb6R at the end of this script.

u1csexorn1/2

Recoded, ordinal sexual orientation item, with integer values from 1 (always attracted to opposite sex) up to 5 (always attracted to the same sex. This version of the item applies to both male and female twins. The raw item (u1csexor) was coded according to specific sexes, e.g. 1=always attracted to males. See comments in the syntax below for further details.

* Sexual orientation.
* Responses to this item were in terms of sex (male or female).
* Recode ordinally for twins of either sex into same-sex, opposite-sex, etc.
* always opposite sex.
IF (sex1 = 0 & u1csexor = 1) u1csexorn = 1.
IF (sex1 = 1 & u1csexor = 5) u1csexorn = 1.
* mostly opposite sex.
IF (sex1 = 0 & u1csexor = 2) u1csexorn = 2.
IF (sex1 = 1 & u1csexor = 4) u1csexorn = 2.
* equally same and opposite sex.
IF (u1csexor = 3) u1csexorn = 3.
* mostly same sex.
IF (sex1 = 0 & u1csexor = 4) u1csexorn = 4.
IF (sex1 = 1 & u1csexor = 2) u1csexorn = 4.
* always same sex.
IF (sex1 = 0 & u1csexor = 5) u1csexorn = 5.
IF (sex1 = 1 & u1csexor = 1) u1csexorn = 5.
EXECUTE.
* leave missing for raw responses of 6=little or no attraction.
* and 7=unsure, to preserve ordinal character of this derived variable.

u1cstat1/2

See u1cq1stat1/2, etc above.

u1cstexlikt1/2, u1cstexdevt1/2

Student Experiences subscales.
Derived from items 8 to 19 of the measure in the twin phase 1 questionnaire (items 1-7 are not suitable for scaling). The subscales are for liking of university (u1cstexlikt) and for personal and intellectual development (u1cstexdevt), based on 4 and 8 items respectively.
Each item has integer response values 0-4, hence each subscale has a range of values from 0 to (4 * number of component items) as it is a 'total' computed by multiplying the mean by the number of items. At least half the component items are required to be non-missing for each scale to be computed.

* Liking university subscale: items 8-11.
COMPUTE u1cstexlikt = 4 * MEAN.2(u1cstex08, u1cstex09, u1cstex10, u1cstex11).
* Personal and intellectual development: items 12-19.
COMPUTE u1cstexdevt = 8 * MEAN.4(u1cstex12, u1cstex13, u1cstex14,
 u1cstex15, u1cstex16, u1cstex17, u1cstex18, u1cstex19).
EXECUTE.

u1ctime1/2

See u1cq1time1/2, etc above.

u1ctwnrm1/2

See u1cdadrm1/2, etc above.

u1cvolnt1/2, ucv1volnt1/2, ucv2volnt1/2, ucv3volnt1/2

Volunteering total scale, from the TEDS21 twin phase 1 (u1c) and covid twin phase 1 (ucv1) and phase 2 (ucv2) and phase 3 (ucv3) questionnaires. Derived from all available items of the measure (5 items in TEDS21, and a different 3 items in covid).
Each item has integer response values 0-4, hence each scale has a range of values from 0 to (4 x number of items) because it is computed as the mean multiplied by the number of component items. At least half the items are required to be non-missing for the scale to be computed.

COMPUTE u1cvolnt = 5 * MEAN.3(u1cvoln1, u1cvoln2, u1cvoln3, u1cvoln4, u1cvoln5).
COMPUTE ucv1volnt = 3 * MEAN.2(ucv1voln1, ucv1voln2, ucv1voln3).
COMPUTE ucv2volnt = 3 * MEAN.2(ucv2voln1, ucv2voln2, ucv2voln3).
COMPUTE ucv3volnt = 3 * MEAN.2(ucv3voln1, ucv3voln2, ucv3voln3).
EXECUTE.

u1page

See u1cage1/2, etc above.

u1pconimpt1/2, u1pconinat1/2, u1pcont1/2

Conners ADHD scales.
Derived from items of the Conners measure in the parent phase 1 questionnaire.
The measure has a total scale (18 items) plus two subscales (9 items each).
Each item has integer response values 0-3, hence each scale has a range of values from 0 to (3 * number of component items) as it is a 'total' computed by multiplying the mean by the number of items. At least half the component items are required to be non-missing for each scale to be computed.

* Impulsivity (9 items).
COMPUTE u1pconimpt1 = 9 * MEAN.5(u1pcon101, u1pcon111, u1pcon121,
 u1pcon131, u1pcon141, u1pcon151, u1pcon161, u1pcon171, u1pcon181).
* Inattention (9 items).
COMPUTE u1pconinat1 = 9 * MEAN.5(u1pcon011, u1pcon021, u1pcon031,
 u1pcon041, u1pcon051, u1pcon061, u1pcon071, u1pcon081, u1pcon091).
* Total (all 18 items).
COMPUTE u1pcont1 = 18 * MEAN.9(u1pcon011, u1pcon021, u1pcon031,
 u1pcon041, u1pcon051, u1pcon061, u1pcon071, u1pcon081, u1pcon091,
 u1pcon101, u1pcon111, u1pcon121, u1pcon131, u1pcon141, u1pcon151,
 u1pcon161, u1pcon171, u1pcon181).
EXECUTE.

u1pdevmob, u1pdevwdth

Device categories used for the TEDS21 parent phase 1 questionaire, if completed electronically.
u1pdevmob flags mobile devices (mobile phones and tablets), coded 1=yes 0=no.
u1pdevwdth is a device width category, coded 1=small, 2=medium, 3=large, where width refers to the widest side of the device's screen.
Derived from raw item variables, collected on the web/app server, that are not retained in the dataset. Mobile devices could be categorised from both app and web data, using different methods. Screen sizes were only recorded in the web data, not the app data.

* Mobile devices: CMS data.
* ------------------------.
* The raw CMS variable PlatformType has values 'Web' or 'Mobile'.
* so we can assume 'Mobile' value refers to mobile devices.
* while 'Web' in most cases probably means a web browser used on a laptop/desktop.
RECODE PlatformType ('Web'=0) ('Mobile'=1)
INTO u1pdevmob.

* Mobile devices: web backup.
* --------------------------.
* Use substrings of the consenttechuseragent string to categorise broad device types.
* 'Windows', 'Macintosh' and 'X11' are probably all laptops or desktops.
IF (CHAR.INDEX(u1pconsenttechuseragent, 'Windows') > 0) devicetype = 1.
IF (CHAR.INDEX(u1pconsenttechuseragent, 'Macintosh') > 0) devicetype = 2.
IF (CHAR.INDEX(u1pconsenttechuseragent, 'X11') > 0) devicetype = 3.
EXECUTE.
* 'iPhone' and 'Android' are likely to be mobile phones.
* (in rare cases of Windows phones with Android installed, this supercedes 'Windows' above).
IF (CHAR.INDEX(u1pconsenttechuseragent, 'iPhone') > 0) devicetype = 4.
IF (CHAR.INDEX(u1pconsenttechuseragent, 'Android') > 0) devicetype = 5.
EXECUTE.
* 'iPad' and 'Tablet' are presumably tablets (the latter can overrule 'Windows').
IF (CHAR.INDEX(u1pconsenttechuseragent, 'iPad') > 0) devicetype = 6.
IF (CHAR.INDEX(u1pconsenttechuseragent, 'Tablet') > 0) devicetype = 7.
EXECUTE.
* 'Mobile' is another common substring, but always indicates mobile phones.
* categorised by other substrings above.

* Recode into a binary variable to flag mobile devices.
* to be retained in the dataset (raw device variables will be dropped).
RECODE devicetype (1 THRU 3=0) (4 THRU 7=1)
INTO u1pdevmob.
EXECUTE.

* Screen width: web backup only.
* ----------------------------.
* Some devices may be used in portrait or landscape mode.
* so for consistency treat the largest dimension as the screen 'width'.
IF (u1pconsenttechscrwidth >= u1pconsenttechscrheight) screenwidth = u1pconsenttechscrwidth.
IF (u1pconsenttechscrwidth < u1pconsenttechscrheight) screenwidth = u1pconsenttechscrheight.
EXECUTE.
* Categorise screen widths arbitrarily as small, medium or large.
RECODE screenwidth
 (LOWEST THRU 767=1) (768 THRU 1199=2) (1200 THRU HIGHEST=3)
INTO u1pdevwdth.
EXECUTE.

u1pduration

Duration of the phase 1 parent questionnaire, measured in decimal minutes, derived as the difference between the start date-time and the end date-time. The same method is used to derive this variable in the CMS and backup data, but the duration cannot be measured for paper booklet data. The start and end date-time variables are not retained in the dataset.

* Derive a duration variable, as the difference between the start and end date-times.
* This will not necessarily match the total time variable (derived elsewhere on this page).
* because the 4 questionnaire sections may have been completed at different dates or times.
* with pauses in between; and the overall duration will include consent, address check, etc.
* Only derive for cases where the whole thing was finished.
* Derive as number of seconds divided by 60 to get decimal minutes.
IF (u1pstat = 2) u1pduration = DATEDIFF(u1pend, u1pstart, 'seconds') / 60.
EXECUTE.

u1pLLCage1/2, u1pLLCdate1/2

See u1cLLCage1/2, etc above.

u1pparm1/2, u1pparnegm1/2, u1pparposm1/2

Parental feelings scales, derived from the 6 items of the measure in the phase 1 parent questionnaire.
The subscales are for negative feelings (u1pparnegm), and positive feelings (u1pparposm). The overall scale (u1pparm) is derived from both negative and positive feelings with coding in the same direction as negative feelings.
Each subscale is derived from 3 items. For each scale, at least half the component items are required to be non-missing. The overall scale is derived from the two subscales, hence it is derived indirectly from all 6 items.
Each item has response values 1-5, hence each of these scales has the same range as it is computed as a mean.

* Parental feelings.
* TEDS21 Phase 1, parent-rated, 6 items.
* Two subscales, for positive and negative feelings, 3 items each.
* and an overall scale from all 6 items.
* Negative feelings: items 1, 4, 6.
COMPUTE u1pparnegm1 = MEAN.2(u1ppar11, u1ppar41, u1ppar61).
* Positive feelings: items 2, 3, 5.
COMPUTE u1pparposm1 = MEAN.2(u1ppar21, u1ppar31, u1ppar51).
EXECUTE.
* Overall scale, coded in the negative feelings direction.
* derived from all 6 items, reversing each positive item.
* by subtracting it from 6 (because coded 1/2/3/4/5).
COMPUTE u1pparm1 = MEAN.4(u1ppar11, (6 - u1ppar21), (6 - u1ppar31), 
    u1ppar41, (6 - u1ppar51), u1ppar61).
EXECUTE.

u1pq1quan, u1pq2quan, u1pq3quan, u1pq4quan, u1pquan

Numbers of questions answered in each section (u1pqXquan) and overall (u1pquan) in the phase 1 parent questionnaire. The same method is used to count items in each version (CMS, backup, paper). Note that most questions in the parent questionnaire were optional (they had a 'prefer not to answer' option in the electronic versions, and could be skipped arbitrarily in the paper version). The counts include all items except for those that could have been skipped due to branching rules in the electronic versions. Note for the CoTEDS measure, the syntax counts just 3 raw items before they were restructured and renamed.

* Count items within each of the 4 sections as originally defined.
* in the CMS version and also used in the backup version.
* Omit items that could be skipped because of branching.
COUNT u1pq1quan = u1pliv1x u1pses1mw u1pses2fw u1pses3mq u1pses4fq u1pses5in
 u1ppar11 u1ppar12 u1ppar21 u1ppar22 u1ppar31 u1ppar32
 u1ppar41 u1ppar42 u1ppar51 u1ppar52 u1ppar61 u1ppar62
 u1prsk11 u1prsk12 u1prsk21 u1prsk22 u1prsk31 u1prsk32
 u1prsk41 u1prsk42 u1prsk51 u1prsk52 u1prsk61 u1prsk62
 (0 THRU HIGHEST).
COUNT u1pq2quan = u1pcon011 u1pcon012 u1pcon021 u1pcon022 u1pcon031 u1pcon032
 u1pcon041 u1pcon042 u1pcon051 u1pcon052 u1pcon061 u1pcon062
 u1pcon071 u1pcon072 u1pcon081 u1pcon082 u1pcon091 u1pcon092 
 u1pcon101 u1pcon102 u1pcon111 u1pcon112 u1pcon121 u1pcon122
 u1pcon131 u1pcon132 u1pcon141 u1pcon142 u1pcon151 u1pcon152
 u1pcon161 u1pcon162 u1pcon171 u1pcon172 u1pcon181 u1pcon182
 (0 THRU HIGHEST).
COUNT u1pq3quan = u1psdq011 u1psdq012 u1psdq021 u1psdq022 u1psdq031 u1psdq032 u1psdq041 u1psdq042
 u1psdq051 u1psdq052 u1psdq061 u1psdq062 u1psdq071 u1psdq072 u1psdq081 u1psdq082
 u1psdq091 u1psdq092 u1psdq101 u1psdq102 u1psdq111 u1psdq112 u1psdq121 u1psdq122
 u1psdq131 u1psdq132 u1psdq141 u1psdq142 u1psdq151 u1psdq152 u1psdq161 u1psdq162
 u1psdq171 u1psdq172 u1psdq181 u1psdq182 u1psdq191 u1psdq192 u1psdq201 u1psdq202
 u1psdq211 u1psdq212 u1psdq221 u1psdq222 u1psdq231 u1psdq232 u1psdq241 u1psdq242
 u1psdq251 u1psdq252 (0 THRU HIGHEST).
COUNT u1pq4quan = u1psan011 u1psan012 u1psan021 u1psan022 u1psan031 u1psan032
 u1psan041 u1psan042 u1psan051 u1psan052 u1psan061 u1psan062
 u1psan071 u1psan072 u1psan081 u1psan082 u1psan091 u1psan092 
 u1psan101 u1psan102
 u1pcoteds01 u1pcoteds29 u1pcoteds30
 (0 THRU HIGHEST).
EXECUTE.

* Overall total count.
COMPUTE u1pquan = SUM(u1pq1quan, u1pq2quan, u1pq3quan, u1pq4quan).
EXECUTE.

u1pq1stat, u1pq2stat, u1pq3stat, u1pq4stat, u1pstat

Status variables for each of the 4 sections (u1pqXstat) and overall (u1pstat) in the phase 1 parent questionnaire. Each is coded 0=not started, 1=started but not finished, 2=finished.
Different methods are used to calculate the status versions in the three versions (CMS, backup, paper). The CMS and backup versions had different types of raw status variables that could be used. The CMS raw data contains variable QnrCompletion, with values 0 to 100, indicating the status of the questionnaire as a whole. The backup raw data contains a status variable for each section, coded 0=not completed, 1=completed (data were only submitted at the end of each section, hence partially-completed sections are not present in the raw backup data). For the paper version, the status variables are based purely on the counts of the questions answered in each section (these are derived variables described elsewhere on this page). Note that the sections and their component questions were presented in strict sequence in the electronic versions, so if a section was left unfinished then subsequent sections would necessarily be unstarted. In the paper version, participants could of course leave questions or whole sections unanswered in a wholly arbitrary way.

* CMS raw data.
* ------------.
* Derive a status variable for each section, coded 0=not started 1=started 2=finished.
* Most questions in CMS parent questionnaire are optional, so may be missing.
* therefore need careful approach to determine whether finished.
COMPUTE u1pq1stat = 0.
COMPUTE u1pq2stat = 0.
COMPUTE u1pq3stat = 0.
COMPUTE u1pq4stat = 0.
COMPUTE u1pstat = 0.
EXECUTE.
* If QnrCompletion=100, all must have been finished.
DO IF (QnrCompletion = 100).
 RECODE u1pq1stat u1pq2stat u1pq3stat u1pq4stat u1pstat (0=2).
END IF.
EXECUTE.
* 1st section: assume complete if all 30 items answered or if some of 2nd section done.
* and assume partially complete if 1-29 answered and nothing in 2nd section.
IF (u1pq1stat = 0 & (u1pq1quan = 30 | u1pq2quan > 0)) u1pq1stat = 2.
IF (u1pq1stat = 0 & (RANGE(u1pq1quan,1,29) & u1pq2quan = 0)) u1pq1stat = 1.
EXECUTE.
* 2nd section similarly (36 items of Conners).
IF (u1pq2stat = 0 & (u1pq2quan = 36 | u1pq3quan > 0)) u1pq2stat = 2.
IF (u1pq2stat = 0 & (RANGE(u1pq2quan,1,35) & u1pq3quan = 0)) u1pq2stat = 1.
EXECUTE.
* 3rd section similary (50 items of SDQ).
IF (u1pq3stat = 0 & (u1pq3quan = 50 | u1pq4quan > 0)) u1pq3stat = 2.
IF (u1pq3stat = 0 & (RANGE(u1pq3quan,1,49) & u1pq4quan = 0)) u1pq3stat = 1.
EXECUTE.
* 4th section is last and completed if QnrCompletion=100 as above.
* hence treat as unfinished if 1 or more item done and QnrCompletion < 100.
IF (u1pq4stat = 0 & u1pq4quan > 0 & QnrCompletion < 100) u1pq4stat = 1.
EXECUTE.

* Overall status is unfinished if 1st section started but overall unfinished.
IF (u1pstat = 0 & u1pq1stat > 0 & QnrCompletion < 100) u1pstat = 1.
EXECUTE.

* Backup raw data.
* ---------------.
* Derive a status variable for each section, coded 0=not started 1=started 2=finished.
* In this backup version, each section is either not started or finished.
* because data are only submitted at the end of the section.
* Can therefore conveniently recode the existing status flags (0/1) from the admin file.
* regardless of the number of items completed.
RECODE Parentquestionnaire1__status Parentquestionnaire2__status
 Parentquestionnaire3__status Parentquestionnaire4__status
 (0=0) (1=2)
INTO u1pq1stat u1pq2stat u1pq3stat u1pq4stat.
EXECUTE.
* Overall status may be unfinished if fewer than 4 sections completed.
IF (u1pq1stat = 0) u1pstat = 0.
IF (u1pq1stat > 0 & u1pq4stat < 2) u1pstat = 1.
IF (u1pq4stat = 2) u1pstat = 2.
EXECUTE.

* Paper raw data.
* --------------.
### not yet processed ###

u1pq1time, u1pq2time, u1pq3time, u1pq4time, u1ptime

Time taken to complete each section (u1pqXtime) and the whole (u1ptime) of the phase 1 parent questionnaire. The times are measured in decimal minutes.
These variables are derived in different ways in the CMS raw data (as a sum of item times) and the backup raw data (as a difference between start and end date-times). The variables cannot be derived for paper booklet data. Note that the item times used in the CMS derivation, and the section start/end times used in the backup derivation, are not retained in the dataset because they are not present in other versions.

* Backup raw data.
* ---------------.
* In the backup data, there are no item times but there is a date-time.
* for the start and end of each section - use these to derive the duration in minutes.
* Should be roughly compatible with the times in the CMS version, which are.
* derived as sums of item times.
* Only derive if the relevant section was completed.
IF (u1pq1stat = 2) u1pq1time = DATEDIFF(Parentquestionnaire1__subTime, Parentquestionnaire1__genTime, 'seconds') / 60.
IF (u1pq2stat = 2) u1pq2time = DATEDIFF(Parentquestionnaire2__subTime, Parentquestionnaire2__genTime, 'seconds') / 60.
IF (u1pq3stat = 2) u1pq3time = DATEDIFF(Parentquestionnaire3__subTime, Parentquestionnaire3__genTime, 'seconds') / 60.
IF (u1pq4stat = 2) u1pq4time = DATEDIFF(Parentquestionnaire4__subTime, Parentquestionnaire4__genTime, 'seconds') / 60.
EXECUTE.
* Sum for total time if all finished.
IF (u1pstat = 2) u1ptime = SUM(u1pq1time, u1pq2time, u1pq3time, u1pq4time).
EXECUTE.

* CMS raw data.
* ------------.
* Sum the item times for each questionnaire section, including CoTEDS.
* and divide by 60000 to get a time in minutes.
* Only derive times for completed sections.
IF (u1pq1stat = 2) u1pq1time = SUM(u1pliv1rt, u1pliv2rt1, u1pliv2rt2,
 u1pses1rt, u1pses2rt, u1pses3rt, u1pses4rt, u1pses5rt, 
 u1ppar1rt1, u1ppar1rt2, u1ppar2rt1, u1ppar2rt2, u1ppar3rt1, u1ppar3rt2, u1ppar4rt1, u1ppar4rt2, 
 u1ppar5rt1, u1ppar5rt2, u1ppar6rt1, u1ppar6rt2, u1prsk1rt1, u1prsk1rt2, u1prsk2rt1, u1prsk2rt2, 
 u1prsk3rt1, u1prsk3rt2, u1prsk4rt1, u1prsk4rt2, u1prsk5rt1, u1prsk5rt2, u1prsk6rt1, u1prsk6rt2) / 60000.
IF (u1pq2stat = 2) u1pq2time = SUM(u1pcon01rt1, u1pcon01rt2,
 u1pcon02rt1, u1pcon02rt2, u1pcon03rt1, u1pcon03rt2, 
 u1pcon04rt1, u1pcon04rt2, u1pcon05rt1, u1pcon05rt2, u1pcon06rt1, u1pcon06rt2, 
 u1pcon07rt1, u1pcon07rt2, u1pcon08rt1, u1pcon08rt2, u1pcon09rt1, u1pcon09rt2, 
 u1pcon10rt1, u1pcon10rt2, u1pcon11rt1, u1pcon11rt2, u1pcon12rt1, u1pcon12rt2, 
 u1pcon13rt1, u1pcon13rt2, u1pcon14rt1, u1pcon14rt2, u1pcon15rt1, u1pcon15rt2, 
 u1pcon16rt1, u1pcon16rt2, u1pcon17rt1, u1pcon17rt2, u1pcon18rt1, u1pcon18rt2) / 60000.
IF (u1pq3stat = 2) u1pq3time = SUM(u1psdq01rt1, u1psdq01rt2,
 u1psdq02rt1, u1psdq02rt2, u1psdq03rt1, u1psdq03rt2, u1psdq04rt1, u1psdq04rt2, 
 u1psdq05rt1, u1psdq05rt2, u1psdq06rt1, u1psdq06rt2, u1psdq07rt1, u1psdq07rt2, u1psdq08rt1, u1psdq08rt2, 
 u1psdq09rt1, u1psdq09rt2, u1psdq10rt1, u1psdq10rt2, u1psdq11rt1, u1psdq11rt2, u1psdq12rt1, u1psdq12rt2, 
 u1psdq13rt1, u1psdq13rt2, u1psdq14rt1, u1psdq14rt2, u1psdq15rt1, u1psdq15rt2, u1psdq16rt1, u1psdq16rt2, 
 u1psdq17rt1, u1psdq17rt2, u1psdq18rt1, u1psdq18rt2, u1psdq19rt1, u1psdq19rt2, u1psdq20rt1, u1psdq20rt2, 
 u1psdq21rt1, u1psdq21rt2, u1psdq22rt1, u1psdq22rt2, u1psdq23rt1, u1psdq23rt2, u1psdq24rt1, u1psdq24rt2, 
 u1psdq25rt1, u1psdq25rt2) / 60000.
IF (u1pq4stat = 2) u1pq4time = SUM(u1psan01rt1, u1psan01rt2, 
 u1psan02rt1, u1psan02rt2, u1psan03rt1, u1psan03rt2, u1psan04rt1, u1psan04rt2, 
 u1psan05rt1, u1psan05rt2, u1psan06rt1, u1psan06rt2, u1psan07rt1, u1psan07rt2, u1psan08rt1, u1psan08rt2, 
 u1psan09rt1, u1psan09rt2, u1psan10rt1, u1psan10rt2, 
 u1pcoteds01rt, u1pcoteds02rt, u1pcoteds03rt, u1pcoteds04rt, u1pcoteds05rt, u1pcoteds06rt, u1pcoteds07rt, 
 u1pcoteds08rt, u1pcoteds09rt, u1pcoteds10rt, u1pcoteds11rt, u1pcoteds12rt, u1pcoteds13rt, u1pcoteds14rt, 
 u1pcoteds15rt, u1pcoteds16rt, u1pcoteds17rt, u1pcoteds18rt, u1pcoteds19rt, u1pcoteds20rt, u1pcoteds21rt, 
 u1pcoteds22rt, u1pcoteds23rt, u1pcoteds24rt, u1pcoteds25rt, u1pcoteds26rt, u1pcoteds27rt, u1pcoteds28rt, 
 u1pcoteds29rt, u1pcoteds30rt, u1pcoteds31rt, u1pcoteds32rt) / 60000 .
EXECUTE.
* Sum for total time if all finished.
IF (u1pstat = 2) u1ptime = SUM(u1pq1time, u1pq2time, u1pq3time, u1pq4time).
EXECUTE.

u1prskt1/2

See u1crskt1/2, u1prskt1/2 above.

u1psant1/2

SANS total scale, from all 10 items in the phase 1 parent questionnaire. Each item has integer response values 0-4, hence each scale has a range of values from 0 to (4 * number of component items) as it is a 'total' computed by multiplying the mean by the number of items. At least half the component items are required to be non-missing for each scale to be computed.

COMPUTE u1psant1 = 10 * MEAN.5(u1psan011, u1psan021, u1psan031,
 u1psan041, u1psan051, u1psan061, u1psan071, u1psan081, u1psan091, u1psan101).
EXECUTE.

u1psdqemot1/2, u1psdqpert1/2, u1psdqhypt1/2, u1psdqcont1/2, u1psdqprot1/2, u1psdqbeht1/2

See u1csdqemot1/2, etc above.

u1pses

SES composite for parents, derived from 5 ordinal items in the phase 1 parent SES questionnaire. Each component, and the final composite, is standardised by cohort to eliminate significant cohort differences. Coding is such that higher values indicate higher SES. The derivation is explained by comments in the syntax.

* Derive parent SES composite from 5 components (household income, mother/father SOC and education).
* All components show significant cohort effects.
* so start by standardising them within each cohort.
SORT CASES  BY ucohort.
SPLIT FILE SEPARATE BY ucohort.

DESCRIPTIVES VARIABLES= u1pmosoc u1pfasoc u1pmohqual u1pfahqual u1pses5in
  /SAVE.
  
SPLIT FILE OFF.

* Reverse the parent SOC scores so high values = high SES.
COMPUTE Zu1pmosocR = -1 * Zu1pmosoc.
COMPUTE Zu1pfasocR = -1 * Zu1pfasoc.
EXECUTE.

* Parent SES composite is an equally-weighted mean of the 5 standardised components.
* with a requirement that at least 2 are non-missing.
COMPUTE parentses = MEAN.2(Zu1pmohqual, Zu1pfahqual, Zu1pses5in, Zu1pmosocR, Zu1pfasocR).
EXECUTE.

* Re-standardise the new composites to correct the SD to 1.
* and to correct any cohort differences that may have reappeared in the mean.
SORT CASES  BY ucohort.
SPLIT FILE SEPARATE BY ucohort.

DESCRIPTIVES VARIABLES= parentses (u1pses) 
  /SAVE.

SPLIT FILE OFF.

u2calco031/2, u2calco051/2, u2calcoaudit1/2, ucv1alco21/2, ucv2alco21/2, ucv3alco21/2, ucv4alco21/2

AUDIT scale for alcohol use, with associated recoded items measuring alcohol units.
u2calcoaudit is the alcohol use AUDIT total scale, derived from items 4-13 of the TEDS21 phase 2 twin questionnaire measure. This scale is designed to match the published AUDIT scale as closely as possible.
u2calco03, u2calco05, ucv1alco2, ucv2alco2, ucv3alco2 and ucv4alco2 are estimates of the total number of alcohol units consumed, derived from respective 4-part questions in the TEDS21 phase 2 questionnaire (u2c, questions 3 and 5) and in the Covid twin questionnaires (question 2, ucv1 is phase 1, ucv2 is phase 2, ucv3 is phase 3 and ucv4 is phase 4). Note that only one of these units variables (u2calco05) is subsequently used in the derivation of the AUDIT scale.
For the AUDIT scale, firstly, the four parts of item 5 are combined and recoded to values 0-4, as shown in the syntax below, to match the coding of the other nine items. The scale is then derived as the mean multiplied by the number of items (10), resulting in a range of values from 0 to 40. In the questionnaire there is a screening item, and if the response was 0=no then assign a zero value to the scale (otherwise it would be missing).
In deriving the mean, at least half the items are required to be non-missing for the scale to be computed.

* First convert the four parts of TEDS21 item 5 (and Covid item 2) into approximate numbers of units.
* by recoding response codes to the mid-range point of the number of drinks.
* for example 1-2 is 1.5, 2-4 is 3 and top of range 26+ is 30.
* Each glass of wine or pint of beer/cider is assumed to be 2 units, so multiply these by 2.
* Do the same for TEDS21 item 3 although the range is extended at the top end in this latter case.
RECODE u2calco03a u2calco03b u2calco05a u2calco05b
 ucv1alco2a ucv1alco2b ucv2alco2a ucv2alco2b ucv3alco2a ucv3alco2b ucv4alco2a ucv4alco2b
 (0=0) (1=3) (2=8) (3=16) (4=26) (5=36) (6=46) (7=60)
INTO u2calco3wineunits u2calco3beerunits u2calco5wineunits u2calco5beerunits
 ucv1wineunits ucv1beerunits ucv2wineunits ucv2beerunits 
 ucv3wineunits ucv3beerunits ucv4wineunits ucv4beerunits.
* Measures of alcopops and spirits are assumed to be 1 unit.
RECODE u2calco03c u2calco03d u2calco05c u2calco05d
 ucv1alco2c ucv1alco2d ucv2alco2c ucv2alco2d ucv3alco2c ucv3alco2d ucv4alco2c ucv4alco2d
 (0=0) (1=1.5) (2=4) (3=8) (4=13) (5=18) (6=23) (7=30)
INTO u2calco3alcopopunits u2calco3spiritunits u2calco5alcopopunits u2calco5spiritunits
 ucv1alcopopunits ucv1spiritunits ucv2alcopopunits ucv2spiritunits
 ucv3alcopopunits ucv3spiritunits ucv4alcopopunits ucv4spiritunits.
EXECUTE.
* Sum to get a total number of units, in the replacement variables.
* These will be retained in place of the raw items.
* round to an integer because these are approximate anyway.
COMPUTE u2calco03 = RND(SUM(u2calco3wineunits, u2calco3beerunits, u2calco3alcopopunits, u2calco3spiritunits)).
COMPUTE u2calco05 = RND(SUM(u2calco5wineunits, u2calco5beerunits, u2calco5alcopopunits, u2calco5spiritunits)).
COMPUTE ucv1alco2 = RND(SUM(ucv1wineunits, ucv1beerunits, ucv1alcopopunits, ucv1spiritunits)). 
COMPUTE ucv2alco2 = RND(SUM(ucv2wineunits, ucv2beerunits, ucv2alcopopunits, ucv2spiritunits)). 
COMPUTE ucv3alco2 = RND(SUM(ucv3wineunits, ucv3beerunits, ucv3alcopopunits, ucv3spiritunits)). 
COMPUTE ucv4alco2 = RND(SUM(ucv4wineunits, ucv4beerunits, ucv4alcopopunits, ucv4spiritunits)). 
EXECUTE.
* Recode number of units in Q5 to categories (0-4 scale).
* using the ranges in the published AUDIT scale.
RECODE u2calco05 
 (0 THRU 2=0) (2.1 THRU 4=1) (4.1 THRU 6=2) (6.1 THRU 9.9=3) (10 THRU HIGHEST=4)
INTO u2calco05un.
EXECUTE.
* Now create a total AUDIT score from items 4-13, all coded 0-4, including recoded item 5.
COMPUTE u2calcoaudit = 10 * MEAN.5(u2calco04, u2calco05un, u2calco06,
 u2calco07, u2calco08, u2calco09, u2calco10, u2calco11, u2calco12, u2calco13).
EXECUTE.
* Item 1 (ever had a drink) is a screening question: if 'no', other items are missing.
* The published AUDIT scale does not have a screening question.
* so assume scale value should be 0 if item 1 response is no.
IF (u2calco01 = 0) u2calcoaudit = 0.
EXECUTE.

u2cambit1/2

Ambition total scale, from the twin phase 2 questionnaire, derived from all 5 available items of the measure.
Each item has integer response values 0-4, hence the scale has a range of values from 0 to 20 because it is computed as the mean multiplied by the number of component items. At least half the items are required to be non-missing for the scale to be computed.

COMPUTE u2cambit = 5 * MEAN.3(u2cambi1, u2cambi2, u2cambi3, (4 - u2cambi4), u2cambi5).
EXECUTE.

u2cantbnonvt1/2, u2cantbt1/2, u2cantbviolt1/2

Antisocial behaviour scales, from the twin phase 2 questionnaire, derived from 12 of the 14 available items in the measure.
u2cantbnonvt1/2: count of non-violent, criminal behaviours.
u2cantbviolt1/2: count of violent, criminal behaviours.
u2cantbt1/2: overall total count.
Each is a count of the distinct behaviours reported rather than a conventional scale. The derivation is explained in comments in the syntax below.

* Antisocial behaviour.
* TEDS21 Phase 2, self-rated.
* The items do not correlate strongly so a conventional scale is not used.
* Instead, we derive counts of reported antisocial behaviours.
* in the two categories of violent and non-violent behaviours.
* using an approach consistent with that used in published literature.
* Each item is coded 0=no, 1=once, 2=more than once.
* For the scales, simply count positive responses (anything greater than 0).
* to give a measure of the number of antisocial, criminal behaviours reported.
* First make a temporary variable counting the number of responses of any type.
* in all the compulsory, non-branched items (not 8 or 13) of the measure.
COUNT u2cantbcount = u2cantb01 u2cantb02 u2cantb03 u2cantb04 u2cantb05 u2cantb06 
 u2cantb07 u2cantb09 u2cantb10 u2cantb11 u2cantb12 u2cantb14 u2cantb15 u2cantb16
 (0 THRU HIGHEST).
EXECUTE.
* Require over half (more than 7) of these items to be present.
DO IF (u2cantbcount > 7).
* Violent behaviours: 4 items.
 COUNT u2cantbviolt = u2cantb08 u2cantb09 u2cantb12 u2cantb13 (1 THRU HIGHEST).
* Non-violent behaviours: 10 items.
 COUNT u2cantbnonvt = u2cantb02 u2cantb03 u2cantb04 u2cantb05 u2cantb06 
  u2cantb07 u2cantb10 u2cantb14 u2cantb15 u2cantb16 (1 THRU HIGHEST).
END IF.
EXECUTE.
* Total: sum if both counts are non-missing.
COMPUTE u2cantbt = SUM.2(u2cantbviolt, u2cantbnonvt).
EXECUTE.
* Note that 2 items are not used: items 1 and 11.
* (a) because they are not used in the published scales.
* (b) because item 1 (rowdy/rude) is not a criminal behaviour, unlike the others.
* (c) because item 11 (harming animals) seems entirely uncorrelated with the others.
* Note that there are some items having very rare or even negligible responses.
* that are used above for scaling but are dropped as items from the dataset.

u2ccexpt1/2

Childhood Experiences total scale, from the twin phase 2 questionnaire, derived from all 8 available items of the measure.
Each item has integer response values 0-4, hence the scale has a range of values from 0 to 32 because it is computed as the mean multiplied by the number of component items. At least half the items are required to be non-missing for the scale to be computed.

COMPUTE u2ccexpt = 8 * MEAN.4(u2ccexp1, u2ccexp2, u2ccexp3,
 u2ccexp4, u2ccexp5, u2ccexp6, u2ccexp7, u2ccexp8).
EXECUTE.

u2ccgent1/2

Cognitive enhancers total scale, from the twin phase 2 questionnaire, derived from 3 of the 4 available items of the measure. Item 3 is omitted as it is branching and it has a different response pattern from the other items.
Each included item (1, 2 and 4) has integer response values 0-4, hence the scale has a range of values from 0 to 12 because it is computed as the mean multiplied by the number of component items. At least half the items are required to be non-missing for the scale to be computed.

COMPUTE u2ccgent = 3 * MEAN.2(u2ccgen1, u2ccgen2, u2ccgen4).
EXECUTE.

u2cconnt1/2, u2cconninat1/2, u2cconnhypt1/2

Conners, twin self-report, total scale and two subscales.
u2cconnt1/2 is the total scale from all 20 items.
u2cconninat1/2 is the inattention subscale from 11 items.
u2cconnhypt1/2 is the hyperactivity subscale from 9 items.
Each item has values 0/1/2/3, hence the scale values have ranges 0-60, 0-33 and 0-27 respectively. At least half the items are required to be non-missing for each scale to be computed.

* Conners (twin).
* TEDS21 Phase 2, self-rated, 20 items.
* Total scale and two subscales.
* Note this is a different version from the parent Conners.
* Inattention scale from the first 11 items.
COMPUTE u2cconninat = 11 * MEAN.6(u2cconn01, u2cconn02, u2cconn03, u2cconn04, 
 u2cconn05, u2cconn06, u2cconn07, u2cconn08, u2cconn09, u2cconn10, u2cconn11).
* Hyperactivity scale from the last 9 items.
COMPUTE u2cconnhypt = 9 * MEAN.5(u2cconn12, u2cconn13, u2cconn14, u2cconn15, 
 u2cconn16, u2cconn17, u2cconn18, u2cconn19, u2cconn20).
* Total scale from all 20 items.
COMPUTE u2cconnt = 20 * MEAN.10(u2cconn01, u2cconn02, u2cconn03, u2cconn04, 
 u2cconn05, u2cconn06, u2cconn07, u2cconn08, u2cconn09, u2cconn10, u2cconn11, 
 u2cconn12, u2cconn13, u2cconn14, u2cconn15, u2cconn16, u2cconn17, u2cconn18, 
 u2cconn19, u2cconn20).
EXECUTE.

u2ccrimt1/2

Criminality total scale, from the twin phase 2 questionnaire, derived from items 1 to 4 of the measure.
This is derived as an integer-valued ordinal scale, where a positive response in each item contributes 1 to the total value - hence the scale has values 0 to 4. Note that the measure includes branching, such that items 3 and 4 were only attempted by twins who gave a positive response in item 2.

* Criminality.
* Derive a simple integer 0-4 scale based on items 1-4.
* First sum the first two, compulsory items, requiring both to be present.
* both are coded 1Y 0N so this give a total 0-2.
COMPUTE u2ccrimt = SUM.2(u2ccrim1, u2ccrim2).
EXECUTE.
* Items 3 and 4 branch from item 2, so are missing if item 2 was 'no'.
* Add an extra 1 to the scale if the response to item 3 was > 1.
* (arrested more than once).
IF (u2ccrimt > 0 & u2ccrim3 > 1) u2ccrimt = u2ccrimt + 1.
EXECUTE.
* Similarly, add 1 to the scale if the response to item 4 was > 0.
* (spent at least one night in a police cell).
IF (u2ccrimt > 0 & u2ccrim4 > 0) u2ccrimt = u2ccrimt + 1.
EXECUTE.
* This derivation allows for missing data in items 3/4 but not in items 1/2.

u2cdevmob1/2, u2cdevwdth1/2

Device categories used for the TEDS21 twin phase 2 questionaire, if completed electronically.
u2cdevmob flags mobile devices (mobile phones and tablets), coded 1=yes 0=no.
u2cdevwdth is a device width category, coded 1=small, 2=medium, 3=large, where width refers to the widest side of the device's screen.
Derived from raw item variables, collected on the web/app server, that are not retained in the dataset. Mobile devices could be categorised from both app and web data, using different methods. Screen sizes were only recorded in the web data, not the app data.

* Mobile devices: CMS data.
* ------------------------.
* The raw CMS variable PlatformType has values 'Web' or 'Mobile'.
* so we can assume 'Mobile' value refers to mobile devices.
* while 'Web' in most cases probably means a web browser used on a laptop/desktop.
RECODE PlatformType ('Web'=0) ('Mobile'=1)
INTO u2cdevmob.

* Mobile devices: web backup.
* --------------------------.
* Use substrings of the u2cconsenttechuseragent string to categorise broad device types.
* 'Windows', 'Macintosh' and 'X11' are probably all laptops or desktops.
IF (CHAR.INDEX(u2cconsenttechuseragent, 'Windows') > 0) devicetype = 1.
IF (CHAR.INDEX(u2cconsenttechuseragent, 'Macintosh') > 0) devicetype = 2.
IF (CHAR.INDEX(u2cconsenttechuseragent, 'X11') > 0) devicetype = 3.
EXECUTE.
* 'iPhone' and 'Android' are likely to be mobile phones.
* (in rare cases of Windows phones with Android installed, this supercedes 'Windows' above).
IF (CHAR.INDEX(u2cconsenttechuseragent, 'iPhone') > 0) devicetype = 4.
IF (CHAR.INDEX(u2cconsenttechuseragent, 'Android') > 0) devicetype = 5.
EXECUTE.
* 'iPad' and 'Tablet' are presumably tablets (the latter can overrule 'Windows').
IF (CHAR.INDEX(u2cconsenttechuseragent, 'iPad') > 0) devicetype = 6.
IF (CHAR.INDEX(u2cconsenttechuseragent, 'Tablet') > 0) devicetype = 7.
EXECUTE.
* 'Mobile' is another common substring, but always indicates mobile phones.
* categorised by other substrings above.

* Recode into a binary variable to flag mobile devices.
* to be retained in the dataset (raw device variables will be dropped).
RECODE devicetype (1 THRU 3=0) (4 THRU 7=1)
INTO u2cdevmob.
EXECUTE.

* Screen width: web backup only.
* ----------------------------.
* Some devices may be used in portrait or landscape mode.
* so for consistency treat the largest dimension as the screen 'width'.
IF (u2cconsenttechscrwidth >= u2cconsenttechscrheight) screenwidth = u2cconsenttechscrwidth.
IF (u2cconsenttechscrwidth < u2cconsenttechscrheight) screenwidth = u2cconsenttechscrheight.
EXECUTE.
* Categorise screen widths arbitrarily as small, medium or large.
RECODE screenwidth
 (LOWEST THRU 767=1) (768 THRU 1199=2) (1200 THRU HIGHEST=3)
INTO u2cdevwdth.
EXECUTE.

u2cduration1/2

Duration of the phase 2 twin questionnaire, measured in decimal minutes, derived as the difference between the start date-time and the end date-time. The same method is used to derive this variable in the CMS and backup data, but the duration cannot be measured for paper booklet data. The start and end date-time variables are not retained in the dataset.

* Derive a duration variable, as the difference between the start and end date-times.
* This will not necessarily match the total time variable derived above.
* because the 7 questionnaire sections may have been completed at different dates or times.
* with pauses in between; and the overall duration will include consent, address check, etc.
* Only derive for cases where the whole thing was finished.
* Derive as number of seconds divided by 60 to get decimal minutes.
IF (u2cstat = 2) u2cduration = DATEDIFF(u2cend, u2cstart, 'seconds') / 60.
EXECUTE.

u2cganxt1/2, ucv1ganxt1/2, ucv2ganxt1/2, ucv3ganxt1/2, ucv4ganxt1/2

General anxiety total scale, from the TEDS21 twin phase 2 questionnaire (u2cganxt) and from the twin covid phase 1 (ucv1ganxt), phase 2 (ucv2ganxt), phase 3 (ucv3ganxt) and phase 4 (ucv4ganxt) questionnaires, in each case derived from all 10 available items of the measure.
Each item has integer response values 0-4, hence the scale has a range of values from 0 to 40 because it is computed as the mean multiplied by the number of component items. At least half the items are required to be non-missing for the scale to be computed.

COMPUTE u2cganxt = 10 * MEAN.5(u2cganx01, u2cganx02, u2cganx03,
 u2cganx04, u2cganx05, u2cganx06, u2cganx07, u2cganx08, u2cganx09, u2cganx10).
COMPUTE ucv1ganxt = 10 * MEAN.5(ucv1ganx01, ucv1ganx02, ucv1ganx03,
 ucv1ganx04, ucv1ganx05, ucv1ganx06, ucv1ganx07, ucv1ganx08, ucv1ganx09, ucv1ganx10).
COMPUTE ucv2ganxt = 10 * MEAN.5(ucv2ganx01, ucv2ganx02, ucv2ganx03,
 ucv2ganx04, ucv2ganx05, ucv2ganx06, ucv2ganx07, ucv2ganx08, ucv2ganx09, ucv2ganx10).
COMPUTE ucv3ganxt = 10 * MEAN.5(ucv3ganx01, ucv3ganx02, ucv3ganx03,
 ucv3ganx04, ucv3ganx05, ucv3ganx06, ucv3ganx07, ucv3ganx08, ucv3ganx09, ucv3ganx10).
COMPUTE ucv4ganxt = 10 * MEAN.5(ucv4ganx01, ucv4ganx02, ucv4ganx03,
 ucv4ganx04, ucv4ganx05, ucv4ganx06, ucv4ganx07, ucv4ganx08, ucv4ganx09, ucv4ganx10).
EXECUTE.

u2chasst1/2

Hassles total scale, from the twin phase 2 questionnaire, derived from all 7 available items of the measure.
Each item has integer response values 0-4, hence the scale has a range of values from 0 to 28 because it is computed as the mean multiplied by the number of component items. At least half the items are required to be non-missing for the scale to be computed.

COMPUTE u2chasst = 7 * MEAN.4(u2chass1, u2chass2,
 u2chass3, u2chass4, u2chass5, u2chass6, u2chass7).
EXECUTE.

u2cleism1/2

Leisure and hobbies overall mean scale, derived from all 5 items of the measure in the phase 2 twin questionnaire.
At least half the component items are required to be non-missing. Each item has response values 1-5, hence the scale has the same range as it is computed as a mean.

COMPUTE u2cleism = MEAN.3(u2cleis1, u2cleis2, u2cleis3, u2cleis4, u2cleis5).
EXECUTE.

u2clfevt1/2, u2clfevnnt1/2, u2clfevnat1/2

Life Events scales, from the twin phase 2 questionnaire, derived from all 11 available items of the measure.
u2clfevt1/2 is a conventional scale representing a total score; each item has integer response values 0-4, hence this scale has a range of values from 0 to 44 because it is computed as the mean multiplied by the number of component items. At least half the items are required to be non-missing for the scale to be computed.
u2clfevnnt1/2 and u2clfevnat1/2 are counts of reported life events: the former is a count of events reported with little or no effect (response values 1, 2); the latter is a count of events reported with some effect (response values 3, 4). Each of these two scales may have integer values between 0 and 11, because the measure includes 11 event items. Note that all the events in this measure at this age are treated as negative events.

* Total scale from all 11 items, as a mean.
COMPUTE u2clfevt = 11 * MEAN.6(u2clfev01, u2clfev02, u2clfev03, u2clfev04,
 u2clfev05, u2clfev06, u2clfev07, u2clfev08, u2clfev09, u2clfev10, u2clfev11).
EXECUTE.

* Correlations between items are low because these are mostly independent events.
* So as an alternative to the scale add counts of events, as at age 26.
* First count the number of all responses in the measure, including 'no' responses.
COUNT u2clfevcount = u2clfev01 u2clfev02 u2clfev03 u2clfev04 u2clfev05 u2clfev06 
 u2clfev07 u2clfev08 u2clfev09 u2clfev10 u2clfev11 (0 THRU HIGHEST).
EXECUTE.
* Counts of events may be invalid if too many are missing.
* so require at least 9 of the 11 items to be answered (this excludes only 2 twins).
DO IF (u2clfevcount >= 9).
 * Count events that occurred but had little or no effect on the twin.
 COUNT u2clfevnnt = u2clfev01 u2clfev02 u2clfev03 u2clfev04 u2clfev05 u2clfev06 
  u2clfev07 u2clfev08 u2clfev09 u2clfev10 u2clfev11 (1, 2).
 * Count events that affected the twin (moderately or a lot).
 COUNT u2clfevnat = u2clfev01 u2clfev02 u2clfev03 u2clfev04 u2clfev05 u2clfev06 
  u2clfev07 u2clfev08 u2clfev09 u2clfev10 u2clfev11 (3, 4).
END IF.
EXECUTE.

u2cLLCage1/2, u2cLLCdate1/2

See u1cLLCage1/2, etc above.

u2clrssparm1/2, u2clrssoccm1/2, u2clrsshomm1/2

Life Role Salience (LRSS) subscales, derived from items of the measure in the phase 2 twin questionnaire.
The subscales are for parental roles (u2clrssparm), occupational roles (u2clrssoccm) and home care roles (u2clrsshomm).
Each subscale is derived from 4 items (with item 6 reversed). For each scale, at least half the component items are required to be non-missing. Each item has response values 1-5, hence each of these scales has the same range as it is computed as a mean.

* Parental role: items 1,4,7,10.
COMPUTE u2clrssparm = MEAN.2(u2clrss01, u2clrss04, u2clrss07, u2clrss10).
* Occupational role: items 2,5,8,11.
COMPUTE u2clrssoccm = MEAN.2(u2clrss02, u2clrss05, u2clrss08, u2clrss11).
* Home care role: items 3,6,9,12 (with item 6 reversed).
COMPUTE u2clrsshomm = MEAN.2(u2clrss03, (6 - u2clrss06), u2clrss09, u2clrss12).
EXECUTE.

u2cmfqt1/2

See u1cmfqt1/2, u2cmfqt1/2 above.

u2cperpphyt1/2, u2cperpsoct1/2, u2cperpvert1/2, u2cperpcybt1/2, u2cperpt1/2, u2cvictphyt1/2, u2cvictsoct1/2, u2cvictvert1/2, u2cvictcybt1/2, u2cvictt1/2, ucv1victphyt1/2, ucv1victvert1/2, ucv1victcybt1/2, ucv1victt1/2, ucv2victphyt1/2, ucv2victvert1/2, ucv2victcybt1/2, ucv2victt1/2, ucv3victphyt1/2, ucv3victvert1/2, ucv3victcybt1/2, ucv3victt1/2, ucv4victphyt1/2, ucv4victvert1/2, ucv4victcybt1/2, ucv4victt1/2

Victimisation and Perpetration subscales, derived from items of the two closely-related measures in the TEDS21 phase 2 twin questionnaire, and the shortened victimisation scale in the covid questionnaires.
The TEDS21 victimisation and perpetration measures have essentially the same 16 items, rephrased for victimisation by peers of the twin (u2cvict) then for victimisation towards others perpetrated by the twin (u2cperp). The same set of scales and subscales is therefore derived for the two measures. The covid questionnaires (ucv1vict, ucv2vict, ucv3vict, ucv4vict) had a subset of 12 of the 16 victimisation items, omitting the four social items and omitting the perpetration items.
The subscales are for physical (phyt), social (soct), verbal (vert) and cyber-media (cybt) victimisation, each derived as a mean from 4 of the items. There is also an overall total scale (u2cperpt, u2cvictt) derived from all 16 items, or in the case of the covid questionnaire (ucv1victt, ucv2victt, ucv3victt, ucv4victt) derived from all available 12 items.
For each scale, at least half the component items are required to be non-missing. Each item has response values 0-2, hence each scale has a range of values from 0 to (2 * number of items) because it is computed as the mean multiplied by the number of component items.

* Total scale and four subscales.
* Physical victimisation subscale: items 1/5/9/13 in TEDS21, items 1/4/7/10 in covid.
COMPUTE u2cvictphyt = 4 * MEAN.2(u2cvict01, u2cvict05, u2cvict09, u2cvict13).
COMPUTE u2cperpphyt = 4 * MEAN.2(u2cperp01, u2cperp05, u2cperp09, u2cperp13).
COMPUTE ucv1victphyt = 4 * MEAN.2(ucv1vict01, ucv1vict04, ucv1vict07, ucv1vict10).
COMPUTE ucv2victphyt = 4 * MEAN.2(ucv2vict01, ucv2vict04, ucv2vict07, ucv2vict10).
COMPUTE ucv3victphyt = 4 * MEAN.2(ucv3vict01, ucv3vict04, ucv3vict07, ucv3vict10).
COMPUTE ucv4victphyt = 4 * MEAN.2(ucv4vict01, ucv4vict04, ucv4vict07, ucv4vict10).
EXECUTE.
* Social victimisation subscale: items 2/6/10/14 in TEDS21.
COMPUTE u2cvictsoct = 4 * MEAN.2(u2cvict02, u2cvict06, u2cvict10, u2cvict14).
COMPUTE u2cperpsoct = 4 * MEAN.2(u2cperp02, u2cperp06, u2cperp10, u2cperp14).
EXECUTE.
* Verbal victimisation subscale:  items 3/7/11/15 in TEDS21, items 2/5/8/11 in covid.
COMPUTE u2cvictvert = 4 * MEAN.2(u2cvict03, u2cvict07, u2cvict11, u2cvict15).
COMPUTE u2cperpvert = 4 * MEAN.2(u2cperp03, u2cperp07, u2cperp11, u2cperp15).
COMPUTE ucv1victvert = 4 * MEAN.2(ucv1vict02, ucv1vict05, ucv1vict08, ucv1vict11).
COMPUTE ucv2victvert = 4 * MEAN.2(ucv2vict02, ucv2vict05, ucv2vict08, ucv2vict11).
COMPUTE ucv3victvert = 4 * MEAN.2(ucv3vict02, ucv3vict05, ucv3vict08, ucv3vict11).
COMPUTE ucv4victvert = 4 * MEAN.2(ucv4vict02, ucv4vict05, ucv4vict08, ucv4vict11).
EXECUTE.
* Cyber-victimisation subscale: items 4/8/12/16 in TEDS21, items 3/6/9/12 in covid.
COMPUTE u2cvictcybt = 4 * MEAN.2(u2cvict04, u2cvict08, u2cvict12, u2cvict16).
COMPUTE u2cperpcybt = 4 * MEAN.2(u2cperp04, u2cperp08, u2cperp12, u2cperp16).
COMPUTE ucv1victcybt = 4 * MEAN.2(ucv1vict03, ucv1vict06, ucv1vict09, ucv1vict12).
COMPUTE ucv2victcybt = 4 * MEAN.2(ucv2vict03, ucv2vict06, ucv2vict09, ucv2vict12).
COMPUTE ucv3victcybt = 4 * MEAN.2(ucv3vict03, ucv3vict06, ucv3vict09, ucv3vict12).
COMPUTE ucv4victcybt = 4 * MEAN.2(ucv4vict03, ucv4vict06, ucv4vict09, ucv4vict12).
EXECUTE.
* Overall total from all 16 items (TEDS21).
* or 12 items (Covid).
COMPUTE u2cvictt = 16 * MEAN.8(u2cvict01, u2cvict02, u2cvict03, u2cvict04,
 u2cvict05, u2cvict06, u2cvict07, u2cvict08, u2cvict09, u2cvict10,
 u2cvict11, u2cvict12, u2cvict13, u2cvict14, u2cvict15, u2cvict16).
COMPUTE u2cperpt = 16 * MEAN.8(u2cperp01, u2cperp02, u2cperp03, u2cperp04,
 u2cperp05, u2cperp06, u2cperp07, u2cperp08, u2cperp09, u2cperp10,
 u2cperp11, u2cperp12, u2cperp13, u2cperp14, u2cperp15, u2cperp16).
COMPUTE ucv1victt = 12 * MEAN.6(ucv1vict01, ucv1vict02, ucv1vict03, ucv1vict04,
 ucv1vict05, ucv1vict06, ucv1vict07, ucv1vict08, ucv1vict09, ucv1vict10,
 ucv1vict11, ucv1vict12).
COMPUTE ucv2victt = 12 * MEAN.6(ucv2vict01, ucv2vict02, ucv2vict03, ucv2vict04,
 ucv2vict05, ucv2vict06, ucv2vict07, ucv2vict08, ucv2vict09, ucv2vict10,
 ucv2vict11, ucv2vict12).
COMPUTE ucv3victt = 12 * MEAN.6(ucv3vict01, ucv3vict02, ucv3vict03, ucv3vict04,
 ucv3vict05, ucv3vict06, ucv3vict07, ucv3vict08, ucv3vict09, ucv3vict10,
 ucv3vict11, ucv3vict12).
COMPUTE ucv4victt = 12 * MEAN.6(ucv4vict01, ucv4vict02, ucv4vict03, ucv4vict04,
 ucv4vict05, ucv4vict06, ucv4vict07, ucv4vict08, ucv4vict09, ucv4vict10,
 ucv4vict11, ucv4vict12).
EXECUTE.

u2cq1quan1/2, u2cq2quan1/2, u2cq3quan1/2, u2cq4quan1/2, u2cq5quan1/2, u2cq6quan1/2, u2cq7quan1/2, u2cquan1/2

Variables showing the number of compulsory questions answered in each of the 7 sections (u2cqXquan) and overall (u2cquan) in the phase 2 twin questionnaire. The same method was used in each of the 3 main questionnaire versions (CMS, backup, paper), using the same subset of item variables in each case. The counts ignore optional questions and those questions that might be skipped because of branching rules.

* Count the number of items completed, not including items that might be skipped due to branching.
COUNT u2cq1quan = 
 u2clrss01 u2clrss02 u2clrss03 u2clrss04 u2clrss05 u2clrss06 u2clrss07
 u2clrss08 u2clrss09 u2clrss10 u2clrssqc u2clrss11 u2clrss12
 u2ccexp1 u2ccexp2 u2ccexpqc u2ccexp3 u2ccexp4 u2ccexp5 u2ccexp6 u2ccexp7 u2ccexp8
 (0 THRU HIGHEST).
COUNT u2cq2quan = 
 u2cambi1 u2cambi2 u2cambi3 u2cambi4 u2cambi5
 u2chass1 u2chass2 u2chass3 u2chass4 u2chass5 u2chass6 u2chass7
 u2clfev01 u2clfev02 u2clfev03 u2clfev04 u2clfev05 u2clfev06
 u2clfev07 u2clfev08 u2clfevqc u2clfev09 u2clfev10 u2clfev11
 (0 THRU HIGHEST).
COUNT u2cq3quan =
 u2cconn01 u2cconn02 u2cconn03 u2cconn04 u2cconn05 u2cconn06 u2cconn07
 u2cconn08 u2cconn09 u2cconn10 u2cconn11 u2cconnqc u2cconn12 u2cconn13
 u2cconn14 u2cconn15 u2cconn16 u2cconn17 u2cconn18 u2cconn19 u2cconn20
 u2cspeq01 u2cspeq02 u2cspeq03 u2cspeq04 u2cspeq05 u2cspeq06 u2cspeq07 u2cspeq08 u2cspeq09
 u2cspeq10 u2cspeq11 u2cspeq12 u2cspeq13 u2cspeq14 u2cspeq15 u2cspeq16 u2cspeq17
 u2cspeq18 u2cspeq19 u2cspeq20 u2cspeq21 u2cspeq22 u2cspeq23 u2cspeqqc u2cspeq24
 (0 THRU HIGHEST).
COUNT u2cq4quan = 
 u2calco01 u2csmok01 u2csmok10 u2ccann01
 u2ccgen1 u2ccgenqc u2ccgen2 u2ccgen4 u2cdrug1 
 (0 THRU HIGHEST).
COUNT u2cq5quan = 
 u2cantb01 u2cantb02 u2cantb03 u2cantb04 u2cantb05 u2cantb06 u2cantb07 u2cantb09 
 u2cantb10 u2cantb11 u2cantbqc u2cantb12 u2cantb14 u2cantb15 u2cantb16
 u2ccrim1 u2ccrim2 
 u2cvict01 u2cvict02 u2cvict03 u2cvict04 u2cvict05 u2cvict06 u2cvict07 u2cvict08
 u2cvict09 u2cvict10 u2cvict11 u2cvict12 u2cvict13 u2cvict14 u2cvict15 u2cvict16 
 u2cperp01 u2cperp02 u2cperp03 u2cperp04 u2cperp05 u2cperp06 u2cperp07 u2cperp08 u2cperp09
 u2cperp10 u2cperp11 u2cperp12 u2cperp13 u2cperpqc u2cperp14 u2cperp15 u2cperp16
 (0 THRU HIGHEST).
COUNT u2cq6quan = 
 u2cleis1 u2cleis2 u2cleis3 u2cleis4 u2cleis5
 u2cganx01 u2cganx02 u2cganx03 u2cganx04 u2cganx05 
 u2cganx06 u2cganx07 u2cganx08 u2cganx09 u2cganx10
 u2cmfq1 u2cmfq2 u2cmfq3 u2cmfq4 u2cmfq5 u2cmfq6 u2cmfq7 u2cmfqqc u2cmfq8
 u2cslfh01 u2cslfh02
 (0 THRU HIGHEST).
COUNT u2cq7quan = 
 u2cexcl01 u2cexcl04 u2cexcl12 
 u2cshec1 u2cshec2 u2cshec3 u2cshec4 u2cshec5
 u2clhec01 u2clhec02 u2clhec04 u2clhec06 u2clhec09 u2clhec10 u2clhec11
 u2clhec12 u2clhec13 u2clhec14 u2clhec15 u2clhec16 u2clhec17 u2clhec18
 u2cslpq1 u2cslpq2 u2cslpq3 u2cslpq4 u2cslpqqc u2cslpq5 u2cslpq6 u2cslpq7 u2cslpq8
 (0 THRU HIGHEST).
EXECUTE.
* Overall total count of (non-branched) questions answered.
NUMERIC u2cquan (F3.0).
COMPUTE u2cquan = SUM(u2cq1quan, u2cq2quan, u2cq3quan, u2cq4quan, u2cq5quan, u2cq6quan, u2cq7quan).
EXECUTE.

u2cq1stat1/2, u2cq2stat1/2, u2cq3stat1/2, u2cq4stat1/2, u2cq5stat1/2, u2cq6stat1/2, u2cq7stat1/2, u2cstat1/2

Status variables for each of the 7 sections (u2cqXstat) and overall (u2cstat) in the phase 2 TEDS21 twin questionnaire. Each is initially coded 0=not started, 1=started but not finished, 2=finished. During data cleaning, each may be recoded from 2=finished to 4=excluded (random responder), although this recoding is not shown in the syntax below.
Different methods are used to calculate the status variables in the three versions (CMS, backup, paper). The CMS and backup versions had different types of raw status variables that could be used. The CMS raw data contains variable QnrCompletion, with values 0 to 100, indicating the status of the questionnaire as a whole. The backup raw data contains a status variable for each section, coded 0=not completed, 1=completed (data were only submitted at the end of each section, hence partially-completed sections are not present in the raw backup data). For the paper version, the status variables are based purely on the counts of the questions answered in each section (these are derived variables described elsewhere on this page). Note that the sections and their component questions were presented in strict sequence in the electronic versions, so if a section was left unfinished then subsequent sections would necessarily be unstarted. In the paper version, participants could of course leave questions or whole sections unanswered in a wholly arbitrary way; but an attempt has been made to compute the status variables in a roughly equivalent way.

* CMS raw data.
* ------------.
COMPUTE u2cq1stat = 0.
COMPUTE u2cq2stat = 0.
COMPUTE u2cq3stat = 0.
COMPUTE u2cq4stat = 0.
COMPUTE u2cq5stat = 0.
COMPUTE u2cq6stat = 0.
COMPUTE u2cq7stat = 0.
COMPUTE u2cstat = 0.
EXECUTE.
* If QnrCompletion=100, all should have been 'finished'.
DO IF (QnrCompletion = 100).
 RECODE u2cq1stat u2cq2stat u2cq3stat u2cq4stat u2cq5stat u2cq6stat u2cq7stat u2cstat (0=2).
END IF.
EXECUTE.
* For some consistency with the status variables in the paper version.
* if a theme is not obviously finished then count as finished if >=80% of compulsory.
* questions were answered; but obviously finished if next theme has been started.
* (note situation is much simpler in backup version where all or none of the questions.
* were submitted separately for each theme).
* 1st section: assume complete if at least 18 of 22 items answered OR if some of 2nd section done.
* and assume partially complete if 1-17 answered AND nothing in 2nd section.
* (note the childhood experiences items were optional).
IF (u2cq1stat = 0 & (u2cq1quan >= 18 | u2cq2quan > 0)) u2cq1stat = 2.
IF (u2cq1stat = 0 & (RANGE(u2cq1quan,1,17) & u2cq2quan = 0)) u2cq1stat = 1.
EXECUTE.
* 2nd section (24 items, all compulsory, require at least 20 for completion).
IF (u2cq2stat = 0 & (u2cq2quan >= 20 | u2cq3quan > 0)) u2cq2stat = 2.
IF (u2cq2stat = 0 & (RANGE(u2cq2quan,1,19) & u2cq3quan = 0)) u2cq2stat = 1.
EXECUTE.
* 3rd section (46 items, all compulsory, require at least 37 for completion).
IF (u2cq3stat = 0 & (u2cq3quan >=37 | u2cq4quan > 0)) u2cq3stat = 2.
IF (u2cq3stat = 0 & (RANGE(u2cq3quan,1,36) & u2cq4quan = 0)) u2cq3stat = 1.
EXECUTE.
* 4th section (only 9 compulsory items, some of which themselves were optional: require at least 7 for completion).
IF (u2cq4stat = 0 & (u2cq4quan >= 7 | u2cq5quan > 0)) u2cq4stat = 2.
IF (u2cq4stat = 0 & (RANGE(u2cq4quan,1,6) & u2cq5quan = 0)) u2cq4stat = 1.
EXECUTE.
* 5th section (50 items, many of which are optional, require at least 40 for completion).
IF (u2cq5stat = 0 & (u2cq5quan >= 40 | u2cq6quan > 0)) u2cq5stat = 2.
IF (u2cq5stat = 0 & (RANGE(u2cq5quan,1,39) & u2cq6quan = 0)) u2cq5stat = 1.
EXECUTE.
* 6th section (26 items, all compulsory, require at least 21 for completion).
IF (u2cq6stat = 0 & (u2cq6quan >= 21 | u2cq7quan > 0)) u2cq6stat = 2.
IF (u2cq6stat = 0 & (RANGE(u2cq6quan,1,20) & u2cq7quan = 0)) u2cq6stat = 1.
EXECUTE.
* 7th section is last and completed if QnrCompletion=100 as above.
* or if at least 25 of the 31 items were completed.
* hence treat as unfinished if 1-24 items done and QnrCompletion < 100.
IF (u2cq7stat = 0 & RANGE(u2cq7quan,1,24) & QnrCompletion < 100) u2cq7stat = 1.
IF (u2cq7stat = 0 & u2cq7quan >= 25 & QnrCompletion < 100) u2cq7stat = 2.
EXECUTE.
* Overall status is unfinished if 1st section started but final theme is unfinished.
IF (u2cstat = 0 & u2cq1stat > 0 & u2cq7stat < 2) u2cstat = 1.
EXECUTE.

* Backup raw data.
* ---------------.
* Derive a status variable for each section, coded 0=not started 1=started 2=finished.
* In this backup version, each section is either not started or finished.
* because data are only submitted at the end of the section.
* Can therefore conveniently recode the existing status flags (0/1) from the admin file.
* regardless of the number of items completed.
RECODE relationships__status lifeExperiences__status behaviour__status
 substanceUse__status conflict__status wellbeing__status health__status 
 (0=0) (1=2)
INTO u2cq1stat u2cq2stat u2cq3stat u2cq4stat u2cq5stat u2cq6stat u2cq7stat.
EXECUTE.
* Overall status may be unfinished if fewer than 9 sections completed.
IF (u2cq1stat = 0) u2cstat = 0.
IF (u2cq1stat > 0 & u2cq7stat < 2) u2cstat = 1.
IF (u2cq7stat = 2) u2cstat = 2.
EXECUTE.

* Paper raw data.
* --------------.
* Derive a status variable for each section, coded 0=not started 1=started 2=finished.
* In the CMS/web, the purpose is to identify sections started but not finished.
* In this paper version, unlike electronic versions, no questions can be made compulsory.
* as a condition of continuing; any questions can easily be skipped mid-questionnaire.
* hence the status variable has a slightly different meaning in the paper version.
* Therefore treat each section as finished if most (at least 80%) compulsory non-branched questions were answered.
RECODE u2cq1quan (0=0) (1 THRU 17=1) (18 THRU 22=2) INTO u2cq1stat.
RECODE u2cq2quan (0=0) (1 THRU 19=1) (20 THRU 24=2) INTO u2cq2stat.
RECODE u2cq3quan (0=0) (1 THRU 36=1) (37 THRU 46=2) INTO u2cq3stat.
RECODE u2cq4quan (0=0) (1 THRU 6=1) (7 THRU 9=2) INTO u2cq4stat.
RECODE u2cq5quan (0=0) (1 THRU 39=1) (40 THRU 50=2) INTO u2cq5stat.
RECODE u2cq6quan (0=0) (1 THRU 20=1) (21 THRU 26=2) INTO u2cq6stat.
RECODE u2cq7quan (0=0) (1 THRU 24=1) (25 THRU 31=2) INTO u2cq7stat.
EXECUTE.
* Now the overall status is finished if all 7 sections are finished by the above definition.
IF (SUM(u2cq1stat, u2cq2stat, u2cq3stat, u2cq4stat, u2cq5stat, u2cq6stat, u2cq7stat) = 0)
    u2cstat = 0.
IF (SUM(u2cq1stat, u2cq2stat, u2cq3stat, u2cq4stat, u2cq5stat, u2cq6stat, u2cq7stat) = 14)
    u2cstat = 2.
IF (RANGE(SUM(u2cq1stat, u2cq2stat, u2cq3stat, u2cq4stat, u2cq5stat, u2cq6stat,
    u2cq7stat), 1, 13)) u2cstat = 1.
EXECUTE.

u2cq1time1/2, u2cq2time1/2, u2cq3time1/2, u2cq4time1/2, u2cq5time1/2, u2cq6time1/2, u2cq7time1/2, u2ctime1/2

Time taken to complete each section (u2cqXtime) and the whole (u2ctime) of the phase 2 TEDS21 twin questionnaire. The times are measured in decimal minutes.
These variables are derived in different ways in the CMS raw data (as a sum of item times) and the backup raw data (as a difference between start and end date-times). The variables cannot be derived for paper booklet data. Note that the item times used in the CMS derivation, and the section start/end times used in the backup derivation, are not retained in the dataset because they are not present in other versions.

* Backup raw data.
* ---------------.
* In the backup data, there are no item times but there is a date-time.
* for the start and end of each section - use these to derive the duration in minutes.
* Should be roughly compatible with the times in the CMS version, which are.
* derived as sums of item times.
NUMERIC u2cq1time u2cq2time u2cq3time u2cq4time u2cq5time u2cq6time u2cq7time u2ctime (F4.1).
* Only derive if the relevant section was completed.
IF (u2cq1stat = 2) u2cq1time = DATEDIFF(relationships__subTime, relationships__genTime, 'seconds') / 60.
IF (u2cq2stat = 2) u2cq2time = DATEDIFF(lifeExperiences__subTime, lifeExperiences__genTime, 'seconds') / 60.
IF (u2cq3stat = 2) u2cq3time = DATEDIFF(behaviour__subTime, behaviour__genTime, 'seconds') / 60.
IF (u2cq4stat = 2) u2cq4time = DATEDIFF(substanceUse__subTime, substanceUse__genTime, 'seconds') / 60.
IF (u2cq5stat = 2) u2cq5time = DATEDIFF(conflict__subTime, conflict__genTime, 'seconds') / 60.
IF (u2cq6stat = 2) u2cq6time = DATEDIFF(wellbeing__subTime, wellbeing__genTime, 'seconds') / 60.
IF (u2cq7stat = 2) u2cq7time = DATEDIFF(health__subTime, health__genTime, 'seconds') / 60.
EXECUTE.
* Sum for total time if all finished.
IF (u2cstat = 2) u2ctime = SUM(u2cq1time, u2cq2time, u2cq3time,
 u2cq4time, u2cq5time, u2cq6time, u2cq7time).
EXECUTE.
* CMS raw data.
* ------------.
* Now sum the item times for each questionnaire section/theme (including optional item times).
* and divided by 60000 to get a time in minutes.
NUMERIC u2cq1time u2cq2time u2cq3time u2cq4time u2cq5time u2cq6time u2cq7time u2ctime (F4.1).
* only derive times for completed sections.
IF (u2cq1stat = 2) u2cq1time = SUM(u2cconmbranchrt,
 u2cconm1rt, u2cconm2rt, u2cconm3rt, u2cconm4rt, u2cconf1rt, u2cconf2rt, u2cconf3rt, u2cconf4rt,
 u2clrss01rt, u2clrss02rt, u2clrss03rt, u2clrss04rt, u2clrss05rt, u2clrss06rt, u2clrss07rt,
 u2clrss08rt, u2clrss09rt, u2clrss10rt, u2clrssqcrt, u2clrss11rt, u2clrss12rt,
 u2ccexp1rt, u2ccexp2rt, u2ccexpqcrt, u2ccexp3rt, u2ccexp4rt, u2ccexp5rt, u2ccexp6rt, u2ccexp7rt, u2ccexp8rt) / 60000.
IF (u2cq2stat = 2) u2cq2time = SUM(u2cambi1rt, u2cambi2rt, u2cambi3rt, u2cambi4rt, u2cambi5rt,
 u2chass1rt, u2chass2rt, u2chass3rt, u2chass4rt, u2chass5rt, u2chass6rt, u2chass7rt,
 u2clfev01rt, u2clfev02rt, u2clfev03rt, u2clfev04rt, u2clfev05rt, u2clfev06rt,
 u2clfev07rt, u2clfev08rt, u2clfevqcrt, u2clfev09rt, u2clfev10rt, u2clfev11rt) / 60000.
IF (u2cq3stat = 2) u2cq3time = SUM(u2cconn01rt, u2cconn02rt, u2cconn03rt, u2cconn04rt, u2cconn05rt, u2cconn06rt, u2cconn07rt,
 u2cconn08rt, u2cconn09rt, u2cconn10rt, u2cconn11rt, u2cconnqcrt, u2cconn12rt, u2cconn13rt,
 u2cconn14rt, u2cconn15rt, u2cconn16rt, u2cconn17rt, u2cconn18rt, u2cconn19rt, u2cconn20rt,
 u2cspeq01rt, u2cspeq02rt, u2cspeq03rt, u2cspeq04rt, u2cspeq05rt, u2cspeq06rt, u2cspeq07rt, u2cspeq08rt,
 u2cspeq09rt, u2cspeq10rt, u2cspeq11rt, u2cspeq12rt, u2cspeq13rt, u2cspeq14rt, u2cspeq15rt, u2cspeq16rt,
 u2cspeq17rt, u2cspeq18rt, u2cspeq19rt, u2cspeq20rt, u2cspeq21rt, u2cspeq22rt, u2cspeq23rt, u2cspeqqcrt, u2cspeq24rt) / 60000.
IF (u2cq4stat = 2) u2cq4time = SUM(u2calco01rt, u2calco02rt, u2calco03rta, u2calco03rtb, u2calco03rtc, u2calco03rtd, u2calco04rt,
 u2calco05rta, u2calco05rtb, u2calco05rtc, u2calco05rtd, u2calco06rt, u2calco07rt, u2calco08rt,
 u2calcoqcrt, u2calco09rt, u2calco10rt, u2calco11rt, u2calco12rt, u2calco13rt,
 u2csmok01rt, u2csmok02rt, u2csmok03rt, u2csmok04rt, u2csmok05rt, u2csmok06rt,
 u2csmok07rt, u2csmok08rt, u2csmok09rt, u2csmok10rt, u2csmok11rt, u2csmok12rt,
 u2ccann01rt, u2ccann02rt, u2ccann03rt, u2ccann04rt, u2ccann05rt,
 u2ccann06rt, u2ccann07rt, u2ccann08rt, u2ccann09rt, u2ccann10rt,
 u2ccgen1rt, u2ccgenqcrt, u2ccgen2rt, u2ccgen3rt, u2ccgen4rt,
 u2cdrug1rt, u2cdrug2rt, u2cdrug3rt, u2cdrug4rt, u2cdrug5rt, u2cdrug6rt, u2cdrug7rt, u2cdrug8rt) / 60000.
IF (u2cq5stat = 2) u2cq5time = SUM(u2cantb01rt, u2cantb02rt, u2cantb03rt, u2cantb04rt, u2cantb05rt, u2cantb06rt, u2cantb07rt, u2cantb08rt,
 u2cantb09rt, u2cantb10rt, u2cantb11rt, u2cantbqcrt, u2cantb12rt, u2cantb13rt, u2cantb14rt, u2cantb15rt, u2cantb16rt,
 u2ccrim1rt, u2ccrim2rt, u2ccrim3rt, u2ccrim4rt, u2ccrim5rt, u2ccrim6rt,
 u2cvict01rt, u2cvict02rt, u2cvict03rt, u2cvict04rt, u2cvict05rt, u2cvict06rt, u2cvict07rt, u2cvict08rt,
 u2cvict09rt, u2cvict10rt, u2cvict11rt, u2cvict12rt, u2cvict13rt, u2cvict14rt, u2cvict15rt, u2cvict16rt,
 u2cperp01rt, u2cperp02rt, u2cperp03rt, u2cperp04rt, u2cperp05rt, u2cperp06rt, u2cperp07rt, u2cperp08rt,
 u2cperp09rt, u2cperp10rt, u2cperp11rt, u2cperp12rt, u2cperp13rt, u2cperpqcrt, u2cperp14rt, u2cperp15rt, u2cperp16rt) / 60000.
IF (u2cq6stat = 2) u2cq6time = SUM(u2cleis1rt, u2cleis2rt, u2cleis3rt, u2cleis4rt, u2cleis5rt,
 u2cganx01rt, u2cganx02rt, u2cganx03rt, u2cganx04rt, u2cganx05rt,
 u2cganx06rt, u2cganx07rt, u2cganx08rt, u2cganx09rt, u2cganx10rt,
 u2cmfq1rt, u2cmfq2rt, u2cmfq3rt, u2cmfq4rt, u2cmfq5rt, u2cmfq6rt, u2cmfq7rt, u2cmfqqcrt, u2cmfq8rt,
 u2cslfh01rt, u2cslfh02rt, u2cslfh03rt, u2cslfh04rt, u2cslfh05rt, u2cslfh06rt, u2cslfh07rt,
 u2cslfh08rt, u2cslfh09rt, u2cslfh10rt, u2cslfh11rt, u2cslfh12rt, u2cslfh13rt, u2cslfh14rt) / 60000.
IF (u2cq7stat = 2) u2cq7time = SUM(u2cexcl01rt, u2cexcl02rt, u2cexcl03rt, u2cexcl04rt, u2cexcl05rt, u2cexcl06rt, u2cexcl07rt,
 u2cexcl08rt, u2cexcl09rt, u2cexcl10rt, u2cexcl11rt, u2cexcl12rt, u2cexcl13rt, u2cexcl14rt,
 u2cshec1rt, u2cshec2rt, u2cshec3rt, u2cshec4rt, u2cshec5rt,
 u2clhec01rt, u2clhec02rt, u2clhec03rta, u2clhec03rtb, u2clhec03rtc, u2clhec03rtd,
 u2clhec04rt, u2clhec05rta, u2clhec05rtb, u2clhec05rtc, u2clhec05rtd, u2clhec06rt,
 u2clhec07rt, u2clhec08rt, u2clhec09rt, u2clhec10rt, u2clhec11rt, u2clhec12rt,
 u2clhec13rt, u2clhec14rt, u2clhec15rt, u2clhec16rt, u2clhec17rt, u2clhec18rt,
 u2cslpq1rt, u2cslpq2rt, u2cslpq3rt, u2cslpq4rt, u2cslpqqcrt, u2cslpq5rt, u2cslpq6rt, u2cslpq7rt, u2cslpq8rt) / 60000.
EXECUTE.
* Sum for total time if all finished.
IF (u2cstat = 2) u2ctime = SUM(u2cq1time, u2cq2time, u2cq3time,
 u2cq4time, u2cq5time, u2cq6time, u2cq7time).
EXECUTE.

u2cslpqt1/2

Sleep quality total scale, from the twin phase 2 questionnaire, derived from all 8 available items of the measure.
Each item has integer response values 0-3, hence the scale has a range of values from 0 to 24 because it is computed as the mean multiplied by the number of component items. At least half the items are required to be non-missing for the scale to be computed.

COMPUTE u2cslpqt = 8 * MEAN.4(u2cslpq1, u2cslpq2, u2cslpq3,
 u2cslpq4, u2cslpq5, u2cslpq6, u2cslpq7, u2cslpq8).
EXECUTE.

u2cspeqpart1/2, u2cspeqhalt1/2

Specific Psychotic Experiences Questionnaire (SPEQ) subscales, derived from items of the measure in the phase 2 twin questionnaire.
The subscales are for paranoia (u2cspeqpart), and hallucinations (u2cspeqhalt).
The subscales are derived from 15 and 9 items respectively. For each scale, at least half the component items are required to be non-missing. Each item has response values 0-5, hence each scale has a range of values from 0 to (5 * number of items) because it is computed as the mean multiplied by the number of component items.

* Paranoia subscale: items 1-15.
COMPUTE u2cspeqpart = 15 * MEAN.8(u2cspeq01, u2cspeq02, u2cspeq03, 
 u2cspeq04, u2cspeq05, u2cspeq06, u2cspeq07, u2cspeq08, u2cspeq09, 
 u2cspeq10, u2cspeq11, u2cspeq12, u2cspeq13, u2cspeq14, u2cspeq15).
* Hallucinations subscale: items 16-24.
COMPUTE u2cspeqhalt = 9 * MEAN.5(u2cspeq16, u2cspeq17, u2cspeq18, 
 u2cspeq19, u2cspeq20, u2cspeq21, u2cspeq22, u2cspeq23, u2cspeq24).
EXECUTE.

u2ctime1/2

See u2cq1time1/2, etc above.

u2cvictphyt1/2, u2cvictsoct1/2, u2cvictvert1/2, u2cvictcybt1/2, u2cvictt1/2

See u2cperpphyt1/2, etc above.

ucgage1/2

Age of twin (in decimal years) when the g-game was started.
Derived from ucgconstdt (g-game consent start date) and aonsdob (twin birth date). These date variables are not retained in the dataset.

COMPUTE ucgage = RND((DATEDIFF(ucgconstdt, aonsdob, "days")) / 365.25, 0.1) .
EXECUTE.

ucgactn1/2, ucgstat1/2, ucgconstat1/2, ucgiststat1/2, ucgmisstat1/2, ucgravstat1/2, ucgverstat1/2, ucgvocstat1/2

Status variables for the g-game study.
ucgactn is the number of sub-tests completed (0-5).
ucgstat is the overall status, coded 0=not started, 1=started but not finished, 2=finished, 4=finished but excluded as a random responder.
ucgconstat, ucgiststat, ucgmisstat, ucgravstat, ucgverstat, ucgvocstat are the status variables respectively for consent, ISTO/NVRA/NVRB, Missing Letter, Ravens Matrices, Verbal Reasoning and Vocabulary. Each activity could not be left unfinished, hence the coding is 0=not started, 2=finished, 4=finished but data excluded (random responder).
The identification and exclusion of random responders is not shown here.

* The raw sub-test status variables have values 0=not done, 1=completed.
* In this web implementation, sub-tests cannot be started but not finished.
* Rename these.
RENAME VARIABLES
 (consent__status mountain__status space__status tower__status ocean__status woodland__status
 = ucgconstat ucgvocstat ucgiststat ucgmisstat ucgravstat ucgverstat).

* Count the test activities completed and create an overall status variable.
* (do not include consent in the test activities here).
COMPUTE ucgactn = SUM(ucgvocstat, ucgiststat, ucgmisstat, ucgravstat, ucgverstat).
EXECUTE.
* Now code the overall status variable as 0=not started, 1=started, 2=finished (excluding consent).
RECODE ucgactn (0=0) (1 THRU 4=1) (5=2)
INTO ucgstat.
EXECUTE.

* Now, to be consistent with other datasets, recode sub-test status variables.
* to 0=not started, 2=completed.
RECODE ucgconstat ucgvocstat ucgiststat ucgmisstat ucgravstat ucgverstat
 (0=0) (1=2).
EXECUTE.

ucgcontime1/2, ucgvoctime1/2, ucgisttime1/2, ucgmistime1/2, ucgravtime1/2, ucgvertime1/2, ucgtime1/2

Time taken (in decimal minutes) to complete each activity of the g-game: consent and each of the five sub-tests; ucgtime is the total g-game time.
Each is derived from the respective start and end date-time variables (which have not been retained in the dataset), also using the respective status variables to check that the relevant activity was completed. The status variables are derived variables described elsewhere on this page.

* derive sub-test times as differences between start and end times, in minutes.
* only compute if the activity was completed.
IF (ucgconstat = 1) ucgcontime = DATEDIFF(ucgconendt, ucgconstdt, 'seconds') / 60.
IF (ucgvocstat = 1) ucgvoctime = DATEDIFF(ucgvocendt, ucgvocstdt, 'seconds') / 60.
IF (ucgiststat = 1) ucgisttime = DATEDIFF(ucgistendt, ucgiststdt, 'seconds') / 60.
IF (ucgmisstat = 1) ucgmistime = DATEDIFF(ucgmisendt, ucgmisstdt, 'seconds') / 60.
IF (ucgravstat = 1) ucgravtime = DATEDIFF(ucgravendt, ucgravstdt, 'seconds') / 60.
IF (ucgverstat = 1) ucgvertime = DATEDIFF(ucgverendt, ucgverstdt, 'seconds') / 60.
EXECUTE.
* Sum these to derive the total time if all sub-tests completed.
IF (ucgstat = 2) ucgtime = SUM(ucgcontime, ucgvoctime, ucgisttime, ucgmistime, ucgravtime, ucgvertime).
EXECUTE.

ucgdevmob1/2, ucgdevwdth1/2

Device categories used for the g-game web tests.
ucgdevmob flags mobile devices (mobile phones and tablets), coded 1=yes 0=no.
ucgdevwdth is a device width category, coded 1=small, 2=medium, 3=large, where width refers to the widest side of the device's screen.
Derived from raw item variables, collected on the web server, that are not retained in the dataset: consent__techuseragent (a complex string denoting the type of user agent), consent__techscrwidth and consent__techscrheight (screen dimensions in pixels).

* Use the consent__techuseragent field to identify crude device types from substrings.
* 'Windows', 'Macintosh' and 'X11' are probably all laptops or desktops.
IF (CHAR.INDEX(consent__techuseragent, 'Windows') > 0) devicetype = 1.
IF (CHAR.INDEX(consent__techuseragent, 'Macintosh') > 0) devicetype = 2.
IF (CHAR.INDEX(consent__techuseragent, 'X11') > 0) devicetype = 3.
EXECUTE.
* 'iPhone' and 'Android' are likely to be mobile phones.
IF (CHAR.INDEX(consent__techuseragent, 'iPhone') > 0) devicetype = 4.
IF (CHAR.INDEX(consent__techuseragent, 'Android') > 0) devicetype = 5.
EXECUTE.
* 'iPad' and 'Tablet' are presumably tablets (the latter can overrule 'Windows' and 'Android' above).
IF (CHAR.INDEX(consent__techuseragent, 'iPad') > 0) devicetype = 6.
IF (CHAR.INDEX(consent__techuseragent, 'Tablet') > 0) devicetype = 7.
EXECUTE.
* Recode into a binary variable to flag mobile devices.
* to be retained in the dataset (raw device variables will be dropped).
RECODE devicetype (1 THRU 3=0) (4 THRU 7=1)
INTO ucgdevmob.
EXECUTE.

* Some devices may be used in portrait or landscape mode.
* so for consistency treat the largest dimension as the screen 'width'.
IF (consent__techscrwidth >= consent__techscrheight) screenwidth = consent__techscrwidth.
IF (consent__techscrwidth < consent__techscrheight) screenwidth = consent__techscrheight.
EXECUTE.
* Categorise screen widths arbitrarily as small, medium or large.
RECODE screenwidth
 (LOWEST THRU 767=1) (768 THRU 1199=2) (1200 THRU HIGHEST=3)
INTO ucgdevwdth.
EXECUTE.

ucgistrtm1/2, ucgmisrtm1/2, ucgravrtm1/2, ucgverrtm1/2, ucgvocrtm1/2

G-game sub-test mean item response times. Measured in seconds.

* Derive the mean item response item (excluding QC items) for each sub-test.
COMPUTE ucgvocrtm = MEAN(ucgvoc1rt, ucgvoc2rt, ucgvoc3rt, ucgvoc4rt,
 ucgvoc5rt, ucgvoc6rt, ucgvoc7rt, ucgvoc8rt).
COMPUTE ucgistrtm = MEAN(ucgist1rt, ucgist2rt, ucgist3rt, ucgist4rt, 
 ucgist5rt, ucgist6rt, ucgist7rt, ucgist8rt, ucgist9rt).
COMPUTE ucgmisrtm = MEAN(ucgmis1rt, ucgmis2rt, ucgmis3rt, ucgmis4rt, ucgmis5rt, ucgmis6rt).
COMPUTE ucgravrtm = MEAN(ucgrav01rt, ucgrav02rt, ucgrav03rt, ucgrav04rt, ucgrav05rt, 
 ucgrav06rt, ucgrav07rt, ucgrav08rt, ucgrav09rt, ucgrav10rt, ucgrav11rt).
COMPUTE ucgverrtm = MEAN(ucgver1rt, ucgver2rt, ucgver3rt, ucgver4rt, ucgver5rt, ucgver6rt).
EXECUTE.

ucgLLCage1/2, ucgLLCdate1/2

See u1cLLCage1/2, etc above.

ucgnvt1/2, ucgvbt1/2, ucgt1/2

G-game total scores for verbal ability (ucgvbt), non-verbal ability (ucgnvt) and overall general cognitive ability or 'g' (ucgt).
Each is derived as the sum of the relevant sub-test scores, if all completed.

* The g-game is designed to have equal weighting of verbal and non-verbal items/scores (20 each).
* Therefore create simple sums as scores for verbal, non-verbal and g.
* requiring all relevant sub-tests to be non-missing.
COMPUTE ucgnvt = SUM.2(ucgisttot, ucgravtot).
COMPUTE ucgvbt = SUM.3(ucgvoctot, ucgmistot, ucgvertot).
COMPUTE ucgt = SUM.2(ucgnvt, ucgvbt).
EXECUTE.

ucv1actvm1/2

See u1cactvm1/2, etc above.

ucv1age1/2, ucv2age1/2, ucv3age1/2, ucv4age1/2

Age of twin (in decimal years) at the start of the respective covid questionnaires in phase 1 (ucv1age), phase 2 (ucv2age), phase 3 (ucv3age) and phase 4 (ucv4age).
Derived from ucv1constdt/ucv2constdt/ucv3constdt/ucv4constdt (consent start date) and aonsdob (twin birth date). These date variables are not retained in the dataset.

COMPUTE ucv1age = RND((DATEDIFF(ucv1constdt, aonsdob, "days")) / 365.25, 0.1) .
COMPUTE ucv2age = RND((DATEDIFF(ucv2constdt, aonsdob, "days")) / 365.25, 0.1) .
COMPUTE ucv3age = RND((DATEDIFF(ucv3constdt, aonsdob, "days")) / 365.25, 0.1) .
COMPUTE ucv4age = RND((DATEDIFF(ucv4constdt, aonsdob, "days")) / 365.25, 0.1) .
EXECUTE.

ucv1actn1/2, ucv1stat1/2

Status variables for the covid phase 1 study.
ucv1actn is the number of questionnaire sections completed (0-8).
ucv1stat is the overall status.
ucv1qXstat (X=1-8) are the status variables respectively for questionnaire sections 1-8, which are unmodified item variables from the raw data, each coded 1=completed, 0=not completed.

* Count the questionnaire sections completed (excluding consent).
COMPUTE ucv1actn = SUM(ucv1q1stat, ucv1q2stat, ucv1q3stat, ucv1q4stat,
   ucv1q5stat, ucv1q6stat, ucv1q7stat, ucv1q8stat).
EXECUTE.
* Use this to create an overall status variable.
* coded as 0=not started, 1=started, 2=finished (excluding consent).
RECODE ucv1actn (0=0) (1 THRU 7=1) (8=2)
INTO ucv1stat.
EXECUTE.

ucv1alco2un1/2

See u2calcoaudit1/2, etc above.

ucv1commm1/2

See u1ccommm1/2, etc above.

ucv1contime1/2, ucv1q1time1/2, ucv1q2time1/2, ucv1q3time1/2, ucv1q4time1/2, ucv1q5time1/2, ucv1q6time1/2, ucv1q7time1/2, ucv1q8time1/2, ucv1time1/2, ucv2contime1/2, ucv2q1time1/2, ucv2q2time1/2, ucv2q3time1/2, ucv2q4time1/2, ucv2q5time1/2, ucv2q6time1/2, ucv2q7time1/2, ucv2q8time1/2, ucv2time1/2, ucv3contime1/2, ucv3q1time1/2, ucv3q2time1/2, ucv3q3time1/2, ucv3q4time1/2, ucv3q5time1/2, ucv3q6time1/2, ucv3q7time1/2, ucv3q8time1/2, ucv3time1/2, ucv4contime1/2, ucv4q1time1/2, ucv4q2time1/2, ucv4q3time1/2, ucv4q4time1/2, ucv4q5time1/2, ucv4q6time1/2, ucv4q7time1/2, ucv4q8time1/2, ucv4time1/2

Time taken (in decimal minutes) to complete each part of the covid studies (ucv1=phase 1, ucv2=phase 2, ucv3=phase 3, ucv4=phase 4): consent and each of the eight questionnaire sections; ucv1time/ucv2time/ucv3time/ucv4time is the total questionnaire time.
Each is derived from the respective start and end date-time variables (which have not been retained in the dataset), also using the respective status variables to check that the relevant activity was completed. The status variables are items coded 1Y 0N to show completion of each section.

* derive section times as differences between start and end times, in minutes.
* only compute if the section was completed.
IF (ucv1constat = 1) ucv1contime = DATEDIFF(ucv1conendt, ucv1constdt, 'seconds') / 60.
IF (ucv1q1stat = 1) ucv1q1time = DATEDIFF(ucv1q1endt, ucv1q1stdt, 'seconds') / 60.
IF (ucv1q2stat = 1) ucv1q2time = DATEDIFF(ucv1q2endt, ucv1q2stdt, 'seconds') / 60.
IF (ucv1q3stat = 1) ucv1q3time = DATEDIFF(ucv1q3endt, ucv1q3stdt, 'seconds') / 60.
IF (ucv1q4stat = 1) ucv1q4time = DATEDIFF(ucv1q4endt, ucv1q4stdt, 'seconds') / 60.
IF (ucv1q5stat = 1) ucv1q5time = DATEDIFF(ucv1q5endt, ucv1q5stdt, 'seconds') / 60.
IF (ucv1q6stat = 1) ucv1q6time = DATEDIFF(ucv1q6endt, ucv1q6stdt, 'seconds') / 60.
IF (ucv1q7stat = 1) ucv1q7time = DATEDIFF(ucv1q7endt, ucv1q7stdt, 'seconds') / 60.
IF (ucv1q8stat = 1) ucv1q8time = DATEDIFF(ucv1q8endt, ucv1q8stdt, 'seconds') / 60.
IF (ucv2constat = 1) ucv2contime = DATEDIFF(ucv2conendt, ucv2constdt, 'seconds') / 60.
IF (ucv2q1stat = 1) ucv2q1time = DATEDIFF(ucv2q1endt, ucv2q1stdt, 'seconds') / 60.
IF (ucv2q2stat = 1) ucv2q2time = DATEDIFF(ucv2q2endt, ucv2q2stdt, 'seconds') / 60.
IF (ucv2q3stat = 1) ucv2q3time = DATEDIFF(ucv2q3endt, ucv2q3stdt, 'seconds') / 60.
IF (ucv2q4stat = 1) ucv2q4time = DATEDIFF(ucv2q4endt, ucv2q4stdt, 'seconds') / 60.
IF (ucv2q5stat = 1) ucv2q5time = DATEDIFF(ucv2q5endt, ucv2q5stdt, 'seconds') / 60.
IF (ucv2q6stat = 1) ucv2q6time = DATEDIFF(ucv2q6endt, ucv2q6stdt, 'seconds') / 60.
IF (ucv2q7stat = 1) ucv2q7time = DATEDIFF(ucv2q7endt, ucv2q7stdt, 'seconds') / 60.
IF (ucv2q8stat = 1) ucv2q8time = DATEDIFF(ucv2q8endt, ucv2q8stdt, 'seconds') / 60.
IF (ucv3constat = 1) ucv3contime = DATEDIFF(ucv3conendt, ucv3constdt, 'seconds') / 60.
IF (ucv3q1stat = 1) ucv3q1time = DATEDIFF(ucv3q1endt, ucv3q1stdt, 'seconds') / 60.
IF (ucv3q2stat = 1) ucv3q2time = DATEDIFF(ucv3q2endt, ucv3q2stdt, 'seconds') / 60.
IF (ucv3q3stat = 1) ucv3q3time = DATEDIFF(ucv3q3endt, ucv3q3stdt, 'seconds') / 60.
IF (ucv3q4stat = 1) ucv3q4time = DATEDIFF(ucv3q4endt, ucv3q4stdt, 'seconds') / 60.
IF (ucv3q5stat = 1) ucv3q5time = DATEDIFF(ucv3q5endt, ucv3q5stdt, 'seconds') / 60.
IF (ucv3q6stat = 1) ucv3q6time = DATEDIFF(ucv3q6endt, ucv3q6stdt, 'seconds') / 60.
IF (ucv3q7stat = 1) ucv3q7time = DATEDIFF(ucv3q7endt, ucv3q7stdt, 'seconds') / 60.
IF (ucv3q8stat = 1) ucv3q8time = DATEDIFF(ucv3q8endt, ucv3q8stdt, 'seconds') / 60.
IF (ucv4constat = 1) ucv4contime = DATEDIFF(ucv4conendt, ucv4constdt, 'seconds') / 60.
IF (ucv4q1stat = 1) ucv4q1time = DATEDIFF(ucv4q1endt, ucv4q1stdt, 'seconds') / 60.
IF (ucv4q2stat = 1) ucv4q2time = DATEDIFF(ucv4q2endt, ucv4q2stdt, 'seconds') / 60.
IF (ucv4q3stat = 1) ucv4q3time = DATEDIFF(ucv4q3endt, ucv4q3stdt, 'seconds') / 60.
IF (ucv4q4stat = 1) ucv4q4time = DATEDIFF(ucv4q4endt, ucv4q4stdt, 'seconds') / 60.
IF (ucv4q5stat = 1) ucv4q5time = DATEDIFF(ucv4q5endt, ucv4q5stdt, 'seconds') / 60.
IF (ucv4q6stat = 1) ucv4q6time = DATEDIFF(ucv4q6endt, ucv4q6stdt, 'seconds') / 60.
IF (ucv4q7stat = 1) ucv4q7time = DATEDIFF(ucv4q7endt, ucv4q7stdt, 'seconds') / 60.
IF (ucv4q8stat = 1) ucv4q8time = DATEDIFF(ucv4q8endt, ucv4q8stdt, 'seconds') / 60.
EXECUTE.
* add total time if all sections completed.
IF (ucv1stat = 2) ucv1time = SUM(ucv1contime, ucv1q1time, ucv1q2time, 
 ucv1q3time, ucv1q4time, ucv1q5time, ucv1q6time, ucv1q7time, ucv1q8time).
EXECUTE.
IF (ucv2stat = 2) ucv2time = SUM(ucv2contime, ucv2q1time, ucv2q2time, 
 ucv2q3time, ucv2q4time, ucv2q5time, ucv2q6time, ucv2q7time, ucv2q8time).
EXECUTE.
IF (ucv3stat = 2) ucv3time = SUM(ucv3contime, ucv3q1time, ucv3q2time, 
 ucv3q3time, ucv3q4time, ucv3q5time, ucv3q6time, ucv3q7time, ucv3q8time).
EXECUTE.
IF (ucv4stat = 2) ucv4time = SUM(ucv4contime, ucv4q1time, ucv4q2time, 
 ucv4q3time, ucv4q4time, ucv4q5time, ucv4q6time, ucv4q7time, ucv4q8time).
EXECUTE.

ucv1devmob1/2, ucv1devwdth1/2, ucv2devmob1/2, ucv2devwdth1/2, ucv3devmob1/2, ucv3devwdth1/2, ucv4devmob1/2, ucv4devwdth1/2

Device categories used for the web covid phase 1 (ucv1), phase 2 (ucv2), phase 3 (ucv3) and phase 4 (ucv4) questionnaires.
ucvXdevmob flags mobile devices (mobile phones and tablets), coded 1=yes 0=no.
ucvXdevwdth is a device width category, coded 1=small, 2=medium, 3=large, where width refers to the widest side of the device's screen.
Derived from raw item variables, collected on the web server, that are not retained in the dataset: consent__techuseragent (a complex string denoting the type of user agent), consent__techscrwidth and consent__techscrheight (screen dimensions in pixels).

* Phase 1.
* -------.
* Use the consent__techuseragent field to identify crude device types from substrings.
* 'Windows', 'Macintosh' and 'X11' are probably all laptops or desktops.
IF (CHAR.INDEX(consent__techuseragent, 'Windows') > 0) devicetype = 1.
IF (CHAR.INDEX(consent__techuseragent, 'Macintosh') > 0) devicetype = 2.
IF (CHAR.INDEX(consent__techuseragent, 'X11') > 0) devicetype = 3.
EXECUTE.
* 'iPhone' and 'Android' are likely to be mobile phones.
IF (CHAR.INDEX(consent__techuseragent, 'iPhone') > 0) devicetype = 4.
IF (CHAR.INDEX(consent__techuseragent, 'Android') > 0) devicetype = 5.
EXECUTE.
* 'iPad' and 'Tablet' are presumably tablets (the latter can overrule 'Windows' and 'Android' above).
IF (CHAR.INDEX(consent__techuseragent, 'iPad') > 0) devicetype = 6.
IF (CHAR.INDEX(consent__techuseragent, 'Tablet') > 0) devicetype = 7.
EXECUTE.
* Recode into a binary variable to flag mobile devices.
* to be retained in the dataset (raw device variables will be dropped).
RECODE devicetype (1 THRU 3=0) (4 THRU 7=1)
INTO ucv1devmob.
EXECUTE.

* Some devices may be used in portrait or landscape mode.
* so for consistency treat the largest dimension as the screen 'width'.
IF (consent__techscrwidth >= consent__techscrheight) screenwidth = consent__techscrwidth.
IF (consent__techscrwidth < consent__techscrheight) screenwidth = consent__techscrheight.
EXECUTE.
* Categorise screen widths arbitrarily as small, medium or large.
RECODE screenwidth
 (LOWEST THRU 767=1) (768 THRU 1199=2) (1200 THRU HIGHEST=3)
INTO ucv1devwdth.
EXECUTE.

* Phase 2.
* -------.
* Repeat the syntax above to derive devicetype and screenwidth in exactly the same way.
* from the phase 2 raw data file - then derive the dataset variables as follows.
RECODE devicetype (1 THRU 3=0) (4 THRU 7=1)
INTO ucv2devmob.
EXECUTE.
RECODE screenwidth
 (LOWEST THRU 767=1) (768 THRU 1199=2) (1200 THRU HIGHEST=3)
INTO ucv2devwdth.
EXECUTE.

* Phase 3.
* -------.
* Repeat the syntax above to derive devicetype and screenwidth in exactly the same way.
* from the phase 3 raw data file - then derive the dataset variables as follows.
RECODE devicetype (1 THRU 3=0) (4 THRU 7=1)
INTO ucv3devmob.
EXECUTE.
RECODE screenwidth
 (LOWEST THRU 767=1) (768 THRU 1199=2) (1200 THRU HIGHEST=3)
INTO ucv3devwdth.
EXECUTE.

* Phase 4.
* -------.
* Repeat the syntax above to derive devicetype and screenwidth in exactly the same way.
* from the phase 4 raw data file - then derive the dataset variables as follows.
RECODE devicetype (1 THRU 3=0) (4 THRU 7=1)
INTO ucv4devmob.
EXECUTE.
RECODE screenwidth
 (LOWEST THRU 767=1) (768 THRU 1199=2) (1200 THRU HIGHEST=3)
INTO ucv4devwdth.
EXECUTE.

ucv1ganxt1/2

See u2cganxt1/2, etc above.

ucv1goalfult1/2, ucv1goalrelt1/2

See u1cgoalfult1/2, etc above.

ucv1LLCage1/2, ucv1LLCdate1/2

See u1cLLCage1/2, etc above.

ucv1mfqt1/2

See u1cmfqt1/2, etc above.

ucv1parvm1/2

See u1cparvm1/2, etc above.

ucv1pilm1/2

See u1cpilm1/2, etc above.

ucv1relam1/2

See u1crelam1/2, etc above.

ucv1sdqemot1/2, ucv1sdqpert1/2, ucv1sdqhypt1/2, ucv1sdqcont1/2, ucv1sdqprot1/2, ucv1sdqbeht1/2

See u1csdqemot1/2, etc above.

ucv1stat1/2

See ucv1actn1/2, ucv1stat1/2 above.

ucv1victphyt1/2, ucv1victvert1/2, ucv1victcybt1/2, ucv1victt1/2

See u2cvictphyt1/2, etc above.

ucv1volnt1/2

See u1cvolnt1/2, etc above.

ucv2actn1/2, ucv2stat1/2

Status variables for the covid phase 2 study.
ucv2actn is the number of questionnaire sections completed (0-8).
ucv2stat is the overall status.
ucv2qXstat (X=1-8) are the status variables respectively for questionnaire sections 1-8, which are unmodified item variables from the raw data, each coded 1=completed, 0=not completed.

* Count the questionnaire sections completed (excluding consent).
COMPUTE ucv2actn = SUM(ucv2q1stat, ucv2q2stat, ucv2q3stat, ucv2q4stat,
 ucv2q5stat, ucv2q6stat, ucv2q7stat, ucv2q8stat).
EXECUTE.
* Use this to create an overall status variable.
* coded as 0=not started, 1=started, 2=finished (excluding consent).
RECODE ucv2actn (0=0) (1 THRU 7=1) (8=2)
INTO ucv2stat.
EXECUTE.

ucv2actvm1/2

See u1cactvm1/2, etc above.

ucv2age1/2

See ucv1age1/2, etc above.

ucv2alco2un1/2

See u2calcoaudit1/2, etc above.

ucv2commm1/2

See u1ccommm1/2, etc above.

ucv2contime1/2, ucv2q1time1/2, ucv2q2time1/2, ucv2q3time1/2, ucv2q4time1/2, ucv2q5time1/2, ucv2q6time1/2, ucv2q7time1/2, ucv2q8time1/2, ucv2time1/2

See ucv1contime1/2, etc above.

ucv2devmob1/2, ucv2devwdth1/2

See ucv1devmob1/2, etc above.

ucv2ganxt1/2

See u2cganxt1/2, etc above.

ucv2goalfult1/2, ucv2goalrelt1/2

See u1cgoalfult1/2, etc above.

ucv2LLCage1/2, ucv2LLCdate1/2

See u1cLLCage1/2, etc above.

ucv2mfqt1/2

See u1cmfqt1/2, etc above.

ucv2parvm1/2

See u1cparvm1/2, etc above.

ucv2pilm1/2

See u1cpilm1/2, etc above.

ucv2relam1/2

See u1crelam1/2, etc above.

ucv2sdqemot1/2, ucv2sdqpert1/2, ucv2sdqhypt1/2, ucv2sdqcont1/2, ucv2sdqprot1/2, ucv2sdqbeht1/2

See u1csdqemot1/2, etc above.

ucv2stat1/2

See ucv2actn1/2, ucv2stat1/2 above.

ucv2victphyt1/2, ucv2victvert1/2, ucv2victcybt1/2, ucv2victt1/2

See u2cvictphyt1/2, etc above.

ucv2volnt1/2

See u1cvolnt1/2, etc above.

ucv3actn1/2, ucv3stat1/2

Status variables for the covid phase 3 study.
ucv3actn is the number of questionnaire sections completed (0-8).
ucv3stat is the overall status.
ucv3qXstat (X=1-8) are the status variables respectively for questionnaire sections 1-8, which are unmodified item variables from the raw data, each coded 1=completed, 0=not completed.

* Count the questionnaire sections completed (excluding consent).
COMPUTE ucv3actn = SUM(ucv3q1stat, ucv3q2stat, ucv3q3stat, ucv3q4stat,
 ucv3q5stat, ucv3q6stat, ucv3q7stat, ucv3q8stat).
EXECUTE.
* Use this to create an overall status variable.
* coded as 0=not started, 1=started, 2=finished (excluding consent).
RECODE ucv3actn (0=0) (1 THRU 7=1) (8=2)
INTO ucv3stat.
EXECUTE.

ucv3actvm1/2

See u1cactvm1/2, etc above.

ucv3age1/2

See ucv1age1/2, etc above.

ucv3alco2un1/2

See u2calcoaudit1/2, etc above.

ucv3commm1/2

See u1ccommm1/2, etc above.

ucv3contime1/2, ucv3q1time1/2, ucv3q2time1/2, ucv3q3time1/2, ucv3q4time1/2, ucv3q5time1/2, ucv3q6time1/2, ucv3q7time1/2, ucv3q8time1/2, ucv3time1/2

See ucv1contime1/2, etc above.

ucv3devmob1/2, ucv3devwdth1/2

See ucv1devmob1/2, etc above.

ucv3ganxt1/2

See u2cganxt1/2, etc above.

ucv3goalfult1/2, ucv3goalrelt1/2

See u1cgoalfult1/2, etc above.

ucv3LLCage1/2, ucv3LLCdate1/2

See u1cLLCage1/2, etc above.

ucv3mfqt1/2

See u1cmfqt1/2, etc above.

ucv3parvm1/2

See u1cparvm1/2, etc above.

ucv3pilm1/2

See u1cpilm1/2, etc above.

ucv3relam1/2

See u1crelam1/2, etc above.

ucv3sdqemot1/2, ucv3sdqpert1/2, ucv3sdqhypt1/2, ucv3sdqcont1/2, ucv3sdqprot1/2, ucv3sdqbeht1/2

See u1csdqemot1/2, etc above.

ucv3stat1/2

See ucv3actn1/2, ucv3stat1/2 above.

ucv3victphyt1/2, ucv3victvert1/2, ucv3victcybt1/2, ucv3victt1/2

See u2cvictphyt1/2, etc above.

ucv3volnt1/2

See u1cvolnt1/2, etc above.

ucv4actn1/2, ucv4stat1/2

Status variables for the covid phase 4 study.
ucv4actn is the number of questionnaire sections completed (0-8).
ucv4stat is the overall status.
ucv4qXstat (X=1-8) are the status variables respectively for questionnaire sections 1-8, which are unmodified item variables from the raw data, each coded 1=completed, 0=not completed.

* Count the questionnaire sections completed (excluding consent).
COMPUTE ucv4actn = SUM(ucv4q1stat, ucv4q2stat, ucv4q3stat, ucv4q4stat,
 ucv4q5stat, ucv4q6stat, ucv4q7stat, ucv4q8stat).
EXECUTE.
* Use this to create an overall status variable.
* coded as 0=not started, 1=started, 2=finished (excluding consent).
RECODE ucv4actn (0=0) (1 THRU 7=1) (8=2)
INTO ucv4stat.
EXECUTE.