TEDS Data Dictionary

TEDS Exclusions

This page describes exclusions, and exclusion variables, that are routinely used in analysis of TEDS data. The information below applies to the revised, twin-specific medical exclusions that were implemented at the end of 2018.

Contents of this page:

Standard exclusions for analysis

For routine analysis, twins or twin pairs are generally excluded if they fall into one or more of the following categories. Note that some exclusion criteria apply to twin pairs (no 1st Contact, perinatal outliers, unknown sex/zygosity) while others apply to individual twins (medical exclusions and hence the overall general exclusion).

Category Brief explanation Exclusion variable Variable values
General exclusion Variables to encapsulate all four categories below. exclude1/2
(twin-specific, double entered)
1=excluded,
0=not excluded
Medical exclusions Twins with serious medical conditions which affect their ability to take part in TEDS activities and/or are known to be associated with mental impairment. medexcl1/2
(twin-specific, double entered)
1=excluded,
0=not excluded
Perinatal outliers Extreme adverse conditions are known to have applied before or after birth and may have affected subsequent development.
Derived from 1st Contact variables.
aperinat
(applies to twin pair)
1=excluded,
0=not excluded
No 1st Contact data Essential background variables, including perinatal variables, are missing. acontact
(applies to twin pair)
1=data present (not excluded),
0=data missing (excluded)
Unknown twin zygosity or gender Missing information about zygosity or gender prevents inclusion in twin models and other common analyses. sexzyg
(applies to twin pair)
7=unknown (exclude),
1=MZM, 2=DZM, 3=MZF, 4=DZF, 5=DZOS (not excluded)

For routine analysis, variables exclude1/2 can be used for exclusion purposes. However, the other exclusion variables are generally available in the dataset if more flexibility is required.

The exclude1/2 variables are twin-specific and double entered. Apply filters as following for exclusion:

  • exclude1=0 & exclude2=0:
    for twin-pair analysis, this removes pairs in which either or both twins are excluded
  • exclude1=0:
    for analysis that does not involve pairs, this removes individual excluded twins

The SPSS syntax below shows how the exclude1/2 variable is derived.

* Default value is 0 (not excluded).
COMPUTE exclude1 = 0.
COMPUTE exclude2 = 0.
EXECUTE.
* New definition of exclusion: define for each twin.
* and exclude if twin is a medical exclusion, or if twin pair has no 1st Contact.
* or perinatal outliers or unknown sex/zygosity.
IF (SYSMIS(acontact) | acontact = 0 | aperinat = 1 | sexzyg = 7 | medexcl1 = 1) exclude1 = 1.
IF (SYSMIS(acontact) | acontact = 0 | aperinat = 1 | sexzyg = 7 | medexcl2 = 1) exclude2 = 1.
EXECUTE.
			

Further explanation of the perinatal outlier and medical exclusion categories follows below.

Perinatal outliers

The perinatal outliers are twin pairs for whom extreme circumstances before or after birth have been identified from 1st Contact variables, and whose subsequent development may have been adversely affected. They are therefore routinely treated as exclusions from analysis. The perinatal outliers are identified using variable aperinat as mentioned above (aperinat=1 signals exclusion). They are twin pairs who fall into one or more of the following categories.

  • Very low birth weight, for either or both twins: less than 471g.
  • Long period of special care after birth, for either or both twins: more than 97 days.
  • Long period of hospital admission after birth, for either or both twins: more than 74 days.
  • Very short period of gestation: less than 27 weeks.
  • High weekly consumption of alcohol by mother during pregnancy: 14 or more units per week.

In all five categories, the cut-off was based on a 1/7000 rule (3.6SD from the mean). Although the first three categories could apply to just one twin of a pair, in practice they are often found to apply to both twins in affected pairs. The last two categories necessarily affect the twin pair. Hence this exclusion is applied pair-wise. The aperinat variable is described in more detail on the 1st Contact derived variables page.

Medical exclusions

Medical exclusions are twins who have serious medical conditions and who are routinely excluded from analysis because (a) their condition probably affects their ability to participate in TEDS activities, and/or (b) their condition is known to be associated with mental impairment. Twins categorised as medical exclusions are identified using the double entered variables medexcl1/2. Most medical exclusions fall into the following categories.

Medical exclusion category Examples of excluded conditions Examples of related conditions NOT excluded
Severe ASD Cases that are non-verbal or with severely delayed speech, or with difficulties in completing activities. Mild or high-functioning ASD/Aspergers cases, cases who have apparently participated without difficulty
Severe cerebral palsy Cerebral palsy with quadriplegia or diplegia; difficulties completing activities; complications with other conditions. Mild cerebral palsy, no apparent difficulty in participating
Chromosomal abnormalities Down's syndrome; chromosome deletion syndromes (nearly always associated with mental impairment) -
Inherited or genetic conditions having known associations with mental impairment Fragile X, Noonan syndrome and others. Inherited or genetic conditions not associated with mental impairment, for example cystic fibrosis, haemophilia.
Brain organically affected Hydrocephalus, brain haemorrhage, microcephaly, brain damage from accidents. Brain tumours.
Developmental delay Also called global developmental delay. -
Deafness or blindness Profound deafness in both ears; complete blindness in both eyes. Deafness in one ear, partial hearing, partial sight.
Other serious conditions or syndromes Typically a combination of conditions/symptoms that apparently affect the ability to participate. Multiple medical conditions that apparently do not hinder the ability to participate.

In any given twin pair, often only one twin is affected. Hence, medical exclusions are now applied per-twin not per-pair. This avoids having to exclude participating unaffected twins from analysis if their co-twins are categorised as medical exclusions (and sometimes non-participating).

The records of medical conditions, and categories of medical exclusion, are maintained within the TEDS admin database. These records are based on information accumulated throughout the TEDS study, from various sources. Some information originates from TEDS questionnaires, for example the 1st Contact and 7 year parent booklets contained questions about serious medical conditions. Further information has come from direct contact with parents and twins, for example during telephone calling in the TEDS web studies. Further information still has come indirectly to TEDS via spin-off studies like E-Risk, Attachment and SRS, who have visited the twins. In particular, our information about severity of ASD has often come from SRS staff.

As well as the "exclusion from analysis" category, twins may be marked in the admin database as candidates for exclusion from participation in studies. This latter category is based on a twin's record of participation as well as on information about the severity of any medical conditions. The "exclusion from analysis" and "exclusion from studies" categories are independent, so the latter may be changed while the former remains unchanged. Inclusion or exclusion in any given study does not change a twin's status with respect to exclusion from analysis. At the start of any new study, such "exclusion from studies" twins are considered for participation or exclusion on a case-by-case basis.

Other exclusions

The general exclusions described above (using variables exclude1/2) may be replaced or supplemented with other exclusions for specific types of analysis. Some examples are described below.

ASD trait analysis

Twin ASD traits have been measured using measures such as CAST and AQ at a variety of ages. In analysis of such measures, it is often useful to compare results with and without exclusion of twins who are known to have diagnoses of ASD. Additional variables are available in the dataset for this purpose.

As explained above (medical exclusions), mild cases of ASD are no longer treated as exclusions, so are routinely included in analysis. However, severe cases of ASD are treated as medical exclusions. To help identify such cases, and to remove exclusion from severe ASD cases, the following variables are available in the datasets. Each variable is twin-specific and double entered.

Variable Variable coding Description
autism1/2 2=severe ASD, 1=mild ASD, 0=no ASD Diagnoses of ASD reported by families (or reported by autism sub-studies like SRS) and recorded in the TEDS admin database. Where categorised as severe (value 2), the twin is a medical exclusion. Mild ASD (value 1) is not categorised as a medical exclusion.
exclaut1/2 1=exclude from analysis, 0=not excluded General exclusion variable (similar to exclude1/2 as described above) but with the exception that cases of severe ASD (autism1/2 = 2) are NOT excluded.

The exclaut1/2 variable is derived as follows in SPSS syntax.

COMPUTE exclaut1 = 0.
COMPUTE exclaut2 = 0.
EXECUTE.
* New definition (per-twin): mild ASD is no longer a medical exclusion, can ignore these.
* However, do NOT exclude if twin is medical exclusion and has severe ASD (autism1/2 = 2).
* Hence exclude if twin is medical exclusion and not severe autism.
IF ((medexcl1 = 1 & autism1 < 2) | aperinat = 1 | sexzyg = 7 | SYSMIS(acontact) | acontact = 0) exclaut1 = 1.
IF ((medexcl2 = 1 & autism2 < 2) | aperinat = 1 | sexzyg = 7 | SYSMIS(acontact) | acontact = 0) exclaut2 = 1.
EXECUTE.
			

The older twin autism variable called dawbar1/2 is no longer in the dataset. This variable combined diagnostic results from two childhood autism sub-studies: the DAWBA study (2005-07) and the earlier SCQ study (2001). This variable has become redundant because of its age and because of the availability of the more current autism1/2 variable described above. The latter variable has incorporated early results from the DAWBA and SCQ sub-studies with subsequent results from the SRS sub-study plus more recent self-reported twin diagnoses at ages 21 and 26.

In home study dataset

Families were visited in their homes in the 4 year in home study. As part of the visit, TEDS staff assessed the twins' health, and specifically their ability to participate in the battery of tests. The assessment included a hearing test. As a result, there is a study-specific medical exclusion variable (emedexcl) and a study-specific general exclusion variable (eexclude). These variables, rather than the more general exclusion variables described above, are normally used to filter exclusions in analysis of the in-home dataset.

Language development analysis

The 1st Contact dataset includes variable alang, describing the main language spoken at home at the time of assessment (with values 1=English, 0=other, 2=English+other).

For analysis of twin language development, especially at early ages, it is generally appropriate to exclude from analysis those twins whose first language is not known to have been English. Hence, for analysis, the filter alang=1 can be applied (excluding 0, 2 and missing values).

Age exclusions

The 2 year, 3 year and 4 year booklet studies were designed for data collection as close as possible to the respective birth dates of the twins. The aim was ideally to collect twin data no earlier than 2 months before and no later than 3 months after the respective twin birthdates (for example, in the 2 year study the aim was to collect data between the ages of 1 year 10 months and 2 years 3 months). Furthermore, for each pair of twins, the aim was to collect data simultaneously (at least, within a few days or at most a difference of two months).

If needed, exclusion variables exist to remove twin pairs from whom data was collected too early, too late or (for the pair) too far apart in time. These exclusion variables exist in "moderate" and "strict" forms, as follows. In each case, the variable coding is 1=excluded, 0=not excluded.

Dataset Exclusion description Exclusion variables Criteria for exclusion
2 year Moderate age exclusion bagemdex Age less than 1 year 10 months, or greater than 2 years 6 months.
Moderate age combined with general exclusion. bexcludm Age criteria as above, and/or general exclusions encapsulated in variable exclude1/2.
Strict age exclusion bagestex Age less than 1 year 10 months, or greater than 2 years 3 months, or twin pair age difference greater than 2 months.
Strict age combined with general exclusion. bexcluds Age criteria as above, and/or general exclusions encapsulated in variable exclude1/2.
3 year Moderate age exclusion cagemdex Age less than 2 years 10 months, or greater than 3 years 6 months.
Moderate age combined with general exclusion. cexcludm Age criteria as above, and/or general exclusions encapsulated in variable exclude1/2.
Strict age exclusion cagestex Age less than 2 years 10 months, or greater than 3 years 3 months, or twin pair age difference greater than 2 months.
Strict age combined with general exclusion. cexcluds Age criteria as above, and/or general exclusions encapsulated in variable exclude1/2.
4 year Moderate age exclusion dagemdex Age less than 3 years 10 months, or greater than 4 years 6 months.
Moderate age combined with general exclusion. dexcludm Age criteria as above, and/or general exclusions encapsulated in variable exclude1/2.
Strict age exclusion dagestex Age less than 3 years 10 months, or greater than 4 years 3 months, or twin pair age difference greater than 2 months.
Strict age combined with general exclusion. dexcluds Age criteria as above, and/or general exclusions encapsulated in variable exclude1/2.

In each of these cases, exclusion is derived based on twin birth date and the dates recorded in the twin booklets. Further details can be found on the derived variable pages for the 2 year, 3 year and 4 year studies.

Ethnic origin exclusions

The 1st Contact dataset includes variable aethnic, describing the ethnic origin of the twins (with values 1=white, 0=other).

Because a very large majority of the TEDS twin sample has white ethnic origin, for analysis of genotypic data, it is often appropriate to exclude from analysis those twins whose ethnic origin was not white. Hence, for analysis, the filter aethnic=1 can be applied (excluding 0 and missing values).

Note that every attempt was made to apply this exclusion both during selection for genotyping and during QC of the genotype data. Hence, it is rare that this exclusion needs to be made at the analysis stage.