Cross-Disorder Analysis

Any member of the PGC can propose a secondary analysis. Here, we describe how to propose analysis of data that involve more than one disorder. (Proposals for a single disorder should be discussed with that PGC working group.)

The proposal process is straight-forward. The PGC strongly encourages these analyses so as to increase our knowledge of these disorders. There is a simple application form available here. This form is reviewed the by the Cross-Disorder Group to ensure suitability and lack of overlap with any other on-going or approved analysis. Before you submit your proposal, make sure you review the list of approved projects and FAQ.

Approved proposals

Date/Title/Investigators Aims

Cross-disorder analyses of anxiety disorders

Jordan Smoller, Jack Hettema
Anxiety disorders as defined by DSM are known to be heritable phenotypes and represent common comorbidities of the PGC disorders (especially bipolar disorder, MDD, and ADHD). Data from family, twin and molecular genetic studies suggest that there are shared genetic risk factors that transcend anxiety disorder diagnoses and that may also overlap with genes influencing other psychiatric disorders. To date, most association studies of anxiety disorders have focused on a limited set of candidate genes, although a small number of GWAS studies of panic disorder have appeared. We aim to examine comorbid anxiety disorder diagnoses in the context of the PGC to identify genes that influence DSM-IV anxiety disorders and latent factors representing a broader anxiety-disorder proneness.

Depression-Bipolar meta-analysis

Gerome Breen, Patrick Sullivan, Mark Daly, John Kelsoe, Jordan Smoller
Given the considerable phenotypic and genetic overlap between depression and bipolar, we seek to analyses the common SNP association data to determine whether they reveal additional findings. This may involve integration of common phenotype data such as age of onset, depressive episodes.

Autism-Schizophrenia meta-analysis

Mark Daly, Mick O'Donovan
Given the considerable overlap in associated rare and de novo CNVs between autism and schizophrenia, we seek to evaluate the common SNP association data to determine whether shared signals may exist which further implicate shared processes between the two disorders.

Genome-wide association study of suicide attempt in MDD, BIP, SCZ

Cathryn Lewis
Suicide is an major problem in several psychiatric disorders, and studies indicate it has a genetic component which is in part independent of the genetics of the underlying psychiatric trait. In addition to numerous but inconclusive candidate gene studies, several genome-wide association studies of suicide attempt have been published using studies that are part of the PGC: Perlis et al.,(AJP 2010) detected no replicated association in MDD or bipolar disorder, Willour et al., (Mol Psych) showed evidence for association to suicide attempt near ACP1, and we have a paper looking at suicide attempt in MDD in the RADIANT studies (Lewis et al, PLoS ONE, in production). The results of these studies suggest that, in common with most psychiatric traits, the genetic susceptibility to suicide attempt is spread across many genes each with model effect sizes, and large sample sizes will be required to detect them. I propose to use the strength of the CDG group samples to look at suicide attempt in the MDD, BIP and SCZ studies to identify SNPs associated with suicide attempt, independent of the underlying disorder.

Methods for improving power of GWAS by leveraging SNP annotation and study psychiatric pleiotropy

Ole A. Andreassen, Anders M. Dale
We propose to leverage a new variant annotation pipeline developed by our colleagues and novel computational tools to improve statistical methods for GWA studies. Preliminary analyses suggest that our methodology has the potential to improve the power of the traditional GWAS, increase the likelihood of identifying replicable variants and improve phenotypic prediction generalization from SNP data. We aim 1) To assess the validity of our approach using completed GWAS p-values per disease, 2) Leverage genotype and phenotype data to improve gene finding and assess improved SNP-based prediction generalization across sub-studies, 3) Use the method to study psychiatric pleiotropy across disorders

Bipolar Disorder-Schizophrenia meta-analysis

Ken Kendler, Nick Craddock, Pamela Sklar, Stephan Ripke, Andrew Mcquillin, Ayman Fanous, Douglas Ruderfer
The importance of undertaking genetic studies across traditional clinical phenotypes has been acknowledged by PGC (PGC Cross Disorders Group, Br J Psychiatry 2010) and such studies are an integral aim of PGC. Following the general approach and strategy we have outlined for the PGC Cross Disorders work (PGC Cross Disorders Group, Br J Psychiatry 2010) we propose studies examining the relationship between bipolar disorder and schizophrenia. 
For several of the most strongly associated SNPs reported in bipolar disorder (BD) and schizophrenia (SCZ), there is evidence for association also in the “other” disorder (Williams et al, Mol. Psychiatry, 2010). The report of an overlap in the “polygenic” signals in the disorders (ISC, Nature, 2009), and analyses of gene-wide association (Moskvina et al, Mol. Psychiatry, 2009) further support the existence of an overlap in genetic susceptibility to BD and SCZ. Cases with mixed features of BD and SCZ (schizoaffective disorder, bipolar type) have had limited scrutiny to date and their relationship to BD and SCZ is not yet clear. In addition, recent analyses restricted to a handful of loci completed by the PGC-BD and PGC-SCZ groups have identified contributions from both bipolar disorder and schizophrenia to the overall genome-wide significant findings. Thus in conjunction with the PGC-SCZ and PGC-BD working groups, we seek to evaluate and characterize the nature, extent and relationships between the common SNP association signals in SCZ and BD using the PGC dataset.

Do common genetic variants associated with autism, schizophrenia, and bipolar disorder predict cognitive and behavioral variation in a general population sample of children ages 7-16?

Elise Robinson, Mark Daly, Jordan Smoller, Susan Santangelo, Roy Perlis, Ben Neale, Shaun Purcell, Stephan Ripke, Robert Plomin, Angelica Ronald
The extent to which common genetic variants associated with major psychiatric and neurodevelopmental disorders predict cognitive and behavioural variation in the general population is unclear. As a component of a K award application (to ER, mentored by MD – to be submitted in February 2012), the proposing investigators aim to use data from the Twins Early Development Study (TEDS, n=3,200) to address the following questions: 1) Do common genetic variants associated with autism, schizophrenia and bipolar disorder predict cognitive and behavioural variation assessed quantitatively in the general population in the general population? 2) Do common genetic variants associated with autism, schizophrenia, and bipolar disorder predict trajectories of cognitive and behavioural development in the general population between ages 7 and 16?

Candidate selection and testing through integration of functional genomics

Kenneth Kendler, Xiangning Chen, Silviu-Alin Bacanu
We propose to test methods to enrich GWAS association signals by integration with functional genomics datasets. In our preliminary analyses of the schizophrenia dataset, functional genomics data can significantly enrich association signals above GWAS background and allow effective selection of promising candidates for further testing. Our goals are: 1) To assess the validity of our approach by testing individual diseases and across diseases. 2) To evaluate if association signals enriched are disease specific or shared across multiple diseases.

Cross disorder analysis ADHD and bipolar disorder (BPD)

Andreas Reif, Klaus-Peter Lesch, Barbara Franke, Eric Mick, John Kelsoe, Steve Faraone
There is considerable evidence that ADHD and BPD can be co-morbid conditions. Coming from primary bipolar samples, the co-morbidity between BPD and ADHD has been estimated to be between 9% (Nierenberg et al., 2005) and 18% (Tammam et al., 2006; McIntyre et al., 2010). Coming from primary ADHD samples, comorbidity rates varied a lot with values between 5% (McGough et al., 2005) and 47% (Wilens et al., 2003), although the means seems to be ranging from 9% (Park et al., 2010) to 19% (Faraone et al., 2005) as well. A mutual co-morbidity rate around 20% is also supported by the population-based National Comorbidity Survey study (Kessler et al., 2006). One possible explanation for this increased co-morbidity might be that both disorders share at least some risk genes, or alternatively, that ADHD with co-morbid BPD might in fact be a distinct disorder (Faraone et al., 2001; Bernardi et al., 2010). In either case, only a few risk genes (e.g., DGKH; Weber et al., 2011) have been suggested to date. We here aim to identify both shared risk genes by analyzing the ADHD and BPD GWAS datasets.


Genome-wide analysis of genetic determinants of the course of major psychiatric disorders

Thomas G. Schulze, Marcella Rietschel, Stephan Ripke


Psychiatric genetic research has so far primarily focused on cross-sectional diagnostic phenotypes. One largely untraveled avenue so far is the study of longitudinal phenotypes, in particular the course of disorder. Course of the disorder however constitutes and important important as it is common clinical knowledge that patients differ widely in several aspects of the course of their illness. These include individual patterns of relapse, regain of functioning after an acute episode of illness, level of disability, cognitive functioning in relation to the duration of illness among others. While the samples gathered by the PGC are not longitudinal in nature, several groups represented in the PGC have collected clinical data that can be used as proxies for course markers such as GAF (assessed at various time points) or OPCRIT items #5 (mode of onset), #9 (poor work adjustment), # 10 poor premorbid social adjustment), #88 (deterioration from premorbid level of functioning), #89 (psychotic symptoms respond to neuroleptics), or #90 (course of disorder).  We are proposing to use these variables as primary phenotypes for a GWAS, applying various analytical strategies: A) GWAS of course markers in SCZ B) GWAS of course markers in BD C) GWAS of course markers in MD D) GWAS of course markers across SCZ, BD, and MD E) Polygenic prediction of course markers across the 4 disorders.  Thus, in addition to elucidating disorder-specific predictors of course, we plan to investigate whether there are some genetic determinants of course shared across the three disorders. 


Genetic influences on age of onset of adult psychiatric disorders

Ayman Fanous, Ken Kendler, Tim Bigdeli


Considerable evidence has mounted suggesting that susceptibility genes for schizophrenia, bipolar disorder, and major depressive disorder overlap to some extent.  Age of onset (AOO) may be influenced by common alleles that do not by themselves alter the risk of illness.  There may be, for example, genetic factors modifying resilience against the development of any psychiatric disorder, which might delay the onset of illness until such a point that this is counterbalanced by genetic or environmental risk factors.  We seek to identify such common alleles shared by SCZ, BPD, and MDD, having detected suggestive evidence of them in SCZ alone (Schizophrenia PGC, submitted, Am J Psychiatry).  We further seek to determine whether the polygenic risk of illness affects age of onset. Finally, we will attempt to identify gene pathways that might affect AOO.      


Genetic relationship of head circumference to schizophrenia and autism

Ayman Fanous, Brion Maher, Kelly Benke, Tim Bigdeli

A robust relationship between early life head growth and autism has been replicated in multiple studies (Redcay 2005).  Head circumference at birth is small, but grows much more rapidly in the first two to four years (Courchesne 2003; Redcay 2005). Similarly, small head size at birth may correlate with schizophrenia (Basset 1996).  Further support for the relationship between anthropomorphic traits and these two psychiatric disorders might come from genetic association studies.   A recent GWA meta-analysis of infant head circumference (18-30 months) reported 12 independent loci in the discovery cohorts with a p-value < 10-5 (Taal 2012).  Additionally, many GWA studies report findings for autism and schizophrenia.  While any given locus is not likely to explain much of the observed variation in the trait, a genetic risk score that combines information across all loci may be explain a larger proportion of the variance (Janssen 2009), indicating that the genetic architecture of many common, complex traits can be explained, in part, by the combined additive effects across several common, modestly associated genetic variants (ISC, 2009).  Thus, genetic risk scores created by using SNPs (or their close correlates) identified from GWA studies of autism and schizophrenia may associate with infant head circumference, even if the individual SNPs were not found to be significant in the GWA meta-analysis of infant head circumference.  Similarly, genetic risk scores created from the SNPs identified for infant head circumference may associate with psychiatric traits in adulthood. We plan to examine the relationship between polygenic risk scores of two psychiatric illnesses and infant head circumference.  As a complementary method, we may also implement the GCTA analysis. Finally, we will perform a meta-analysis of individual GWAS of head circumference and SCZ, as well as head circumference and AUT.


Corrected AUC (Area Under the receiver operating Characteristic): a viable, potentially more robust, alternative to logistic regression in GWAS

John Kelsoe, Nicholas Schork, Aaditya Rangan, Caroline McGrouther

Currently, logistic regression corrected for covariates (MDS components, study of origin) is the default method by which we analyze the imputed data available through the PGC.  Logistic regression has the following attractive features: 1) it can be corrected for linear relationships between covariates, 2) people are very familiar with it, 3) it can be generalized to higher dimensions, and 4) it is well-suited to linear additive models with many SNPs (Purcell et al2006) as its coefficient is basically the difference in the means between the case and control groups passed through a non-linear function. Unfortunately, 1) it is sensitive to outliers, 2) its trial to trial variance can be unbounded, 3) it was not designed to work with skewed data distributions and much of our imputed data distributions are neither normally distributed nor integer-valued.  The Area Under the receiver operating Characteristic (AUC) is the probability that, if one picks a case and a control, the case value will be larger than the control.  Unlike logistic regression, the AUC is 1) robust to outliers, 2) its trial to trial variance is bounded, and 3) is well suited to voting models (those that rely on few SNPs).   It cannot be generalized to higher dimensions, but this is not relevant to our analysis.  Despite these positive features, it has not previously been considered an alternative to logistic regression for non-normally distributed or integer-valued data as it could not be comparably corrected for covariates.  Due to our recent work, McGrouther, Schork, and Rangan (in submission), AUC can be corrected for linear as well as non-linear covariates.  Its capacity to correct for non-linear covariate interactions is another reason to suspect it may outperform logistic regression in identifying replicable SNPs in a GWAS analysis with mixed imputed/real data. It is our aim to determine whether 1) Corrected AUC (CAUC) can more reliably identify replicable SNPs in the BPD, MDD, SCZ, Autism and ADHD GWAS datasets, 2) CAUC is associated with greater enrichment of replicable SNPs than logistic regression, and 3) a risk prediction model based on CAUC can use fewer SNPs to predict comparably or better to the traditional linear additive model based on corrected logistic regression.


Determining whether real and imputed data should be treated as equivalent

John Kelsoe, Nicholas Schork, Aaditya Rangan, Caroline McGrouther

Many of the SNPs within the Sklar et al. 2011 hapmap2 imputed data set are heavily imputed. This includes many of the most promising SNPs. For example. rs4765913 (CACN1A) is 100% imputed and rs12576775 (ODZ4) is 66% imputed. A similar trend is observed within the Smoller et al. 2013 CDG results: 3 of the top 4 CDG SNPs are more than 2/3rds imputed in the  BPD sample. This proposal asks whether imputed data is a good approximation of real data. Our assumption has been that imputed data is a good proxy for real data and can therefore be used to indicate what SNPs should be further analyzed.  Preliminary findings in the Sklar et al. data suggest they are not equivalent. We have observed that (i) prior to correction, the imputed data is much more correlated across the population than the real data, and (ii) taken separately, the imputed data and real data identify different SNPs.  The vast majority of the top SNPs in the imputed data set are nowhere near the top SNPs in the real data set, and vice-versa. In addition, there is a growing body of literature suggesting that imputed data may systematically differ from real data.  Thus, it is imperative to determine if this trend is isolated to the BPD sample, or if it is more pervasive. If there is a  systematic difference between the imputed and real data in the SCZ, AUT, ADHD and MDD data sets, then many of the conclusions regarding these data sets may need to be reevaluated.  Our first aim is to repeat our preliminary analysis of the BPD data on the other data sets mentioned above. Second, if we find a difference between the real and imputed data, we will further investigate whether or not these differences remain after imputing with a larger reference library (such as, e.g., the 1000 Genomes Project). 


Brainstorm project: a cross-disorder analysis of common variation in neurological and psychiatric diseases

Verneri Anttila, Stephan Ripke, Rainer Malik, Laramie Duncan, Alessandro Biffi, Phil Lee, Ken Kendler, Jeremiah Scharf, Aarno Palotie, Jordan Smoller, Mark Daly, Jonathan Rosand, Benjamin Neale

Many neurological and psychiatric diseases have considerable co-morbidity, and the strength of the etiological boundaries is a topic of active debate. The objective of the study is to integrate neurological genome-wide association results with the PGC-CDG efforts to further explore cross-disorder analyses of common genetic risk factors. The overall aim is to provide insight into the molecular basis of these phenotypic co-morbidities and leverage that into reveal more of the underlying pathways involved in the pathophysiology.


Gene-based cross disorder analysis

Colm O'Dushlaine, Irwin Waldman, Jordan Smoller, Gerome Breen, Valentina Escott-Price, Steve Faraone

The project aims to conduct a set of gene-based cross disorder analyses of schizophrenia (SZ), bipolar disorder (BP), and major depressive disorder (MDD). To date, the PGC-NPA group has conducted pathway based analyses and the cross-disorder group has conducted cross-disorder SNP-based analyses. An important analysis that remains, however, is at the gene level, given that cross-disorder allelic heterogeneity might reduce what would otherwise be consistent signals across disorders. Gene-based tests may reveal stronger and more significant associations than SNP-based tests, and examining associations of genes with these disorders simultaneously might reveal stronger and more consistent effects than associations with each disorder individually. Using multiple methods has particular benefits as each method adjust for LD differently and has different biases for gene size, number of markers, etc.  An averaged or combined test statistic across methods will give the best results.


Can disorder-specific effects of SNPs be distinguished in a combined genome-wide association analysis of five psychiatric disorders? 

Steve Faraone, Susan Santangelo, Jimmy Potash, John R. Kelsoe, Tiffany Greenwood, Manuel Mattheisen, Stephen J. Glatt, Jonathan Hess

The purpose of this analysis is to test the discriminating effects of polymorphic SNPs in a combined genome-wide association analysis of the five primary PGC disorders (attention deficient hyperactive disorder, autism spectrum disorder, schizophrenia, bipolar disorder, and major depression). Our aim is to apply a multinomial logistic regression model to test the predictive effect of individual SNPs on the outcome of affected status, with the aim of identifying the most disorder-specific or -selective SNPs. Specifically, the goals of the analyses are to:  i) identify SNPs conferring a reliable effect on each disorder while not increasing risk for the other four disorders; ii) identify subsets of cross-disorder associated SNPs that are significantly more strongly weighted to one disorder than the other four disorders; and iii) characterize the biology of the disorder-specific genes via pathway and ontology enrichment analyses. 


Sex-dependent genetic architecture shared across mood and psychotic disorders

Jordan Smoller, Jill Goldstein, Chris Cotsapas, Myles Brown, Matt Freedman, Stuart Tobet

Sex differences are pervasive in psychiatric disorders, including major depressive disorder (MDD), schizophrenia (SCZ) and other psychoses, and anxiety (ANX) disorders. We have previously demonstrated that shared pathways involve excess maternal gestational glucocorticoids (indicator of “prenatal stress”) disrupting fetal gamma aminobutyric acid (GABA) signaling in conjunction with hormone, growth, and inflammatory factors. The genes in these pathways also contribute to sex-dependent alterations in neuronal development in stress response circuitry in disease-relevant brain regions. Thus, based on prenatal stress model evidence in preclinical studies and our own work in humans, we hypothesize that there are sex-dependent developmental neurobiological pathways shared across these disorders. We will test this hypothesis in two ways: (1) Look for enrichment of genetic risk burden in sex-dependent hormone binding elements in relevant cell types from the Epigenomics Roadmap Project; (2) Perform pathway analyses using standard tools to examine the cross-disorder enrichment of GABA signaling, glucocorticoid and immune pathways that contribute to sex-dependent effects.


Risk prediction using multivariate analysis for PGC cross disorders

S. Hong Lee, Naomi Wray

We estimated genetic correlations among 5 psychiatric disorders and found substantial genetics overlap between SCZ/BIP, SCZ/MDD and BIP/MDD (Lee et al, 2013). We used a bivariate model for each pair of disorders. The genetic sharing between disorders can be leveraged to increase accuracy of risk prediction. We propose to use multivariate analysis to predict individual risk for the disorders. We will use multivariate REML applied to the combined data of all 5 PGC cross-disorders (SCZ/BIP/MDD/AUT/ADHD). We will estimate predictors for individual risk and compare to that from a conventional model (e.g. univariate model). We will also do a functional annotation analysis with the multivariate model and investigate if the prediction accuracy will increase. The goal of this proposal is to develop an approach to increase the accuracy of individual risk prediction.

FAQ: Proposals for Cross-Disorder Analyses

    1. What is a PGC Cross-Disorder Analysis? 
      A cross-disorder analysis (CDA) is one that incorporates data from more than one of the PGC Disorder Workgroups.
    2. Who may submit a CDA proposal? 
      Any PGC investigator may submit a proposal. Also, multiple investigators may jointly propose an analysis.
    3. What steps should I take before submitting a CDA proposal? 
      In order to maintain the PGC’s spirit of collaboration and transparency, proposals should be discussed and be acceptable to the Disorder Workgroups whose data will be included. In order to avoid redundancy with other CDA’s, investigators wishing to propose an analysis should check the webpage of ongoing CDA’s (url: TBD) which lists such analyses, the proposers, and working groups. If the analysis you intended to propose is redundant with another ongoing analysis, you may wish to join the working group for that analysis or modify your proposal to remove overlap. Note that the guidelines and procedures for proposals are likely to evolve as experience with the process accumulates, but the over-riding aims will be to be inclusive in facilitating high quality, non-redundant analyses. 
    4. How do I submit a proposal? 
      Proposals should be submitted by completing the two-page Cross-Disorder Analysis Template. The proposal should indicate the names, contact information, and Workgroup affiliations of the proposing investigators, the Workgroups whose data will be involved, and brief descriptions of the research question(s), analytic plan, analytic personnel, resources needed, and timeline. Proposals can be submitted through a CDG representative of one of the workgroups whose data will be involved.
    5. How are proposals handled? 
      After the proposal is submitted, it will be discussed at the next available teleconference of the CDG (held every 2nd Wednesday of the month at 9-10 am Eastern U.S. time. Proposals will be evaluated primarily for feasibility and redundancy with other ongoing PGC cross-disorder analyses. If a proposal appears to be unfeasible or redundant, it will be discussed with the proposing investigators to see whether modifications can be made to address these issues.
    6. What happens after the CDA proposal has been reviewed by the CDG? 
      Following approval by the CDG, the proposers should form a working group that will be responsible for conducting and reporting the analysis. To allow participation by other PGC investigators, the proposing investigators should solicit working group members by circulating the proposal to the Disorder Workgroups whose data will be involved. The proposal and list of working group members will also be posted to the PGC webpage listing ongoing CDG analyses. Please note that while the CDG can provide review and feedback on proposals, we will not be able to provide analytic support. Resources for data analysis (including analyst effort) will be the responsibility of the proposing workgroup.
    7. How can the working group access variables needed to perform the analysis? 
      The CDG is working on gathering and harmonizing a set of key variables for CDAs. If the variables needed for your analysis are included in this set, you can access the data through the usual PGC channels for data analysis. If your proposal requires variables that have not been gathered by the Disorder Workgroups or CDG group, you will be responsible for gathering these additional variables and having them deposited in the PGC database on the Netherlands Genetic Cluster Computer. Consistent with the rules of the PGC, all analyses of PGC data must be performed on the Netherlands Genetic Cluster Computer. Note that no individual-level data can be moved off the cluster in any form (aggregate results can be transferred). All analysts must read and follow the steps in the "gatekeeper" document (attached).
    8. What is the timetable for performing cross-disorder analyses? 
      In order to ensure timely completion of cross-disorder analyses and avoid unnecessary “bottlenecks”, the proposing workgroup will have a 6-month “proprietary period” within which to complete the proposed analyses and draft a manuscript.
    9. What procedures should be followed for cross-disorder manuscripts and publications? 
      The designated CDA writing groups will have primary responsibility for drafting results from the cross-disorder analyses. As with other PGC publications, manuscripts will be published under a PGC consortium byline. A listing of the members of the writing group and their contributions should be included at the end of the manuscript. In addition, manuscripts should include a standard acknowledgement of Members of the CDG and other PGC Workgroups. Because each manuscript will reflect the work of the PGC, completed manuscripts should be submitted to the PGC Coordinating Committee for approval prior to submission.