How To

View PGC Data Access Director Lea Davis' Presentation "The PGC Data Access Portal and Genomic Privacy"

DAC-HowTo

Find Available Datasets

  1. GWAS summary statistics are available without restriction on the Downloads page.

  2. Access to individual level data from each phenotype may be requested through the Data Access Portal. Consult the PGC Workgroup DAC representative for a list of available datasets.

Request Individual Level Data

  1. The first step is to consult your workgroup DAC representative to make sure that the data you are requesting is appropriate for your analysis plan. In addition, your DAC representative can let you know if there are any special permission requirements needed. You should allow at least a week for your DAC rep to respond.

  2. You will need to agree to the terms of the analyst memo and send a signed copy to your workgroup chair.

  3. You will need to obtain an account on the LISA (GCC) cluster. This usually takes only a couple of days.

  4. You will write a secondary analysis proposal and submit it to your workgroup chair (and cross-disorder chair, if necessary). Typically secondary analysis proposals are reviewed within three weeks.

  5. All workgroups have a “fast-track” data package that is available once the workgroup approves the proposal. However, some additional data sets may require explicit permission from individual investigators, dbGAP, NIMH, or a sponsoring foundation. You will need to secure any special permissions for data sets not included in the “fast-track” data package. This is the most time consuming step and can take anywhere from 3-6 weeks. For most phenotypes, the vast majority of the data is available through the “fast-track” data package.

  6. Once you have an approved proposal and documentation of any “special permissions” you may request data through the Data Access Portal.

Write a Secondary Analysis Proposal

The secondary analysis proposal includes a brief description of the rationale for the proposal, the analytic plans, the data being requested, the individuals who will be involved in the work, the timeline, and the plans for publication. When trying to decide how much detail to include, it is better to err on the side of too much rather than too little so that the workgroup has enough information on which to make a recommendation. The approval is awarded to the project, meaning that approval is given for the project that you describe. We realize that analytic strategies can sometimes change through the course of analysis. If you find that your plan has significantly shifted course, you should speak with your DAC representative to determine if the project has changed enough to be considered a new project.  

Gain Access to LISA (Genetic Cluster Computer (GCC))

Most PGC analyses are done on LISA, also known as the Genetic Cluster Computer (GCC), in the Netherlands. You or someone in your group will need to have an account on this cluster in order to access individual level genotype data.

You can apply for a LISA/GCC account here.

The PGC is deeply grateful to our colleagues in the Netherlands. The GCC is supported by an Netherlands Organisation for Scientific Research Medium Investment grant (480-05-003) to Prof Danielle Posthuma, by the VU University Amsterdam, and by the Dutch Brain Foundation to Prof Roel Ophoff. The GCC is hosted by the Dutch National Computing and Networking Services.