Genetic Testing Registry Dataset

The Genetic Testing Registry (GTR) is an NCBI dataset (and like Clinvar, available via FTP and eutils) that publishes information on specific genetic tests provided by various institutions. Some tests focus on a particular gene (e.g. BRCA1); some tests comprise disease or condition “panels”; and some entries in GTR bundle the most commonly problematic genes (especially for cancer) into a single test.

For example, a typical disease testing panel for HHT (Hereditary hemorrhagic telangiectasia) should encompass at least the ENG and ACVR1 genes, and potentially also SMAD4. These genes are bundled into “panel” groupings — for example, this HHT Panel by GeneDX — to enable a genetic test provider to respond comprehensively to a doctor’s diagnostic indications for a patient.

As with the Clinvar dataset, the GTR data will be imported for analysis into MySQL using the medgen-mysql toolkit.

Technical Details: Access and Manipulation

Excerpt from a technical discussion of the contents of GTR and its methods of access:

NIH Genetic Testing Registry (GTR): A data mine available through programmatic access. B. Kattman, A. Malheiro, J. Lee, D. Maglott, V. Hem, M. Ovetsky, G. Song, C. Wallin, K. Katz, R. Villamarin-Salomon, B. Gu, S. Chitipiralla, W. Rubinstein NCBI/NLM/NIH, Bethesda, MD.

The NIH Genetic Testing Registry (GTR; http://www.ncbi.nlm.nih.gov/gtr/) houses detailed information on more than 17,000 genetic tests voluntarily submitted by laboratory test providers. Interest in programmatic access to GTR data has burgeoned as GTR has become the world’s most comprehensive repository of publicly available data about genetic tests. The GTR website supports interactive data access for a high volume of daily users. Recently the entire GTR dataset was made available through XML files from the ftp site, as well as summary data via NCBI’s application programming interface, E-utilities.

The GTR includes germline and somatic tests using molecular, cytogenetic and/or biochemical methodologies. Tests for drug responses, and complex panels utilizing next-generation sequencing or array technologies, are in scope. As of June 2014, GTR has submissions from more than 400 testing laboratories from 39 countries. In the past 7 months there has been a 66% increase in the number of registered tests. Registered tests evaluate 4,456 conditions, with molecular tests targeting 3,014 genes (91% of clinically relevant genes reported by ClinVar). Submission of complex tests is also increasing, with 651 tests that evaluate 5 or more genes. Next-generation sequencing (NGS) is a component of 9.2% of molecular tests. Data surrounding the evidentiary basis of tests are available for a growing number of records: Analytical validity (100%), Target population (36%), Clinical validity (14%), and Clinical utility (15%). Information is also available about proficiency testing, FDA approval/clearance, laboratory certification and much more.

GTR has long maintained an FTP site to support unrestricted access to standard terminologies and provide summary data (ftp://ftp.ncbi.nlm.nih.gov/pub/GTR/_README.html). In response to stakeholders, GTR added the comprehensive extraction of test data as XML. In addition, data are accessible by NCBI programmatic tools (www.ncbi.nlm.nih.gov/gtr/docs/maintenance_use/).

GTR has a mission to improve transparency surrounding genetic testing, and is being sought by a wide variety of stakeholders interested in surveying the genetic testing landscape.

Tabular Data of Interest

+--------------------------+
| Tables_in_GTR            |
+--------------------------+
| README                   |
| labs_tests_by_country    |
| log                      |
| mode_of_inheritance      |
| test_condition_gene      |
| test_version             |
| tests_by_method_category |
+--------------------------+

mysql> describe test_condition_gene;
+----------------------+---------------+------+-----+---------+-------+
| Field                | Type          | Null | Key | Default | Extra |
+----------------------+---------------+------+-----+---------+-------+
| test_accession_ver   | varchar(20)   | YES  | MUL | NULL    |       |
| test_type            | varchar(20)   | YES  | MUL | NULL    |       |
| concept_type         | varchar(20)   | YES  | MUL | NULL    |       |
| GTR_identifier       | varchar(20)   | YES  | MUL | NULL    |       |
| MIM_number           | varchar(20)   | YES  | MUL | NULL    |       |
| umls_name            | varchar(1000) | YES  |     | NULL    |       |
| gene_or_SNOMED_CT_ID | varchar(20)   | YES  | MUL | NULL    |       |
| Symbol               | varchar(25)   | YES  |     | NULL    |       |
+----------------------+---------------+------+-----+---------+-------+

mysql> select * from test_condition_gene limit 10;
+--------------------+-----------+--------------+----------------+------------+-------------------------------------+----------------------+--------+
| test_accession_ver | test_type | concept_type | GTR_identifier | MIM_number | umls_name                           | gene_or_SNOMED_CT_ID | Symbol |
+--------------------+-----------+--------------+----------------+------------+-------------------------------------+----------------------+--------+
| GTR000004006.1     | Clinical  | condition    | C0016667       | 300624     | Fragile X syndrome                  | 613003               | NULL   |
| GTR000004006.1     | Clinical  | gene         | C1414649       | 309550     | FMR1:fragile X mental retardation 1 | 2332                 | FMR1   |
| GTR000004034.3     | Clinical  | condition    | C0016667       | 300624     | Fragile X syndrome                  | 613003               | NULL   |
| GTR000004034.3     | Clinical  | gene         | C1414649       | 309550     | FMR1:fragile X mental retardation 1 | 2332                 | FMR1   |
| GTR000004036.1     | Clinical  | condition    | C1839780       | 300623     | Fragile X tremor/ataxia syndrome    |                      | NULL   |
| GTR000004036.1     | Clinical  | condition    | C0016667       | 300624     | Fragile X syndrome                  | 613003               | NULL   |
| GTR000004036.1     | Clinical  | condition    | C2749126       | 311360     | Premature ovarian failure 1         |                      | NULL   |
| GTR000004036.1     | Clinical  | gene         | C1414649       | 309550     | FMR1:fragile X mental retardation 1 | 2332                 | FMR1   |
| GTR000004040.3     | Clinical  | condition    | C0016667       | 300624     | Fragile X syndrome                  | 613003               | NULL   |
| GTR000004040.3     | Clinical  | gene         | C1414649       | 309550     | FMR1:fragile X mental retardation 1 | 2332                 | FMR1   |
+--------------------+-----------+--------------+----------------+------------+-------------------------------------+----------------------+--------+
10 rows in set (0.00 sec)

Experiment Codebook

test_condition_gene.GTR_identifier — unique ID for each record in this table

test_condition_gene.concept_type — “condition” or “gene” — (If concept_type is “condition”, the Symbol field will be NULL or empty.)

test_condition_gene.Symbol — the HUGO gene name for the gene region(s) being tested.

test_type — “Research” or “Clinical” — whether the test results are intended to be used by doctors for patient care (Clinical) or whether testing must be considered for scientific research purposes only.

Advertisements
Genetic Testing Registry Dataset

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s