Publications by year
In Press
Zhang X, Wakeling M, Ware J, Whiffin N (In Press). Annotating high-impact 5’untranslated region variants with the UTRannotator.
Abstract:
Annotating high-impact 5’untranslated region variants with the UTRannotator
AbstractSummaryCurrent tools to annotate the predicted effect of genetic variants are heavily biased towards protein-coding sequence. Variants outside of these regions may have a large impact on protein expression and/or structure and can lead to disease, but this effect can be challenging to predict. Consequently, these variants are poorly annotated using standard tools. We have developed a plugin to the Ensembl Variant Effect Predictor, the UTRannotator, that annotates variants in 5’untranslated regions (5’UTR) that create or disrupt upstream open reading frames (uORFs). We investigate the utility of this tool using the ClinVar database, providing an annotation for 30.8% of all 5’UTR (likely) pathogenic variants, and highlighting 31 variants of uncertain significance as candidates for further follow-up. We will continue to update the UTR annotator as we gain new knowledge on the impact of variants in UTRs.Availability and implementationUTRannotator is freely available on Github: https://github.com/ImperialCardioGenetics/UTRannotatorSupplementary informationSupplementary data are available at bioRxiv.
Abstract.
Laver TW, Wakeling M, Hua JHY, Houghton J, Hussain K, Ellard S, Flanagan S (In Press). Comprehensive screening shows that mutations in the known syndromic genes are rare in infants presenting with hyperinsulinaemic hypoglycaemia.
Clinical Endocrinology Full text.
Alakbarzade V, Iype T, Chioza BA, Harlalka GV, Singh R, Hardy H, Sreekantan-Nair A, Proukakis C, Kathryn J P, Clark LN, et al (In Press). Copy number variation of LINGO1 in familial dystonic tremor.
Neurology Genetics Full text.
De Franco E, Caswell R, Johnson M, Wakeling M, Zung A, Dũng VC, Bích Ngọc CT, Goonetilleke R, Vivanco Jury M, El- Khateeb M, et al (In Press). De novo mutations in EIF2B1 affecting eIF2 signaling cause neonatal/early onset diabetes and transient hepatic dysfunction.
Diabetes Full text.
Laver TW, De Franco E, Johnson MB, Patel K, Ellard S, Weedon MN, Flanagan SE, Wakeling MN (In Press). SavvyCNV: genome-wide CNV calling from off-target reads.
Abstract:
SavvyCNV: genome-wide CNV calling from off-target reads
AbstractIdentifying copy number variants (CNVS) can provide diagnoses to patients and provide important biological insights into human health and disease. Current exome and targeted sequencing approaches cannot detect clinically and biologically-relevant CNVs outside their target area. We present SavvyCNV, a tool which uses off-target read data to call CNVs genome-wide. Up to 70% of sequencing reads from exome and targeted sequencing fall outside the targeted regions - SavvyCNV exploits this ‘free data’.We benchmarked SavvyCNV using truth sets generated from genome sequencing data and Multiplex Ligation-dependent Probe Amplification assays. SavvyCNV called CNVs with high precision and recall, outperforming five state-of-the-art CNV callers at calling CNVs genome-wide using off-target or on-target reads from targeted panel and exome sequencing. Furthermore SavvyCNV was able to call previously undetected clinically-relevant CNVs from targeted panel data highlighting the utility of this tool within the diagnostic setting. SavvyCNV is freely available.
Abstract.
2020
Zhang X, Wakeling M, Ware J, Whiffin N (2020). Annotating high-impact 5'untranslated region variants with the UTRannotator.
BioinformaticsAbstract:
Annotating high-impact 5'untranslated region variants with the UTRannotator.
SUMMARY: Current tools to annotate the predicted effect of genetic variants are heavily biased towards protein-coding sequence. Variants outside of these regions may have a large impact on protein expression and/or structure and can lead to disease, but this effect can be challenging to predict. Consequently, these variants are poorly annotated using standard tools. We have developed a plugin to the Ensembl Variant Effect Predictor, the UTRannotator, that annotates variants in 5'untranslated regions (5'UTR) that create or disrupt upstream open reading frames (uORFs). We investigate the utility of this tool using the ClinVar database, providing an annotation for 31.9% of all 5'UTR (likely) pathogenic variants, and highlighting 31 variants of uncertain significance as candidates for further follow-up. We will continue to update the UTRannotator as we gain new knowledge on the impact of variants in UTRs. AVAILABILITY AND IMPLEMENTATION: UTRannotator is freely available on Github: https://github.com/ImperialCardioGenetics/UTRannotator. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Abstract.
Author URL.
Wakeling MN, Laver TW, Colclough K, Parish A, Ellard S, Baple EL (2020). Misannotation of multiple-nucleotide variants risks misdiagnosis.
Wellcome Open Research,
4, 145-145.
Abstract:
Misannotation of multiple-nucleotide variants risks misdiagnosis
Multiple Nucleotide Variants (MNVs) are miscalled by the most widely utilised next generation sequencing analysis (NGS) pipelines, presenting the potential for missing diagnoses. These variants, which should be treated as a single insertion-deletion mutation event, are commonly called as separate single nucleotide variants. This can result in misannotation, incorrect amino acid predictions and potentially false positive and false negative diagnostic results. Using simulated data and re-analysis of sequencing data from a diagnostic targeted gene panel, we demonstrate that the widely adopted pipeline, GATK best practices, results in miscalling of MNVs and that alternative tools can call these variants correctly. The adoption of calling methods that annotate MNVs correctly would present a solution for individual laboratories, however GATK best practices are the basis for important public resources such as the gnomAD database. We suggest integrating a solution into these guidelines would be the optimal approach.
Abstract.
Full text.
Banerjee I, Senniappan S, Laver TW, Caswell R, Zenker M, Mohnike K, Cheetham T, Wakeling MN, Ismail D, Lennerz B, et al (2020). Refinement of the critical genomic region for congenital hyperinsulinism in the Chromosome 9p deletion syndrome.
Wellcome Open Research,
4, 149-149.
Abstract:
Refinement of the critical genomic region for congenital hyperinsulinism in the Chromosome 9p deletion syndrome
Background: Large contiguous gene deletions at the distal end of the short arm of chromosome 9 result in the complex multi-organ condition chromosome 9p deletion syndrome. a range of clinical features can result from these deletions with the most common being facial dysmorphisms and neurological impairment. Congenital hyperinsulinism is a rarely reported feature of the syndrome with the genetic mechanism for the dysregulated insulin secretion being unknown. Methods: We studied the clinical and genetic characteristics of 12 individuals with chromosome 9p deletions who had a history of neonatal hypoglycaemia. Using off-target reads generated from targeted next-generation sequencing of the genes known to cause hyperinsulinaemic hypoglycaemia (n=9), or microarray analysis (n=3), we mapped the minimal shared deleted region on chromosome 9 in this cohort. Targeted sequencing was performed in three patients to search for a recessive mutation unmasked by the deletion. Results: in 10/12 patients with hypoglycaemia, hyperinsulinism was confirmed biochemically. A range of extra-pancreatic features were also reported in these patients consistent with the diagnosis of the Chromosome 9p deletion syndrome. The minimal deleted region was mapped to 7.2 Mb, encompassing 38 protein-coding genes. In silico analysis of these genes highlighted SMARCA2 and RFX3 as potential candidates for the hypoglycaemia. Targeted sequencing performed on three of the patients did not identify a second disease-causing variant within the minimal deleted region. Conclusions: This study identifies 9p deletions as an important cause of hyperinsulinaemic hypoglycaemia and increases the number of cases reported with 9p deletions and hypoglycaemia to 15 making this a more common feature of the syndrome than previously appreciated. Whilst the precise genetic mechanism of the dysregulated insulin secretion could not be determined in these patients, mapping the deletion breakpoints highlighted potential candidate genes for hypoglycaemia within the deleted region.
Abstract.
Full text.
De Franco E, Lytrivi M, Ibrahim H, Montaser H, Wakeling M, Fantuzzi F, Patel K, Demarez C, Cai Y, Igoillo-Esteve M, et al (2020). YIPF5 mutations cause neonatal diabetes and microcephaly through endoplasmic reticulum stress.
Journal of Clinical Investigation,
130 Full text.
2019
Lytrivi M, De Franco E, Patel KA, Esteve MI, Cosentino C, Wakeling MN, Haliloglu B, Yildiz M, Godbole T, Hattersley AT, et al (2019). A Novel Genetic Syndrome of Early-Onset Diabetes, Microcephaly, and Epilepsy Due to Homozygous YIPF5 Mutations.
Author URL.
Baptista J, Stals K, De Franco E, Fryer V, Wakeling M, Parrish A, Mallin L, Bussell A, Settle J, Gunning AC, et al (2019). A gene-agnostic trio exome sequencing strategy outperforms gene panel analysis.
Author URL.
Baptista J, Stals K, De Franco E, Mallin L, Fryer V, Wakeling M, Parrish A, Johnson A, Settle J, Caswell R, et al (2019). A gene-agnostic trio exome strategy maximises diagnostic yield by uncovering disease-causing variants in newly discovered disease genes.
Author URL.
Rawlins LE, Jones H, Wenger O, Aye M, Fasham J, Harlalka GV, Chioza BA, Miron A, Ellard S, Wakeling M, et al (2019). An Amish founder variant consolidates disruption of CEP55 as a cause of hydranencephaly and renal dysplasia.
Eur J Hum Genet,
27(4), 657-662.
Abstract:
An Amish founder variant consolidates disruption of CEP55 as a cause of hydranencephaly and renal dysplasia.
The centrosomal protein 55 kDa (CEP55 (OMIM 610000)) plays a fundamental role in cell cycle regulation and cytokinesis. However, the precise role of CEP55 in human embryonic growth and development is yet to be fully defined. Here we identified a novel homozygous founder frameshift variant in CEP55, present at low frequency in the Amish community, in two siblings presenting with a lethal foetal disorder. The features of the condition are reminiscent of a Meckel-like syndrome comprising of Potter sequence, hydranencephaly, and cystic dysplastic kidneys. These findings, considered alongside two recent studies of single families reporting loss of function candidate variants in CEP55, confirm disruption of CEP55 function as a cause of this clinical spectrum and enable us to delineate the cardinal clinical features of this disorder, providing important new insights into early human development.
Abstract.
Author URL.
Full text.
Wakeling MN, Laver TW, Wright CF, De Franco E, Stals KL, Patch A-M, Hattersley AT, Flanagan SE, Ellard S, DDD Study, et al (2019). Correction: Homozygosity mapping provides supporting evidence of pathogenicity in recessive Mendelian disease.
Genet Med,
21(3).
Abstract:
Correction: Homozygosity mapping provides supporting evidence of pathogenicity in recessive Mendelian disease.
The original version of this Article contained an error in the top left of Figure 2: the number 1 on the y-axis had been changed to 0 during the typesetting process. This has now been corrected in both the PDF and HTML versions of the Article.
Abstract.
Author URL.
Wakeling MN, De Franco E, Laver TW, Flanagan SE, Johnson M, Patel K, Hattersley AT, Ellard S (2019). Homozygosity mapping from small targeted NGS panels using SavvyHomozygosity - getting more from less.
Author URL.
Wakeling MN, Laver TW, Wright CF, De Franco E, Stals KL, Patch A-M, Hattersley AT, Flanagan SE, Ellard S, DDD Study, et al (2019). Homozygosity mapping provides supporting evidence of pathogenicity in recessive Mendelian disease.
Genet Med,
21(4), 982-986.
Abstract:
Homozygosity mapping provides supporting evidence of pathogenicity in recessive Mendelian disease.
PURPOSE: One of the greatest challenges currently facing those studying Mendelian disease is identifying the pathogenic variant from the long list produced by a next-generation sequencing test. We investigate the predictive ability of homozygosity mapping for identifying the regions likely to contain the causative variant. METHODS: We use 179 homozygous pathogenic variants from three independent cohorts to investigate the predictive power of homozygosity mapping. RESULTS: We demonstrate that homozygous pathogenic variants in our cohorts are disproportionately likely to be found within one of the largest regions of homozygosity: 80% of pathogenic variants are found in a homozygous region that is in the ten largest regions in a sample. The maximal predictive power is achieved in patients with 3 Mb from a telomere; this gives an area under the curve (AUC) of 0.735 and results in 92% of the causative variants being in one of the ten largest homozygous regions. CONCLUSION: This predictive power can be used to prioritize the list of candidate variants in gene discovery studies. When classifying a homozygous variant the size and rank of the region of homozygosity in which the candidate variant is located can also be considered as supporting evidence for pathogenicity.
Abstract.
Author URL.
Full text.
De Franco E, Wakeling MN, Johnson MB, Flanagan SE, Ellard S, Hattersley AT (2019). Integration of research within clinical care identifies 14 novel genetic causes of neonatal diabetes.
Author URL.
Laver TW, Wakeling MN, Caswell R, Bunce B, Houghton JAL, Patel KA, Hussain K, Ellard S, Flanagan S (2019). Large deletions are an underappreciated cause of hyperinsulinism.
Author URL.
Wakeling MN, Laver TW, Colclough K, Parish A, Ellard S, Baple EL (2019). Misannotation of multiple-nucleotide variants risks misdiagnosis.
Wellcome Open Research,
4, 145-145.
Abstract:
Misannotation of multiple-nucleotide variants risks misdiagnosis
Multiple Nucleotide Variants (MNVs) are miscalled by the most widely utilised next generation sequencing analysis (NGS) pipelines, presenting the potential for missing diagnoses that would previously have been made by standard Sanger (dideoxy) sequencing. These variants, which should be treated as a single insertion-deletion mutation event, are commonly called as separate single nucleotide variants. This can result in misannotation, incorrect amino acid predictions and potentially false positive and false negative diagnostic results. This risk will be increased as confirmatory Sanger sequencing of Single Nucleotide variants (SNVs) ceases to be standard practice. Using simulated data and re-analysis of sequencing data from a diagnostic targeted gene panel, we demonstrate that the widely adopted pipeline, GATK best practices, results in miscalling of MNVs and that alternative tools can call these variants correctly. The adoption of calling methods that annotate MNVs correctly would present a solution for individual laboratories, however GATK best practices are the basis for important public resources such as the gnomAD database. We suggest integrating a solution into these guidelines would be the optimal approach.
Abstract.
Van Bergen NJ, Guo Y, Rankin J, Paczia N, Becker-Kettern J, Kremer LS, Pyle A, Conrotte J-F, Ellaway C, Procopis P, et al (2019). NAD(P)HX dehydratase (NAXD) deficiency: a novel neurodegenerative disorder exacerbated by febrile illnesses.
Brain,
142(1), 50-58.
Abstract:
NAD(P)HX dehydratase (NAXD) deficiency: a novel neurodegenerative disorder exacerbated by febrile illnesses.
Physical stress, including high temperatures, may damage the central metabolic nicotinamide nucleotide cofactors [NAD(P)H], generating toxic derivatives [NAD(P)HX]. The highly conserved enzyme NAD(P)HX dehydratase (NAXD) is essential for intracellular repair of NAD(P)HX. Here we present a series of infants and children who suffered episodes of febrile illness-induced neurodegeneration or cardiac failure and early death. Whole-exome or whole-genome sequencing identified recessive NAXD variants in each case. Variants were predicted to be potentially deleterious through in silico analysis. Reverse-transcription PCR confirmed altered splicing in one case. Subject fibroblasts showed highly elevated concentrations of the damaged cofactors S-NADHX, R-NADHX and cyclic NADHX. NADHX accumulation was abrogated by lentiviral transduction of subject cells with wild-type NAXD. Subject fibroblasts and muscle biopsies showed impaired mitochondrial function, higher sensitivity to metabolic stress in media containing galactose and azide, but not glucose, and decreased mitochondrial reactive oxygen species production. Recombinant NAXD protein harbouring two missense variants leading to the amino acid changes p.(Gly63Ser) and p.(Arg608Cys) were thermolabile and showed a decrease in Vmax and increase in KM for the ATP-dependent NADHX dehydratase activity. This is the first study to identify pathogenic variants in NAXD and to link deficient NADHX repair with mitochondrial dysfunction. The results show that NAXD deficiency can be classified as a metabolite repair disorder in which accumulation of damaged metabolites likely triggers devastating effects in tissues such as the brain and the heart, eventually leading to early childhood death.
Abstract.
Author URL.
Van Bergen NJ, Guo Y, Rankin J, Paczia N, Becker-Kettern J, Kremer LS, Pyle A, Conrotte J, Ellaway CJ, Procopis P, et al (2019). NAXDmutations cause a novel neurodegenerative disorder exacerbated by febrile illnesses.
Author URL.
Banerjee I, Senniappan S, Laver TW, Caswell R, Zenker M, Mohnike K, Cheetham T, Wakeling MN, Ismail D, Lennerz B, et al (2019). Refinement of the critical genomic region for hypoglycaemia in the Chromosome 9p deletion syndrome.
Wellcome Open Research,
4, 149-149.
Abstract:
Refinement of the critical genomic region for hypoglycaemia in the Chromosome 9p deletion syndrome
Background: Large contiguous gene deletions at the distal end of the short arm of chromosome 9 result in the complex multi-organ condition chromosome 9p deletion syndrome. a range of clinical features can result from these deletions with the most common being facial dysmorphisms and neurological impairment. Congenital hyperinsulinism is a rarely reported feature of the syndrome with the genetic mechanism for the dysregulated insulin secretion being unknown. Methods: We studied the clinical and genetic characteristics of 12 individuals with chromosome 9p deletions who had a history of neonatal hypoglycaemia. Using off-target reads generated from targeted next-generation sequencing of the genes known to cause hyperinsulinaemic hypoglycaemia (n=9), or microarray analysis (n=3), we mapped the minimal shared deleted region on chromosome 9 in this cohort. Targeted sequencing was performed in three patients to search for a recessive mutation unmasked by the deletion. Results: in 10/12 patients with hypoglycaemia, hyperinsulinism was confirmed biochemically. A range of extra-pancreatic features were also reported in these patients consistent with the diagnosis of the Chromosome 9p deletion syndrome. The minimal deleted region was mapped to 7.2 Mb, encompassing 38 protein-coding genes. In silico analysis of these genes highlighted SMARCA2 and RFX3 as potential candidates for the hypoglycaemia. Targeted sequencing performed on three of the patients did not identify a second disease-causing variant within the minimal deleted region. Conclusions: This study identifies 9p deletions as an important cause of hyperinsulinaemic hypoglycaemia and increases the number of cases reported with 9p deletions and hypoglycaemia to 15 making this a more common feature of the syndrome than previously appreciated. Whilst the precise genetic mechanism of the dysregulated insulin secretion could not be determined in these patients, mapping the deletion breakpoints highlighted potential candidate genes for hypoglycaemia within the deleted region.
Abstract.
Johnson MBJ, De Franco E, Atma W Greeley S, Letourneau LR, Gillespie K, Wakeling MN, Ellard S, Flanagan SE, Patel K, Hattersley AT, et al (2019). Trisomy 21 is a Cause of Permanent Neonatal Diabetes that is Autoimmune but not HLA Associated.
Diabetes Full text.
Yaghootkar H, Abbasi F, Ghaemi N, Rabbani A, Wakeling MN, Eshraghi P, Enayati S, Vakili S, Heidari S, Patel K, et al (2019). Type 1 diabetes genetic risk score discriminates between monogenic and Type 1 diabetes in children diagnosed at the age of <5 years in the Iranian population.
Diabet Med,
36(12), 1694-1702.
Abstract:
Type 1 diabetes genetic risk score discriminates between monogenic and Type 1 diabetes in children diagnosed at the age of <5 years in the Iranian population.
AIM: to examine the extent to which discriminatory testing using antibodies and Type 1 diabetes genetic risk score, validated in European populations, is applicable in a non-European population. METHODS: We recruited 127 unrelated children with diabetes diagnosed between 9 months and 5 years from two centres in Iran. All children underwent targeted next-generation sequencing of 35 monogenic diabetes genes. We measured three islet autoantibodies (islet antigen 2, glutamic acid decarboxylase and zinc transporter 8) and generated a Type 1 diabetes genetic risk score in all children. RESULTS: We identified six children with monogenic diabetes, including four novel mutations: homozygous mutations in WFS1 (n=3), SLC19A2 and SLC29A3, and a heterozygous mutation in GCK. All clinical features were similar in children with monogenic diabetes (n=6) and in the rest of the cohort (n=121). The Type 1 diabetes genetic risk score discriminated children with monogenic from Type 1 diabetes [area under the receiver-operating characteristic curve 0.90 (95% CI 0.83-0.97)]. All children with monogenic diabetes were autoantibody-negative. In children with no mutation, 59 were positive to glutamic acid decarboxylase, 39 to islet antigen 2 and 31 to zinc transporter 8. Measuring zinc transporter 8 increased the number of autoantibody-positive individuals by eight. CONCLUSIONS: the present study provides the first evidence that Type 1 diabetes genetic risk score can be used to distinguish monogenic from Type 1 diabetes in an Iranian population with a large number of consanguineous unions. This test can be used to identify children with a higher probability of having monogenic diabetes who could then undergo genetic testing. Identification of these individuals would reduce the cost of treatment and improve the management of their clinical course.
Abstract.
Author URL.
Full text.
2018
Iacovazzo D, Flanagan S, Walker E, Quezado R, Antonio de Sousa F, Caswell R, Johnson MBJ, Wakeling M, Brandle M, Guo M, et al (2018). A MAFA missense mutation causes familial insulinomatosis and diabetes mellitus.
Proceedings of the National Academy of Sciences Full text.
Stals KL, Wakeling M, Baptista J, Caswell R, Parrish A, Rankin J, Tysoe C, Jones G, Gunning AC, Lango Allen H, et al (2018). Diagnosis of lethal or prenatal-onset autosomal recessive disorders by parental exome sequencing.
Prenat Diagn,
38(1), 33-43.
Abstract:
Diagnosis of lethal or prenatal-onset autosomal recessive disorders by parental exome sequencing.
OBJECTIVE: Rare genetic disorders resulting in prenatal or neonatal death are genetically heterogeneous, but testing is often limited by the availability of fetal DNA, leaving couples without a potential prenatal test for future pregnancies. We describe our novel strategy of exome sequencing parental DNA samples to diagnose recessive monogenic disorders in an audit of the first 50 couples referred. METHOD: Exome sequencing was carried out in a consecutive series of 50 couples who had 1 or more pregnancies affected with a lethal or prenatal-onset disorder. In all cases, there was insufficient DNA for exome sequencing of the affected fetus. Heterozygous rare variants (MAF
Abstract.
Author URL.
Baptista J, Stals KL, Wakeling M, Jones G, Parrish A, Bussell A, Caswell R, Tysoe C, Baple E, Ellard S, et al (2018). High diagnostic yield through a gene-agnostic trio exome sequencing strategy that identifies mutations in new and old rare disease genes.
Author URL.
Ruark E, Holt E, Renwick A, Münz M, Wakeling M, Ellard S, Mahamdallie S, Yost S, Rahman N (2018). ICR142 Benchmarker: evaluating, optimising and benchmarking variant calling performance using the ICR142 NGS validation series.
Wellcome Open Res,
3Abstract:
ICR142 Benchmarker: evaluating, optimising and benchmarking variant calling performance using the ICR142 NGS validation series.
Evaluating, optimising and benchmarking of next generation sequencing (NGS) variant calling performance are essential requirements for clinical, commercial and academic NGS pipelines. Such assessments should be performed in a consistent, transparent and reproducible fashion, using independently, orthogonally generated data. Here we present ICR142 Benchmarker, a tool to generate outputs for assessing germline base substitution and indel calling performance using the ICR142 NGS validation series, a dataset of Illumina platform-based exome sequence data from 142 samples together with Sanger sequence data at 704 sites. ICR142 Benchmarker provides summary and detailed information on the sensitivity, specificity and false detection rates of variant callers. ICR142 Benchmarker also automatically generates a single page report highlighting key performance metrics and how performance compares to widely-used open-source tools. We used ICR142 Benchmarker with VCF files outputted by GATK, OpEx and DeepVariant to create a benchmark for variant calling performance. This evaluation revealed pipeline-specific differences and shared challenges in variant calling, for example in detecting indels in short repeating sequence motifs. We next used ICR142 Benchmarker to perform regression testing with DeepVariant versions 0.5.2 and 0.6.1. This showed that v0.6.1 improves variant calling performance, but there was evidence of minor changes in indel calling behaviour that may benefit from attention. The data also allowed us to evaluate filters to optimise DeepVariant calling, and we recommend using 30 as the QUAL threshold for base substitution calls when using DeepVariant v0.6.1. Finally, we used ICR142 Benchmarker with VCF files from two commercial variant calling providers to facilitate optimisation of their in-house pipelines and to provide transparent benchmarking of their performance. ICR142 Benchmarker consistently and transparently analyses variant calling performance based on the ICR142 NGS validation series, using the standard VCF input and outputting informative metrics to enable user understanding of pipeline performance. ICR142 Benchmarker is freely available at https://github.com/RahmanTeamDevelopment/ICR142_Benchmarker/releases.
Abstract.
Author URL.
Ruark E, Holt E, Renwick A, Münz M, Wakeling M, Ellard S, Mahamdallie S, Yost S, Rahman N (2018). ICR142 Benchmarker: evaluating, optimising and benchmarking variant calling using the ICR142 NGS validation series.
Wellcome Open Research,
3, 108-108.
Abstract:
ICR142 Benchmarker: evaluating, optimising and benchmarking variant calling using the ICR142 NGS validation series
Evaluating, optimising and benchmarking of next generation sequencing (NGS) variant calling performance are essential requirements for clinical, commercial and academic NGS pipelines. Such assessments should be performed in a consistent, transparent and reproducible fashion, using independently, orthogonally generated data. Here we present ICR142 Benchmarker, a tool to generate outputs for assessing variant calling performance using the ICR142 NGS validation series, a dataset of exome sequence data from 142 samples together with Sanger sequence data at 704 sites. ICR142 Benchmarker provides summary and detailed information on the sensitivity, specificity and false detection rates of variant callers. ICR142 Benchmarker also automatically generates a single page report highlighting key performance metrics and how performance compares to widely-used open-source tools. We used ICR142 Benchmarker with VCF files outputted by GATK, OpEx and DeepVariant to create a benchmark for variant calling performance. This evaluation revealed pipeline-specific differences and shared challenges in variant calling, for example in detecting indels in short repeating sequence motifs. We next used ICR142 Benchmarker to perform regression testing with versions 0.5.2 and 0.6.1 of DeepVariant. This showed that v0.6.1 improves variant calling performance, but there was evidence of some minor changes in indel calling behaviour that may benefit from attention in future updates. The data also allowed us to evaluate filters to optimise DeepVariant calling, and we recommend using 30 as the QUAL threshold for base substitution calls when using DeepVariant v0.6.1. Finally, we used ICR142 Benchmarker with VCF files from two commercial variant calling providers to facilitate optimisation of their in-house pipelines and to provide transparent benchmarking of their performance. ICR142 Benchmarker consistently and transparently analyses variant calling performance based on the ICR142 NGS validation series, using the standard VCF input and outputting informative metrics to enable user understanding of pipeline performance. ICR142 Benchmarker is freely available at https://github.com/RahmanTeamDevelopment/ICR142_Benchmarker/releases.
Abstract.
De Franco E, Lytrivi M, Patel K, Igoillo-Esteve M, Wakeling M, Haliloglu B, Unal E, Godbole T, Yildiz M, Ellard S, et al (2018). Mutations in YIPF5 are a novel cause of neonatal diabetes, highlighting the critical role of endoplasmic reticulum-to-Golgi trafficking in human beta cell survival.
DIABETOLOGIA,
61, S106-S106.
Author URL.
Low KJ, Stals K, Caswell R, Wakeling M, Clayton-Smith J, Donaldson A, Foulds N, Norman A, Splitt M, Urankar K, et al (2018). Phenotype of CNTNAP1: a study of patients demonstrating a specific severe congenital hypomyelinating neuropathy with survival beyond infancy.
Eur J Hum Genet,
26(6), 796-807.
Abstract:
Phenotype of CNTNAP1: a study of patients demonstrating a specific severe congenital hypomyelinating neuropathy with survival beyond infancy.
CHN is genetically heterogeneous and its genetic basis is difficult to determine on features alone. CNTNAP1 encodes CASPR, integral in the paranodal junction high molecular mass complex. Nineteen individuals with biallelic variants have been described in association with severe congenital hypomyelinating neuropathy, respiratory compromise, profound intellectual disability and death within the first year. We report 7 additional patients ascertained through exome sequencing. We identified 9 novel CNTNAP1 variants in 6 families: three missense variants, four nonsense variants, one frameshift variant and one splice site variant. Significant polyhydramnios occurred in 6/7 pregnancies. Severe respiratory compromise was seen in 6/7 (tracheostomy in 5). A complex neurological phenotype was seen in all patients who had marked brain hypomyelination/demyelination and profound developmental delay. Additional neurological findings included cranial nerve compromise: orobulbar dysfunction in 5/7, facial nerve weakness in 4/7 and vocal cord paresis in 5/7. Dystonia occurred in 2/7 patients and limb contractures in 5/7. All had severe gastroesophageal reflux, and a gastrostomy was required in 5/7. In contrast to most previous reports, only one patient died in the first year of life. Protein modelling was performed for all detected CNTNAP1 variants. We propose a genotype-phenotype correlation, whereby hypomorphic missense variants partially ameliorate the phenotype, prolonging survival. This study suggests that biallelic variants in CNTNAP1 cause a distinct recognisable syndrome, which is not caused by other genes associated with CHN. Neonates presenting with this phenotype will benefit from early genetic definition to inform clinical management and enable essential genetic counselling for their families.
Abstract.
Author URL.
Laver T, Wakeling M, Knox O, De-Franco E, Flanagan S, Colclough K, Ellard S, Hattersley A, Weedon M, Patel K, et al (2018). Redefining the pathogenicity of Maturity Onset Diabetes of the Young (MODY) genes: BLK, PAX4 and KLF11 do not cause MODY.
DIABETIC MEDICINE,
35, 10-10.
Author URL.
2016
Iacovazzo D, Flanagan SE, Walker E, Caswell R, Brandle M, Johnson M, Wakeling M, Guo M, Dang MN, Gabrovska P, et al (2016). A missense mutation in the islet-enriched transcription factor MAFA leads to familial insulinomatosis and diabetes. Endocrine Abstracts
2014
Wakeling M, Eyre J, Hughes S, Roulstone I (2014). Assimilation of vertical motion from simulated cloudy satellite imagery in an idealized single column model.
Quarterly Journal of the Royal Meteorological SocietyAbstract:
Assimilation of vertical motion from simulated cloudy satellite imagery in an idealized single column model
Satellite infrared sounders are invaluable tools for making observations of the structure of the atmosphere. They provide much of the observational data used to initialize atmospheric models, especially in regions that do not have extensive surface-based observing systems, such as oceans. However, information is lacking in the presence of cloud, as the cloud layer is opaque to infrared radiation. This means that where information is most desired (such as in a developing storm) it is often in the shortest supply. In order to explore the mathematics of assimilating data from cloudy radiances, a study has been performed using an idealized single-column atmospheric model. The model simulates cloud development in an atmosphere with vertical motion, allowing the characteristics of a 2D-Var data assimilation system using a single simulated infrared satellite observation taken multiple times to be studied. The strongly nonlinear nature of cloud formation poses a challenge for variational methods. The adjoint method produces an accurate gradient for the cost function and minimization is achieved using preconditioned conjugate gradients. The conditioning is poor and varies strongly with the atmospheric variables and the cost function has multiple minima, but acceptable results are achieved. The assimilation system is provided with a prior forecast simulated by adding random correlated Gaussian error to the truth. Assimilating observations comparable to those available from current geostationary satellites allows vertical motion to be retrieved with an error of less than a centimetre per second in most conditions. Moreover, evaluating the second derivative of the cost function at the minimum provides an estimate of the uncertainty in the retrieval. This allows atmospheric states that do not provide sufficient information for retrieval of vertical motion to be detected (such as a cloudless atmosphere or a non-moving opaque cloud layer in the upper troposphere). Retrieval is most accurate with upwards motion. © 2014 Royal Meteorological Society.
Abstract.
2012
Smith RN, Aleksic J, Butano D, Carr A, Contrino S, Hu F, Lyne M, Lyne R, Kalderimis A, Rutherford K, et al (2012). InterMine: a flexible data warehouse system for the integration and analysis of heterogeneous biological data.
Bioinformatics,
28(23), 3163-3165.
Abstract:
InterMine: a flexible data warehouse system for the integration and analysis of heterogeneous biological data.
SUMMARY: InterMine is an open-source data warehouse system that facilitates the building of databases with complex data integration requirements and a need for a fast customizable query facility. Using InterMine, large biological databases can be created from a range of heterogeneous data sources, and the extensible data model allows for easy integration of new data types. The analysis tools include a flexible query builder, genomic region search and a library of 'widgets' performing various statistical analyses. The results can be exported in many commonly used formats. InterMine is a fully extensible framework where developers can add new tools and functionality. Additionally, there is a comprehensive set of web services, for which client libraries are provided in five commonly used programming languages. AVAILABILITY: Freely available from http://www.intermine.org under the LGPL license. CONTACT: g.micklem@gen.cam.ac.uk SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Abstract.
Author URL.
2007
Lyne R, Smith R, Rutherford K, Wakeling M, Varley A, Guillier F, Janssens H, Ji W, Mclaren P, North P, et al (2007). FlyMine: an integrated database for Drosophila and Anopheles genomics.
Genome Biol,
8(7).
Abstract:
FlyMine: an integrated database for Drosophila and Anopheles genomics.
FlyMine is a data warehouse that addresses one of the important challenges of modern biology: how to integrate and make use of the diversity and volume of current biological data. Its main focus is genomic and proteomics data for Drosophila and other insects. It provides web access to integrated data at a number of different levels, from simple browsing to construction of complex queries, which can be executed on either single items or lists.
Abstract.
Author URL.
2006
Janssens H, Lyne R, Smith R, Guillier F, Ji W, McLaren P, Riley T, Reisinger F, Rutherford K, Wakeling M, et al (2006). Flymine: an integrated database of Drosophila and anopheles genomics.
JOURNAL OF NEUROGENETICS,
20(3-4), 138-139.
Author URL.