What does palindrome mean in biology




















They hope to learn what events initiate such unstable formations, and this new understanding could lead to novel treatments. For example, she said, the group has already determined that certain yeast cells that are susceptible to palindrome formation are far more sensitive than normal cells to radiation as well as to compounds often used in the treatment of cancer, such as cisplatin.

In the previous method, the researchers lost the junction sequences that might provide clues to the origin of the palindromes, and had to analyze them one by one, she explained.

Rattray received her doctorate at the University of Washington, in Seattle, where she studied retroviral replication. Palindrome mapping strategy. A Read density distribution in the Chr15q B qPCR analysis to monitor for palindrome enrichment and determine the directionality of the Chr15q The fold enrichment is based on comparing the fold depletion among different primer sets P1, P2, P3, and P4 relative to a single copy sequence in the genome RAD The locations of TaqMan primer sets P1, P2, P3, and P4 are indicated in C , map of genomic region Chr 47,,,, with restriction sites and primer locations.

Figure from Yang et al. NCI at Frederick. Approximately jobs were run in parallel. Palindrome computations were carried out using the Biological Language Modeling Toolkit version 2 We computed the number of palindromes in the reference genome GRCh37 build across all chromosomes, except mitochondrial chromosomes.

BLMT finds all occurrences of perfectly palindromic 8-base-long sequences and then extends the span of each of them, as long as the bases on either side are complementary but allowing up to 4 mismatches during this extension.

We found that the number of palindromes is proportional to the length of the chromosomes Supplementary Fig. When the distribution of the palindromes in the human genome was inspected in terms of both their location and length, we discovered that they are not distributed uniformly. Our results show that palindromes tend to be highly concentrated in introns and intergenic regions, and , palindromes in coding exons Supplementary Fig.

The details of the number of palindromes in various regions of the reference genome are shown in Supplementary Table 2. We constructed a personal genome for each individual and then computed the palindromes occurring in that genome.

Each palindrome was then indexed in relation to its aligned position in the reference genome. We constructed a catalog of all the palindromes that occur in any of the Gs, as well as how each of these palindromes varied in the individuals. A sample of this palindrome variants matrix is shown in Fig. The entire matrix is presented in Supplementary Table 3. A summary of the variants within the palindromic regions across all the samples is presented in Supplementary Table 4.

Palindrome conservation across the G samples is shown in Fig. On an average, The African population in G had the highest number of altered palindromes, which is expected because of the larger number of variants found in this population Palindrome variation across various populations in G is shown in Fig.

Each slice of the pie chart represents the fraction of the reference genome palindromes with a specific number of variations approximately. The GWAS Catalog is a list of SNPs that have statistically significant associations with specific diseases or traits, as curated from the published literature To determine whether palindromic regions with variants are more susceptible to disease than other genomic locations, we compared the expected and observed probabilities of variants in palindromic locations.

GWAS variants occurred in palindromic regions fold more than expected. Of these, 41 SNPs that are associated with various diseases, and formed new palindromes, were found in fewer than samples with two alleles or samples per individual.

Some of the disease-associated SNPs, such as those of type 2 diabetes, ovarian cancer, breast cancer, and schizophrenia, have low allele count in G as expected, as these individuals were from healthy populations. An analysis of a few SNPs with palindrome associations is shown in Fig. Their size represents the overall number of SNPs associated with that disease. The specific SNPs that alter palindromes are shown as unlabeled nodes, in which SNPs that convert a palindrome to a non-palindrome are shown as diamond-shaped nodes; SNPs that create new palindromes are shown as parallelogram-shaped nodes; SNPs that lengthen the palindromes are shown as triangle-shaped nodes; SNPs that shorten the palindromes are shown as v-shaped nodes; and SNPs that alter palindromes in other ways non-identical, near-to-perfect or perfect-to-near are shown as round-shaped nodes.

The network diagram was made using Cytoscape We found that 46 SNPs in palindromes were likely to affect the binding of proteins and were linked to the expression of gene targets scores 1a—1f.

Of these SNPs, five were associated with obesity and diabetes, and three were associated with mental disorders. Ninety-four SNPs likely to affect the binding of proteins scores 2a—3b 36 also occurred in palindromes, as shown in Fig.

Recent research has shown that palindromes may be critical to several cellular processes, including transcription, replication, and DNA recombination Therefore, it is important to study palindrome distribution in the genome to understand their functions and disease associations. Palindromes are abundantly present in the human genome, but their distribution is non-uniform.

This distribution can be correlated with their participation in important biological functions. The palindrome lengths also vary greatly in the genome. In general, shorter palindromes are expected to be more abundant than longer palindromes, and both short and long palindromes have been implicated in genomic instability 7. The frequency and distribution, from our analysis, of both long and short palindromes of varying lengths in each chromosome are shown in Supplementary Fig.

These palindromes constitute fragile sites, are correlated with breakage and deletion, and are associated with diseases 7. According to our results, the very long palindromes were mostly AT repeats. Other palindromes are referred to as CG-rich. PATRRs are sites frequently associated with double-strand breakage and hairpin or cruciform DNA formation that lead to translocations and recombinations We found that the longest palindromes were AT-rich. Regulatory regions contain promoters and enhancers, and palindromic sequences in these regions are known to serve as TFBS for regulating gene expression.

We found that, overall, disease-associated risk variants GWAS SNPs were 14 times more likely to be present in palindromic regions than expected. This association was tested in tissues relevant to the diseases. In our previous pilot study of palindrome alterations by breast cancer-associated variants in The Cancer Genome Atlas TCGA , we found that many palindrome changes were associated with oncogenes and breast cancer genes In addition, we observed that the palindromes that were associated with oncogene NUP98 were completely absent in tumors.

These results further support the possible role of palindromes in various diseases, including cancer. We also identified the individual SNPs in or near palindromes that are associated with multiple diseases or traits. It also contains binding sites for transcriptional regulators such as miRNA We learned that one of the SNPs rs , which is associated with bipolar disorder and schizophrenia, is a binding site for FOXP2 protein, a TF playing a significant role in these mental illnesses These results further support the role of palindromes in diseases since these SNPs lead to palindrome changes that may affect the binding of TFs, hinting at a possible mechanism for disease pathogenesis.

This allele had a frequency of 0. Figure 7 provides illustrative examples of palindrome-mediated mechanisms of disease, as indicated in the literature. These transcription factors are regulators of brain development. New regulatory elements may be introduced with the insertion of these fragments. It has been conjectured that, as a result of these new elements, SOX3 may be ectopically expressed in hair follicles or precursor cells during the early stages of hair follicle development.

The images were produced based on the GRCh38 hg38 assembly. We believe that these results will help researchers to understand palindrome distribution and conservation across various populations. These results will also help to identify individual palindromes that undergo rearrangements due to the presence of variants such as SNPs that could affect various cellular processes leading to gene dysregulation and disease pathogenesis. The COPS will serve as a resource to investigate palindromic variations in genomics studies of diseases.

Specifically, COPS can serve as control data for the comparison of palindrome variations in patient genomes with the palindromes in G. This was demonstrated in our pilot study on TCGA data in which we compared palindromes in matched tumor and normal pairs of genomes with the G data presented in COPS We are making available the location and length of every palindrome that appears in the reference genome or the G genomes and its variation in each of the individual genomes with respect to the reference genome.

In addition to the individual occurrences of palindromes, aggregated results are presented to show the distribution in coding and non-coding regions, palindrome conservation across the genomes, the presence of rare and common variants within the palindromes, and the GWAS SNPs that are associated with palindromic changes for various diseases.

During genome sequencing, DNA that is to be cloned is inserted into bacterial artificial chromosomes BACs , which are used for transforming Escherichia coli , a process by which foreign DNA is introduced into a bacterial cell. Regions that are highly susceptible to genomic rearrangements, namely, palindromes, duplicated segments, and satellite DNA, might be deleted during transformation and cloning in E.

As a result, these sequences may be underrepresented in the reference genome that was sequenced using this technology Thus, the palindrome computation in this work is limited by the incomplete nature of the reference genome. Our work is based on the human reference genome build GRCh37 hg The primary reason for this choice was that the G was assembled with hg19 as the reference.

Hg19 was also annotated in more detail than the later build GRCh38 hg38 , at the time of our computations. We plan to extend this work to compute and analyze inverted repeats i.

These studies will enhance the knowledge of palindrome functions in the genome and their contribution to human diseases, and highlight the mechanism by which DNA variants play a role in disease. Cunningham, L. Rapid, stabilizing palindrome rearrangements in somatic cells by the center-break mechanism. Anjana, R. A method to find palindromes in nucleic acid sequences.

Bioinformation 9 , — Warburton, P. Inverted repeat structure of the human genome: the X-chromosome contains a preponderance of large, highly homologous inverted repeats that contain testes genes. Genome Res. Liu, G.

Compositional bias is a major determinant of the distribution pattern and abundance of palindromes in Drosophila melanogaster. Chuzhanova, N. Translocation and gross deletion breakpoints in human inherited disease and cancer II: potential involvement of repetitive sequence elements in secondary structure formation between DNA ends.

Darmon, E. Cell 39 , 59—70 Lewis, S. Palindromes and genomic stress fractures: bracing and repairing the damage. DNA Rep. Lu, L. The human genome-wide distribution of DNA palindromes. Genomics 7 , — Zhang, R.

Analysis the influence of palindrome structure to gene expression by constructing combination system. Acta Microbiol. CAS Google Scholar. Pearson, C. Inverted repeats, stem-loops, and cruciforms: significance for initiation of DNA replication. Kato, T. Chromosomal translocations and palindromic AT-rich repeats. FitzGerald, P. Clustering of DNA sequences in human promoters. Zawel, L. Human Smad3 and Smad4 are sequence-specific transcription activators.

Cell 1 , — Fleming, N. Cancer Res. Evolution of genetic and genomic features unique to the human lineage. Greenberg, D. Bissler, J. DNA inverted repeats and human disease. Shapira, M. A transcription-activating polymorphism in the ACHE promoter associated with acute sensitivity to anti-acetylcholinesterases.

Guenthoer, J. Assessment of palindromes as platforms for DNA amplification in breast cancer. Tanaka, H. Palindromic gene amplification—an evolutionarily conserved role for DNA inverted repeats in the genome. Cancer 9 , — Ford, M. Large inverted duplications are associated with gene amplification. Cell 45 , — Large DNA palindromes as a common form of structural chromosome aberrations in human cancers.

Cell 19 , 17—23 Marotta, M. Lu, S. Short inverted repeats are hotspots for genetic instability: relevance to cancer genomes. Cell Rep. Popescu, N. Genetic alterations in cancer as a result of breakage at fragile sites.

Cancer Lett. Inagaki, H. Palindrome-mediated translocations in humans: a new mechanistic model for gross chromosomal rearrangements. Barbouti, A. The breakpoint region of the most common isochromosome, i 17q , in human neoplasia is characterized by a complex genomic architecture with large, palindromic, low-copy repeats. Lachman, H. B , — Chen, D. Segmental duplications flank the multiple sclerosis locus on chromosome 17q.

Rheault, M. Reversible Fanconi syndrome in a pediatric patient on deferasirox. Blood Cancer 56 , — Ganapathiraju, M. Suite of tools for statistical N-gram language modeling for pattern mining in whole genome sequences. Consortium, G. A global reference for human genetic variation.

Nature , 68—74 Subramanian, S. A pilot study on the prevalence of DNA palindromes in breast cancer genomes. BMC Med. Genomics 9 , 73 Choudhury, A. Whole-genome sequencing for an enhanced understanding of genetic variation among South Africans. Welter, D. Nucleic Acids Res.

Boyle, A. Annotation of functional variation in personal genomes using RegulomeDB. Smith, G. Meeting DNA palindromes head-to-head. Genes Dev. Inagaki, K. Science , — Rafnar, T. Mutations in BRIP1 confer high risk of ovarian cancer.



0コメント

  • 1000 / 1000