Epigenetics & Methlyation and Why it Matters

Remember “the central dogma of molecular biology” from high school, where DNA is transcribed into RNA and then translated into proteins? Well, there’s more to the story. We haven’t had the ability to accurately account for and quantify epigenetic alternations until recently. Throughout our lives, epigenetic changes, such as DNA and microRNA methylation, play a crucial role in regulating gene expression and maintaining cellular identity. It’s the mysterious process that governs the variability in our metabolisms and is probably why you have that one friend that eats fast food everyday but still looks better than you. It may also be why we all have different predispositions to various neuropsychiatric diseases.

Youtube Clip: The Central Dogma of Molecular Biology

By adding a methyl group (CH₃) to cytosine bases in CpG sites, cells can control which genes are active or silent. This process significantly impacts development, differentiation, and disease, making it a revolutionary area of research and what I personally consider to be one of the rate-limiting steps to the “genomic revolution” everyone is talking about during this artificial intelligence craze. Up until recently the solution-space & combinatorics that underpinned epigenetics were so dauntingly large that it was somewhat deprioritized from a research perspective.

Youtube Clip: DNA Methylation Explained
Figure 1. CpG Islands If You are Visual Like Me

DNA methylation is managed by enzymes called DNA methyltransferases (DNMTs). If you aren’t a science person… any word that ends with “ase” - you can just imagine a big protein glob that “does some molecular process”. In humans, DNMT1 maintains existing methylation patterns during DNA replication, while DNMT3A and DNMT3B create new patterns during development. These changes typically occur at CpG sites near gene promoters, where the addition of methyl groups forms 5-methylcytosine, leading to gene silencing by blocking transcription factors. All lot of techincal words I know… but you can imagine methyl groups as little grape branches sticking to DNA so that a transcription factor (another protein glob) can’t work properly. This has broad downstream effects. Abnormal DNA methylation patterns are common in cancer: hypomethylation can activate cancer-causing genes (oncogenes), while hypermethylation can silence genes that prevent cancer (tumor suppressor genes). These methylation patterns can serve as biomarkers for cancer diagnosis, prognosis, and treatment targets. More research is emerging showing methylation greatly impacts systemic and targeted therapy sensitivity… for example in breast cancer

Figure 2. Aberrantly Hypermethylated Genes in Breast Cancer

Figure 3. Aberrantly Hypomethylated Genes in Breast Cancer


How is Methylation Measured in the Lab

Historically methylation status was evaluated after bisulfite treatment through various different laboratory modalities, which have gradually improved over time:

  • Bisulfite Sequencing: Converts unmethylated cytosine to uracil, leaving methylated cytosine unchanged. Whole-genome bisulfite sequencing (WGBS) offers comprehensive coverage, while reduced representation bisulfite sequencing (RRBS) targets CpG-rich regions more cost-effectively.
  • Methylation-Specific PCR (MSP): Uses bisulfite-treated DNA and primers for methylated or unmethylated DNA, allowing for qualitative and quantitative analysis.
  • Pyrosequencing: Provides quantitative analysis at specific CpG sites following bisulfite treatment.

Okay… so what does that all mean, basically you mix the DNA with bisulfite and then amplify the result of that with PCR and measure it qualitatively or with light (pyrophosphate release). This can tell you where the methyl groups are sticking.

Figure 4. What is Oxidative Bisulfite Sequencing

See how the unmethylated C’s turn into U’s and then T’s? Contrast this to the methylated C’s that remain C’s. Basically that’s all you need to know. Unforunately this process involves treating the DNA with several steps, and whenever there are steps a human has to perform, it means incremental cost and potential error sources. Not good.

More Recently:

  • Methylated DNA Immunoprecipitation (MeDIP): Enriches methylated DNA fragments for analysis using microarrays or sequencing.
  • Array-Based Methods: The Illumina Infinium Methylation Assay measures methylation at predefined CpG sites.

Illumina started using some of their technological advances with mircroarray to measure predefined sites in a more cost-efficient and rapid manner. Unforunately this still requires the DNA to be treated with bilsulfate and thus didn’t really revolutionize the process. Their technology is definitely faster, but probably not fundamentally better.

Figure 5. Illumina Infinium Methylation Assay


Entering the Nanopore Era

Nanopore sequencing, the latest advancement in DNA methylation measurement, directly reads native DNA without bisulfite conversion. It distinguishes methylated from unmethylated cytosines based on electrical signal changes as DNA passes through a nanopore. It sounds crazy and it is! The technology still isn’t quite refined and it may never fully be… but the prospect of obtaining a methylation run and seq data simultaneously without bisulfite treatment would be a massive advancement. This also means methylation runs could be stored with their seq data at source which would be revolutionary from a data collection standpoint. I happen to be a data scientist at heart and spend an egregious amount of time everyday cleaning data, whether it be for my market research at work or for clinical/scientific research. Advancements in this data ingestion arena are actually why I am more bullish on a company like Oxford Nanopore relative to Illumina.

I’ll save the full financial breakdown of this reasoning for another post. Oxford is a company I have followed closely over the past three years… as one of my long time friends currently works there. Without getting too granular, right now Oxford’s seq tests are a hair less accurate than Illumina’s. They are also cheaper relative to Illumina, as they utilize a fundamentally different flow-cell technology. Oxford uses the variation in a nanopore’s membrane current to detect differences between nucleotides, Illumina takes a different approach and utilizes dye terminators bound to the four bases (A, C, T, G) and measures the variable wavelength emission of each dye color. The reason why I am bullish on Oxford to take this market over is because they are positioned advantageously for the epigenetic revolution. They are dominating Illumina in this subfield from a R&D perspective and I believe their technology long-term will be able to produce data in a more efficient manner with less “human hands on” time.

Figure 6. Nanopore Sequencing

In summary, nanopore sequencing has various advantages:

  • Real-Time Detection: Crucial for timely clinical decision-making.
  • Simultaneous Methylation and Sequence Data: Provides a comprehensive view of the genome’s epigenetic landscape.
  • Resolution and Length: Handles long reads, beneficial for studying complex methylation patterns and structural variations.
  • Transformative Impact on Cancer Care
  • Enhanced Diagnosis: Precise methylation profiling can improve early cancer detection by identifying specific methylation signatures associated with different cancer types.
  • Personalized Prognosis: Methylation patterns can predict disease progression and patient outcomes more accurately, allowing for tailored treatment plans.
  • Targeted Therapies: Understanding methylation’s role in gene regulation can lead to therapies targeting abnormal methylation patterns, reactivating silenced tumor suppressor genes or inhibiting oncogene expression.
  • Real-Time Monitoring: Nanopore sequencing’s real-time capability enables ongoing monitoring of methylation changes during treatment, allowing for timely adjustments in therapeutic strategies.

The Curse of Dimensionality & How to Combat It

The emergence of GPU infrastructure, alongside its associated software libraries, has revolutionized the landscape of DNA methylation analysis. There are now at least partial solutions to the challenges posed by the curse of dimensionality (dimensionality reduction, feature selection). AI is now indispensable for managing the vast volumes of data produced by modern sequencing technologies. Despite advances, the enormus solution space created by the multitude of possible methylation patterns across the whole genome remains an obstacle. This is compounded by the variablilty of DNA methylation across time within individuals.

While single-cell transcriptomics holds promise… there still exist computational bottlenecks. Generating vast amounts of data from individual cells demands substantial resources for processing and analysis. Scaling up to analyze millions of cells across diverse conditions and establishing standardized protocols represent major obstacles that will require collaborative efforts from the research community.


Conclusions

  • The integration of advanced DNA methylation testing and AI represents a significant opporunity to transform our understanding of human biology and move the ball down field in oncology treatment.
  • These technologies promise to revolutionize cancer diagnosis, prognosis, and treatment, paving the way for personalized and effective care.
  • The curse of dimensionality and temporal variability in methylation remain mathematical challenges.
  • Single-cell transcriptomics offers a powerful tool to address these challenges, providing detailed insights into gene expression and epigenetic modifications.
  • The full realization of single-cell transcriptomic potential will likely take decades not years.
  • Advancements are undeniable and will likely continue to be appealing to both investors and researchers alike.