Chromatin immunoprecipitation sequencing, also known as ChIP-seq, is a method used to analyze protein interactions with DNA. Chromatin immunoprecipitation (ChIP) was invented over 30 years ago and is still one of the most widely used techniques in molecular biology today. It has evolved over the years to remain useful. The basic antigen-based protocol involves cross-linking proteins to DNA with formaldehyde and then fragmenting the DNA. The associated DNA is then analyzed.
Researchers have since linked ChIP with deep sequencing to develop ChIP-seq, which identifies binding sites of DNA-associated proteins and provides information about genome-wide protein binding. Essentially, the approach uses NGS technology to identify DNA fragments and maps them against the entire genome. ChIP-seq is primarily used to determine how chromatin-associated proteins, such as transcription factors, influence mechanisms that affect the phenotype.
The first step of ChIP-seq is the cross-linking of proteins to DNA using formaldehyde to fix the interaction. The DNA is then fragmented into pieces, between 150 and 500 base pairs long, followed by immunoprecipitation using a specific antibody against the protein of interest. Immunoprecipitation is technique that precipitates a protein antigen out of a solution. Then, these specific cross-linked DNA-protein complexes are enhanced by incubation and centrifugation, allowing for the removal of non-specific binding sites. Next, the DNA is recovered and purified. After sufficient enrichment, adapters are added and the material is ready to be sequenced.
An Overview of Targeted Sequencing
- ChIP-seq requires good quality antibody selection. The antibody must have some specificity for the protein of interest, which can be tested by prior to the experiment. The antibody must also be able to effectively immunoprecipitate the target protein.
- Even the slightest bias in the attachment of adapters, or in PCR amplification, could lead to a large amount of skew in the resulting sequence data. Therefore, it is crucial to run a control using ‘input DNA’, which is non-ChIP genomic DNA, so that sequencing biases can be identified and adjusted for.
- Sequencing errors can limit the mappability. It is not unusual to have only 50% of the reads mappable. This is likely to increase with more intelligent mapping algorithms on the horizon that will be able to take sequencing errors into account. Also, as tag sequences become longer, mapping becomes less of an issue.
Understanding the prospects and limitations ChIP-seq is key to improving the quality of the technology as it matures. In the future, it is likely that there will be a greater emphasis on the overlap of multiple binding-sites, as this will provide a more comprehensive picture of the interactions between DNA-binding proteins. For example, a recent modification of ChIP-seq, called chromatin-interaction analysis using pair end tag sequencing (ChIA-PET), is able to identify all of the chromatin interactions between estrogen and receptor binding sites in the genome.
Bisulfite sequencing protocol
DNA methylation is an epigenetic modification whereby methyl groups are added to a DNA molecule. In the last few decades, scientists have learned a great deal about DNA methylation, including how and where it occurs, and which cellular processes it is an important component of, including embryonic development and X-chromosome inactivation.
The methyl group is often added to the 5th carbon atom of a cytosine, converting the base into 5-methylcystosine. The reaction is catalyzed by DNA methyltransferases. These modified cytosines are usually positioned next to a guanine base, so the methylation is referred to as CpG methylation. The result is two methylated cytosines diagonally next to each other on opposite strands of DNA.
The methylation of cytosine nucleotides. Diagram A shows a comparison of an unmethylated cytosine and a 5-methylcytosine. Diagram B shows the CpG gene cluster – the methylated cytosines are diagonally positioned and the gene is repressed. Image credit: S. Gillespie, 2019
Bisulfite sequencing is used to detect DNA methylation patterns. It was discovered that cytosine was converted into uracil at a faster rate than 5-methlycytosine after DNA was treated with sodium bisulfite. This phenomenon was described by Marianne Frommer in 1992 and has served as the basis for using sequencing methods to investigate DNA methylation.
DNA is treated with sodium bisulfite, causing unmethylated cytosine residues to convert into uracil, yet 5-methylcytosines remain unaffected. After PCR amplification, uracil residues are converted into thymine. Various analyses can be performed on the altered sequence to differentiate between changes of cytosine and thymidine resulting from bisulfite conversion. Therefore, this provides information about the methylation status of both strands of that segment of DNA.
Methods used to analyze methylation status:
- Direct bisulfite sequencing allows researchers to determine how DNA methylation affects the expression of a target gene of interest. It involves primers being designed to be strand-specific, as well as bisulfite-specific. Therefore, both methylated and un-methylated sequences are amplified during PCR. The main limitation of the approach is that it is not able to examine cell-type specific changes, unless the cells are sorted before DNA extraction.
- This is an emerging method that is being used for DNA methylation analysis. The approach can be used to investigate the degree of methylation at CpG positions by analyzing the ratio of T and C. However, it can normally only sequence up to 30 base pairs at a time. Therefore, even when bisulfite-treated DNA is already available, analysis still takes around 4 hours to read 96 samples.
- This is probably the most widely accepted protocol for non-methylation-specific bisulfite sequencing. The method involves the DNA sequence of interest being amplified by PCR with specific primers, followed by the PCR product being cloned. The individual clones are then sequenced. A disadvantage of the approach is that it requires multiple laborious steps because the PCR product is cloned before sequencing. Therefore, it can be very time-consuming.
Methylation-specific PCR methods
- Standard methylation-specific PCR is one of the simplest methods used to analyze DNA methylation, but it is not normally quantitative. It avoids the need to sequence the area of interest because methylation is determined by the ability of a specific primer to achieve amplification. The primers are designed to be methylated-specific by including sequences that are only complementary to 5-methylcytosines. MethylLight is a technique that uses methylated-specific fluorescence probes to anneal to the amplified region. However, the methylation level can only be determined for one or two CpGs, leaving other sites unexplored. Melting curve analysis has also been used to analyze methylation-specific PCR products. It determines the ratio of bisulfite-converted DNA by comparing the differential peaks generated.
Image credit: Science Photo Library