Archive

Archive for the ‘Bioinformatics’ Category

Finding QTLs with EXPLoRA

EXPLoRA is a Hidden Markov Model  (HMM) capable of finding Quantitative Trait Loci (QTL) using Bulk Segregant Analysis experiments.

A few days ago, our new version of the algorithm was made available thru a web server. You can find it following this link (here a link to the paper)

The idea behind the method is to use prediction of natural selection and linkage disequilibrium to detect the regions in the DNA responsible for the trait (i.e. find the QTL). Sexual reproduction involve two parents contributing one gamete each. Gametes are produced during meiosis: chromosomes are duplicated and then the cells are divided twice, ending in sexual cells with half the number of chromosomes of a normal cell. One key feature in this process is recombination or crossover.

Recombination is a fundamental source of genomic variation. It has a huge impact on the life cycle of mutations. Recombination cause the offspring to inherit a completely new combination of parental DNA. From the point of view of the dynamics of evolution this would allow a beneficial mutation to be selected independently of the rest of the ancestor’s mutations to some degree. This independence of mutations is not complete because the frequency of crossover events between two mutations is proportional to their physical distance in the chromosomes. Mutations close to each other are less likely to suffer a crossover event that separate them.

The degree to which mutations are separated is related to the recombination rate (i.e. the average number of recombination events that happen in a chromosome). The more recombination events in a chromosome, the easier it becomes for a beneficial mutation to be selected alone, just for the fitness advantage it confers. But this also implies that mutations in close vicinity tend to be inherited together, and therefore “linked”. The link between two mutations is known as linkage disequilibrium.

When observing the genomes of descendants that are selected for the traits of their parents, independent of the reason behind the selection the effect is that the causal mutation become more common in the population. Interestingly, this does not only happen to the responsible mutation, it also happen to those mutation in close proximity to the causal mutation. The closer a neutral mutation is to the causal mutation the less likely is that a recombination event happen in the area. If a neutral mutation is in very close proximity to the causal mutation, the chance that a recombination event happens in the region between them becomes very low.

Advertisements
Categories: Bioinformatics, QTL

eQTL for Dummies – Usefulness

May 3, 2011 1 comment

Now that we know what an eQTL (expression Quantitative Trait Loci) is in broad terms, we can discuss “why” we would like to make an eQTL analysis.

We all know about the existence of genetic diseases, also we know that those diseases are produced by some differences on the DNA (thats why they are called ‘genetic’). It’s not difficult to imagine that if we need some protein complex to do a specific job but some of the genes (the DNA sequence) that produce some of those proteins have changed, then the whole complex could cease to work or at least work differently. This is what happens in cancer cells, in some types of cancers, for example, the genes that control the process of apoptosis (cell death) get screwed and those cell just don’t ‘hear’ the body orders for them to die, and there you go… cancer!

So, what do we have? First, a complex disease that we don’t fully understand which have so many variables we don’t know where to start looking, and second, the possibility that those variables are not all related to just changes in the coding sequences of the DNA. In that second possibility is that eQTL analysis comes into play, what we are going to look for is that maybe the protein complex is not working as it is suppose to work not because there is something wrong with the parts, but because there is something wrong with the number of parts.

Let’s make an example, imagine a car, if the car stop working it can be for many different kinds of reasons, like a bad engine, no gas or it is stuck on the sand, who knows, so many possibilities! So, imagine a common problematic part of the car, the tires; what is your first guess? flat tires!? Maybe, we can presume that the tires are flat and we just need air to inflate them again, you may see this as our first approach, one of the necessary parts of the car is broken (similar to “there maybe some non-functional protein, lets search for the change in the DNA”) but, could it be that the problem is that there are only 3 tires? That’s an expression problem, the problem is not that the tires are flat, there are 3 beautiful, inflated and perfectly working tires, the problem is that we need four!

Going back to diseases, there are genetic diseases in which the problem is a difference on the amount of proteins working for a process, an those problems are not related to difference in the coding region of any protein but in genetics factors related to gene expression.

Ok, very altruistic, right? we want to help to find a fix to every genetic disease, but, actually, we also want to understand the relation between genes, just for the hell out of it (not really); we want to know how the expression or regulation of some genes make other genes to be expressed or regulated, eQTL analysis also helps us with this. After looking the behavior of gene expression in some diseases we have more information about what genetics factors are associated with what genes, we call that information “markers” and for markers produced by eQTL we call them eSNP, and with this markers and other markers giving information to GWA (Genome Wide Association) studies we can better predict the risk for diseases and better understand how those diseases will try to kill us.

Categories: Bioinformatics, eQTL

eQTL for Dummies – Intro

March 29, 2011 2 comments

What the hell are eQTL (expression Quantitative Trait Loci)?

Let’s start by the beginning, What is a “quantitative trait”? It’s any phenotype that can be quantified, in other words, anything we can measure about the phenotype of a given organism, e.g. we can measure height, fat, skin color, etc, etc. any of those can be a quantitative trait.

Ok, easy enough, but what is a “quantitative trait loci” (QTL)? It’s a quantitative trait -a measure of the phenotype- that is associated with a particular loci, in normal human language: QTL are phenotypes associated with a sequence in the DNA. That DNA sequence have a direct effect on the measure of phenotype, taking the ‘fat’ example, suppose that if you have a determined sequence in your DNA sequence, you will have more or less fat that if you don’t have that sequence.

What about the expression? We are talking about the expression of genes, how much of each gene is being produced by the target cells. So, a quantitative trait is anything we can measure, can we measure gene expression? Off course we can! That’s what microarrays are made for, that is their whole reason of existence.

To conclude, we are measuring the quantity of genes produced by each cell and looking if those changes in expression of the genes can be because of a given DNA sequence.

What are eQTL useful for?

Categories: Bioinformatics, eQTL