Friday, September 28, 2012

RNA extractions

I started the RNA extractions yesterday. I am using an Omega EZNA miRNA Kit and including the "optional" on-column DNase treatment. I'm planning to sequence almost every molecule in these so I want to make sure that I don't have any DNA contamination. The first sample I did was a throw-away sample to work through any bugs. Which was good cause there were a couple.

The first main problem I had was after I ground the tissue on liquid Nitrogen, I had to measure out 30-50mg, and the sample thawed. In the end my RNA Integrity Number (RIN) off the bioanalyzer was 7.8 (the higher the better - I'm shooting for 10, we may consider sequencing a sample that is as low as 9). I am pretty sure that the brief thaw was the issue there.

The second problem I had was that it calls for a 10 minutes spin at 4c and I don't have a centrifuge that refrigerates. I did the spin at RT and later found out that the lab next door has a centrifuge that will keep  it at 4c. This may have also been part of the issue with the low RIN.

For the samples today, I used a different method to avoid thawing the sample during weighing. First: 1ml of buffer can handle 30-50m of tissue. I know the weight of each tissue so I pre-allocated out enough buffer so that I have 50mg of tissue for each 1ml of buffer. I'm going to have to buy more buffer to be able to finish the kit, but that's not a huge deal. Second, each column will handle up to 100mg of tissue which means I will use 2ml of buffer to go on each column. This way, every time I finished grinding a placenta, the powder went straight into the pre-measured buffer without ever thawing, then I used 2ml of the buffer - often having plenty left over (especially for the overgrown crosses).

The next issue is that the powder is so cold that it will freeze the buffer. Once the buffer freezes, it forms a shell around the powder and since it is water-based it has a much higher specific heat than the powder. This means that the buffer shell will stay frozen as the powder inside thaws and degrades. This is a huge issue and the first 5 extractions (I've done 8 so far out of 40) may all be prone to this degradation issue.  To circumvent this one I am dumping the powder into the buffer instead of the buffer into the frozen powder tube. It helps a lot, but still isn't 100%. I'm going to run the bioanalyzer tomorrow morning to check them out and see how degraded everything is - hopefully not at all.

The miRNA kits are interesting. There are two separate columns: one for tRNA (the "t" here stands for "total" not for "transport"), and one for miRNA (micro RNA). You run everything through the tRNA column and then take the flow-through and put that onto the miRNA column. Then they tell you to put the tRNA in the fridge and finish purifying the miRNA before going back to finish the tRNA purification. This may make sense if you are more worried about the miRNA, but I really want the tRNA, the miRNA is really just an afterthought - we're not even sure what experiment to do with it, we just thought it might be nice to have in case we think of anything... suggestions are more than welcome.    But I wonder if another reason the RIN score was so low for the first sample was because I set the tRNA in the fridge for 30 minutes while I finished the miRNA section. It may be a better idea to put the miRNA one in the fridge and do the tRNA one first. I'm going to have to talk to J~ about this once I run the others through the bioanalyzer and see their RIN numbers.

Monday, September 17, 2012

Experimental Design - Replicate, Randomize, Block

The design I had set up in the last post is no good. The problem is that I am confounding technical replicates and biological replicates and have no way to tease them apart: each lane includes biological replicates, but not technical ones, so I have no idea if a difference between the lanes is due to biology or to technical differences with the sequencing.
 Here is my new solution.

I have 8 treatments: 
P. campbelli male (BBM)
P. campbelli female (BBF)
P. campbelli X P. sungorus male (BSM)
P. campbelli X P. sungorus female (BSF)
P. sungorus X P. campbelli male (SBM)
P. sungorus P. campbelli female (SBF)
P. sungorus male (SSM)
P. sungorus female (SSF)

Each of these has 5 biological replicates (#1-5)

I will use 2 lanes and I have 24 barcodes.

So, after going over the Auer (2010) paper again, here is the new set up:
Lane 1:                      Lane 2:
BBM1+2+3              BBM3+4+5
BBF1+2+3               BBF3+4+5
BSM1+2+3              BSM3+4+5
BSF1+2+3               BSF3+4+5
SBM1+2+3              SBM3+4+5
SBF1+2+3               SBF3+4+5
SSM1+2+3              SSM3+4+5
SSF1+2+3               SSF3+4+5

This is 24 individuals per lane, meaning I have enough barcodes. And, most importantly, there is a technical replicate for each treatment --> Biological Replicate #3 appears in both lanes. This way I can test for differences between the lanes (technical replicates) and ascribe those differences to the sequencing platform while still parsing out the biological variance as well.


Auer, P. L. and R. W. Doerge. 2010. Statistical Design and Analysis of RNA Sequencing Data. Genetics 185:405–416.

Saturday, September 15, 2012

Samples for RNAseq

One of the problems with past RNAseq studies of imprinting (such as the two Gregg (2010) science papers, see Proudhon (2011), and DeVeal (2012)) was an inadequate sample size and incorrect experimental set-up. This led to a large number of false-positives. To avoid these mistakes I am going to use 40 samples, 10 from each genotype (5 from each sex). Here is the set up:

P. campbelli                         5 female, 5 male
P. campbelli X P. sungorus 5 female, 5 male
P. sungorus X P. campbelli 5 female, 5 male
P. sungorus                          5 female, 5 male

I will be able to sequence 20 samples per lane, meaning that I can't split it perfectly equally between the lanes. In order to avoid batch effects (Auer, 2010), I will use the following set up:

Lane 1:

P. campbelli                          3 female, 2 male
P. campbelli X P. sungorus  3 female, 2 male
P. sungorus X P. campbelli  3 female, 2 male
P. sungorus                           3 female, 2 male



Lane 2:

P. campbelli                          2 female, 3 male
P. campbelli X P. sungorus  2 female, 3 male
P. sungorus X P. campbelli  2 female, 3 male
P. sungorus                           2 female, 3 male



Finally, each of the 5 female and 5 male placentas will be brother-sister pairs, from 5 different litters (this is the ideal, I'll have to do some more sex-typing to see if I can actually make it work - stay tuned. I will definitely assure that the 5 females are from 5 different litters (and the same for the males) but whether all those male-female pairs are brother/sister may be tricky). I am currently debating whether I will have the 3 females in lane 1 sister to the 3 males in lane 2, or whether there should be a mix. I need to go over the Auer (2010) paper again to see if I can glean any more insight from it.

#################################################################################
As for the exact brother-sister pairs:

P. campbelli:
1. BB15.3M    BB15.4F
2. BB77.2M    BB77.1F
3. BB86.2M    BB86.1F
4. no male yet  BB87.1F   ---> male BB87.6M
5. no male yet  BB90.1F   ---> male 77.3M

Average placenta weight of these individuals is: 0.11783g
Overall average of P. campbelli  is:                     0.11122g
quite close-this should be fine.

I need to sex-type more offspring from families BB86 and BB90. There are two more unknowns from 86 and one from 90. Hopefully they will be males, otherwise I'll have to use a brother of 15.1M, 77.2M or 86.2M which would work, but is not ideal.

#################################################################################
P. campbelli X P. sungorus: 
1. BS70.3M     BS70.4F
2. BS72.2M     BS72.1F
3. BS73.3M     BS73.1F
4. BS71.3M     BS70.2F
5. BS11.1M     BS73.2F

You'll notice that for pairs 4 and 5, the individuals are not siblings (what is worse - the females are sibs of pairs 1 and 3) There is nothing else I can do here. The only option would be to set up more crosses. I'll consider this, but I need to do this sequencing sooner rather than later and I don't think that this little bit of my perfectionism is worth the wait. I'll talk to J~ and see what he says.

Average placenta weight of these samples: 0.12916g
Overall average for these hybrids:               0.10948g
Not as close as before, but not awful.
#################################################################################
P. sungorus X P. campbelli:
1. SB20.2M    SB20.6F
2. SB22.4M    SB22.5F
3. SB25.5M    SB25.3F
4. SB29.4M    SB29.1F
5. SB20.8M    SB24.2F

Pair 5 are not sibs, but unless I set up more crosses this is the best I can do.
Average of these samples:             0.47890g
Overall average for these hybrids: 0.39388g
Pretty good.

#################################################################################
P. sungorus:         
1. SS82.1M     SS82.2F
2. SS87.1M     SS87.3F  
3. SS88.1M     SS88.3F
4. SS89.2M     SS89.1F
5. SS91.1M     no female yet ----> female SS91.4F
6. no male yet  SS93.2F        ----> don't use these

I will sex-type the other individuals in SS91 and SS93. If I can find a female from 91 or a male from 93 I will be set. Otherwise, I will use the ones shown above.

Average of these samples:    0.14113
Overall average of SS:         0.14757
Great.

#################################################################################
Now off to sex-type, updates soon...
Here's the results:
lane:                  gender
1 Ladder        
2 BB87.5          male !!
3 BB87.6          male !!
4 BB90.3          female*
5 SS91.3           female !!
6 SS91.4           female !!
7 SS91.5           male
8 SS93.3           female
9 SS93.4           male !!
10 SS93.5         male !!
11 Pos (male)
12 Neg (female)
13 Neg (dH2O)
14 Ladder    








!! these are the correct gender that I needed. I'll probably use BB87.6M and SS91.4F (and not SS932F and SS93.4M). Finally to round out the P. campbelli I will use BB77.3M.

*this sample seemed very degraded, won't use for RNAseq regardless, but it may be "female" due to failure of the PCR (rather than no Y chromosome) as there was very little DNA in the extraction...


#################################################################################
Auer, P. L. and R. W. Doerge. 2010. Statistical Design and Analysis of RNA Sequencing Data. Genetics 185:405–416.

DeVeale, B., D. van der Kooy and T. Babak. 2012. Critical Evaluation of Imprinted Gene Expression by RNA–Seq: A New Perspective. PLoS Genet 8:e1002600.

Gregg, C., J. Zhang, B. Weissbourd, S. Luo, G. P. Schroth, D. Haig and C. Dulac. 2010. High-Resolution Analysis of Parent-of-Origin Allelic Expression in the Mouse Brain. Science 329:643–648.

Gregg, C., J. Zhang, J. E. Butler, D. Haig and C. Dulac. 2010. Sex-Specific Parent-of-Origin Allelic Expression in the Mouse Brain. Science 329:682–685.

Proudhon, C. and D. Bourc'his. 2011. Identification and resolution of artifacts in the interpretation of imprinted gene expression. Briefings in Functional Genomics 9:374–384.

Friday, September 14, 2012

RNAseq and Overgrown hamsters, experimental setup

Yesterday I ordered all the reagents that I need to prep the libraries, I'm super excited about it especially since some arrived this morning. I'll be using a protocol designed by Sultan et al. (Sultan 2012) which maintains the strand specificity of the sequences. This is important because most imprinted transcripts lie in clusters in the genome (Edwards 2007; Reik 1997) which seem to be more gene-dense than the rest of the genome and can have overlapping transcripts on the + and - strands (Ideraabdullah, 2008). Maintaining strand-specificity will allow me to keep overlapping genes separate in my analysis.
The Sultan (2012) protocol to maintain strand specificity is pretty slick. Here's how it goes:

1. Extract total RNA (tRNA). I'll use an Omega miRNA kit for this and also a genomic DNA kit (in case I want to go back later and look at methylation patterns of the DNA). As placenta is a fairly heterogeneous tissue I need to use the entire sample to assure an even sampling of each of the three main layers. This means that I need to collect archival quality DNA and RNA as I won't be able to go back later if I decide I need something extra. I will grind it with liquid nitrogen and take some of the homogenate for the RNA kit, and some for the DNA kit.

2. Purify the tRNA. There are many types of RNAs in a tissue. I am only interested in messenger RNAs (mRNA) which will all have a poly-A tail. To purify the mRNA from tRNA I will use magnetic beads that have a poly-T oligo attached to them. These will bind to the mRNAs, magnetically bind to the plate, and keep my mRNA from being washed away with all the rest of the material. Then I will elute the mRNA away from beads and collect it for the next step.

3. Fragment the mRNA. mRNAs are much to long to sequence on the Illumina machine, so I need to break them up. I'll use a solution that Illumina provided to fragment the RNA.

4. First-strand synthesis. We do not have the technology to sequence RNA and so to get around this, we instead convert the RNA into DNA and then sequence that. DNA that came from RNA is called complement DNA (cDNA). As DNA is double stranded, this takes two steps, the first strand synthesis (with a reverse-transcriptase) and the second strand synthesis (with a regular polymerase).

5. Second Strand synthesis. This is the first step where the Sultan (2012) protocol differs from the standard Illumina Hiseq. Sultan (2012) calls for using dUTP instead of dTTP when forming the second DNA strand. This way I can later use an enzyme which chops up DNA on a "U" to cut away the second strand, leaving only the actual strand I am interested in - super clever.

6. End repair. After the mRNA-->cDNA conversion there are a lot of overhangs. Here I chop off all the extra so that each fragment has blunt ends.

7. Adenylate 3` end. In order to add on the adapters (necessary for the Illumina machine can sequence my cDNA library) I need an overhanging "A" on each 3` end. Here I add that "A".

8. Adapter ligation. Here I will barcode and add the adapters onto each of my sequences. I'll be splitting my 40 samples into 2 lanes of HiSeq, so I need 20 different barcodes. Here I will also have to make sure and not introduce batch effects into my design (Auer, 2010). I'll probably have another post entirely about this.

9. U-digestion. Here I will digest out the cDNA with "U"s in it so that I only have DNA oriented in the correct way. I will of course amplify this DNA, but it will cause the adapters to be wrong for the Illumina platform, so while they will be present, they won't be sequenced. Clever, clever, clever.

10. Enrich DNA fragment. Here I will PCR the library so that I have many copies of each molecule. This way there will be plenty for the Illumina machine to sequence.

11. Pool libraries. I will combine 20 of the samples into one lane and the other 20 for the other lane.  Then send them off for sequencing.

So in short, I can keep almost the entire Illumina HiSeq kit as per usual, with a couple minor changes to maintain strand specificity.



As you may have noticed my in-text citations, it's because I recently bought the program papers, and am now testing it out in Chrome. It's nice by the way, especially for pages which I've recently started using. However, it looks like it won't format the bibliography directly from in-text citations in chrome, though it will format them properly if you go through and select each reference you want to work with.

Auer, P. L. and R. W. Doerge. 2010. Statistical Design and Analysis of RNA Sequencing Data. Genetics 185:405–416.

Ideraabdullah, F. Y., S. Vigneau and M. S. Bartolomei. 2008. Genomic imprinting mechanisms in mammals. Mutat. Res. 647:77–85.

Edwards, C. A. and A. C. Ferguson-Smith. 2007. Mechanisms regulating imprinted genes in clusters. Current Opinion in Cell Biology 19:281–289.

Reik, W. and E. R. Maher. 1997. Imprinting in clusters: lessons from Beckwith-Wiedemann syndrome. Trends in Genetics 13:330–334.

Sultan, M., S. Dökel, V. Amstislavskiy, D. Wuttig, H. Sültmann, H. Lehrach and M.-L. Yaspo. 2012. A simple strand-specific RNA-Seq library preparation protocol combining the Illumina TruSeq RNA and the dUTP methods. Biochemical and Biophysical Research Communications 422:643–646. Elsevier Inc.