RumraketR
Posts: 6 Joined: Nov. 2012

Quote (Jerry Don Bauer @ Nov. 19 2012,16:37)  Comparing the genome to computer data storage. In order to represent a DNA sequence on a computer, we need to be able to represent all 4 base pair possibilities in a binary format (0 and 1). These 0 and 1 bits are usually grouped together to form a larger unit, with the smallest being a “byte” that represents 8 bits. We can denote each base pair using a minimum of 2 bits, which yields 4 different bit combinations (00, 01, 10, and 11). Each 2bit combination would represent one DNA base pair. A single byte (or 8 bits) can represent 4 DNA base pairs. In order to represent the entire diploid human genome in terms of bytes, we can perform the following calculations:
6×10^9 base pairs/diploid genome x 1 byte/4 base pairs = 1.5×10^9 bytes or 1.5 Gigabytes, about 2 CDs worth of space! 
http://bitesizebio.com/article....genome
is 1.5 Gigabytes more than 500 bits? Then why would we want to go any further than this as you already have the answer before you start.
ANY organism will be over 500 bits.[/quote] Hello everyone, I've been a lurker here for a few years now and I just have to respond because this could be historical stuff.
I want to make sure I understand you correctly here, Jerry Don Bauer, because according to what I have quoted, you seem to be saying that the quantity of information in a string of symbols is equal to the length of the string divided by the number of possible symbols at each locus? As in the information content is measured in bits and is thus proportional to the length of the sequence?
You refer to the example of a 6 billion basepair diploid genome, divided by the number of possibilities pr site (4):
6×10^9 base pairs/diploid genome x 1 byte/4 base pairs = 1.5×10^9 bytes or 1.5 Gigabytes, about 2 CDs worth of space!
In other words, the information content of a sequence of DNA, for example 12 basepairs in length, AUGAATAUGTTA, is equal to 12 base pairs x 1 byte/4 base pairs = 3 bytes.
Am I correct in my understanding here?
