Kristine
Posts: 3061 Joined: Sep. 2006
|
Ye gonads, guys, I had no idea that Biological databases were so darned complex! Reading this stuff is like curling up by a warm fire and kitty on your lap with a volume of Chemical Abstracts: Quote | The phrase biology databases is deliberately vague so as to cover a wide range of data types. There is no good estimate as to the number of databases that are publicly accessible and have some aspect of molecular biology/genomics in their contents, but it is easily double the number included in the Nucleic Acids database issue. The 2003 issue included approximately 400 titles, so there could easily be 800 to 1,000 databases. While numerous, the databases do cluster based on the type of data they include, or by some other scheme, such as organization, institution, or species. Various articles about these resources have generated categorizations.
Another resource sorted them into: biological literature, sequences, expression, protein interaction measurements, and metabolic expression (Marcotte and Date 2001). Another variation is: pathway, genome, protein, enzyme, chemical, and literature. Some of the databases have print counterparts; most are purely electronic. There are hybrid databases, combining data from multiple sources. KEGG, the Kyoto Encyclopedia of Genes and Genomes, is an example of such a combination. “KEGG is a suite of databases and associated software, integrating our current knowledge on molecular interaction networks in biological processes, the information about the universe of genes and proteins, and the information about the universe of chemical compounds and reactions” (http://www.genome.ad.jp/kegg/kegg.html).
Other databases are subsets from the larger databases with local valueadded content. This article focuses on four core database examples: sequence, microarray, protein, and literature databases. The last example needs no explanation for librarians. The sequence databases are the next easiest to explain–they contain DNA sequences and documentation on how thatsequence was created, and often, links to articles and other information related to that sequence. The microarray data come from experiments looking at the ‘interdependence of genes’ (Hung and Kim 2000). The protein data come from both experiments and computational modeling of protein sequences and their structures. |
Got all that? Okay, what kills me (and I attended a workshop on this last July at ALA, and almost had my mind stretched beyond the point of snapping back) is the fact that biologists don't search the literature necessarily - they search the abstracts and citations. It's called data mining, and there is such a thing as a data (or database) curator. Quote | With many large sequence datasets completed, the research has moved to the next phase. As touched on in the introduction, this sequence of effort is unusual in the life sciences world of hypothesis-driven experiments. In the classical experiment-based process, a scientist generates a hypothesis, devises an experiment, collects data, and analyses those data in order to determine, with some level of statistical confidence, whether the null hypothesis can be rejected. Then one more bit of knowledge enters the discipline. But much of the current data collection runs independent of any experiment driven by any hypothesis. The analysis of these data is referred to as in silico biology, or hypothesis free. The former term is considerably less assumptive than the latter. The goal is, as it has always been, to extract significance from the data: an action, a function, a role in a pathway. But the approach must differ from classical methods simply because there is so much data. The only way to extract any sense from them is to apply highly computational approaches. |
(Chiang, Katherine S.(2004) 'Biology Databases for the New Life Sciences', Science & Technology Libraries, 25: 1, 139 — 170) Yes, I am reading this article for my class (Reference Sources in the Sciences).
I think I need to see these search algorithms in action before I completely get it.
-------------- Which came first: the shimmy, or the hip?
AtBC Poet Laureate
"I happen to think that this prerequisite criterion of empirical evidence is itself not empirical." - Clive
"Damn you. This means a trip to the library. Again." -- fnxtr
|