Some features of the organization of the human genome
Only 1.1 to 1.4% of the entire sequenced genome are sequences that encode proteins.
One of the most important parts of the Human Genome program is the detection of genes in sequenced genome sequences. The position of many genes is determined by matching the mRNA sequences with the genomic sequencing, for other genes, localization is established using special computer programs.
The search for genes in the sequence also leads to the identification of sequences of nucleotides similar to the sequence of genes in other species using a special, so-called orthologic approach. This is done, using the BLAST computer program, for genomic or mRNA sequences. Another way of searching for genes is to identify paralogs( family members that have arisen as a result of gene duplication).Other methods of searching for genes in human sequenced DNA have also been used.
The public project "Human Genome" evaluates protein coding genes at 31,000, now it is revealed by the results of a sequence of less than 20,000 genes. Among them, 740 genes for RNA have been identified, which do not encode proteins, but probably more such genes will be detected. Yeast coding genes 6000, in fruit flies - 13 000, in plants - 26 000. Therefore the question, due to which the complexity of human organization is provided in comparison with other, more simply organized organisms, remains open.
The density of genes in the human genome is significantly lower than in other species. Unfortunately, the methods of computer prediction of genes by the results of sequencing are still very inaccurate.
Only 94 of the 1278 families of proteins in the human genome are characteristic of vertebrates only. The main differences between man and yeast or fly are the complexity of the organization of human proteins, which manifests itself in a large number of domains per protein, as well as new combinations of domains. Some of the genes are obtained by humans, apparently directly from bacteria as a result of so-called horizontal gene transfer. Obviously, the bacterial genome could serve as a direct donor of genes for vertebrates.
In vertebrates, two types of genes specific for vertebrates, such as neuronal genes, blood coagulation genes and the acquired immune response genes, on the one hand, and genes that improve the accuracy of intracellular control( genes for intra- and intercellular signals, programmed cell death and control of gene transcription), on the other.
The results of the sequence of the human genome stimulated the detection of single nucleotide polymorphism( SNP), which is supposed to be used to map the genes of predisposition to frequent multifactorial diseases.
Detection of genes of hereditary diseases was greatly facilitated with the use of a draft sequel, as it is accessible to all researchers thanks to the Internet. It is possible to identify candidate genes by their position in the genome using computer programs directly from the sequencing database, followed by screening for the mutation, supplemented with information on the structure of the gene. Thus, more than 30 genes of hereditary diseases have been found.
It is assumed that the sequence will be completed within a relatively short period of time. Particular attention will be paid not only to the creation of more sophisticated computer identification programs for genes, but also their regulatory areas. Apparently, the success of the last two directions of research will depend on the success of the sequence of genomes of other higher animals. Based on the sequence of the human genome, a catalog of genetic variations in humans will be created.