bankskeron.blogg.se

Covid 19 genome sequence analysis
Covid 19 genome sequence analysis










This project also opened the opportunity for me to explore and inspect if this virus was specially designed in a Laboratory or not. X is incompressible if and only if K(x) ≥ x. the encoded Turing machine) would not depend upon the string and is hence a constant. The proof follows from the idea of encoding a Turing machine and then the program in that Turing machine. However, if the Turing machine is universal, the differences between the complexity of the string across such a Turing machine would be varying by constants. This is, to some extent, unavoidable in the sense that the description really depends on the description language. It appears that the complexity depends both on f as well as x.

covid 19 genome sequence analysis

Kolmogorov complexity K of a string, relative to a Turing machine f of a string x, is Kf(x)=min. In some sense, it could be thought of as algorithmic entropy, in the sense that it is the amount of information contained in the object. Kolmogorov complexity of an object or algorithm is the length of its optimal specification. At last, I was able to verify the length of all the 10 proteins(ORF1a, ORF1b, Spike Glycoprotein, Membrane, ORF6, ORF7a, ORF8, ORF10) thus this project has the proof of all the scientific foundlings using Data science concepts. Then, I analyzed the Open Reading Frame(ORF) for the Sars-Cov-2 virus which has 10 different proteins that are responsible for the synthesis and catalytic process of COVID-19 in a human body. With the help of this reading frame sequence, I was able to extract the polypeptides and long-chain polypeptides in the virus. Further, I made a decoder to make the genome into the Reading-Frame sequence. This helped me to find the essential 20 different types of proteins that can be used to express the genome into the Protein sequence. Then I converted the RNA sequence into a DNA string for applying the concepts of "Codons".

covid 19 genome sequence analysis

I was able to compress it into an 8.412 kb file using the "LZMA" algorithm.

covid 19 genome sequence analysis

Using the concept of Kolmogorov complexity, I was able to find the lower bound size of a compressed version of the COVID-19 virus. I cleaned the genome sample to obtain an RNA sequence and I verified the number of base-pairs in the virus. This is a project based on the complete genome analysis of the COVID-19 (Sars-cov2) virus, taken from the Wuhan-Hu-1 isolate sample. Computational Anaylysis of COVID-19 genome












Covid 19 genome sequence analysis