Public access genome sequences. Programmatic download

Genbank holds a dedicated site for Covid-19 resources here.

Follow the link Entrez nucleotide. A summary of available sequences appears. At the top right area Send to... allows to download a File of Accession List list of files. Sorting by sequence size helps pick up whole genomes (more than 29000bp).

This oneliner downloads the genomes from the command line makes use of edirect utilities

Multiple sequence alignments (msa)

The alignment of complete genomes was carried out in prank. This tool claims a codon-aware alignment. The running time to obtain the multifasta file was 620 seconds.aligned multifasta Visualization of aligned sequences is possible by many methods. One of them is PRANK's companion: WASABI . Please, copy this url http://www.egarmo.com/GenBank.msa.fa.best.xml on the graphical interface.

Phylogenetic tree

WASABI shows a first draft of phylogenetic tree, but optimization was achieved by feeding the data into BEAST. After a running time of 24 minutes and the use of ancillary programs (BEAUTi and Treeannotator) a phylogenetic tree in the nexus format was obtained. The running time of BEAUTi and Treeannotator is negligible. The file in nexus format is also available. This file may bie visualizes in iTol from embl.de. The toolbar on the left includes zoom in and out and a search tool under te 'Aa' icon. This button pops up a window to enter search terms in the names of the tree leaves (taxa). For example entering 'Valencia' highlights the position of the genomes uploaded to that database from Spain yet. Genomes 1 and 7 cluster together, but far from all other genomes from Spain. This observation would support the already communicated idea of a possible double entry of the SARS-Covid 19 in Spain or ongoing mutation across the country.

.

Entropy track

COMING SOON