Shasta

author Charles Plessy <charles.plessy@oist.jp>

Thu, 17 Sep 2020 00:35:25 +0000 (09:35 +0900)

committer Charles Plessy <charles.plessy@oist.jp>

Thu, 17 Sep 2020 00:35:25 +0000 (09:35 +0900)
author Charles Plessy <charles.plessy@oist.jp>
Thu, 17 Sep 2020 00:35:25 +0000 (09:35 +0900)
committer Charles Plessy <charles.plessy@oist.jp>
Thu, 17 Sep 2020 00:35:25 +0000 (09:35 +0900)
diff --git a/biblio/32686750.mdwn b/biblio/32686750.mdwn

new file mode 100644 (file)

index 0000000..5b53d3c
--- /dev/null
+++ b/biblio/32686750.mdwn
@@ -0,0 +1,10 @@
+[[!meta title="Nanopore sequencing and the Shasta toolkit enable efficient de novo assembly of eleven human genomes."]]
+[[!tag assembly method software]]
+
+Shafin K, Pesout T, Lorig-Roach R, Haukness M, Olsen HE, Bosworth C, Armstrong J, Tigyi K, Maurer N, Koren S, Sedlazeck FJ, Marschall T, Mayes S, Costa V, Zook JM, Liu KJ, Kilburn D, Sorensen M, Munson KM, Vollger MR, Monlong J, Garrison E, Eichler EE, Salama S, Haussler D, Green RE, Akeson M, Phillippy A, Miga KH, Carnevali P, Jain M, Paten B.
+
+Nat Biotechnol. 2020 Sep;38(9):1044-1053. doi:10.1038/s41587-020-0503-6
+
+Nanopore sequencing and the Shasta toolkit enable efficient de novo assembly of eleven human genomes.
+
+[[!pmid 32686750 desc="Runs in memory (no disk IO) and requires terabyte amounts for human genome.  Designed for Nanopore data.  Reads are run length encoded before assembling.  Assemblies are more fragmented, but with less disagreements to the reference.  The estimated cost of running is lower than for competitors."]]
diff --git a/tags/assembly.mdwn b/tags/assembly.mdwn

index 259f50eb977a33b1df0b8881195a914013c2bd2f..6ff694351064c127364e954e1624d9d7bea77fb6 100644 (file)
--- a/tags/assembly.mdwn
+++ b/tags/assembly.mdwn
@@ -15,6 +15,14 @@ contings and increase their accuracy.  (The predecessor of Flye, ABruijn, was
  reported by [[Istace and coll. (2017)|biblio/28369459]] to be able to assemble
  mitochondrial genomes, unlike Canu and other assemblers.)
  
+The Shasta assembler [[Shafin and coll., 2020|biblio/32686750]] is designed for
+Nanopore data. Reads are run length encoded before assembly, to mitigate the
+impact of errors in homopolymer tracts.  The assembly runs entirely in memory;
+it needs terabyte amounts for a human genome, but as a consequence it runs
+fast. Shasta assemblies tend to be more fragmented, but have less disagreement
+with the reference.  Shasta also comes with polishing modules similar to Racon
+and Medaka, but also  to be faster. 
+
  After assembly, the contigs can be further polished with Racon ([[Vaser, Sović,
  Nagarajan and Šikić, 2017|biblio/28100585]]).
author	Charles Plessy <charles.plessy@oist.jp>
	Thu, 17 Sep 2020 00:35:25 +0000 (09:35 +0900)
committer	Charles Plessy <charles.plessy@oist.jp>
	Thu, 17 Sep 2020 00:35:25 +0000 (09:35 +0900)
biblio/32686750.mdwn	[new file with mode: 0644]	patch \| blob
tags/assembly.mdwn		patch \| blob \| history