Evolutionary origin of genomic structural variations in domestic yaks.
-[[!pmid 37726270 desc="“assemblies were constructed for 6 wild and 15 domestic yaks” “we used a uniform standard pipeline to annotate these 47 bovine genomes. We identified an average of 24,368 protein-coding genes for each assembly” “1,048,639 high-confidence SNPs were detected and used in phylogenetic analyses with the water buffalo genome as outgroup” “We further constructed a species tree on the basis of 8428 single-copy core genes through selecting one representative individual of each species” “Pangenomes were constructed for yaks and cattle and a super-pangenome for the 7 Bovini species. For the yak pangenome the total gene set approached saturation at n = 20. The percentages of core (present in all 22 genomes), near-core (present in 20–21 genomes) and variable (found in 1–19 genomes) gene families were 50.18, 10.91, 38.91%, respectively.” “ In pairwise comparisons of the assemblies constituting the pangenome, each assembly possessed 123 to 2113 genes not present in the other genome” we constructed a multi-assembly graph-based genome of the 47 genomes used in the phylogenomic and super-pangenome analyses. This comprised 3.14 gigabases (Gb) spread across 5,449,222 nodes (the number of fragments of sequences) and connected by 4,889,530 edges (the connections between nodes), with non-reference nodes spanning 387.0 Mb. The core (shared by all genomes), near-core (in 46 or 45 genomes) and variable nodes (in 44 or less samples) accounted for 60.8, 17.0, 22.2% of all nodes.” “We detected SVs ( ≥ 50 bp) in the graph-based genome using the bubble popping algorithm of gfatools25 and retained 293,712 SVs (81.7% <500 bp, 99.76% <10 kb) that could be genotyped in the BosMut3.0 yak reference genome or at least one other genome” “Next, we used the graph-genotyping software Vg (v1.36.0)26 on 386 bovines (233 yaks, 140 cattle, 4 bison, 8 wisent and one gaur, including the novel genome sequences for assembling the super-pangenome) for which resequencing data with >6× coverage were available (Supplementary Data 1, 2)2,3,8,20,21,27,28,29,30,31,32,33,34,35,36. This yielded 610,921 genotyped SVs, from which 57,432 were retained after quality filtering”"]]
+[[!pmid 37726270 desc="“assemblies were constructed for 6 wild and 15 domestic yaks” “we used a uniform standard pipeline to annotate these 47 bovine genomes. We identified an average of 24,368 protein-coding genes for each assembly” “1,048,639 high-confidence SNPs were detected and used in phylogenetic analyses with the water buffalo genome as outgroup” “We further constructed a species tree on the basis of 8428 single-copy core genes through selecting one representative individual of each species” “Pangenomes were constructed for yaks and cattle and a super-pangenome for the 7 Bovini species. For the yak pangenome the total gene set approached saturation at n = 20. The percentages of core (present in all 22 genomes), near-core (present in 20–21 genomes) and variable (found in 1–19 genomes) gene families were 50.18, 10.91, 38.91%, respectively.” “ In pairwise comparisons of the assemblies constituting the pangenome, each assembly possessed 123 to 2113 genes not present in the other genome” we constructed a multi-assembly graph-based genome of the 47 genomes used in the phylogenomic and super-pangenome analyses. This comprised 3.14 gigabases (Gb) spread across 5,449,222 nodes (the number of fragments of sequences) and connected by 4,889,530 edges (the connections between nodes), with non-reference nodes spanning 387.0 Mb. The core (shared by all genomes), near-core (in 46 or 45 genomes) and variable nodes (in 44 or less samples) accounted for 60.8, 17.0, 22.2% of all nodes.” “We detected SVs ( ≥ 50 bp) in the graph-based genome using the bubble popping algorithm of gfatools25 and retained 293,712 SVs (81.7% <500 bp, 99.76% <10 kb) that could be genotyped in the BosMut3.0 yak reference genome or at least one other genome” “Next, we used the graph-genotyping software Vg (v1.36.0) on 386 bovines (233 yaks, 140 cattle, 4 bison, 8 wisent and one gaur, including the novel genome sequences for assembling the super-pangenome) for which resequencing data with >6× coverage were available. This yielded 610,921 genotyped SVs, from which 57,432 were retained after quality filtering”"]]