ohai.social is one of the many independent Mastodon servers you can use to participate in the fediverse.
A cozy, fast and secure Mastodon server where everyone is welcome. Run by the folks at ohai.is.

Administered by:

Server stats:

1.8K
active users

#ethz

3 posts3 participants0 posts today
CuriousCoding · Static search trees: 40x faster than binary searchTable of Contents 1 Introduction 1.1 Problem statement 1.2 Motivation 1.3 Recommended reading 1.4 Binary search and Eytzinger layout 1.5 Hugepages 1.6 A note on benchmarking 1.7 Cache lines 1.8 S-trees and B-trees 2 Optimizing find 2.1 Linear 2.2 Auto-vectorization 2.3 Trailing zeros 2.4 Popcount 2.5 Manual SIMD 3 Optimizing the search 3.1 Batching 3.2 Prefetching 3.3 Pointer arithmetic 3.3.1 Up-front splat 3.3.2 Byte-based pointers 3.3.3 The final version 3.4 Skip prefetch 3.5 Interleave 4 Optimizing the tree layout 4.1 Left-tree 4.2 Memory layouts 4.3 Node size \(B=15\) 4.3.1 Data structure size 4.4 Summary 5 Prefix partitioning 5.1 Full layout 5.2 Compact subtrees 5.3 The best of both: compact first level 5.4 Overlapping trees 5.5 Human data 5.6 Prefix map 5.7 Summary 6 Multi-threaded comparison 7 Conclusion 7.1 Future work 7.1.1 Branchy search 7.1.2 Interpolation search 7.1.3 Packing data smaller 7.1.4 Returning indices in original data 7.1.5 Range queries 7.1.6 Sorting queries 7.1.7 Suffix array searching In this post, we will implement a static search tree (S+ tree) for high-throughput searching of sorted data, as introduced on Algorithmica. We’ll mostly take the code presented there as a starting point, and optimize it to its limits. For a large part, I’m simply taking the ‘future work’ ideas of that post and implementing them. And then there will be a bunch of looking at assembly code to shave off all the instructions we can. Lastly, there will be one big addition to optimize throughput: batching.
CuriousCoding · Static search trees: 40x faster than binary searchTable of Contents 1 Introduction 1.1 Problem statement 1.2 Motivation 1.3 Recommended reading 1.4 Binary search and Eytzinger layout 1.5 Hugepages 1.6 A note on benchmarking 1.7 Cache lines 1.8 S-trees and B-trees 2 Optimizing find 2.1 Linear 2.2 Auto-vectorization 2.3 Trailing zeros 2.4 Popcount 2.5 Manual SIMD 3 Optimizing the search 3.1 Batching 3.2 Prefetching 3.3 Pointer arithmetic 3.3.1 Up-front splat 3.3.2 Byte-based pointers 3.3.3 The final version 3.4 Skip prefetch 3.5 Interleave 4 Optimizing the tree layout 4.1 Left-tree 4.2 Memory layouts 4.3 Node size \(B=15\) 4.3.1 Data structure size 4.4 Summary 5 Prefix partitioning 5.1 Full layout 5.2 Compact subtrees 5.3 The best of both: compact first level 5.4 Overlapping trees 5.5 Human data 5.6 Prefix map 5.7 Summary 6 Multi-threaded comparison 7 Conclusion 7.1 Future work 7.1.1 Branchy search 7.1.2 Interpolation search 7.1.3 Packing data smaller 7.1.4 Returning indices in original data 7.1.5 Range queries 7.1.6 Sorting queries 7.1.7 Suffix array searching In this post, we will implement a static search tree (S+ tree) for high-throughput searching of sorted data, as introduced on Algorithmica. We’ll mostly take the code presented there as a starting point, and optimize it to its limits. For a large part, I’m simply taking the ‘future work’ ideas of that post and implementing them. And then there will be a bunch of looking at assembly code to shave off all the instructions we can. Lastly, there will be one big addition to optimize throughput: batching.

From our #ETHZ / @SIB research group Computational Biology Group (CBG):
Release of V-pipe 3.0

We're thrilled to announce the release of V-pipe 3.0! This computational pipeline is specifically crafted for analyzing short viral genomes using next-generation sequencing data. V-pipe 3.0 enables sustainable viral genomic data science. You can check out the published paper on our work in GigaScience at: doi.org/10.1093/gigascience/gi.

OUP AcademicV-pipe 3.0: a sustainable pipeline for within-sample viral genetic diversity estimationAbstract. The large amount and diversity of viral genomic datasets generated by next-generation sequencing technologies poses a set of challenges for compu