
var data = new List<Tuple<long,long,string>>(500000);
var cnt = data.Count(t => t.Item1 <= x && t.Item2 >= x);

When I generate my random array and run my search for 100 randomly generated values of x, the searches complete in about four seconds. Knowing of the great wonders that sorting does to searching, however, I decided to sort my data - first by Item1, then by Item2, and finally by Item3 - before running my 100 searches. I expected the sorted version to perform a little faster because of branch prediction: my thinking has been that once we get to the point where Item1 == x, all further checks of t.Item1 <= x would predict the branch correctly as "no take", speeding up the tail portion of the search. Much to my surprise, the searches took twice as long on a sorted array!


有人能很好地解释这个奇怪的效应吗?下面是我测试的源代码;我使用的是。net 4.0。

private const int TotalCount = 500000;
private const int TotalQueries = 100;
private static long NextLong(Random r) {
    var data = new byte[8];
    return BitConverter.ToInt64(data, 0);
private class TupleComparer : IComparer<Tuple<long,long,string>> {
    public int Compare(Tuple<long,long,string> x, Tuple<long,long,string> y) {
        var res = x.Item1.CompareTo(y.Item1);
        if (res != 0) return res;
        res = x.Item2.CompareTo(y.Item2);
        return (res != 0) ? res : String.CompareOrdinal(x.Item3, y.Item3);
static void Test(bool doSort) {
    var data = new List<Tuple<long,long,string>>(TotalCount);
    var random = new Random(1000000007);
    var sw = new Stopwatch();
    for (var i = 0 ; i != TotalCount ; i++) {
        var a = NextLong(random);
        var b = NextLong(random);
        if (a > b) {
            var tmp = a;
            a = b;
            b = tmp;
        var s = string.Format("{0}-{1}", a, b);
        data.Add(Tuple.Create(a, b, s));
    if (doSort) {
        data.Sort(new TupleComparer());
    Console.WriteLine("Populated in {0}", sw.Elapsed);
    var total = 0L;
    for (var i = 0 ; i != TotalQueries ; i++) {
        var x = NextLong(random);
        var cnt = data.Count(t => t.Item1 <= x && t.Item2 >= x);
        total += cnt;
    Console.WriteLine("Found {0} matches in {1} ({2})", total, sw.Elapsed, doSort ? "Sorted" : "Unsorted");
static void Main() {

Populated in 00:00:01.3176257
Found 15614281 matches in 00:00:04.2463478 (Unsorted)
Populated in 00:00:01.3345087
Found 15614281 matches in 00:00:08.5393730 (Sorted)
Populated in 00:00:01.3665681
Found 15614281 matches in 00:00:04.1796578 (Unsorted)
Populated in 00:00:01.3326378
Found 15614281 matches in 00:00:08.6027886 (Sorted)






Chris Sinclair notes in the comments that "for TotalCount around 10,000 or less, the sorted version does perform faster". This is because a small list fits entirely into the CPU cache. The memory accesses might be unpredictable but the target is always in cache. I believe there is still a small penalty because even a load from cache takes some cycles. But that seems not to be a problem because the CPU can juggle multiple outstanding loads, thereby increasing throughput. Whenever the CPU hits a wait for memory it will still speed ahead in the instruction stream to queue as many memory operations as it can. This technique is used to hide latency.



