假设我有一个左外连接,如下所示:

from f in Foo
join b in Bar on f.Foo_Id equals b.Foo_Id into g
from result in g.DefaultIfEmpty()
select new { Foo = f, Bar = result }

如何使用扩展方法表达相同的任务?如。

Foo.GroupJoin(Bar, f => f.Foo_Id, b => b.Foo_Id, (f,b) => ???)
    .Select(???)

当前回答

将Marc Gravell的答案转换为一个扩展方法,我做了以下工作。

internal static IEnumerable<Tuple<TLeft, TRight>> LeftJoin<TLeft, TRight, TKey>(
    this IEnumerable<TLeft> left,
    IEnumerable<TRight> right,
    Func<TLeft, TKey> selectKeyLeft,
    Func<TRight, TKey> selectKeyRight,
    TRight defaultRight = default(TRight),
    IEqualityComparer<TKey> cmp = null)
{
    return left.GroupJoin(
            right,
            selectKeyLeft,
            selectKeyRight,
            (x, y) => new Tuple<TLeft, IEnumerable<TRight>>(x, y),
            cmp ?? EqualityComparer<TKey>.Default)
        .SelectMany(
            x => x.Item2.DefaultIfEmpty(defaultRight),
            (x, y) => new Tuple<TLeft, TRight>(x.Item1, y));
}

其他回答

要实现两个数据集的连接,不需要使用组连接方法。

内连接:

var qry = Foos.SelectMany
            (
                foo => Bars.Where (bar => foo.Foo_id == bar.Foo_id),
                (foo, bar) => new
                    {
                    Foo = foo,
                    Bar = bar
                    }
            );

对于左连接,只需添加DefaultIfEmpty()

var qry = Foos.SelectMany
            (
                foo => Bars.Where (bar => foo.Foo_id == bar.Foo_id).DefaultIfEmpty(),
                (foo, bar) => new
                    {
                    Foo = foo,
                    Bar = bar
                    }
            );

EF和LINQ to SQL正确转换为SQL。 对于LINQ to Objects,最好使用GroupJoin,因为它在内部使用Lookup。但是如果您正在查询DB,那么跳过GroupJoin是AFAIK作为性能。

Personlay对我来说,这种方式比GroupJoin()更具可读性。

对于一个表Bar与一个表Foo对Foo的连接(左外)。Foo_Id = Bar。Foo_Id用lambda表示:

var qry = Foo.GroupJoin(
          Bar, 
          foo => foo.Foo_Id,
          bar => bar.Foo_Id,
          (x,y) => new { Foo = x, Bars = y })
       .SelectMany(
           x => x.Bars.DefaultIfEmpty(),
           (x,y) => new { Foo=x.Foo, Bar=y});

你可以创建这样的扩展方法:

public static IEnumerable<TResult> LeftOuterJoin<TSource, TInner, TKey, TResult>(this IEnumerable<TSource> source, IEnumerable<TInner> other, Func<TSource, TKey> func, Func<TInner, TKey> innerkey, Func<TSource, TInner, TResult> res)
    {
        return from f in source
               join b in other on func.Invoke(f) equals innerkey.Invoke(b) into g
               from result in g.DefaultIfEmpty()
               select res.Invoke(f, result);
    }

将Marc Gravell的答案转换为一个扩展方法,我做了以下工作。

internal static IEnumerable<Tuple<TLeft, TRight>> LeftJoin<TLeft, TRight, TKey>(
    this IEnumerable<TLeft> left,
    IEnumerable<TRight> right,
    Func<TLeft, TKey> selectKeyLeft,
    Func<TRight, TKey> selectKeyRight,
    TRight defaultRight = default(TRight),
    IEqualityComparer<TKey> cmp = null)
{
    return left.GroupJoin(
            right,
            selectKeyLeft,
            selectKeyRight,
            (x, y) => new Tuple<TLeft, IEnumerable<TRight>>(x, y),
            cmp ?? EqualityComparer<TKey>.Default)
        .SelectMany(
            x => x.Item2.DefaultIfEmpty(defaultRight),
            (x, y) => new Tuple<TLeft, TRight>(x.Item1, y));
}

我把这个问题收藏起来,每年都要参考一下。每次我重温这个问题时,我发现我已经忘记了它是如何工作的。这里有一个更详细的解释。

GroupJoin is like a mix of GroupBy and Join. GroupJoin basically groups the outer collection by the join key, then joins the groupings to the inner collection on the join key. Suppose we have customers and orders. If you GroupJoin on the respective IDs, the result is an enumerable of {Customer, IGrouping<int, Order>}. The reason GroupJoin is useful is because all inner objects are represented even if the outer collection contains no matching objects. For customers with no orders, the IGrouping<int, Order> is simply empty. Once we have { Customer, IGrouping<int, Order> }, we can use as-is, filter out results that have no orders, or flatten with SelectMany to get results like a traditional LINQ Join.

这里有一个完整的例子,如果有人想通过调试器来了解它是如何工作的:

using System;
using System.Linq;
                    
public class Program
{
    public static void Main()
    {
        //Create some customers
        var customers = new Customer[]
        {
            new Customer(1, "Alice"),
            new Customer(2, "Bob"),
            new Customer(3, "Carol")
        };
        
        //Create some orders for Alice and Bob, but none for Carol
        var orders = new Order[]
        {
            new Order(1, 1),
            new Order(2, 1),
            new Order(3, 1),
            new Order(4, 2),
            new Order(5, 2)
        };

        //Group join customers to orders.
        //Result is IEnumerable<Customer, IGrouping<int, Order>>. 
        //Every customer will be present. 
        //If a customer has no orders, the IGrouping<> will be empty.
        var groupJoined = customers.GroupJoin(orders,
                              c => c.ID,
                              o => o.CustomerID,
                              (customer, order) => (customer, order));

        //Display results. Prints:
        //    Customer: Alice (CustomerID=1), Orders: 3
        //    Customer: Bob (CustomerID=2), Orders: 2
        //    Customer: Carol (CustomerID=3), Orders: 0
        foreach(var result in groupJoined)
        {
            Console.WriteLine($"Customer: {result.customer.Name} (CustomerID={result.customer.ID}), Orders: {result.order.Count()}");
        }
        
        //Flatten the results to look more like a LINQ join
        //Produces an enumerable of { Customer, Order }
        //All customers represented, order is null if customer has no orders
        var flattened = groupJoined.SelectMany(z => z.order.DefaultIfEmpty().Select(y => new { z.customer, y }));

        //Get only results where the outer table is null.
        //roughly equivalent to: 
        //SELECT * 
        //FROM A 
        //LEFT JOIN B 
        //ON A.ID = B.ID 
        //WHERE B.ID IS NULL;
        var noMatch = groupJoined.Where(z => z.order.DefaultIfEmpty().Count() == 0);
    }
}

class Customer
{
    public int ID { get; set; }
    public string Name { get; set; }

    public Customer(int iD, string name)
    {
        ID = iD;
        Name = name;
    }
}

class Order
{
    static Random Random { get; set; } = new Random();

    public int ID { get; set; }
    public int CustomerID { get; set; }
    public decimal Amount { get; set; }

    public Order(int iD, int customerID)
    {
        ID = iD;
        CustomerID = customerID;
        Amount = (decimal)Random.Next(1000, 10000) / 100;
    }
}