假设我有一个左外连接,如下所示:

from f in Foo
join b in Bar on f.Foo_Id equals b.Foo_Id into g
from result in g.DefaultIfEmpty()
select new { Foo = f, Bar = result }

如何使用扩展方法表达相同的任务?如。

Foo.GroupJoin(Bar, f => f.Foo_Id, b => b.Foo_Id, (f,b) => ???)
    .Select(???)

当前回答

虽然接受的答案工作,是很好的Linq对象它困扰我的SQL查询不只是一个直接的左外连接。

下面的代码依赖于允许您传递表达式并将它们调用到查询的LinqKit项目。

static IQueryable<TResult> LeftOuterJoin<TSource,TInner, TKey, TResult>(
     this IQueryable<TSource> source, 
     IQueryable<TInner> inner, 
     Expression<Func<TSource,TKey>> sourceKey, 
     Expression<Func<TInner,TKey>> innerKey, 
     Expression<Func<TSource, TInner, TResult>> result
    ) {
    return from a in source.AsExpandable()
            join b in inner on sourceKey.Invoke(a) equals innerKey.Invoke(b) into c
            from d in c.DefaultIfEmpty()
            select result.Invoke(a,d);
}

它可以这样使用

Table1.LeftOuterJoin(Table2, x => x.Key1, x => x.Key2, (x,y) => new { x,y});

其他回答

将Marc Gravell的答案转换为一个扩展方法,我做了以下工作。

internal static IEnumerable<Tuple<TLeft, TRight>> LeftJoin<TLeft, TRight, TKey>(
    this IEnumerable<TLeft> left,
    IEnumerable<TRight> right,
    Func<TLeft, TKey> selectKeyLeft,
    Func<TRight, TKey> selectKeyRight,
    TRight defaultRight = default(TRight),
    IEqualityComparer<TKey> cmp = null)
{
    return left.GroupJoin(
            right,
            selectKeyLeft,
            selectKeyRight,
            (x, y) => new Tuple<TLeft, IEnumerable<TRight>>(x, y),
            cmp ?? EqualityComparer<TKey>.Default)
        .SelectMany(
            x => x.Item2.DefaultIfEmpty(defaultRight),
            (x, y) => new Tuple<TLeft, TRight>(x.Item1, y));
}

要实现两个数据集的连接,不需要使用组连接方法。

内连接:

var qry = Foos.SelectMany
            (
                foo => Bars.Where (bar => foo.Foo_id == bar.Foo_id),
                (foo, bar) => new
                    {
                    Foo = foo,
                    Bar = bar
                    }
            );

对于左连接,只需添加DefaultIfEmpty()

var qry = Foos.SelectMany
            (
                foo => Bars.Where (bar => foo.Foo_id == bar.Foo_id).DefaultIfEmpty(),
                (foo, bar) => new
                    {
                    Foo = foo,
                    Bar = bar
                    }
            );

EF和LINQ to SQL正确转换为SQL。 对于LINQ to Objects,最好使用GroupJoin,因为它在内部使用Lookup。但是如果您正在查询DB,那么跳过GroupJoin是AFAIK作为性能。

Personlay对我来说,这种方式比GroupJoin()更具可读性。

由于这似乎是使用方法(扩展)语法的左外连接的事实上的SO问题,我想我应该在当前选择的答案之外添加一个替代方案(至少在我的经验中),这是我更常见的答案

// Option 1: Expecting either 0 or 1 matches from the "Right"
// table (Bars in this case):
var qry = Foos.GroupJoin(
          Bars,
          foo => foo.Foo_Id,
          bar => bar.Foo_Id,
          (f,bs) => new { Foo = f, Bar = bs.SingleOrDefault() });

// Option 2: Expecting either 0 or more matches from the "Right" table
// (courtesy of currently selected answer):
var qry = Foos.GroupJoin(
                  Bars, 
                  foo => foo.Foo_Id,
                  bar => bar.Foo_Id,
                  (f,bs) => new { Foo = f, Bars = bs })
              .SelectMany(
                  fooBars => fooBars.Bars.DefaultIfEmpty(),
                  (x,y) => new { Foo = x.Foo, Bar = y });

要使用一个简单的数据集来显示差异(假设我们是根据值本身进行连接):

List<int> tableA = new List<int> { 1, 2, 3 };
List<int?> tableB = new List<int?> { 3, 4, 5 };

// Result using both Option 1 and 2. Option 1 would be a better choice
// if we didn't expect multiple matches in tableB.
{ A = 1, B = null }
{ A = 2, B = null }
{ A = 3, B = 3    }

List<int> tableA = new List<int> { 1, 2, 3 };
List<int?> tableB = new List<int?> { 3, 3, 4 };

// Result using Option 1 would be that an exception gets thrown on
// SingleOrDefault(), but if we use FirstOrDefault() instead to illustrate:
{ A = 1, B = null }
{ A = 2, B = null }
{ A = 3, B = 3    } // Misleading, we had multiple matches.
                    // Which 3 should get selected (not arbitrarily the first)?.

// Result using Option 2:
{ A = 1, B = null }
{ A = 2, B = null }
{ A = 3, B = 3    }
{ A = 3, B = 3    }    

选项2对于典型的左外连接定义是正确的,但正如我前面提到的,根据数据集的不同,它通常是不必要的复杂。

我把这个问题收藏起来,每年都要参考一下。每次我重温这个问题时,我发现我已经忘记了它是如何工作的。这里有一个更详细的解释。

GroupJoin is like a mix of GroupBy and Join. GroupJoin basically groups the outer collection by the join key, then joins the groupings to the inner collection on the join key. Suppose we have customers and orders. If you GroupJoin on the respective IDs, the result is an enumerable of {Customer, IGrouping<int, Order>}. The reason GroupJoin is useful is because all inner objects are represented even if the outer collection contains no matching objects. For customers with no orders, the IGrouping<int, Order> is simply empty. Once we have { Customer, IGrouping<int, Order> }, we can use as-is, filter out results that have no orders, or flatten with SelectMany to get results like a traditional LINQ Join.

这里有一个完整的例子,如果有人想通过调试器来了解它是如何工作的:

using System;
using System.Linq;
                    
public class Program
{
    public static void Main()
    {
        //Create some customers
        var customers = new Customer[]
        {
            new Customer(1, "Alice"),
            new Customer(2, "Bob"),
            new Customer(3, "Carol")
        };
        
        //Create some orders for Alice and Bob, but none for Carol
        var orders = new Order[]
        {
            new Order(1, 1),
            new Order(2, 1),
            new Order(3, 1),
            new Order(4, 2),
            new Order(5, 2)
        };

        //Group join customers to orders.
        //Result is IEnumerable<Customer, IGrouping<int, Order>>. 
        //Every customer will be present. 
        //If a customer has no orders, the IGrouping<> will be empty.
        var groupJoined = customers.GroupJoin(orders,
                              c => c.ID,
                              o => o.CustomerID,
                              (customer, order) => (customer, order));

        //Display results. Prints:
        //    Customer: Alice (CustomerID=1), Orders: 3
        //    Customer: Bob (CustomerID=2), Orders: 2
        //    Customer: Carol (CustomerID=3), Orders: 0
        foreach(var result in groupJoined)
        {
            Console.WriteLine($"Customer: {result.customer.Name} (CustomerID={result.customer.ID}), Orders: {result.order.Count()}");
        }
        
        //Flatten the results to look more like a LINQ join
        //Produces an enumerable of { Customer, Order }
        //All customers represented, order is null if customer has no orders
        var flattened = groupJoined.SelectMany(z => z.order.DefaultIfEmpty().Select(y => new { z.customer, y }));

        //Get only results where the outer table is null.
        //roughly equivalent to: 
        //SELECT * 
        //FROM A 
        //LEFT JOIN B 
        //ON A.ID = B.ID 
        //WHERE B.ID IS NULL;
        var noMatch = groupJoined.Where(z => z.order.DefaultIfEmpty().Count() == 0);
    }
}

class Customer
{
    public int ID { get; set; }
    public string Name { get; set; }

    public Customer(int iD, string name)
    {
        ID = iD;
        Name = name;
    }
}

class Order
{
    static Random Random { get; set; } = new Random();

    public int ID { get; set; }
    public int CustomerID { get; set; }
    public decimal Amount { get; set; }

    public Order(int iD, int customerID)
    {
        ID = iD;
        CustomerID = customerID;
        Amount = (decimal)Random.Next(1000, 10000) / 100;
    }
}

你可以创建这样的扩展方法:

public static IEnumerable<TResult> LeftOuterJoin<TSource, TInner, TKey, TResult>(this IEnumerable<TSource> source, IEnumerable<TInner> other, Func<TSource, TKey> func, Func<TInner, TKey> innerkey, Func<TSource, TInner, TResult> res)
    {
        return from f in source
               join b in other on func.Invoke(f) equals innerkey.Invoke(b) into g
               from result in g.DefaultIfEmpty()
               select res.Invoke(f, result);
    }