我有一份人们的身份证和名字的名单,还有一份人们的身份证和姓氏的名单。有些人没有名字,有些人没有姓;我想在这两个列表上做一个完整的外部连接。

下面列出:

ID  FirstName
--  ---------
 1  John
 2  Sue

ID  LastName
--  --------
 1  Doe
 3  Smith

应该生产:

ID  FirstName  LastName
--  ---------  --------
 1  John       Doe
 2  Sue
 3             Smith

我已经发现了相当多的解决方案的“LINQ外部连接”,它们看起来都很相似,但似乎真的是离开外部连接。

到目前为止我的尝试是这样的:

private void OuterJoinTest()
{
    List<FirstName> firstNames = new List<FirstName>();
    firstNames.Add(new FirstName { ID = 1, Name = "John" });
    firstNames.Add(new FirstName { ID = 2, Name = "Sue" });

    List<LastName> lastNames = new List<LastName>();
    lastNames.Add(new LastName { ID = 1, Name = "Doe" });
    lastNames.Add(new LastName { ID = 3, Name = "Smith" });

    var outerJoin = from first in firstNames
        join last in lastNames
        on first.ID equals last.ID
        into temp
        from last in temp.DefaultIfEmpty()
        select new
        {
            id = first != null ? first.ID : last.ID,
            firstname = first != null ? first.Name : string.Empty,
            surname = last != null ? last.Name : string.Empty
        };
    }
}

public class FirstName
{
    public int ID;
    
    public string Name;
}
    
public class LastName
{
    public int ID;
    
    public string Name;
}

但结果是:

ID  FirstName  LastName
--  ---------  --------
 1  John       Doe
 2  Sue

我做错了什么?


当前回答

在两个输入上执行内存流枚举,并为每一行调用选择器。如果在当前迭代中没有相关性,则选择器参数之一将为空。

例子:

   var result = left.FullOuterJoin(
         right, 
         x=>left.Key, 
         x=>right.Key, 
         (l,r) => new { LeftKey = l?.Key, RightKey=r?.Key });

Requires an IComparer for the correlation type, uses the Comparer.Default if not provided. Requires that 'OrderBy' is applied to the input enumerables /// <summary> /// Performs a full outer join on two <see cref="IEnumerable{T}" />. /// </summary> /// <typeparam name="TLeft"></typeparam> /// <typeparam name="TValue"></typeparam> /// <typeparam name="TRight"></typeparam> /// <typeparam name="TResult"></typeparam> /// <param name="left"></param> /// <param name="right"></param> /// <param name="leftKeySelector"></param> /// <param name="rightKeySelector"></param> /// <param name="selector">Expression defining result type</param> /// <param name="keyComparer">A comparer if there is no default for the type</param> /// <returns></returns> [System.Diagnostics.DebuggerStepThrough] public static IEnumerable<TResult> FullOuterJoin<TLeft, TRight, TValue, TResult>( this IEnumerable<TLeft> left, IEnumerable<TRight> right, Func<TLeft, TValue> leftKeySelector, Func<TRight, TValue> rightKeySelector, Func<TLeft, TRight, TResult> selector, IComparer<TValue> keyComparer = null) where TLeft: class where TRight: class where TValue : IComparable { keyComparer = keyComparer ?? Comparer<TValue>.Default; using (var enumLeft = left.OrderBy(leftKeySelector).GetEnumerator()) using (var enumRight = right.OrderBy(rightKeySelector).GetEnumerator()) { var hasLeft = enumLeft.MoveNext(); var hasRight = enumRight.MoveNext(); while (hasLeft || hasRight) { var currentLeft = enumLeft.Current; var valueLeft = hasLeft ? leftKeySelector(currentLeft) : default(TValue); var currentRight = enumRight.Current; var valueRight = hasRight ? rightKeySelector(currentRight) : default(TValue); int compare = !hasLeft ? 1 : !hasRight ? -1 : keyComparer.Compare(valueLeft, valueRight); switch (compare) { case 0: // The selector matches. An inner join is achieved yield return selector(currentLeft, currentRight); hasLeft = enumLeft.MoveNext(); hasRight = enumRight.MoveNext(); break; case -1: yield return selector(currentLeft, default(TRight)); hasLeft = enumLeft.MoveNext(); break; case 1: yield return selector(default(TLeft), currentRight); hasRight = enumRight.MoveNext(); break; } } } }

其他回答

我认为其中大部分都存在问题,包括公认的答案,因为它们与Linq相比IQueryable工作不太好,要么是由于做了太多的服务器往返和太多的数据返回,要么是做了太多的客户端执行。

对于IEnumerable,我不喜欢Sehe的答案或类似的答案,因为它占用了过多的内存(一个简单的10000000两个列表测试在我的32GB机器上运行Linqpad内存)。

另外,其他大多数方法实际上并没有实现正确的Full Outer Join,因为它们使用的是带有右连接的Union,而不是带有右反半连接的Concat,这不仅消除了结果中重复的内部连接行,而且消除了最初存在于左或右数据中的任何正确的副本。

所以这里是我的扩展,处理所有这些问题,生成SQL以及直接实现LINQ到SQL的连接,在服务器上执行,比其他枚举对象更快,内存更少:

public static class Ext {
    public static IEnumerable<TResult> LeftOuterJoin<TLeft, TRight, TKey, TResult>(
        this IEnumerable<TLeft> leftItems,
        IEnumerable<TRight> rightItems,
        Func<TLeft, TKey> leftKeySelector,
        Func<TRight, TKey> rightKeySelector,
        Func<TLeft, TRight, TResult> resultSelector) {

        return from left in leftItems
               join right in rightItems on leftKeySelector(left) equals rightKeySelector(right) into temp
               from right in temp.DefaultIfEmpty()
               select resultSelector(left, right);
    }

    public static IEnumerable<TResult> RightOuterJoin<TLeft, TRight, TKey, TResult>(
        this IEnumerable<TLeft> leftItems,
        IEnumerable<TRight> rightItems,
        Func<TLeft, TKey> leftKeySelector,
        Func<TRight, TKey> rightKeySelector,
        Func<TLeft, TRight, TResult> resultSelector) {

        return from right in rightItems
               join left in leftItems on rightKeySelector(right) equals leftKeySelector(left) into temp
               from left in temp.DefaultIfEmpty()
               select resultSelector(left, right);
    }

    public static IEnumerable<TResult> FullOuterJoinDistinct<TLeft, TRight, TKey, TResult>(
        this IEnumerable<TLeft> leftItems,
        IEnumerable<TRight> rightItems,
        Func<TLeft, TKey> leftKeySelector,
        Func<TRight, TKey> rightKeySelector,
        Func<TLeft, TRight, TResult> resultSelector) {

        return leftItems.LeftOuterJoin(rightItems, leftKeySelector, rightKeySelector, resultSelector).Union(leftItems.RightOuterJoin(rightItems, leftKeySelector, rightKeySelector, resultSelector));
    }

    public static IEnumerable<TResult> RightAntiSemiJoin<TLeft, TRight, TKey, TResult>(
        this IEnumerable<TLeft> leftItems,
        IEnumerable<TRight> rightItems,
        Func<TLeft, TKey> leftKeySelector,
        Func<TRight, TKey> rightKeySelector,
        Func<TLeft, TRight, TResult> resultSelector) {

        var hashLK = new HashSet<TKey>(from l in leftItems select leftKeySelector(l));
        return rightItems.Where(r => !hashLK.Contains(rightKeySelector(r))).Select(r => resultSelector(default(TLeft),r));
    }

    public static IEnumerable<TResult> FullOuterJoin<TLeft, TRight, TKey, TResult>(
        this IEnumerable<TLeft> leftItems,
        IEnumerable<TRight> rightItems,
        Func<TLeft, TKey> leftKeySelector,
        Func<TRight, TKey> rightKeySelector,
        Func<TLeft, TRight, TResult> resultSelector)  where TLeft : class {

        return leftItems.LeftOuterJoin(rightItems, leftKeySelector, rightKeySelector, resultSelector).Concat(leftItems.RightAntiSemiJoin(rightItems, leftKeySelector, rightKeySelector, resultSelector));
    }

    private static Expression<Func<TP, TC, TResult>> CastSMBody<TP, TC, TResult>(LambdaExpression ex, TP unusedP, TC unusedC, TResult unusedRes) => (Expression<Func<TP, TC, TResult>>)ex;

    public static IQueryable<TResult> LeftOuterJoin<TLeft, TRight, TKey, TResult>(
        this IQueryable<TLeft> leftItems,
        IQueryable<TRight> rightItems,
        Expression<Func<TLeft, TKey>> leftKeySelector,
        Expression<Func<TRight, TKey>> rightKeySelector,
        Expression<Func<TLeft, TRight, TResult>> resultSelector) {

        var sampleAnonLR = new { left = default(TLeft), rightg = default(IEnumerable<TRight>) };
        var parmP = Expression.Parameter(sampleAnonLR.GetType(), "p");
        var parmC = Expression.Parameter(typeof(TRight), "c");
        var argLeft = Expression.PropertyOrField(parmP, "left");
        var newleftrs = CastSMBody(Expression.Lambda(Expression.Invoke(resultSelector, argLeft, parmC), parmP, parmC), sampleAnonLR, default(TRight), default(TResult));

        return leftItems.AsQueryable().GroupJoin(rightItems, leftKeySelector, rightKeySelector, (left, rightg) => new { left, rightg }).SelectMany(r => r.rightg.DefaultIfEmpty(), newleftrs);
    }

    public static IQueryable<TResult> RightOuterJoin<TLeft, TRight, TKey, TResult>(
        this IQueryable<TLeft> leftItems,
        IQueryable<TRight> rightItems,
        Expression<Func<TLeft, TKey>> leftKeySelector,
        Expression<Func<TRight, TKey>> rightKeySelector,
        Expression<Func<TLeft, TRight, TResult>> resultSelector) {

        var sampleAnonLR = new { leftg = default(IEnumerable<TLeft>), right = default(TRight) };
        var parmP = Expression.Parameter(sampleAnonLR.GetType(), "p");
        var parmC = Expression.Parameter(typeof(TLeft), "c");
        var argRight = Expression.PropertyOrField(parmP, "right");
        var newrightrs = CastSMBody(Expression.Lambda(Expression.Invoke(resultSelector, parmC, argRight), parmP, parmC), sampleAnonLR, default(TLeft), default(TResult));

        return rightItems.GroupJoin(leftItems, rightKeySelector, leftKeySelector, (right, leftg) => new { leftg, right }).SelectMany(l => l.leftg.DefaultIfEmpty(), newrightrs);
    }

    public static IQueryable<TResult> FullOuterJoinDistinct<TLeft, TRight, TKey, TResult>(
        this IQueryable<TLeft> leftItems,
        IQueryable<TRight> rightItems,
        Expression<Func<TLeft, TKey>> leftKeySelector,
        Expression<Func<TRight, TKey>> rightKeySelector,
        Expression<Func<TLeft, TRight, TResult>> resultSelector) {

        return leftItems.LeftOuterJoin(rightItems, leftKeySelector, rightKeySelector, resultSelector).Union(leftItems.RightOuterJoin(rightItems, leftKeySelector, rightKeySelector, resultSelector));
    }

    private static Expression<Func<TP, TResult>> CastSBody<TP, TResult>(LambdaExpression ex, TP unusedP, TResult unusedRes) => (Expression<Func<TP, TResult>>)ex;

    public static IQueryable<TResult> RightAntiSemiJoin<TLeft, TRight, TKey, TResult>(
        this IQueryable<TLeft> leftItems,
        IQueryable<TRight> rightItems,
        Expression<Func<TLeft, TKey>> leftKeySelector,
        Expression<Func<TRight, TKey>> rightKeySelector,
        Expression<Func<TLeft, TRight, TResult>> resultSelector) {

        var sampleAnonLgR = new { leftg = default(IEnumerable<TLeft>), right = default(TRight) };
        var parmLgR = Expression.Parameter(sampleAnonLgR.GetType(), "lgr");
        var argLeft = Expression.Constant(default(TLeft), typeof(TLeft));
        var argRight = Expression.PropertyOrField(parmLgR, "right");
        var newrightrs = CastSBody(Expression.Lambda(Expression.Invoke(resultSelector, argLeft, argRight), parmLgR), sampleAnonLgR, default(TResult));

        return rightItems.GroupJoin(leftItems, rightKeySelector, leftKeySelector, (right, leftg) => new { leftg, right }).Where(lgr => !lgr.leftg.Any()).Select(newrightrs);
    }

    public static IQueryable<TResult> FullOuterJoin<TLeft, TRight, TKey, TResult>(
        this IQueryable<TLeft> leftItems,
        IQueryable<TRight> rightItems,
        Expression<Func<TLeft, TKey>> leftKeySelector,
        Expression<Func<TRight, TKey>> rightKeySelector,
        Expression<Func<TLeft, TRight, TResult>> resultSelector) {

        return leftItems.LeftOuterJoin(rightItems, leftKeySelector, rightKeySelector, resultSelector).Concat(leftItems.RightAntiSemiJoin(rightItems, leftKeySelector, rightKeySelector, resultSelector));
    }
}

右反半连接(Right Anti-Semi-Join)之间的差异在Linq to Objects或源代码中大多是没有意义的,但在最终的答案中,在服务器(SQL)端产生了差异,删除了一个不必要的连接。

手动编码Expression来处理将Expression<Func<>>合并到lambda中,可以使用LinqKit进行改进,但如果语言/编译器为此添加了一些帮助,那就更好了。为了完整起见,包括了fullouterjo模糊不清和RightOuterJoin函数,但我还没有重新实现FullOuterGroupJoin。

我为IEnumerable编写了另一个完整的外部连接版本,用于键是可排序的情况,这比组合左外部连接和右反半连接快50%左右,至少在小型集合上是这样。它只对每个集合进行一次排序。

我还为使用EF的版本添加了另一个答案,即用自定义展开替换Invoke。

谢谢大家的有趣帖子!

我修改了代码,因为在我的情况下我需要

个性化的连接谓词 一个个性化的联合,独特的比较

这是我修改过的代码(不好意思,用VB写的)

    Module MyExtensions
        <Extension()>
        Friend Function FullOuterJoin(Of TA, TB, TResult)(ByVal a As IEnumerable(Of TA), ByVal b As IEnumerable(Of TB), ByVal joinPredicate As Func(Of TA, TB, Boolean), ByVal projection As Func(Of TA, TB, TResult), ByVal comparer As IEqualityComparer(Of TResult)) As IEnumerable(Of TResult)
            Dim joinL =
                From xa In a
                From xb In b.Where(Function(x) joinPredicate(xa, x)).DefaultIfEmpty()
                Select projection(xa, xb)
            Dim joinR =
                From xb In b
                From xa In a.Where(Function(x) joinPredicate(x, xb)).DefaultIfEmpty()
                Select projection(xa, xb)
            Return joinL.Union(joinR, comparer)
        End Function
    End Module

    Dim fullOuterJoin = lefts.FullOuterJoin(
        rights,
        Function(left, right) left.Code = right.Code And (left.Amount [...] Or left.Description.Contains [...]),
        Function(left, right) New CompareResult(left, right),
        New MyEqualityComparer
    )

    Public Class MyEqualityComparer
        Implements IEqualityComparer(Of CompareResult)

        Private Function GetMsg(obj As CompareResult) As String
            Dim msg As String = ""
            msg &= obj.Code & "_"
            [...]
            Return msg
        End Function

        Public Overloads Function Equals(x As CompareResult, y As CompareResult) As Boolean Implements IEqualityComparer(Of CompareResult).Equals
            Return Me.GetMsg(x) = Me.GetMsg(y)
        End Function

        Public Overloads Function GetHashCode(obj As CompareResult) As Integer Implements IEqualityComparer(Of CompareResult).GetHashCode
            Return Me.GetMsg(obj).GetHashCode
        End Function
    End Class

我真的很讨厌这些linq表达式,这就是SQL存在的原因:

select isnull(fn.id, ln.id) as id, fn.firstname, ln.lastname
   from firstnames fn
   full join lastnames ln on ln.id=fn.id

在数据库中创建此sql视图并将其作为实体导入。

当然,(不同的)左连接和右连接的并集也可以做到这一点,但这是愚蠢的。

两个或多个表的完全外部连接: 首先提取要连接的列。

var DatesA = from A in db.T1 select A.Date; 
var DatesB = from B in db.T2 select B.Date; 
var DatesC = from C in db.T3 select C.Date;            

var Dates = DatesA.Union(DatesB).Union(DatesC); 

然后在提取的列和主表之间使用左外连接。

var Full_Outer_Join =

(from A in Dates
join B in db.T1
on A equals B.Date into AB 

from ab in AB.DefaultIfEmpty()
join C in db.T2
on A equals C.Date into ABC 

from abc in ABC.DefaultIfEmpty()
join D in db.T3
on A equals D.Date into ABCD

from abcd in ABCD.DefaultIfEmpty() 
select new { A, ab, abc, abcd })
.AsEnumerable();

正如您所发现的,Linq没有“外部连接”结构。您所能得到的最接近的是使用您所声明的查询的左外连接。为此,你可以添加任何没有在join中表示的姓氏列表元素:

outerJoin = outerJoin.Concat(lastNames.Select(l=>new
                            {
                                id = l.ID,
                                firstname = String.Empty,
                                surname = l.Name
                            }).Where(l=>!outerJoin.Any(o=>o.id == l.id)));