我有一份人们的身份证和名字的名单,还有一份人们的身份证和姓氏的名单。有些人没有名字,有些人没有姓;我想在这两个列表上做一个完整的外部连接。

下面列出:

ID  FirstName
--  ---------
 1  John
 2  Sue

ID  LastName
--  --------
 1  Doe
 3  Smith

应该生产:

ID  FirstName  LastName
--  ---------  --------
 1  John       Doe
 2  Sue
 3             Smith

我已经发现了相当多的解决方案的“LINQ外部连接”,它们看起来都很相似,但似乎真的是离开外部连接。

到目前为止我的尝试是这样的:

private void OuterJoinTest()
{
    List<FirstName> firstNames = new List<FirstName>();
    firstNames.Add(new FirstName { ID = 1, Name = "John" });
    firstNames.Add(new FirstName { ID = 2, Name = "Sue" });

    List<LastName> lastNames = new List<LastName>();
    lastNames.Add(new LastName { ID = 1, Name = "Doe" });
    lastNames.Add(new LastName { ID = 3, Name = "Smith" });

    var outerJoin = from first in firstNames
        join last in lastNames
        on first.ID equals last.ID
        into temp
        from last in temp.DefaultIfEmpty()
        select new
        {
            id = first != null ? first.ID : last.ID,
            firstname = first != null ? first.Name : string.Empty,
            surname = last != null ? last.Name : string.Empty
        };
    }
}

public class FirstName
{
    public int ID;
    
    public string Name;
}
    
public class LastName
{
    public int ID;
    
    public string Name;
}

但结果是:

ID  FirstName  LastName
--  ---------  --------
 1  John       Doe
 2  Sue

我做错了什么?


当前回答

谢谢大家的有趣帖子!

我修改了代码,因为在我的情况下我需要

个性化的连接谓词 一个个性化的联合,独特的比较

这是我修改过的代码(不好意思,用VB写的)

    Module MyExtensions
        <Extension()>
        Friend Function FullOuterJoin(Of TA, TB, TResult)(ByVal a As IEnumerable(Of TA), ByVal b As IEnumerable(Of TB), ByVal joinPredicate As Func(Of TA, TB, Boolean), ByVal projection As Func(Of TA, TB, TResult), ByVal comparer As IEqualityComparer(Of TResult)) As IEnumerable(Of TResult)
            Dim joinL =
                From xa In a
                From xb In b.Where(Function(x) joinPredicate(xa, x)).DefaultIfEmpty()
                Select projection(xa, xb)
            Dim joinR =
                From xb In b
                From xa In a.Where(Function(x) joinPredicate(x, xb)).DefaultIfEmpty()
                Select projection(xa, xb)
            Return joinL.Union(joinR, comparer)
        End Function
    End Module

    Dim fullOuterJoin = lefts.FullOuterJoin(
        rights,
        Function(left, right) left.Code = right.Code And (left.Amount [...] Or left.Description.Contains [...]),
        Function(left, right) New CompareResult(left, right),
        New MyEqualityComparer
    )

    Public Class MyEqualityComparer
        Implements IEqualityComparer(Of CompareResult)

        Private Function GetMsg(obj As CompareResult) As String
            Dim msg As String = ""
            msg &= obj.Code & "_"
            [...]
            Return msg
        End Function

        Public Overloads Function Equals(x As CompareResult, y As CompareResult) As Boolean Implements IEqualityComparer(Of CompareResult).Equals
            Return Me.GetMsg(x) = Me.GetMsg(y)
        End Function

        Public Overloads Function GetHashCode(obj As CompareResult) As Integer Implements IEqualityComparer(Of CompareResult).GetHashCode
            Return Me.GetMsg(obj).GetHashCode
        End Function
    End Class

其他回答

谢谢大家的有趣帖子!

我修改了代码,因为在我的情况下我需要

个性化的连接谓词 一个个性化的联合,独特的比较

这是我修改过的代码(不好意思,用VB写的)

    Module MyExtensions
        <Extension()>
        Friend Function FullOuterJoin(Of TA, TB, TResult)(ByVal a As IEnumerable(Of TA), ByVal b As IEnumerable(Of TB), ByVal joinPredicate As Func(Of TA, TB, Boolean), ByVal projection As Func(Of TA, TB, TResult), ByVal comparer As IEqualityComparer(Of TResult)) As IEnumerable(Of TResult)
            Dim joinL =
                From xa In a
                From xb In b.Where(Function(x) joinPredicate(xa, x)).DefaultIfEmpty()
                Select projection(xa, xb)
            Dim joinR =
                From xb In b
                From xa In a.Where(Function(x) joinPredicate(x, xb)).DefaultIfEmpty()
                Select projection(xa, xb)
            Return joinL.Union(joinR, comparer)
        End Function
    End Module

    Dim fullOuterJoin = lefts.FullOuterJoin(
        rights,
        Function(left, right) left.Code = right.Code And (left.Amount [...] Or left.Description.Contains [...]),
        Function(left, right) New CompareResult(left, right),
        New MyEqualityComparer
    )

    Public Class MyEqualityComparer
        Implements IEqualityComparer(Of CompareResult)

        Private Function GetMsg(obj As CompareResult) As String
            Dim msg As String = ""
            msg &= obj.Code & "_"
            [...]
            Return msg
        End Function

        Public Overloads Function Equals(x As CompareResult, y As CompareResult) As Boolean Implements IEqualityComparer(Of CompareResult).Equals
            Return Me.GetMsg(x) = Me.GetMsg(y)
        End Function

        Public Overloads Function GetHashCode(obj As CompareResult) As Integer Implements IEqualityComparer(Of CompareResult).GetHashCode
            Return Me.GetMsg(obj).GetHashCode
        End Function
    End Class

这是另一个完整的外部连接

由于对其他命题的简单性和可读性不太满意,我最后得出了这样的结论:

它没有快速的自命(在2020m CPU上加入1000 * 1000大约800毫秒:2.4ghz / 2核)。对我来说,它只是一个紧凑而随意的完全外部连接。

它的工作原理与SQL FULL OUTER JOIN相同(重复保存)

欢呼;-)

using System;
using System.Collections.Generic;
using System.Linq;
namespace NS
{
public static class DataReunion
{
    public static List<Tuple<T1, T2>> FullJoin<T1, T2, TKey>(List<T1> List1, Func<T1, TKey> KeyFunc1, List<T2> List2, Func<T2, TKey> KeyFunc2)
    {
        List<Tuple<T1, T2>> result = new List<Tuple<T1, T2>>();

        Tuple<TKey, T1>[] identifiedList1 = List1.Select(_ => Tuple.Create(KeyFunc1(_), _)).OrderBy(_ => _.Item1).ToArray();
        Tuple<TKey, T2>[] identifiedList2 = List2.Select(_ => Tuple.Create(KeyFunc2(_), _)).OrderBy(_ => _.Item1).ToArray();

        identifiedList1.Where(_ => !identifiedList2.Select(__ => __.Item1).Contains(_.Item1)).ToList().ForEach(_ => {
            result.Add(Tuple.Create<T1, T2>(_.Item2, default(T2)));
        });

        result.AddRange(
            identifiedList1.Join(identifiedList2, left => left.Item1, right => right.Item1, (left, right) => Tuple.Create<T1, T2>(left.Item2, right.Item2)).ToList()
        );

        identifiedList2.Where(_ => !identifiedList1.Select(__ => __.Item1).Contains(_.Item1)).ToList().ForEach(_ => {
            result.Add(Tuple.Create<T1, T2>(default(T1), _.Item2));
        });

        return result;
    }
}
}

这个想法是

基于提供的关键函数生成器构建id 处理仅剩下的项 流程内部连接 只处理正确的项目

下面是一个与之相关的简单测试:

在结束处放置断点,以手动验证它的行为是否符合预期

using System;
using System.Collections.Generic;
using Microsoft.VisualStudio.TestTools.UnitTesting;
using Newtonsoft.Json;
using Newtonsoft.Json.Linq;
using NS;

namespace Tests
{
[TestClass]
public class DataReunionTest
{
    [TestMethod]
    public void Test()
    {
        List<Tuple<Int32, Int32, String>> A = new List<Tuple<Int32, Int32, String>>();
        List<Tuple<Int32, Int32, String>> B = new List<Tuple<Int32, Int32, String>>();

        Random rnd = new Random();

        /* Comment the testing block you do not want to run
        /* Solution to test a wide range of keys*/

        for (int i = 0; i < 500; i += 1)
        {
            A.Add(Tuple.Create(rnd.Next(1, 101), rnd.Next(1, 101), "A"));
            B.Add(Tuple.Create(rnd.Next(1, 101), rnd.Next(1, 101), "B"));
        }

        /* Solution for essential testing*/

        A.Add(Tuple.Create(1, 2, "B11"));
        A.Add(Tuple.Create(1, 2, "B12"));
        A.Add(Tuple.Create(1, 3, "C11"));
        A.Add(Tuple.Create(1, 3, "C12"));
        A.Add(Tuple.Create(1, 3, "C13"));
        A.Add(Tuple.Create(1, 4, "D1"));

        B.Add(Tuple.Create(1, 1, "A21"));
        B.Add(Tuple.Create(1, 1, "A22"));
        B.Add(Tuple.Create(1, 1, "A23"));
        B.Add(Tuple.Create(1, 2, "B21"));
        B.Add(Tuple.Create(1, 2, "B22"));
        B.Add(Tuple.Create(1, 2, "B23"));
        B.Add(Tuple.Create(1, 3, "C2"));
        B.Add(Tuple.Create(1, 5, "E2"));

        Func<Tuple<Int32, Int32, String>, Tuple<Int32, Int32>> key = (_) => Tuple.Create(_.Item1, _.Item2);

        var watch = System.Diagnostics.Stopwatch.StartNew();
        var res = DataReunion.FullJoin(A, key, B, key);
        watch.Stop();
        var elapsedMs = watch.ElapsedMilliseconds;
        String aser = JToken.FromObject(res).ToString(Formatting.Indented);
        Console.Write(elapsedMs);
    }
}

}

我喜欢她的回答,但它没有使用延迟执行(输入序列被调用ToLookup急切地枚举)。因此,在查看了LINQ-to-objects的.NET源代码后,我想到了这个:

public static class LinqExtensions
{
    public static IEnumerable<TResult> FullOuterJoin<TLeft, TRight, TKey, TResult>(
        this IEnumerable<TLeft> left,
        IEnumerable<TRight> right,
        Func<TLeft, TKey> leftKeySelector,
        Func<TRight, TKey> rightKeySelector,
        Func<TLeft, TRight, TKey, TResult> resultSelector,
        IEqualityComparer<TKey> comparator = null,
        TLeft defaultLeft = default(TLeft),
        TRight defaultRight = default(TRight))
    {
        if (left == null) throw new ArgumentNullException("left");
        if (right == null) throw new ArgumentNullException("right");
        if (leftKeySelector == null) throw new ArgumentNullException("leftKeySelector");
        if (rightKeySelector == null) throw new ArgumentNullException("rightKeySelector");
        if (resultSelector == null) throw new ArgumentNullException("resultSelector");

        comparator = comparator ?? EqualityComparer<TKey>.Default;
        return FullOuterJoinIterator(left, right, leftKeySelector, rightKeySelector, resultSelector, comparator, defaultLeft, defaultRight);
    }

    internal static IEnumerable<TResult> FullOuterJoinIterator<TLeft, TRight, TKey, TResult>(
        this IEnumerable<TLeft> left,
        IEnumerable<TRight> right,
        Func<TLeft, TKey> leftKeySelector,
        Func<TRight, TKey> rightKeySelector,
        Func<TLeft, TRight, TKey, TResult> resultSelector,
        IEqualityComparer<TKey> comparator,
        TLeft defaultLeft,
        TRight defaultRight)
    {
        var leftLookup = left.ToLookup(leftKeySelector, comparator);
        var rightLookup = right.ToLookup(rightKeySelector, comparator);
        var keys = leftLookup.Select(g => g.Key).Union(rightLookup.Select(g => g.Key), comparator);

        foreach (var key in keys)
            foreach (var leftValue in leftLookup[key].DefaultIfEmpty(defaultLeft))
                foreach (var rightValue in rightLookup[key].DefaultIfEmpty(defaultRight))
                    yield return resultSelector(leftValue, rightValue, key);
    }
}

该实现具有以下重要属性:

延迟执行,在枚举输出序列之前不会枚举输入序列。 每个输入序列只枚举一次。 保留输入序列的顺序,在某种意义上,它将以左序列和右序列的顺序生成元组(对于不在左序列中出现的键)。

这些属性很重要,因为它们是那些刚接触FullOuterJoin但有LINQ经验的人所期望的。

正如您所发现的,Linq没有“外部连接”结构。您所能得到的最接近的是使用您所声明的查询的左外连接。为此,你可以添加任何没有在join中表示的姓氏列表元素:

outerJoin = outerJoin.Concat(lastNames.Select(l=>new
                            {
                                id = l.ID,
                                firstname = String.Empty,
                                surname = l.Name
                            }).Where(l=>!outerJoin.Any(o=>o.id == l.id)));

我真的很讨厌这些linq表达式,这就是SQL存在的原因:

select isnull(fn.id, ln.id) as id, fn.firstname, ln.lastname
   from firstnames fn
   full join lastnames ln on ln.id=fn.id

在数据库中创建此sql视图并将其作为实体导入。

当然,(不同的)左连接和右连接的并集也可以做到这一点,但这是愚蠢的。