我可能有一个像下面这样的数组:

[1, 4, 2, 2, 6, 24, 15, 2, 60, 15, 6]

或者,实际上,任何类似类型的数据部分的序列。我要做的是确保每个相同的元素只有一个。例如,上面的数组将变成:

[1, 4, 2, 6, 24, 15, 60]

请注意,删除了2、6和15的重复项,以确保每个相同的元素中只有一个。Swift是否提供了一种容易做到这一点的方法,还是我必须自己做?


当前回答

斯威夫特5

extension Sequence where Element: Hashable {
    func unique() -> [Element] {
        NSOrderedSet(array: self as! [Any]).array as! [Element]
    }
}

其他回答

受https://www.swiftbysundell.com/posts/the-power-of-key-paths-in-swift的启发,我们可以声明一个更强大的工具,它能够过滤任何keyPath上的唯一性。感谢Alexander关于复杂性的各种答案的评论,下面的解决方案应该是最优的。

Non-mutating解决方案

我们扩展了一个函数,它可以过滤任意keyPath上的唯一性:

extension RangeReplaceableCollection {
    /// Returns a collection containing, in order, the first instances of
    /// elements of the sequence that compare equally for the keyPath.
    func unique<T: Hashable>(for keyPath: KeyPath<Element, T>) -> Self {
        var unique = Set<T>()
        return filter { unique.insert($0[keyPath: keyPath]).inserted }
    }
}

注意:在你的对象不符合RangeReplaceableCollection,但符合Sequence的情况下,你可以有这个额外的扩展,但返回类型将始终是一个数组:

extension Sequence {
    /// Returns an array containing, in order, the first instances of
    /// elements of the sequence that compare equally for the keyPath.
    func unique<T: Hashable>(for keyPath: KeyPath<Element, T>) -> [Element] {
        var unique = Set<T>()
        return filter { unique.insert($0[keyPath: keyPath]).inserted }
    }
}

使用

如果我们想要元素本身的唯一性,如问题中所示,我们使用keyPath \.self:

let a = [1, 4, 2, 2, 6, 24, 15, 2, 60, 15, 6]
let b = a.unique(for: \.self)
/* b is [1, 4, 2, 6, 24, 15, 60] */

如果我们想要其他东西的唯一性(比如对象集合的id),那么我们使用我们选择的keyPath:

let a = [CGPoint(x: 1, y: 1), CGPoint(x: 2, y: 1), CGPoint(x: 1, y: 2)]
let b = a.unique(for: \.y)
/* b is [{x 1 y 1}, {x 1 y 2}] */

变异的解决方案

我们扩展了一个变异函数,它能够过滤任意keyPath上的唯一性:

extension RangeReplaceableCollection {
    /// Keeps only, in order, the first instances of
    /// elements of the collection that compare equally for the keyPath.
    mutating func uniqueInPlace<T: Hashable>(for keyPath: KeyPath<Element, T>) {
        var unique = Set<T>()
        removeAll { !unique.insert($0[keyPath: keyPath]).inserted }
    }
}

使用

如果我们想要元素本身的唯一性,如问题中所示,我们使用keyPath \.self:

var a = [1, 4, 2, 2, 6, 24, 15, 2, 60, 15, 6]
a.uniqueInPlace(for: \.self)
/* a is [1, 4, 2, 6, 24, 15, 60] */

如果我们想要其他东西的唯一性(比如对象集合的id),那么我们使用我们选择的keyPath:

var a = [CGPoint(x: 1, y: 1), CGPoint(x: 2, y: 1), CGPoint(x: 1, y: 2)]
a.uniqueInPlace(for: \.y)
/* a is [{x 1 y 1}, {x 1 y 2}] */

编辑/更新Swift 4或更高版本

我们还可以扩展RangeReplaceableCollection协议,以允许它也用于StringProtocol类型:

extension RangeReplaceableCollection where Element: Hashable {
    var orderedSet: Self {
        var set = Set<Element>()
        return filter { set.insert($0).inserted }
    }
    mutating func removeDuplicates() {
        var set = Set<Element>()
        removeAll { !set.insert($0).inserted }
    }
}

let integers = [1, 4, 2, 2, 6, 24, 15, 2, 60, 15, 6]
let integersOrderedSet = integers.orderedSet // [1, 4, 2, 6, 24, 15, 60]

"abcdefabcghi".orderedSet  // "abcdefghi"
"abcdefabcghi".dropFirst(3).orderedSet // "defabcghi"

变异的方法:

var string = "abcdefabcghi"
string.removeDuplicates() 
string  //  "abcdefghi"

var substring = "abcdefabcdefghi".dropFirst(3)  // "defabcdefghi"
substring.removeDuplicates()
substring   // "defabcghi"

对于Swift 3,请点击这里

包含相等性检查,而插入检查哈希,最安全的检查方式如下:

extension Array where Element: Hashable {

    /// Big O(N) version. Updated since @Adrian's comment. 
    var uniques: Array {
        // Go front to back, add element to buffer if it isn't a repeat.
         var buffer: [Element] = []
         var dictionary: [Element: Int] = [:]
         for element in self where dictionary[element] == nil {
             buffer.append(element)
             dictionary[element] = 1
         }
         return buffer
    }
}

像函数式程序员一样思考:)

要根据元素是否已经出现来筛选列表,需要索引。可以使用enumeration获取索引,并使用map返回值列表。

let unique = myArray
    .enumerated()
    .filter{ myArray.firstIndex(of: $0.1) == $0.0 }
    .map{ $0.1 }

这保证了秩序。如果你不介意顺序,那么Array(Set(myArray))的现有答案更简单,可能更有效。


更新:一些关于效率和正确性的注意事项

一些人对效率进行了评论。我肯定是先写正确而简单的代码,然后再找出瓶颈,尽管我知道这是否比Array(Set(Array))更清楚是有争议的。

这个方法比Array(Set(Array))慢很多。正如评论中所指出的,它确实保持了顺序,并对非Hashable的元素起作用。

然而,@Alain T的方法也保持了秩序,也快得多。所以除非你的元素类型是不可哈希的,或者你只是需要一个快速的一行,那么我建议采用他们的解决方案。

以下是MacBook Pro(2014)在Xcode 11.3.1 (Swift 5.1)发布模式下的一些测试。

profiler函数和两个比较方法:

func printTimeElapsed(title:String, operation:()->()) {
    var totalTime = 0.0
    for _ in (0..<1000) {
        let startTime = CFAbsoluteTimeGetCurrent()
        operation()
        let timeElapsed = CFAbsoluteTimeGetCurrent() - startTime
        totalTime += timeElapsed
    }
    let meanTime = totalTime / 1000
    print("Mean time for \(title): \(meanTime) s")
}

func method1<T: Hashable>(_ array: Array<T>) -> Array<T> {
    return Array(Set(array))
}

func method2<T: Equatable>(_ array: Array<T>) -> Array<T>{
    return array
    .enumerated()
    .filter{ array.firstIndex(of: $0.1) == $0.0 }
    .map{ $0.1 }
}

// Alain T.'s answer (adapted)
func method3<T: Hashable>(_ array: Array<T>) -> Array<T> {
    var uniqueKeys = Set<T>()
    return array.filter{uniqueKeys.insert($0).inserted}
}

以及少量的测试输入:

func randomString(_ length: Int) -> String {
  let letters = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789"
  return String((0..<length).map{ _ in letters.randomElement()! })
}

let shortIntList = (0..<100).map{_ in Int.random(in: 0..<100) }
let longIntList = (0..<10000).map{_ in Int.random(in: 0..<10000) }
let longIntListManyRepetitions = (0..<10000).map{_ in Int.random(in: 0..<100) }
let longStringList = (0..<10000).map{_ in randomString(1000)}
let longMegaStringList = (0..<10000).map{_ in randomString(10000)}

给出输出:

Mean time for method1 on shortIntList: 2.7358531951904296e-06 s
Mean time for method2 on shortIntList: 4.910230636596679e-06 s
Mean time for method3 on shortIntList: 6.417632102966309e-06 s
Mean time for method1 on longIntList: 0.0002518167495727539 s
Mean time for method2 on longIntList: 0.021718120217323302 s
Mean time for method3 on longIntList: 0.0005312927961349487 s
Mean time for method1 on longIntListManyRepetitions: 0.00014377200603485108 s
Mean time for method2 on longIntListManyRepetitions: 0.0007293639183044434 s
Mean time for method3 on longIntListManyRepetitions: 0.0001843773126602173 s
Mean time for method1 on longStringList: 0.007168249964714051 s
Mean time for method2 on longStringList: 0.9114790915250778 s
Mean time for method3 on longStringList: 0.015888616919517515 s
Mean time for method1 on longMegaStringList: 0.0525397013425827 s
Mean time for method2 on longMegaStringList: 1.111266262292862 s
Mean time for method3 on longMegaStringList: 0.11214958941936493 s

在数组中保留唯一值和排序。

(使用Swift 3)

    var top3score: [Int] = []


    outerLoop: for i in 0..<top10score.count {
        dlog(message: String(top10score[i]))

        if top3score.count == 3 {
            break
        }

        for aTop3score in top3score {
            if aTop3score == top10score[i] {
                continue outerLoop
            }
        }

        top3score.append(top10score[i])

    }

    print("top10score is \(top10score)")  //[14, 5, 5, 5, 3, 3, 2, 2, 2, 2]
    print("top3score is \(top3score)")   //[14, 5, 3]