Swift:拆分 [String] 得到具有给定子数组大小的 [[String]] 的正确方法是啥?
Posted
技术标签:
【中文标题】Swift:拆分 [String] 得到具有给定子数组大小的 [[String]] 的正确方法是啥?【英文标题】:Swift: what is the right way to split up a [String] resulting in a [[String]] with a given subarray size?Swift:拆分 [String] 得到具有给定子数组大小的 [[String]] 的正确方法是什么? 【发布时间】:2014-12-11 07:26:54 【问题描述】:从一个大的 [String] 和给定的子数组大小开始,我可以将这个数组拆分成更小的数组的最佳方法是什么? (最后一个数组将小于给定的子数组大小)。
具体例子:
拆分 ["1","2","3","4","5","6","7"],最大拆分大小为 2
代码会产生 [["1","2"],["3","4"],["5","6"],["7"]]
显然我可以更手动地执行此操作,但我觉得像 map() 或 reduce() 这样的快速操作可能会非常漂亮地完成我想要的操作。
【问题讨论】:
您希望在什么基础上进行拆分?鉴于您正在谈论“页面大小”,字体和大小必须很重要。你为什么要自己做而不是让操作系统来做文本布局? 页面大小是什么意思? @GaryMakin 抱歉,现在更新。这只是一个设定的拆分大小,即将数组拆分为最大大小为 100 的较小数组。 @Jordan,尽管这些很有趣,但这并不是 SO 的真正用途 - 您可能想在 #swift-lang IRC 频道中提出这些问题。 我在搜索 Ruby 的each_cons
函数 ***.com/q/39756309/78336 的快速等效项时问了几乎相同的问题
【参考方案1】:
我认为您不会想要使用 map 或 reduce。 Map 用于对数组中的每个单独元素应用函数,而 reduce 用于展平数组。你想要做的是将数组分割成一定大小的子数组。这个 sn-p 使用切片。
var arr = ["1","2","3","4","5","6","7"]
var splitSize = 2
var newArr = [[String]]()
var i = 0
while i < arr.count
var slice: Slice<String>!
if i + splitSize >= arr.count
slice = arr[i..<arr.count]
else
slice = arr[i..<i+splitSize]
newArr.append(Array(slice))
i += slice.count
println(newArr)
【讨论】:
此解决方案适用于 swift 2.2 到 3.0,这是一个优点!而且我认为它更具可读性,直到我们都了解“新语言”的最新风格..我的意思是迅速。【参考方案2】:我不会说它漂亮,但这里有一个使用map
的方法:
let numbers = ["1","2","3","4","5","6","7"]
let splitSize = 2
let chunks = numbers.startIndex.stride(to: numbers.count, by: splitSize).map
numbers[$0 ..< $0.advancedBy(splitSize, limit: numbers.endIndex)]
stride(to:by:)
方法为您提供每个块的第一个元素的索引,因此您可以使用 advancedBy(distance:limit:)
将这些索引映射到源数组的切片。
一种更“功能性”的方法就是对数组进行递归,如下所示:
func chunkArray<T>(s: [T], splitSize: Int) -> [[T]]
if countElements(s) <= splitSize
return [s]
else
return [Array<T>(s[0..<splitSize])] + chunkArray(Array<T>(s[splitSize..<s.count]), splitSize)
【讨论】:
Swift 2.0 let chunks = stride(from: 0, to: numbers.count, by: splitSize).map( numbers[$0.. 新的 XC 7 Beta 6 现已失效【参考方案3】:上面的内容很中肯,但它让我很头疼。我不得不恢复到不那么迅速的方法。
对于 Swift 2.0
var chunks = [[Int]]()
var temp = [Int]()
var splitSize = 3
var x = [1,2,3,4,5,6,7]
for (i, element) in x.enumerate()
if temp.count < splitSize
temp.append(element)
if temp.count == splitSize
chunks.append(temp)
temp.removeAll()
if !temp.isEmpty
chunks.append(temp)
Playground Result [[1, 2, 3], [4, 5, 6], [7]]
【讨论】:
【参考方案4】:我喜欢 Nate Cook 的回答,看起来 Swift 自编写以来一直在进步,这是我对 Array 的扩展的看法:
extension Array
func chunk(chunkSize : Int) -> Array<Array<Element>>
return 0.stride(to: self.count, by: chunkSize)
.map Array(self[$0..<$0.advancedBy(chunkSize, limit: self.count)])
注意,它返回 [] 表示负数,并会导致上面写的致命错误。如果你想防止这种情况发生,你必须设置一个警卫。
func testChunkByTwo()
let input = [1,2,3,4,5,6,7]
let output = input.chunk(2)
let expectedOutput = [[1,2], [3,4], [5,6], [7]]
XCTAssertEqual(expectedOutput, output)
func testByOne()
let input = [1,2,3,4,5,6,7]
let output = input.chunk(1)
let expectedOutput = [[1],[2],[3],[4],[5],[6],[7]]
XCTAssertEqual(expectedOutput, output)
func testNegative()
let input = [1,2,3,4,5,6,7]
let output = input.chunk(-2)
let expectedOutput = []
XCTAssertEqual(expectedOutput, output)
【讨论】:
【参考方案5】:使用 Swift 5,根据您的需要,您可以选择以下五种方式中的一种来解决您的问题。
1。在Collection
扩展方法中使用AnyIterator
AnyIterator
是迭代符合Collection
协议的对象的索引以返回该对象的子序列的良好候选者。在Collection
协议扩展中,您可以声明具有以下实现的chunked(by:)
方法:
extension Collection
func chunked(by distance: Int) -> [[Element]]
precondition(distance > 0, "distance must be greater than 0") // prevents infinite loop
var index = startIndex
let iterator: AnyIterator<Array<Element>> = AnyIterator(
let newIndex = self.index(index, offsetBy: distance, limitedBy: self.endIndex) ?? self.endIndex
defer index = newIndex
let range = index ..< newIndex
return index != self.endIndex ? Array(self[range]) : nil
)
return Array(iterator)
用法:
let array = ["1", "2", "3", "4", "5", "6", "7", "8", "9"]
let newArray = array.chunked(by: 2)
print(newArray) // prints: [["1", "2"], ["3", "4"], ["5", "6"], ["7", "8"], ["9"]]
2。在Array
扩展方法中使用stride(from:to:by:)
函数
Array
索引的类型为Int
并符合Strideable
协议。因此,您可以将stride(from:to:by:)
和advanced(by:)
与它们一起使用。在Array
扩展中,您可以声明具有以下实现的chunked(by:)
方法:
extension Array
func chunked(by distance: Int) -> [[Element]]
let indicesSequence = stride(from: startIndex, to: endIndex, by: distance)
let array: [[Element]] = indicesSequence.map
let newIndex = $0.advanced(by: distance) > endIndex ? endIndex : $0.advanced(by: distance)
//let newIndex = self.index($0, offsetBy: distance, limitedBy: self.endIndex) ?? self.endIndex // also works
return Array(self[$0 ..< newIndex])
return array
用法:
let array = ["1", "2", "3", "4", "5", "6", "7", "8", "9"]
let newArray = array.chunked(by: 2)
print(newArray) // prints: [["1", "2"], ["3", "4"], ["5", "6"], ["7", "8"], ["9"]]
3。在Array
扩展方法中使用递归方法
基于 Nate Cook recursive code,您可以使用以下实现在 Array
扩展中声明 chunked(by:)
方法:
extension Array
func chunked(by distance: Int) -> [[Element]]
precondition(distance > 0, "distance must be greater than 0") // prevents infinite loop
if self.count <= distance
return [self]
else
let head = [Array(self[0 ..< distance])]
let tail = Array(self[distance ..< self.count])
return head + tail.chunked(by: distance)
用法:
let array = ["1", "2", "3", "4", "5", "6", "7", "8", "9"]
let newArray = array.chunked(by: 2)
print(newArray) // prints: [["1", "2"], ["3", "4"], ["5", "6"], ["7", "8"], ["9"]]
4。在 Collection
扩展方法中使用 for 循环和批处理
Chris Eidhof 和 Florian Kugler 在 Swift Talk #33 - Sequence & Iterator (Collections #2) 视频中展示了如何使用简单的 for 循环填充成批的序列元素,并在完成时将它们附加到数组中。在Sequence
扩展中,您可以声明具有以下实现的chunked(by:)
方法:
extension Collection
func chunked(by distance: Int) -> [[Element]]
var result: [[Element]] = []
var batch: [Element] = []
for element in self
batch.append(element)
if batch.count == distance
result.append(batch)
batch = []
if !batch.isEmpty
result.append(batch)
return result
用法:
let array = ["1", "2", "3", "4", "5", "6", "7", "8", "9"]
let newArray = array.chunked(by: 2)
print(newArray) // prints: [["1", "2"], ["3", "4"], ["5", "6"], ["7", "8"], ["9"]]
5。使用符合Sequence
和IteratorProtocol
协议的自定义struct
如果您不想创建Sequence
、Collection
或Array
的扩展,您可以创建符合Sequence
和IteratorProtocol
协议的自定义struct
。这个struct
应该有以下实现:
struct BatchSequence<T>: Sequence, IteratorProtocol
private let array: [T]
private let distance: Int
private var index = 0
init(array: [T], distance: Int)
precondition(distance > 0, "distance must be greater than 0") // prevents infinite loop
self.array = array
self.distance = distance
mutating func next() -> [T]?
guard index < array.endIndex else return nil
let newIndex = index.advanced(by: distance) > array.endIndex ? array.endIndex : index.advanced(by: distance)
defer index = newIndex
return Array(array[index ..< newIndex])
用法:
let array = ["1", "2", "3", "4", "5", "6", "7", "8", "9"]
let batchSequence = BatchSequence(array: array, distance: 2)
let newArray = Array(batchSequence)
print(newArray) // prints: [["1", "2"], ["3", "4"], ["5", "6"], ["7", "8"], ["9"]]
【讨论】:
嗨,你有那个扩展方法的 Swift 3 版本吗? 很好的答案,谢谢!请注意,如果被分块的数组为空,则选项 4 具有我认为奇怪的行为。它返回[]
而不是 [[]]
。选项 3 的行为符合我的预期。【参考方案6】:
我将在这里使用基于 AnyGenerator.
的另一个实现来表达我的想法
extension Array
func chunks(_ size: Int) -> AnyIterator<[Element]>
if size == 0
return AnyIterator
return nil
let indices = stride(from: startIndex, to: count, by: size)
var generator = indices.makeIterator()
return AnyIterator
guard let i = generator.next() else
return nil
var j = self.index(i, offsetBy: size)
repeat
j = self.index(before: j)
while j >= self.endIndex
return self[i...j].lazy.map $0
我更喜欢这种方法,因为它完全依赖生成器,在处理大型数组时可以产生不可忽略的积极内存影响。
对于您的具体示例,它的工作原理如下:
let chunks = Array(["1","2","3","4","5","6","7"].chunks(2))
结果:
[["1", "2"], ["3", "4"], ["5", "6"], ["7"]]
【讨论】:
【参考方案7】:在 Swift 3/4 中,这将如下所示:
let numbers = ["1","2","3","4","5","6","7"]
let chunkSize = 2
let chunks = stride(from: 0, to: numbers.count, by: chunkSize).map
Array(numbers[$0..<min($0 + chunkSize, numbers.count)])
// prints as [["1", "2"], ["3", "4"], ["5", "6"], ["7"]]
作为 Array 的扩展:
extension Array
func chunked(by chunkSize: Int) -> [[Element]]
return stride(from: 0, to: self.count, by: chunkSize).map
Array(self[$0..<Swift.min($0 + chunkSize, self.count)])
或者稍微冗长但更笼统的:
let numbers = ["1","2","3","4","5","6","7"]
let chunkSize = 2
let chunks: [[String]] = stride(from: 0, to: numbers.count, by: chunkSize).map
let end = numbers.endIndex
let chunkEnd = numbers.index($0, offsetBy: chunkSize, limitedBy: end) ?? end
return Array(numbers[$0..<chunkEnd])
这更笼统,因为我对集合中的索引类型做出的假设更少。在之前的实现中,我假设它们可以进行比较和添加。
请注意,在 Swift 3 中,推进索引的功能已从索引本身转移到集合中。
【讨论】:
可以使用 ArraySlice 作为更有效的方法,即func chunked(by chunkSize: Int) -> [ArraySlice<Element>]
然后减去 Array( ... )
演员表
如何编辑扩展,制作不同大小的分块数组?例如第一个数组包含 17 和其他数组包含 25 ?【参考方案8】:
将Tyler Cloutier's formulation 表示为 Array 的扩展会很好:
extension Array
func chunked(by chunkSize:Int) -> [[Element]]
let groups = stride(from: 0, to: self.count, by: chunkSize).map
Array(self[$0..<[$0 + chunkSize, self.count].min()!])
return groups
这为我们提供了一种将数组划分为块的通用方法。
【讨论】:
Swift.min($0 + chunkSize, self.count)
而不必创建数组【参考方案9】:
您知道任何具有 [a...b] swift 风格的解决方案的运行速度比常规解决方案慢 10 倍吗?
for y in 0..<rows
var row = [Double]()
for x in 0..<cols
row.append(stream[y * cols + x])
mat.append(row)
试试看,这是我的原始测试代码:
let count = 1000000
let cols = 1000
let rows = count / cols
var stream = [Double].init(repeating: 0.5, count: count)
// Regular
var mat = [[Double]]()
let t1 = Date()
for y in 0..<rows
var row = [Double]()
for x in 0..<cols
row.append(stream[y * cols + x])
mat.append(row)
print("regular: \(Date().timeIntervalSince(t1))")
//Swift
let t2 = Date()
var mat2: [[Double]] = stride(from: 0, to: stream.count, by: cols).map
let end = stream.endIndex
let chunkEnd = stream.index($0, offsetBy: cols, limitedBy: end) ?? end
return Array(stream[$0..<chunkEnd])
print("swift: \(Date().timeIntervalSince(t2))")
出去:
常规:0.0449600219726562
迅速:0.49255496263504
【讨论】:
让我猜猜。您正在操场上对此进行基准测试【参考方案10】:Swift 4 中的新功能,您可以使用 reduce(into:)
高效地完成此操作。这是对序列的扩展:
extension Sequence
func eachSlice(_ clump:Int) -> [[Self.Element]]
return self.reduce(into:[]) memo, cur in
if memo.count == 0
return memo.append([cur])
if memo.last!.count < clump
memo.append(memo.removeLast() + [cur])
else
memo.append([cur])
用法:
let result = [1,2,3,4,5,6,7,8,9].eachSlice(2)
// [[1, 2], [3, 4], [5, 6], [7, 8], [9]]
【讨论】:
【参考方案11】:在 Swift 4 或更高版本中,您还可以扩展 Collection
并返回其中的 SubSequence
集合,以便能够将其与 StringProtocol
类型(String
或 Substring
)一起使用。这样它将返回一个子字符串的集合,而不是一堆字符的集合:
Xcode 10.1 • Swift 4.2.1 或更高版本
extension Collection
func subSequences(limitedTo maxLength: Int) -> [SubSequence]
precondition(maxLength > 0, "groups must be greater than zero")
var start = startIndex
var subSequences: [SubSequence] = []
while start < endIndex
let end = index(start, offsetBy: maxLength, limitedBy: endIndex) ?? endIndex
defer start = end
subSequences.append(self[start..<end])
return subSequences
或者在 cmets 中由 @Jessy 建议使用收集方法
public func sequence<T, State>(state: State, next: @escaping (inout State) -> T?) -> UnfoldSequence<T, State>
extension Collection
func subSequences(limitedTo maxLength: Int) -> [SubSequence]
precondition(maxLength > 0, "groups must be greater than zero")
return .init(sequence(state: startIndex) start in
guard start < self.endIndex else return nil
let end = self.index(start, offsetBy: maxLength, limitedBy: self.endIndex) ?? self.endIndex
defer start = end
return self[start..<end]
)
用法
let array = ["1", "2", "3", "4", "5", "6", "7", "8", "9"]
let slices = array.subSequences(limitedTo: 2) // [ArraySlice(["1", "2"]), ArraySlice(["3", "4"]), ArraySlice(["5", "6"]), ArraySlice(["7", "8"]), ArraySlice(["9"])]
for slice in slices
print(slice) // prints: [["1", "2"], ["3", "4"], ["5", "6"], ["7", "8"], ["9"]]
// To convert from ArraySlice<Element> to Array<element>
let arrays = slices.map(Array.init) // [["1", "2"], ["3", "4"], ["5", "6"], ["7", "8"], ["9"]]
extension Collection
var singles: [SubSequence] return subSequences(limitedTo: 1)
var pairs: [SubSequence] return subSequences(limitedTo: 2)
var triples: [SubSequence] return subSequences(limitedTo: 3)
var quads: [SubSequence] return subSequences(limitedTo: 4)
字符数组或数组切片
let chars = ["a","b","c","d","e","f","g","h","i"]
chars.singles // [["a"], ["b"], ["c"], ["d"], ["e"], ["f"], ["g"], ["h"], ["i"]]
chars.pairs // [["a", "b"], ["c", "d"], ["e", "f"], ["g", "h"], ["i"]]
chars.triples // [["a", "b", "c"], ["d", "e", "f"], ["g", "h", "i"]]
chars.quads // [["a", "b", "c", "d"], ["e", "f", "g", "h"], ["i"]]
chars.dropFirst(2).quads // [["c", "d", "e", "f"], ["g", "h", "i"]]
StringProtocol 元素(字符串和子字符串)
let str = "abcdefghi"
str.singles // ["a", "b", "c", "d", "e", "f", "g", "h", "i"]
str.pairs // ["ab", "cd", "ef", "gh", "i"]
str.triples // ["abc", "def", "ghi"]
str.quads // ["abcd", "efgh", "i"]
str.dropFirst(2).quads // ["cdef", "ghi"]
【讨论】:
这是个好主意!但是count
可能是 O(n),所以最好找到一些其他的迭代方式。我在我的答案中放了一个。
@Jessy 你可以简单地使用一个while循环
不,那么您必须选择要返回的集合类型,而不是仅将子序列作为序列提供。
好吧,我很想看看这个的基准测试结果
@Jessy 我已经按照你的建议编辑了我的答案。这种方法有什么问题吗?【参考方案12】:
Swift 5.1 - 各种集合的通用解决方案:
extension Collection where Index == Int
func chunked(by chunkSize: Int) -> [[Element]]
stride(from: startIndex, to: endIndex, by: chunkSize).map Array(self[$0..<Swift.min($0 + chunkSize, count)])
【讨论】:
这不是通用的。它要求集合由 Int【参考方案13】:public extension Optional
/// Wraps a value in an `Optional`, based on a condition.
/// - Parameters:
/// - wrapped: A non-optional value.
/// - getIsNil: The condition that will result in `nil`.
init(
_ wrapped: Wrapped,
nilWhen getIsNil: (Wrapped) throws -> Bool
) rethrows
self = try getIsNil(wrapped) ? nil : wrapped
public extension Sequence
/// Splits a `Sequence` into equal "chunks".
///
/// - Parameter maxArrayCount: The maximum number of elements in a chunk.
/// - Returns: `Array`s with `maxArrayCount` `counts`,
/// until the last chunk, which may be smaller.
subscript(maxArrayCount maxCount: Int) -> AnySequence<[Element]>
.init(
sequence( state: makeIterator() ) iterator in
Optional(
(0..<maxCount).compactMap _ in iterator.next() ,
nilWhen: \.isEmpty
)
)
// [ ["1", "2"], ["3", "4"], ["5", "6"], ["7"] ]"
(1...7).map(String.init)[maxArrayCount: 2]
public extension Collection
/// Splits a `Collection` into equal "chunks".
///
/// - Parameter maxSubSequenceCount: The maximum number of elements in a chunk.
/// - Returns: `SubSequence`s with `maxSubSequenceLength` `counts`,
/// until the last chunk, which may be smaller.
subscript(maxSubSequenceCount maxCount: Int) -> AnySequence<SubSequence>
.init(
sequence(state: startIndex) startIndex in
guard startIndex < self.endIndex
else return nil
let endIndex =
self.index(startIndex, offsetBy: maxCount, limitedBy: self.endIndex)
?? self.endIndex
defer startIndex = endIndex
return self[startIndex..<endIndex]
)
// ["12", "34", "56", "7"]
(1...7).map(String.init).joined()[maxSubSequenceCount: 2]
【讨论】:
以上是关于Swift:拆分 [String] 得到具有给定子数组大小的 [[String]] 的正确方法是啥?的主要内容,如果未能解决你的问题,请参考以下文章
错误:无法将“[String]”类型的值分配给 swift 中的“String”类型
在 swift 中将 String 类型数组转换为 Float 类型数组 不能将类型 'String' 的值分配给类型 'Double' 的下标。