VBA - 删除每张工作簿上的每 N 行(每张工作表 100k+ 值)

Posted

技术标签:

【中文标题】VBA - 删除每张工作簿上的每 N 行(每张工作表 100k+ 值)【英文标题】:VBA - Delete Every Nth Row On Each Sheet Of Workbook (100k+ values per sheet) 【发布时间】:2018-03-15 06:24:36 【问题描述】:

我有一个包含 10 多张工作表的工作簿,每张都有数十万个值(125k sheet1、240k sheet 2、400k sheet 3,等等)表。

我无法获取代码来完成第一张纸上的数据修剪。代码已经运行了一个多小时而没有完成第一张工作表。我也尝试过使用较小的数据集(5 张纸中约 1000 个点),但宏只能成功修剪第一张纸上的点。其他表没有修改

下面是我用来删除行间隔的代码;这是我能找到的删除行的最可定制的方式(这正是我正在寻找的:定制/简单性

lastRow = Application.ActiveSheet.UsedRange.Rows.Count    

For i = 2 To lastRow Step 1        'Interval of rows to delete
     Range(Rows(i), Rows(i + 997)).Delete Shift:=xlUp
Next i

这个特定任务的代码被插入到这个问题中找到的代码的修改版本中*感谢最初编写它们的人

问题:Excel VBA Performance - 1 million rows - Delete rows containing a value, in less than 1 min

这是 Paul bica 在他的代码中使用的辅助函数

    Public Sub FastWB(Optional ByVal opt As Boolean = True)
      With Application
        .Calculation = IIf(opt, xlCalculationManual, xlCalculationAutomatic)
        .DisplayAlerts = Not opt
        .DisplayStatusBar = Not opt
        .EnableAnimations = Not opt
        .EnableEvents = Not opt
        .ScreenUpdating = Not opt
      End With
      FastWS , opt
    End Sub

    Public Sub FastWS(Optional ByVal ws As Worksheet = Nothing, _
    Optional ByVal opt As Boolean = True)
      If ws Is Nothing Then
        For Each ws In Application.ActiveWorkbook.Sheets
            EnableWS ws, opt
        Next
      Else
            EnableWS ws, opt
      End If
    End Sub

    Private Sub EnableWS(ByVal ws As Worksheet, ByVal opt As Boolean)
      With ws
        .DisplayPageBreaks = False
        .EnableCalculation = Not opt
        .EnableFormatConditionsCalculation = Not opt
        .EnablePivotTable = Not opt
        End With
    End Sub

由 marko2049 生成测试集的漂亮小代码:

Sub DevelopTest()
    Dim index As Long
    FastWB True
    ActiveSheet.UsedRange.Clear
    For index = 1 To 1000000 '1 million test
        ActiveSheet.Cells(index, 1).Value = index
        If (index Mod 10) = 0 Then
            ActiveSheet.Cells(index, 2).Value = "Test String"
        Else
            ActiveSheet.Cells(index, 2).Value = "Blah Blah Blah"
        End If
    Next index
    Application.StatusBar = ""
    FastWB False
End Sub

生成测试集并将其复制到多个工作表后,我运行了以下代码的修改版本

代码主体由用户marko5049制作

Sub DeleteRowFast()
    Dim curWorksheet As Worksheet 'Current worksheet vairable

    Dim rangeSelection As Range   'Selected range
    Dim startBadVals As Long      'Start of the unwanted values
    Dim endBadVals As Long        'End of the unwanted values
    Dim strtTime As Double        'Timer variable
    Dim lastRow As Long           'Last Row variable
    Dim lastColumn As Long        'Last column variable
    Dim indexCell As Range        'Index range start
    Dim sortRange As Range        'The range which the sort is applied to
    Dim currRow As Range          'Current Row index for the for loop
    Dim cell As Range             'Current cell for use in the for loop

    On Error GoTo Err
        Set rangeSelection = Application.InputBox("Select the (N=) range to be checked", "Get Range", Type:=8)    'Get the desired range from the user
        Err.Clear

    M1 = MsgBox("This is recommended for large files (50,000 or more entries)", vbYesNo, "Enable Fast Workbook?") 'Prompt the user with an option to enable Fast Workbook, roughly 150% performace gains... Recommended for incredibly large files
    Select Case M1
        Case vbYes
            FastWB True  'Enable fast workbook
        Case vbNo
            FastWB False 'Disable fast workbook
    End Select

    strtTime = Timer     'Begin the timer

    Set curWorksheet = ActiveSheet
    lastRow = CLng(rangeSelection.SpecialCells(xlCellTypeLastCell).Row)
    lastColumn = curWorksheet.Cells(1, 16384).End(xlToLeft).Column

    Set indexCell = curWorksheet.Cells(1, 1)

    On Error Resume Next

    If rangeSelection.Rows.Count > 1 Then 'Check if there is anything to do

        lastVisRow = rangeSelection.Rows.Count

        Set sortRange = curWorksheet.Range(indexCell, curWorksheet.Cells(curWorksheet.Rows(lastRow).Row, 16384).End(xlToLeft)) 'Set the sort range

        sortRange.Sort Key1:=rangeSelection.Cells(1, 1), Order1:=xlAscending, Header:=xlNo 'Sort by values, lowest to highest

        startBadVals = rangeSelection.Find(What:="Test String", LookAt:=xlWhole, MatchCase:=False).Row
        endBadVals = rangeSelection.Find(What:="Test String", LookAt:=xlWhole, SearchDirection:=xlPrevious, MatchCase:=False).Row

        curWorksheet.Range(curWorksheet.Rows(startBadVals), curWorksheet.Rows(endBadVals)).EntireRow.Delete 'Delete uneeded rows, deleteing in continuous range blocks is quick than seperated or individual deletions.

        sortRange.Sort Key1:=indexCell, Order1:=xlAscending, Header:=xlNo 'Sort by index instead of values, lowest to highest
    End If

    Application.StatusBar = ""                    'Reset the status bar

    FastWB False                                  'Disable fast workbook

    MsgBox CStr(Round(Timer - strtTime, 2)) & "s" 'Display duration of task

Err:
    Exit Sub

End Sub

我将上面的代码修改如下

Sub DeleteRowFastMod()

    Dim lastRow As Long
    Dim i As Long
    Dim ws As Worksheet
    Dim wb As Workbook
    Set wb = Application.ActiveWorkbook

    On Error GoTo Err
            'Get the desired range from the user
        Err.Clear

    FastWB True  'Enable fast workbook


    strtTime = Timer     'Begin the timer


    On Error Resume Next


For Each ws In wb.Worksheets(1)         'Loop through sheets in workbook 
    ws.Activate
    lastRow = Application.ActiveSheet.UsedRange.Rows.Count

    If lastRow > 1 Then 'Check if there is anything to do

       For i = 2 To lastRow Step 1        'Interval of rows to delete
           Range(Rows(i), Rows(i + 997)).Delete Shift:=xlUp
       Next i
    End If
Next

    Application.StatusBar = ""                    'Reset the status bar

    FastWB False                                  'Disable fast workbook

    MsgBox CStr(Round(Timer - strtTime, 2)) & "s" 'Display duration of task

Err:
    Exit Sub

End Sub

我不确定如何进一步修改此代码以及时在工作簿中的每个工作表上运行。

提前感谢您的任何指导

【问题讨论】:

你没有描述你想要做什么。 trimming down the sheets by keeping every thousandth or so point in each sheet. 毫无意义 无意冒犯,但您不理解您尝试修改的代码。如果您想理解它,我建议您单步执行代码并查看它在做什么。如果您想使用与以前相同的方法,您要做的是制作一个放入原始行号的辅助列,第二个辅助列说明该行是否为 1000 的倍数(如果是,则输入“Y” ),按第二个辅助列排序,删除一个范围的行(在第二个辅助列中带有“N”的行),并使用第一个辅助列来处理数据。 另一种选择是简单地创建另一个工作表并使用工作表函数获取每 1000 行,然后删除原始工作表。这可能会比您最初尝试的方法慢一点,但也会相当快。 此外,这仅适用于 1 个工作表,因为您使用了 For Each ws In wb.Worksheets(1) 行。请注意,wb.Worksheets(1) 只是 1 个工作表。您想使用他们在您尝试修改的代码中所做的相同的事情,For Each ws In wb.Sheets 【参考方案1】:

我认为您最大的性能锚点是您删除的频率如此之高,而 Excel 不得不移动如此多的数据。您可能会考虑先清除内容和/或使用 UNION 函数一次性删除所有内容。下面是如何编写这两种方法的示例:

Sub UnionExample()
Dim deleteRNG As Range

'You need one start statement that is not a union.
Set deleteRNG = Rows(2)

'Now you can start a loop or use some method to include members in your delete range
Set deleteRNG = Union(deleteRNG, Rows(4))

'when finished creating the delete range, clear contents (it's helped my performance)
deleteRNG.ClearContents


'then do your full delete
deleteRNG.Delete shift:=xlUp

End Sub

【讨论】:

【参考方案2】:

使用The SpreadSheetGuru's Timer,我在 13.53 秒内从 4 个工作表中总共删除了 1,599,992 个。

Sub ProcessWorksheets()
    Dim ws As Worksheet

    With Application
        .ScreenUpdating = False
        .Calculation = xlCalculationManual
    End With

    For Each ws In ThisWorkbook.Worksheets
        KeepNthRows ws.UsedRange, 2, 1000
    Next

    With Application
        .ScreenUpdating = True
        .Calculation = xlCalculationAutomatic
    End With

End Sub

Sub KeepNthRows(Target As Range, FirstRow As Long, NthStep As Long)
    Dim data As Variant, results As Variant
    Dim x1 As Long, x2 As Long, y As Long

    If Target.Rows.Count < 2 Then Exit Sub

    FirstRow = FirstRow - 1                           'Adjustment needed for using Range.Offset
    data = Target.Offset(FirstRow).Value

    ReDim results(1 To UBound(data, 1), 1 To UBound(data, 2))

    For x1 = FirstRow To UBound(data, 1) Step NthStep
        x2 = x2 + 1
        For y = 1 To UBound(data, 2)
            results(x2, y) = data(x1, y)
        Next
    Next

    Target.Offset(FirstRow).Value = results
End Sub

【讨论】:

我尝试运行 Sub ProcessWorksheets,但在 ReDim 行上出现运行时错误 13。我不确定如何解决这个问题 我更新了我的代码。当 Worksheets 的 UsedRange 少于 2 行时发生错误【参考方案3】:

您可以使用与链接中相同的方法

Excel VBA Performance - 1 million rows - Delete rows containing a value, in less than 1 min


下面的代码(模块 2)设置测试数据 - 10 个工作表中的 3000 万个公式(3 个完整列)

模块 1 中的子循环遍历所有工作表和

隐藏 1K 行集 将可见行复制到新工作表 删除初始工作表

模块 1 - 主子


Option Explicit

Public Sub TrimLargeData()    'Time: 12.531 sec
    Const TRIM_SZ = 1000
    Dim t As Double, wb As Workbook, ws As Worksheet
    Dim lr As Long, r As Long, newWs As Worksheet, done As Collection

    t = Timer:  Set wb = ThisWorkbook
    FastWB True

    Set done = New Collection
    For Each ws In wb.Worksheets
        done.Add ws
    Next

    For Each ws In done
        lr = ws.UsedRange.Rows.Count

        For r = 1 To lr Step TRIM_SZ
           If r >= lr - (TRIM_SZ + 1) Then
                ws.Range(ws.Cells(r + 1, 1), ws.Cells(lr - 1, 1)).EntireRow.Hidden = True
                Exit For
           End If
           ws.Range(ws.Cells(r + 1, 1), ws.Cells(r + TRIM_SZ - 1, 1)).EntireRow.Hidden = True
        Next

        Set newWs = Worksheets.Add(After:=Worksheets(Worksheets.Count))
        newWs.Name = Left("Trimmed " & ws.Name, 30)
        ws.UsedRange.SpecialCells(xlCellTypeVisible).Copy newWs.Cells(1)
        ws.Delete
    Next
    FastWB False:   Debug.Print "Time: " & Format(Timer - t, "0.000") & " sec"
End Sub

模块 2 - 设置测试数据子程序和辅助程序


Option Explicit

'generates 30 million formulas (3 full columns) on 10 Worksheets, in about 1 min

Public Sub MakeTestData()
    Dim t As Double, ur As Range, ws As Worksheet

    t = Timer
    FastWB True
        FormatCells
        MakeWorksheets

        With ThisWorkbook
            Set ws = .Worksheets(1)
            Set ur = ws.Range("A1:C" & ws.Rows.Count)
            ur.Formula = "=Address(Row(), Column(), 4)"
            .Worksheets.FillAcrossSheets ur
        End With
    FastWB False
    Debug.Print "Time: " & Format(Timer - t, "0.000") & " sec"
End Sub

Private Sub FormatCells()
    With ThisWorkbook.Worksheets(1).Cells
        .HorizontalAlignment = xlCenter
        .VerticalAlignment = xlCenter
        .WrapText = False
        .IndentLevel = 0
        .MergeCells = False
    End With
End Sub

Private Sub MakeWorksheets()
    Dim ws As Worksheet, i As Long, wsName As Long

    With ThisWorkbook
        If .Worksheets.Count > 1 Then
            For Each ws In .Worksheets
                If ws.Index <> 1 Then ws.Delete
            Next
        End If
        For i = 1 To 10
            wsName = .Worksheets.Count
            .Worksheets.Add(After:=.Worksheets(wsName)).Name = wsName
        Next
    End With
End Sub

Public Sub FastWB(Optional ByVal opt As Boolean = True)
  With Application
    .Calculation = IIf(opt, xlCalculationManual, xlCalculationAutomatic)
    .DisplayAlerts = Not opt
    .DisplayStatusBar = Not opt
    .EnableAnimations = Not opt
    .EnableEvents = Not opt
    .ScreenUpdating = Not opt
  End With
  FastWS , opt
End Sub

Public Sub FastWS(Optional ByVal ws As Worksheet = Nothing, _
Optional ByVal opt As Boolean = True)
  If ws Is Nothing Then
    For Each ws In Application.ActiveWorkbook.Sheets
        EnableWS ws, opt
    Next
  Else
        EnableWS ws, opt
  End If
End Sub

Private Sub EnableWS(ByVal ws As Worksheet, ByVal opt As Boolean)
  With ws
    .DisplayPageBreaks = False
    .EnableCalculation = Not opt
    .EnableFormatConditionsCalculation = Not opt
    .EnablePivotTable = Not opt
    End With
End Sub

【讨论】:

嗨,保罗。这个宏似乎运行良好,只是它不会删除范围内的最后一组 1k 值。这应该如何纠正? 我更新了主 Sub 以删除最后一组,不包括最后一行 太棒了,保罗。非常感谢大家的帮助

以上是关于VBA - 删除每张工作簿上的每 N 行(每张工作表 100k+ 值)的主要内容,如果未能解决你的问题,请参考以下文章

VBA删除匹配数据的行

比较两张纸并突出显示每张纸上的差异 - 循环是唯一的方法吗?

VBA Excel - 将行复制到另一个带有条件的工作簿表

根据行数据复制特定单元格并粘贴到特定工作表上

使用VBA对Excel工作表进行筛选

使用VBA对Excel工作表进行筛选