powershell 清理和交叉连接包含具有由comman或换行符分隔的多个条目的单元格的excel表

Posted

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了powershell 清理和交叉连接包含具有由comman或换行符分隔的多个条目的单元格的excel表相关的知识,希望对你有一定的参考价值。

function CartesianProduct($htRow, $currCol=0){
    $colCount = $htRow.Keys.Count
    if ($currCol -eq 0){
        $wordIndices = New-Object int[] $colCount
    }
    $wordCount = ($htRow.Values | select)[$currCol].Count
    #walk through the items in the current 
    for ($wordIndex = 0; $wordIndex -lt $wordCount; $wordIndex++){
        #add the index to the indices for the current column
        $wordIndices[$currCol] =  $wordIndex
        #if we reach the end of the row
        if ($currCol -eq ($colCount - 1)) {
            $htCartesianSet = [ordered]@{}
            for ($colIndex = 0; $colIndex -lt $colCount; $colIndex++){
                #add the items to the result set based on the collected indices
                $key = ($htRow.Keys | select)[$colIndex]
                $value = ($htRow.Values | select)[$colIndex][$wordIndices[$colIndex]]
                $htCartesianSet.Add($key, $value)
            }
            [PSCustomObject]$htCartesianSet
        } 
        #do this for every column
        else {
            CartesianProduct $htRow ($currCol + 1)
        }
    }
}


function Remove-ComObject {
    end {
        Start-Sleep -Milliseconds 500
        [Management.Automation.ScopedItemOptions]$scopedOpt = 'ReadOnly, Constant'
        Get-Variable -Scope 1 | Where-Object {
            $_.Value.PSTypeNames -contains 'System.__ComObject' -and -not ($scopedOpt -band $_.Options)
        } | Remove-Variable -Scope 1 -Verbose:([Bool]$PSBoundParameters['Verbose'].IsPresent)
        [GC]::Collect()
    }
}

function ImportAndCrossJoin($path){
    $xls = New-Object -ComObject Excel.Application
    $xls.Visible = $false
    $wb = $xls.Workbooks.Open($path)
    $ws = $wb.Sheets.Item(1)
    $lastRow = ($ws.UsedRange.Rows).Count
    $lastCol = ($ws.UsedRange.Columns).Count
    foreach ($row in (2..$lastRow)){
        $ht = [ordered]@{}
        foreach ($col in (1..$lastCol)){
            $heading = $ws.Cells.Item(1,$col).value2
            #split the cell by crlf or comma
            $parts = ($ws.Cells.Item($row,$col).value2 -split ",|`n").Trim()
            $ht."$heading" = @($parts) 
        }
        CartesianProduct $ht
    }
    $wb.Close()
    $xls.Quit()
}


ImportAndCrossJoin C:\test.xlsx | Export-CSV -NoTypeInformation -path C:\test2.csv
ii C:\test2.csv

以上是关于powershell 清理和交叉连接包含具有由comman或换行符分隔的多个条目的单元格的excel表的主要内容,如果未能解决你的问题,请参考以下文章

数据库内连接外连接交叉连接

具有数百万条记录的 2 个数据帧之间的 Pyspark 交叉连接

静态内存是不是由不同的线程清理?

尝试创建一种有效的方法来产生具有组合结果的交叉连接

将 T-SQL 交叉应用转换为 Oracle

Powershell 和 schtask 与具有空间的任务