python 从列表中有效地提取列索引。

Posted 2021-05-10

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了python 从列表中有效地提取列索引。相关的知识，希望对你有一定的参考价值。

"""
Extracts specified columns from a list by taking a list of column indices and converting them in to mininum number of Python
slice objects ahead of time before iterating through the list.  Then uses slices to extract ranges of columns.
"""

from itertools import chain


def prepare_slices(indices):
    """
    Converts a list of Python indices into an optimized list of slice objects.

    >>> x = [2, 4, 13, 16, 19, 23, 24, 25]
    >>> prepare_slices(x)
    [slice(2, 5, 2), slice(13, 20, 3), slice(23, 26, 1)]

    """
    start = None
    try:
        start = indices.pop(0)
        stop = indices.pop(0)
    except IndexError:
        return [slice(start, None)]
    step = stop - start
    slices = []
    for index in indices:
        if step:
            stride = index - stop
            if stride == step:
                stop = index
            else:
                slices.append(slice(start, stop + 1, step))
                start, stop, step = index, None, None
        else:
            stop, step = index, index - start
    slices.append(slice(start, stop + 1 if stop else None, step))
    return slices


def slice_columns(seq, slices):
    """
    Extracts items from a sequence using a list of slice objects

    >>> letters_a_to_z = [chr(97 + i) for i in range(26)]
    >>> slices_ = [slice(2, 5, 2), slice(13, 20, 3), slice(23, 26, 1)]
    >>> slice_columns(letters_a_to_z, slices_)
    ['c', 'e', 'n', 'q', 't', 'x', 'y', 'z']

    """
    return list(chain.from_iterable((seq[i] for i in slices)))


def slice_rows(rows, indices):
    """
    Extracts columns from a list of rows
    
    """
    slices = prepare_slices(indices)
    for row in rows:
        yield slice_columns(row, slices)


if __name__ == '__main__':
    import doctest
    doctest.testmod()

以上是关于python 从列表中有效地提取列索引。的主要内容，如果未能解决你的问题，请参考以下文章

从熊猫数据框的列索引中获取字符串列表

从结合两个多索引dfs和列索引的元组列表构建dict

Python数据处理 | 批量提取文件夹下的csv文件，每个csv文件根据列索引提取特定几列，并将提取后的数据保存到新建的一个文件夹

基于列索引的 Spark Dataframe 选择

列索引的有效性是不是与列数据的熵有关