Python07python内置数据结构之字符串及bytes

Posted 2020-09-20

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了Python07python内置数据结构之字符串及bytes相关的知识，希望对你有一定的参考价值。

一、字符串

1、定义和初始化

In [4]: s = "hello python"
In [4]: s = "hello python"

In [5]: s
Out[5]: ‘hello python‘

In [6]: s = ‘hello python‘

In [7]: s
Out[7]: ‘hello python‘

In [8]: s = ‘‘‘hello python‘‘‘

In [9]: s
Out[9]: ‘hello python‘

In [10]: s = """hello python"""

In [11]: s
Out[11]: ‘hello python‘

python中单双引号没有区别，只能定义单行字符串

三引号能定义多行字符串

单双三引号是有区别的

In [24]: s = ‘hello  python
  File "<ipython-input-24-54fb5309d2d0>", line 1
    s = ‘hello  python
                      ^
SyntaxError: EOL while scanning string literal


In [25]: s = ‘hello  python \        # 续写上一行
    ...: i like python‘

In [26]: s
Out[26]: ‘hello  python i like python‘

In [22]: s = """hello python
    ...: i like python"""

In [23]: s
Out[23]: ‘hello python\ni like python‘

工厂函数str()：

In [12]: print(str.__doc__)
str(object=‘‘) -> str
str(bytes_or_buffer[, encoding[, errors]]) -> str

Create a new string object from the given object. If encoding or
errors is specified, then the object must expose a data buffer
that will be decoded using the given encoding and error handler.
Otherwise, returns the result of object.__str__() (if defined)
or repr(object).
encoding defaults to sys.getdefaultencoding().
errors defaults to ‘strict‘.

In [13]: s = str("abc")

In [14]: s
Out[14]: ‘abc‘

In [16]: s = str([1, 2])

In [17]: s
Out[17]: ‘[1, 2]‘

In [18]: s = str(1)

In [19]: s
Out[19]: ‘1‘

2、字符串转义

In [32]: s = "i like \n python"

In [33]: s
Out[33]: ‘i like \n python‘

In [34]: s = "i like \npython"

In [35]: s
Out[35]: ‘i like \npython‘

In [36]: s = ‘I‘m xj‘
  File "<ipython-input-36-0b8827686244>", line 1
    s = ‘I‘m xj‘
           ^
SyntaxError: invalid syntax


In [37]: s = ‘I\‘m xj‘

In [38]: s
Out[38]: "I‘m xj"


In [50]: path = ‘c:\windows\nt\system32‘    # 这里的\n可能会被转义成换行符

In [51]: path
Out[51]: ‘c:\\windows\nt\\system32‘

In [52]: path = ‘c:\\windows\\nt\\system32‘  # 一般需要这么写

In [53]: path
Out[53]: ‘c:\\windows\\nt\\system32‘

In [54]: path = r‘c:\windows\nt\system32‘   # 加r（raw）能表示此字符串是自然字符串，不会转义

In [55]: path
Out[55]: ‘c:\\windows\\nt\\system32‘

二、字符串的操作

1、索引操作

In [59]: s = "I‘m xjj"

In [60]: s[1]
Out[60]: "‘"

In [61]: s[2]
Out[61]: ‘m‘

In [62]: s[3]
Out[62]: ‘ ‘

str.count()和str.index()方法和在list,tuple中表现一样

2、str的连接和分割

1）str的连接

str.join()

使用str将可迭代对象的str元素连接成1个str

参数是元素都为str的可迭代对象，接收者是分隔符

In [71]: print(str.join.__doc__)
S.join(iterable) -> str

Return a string which is the concatenation of the strings in the
iterable.  The separator between elements is S.

In [81]: lst = ["I", "am", "xxj"]    # 可迭代对象的元素必须是str

In [82]: ‘‘.join(lst)
Out[82]: ‘Iamxxj‘

In [83]: ‘ ‘.join(lst)
Out[83]: ‘I am xxj‘

In [84]: ‘,‘.join(lst)
Out[84]: ‘I,am,xxj‘

In [85]: ‘,!‘.join(lst)
Out[85]: ‘I,!am,!xxj‘

In [86]: ‘ , ‘.join(lst)
Out[86]: ‘I , am , xxj‘

In [87]: lst = [1, 2, 3]

In [88]: ‘,‘.join(lst)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-88-b4c772e35459> in <module>()
----> 1 ‘,‘.join(lst)

TypeError: sequence item 0: expected str instance, int found

In [93]: "hello" + "python"
Out[93]: ‘hellopython‘

In [94]: str1 = "xxj"

In [95]: str1 + 1
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-95-2584ac008f78> in <module>()
----> 1 str1 + 1

TypeError: must be str, not int

In [96]: str1 + "hello"
Out[96]: ‘xxjhello‘

In [97]: str1 + " hello"
Out[97]: ‘xxj hello‘

2）分割

str.split()

不原地修改，返回使用分隔符分隔的列表

In [99]: print(s.split.__doc__)
S.split(sep=None, maxsplit=-1) -> list of strings

Return a list of the words in S, using sep as the
delimiter string.  If maxsplit is given, at most maxsplit
splits are done. If sep is not specified or is None, any
whitespace string is a separator and empty strings are
removed from the result.

In [98]: s = "I love python"

In [100]: s.split("o")              # 默认分隔所有
Out[100]: [‘I l‘, ‘ve pyth‘, ‘n‘]

In [101]: s.split("o", 1)           # 指定分隔一次
Out[101]: [‘I l‘, ‘ve python‘]

In [102]: s.split()
Out[102]: [‘I‘, ‘love‘, ‘python‘]

In [102]: s.split()                 # 默认分隔符为1个或多个空格
Out[102]: [‘I‘, ‘love‘, ‘python‘]

In [103]: s.split("ov")            # 可以使用多个字符串当空格
Out[103]: [‘I l‘, ‘e python‘]

In [159]: s.split("A")            # 不包含分隔符号时，不分隔原str
Out[159]: [‘I love python‘]

In [104]: s = "I love      python"

In [105]: s.split()
Out[105]: [‘I‘, ‘love‘, ‘python‘]

In [108]: s.split(" ")              # 使用一个空格当做分隔符
Out[108]: [‘I‘, ‘love‘, ‘‘, ‘‘, ‘‘, ‘‘, ‘‘, ‘python‘]


In [110]: s.split(maxsplit=1)
Out[110]: [‘I‘, ‘love      python‘]

In [111]: s.split()
Out[111]: [‘I‘, ‘love‘, ‘python‘]

str.rsplit()：

从右往左开始分隔；

当不指定maxsplit参数时，str.rsplit()和str.split()完全一样，当str.split()效率更高

In [122]: s = "I love python"

In [123]: s.rsplit("o")
Out[123]: [‘I l‘, ‘ve pyth‘, ‘n‘]

In [124]: s.rsplit("o", 1)
Out[124]: [‘I love pyth‘, ‘n‘]

str.splitlines():

按行分隔，返回结果可以选择带不带换行符；返回值是一个列表

In [136]: print(str.splitlines.__doc__)
S.splitlines([keepends]) -> list of strings

Return a list of the lines in S, breaking at line boundaries.
Line breaks are not included in the resulting list unless keepends
is given and true.

In [137]: s = """I am xxj
     ...: i love python"""

In [138]: s
Out[138]: ‘I am xxj\ni love python‘

In [139]: s.splitlines()
Out[139]: [‘I am xxj‘, ‘i love python‘]

In [140]: s.splitlines(true)
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-140-dfaf8d28775c> in <module>()
----> 1 s.splitlines(true)

NameError: name ‘true‘ is not defined

In [141]: s.splitlines(True)
Out[141]: [‘I am xxj\n‘, ‘i love python‘]

str.partition()：

总是返回一个三元组，它被传入的分隔符分隔1次，分隔成（head, sep,tail）

In [145]: print(str.partition.__doc__)
S.partition(sep) -> (head, sep, tail)

Search for the separator sep in S, and return the part before it,
the separator itself, and the part after it.  If the separator is not
found, return S and two empty strings.

In [147]: s = "I love python"

In [148]: s.partition("o")
Out[148]: (‘I l‘, ‘o‘, ‘ve python‘)

str.rpartition()是str.partition()从右往左的版本：

In [153]: s.rpartition("o")
Out[153]: (‘I love pyth‘, ‘o‘, ‘n‘)

In [154]: s.rpartition("A")
Out[154]: (‘‘, ‘‘, ‘I love python‘)

In [155]: "A".rpartition("A")
Out[155]: (‘‘, ‘A‘, ‘‘)

In [156]: "".rpartition("A")
Out[156]: (‘‘, ‘‘, ‘‘)

In [157]: " ".rpartition("A")
Out[157]: (‘‘, ‘‘, ‘ ‘)

3、str大小写转换与排版

In [2]: s = "I love python"

In [3]: s.upper()
Out[3]: ‘I LOVE PYTHON‘

In [5]: s.lower()
Out[5]: ‘i love python‘

In [6]: s.title()        # 首字母全部大写
Out[6]: ‘I Love Python‘

In [8]: s.capitalize()   # 把首字母大写
Out[8]: ‘I love python‘

In [10]: print(s.center.__doc__)       # 在给定宽度下居中，可以使用单个字符填充
S.center(width[, fillchar]) -> str

Return S centered in a string of length width. Padding is
done using the specified fill character (default is a space)

In [11]: s.center(50)
Out[11]: ‘                  I love python                   ‘

In [12]: s.center(50, "#")
Out[12]: ‘##################I love python###################‘

In [13]: s.center(50, "#%")
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-13-4aa39ce1c3b3> in <module>()
----> 1 s.center(50, "#%")

TypeError: The fill character must be exactly one character long

In [19]: s
Out[19]: ‘I love python‘

In [20]: s.zfill(5)
Out[20]: ‘I love python‘

In [21]: s.zfill(50)        # 用0填充
Out[21]: ‘0000000000000000000000000000000000000I love python‘

In [23]: print(s.casefold.__doc__)
S.casefold() -> str

Return a version of S suitable for caseless comparisons.

In [25]: s
Out[25]: ‘I love python‘

In [26]: s.casefold()     # 返回一个统一大小写的str，在不同平台有不同的表现形式
Out[26]: ‘i love python‘
 
In [27]: s.swapcase()     # 交换大小写
Out[27]: ‘i LOVE PYTHON‘

In [36]: "\t".expandtabs()  # 默认将\t转换为8个空格
Out[36]: ‘        ‘

In [40]: "\t".expandtabs(8)
Out[40]: ‘        ‘

In [37]: "\t".expandtabs(3)
Out[37]: ‘   ‘

4、修改

str.replace()

使用new str替换old str，返回新的str

In [44]: help(str.replace.__doc__)
No Python documentation found for ‘S.replace(old, new[, count]) -> str\n\nReturn a copy of S with all occurrences of substring\nold replaced by new.  If the optional argument count is\ngiven, only the first count occurrences are replaced.‘.
Use help() to get the interactive help utility.
Use help(str) for help on the str class.

In [47]: s
Out[47]: ‘I love python‘

In [48]: s.replace("love", "give up")
Out[48]: ‘I give up python‘

In [49]: s.replace("o", 0)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-49-184707d40696> in <module>()
----> 1 s.replace("o", 0)

TypeError: replace() argument 2 must be str, not int

In [50]: s.replace("o", "O")
Out[50]: ‘I lOve pythOn‘

In [51]: s.replace("o", "O", -1)
Out[51]: ‘I lOve pythOn‘

In [52]: s.replace("o", "O", 0)
Out[52]: ‘I love python‘

In [53]: s.replace("o", "O", 1)
Out[53]: ‘I lOve python‘

In [54]: s.replace("o", "O", 5)
Out[54]: ‘I lOve pythOn‘

str.strip()

str.rstrip()

str.lstrip()

移除str首尾指定字符集合内的字符

In [62]: print(str.strip.__doc__)
S.strip([chars]) -> str

Return a copy of the string S with leading and trailing
whitespace removed.
If chars is given and not None, remove characters in chars instead.

In [66]: s = " I love python  "

In [67]: s
Out[67]: ‘ I love python  ‘

In [68]: s.strip()           # 默认去掉首尾的空白字符
Out[68]: ‘I love python‘

In [69]: s.lstrip()          # 去掉首部的空白字符
Out[69]: ‘I love python  ‘

In [70]: s.rstrip()          # 去掉尾部的空白字符
Out[70]: ‘ I love python‘

In [76]: s = "\n \r \t haha \n \r\t"

In [77]: s
Out[77]: ‘\n \r \t haha \n \r\t‘

In [78]: s.strip()
Out[78]: ‘haha‘

In [84]: s = "I love python haha"

In [86]: s.strip("a")
Out[86]: ‘I love python hah‘

In [87]: s.strip("ha")
Out[87]: ‘I love python ‘

In [88]: s.strip("on")
Out[88]: ‘I love python haha‘

In [89]: s
Out[89]: ‘I love python haha‘

In [91]: s = "{{ haha haha }}"

In [92]: s
Out[92]: ‘{{ haha haha }}‘

In [94]: s.strip("{}") 
Out[94]: ‘ haha haha ‘

In [95]: s.strip("{}s")      # 移除指定字符集合里的字符
Out[95]: ‘ haha haha ‘

In [96]: s.lstrip("{}s")
Out[96]: ‘ haha haha }}‘

str.ljust()

str.rjust()

左\右对其并填充

In [98]: print(str.ljust.__doc__)
S.ljust(width[, fillchar]) -> str

Return S left-justified in a Unicode string of length width. Padding is
done using the specified fill character (default is a space).


In [105]: s = "xxj"

In [106]: s.ljust(3)
Out[106]: ‘xxj‘

In [107]: s.ljust(1)
Out[107]: ‘xxj‘

In [108]: s.ljust(10)
Out[108]: ‘xxj       ‘

In [109]: s.ljust(10, "A")
Out[109]: ‘xxjAAAAAAA‘

In [110]: s.ljust(10, "Ab")
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-110-b45e86b2e828> in <module>()
----> 1 s.ljust(10, "Ab")

TypeError: The fill character must be exactly one character long

In [111]: s.rjust(10, "A")
Out[111]: ‘AAAAAAAxxj‘

以上是关于Python07python内置数据结构之字符串及bytes的主要内容，如果未能解决你的问题，请参考以下文章

what's the python之基本运算符及字符串列表元祖集合字典的内置方法

Python全栈-Day05

Python内置函数之enumerate() 函数

Python数据类型的内置函数之str(字符串)

python数据类型之内置方法

python入门之内置数据结构入门