456python string 类内容(去除文本标点)
Posted alex_bn_lee
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了456python string 类内容(去除文本标点)相关的知识,希望对你有一定的参考价值。
主要用于 NLP 处理,里面存在一些常量列表,包括数字、字母、大写字母、小写字母、标点符号、空格等。
参考:6.1. string — Common string operations
可以用于删除文本中的标点符号,将标点符号 replace 为 空。
>>> import string >>> string.punctuation ‘!"#$%&‘()*+,-./:;<=>?@[\]^_`{|}~‘ >>> string.digits ‘0123456789‘ >>> string.ascii_letters ‘abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ‘ >>> string.ascii_lowercase ‘abcdefghijklmnopqrstuvwxyz‘ >>> string.ascii_uppercase ‘ABCDEFGHIJKLMNOPQRSTUVWXYZ‘ >>> string.hexdigits ‘0123456789abcdefABCDEF‘ >>> string.printable ‘0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ!"#$%&‘()*+,-./:;<=>?@[\]^_`{|}~ x0bx0c‘ >>> string.whitespace ‘ x0bx0c‘
6.1.1. String constants
The constants defined in this module are:
- string.ascii_letters
-
The concatenation of the ascii_lowercase and ascii_uppercase constants described below. This value is not locale-dependent.
- string.ascii_lowercase
-
The lowercase letters ‘abcdefghijklmnopqrstuvwxyz‘. This value is not locale-dependent and will not change.
- string.ascii_uppercase
-
The uppercase letters ‘ABCDEFGHIJKLMNOPQRSTUVWXYZ‘. This value is not locale-dependent and will not change.
- string.digits
-
The string ‘0123456789‘.
- string.hexdigits
-
The string ‘0123456789abcdefABCDEF‘.
- string.octdigits
-
The string ‘01234567‘.
- string.punctuation
-
String of ASCII characters which are considered punctuation characters in the C locale.
- string.printable
-
String of ASCII characters which are considered printable. This is a combination of digits, ascii_letters, punctuation, and whitespace.
- string.whitespace
-
A string containing all ASCII characters that are considered whitespace. This includes the characters space, tab, linefeed, return, formfeed, and vertical tab.
以上是关于456python string 类内容(去除文本标点)的主要内容,如果未能解决你的问题,请参考以下文章