使用熊猫合并tsv文件的问题
Posted
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了使用熊猫合并tsv文件的问题相关的知识,希望对你有一定的参考价值。
所以我在使用熊猫时遇到了一个问题,即它根本无法合并某些行。例如,当尝试将以下两个摘录合并在一起时:
Haggai 1:1 In the second year of Darius the king, in the sixth month, in the first day of the month, the Word of Yahweh came by Haggai, the prophet, to Zerubbabel, the son of Shealtiel, governor of Judah, and to Joshua, the son of Jehozadak, the high priest, saying,
Haggai 1:2 "This is what Yahweh of Hosts says: These people say, 'The time hasn't yet come, the time for Yahweh's house to be built.'"
Haggai 1:3 Then the Word of Yahweh came by Haggai, the prophet, saying,
Haggai 1:4 "Is it a time for you yourselves to dwell in your paneled houses, while this house lies waste?
Haggai 1:5 Now therefore this is what Yahweh of Hosts says: Consider your ways.
Haggai 1:6 You have sown much, and bring in little. You eat, but you don't have enough. You drink, but you aren't filled with drink. You clothe yourselves, but no one is warm, and he who earns wages earns wages to put them into a bag with holes in it."
和
Haggai 1:1 ΕΝ τῷ δευτέρῳ ἔτει ἐπὶ Δαρίου τοῦ βασιλέως ἐν τῷ μηνὶ τῷ ἕκτῳ μιᾷ τοῦ μηνὸς ἐγένετο λόγος Κυρίου ἐν χειρὶ Ἀγγαίου τοῦ προφήτου λέγων Εἰπὸν πρὸς Ζοροβάβελ τὸν τοῦ Σαλαθιὴλ ἐκ φυλῆς Ἰούδα καὶ πρὸς Ἰησοῦν τὸν τοῦ Ἰωσεδὲκ τὸν ἱερέα τὸν μέγαν λέγων
Haggai 1:2 Τάδε λέγει Κύριος Παντοκράτωρ λέγων Ὁ λαὸς οὗτος λέγουσιν Οὐχ ἤκει ὁ καιρὸς τοῦ οἰκοδομῆσαι τὸν οἶκον Κυρίου.
Haggai 1:3 καὶ ἐγένετο λόγος Κυρίου ἐν χειρὶ Ἀγγαίου τοῦ προφήτου λέγων
Haggai 1:4 Εἰ καιρὸς μέν ὑμῖν ἐστιν τοῦ οἰκεῖν ἐν οἴκοις ὑμῶν κοιλοστάθμοις, ὁ δὲ οἶκος ὑμῶν ἐξηρήμωται;
Haggai 1:5 καὶ νῦν τάδε λέγει Κύριος Παντοκράτωρ Τάξατε δὴ τὰς καρδίας ὑμῶν εἰς τὰς ὁδοὺς ὑμῶν·
Haggai 1:6 ἐσπείρατε πολλὰ καὶ εἰσηνέγκατε ὀλίγα, ἐφάγετε καὶ οὐκ εἰς πλησμονήν, ἐπίετε καὶ οὐκ εἰς μέθην, περιεβάλεσθε καὶ οὐκ ἐθερμάνθητε ἐν αὐτοῖς, καὶ ὁ τοὺς μισθοὺς συνάγων συνήγαγεν εἰς δεσμὸν τετρυπημένον.
我得到:
21 Haggai 1:1 ΕΝ τῷ δευτέρῳ ἔτει ἐπὶ Δαρίου τοῦ βασιλέως ἐν τῷ μηνὶ τῷ ἕκτῳ μιᾷ τοῦ μηνὸς ἐγένετο λόγος Κυρίου ἐν χειρὶ Ἀγγαίου τοῦ προφήτου λέγων Εἰπὸν πρὸς Ζοροβάβελ τὸν τοῦ Σαλαθιὴλ ἐκ φυλῆς Ἰούδα καὶ πρὸς Ἰησοῦν τὸν τοῦ Ἰωσεδὲκ τὸν ἱερέα τὸν μέγαν λέγων In the second year of Darius the king, in the sixth month, in the first day of the month, the Word of Yahweh came by Haggai, the prophet, to Zerubbabel, the son of Shealtiel, governor of Judah, and to Joshua, the son of Jehozadak, the high priest, saying,
22 Haggai 1:2 Τάδε λέγει Κύριος Παντοκράτωρ λέγων Ὁ λαὸς οὗτος λέγουσιν Οὐχ ἤκει ὁ καιρὸς τοῦ οἰκοδομῆσαι τὸν οἶκον Κυρίου. This is what Yahweh of Hosts says: These people say, 'The time hasn't yet come, the time for Yahweh's house to be built.'
23 Haggai 1:3 καὶ ἐγένετο λόγος Κυρίου ἐν χειρὶ Ἀγγαίου τοῦ προφήτου λέγων Then the Word of Yahweh came by Haggai, the prophet, saying,
24 Haggai 1:4 Εἰ καιρὸς μέν ὑμῖν ἐστιν τοῦ οἰκεῖν ἐν οἴκοις ὑμῶν κοιλοστάθμοις, ὁ δὲ οἶκος ὑμῶν ἐξηρήμωται; "Is it a time for you yourselves to dwell in your paneled houses, while this house lies waste?
Haggai 1:5 Now therefore this is what Yahweh of Hosts says: Consider your ways.
Haggai 1:6 You have sown much, and bring in little. You eat, but you don't have enough. You drink, but you aren't filled with drink. You clothe yourselves, but no one is warm, and he who earns wages earns wages to put them into a bag with holes in it."
显然Haggai 1:5
和Haggai 1:6
合并不正确。
我正在使用的代码是:
import pandas as pd
df1 = pd.read_table('greekBible.txt')
df2 = pd.read_table('englishBible.txt')
df3 = pd.merge(df1, df2, on=['Book', 'Chapter:Verse'])
df3.to_csv('test.txt', sep="\t")
请记住,这只是一小部分摘录。此外,这两个圣经不是完全对齐的,其中一个条目不在另一个条目中,反之亦然。不过,据我了解,这应该不是问题。
非常感谢您对这个问题的帮助!
答案
您的文件中似乎缺少引号,这可能是由于文本的拆分方式所致。在这行上最明显的是[[哈加1:4“这是你们自己一个时间住在这栋镶板房屋中,而这所房屋却是废物吗?
以上是关于使用熊猫合并tsv文件的问题的主要内容,如果未能解决你的问题,请参考以下文章