从流式阅读器读取一行而不消耗?
Posted
技术标签:
【中文标题】从流式阅读器读取一行而不消耗?【英文标题】:Reading a line from a streamreader without consuming? 【发布时间】:2010-10-24 22:56:19 【问题描述】:有没有办法提前读一行来测试下一行是否包含特定的标签数据?
我正在处理一种有开始标签但没有结束标签的格式。
我想读取一行,将其添加到结构中,然后测试下面的行以确保它不是新的“节点”,如果它关闭该结构并创建一个新的,则它没有继续添加
我能想到的唯一解决方案是让两个流阅读器同时进行,沿着锁定步骤有点混乱,但这似乎很浪费(如果它甚至可以工作的话)
我需要 peek 但 peekline 之类的东西
【问题讨论】:
我认为 PeekLine 方法不是处理“无结束标签”问题的好方法,因为您总是需要查看行并测试新结构的开始位置。我想将流的位置设置为上一行,下一个 ReadLine 将返回您已阅读的行。 【参考方案1】:问题是底层流甚至可能不可搜索。如果您查看流读取器实现,它使用缓冲区,因此即使流不可搜索,它也可以实现 TextReader.Peek()。
您可以编写一个简单的适配器来读取下一行并在内部对其进行缓冲,如下所示:
public class PeekableStreamReaderAdapter
private StreamReader Underlying;
private Queue<string> BufferedLines;
public PeekableStreamReaderAdapter(StreamReader underlying)
Underlying = underlying;
BufferedLines = new Queue<string>();
public string PeekLine()
string line = Underlying.ReadLine();
if (line == null)
return null;
BufferedLines.Enqueue(line);
return line;
public string ReadLine()
if (BufferedLines.Count > 0)
return BufferedLines.Dequeue();
return Underlying.ReadLine();
【讨论】:
我会在使用之前初始化BufferedLines
:) 而且,我会为 PeekLine() 使用另一个名称,因为顾名思义它总是会返回同一行(下一行来自最后一个 ReadLine 的位置)。已经投票 +1
感谢添加初始化程序。甚至从未编译过代码。也许像 LookAheadReadLine() 这样的东西可能更合适。
我稍微扩展了这个,所以类继承自 TextReader:gist.github.com/1317325
@AndyEdinborough 喜欢 PeekableTextReader
@AndyEdinborough 你刚刚为我节省了两个小时,辛苦了,非常感谢!【参考方案2】:
您可以存储访问 StreamReader.BaseStream.Position 的位置,然后读取下一行,进行测试,然后在读取该行之前找到该位置:
// Peek at the next line
long peekPos = reader.BaseStream.Position;
string line = reader.ReadLine();
if (line.StartsWith("<tag start>"))
// This is a new tag, so we reset the position
reader.BaseStream.Seek(pos);
else
// This is part of the same node.
这是很多寻找和重新阅读相同的行。使用一些逻辑,您可以完全避免这种情况 - 例如,当您看到一个新标签开始时,关闭现有结构并开始一个新结构 - 这是一个基本算法:
SomeStructure myStructure = null;
while (!reader.EndOfStream)
string currentLine = reader.ReadLine();
if (currentLine.StartsWith("<tag start>"))
// Close out existing structure.
if (myStructure != null)
// Close out the existing structure.
// Create a new structure and add this line.
myStructure = new Structure();
// Append to myStructure.
else
// Add to the existing structure.
if (myStructure != null)
// Append to existing myStructure
else
// This means the first line was not part of a structure.
// Either handle this case, or throw an exception.
【讨论】:
看这里:似乎底层流的位置可能并不总是与 StreamReader:***.com/questions/1737591/streamreader-c-peek 匹配【参考方案3】:为什么困难?无论如何都返回下一行。检查它是否是一个新节点,如果不是,将其添加到结构中。如果是,则创建一个新结构。
// Not exactly C# but close enough
Collection structs = new Collection();
Struct struct;
while ((line = readline()) != null))
if (IsNode(line))
if (struct != null) structs.add(struct);
struct = new Struct();
continue;
// Whatever processing you need to do
struct.addLine(line);
structs.add(struct); // Add the last one to the collection
// Use your structures here
foreach s in structs
【讨论】:
【参考方案4】:这是我到目前为止所做的。我走的分割路线比逐行路线的流式阅读器更多。
我确信有一些地方正在变得更加优雅,但现在它似乎正在发挥作用。
请告诉我你的想法
struct INDI
public string ID;
public string Name;
public string Sex;
public string BirthDay;
public bool Dead;
struct FAM
public string FamID;
public string type;
public string IndiID;
List<INDI> Individuals = new List<INDI>();
List<FAM> Family = new List<FAM>();
private void button1_Click(object sender, EventArgs e)
string path = @"C:\mostrecent.ged";
ParseGedcom(path);
private void ParseGedcom(string path)
//Open path to GED file
StreamReader SR = new StreamReader(path);
//Read entire block and then plit on 0 @ for individuals and familys (no other info is needed for this instance)
string[] Holder = SR.ReadToEnd().Replace("0 @", "\u0646").Split('\u0646');
//For each new cell in the holder array look for Individuals and familys
foreach (string Node in Holder)
//Sub Split the string on the returns to get a true block of info
string[] SubNode = Node.Replace("\r\n", "\r").Split('\r');
//If a individual is found
if (SubNode[0].Contains("INDI"))
//Create new Structure
INDI I = new INDI();
//Add the ID number and remove extra formating
I.ID = SubNode[0].Replace("@", "").Replace(" INDI", "").Trim();
//Find the name remove extra formating for last name
I.Name = SubNode[FindIndexinArray(SubNode, "NAME")].Replace("1 NAME", "").Replace("/", "").Trim();
//Find Sex and remove extra formating
I.Sex = SubNode[FindIndexinArray(SubNode, "SEX")].Replace("1 SEX ", "").Trim();
//Deterine if there is a brithday -1 means no
if (FindIndexinArray(SubNode, "1 BIRT ") != -1)
// add birthday to Struct
I.BirthDay = SubNode[FindIndexinArray(SubNode, "1 BIRT ") + 1].Replace("2 DATE ", "").Trim();
// deterimin if there is a death tag will return -1 if not found
if (FindIndexinArray(SubNode, "1 DEAT ") != -1)
//convert Y or N to true or false ( defaults to False so no need to change unless Y is found.
if (SubNode[FindIndexinArray(SubNode, "1 DEAT ")].Replace("1 DEAT ", "").Trim() == "Y")
//set death
I.Dead = true;
//add the Struct to the list for later use
Individuals.Add(I);
// Start Family section
else if (SubNode[0].Contains("FAM"))
//grab Fam id from node early on to keep from doing it over and over
string FamID = SubNode[0].Replace("@ FAM", "");
// Multiple children can exist for each family so this section had to be a bit more dynaimic
// Look at each line of node
foreach (string Line in SubNode)
// If node is HUSB
if (Line.Contains("1 HUSB "))
FAM F = new FAM();
F.FamID = FamID;
F.type = "PAR";
F.IndiID = Line.Replace("1 HUSB ", "").Replace("@","").Trim();
Family.Add(F);
//If node for Wife
else if (Line.Contains("1 WIFE "))
FAM F = new FAM();
F.FamID = FamID;
F.type = "PAR";
F.IndiID = Line.Replace("1 WIFE ", "").Replace("@", "").Trim();
Family.Add(F);
//if node for multi children
else if (Line.Contains("1 CHIL "))
FAM F = new FAM();
F.FamID = FamID;
F.type = "CHIL";
F.IndiID = Line.Replace("1 CHIL ", "").Replace("@", "");
Family.Add(F);
private int FindIndexinArray(string[] Arr, string search)
int Val = -1;
for (int i = 0; i < Arr.Length; i++)
if (Arr[i].Contains(search))
Val = i;
return Val;
【讨论】:
FAM 和 INDI 是这些结构的可怕名称(如果其他人可能需要阅读或使用您的代码)。 这是标签的名称,我认为它很容易解释以上是关于从流式阅读器读取一行而不消耗?的主要内容,如果未能解决你的问题,请参考以下文章