如何规范化 Java 中的 EOL 字符?
Posted
技术标签:
【中文标题】如何规范化 Java 中的 EOL 字符?【英文标题】:How can I normalize the EOL character in Java? 【发布时间】:2011-04-16 04:05:15 【问题描述】:我有一个 linux 服务器和许多带有许多操作系统的客户端。服务器从客户端获取输入文件。 Linux 具有行尾字符 LF,而 Mac 具有行尾字符 CR,并且 Windows 有行尾字符 CR+LF
服务器需要作为行尾字符 LF。使用 java,我想确保该文件将始终使用 linux eol char LF。我怎样才能实现它?
【问题讨论】:
现代 Mac(即装有 OS X 的 Mac)使用 LF 作为行终止符。 【参考方案1】:你能试试这个吗?
content.replaceAll("\\r\\n?", "\n")
【讨论】:
这个方案比调用replaceAll
两次效率更高。它应该是公认的答案。【参考方案2】:
结合两个答案(由 Visage 和 eumiro 提供):
编辑: 阅读评论后。线。
System.getProperty("line.separator")
那就没有用了。
在将文件发送到服务器之前,打开它替换所有 EOL 并写回
确保使用 DataStreams 执行此操作,并以二进制形式写入
String fileString;
//..
//read from the file
//..
//for windows
fileString = fileString.replaceAll("\\r\\n", "\n");
fileString = fileString.replaceAll("\\r", "\n");
//..
//write to file in binary mode.. something like:
DataOutputStream os = new DataOutputStream(new FileOutputStream("fname.txt"));
os.write(fileString.getBytes());
//..
//send file
//..
replaceAll
方法有两个参数,第一个是要替换的字符串,第二个是替换。但是,第一个被视为正则表达式,因此,'\'
以这种方式解释。所以:
"\\r\\n" is converted to "\r\n" by Regex
"\r\n" is converted to CR+LF by Java
【讨论】:
很好,但我不能在服务器端做任何事情。我需要做一些事情来做客户端。像转换也许.. 在客户端做同样的事情。请参阅修改后的答案。你在说什么类型的转换?如果您的意思是编码,那么这是一个单独的问题,与 EOL 无关。 不管这是否回答了 OP 的问题,为什么要首先使用正则表达式?您不是在寻找模式,而只是在寻找固定的字符序列。那么为什么不只是使用replace()
方法逐个字符替换呢?
lalli,编辑后的代码工作正常,但我需要输出为字符串。如何将输出流转换为字符串?
用\\r\\n?
替换第一个正则表达式,您只需调用replaceAll 一次。【参考方案3】:
必须为最近的项目执行此操作。下面的方法会将给定文件中的行尾规范化为运行 JVM 的操作系统指定的行尾。因此,如果您的 JVM 在 Linux 上运行,这会将所有行结尾标准化为 LF (\n)。
由于使用缓冲流,也适用于非常大的文件。
public static void normalizeFile(File f)
File temp = null;
BufferedReader bufferIn = null;
BufferedWriter bufferOut = null;
try
if(f.exists())
// Create a new temp file to write to
temp = new File(f.getAbsolutePath() + ".normalized");
temp.createNewFile();
// Get a stream to read from the file un-normalized file
FileInputStream fileIn = new FileInputStream(f);
DataInputStream dataIn = new DataInputStream(fileIn);
bufferIn = new BufferedReader(new InputStreamReader(dataIn));
// Get a stream to write to the normalized file
FileOutputStream fileOut = new FileOutputStream(temp);
DataOutputStream dataOut = new DataOutputStream(fileOut);
bufferOut = new BufferedWriter(new OutputStreamWriter(dataOut));
// For each line in the un-normalized file
String line;
while ((line = bufferIn.readLine()) != null)
// Write the original line plus the operating-system dependent newline
bufferOut.write(line);
bufferOut.newLine();
bufferIn.close();
bufferOut.close();
// Remove the original file
f.delete();
// And rename the original file to the new one
temp.renameTo(f);
else
// If the file doesn't exist...
log.warn("Could not find file to open: " + f.getAbsolutePath());
catch (Exception e)
log.warn(e.getMessage(), e);
finally
// Clean up, temp should never exist
FileUtils.deleteQuietly(temp);
IOUtils.closeQuietly(bufferIn);
IOUtils.closeQuietly(bufferOut);
【讨论】:
为什么要注入DataInputStream
和DataOutputStream
?如果没有这些,它是否也能正常工作?【参考方案4】:
这是一个处理 EOL 问题的综合帮助类。它部分基于 tyjen 发布的解决方案。
import java.io.BufferedInputStream;
import java.io.BufferedReader;
import java.io.BufferedWriter;
import java.io.DataInputStream;
import java.io.DataOutputStream;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStreamReader;
import java.io.OutputStreamWriter;
import org.apache.commons.io.FileUtils;
import org.apache.commons.io.IOUtils;
/**
* Helper class to deal with end-of-line markers in text files.
*
* Loosely based on these examples:
* - http://***.com/a/9456947/1084488 (cc by-sa 3.0)
* - http://svn.apache.org/repos/asf/tomcat/trunk/java/org/apache/tomcat/buildutil/CheckEol.java (Apache License v2.0)
*
* This file is posted here to meet the "ShareAlike" requirement of cc by-sa 3.0:
* http://***.com/a/27930311/1084488
*
* @author Matthias Stevens
*/
public class EOLUtils
/**
* Unix-style end-of-line marker (LF)
*/
private static final String EOL_UNIX = "\n";
/**
* Windows-style end-of-line marker (CRLF)
*/
private static final String EOL_WINDOWS = "\r\n";
/**
* "Old Mac"-style end-of-line marker (CR)
*/
private static final String EOL_OLD_MAC = "\r";
/**
* Default end-of-line marker on current system
*/
private static final String EOL_SYSTEM_DEFAULT = System.getProperty( "line.separator" );
/**
* The support end-of-line marker modes
*/
public static enum Mode
/**
* Unix-style end-of-line marker ("\n")
*/
LF,
/**
* Windows-style end-of-line marker ("\r\n")
*/
CRLF,
/**
* "Old Mac"-style end-of-line marker ("\r")
*/
CR
/**
* The default end-of-line marker mode for the current system
*/
public static final Mode SYSTEM_DEFAULT = ( EOL_SYSTEM_DEFAULT.equals( EOL_UNIX ) ? Mode.LF : ( EOL_SYSTEM_DEFAULT
.equals( EOL_WINDOWS ) ? Mode.CRLF : ( EOL_SYSTEM_DEFAULT.equals( EOL_OLD_MAC ) ? Mode.CR : null ) ) );
static
// Just in case...
if ( SYSTEM_DEFAULT == null )
throw new IllegalStateException( "Could not determine system default end-of-line marker" );
/**
* Determines the end-of-line @link Mode of a text file.
*
* @param textFile the file to investigate
* @return the end-of-line @link Mode of the given file, or @code null if it could not be determined
* @throws Exception
*/
public static Mode determineEOL( File textFile )
throws Exception
if ( !textFile.exists() )
throw new IOException( "Could not find file to open: " + textFile.getAbsolutePath() );
FileInputStream fileIn = new FileInputStream( textFile );
BufferedInputStream bufferIn = new BufferedInputStream( fileIn );
try
int prev = -1;
int ch;
while ( ( ch = bufferIn.read() ) != -1 )
if ( ch == '\n' )
if ( prev == '\r' )
return Mode.CRLF;
else
return Mode.LF;
else if ( prev == '\r' )
return Mode.CR;
prev = ch;
throw new Exception( "Could not determine end-of-line marker mode" );
catch ( IOException ioe )
throw new Exception( "Could not determine end-of-line marker mode", ioe );
finally
// Clean up:
IOUtils.closeQuietly( bufferIn );
/**
* Checks whether the given text file has Windows-style (CRLF) line endings.
*
* @param textFile the file to investigate
* @return
* @throws Exception
*/
public static boolean hasWindowsEOL( File textFile )
throws Exception
return Mode.CRLF.equals( determineEOL( textFile ) );
/**
* Checks whether the given text file has Unix-style (LF) line endings.
*
* @param textFile the file to investigate
* @return
* @throws Exception
*/
public static boolean hasUnixEOL( File textFile )
throws Exception
return Mode.LF.equals( determineEOL( textFile ) );
/**
* Checks whether the given text file has "Old Mac"-style (CR) line endings.
*
* @param textFile the file to investigate
* @return
* @throws Exception
*/
public static boolean hasOldMacEOL( File textFile )
throws Exception
return Mode.CR.equals( determineEOL( textFile ) );
/**
* Checks whether the given text file has line endings that conform to the system default mode (e.g. LF on Unix).
*
* @param textFile the file to investigate
* @return
* @throws Exception
*/
public static boolean hasSystemDefaultEOL( File textFile )
throws Exception
return SYSTEM_DEFAULT.equals( determineEOL( textFile ) );
/**
* Convert the line endings in the given file to Unix-style (LF).
*
* @param textFile the file to process
* @throws IOException
*/
public static void convertToUnixEOL( File textFile )
throws IOException
convertLineEndings( textFile, EOL_UNIX );
/**
* Convert the line endings in the given file to Windows-style (CRLF).
*
* @param textFile the file to process
* @throws IOException
*/
public static void convertToWindowsEOL( File textFile )
throws IOException
convertLineEndings( textFile, EOL_WINDOWS );
/**
* Convert the line endings in the given file to "Old Mac"-style (CR).
*
* @param textFile the file to process
* @throws IOException
*/
public static void convertToOldMacEOL( File textFile )
throws IOException
convertLineEndings( textFile, EOL_OLD_MAC );
/**
* Convert the line endings in the given file to the system default mode.
*
* @param textFile the file to process
* @throws IOException
*/
public static void convertToSystemEOL( File textFile )
throws IOException
convertLineEndings( textFile, EOL_SYSTEM_DEFAULT );
/**
* Line endings conversion method.
*
* @param textFile the file to process
* @param eol the end-of-line marker to use (as a @link String)
* @throws IOException
*/
private static void convertLineEndings( File textFile, String eol )
throws IOException
File temp = null;
BufferedReader bufferIn = null;
BufferedWriter bufferOut = null;
try
if ( textFile.exists() )
// Create a new temp file to write to
temp = new File( textFile.getAbsolutePath() + ".normalized" );
temp.createNewFile();
// Get a stream to read from the file un-normalized file
FileInputStream fileIn = new FileInputStream( textFile );
DataInputStream dataIn = new DataInputStream( fileIn );
bufferIn = new BufferedReader( new InputStreamReader( dataIn ) );
// Get a stream to write to the normalized file
FileOutputStream fileOut = new FileOutputStream( temp );
DataOutputStream dataOut = new DataOutputStream( fileOut );
bufferOut = new BufferedWriter( new OutputStreamWriter( dataOut ) );
// For each line in the un-normalized file
String line;
while ( ( line = bufferIn.readLine() ) != null )
// Write the original line plus the operating-system dependent newline
bufferOut.write( line );
bufferOut.write( eol ); // write EOL marker
// Close buffered reader & writer:
bufferIn.close();
bufferOut.close();
// Remove the original file
textFile.delete();
// And rename the original file to the new one
temp.renameTo( textFile );
else
// If the file doesn't exist...
throw new IOException( "Could not find file to open: " + textFile.getAbsolutePath() );
finally
// Clean up, temp should never exist
FileUtils.deleteQuietly( temp );
IOUtils.closeQuietly( bufferIn );
IOUtils.closeQuietly( bufferOut );
【讨论】:
【参考方案5】:使用
System.getProperty("line.separator")
这将为您提供(本地)EOL 字符。然后,您可以使用对 incomifile 的分析来确定它是什么“风味”并进行相应的转换。
或者,让您的客户标准化!
【讨论】:
我将如何使用它?我不需要本地设置,我总是需要 linux 的设置,我的意思是 LF 文件结尾,即使文件将在 windows 中生成 顺便说一句,我想做这个客户端,而不是服务器端。【参考方案6】:public static String normalize(String val)
return val.replace("\r\n", "\n")
.replace("\r", "\n");
对于 html:
public static String normalize(String val)
return val.replace("\r\n", "<br/>")
.replace("\n", "<br/>")
.replace("\r", "<br/>");
【讨论】:
【参考方案7】:虽然 String.replaceAll() 的代码更简单,但它应该执行得更好,因为它不经过正则表达式基础结构。
/**
* Accepts a non-null string and returns the string with all end-of-lines
* normalized to a \n. This means \r\n and \r will both be normalized to \n.
* <p>
* Impl Notes: Although regex would have been easier to code, this approach
* will be more efficient since it's purpose built for this use case. Note we only
* construct a new StringBuilder and start appending to it if there are new end-of-lines
* to be normalized found in the string. If there are no end-of-lines to be replaced
* found in the string, this will simply return the input value.
* </p>
*
* @param inputValue !null, input value that may or may not contain new lines
* @return the input value that has new lines normalized
*/
static String normalizeNewLines(String inputValue)
StringBuilder stringBuilder = null;
int index = 0;
int len = inputValue.length();
while (index < len)
char c = inputValue.charAt(index);
if (c == '\r')
if (stringBuilder == null)
stringBuilder = new StringBuilder();
// build up the string builder so it contains all the prior characters
stringBuilder.append(inputValue.substring(0, index));
if ((index + 1 < len) &&
inputValue.charAt(index + 1) == '\n')
// this means we encountered a \r\n ... move index forward one more character
index++;
stringBuilder.append('\n');
else
if (stringBuilder != null)
stringBuilder.append(c);
index++;
return stringBuilder == null ? inputValue : stringBuilder.toString();
【讨论】:
【参考方案8】:从 Java 12 开始可以使用
var str = str.ident(0);
【讨论】:
【参考方案9】:在路径中更改以递归搜索结尾的文件的解决方案
package handleFileLineEnd;
import java.io.File;
import java.io.IOException;
import java.nio.charset.Charset;
import java.nio.charset.StandardCharsets;
import java.nio.file.FileSystems;
import java.nio.file.Files;
import java.nio.file.OpenOption;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.util.List;
import sun.awt.image.BytePackedRaster;
public class handleFileEndingMain
static int carriageReturnTotal;
static int newLineTotal;
public static void main(String[] args) throws IOException
processPath("c:/temp/directories");
System.out.println("carriageReturnTotal (files have issue): " + carriageReturnTotal);
System.out.println("newLineTotal: " + newLineTotal);
private static void processPath(String path) throws IOException
File dir = new File(path);
File[] directoryListing = dir.listFiles();
if (directoryListing != null)
for (File child : directoryListing)
if (child.isDirectory())
processPath(child.toString());
else
checkFile(child.toString());
private static void checkFile(String fileName) throws IOException
Path path = FileSystems.getDefault().getPath(fileName);
byte[] bytes= Files.readAllBytes(path);
for (int counter=0; counter<bytes.length; counter++)
if (bytes[counter] == 13)
carriageReturnTotal = carriageReturnTotal + 1;
System.out.println(fileName);
modifyFile(fileName);
break;
if (bytes[counter] == 10)
newLineTotal = newLineTotal+ 1;
//System.out.println(fileName);
break;
private static void modifyFile(String fileName) throws IOException
Path path = Paths.get(fileName);
Charset charset = StandardCharsets.UTF_8;
String content = new String(Files.readAllBytes(path), charset);
content = content.replaceAll("\r\n", "\n");
content = content.replaceAll("\r", "\n");
Files.write(path, content.getBytes(charset));
【讨论】:
以上是关于如何规范化 Java 中的 EOL 字符?的主要内容,如果未能解决你的问题,请参考以下文章