如何规范化 Java 中的 EOL 字符?

Posted

技术标签:

【中文标题】如何规范化 Java 中的 EOL 字符?【英文标题】:How can I normalize the EOL character in Java? 【发布时间】:2011-04-16 04:05:15 【问题描述】:

我有一个 linux 服务器和许多带有许多操作系统的客户端。服务器从客户端获取输入文件。 Linux 具有行尾字符 LF,而 Mac 具有行尾字符 CR,并且 Windows 有行尾字符 CR+LF

服务器需要作为行尾字符 LF。使用 java,我想确保该文件将始终使用 linux eol char LF。我怎样才能实现它?

【问题讨论】:

现代 Mac(即装有 OS X 的 Mac)使用 LF 作为行终止符。 【参考方案1】:

你能试试这个吗?

content.replaceAll("\\r\\n?", "\n")

【讨论】:

这个方案比调用replaceAll两次效率更高。它应该是公认的答案。【参考方案2】:

结合两个答案(由 Visage 和 eumiro 提供):

编辑: 阅读评论后。线。 System.getProperty("line.separator") 那就没有用了。 在将文件发送到服务器之前,打开它替换所有 EOL 并写回 确保使用 DataStreams 执行此操作,并以二进制形式写入

String fileString;
//..
//read from the file
//..
//for windows
fileString = fileString.replaceAll("\\r\\n", "\n");
fileString = fileString.replaceAll("\\r", "\n");
//..
//write to file in binary mode.. something like:
DataOutputStream os = new DataOutputStream(new FileOutputStream("fname.txt"));
os.write(fileString.getBytes());
//..
//send file
//..

replaceAll 方法有两个参数,第一个是要替换的字符串,第二个是替换。但是,第一个被视为正则表达式,因此,'\' 以这种方式解释。所以:

"\\r\\n" is converted to "\r\n" by Regex
"\r\n" is converted to CR+LF by Java

【讨论】:

很好,但我不能在服务器端做任何事情。我需要做一些事情来做客户端。像转换也许.. 在客户端做同样的事情。请参阅修改后的答案。你在说什么类型的转换?如果您的意思是编码,那么这是一个单独的问题,与 EOL 无关。 不管这是否回答了 OP 的问题,为什么要首先使用正则表达式?您不是在寻找模式,而只是在寻找固定的字符序列。那么为什么不只是使用replace() 方法逐个字符替换呢? lalli,编辑后的代码工作正常,但我需要输出为字符串。如何将输出流转换为字符串? \\r\\n? 替换第一个正则表达式,您只需调用replaceAll 一次。【参考方案3】:

必须为最近的项目执行此操作。下面的方法会将给定文件中的行尾规范化为运行 JVM 的操作系统指定的行尾。因此,如果您的 JVM 在 Linux 上运行,这会将所有行结尾标准化为 LF (\n)。

由于使用缓冲流,也适用于非常大的文件。

public static void normalizeFile(File f)       
    File temp = null;
    BufferedReader bufferIn = null;
    BufferedWriter bufferOut = null;        

    try            
        if(f.exists()) 
            // Create a new temp file to write to
            temp = new File(f.getAbsolutePath() + ".normalized");
            temp.createNewFile();

            // Get a stream to read from the file un-normalized file
            FileInputStream fileIn = new FileInputStream(f);
            DataInputStream dataIn = new DataInputStream(fileIn);
            bufferIn = new BufferedReader(new InputStreamReader(dataIn));

            // Get a stream to write to the normalized file
            FileOutputStream fileOut = new FileOutputStream(temp);
            DataOutputStream dataOut = new DataOutputStream(fileOut);
            bufferOut = new BufferedWriter(new OutputStreamWriter(dataOut));

            // For each line in the un-normalized file
            String line;
            while ((line = bufferIn.readLine()) != null) 
                // Write the original line plus the operating-system dependent newline
                bufferOut.write(line);
                bufferOut.newLine();                                
            

            bufferIn.close();
            bufferOut.close();

            // Remove the original file
            f.delete();

            // And rename the original file to the new one
            temp.renameTo(f);
         else 
            // If the file doesn't exist...
            log.warn("Could not find file to open: " + f.getAbsolutePath());
        
     catch (Exception e) 
        log.warn(e.getMessage(), e);
     finally 
        // Clean up, temp should never exist
        FileUtils.deleteQuietly(temp);
        IOUtils.closeQuietly(bufferIn);
        IOUtils.closeQuietly(bufferOut);
    

【讨论】:

为什么要注入DataInputStreamDataOutputStream?如果没有这些,它是否也能正常工作?【参考方案4】:

这是一个处理 EOL 问题的综合帮助类。它部分基于 tyjen 发布的解决方案。

import java.io.BufferedInputStream;
import java.io.BufferedReader;
import java.io.BufferedWriter;
import java.io.DataInputStream;
import java.io.DataOutputStream;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStreamReader;
import java.io.OutputStreamWriter;

import org.apache.commons.io.FileUtils;
import org.apache.commons.io.IOUtils;

/**
 * Helper class to deal with end-of-line markers in text files.
 * 
 * Loosely based on these examples:
 *  - http://***.com/a/9456947/1084488 (cc by-sa 3.0)
 *  - http://svn.apache.org/repos/asf/tomcat/trunk/java/org/apache/tomcat/buildutil/CheckEol.java (Apache License v2.0)
 * 
 * This file is posted here to meet the "ShareAlike" requirement of cc by-sa 3.0:
 *    http://***.com/a/27930311/1084488
 * 
 * @author Matthias Stevens
 */
public class EOLUtils


    /**
     * Unix-style end-of-line marker (LF)
     */
    private static final String EOL_UNIX = "\n";

    /**
     * Windows-style end-of-line marker (CRLF)
     */
    private static final String EOL_WINDOWS = "\r\n";

    /**
     * "Old Mac"-style end-of-line marker (CR)
     */
    private static final String EOL_OLD_MAC = "\r";

    /**
     * Default end-of-line marker on current system
     */
    private static final String EOL_SYSTEM_DEFAULT = System.getProperty( "line.separator" );

    /**
     * The support end-of-line marker modes
     */
    public static enum Mode
    
        /**
         * Unix-style end-of-line marker ("\n")
         */
        LF,

        /**
         * Windows-style end-of-line marker ("\r\n") 
         */
        CRLF,

        /**
         * "Old Mac"-style end-of-line marker ("\r")
         */
        CR
    

    /**
     * The default end-of-line marker mode for the current system
     */
    public static final Mode SYSTEM_DEFAULT = ( EOL_SYSTEM_DEFAULT.equals( EOL_UNIX ) ? Mode.LF : ( EOL_SYSTEM_DEFAULT
        .equals( EOL_WINDOWS ) ? Mode.CRLF : ( EOL_SYSTEM_DEFAULT.equals( EOL_OLD_MAC ) ? Mode.CR : null ) ) );
    static
    
        // Just in case...
        if ( SYSTEM_DEFAULT == null )
        
            throw new IllegalStateException( "Could not determine system default end-of-line marker" );
        
    

    /**
     * Determines the end-of-line @link Mode of a text file.
     * 
     * @param textFile the file to investigate
     * @return the end-of-line @link Mode of the given file, or @code null if it could not be determined
     * @throws Exception
     */
    public static Mode determineEOL( File textFile )
        throws Exception
    
        if ( !textFile.exists() )
        
            throw new IOException( "Could not find file to open: " + textFile.getAbsolutePath() );
        

        FileInputStream fileIn = new FileInputStream( textFile );
        BufferedInputStream bufferIn = new BufferedInputStream( fileIn );
        try
        
            int prev = -1;
            int ch;
            while ( ( ch = bufferIn.read() ) != -1 )
            
                if ( ch == '\n' )
                
                    if ( prev == '\r' )
                    
                        return Mode.CRLF;
                    
                    else
                    
                        return Mode.LF;
                    
                
                else if ( prev == '\r' )
                
                    return Mode.CR;
                
                prev = ch;
            
            throw new Exception( "Could not determine end-of-line marker mode" );
        
        catch ( IOException ioe )
        
            throw new Exception( "Could not determine end-of-line marker mode", ioe );
        
        finally
        
            // Clean up:
            IOUtils.closeQuietly( bufferIn );
        
    

    /**
     * Checks whether the given text file has Windows-style (CRLF) line endings.
     * 
     * @param textFile the file to investigate
     * @return
     * @throws Exception
     */
    public static boolean hasWindowsEOL( File textFile )
        throws Exception
    
        return Mode.CRLF.equals( determineEOL( textFile ) );
    

    /**
     * Checks whether the given text file has Unix-style (LF) line endings.
     * 
     * @param textFile the file to investigate
     * @return
     * @throws Exception
     */
    public static boolean hasUnixEOL( File textFile )
        throws Exception
    
        return Mode.LF.equals( determineEOL( textFile ) );
    

    /**
     * Checks whether the given text file has "Old Mac"-style (CR) line endings.
     * 
     * @param textFile the file to investigate
     * @return
     * @throws Exception
     */
    public static boolean hasOldMacEOL( File textFile )
        throws Exception
    
        return Mode.CR.equals( determineEOL( textFile ) );
    

    /**
     * Checks whether the given text file has line endings that conform to the system default mode (e.g. LF on Unix).
     * 
     * @param textFile the file to investigate
     * @return
     * @throws Exception
     */
    public static boolean hasSystemDefaultEOL( File textFile )
        throws Exception
    
        return SYSTEM_DEFAULT.equals( determineEOL( textFile ) );
    

    /**
     * Convert the line endings in the given file to Unix-style (LF).
     * 
     * @param textFile the file to process
     * @throws IOException
     */
    public static void convertToUnixEOL( File textFile )
        throws IOException
    
        convertLineEndings( textFile, EOL_UNIX );
    

    /**
     * Convert the line endings in the given file to Windows-style (CRLF).
     * 
     * @param textFile the file to process
     * @throws IOException
     */
    public static void convertToWindowsEOL( File textFile )
        throws IOException
    
        convertLineEndings( textFile, EOL_WINDOWS );
    

    /**
     * Convert the line endings in the given file to "Old Mac"-style (CR).
     * 
     * @param textFile the file to process
     * @throws IOException
     */
    public static void convertToOldMacEOL( File textFile )
        throws IOException
    
        convertLineEndings( textFile, EOL_OLD_MAC );
    

    /**
     * Convert the line endings in the given file to the system default mode.
     * 
     * @param textFile the file to process
     * @throws IOException
     */
    public static void convertToSystemEOL( File textFile )
        throws IOException
    
        convertLineEndings( textFile, EOL_SYSTEM_DEFAULT );
    

    /**
     * Line endings conversion method.
     * 
     * @param textFile the file to process
     * @param eol the end-of-line marker to use (as a @link String)
     * @throws IOException 
     */
    private static void convertLineEndings( File textFile, String eol )
        throws IOException
    
        File temp = null;
        BufferedReader bufferIn = null;
        BufferedWriter bufferOut = null;

        try
        
            if ( textFile.exists() )
            
                // Create a new temp file to write to
                temp = new File( textFile.getAbsolutePath() + ".normalized" );
                temp.createNewFile();

                // Get a stream to read from the file un-normalized file
                FileInputStream fileIn = new FileInputStream( textFile );
                DataInputStream dataIn = new DataInputStream( fileIn );
                bufferIn = new BufferedReader( new InputStreamReader( dataIn ) );

                // Get a stream to write to the normalized file
                FileOutputStream fileOut = new FileOutputStream( temp );
                DataOutputStream dataOut = new DataOutputStream( fileOut );
                bufferOut = new BufferedWriter( new OutputStreamWriter( dataOut ) );

                // For each line in the un-normalized file
                String line;
                while ( ( line = bufferIn.readLine() ) != null )
                
                    // Write the original line plus the operating-system dependent newline
                    bufferOut.write( line );
                    bufferOut.write( eol ); // write EOL marker
                

                // Close buffered reader & writer:
                bufferIn.close();
                bufferOut.close();

                // Remove the original file
                textFile.delete();

                // And rename the original file to the new one
                temp.renameTo( textFile );
            
            else
            
                // If the file doesn't exist...
                throw new IOException( "Could not find file to open: " + textFile.getAbsolutePath() );
            
        
        finally
        
            // Clean up, temp should never exist
            FileUtils.deleteQuietly( temp );
            IOUtils.closeQuietly( bufferIn );
            IOUtils.closeQuietly( bufferOut );
        
    


【讨论】:

【参考方案5】:

使用

System.getProperty("line.separator")

这将为您提供(本地)EOL 字符。然后,您可以使用对 incomifile 的分析来确定它是什么“风味”并进行相应的转换。

或者,让您的客户标准化!

【讨论】:

我将如何使用它?我不需要本地设置,我总是需要 linux 的设置,我的意思是 LF 文件结尾,即使文件将在 windows 中生成 顺便说一句,我想做这个客户端,而不是服务器端。【参考方案6】:
public static String normalize(String val) 
    return val.replace("\r\n", "\n")
            .replace("\r", "\n");

对于 html

public static String normalize(String val) 
    return val.replace("\r\n", "<br/>")
            .replace("\n", "<br/>")
            .replace("\r", "<br/>");

【讨论】:

【参考方案7】:

虽然 String.replaceAll() 的代码更简单,但它应该执行得更好,因为它不经过正则表达式基础结构。

    /**
 * Accepts a non-null string and returns the string with all end-of-lines
 * normalized to a \n.  This means \r\n and \r will both be normalized to \n.
 * <p>
 *     Impl Notes:  Although regex would have been easier to code, this approach
 *     will be more efficient since it's purpose built for this use case.  Note we only
 *     construct a new StringBuilder and start appending to it if there are new end-of-lines
 *     to be normalized found in the string.  If there are no end-of-lines to be replaced
 *     found in the string, this will simply return the input value.
 * </p>
 *
 * @param inputValue !null, input value that may or may not contain new lines
 * @return the input value that has new lines normalized
 */
static String normalizeNewLines(String inputValue)
    StringBuilder stringBuilder = null;
    int index = 0;
    int len = inputValue.length();
    while (index < len)
        char c = inputValue.charAt(index);
        if (c == '\r')
            if (stringBuilder == null)
                stringBuilder = new StringBuilder();
                // build up the string builder so it contains all the prior characters
                stringBuilder.append(inputValue.substring(0, index));
            
            if ((index + 1 < len) &&
                inputValue.charAt(index + 1) == '\n')
                // this means we encountered a \r\n  ... move index forward one more character
                index++;
            
            stringBuilder.append('\n');
        else
            if (stringBuilder != null)
                stringBuilder.append(c);
            
        
        index++;
    
    return stringBuilder == null ? inputValue : stringBuilder.toString();

【讨论】:

【参考方案8】:

从 Java 12 开始可以使用

var str = str.ident(0);

【讨论】:

【参考方案9】:

在路径中更改以递归搜索结尾的文件的解决方案

package handleFileLineEnd;

import java.io.File;
import java.io.IOException;
import java.nio.charset.Charset;
import java.nio.charset.StandardCharsets;
import java.nio.file.FileSystems;
import java.nio.file.Files;
import java.nio.file.OpenOption;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.util.List;

import sun.awt.image.BytePackedRaster;

public class handleFileEndingMain 

    static int carriageReturnTotal;
    static int newLineTotal;

    public static void main(String[] args)  throws IOException
           
        processPath("c:/temp/directories");

        System.out.println("carriageReturnTotal  (files have issue): " + carriageReturnTotal);

        System.out.println("newLineTotal: " + newLineTotal);
    

    private static void processPath(String path) throws IOException
    
        File dir = new File(path);
        File[] directoryListing = dir.listFiles();

        if (directoryListing != null) 
            for (File child : directoryListing) 
                if (child.isDirectory())                
                    processPath(child.toString());              
                else
                    checkFile(child.toString());
            
         


    

    private static void checkFile(String fileName) throws IOException
    
        Path path = FileSystems.getDefault().getPath(fileName);

        byte[] bytes= Files.readAllBytes(path);

        for (int counter=0; counter<bytes.length; counter++)
        
            if (bytes[counter] == 13)
            
                carriageReturnTotal = carriageReturnTotal + 1;

                System.out.println(fileName);
                modifyFile(fileName);
                break;
            
            if (bytes[counter] == 10)
            
                newLineTotal = newLineTotal+ 1;
                //System.out.println(fileName);
                break;
            
        

    

    private static void modifyFile(String fileName) throws IOException
    

        Path path = Paths.get(fileName);
        Charset charset = StandardCharsets.UTF_8;

        String content = new String(Files.readAllBytes(path), charset);
        content = content.replaceAll("\r\n", "\n");
        content = content.replaceAll("\r", "\n");
        Files.write(path, content.getBytes(charset));
    

【讨论】:

以上是关于如何规范化 Java 中的 EOL 字符?的主要内容,如果未能解决你的问题,请参考以下文章

如何规范化 Java 中的 URL?

java:如何规范化文本?

如何在 Java 中正确计算字符串的长度?

如何使用 JavaScript 中的格式规范将字符串转换为日期时间?

如何规范化 Bash 中的文件路径?

在 Java 中规范化可能编码的 URI 字符串