Friday 9 March 2018

Converting line endings

eolconv

An up-to-date version of this and other files can be found in my CommandLineTools project on GitHub.

On Stack Overflow and in the Embarcadero forums I regularly see people ask how to convert a file with Unix line endings to Windows line endings, or vice versa. The usual advice is to load the file into Notepad++ and to save it with the desired line ending format.

But Delphi can do this too, either using AdjustLineBreaks or using a TStringList. That is why I wrote this simple command line tool. The actual conversion is extremely simple:

  • Load the file into a TStringList.
  • Set the LineBreak property of the string list, depending on the desired output format (Windows or Unix).
  • Convert the file depending on the encoding: 8 bit, 16 bit little endian or 16 bit big endian.
  • Rename the old file to a name with a unique extension.
  • Rename the new file to the name of the old file.
  • Save the contents of the TStringList back to the file.

Most of the code is there to handle command line parameters.

Update

The TStringList way is simple to write, but is not only pretty slow and — if the file is large — can also require a lot of memory. The new way uses a buffer of up to 2MB to load blocks of the file and convert on the fly. This turned out to be a lot faster for large files.

Synopsis

EOLCONV [options ...] Name[s]

Multiple file names can be specified. EOLConv does not understand wildcards (yet), but for instance on the Windows command line, you can do something like:

for %f in (*.txt) do eolconv %f -U

That will convert all files with the extension .txt in the current directory to the Unix format.

Each original file will be preserved by renaming it, adding a unique .~1, .~2, etc. extension to the original file name. EOLConv will try to add extensions up to .~200, and if the file name with that extension exists too, EOLConv will give up and the original file will remain as it was.

Options

On Windows, all options start with either a / or a . On other platforms, they start with a .

Options are case insensitive.

If no options are specified, the files will be converted to the default line ending format for the platform. To set the line ending format to a specific style, use:

  • -W     to set the line endings to Windows format (CR+LF)*
  • -U     to set the line endings to Unix format (LF)*

* CR stands for Carriage Return (#13 in Delphi), LF stands for Line Feed (#10 in Delphi).

To get help, either start eolconv without any parameters, or specify -H or -?.

Finally

EOLConv is freeware. All rights are reserved. Its code is provided as is, expressly without a warranty of any kind. You use it at your own risk.

I hope this code is useful to you. If you use some of it, please credit me. If you modify or improve EOLConv, please send me the modifications.

I may improve or enhance EOLConv myself, and I will try to post changes here. But this is not a promise. Please don’t request features.

Rudy Velthuis

No comments:

Post a Comment