See Also ((StubbornMacEOL))
I run into this problem periodically with friends and clients having difficulty openning a text file. In some cases, an application will mysteriously crash or yield inexplicable results.
Most recently I encountered this issue running the ((md5sum)) utility. Last year, it was a client who was trying to read ASCII HPGL plot files on his Mac which were generated by a Windows application.
The ((crux of the biscuit ZappaApostrophe)) is the way in which operating systems encode the "End of Line" or "New Line".
In an ASCII text file, there is a special character to mark the end of a line or beginning of new line. This is sometimes called a "hard return", as opposed to "soft return", where the application decides where to "wrap" the text to a new line.
The EOL character (or characters) differ between the various operating systems. For our purposes, I will cover three: DOS/Windows, Unix/Mac OS X, and Apple/Mac OS (prior to X).
Examining an ASCII table will reveal the following two characters (among many other "control" characters):
Decimal | Hex | Octal | Meaning |
---|---|---|---|
10 | 0A | 012 | LF - Line Feed |
13 | 0D | 015 | CR - Carriage Return |
These were designed to communicate with early teletype machines and are analagous to the old typewriter controls. When you reached the end of a line, you had to return the carriage back to the beginning, and then roll the drum down (up) to a new line.
When ASCII was adopted and the standard for text files stored on computer disk, the various operating systems adopted different methods of storing the end-of-line code.
Under DOS/Windows, Microsoft uses a combination of the two characters CR+LF. Always interested in streamlining, Unix adopted a single LF character. Just to make matters interesting, earlier (Non-X) versions of Mac OS use a single CR character.
In the early days before the internet was prevalent, few people cared. (I'd wager even today few people care, but if you're reading this you're one of 'em.) Today, you may care if you're trying to exchange a text file between systems. One simple solution (among many) is to use the "tr" (translate) Terminal utility. I built a quick shell script to convert DOS/Windows-style EOL to Unix:
#!/bin/bash
#
# script to convert DOS (CR+LF) to Unix (LF) text files
#
#
if [ $# -ne 1 ]; then
echo "usage: `basename $0` <inputfile>"
exit 1
fi
ftmp="$(basename $0)_$$_tmp"
tr -d '\015' <"$1" >"$ftmp"
mv "$ftmp" "$1"
echo Done
The meat of the script is the following line:
tr -d '\015' <"$1" >"$ftmp"
The "-d" option says "delete". The '\015' is the octal value for CR. You could also use '\x0d' or '\r'. The <"$1" says read from the first argument on the command line, and the >"$ftmp" says send output to the temporary file defined in the previous line.
My plans for the future include adding the ability to handle multiple input files and formats. In other words, a generalized script that can convert to and from all three formats. I'll probably use command-line switches to indicate the source/destination formats. Maybe also allow the user to specify an output file rather than overwriting the original. It would also be a good idea to check for write permissions on the source file since the "mv" will fail without it.