2016年2月19日 星期五

Implicit Encoding with "Add-Content" and "Out-File" commands in PowerShell

In many tutorials on PowerShell I read, it's always written that the following two instructions are equivalent:

Add-Content -path $myFile -value $line
echo $line | Out-File -append $myFile

But this is not true.  Those authors are probably still staying in the computing of the 80's !!  We are now in the 21st Century and Unicode is everywhere!

Try the following code and you'll see that "add-content" and "out-file" apply implicit encodings which is very very disturbing!

Be careful!  Do not mix these two commands if your strings do not have pure 7-bit ASCII characters; or else your text file is not usable!

$myFile_ac = "c:\temp\f-add-content.txt"
$myFile_echo = "c:\temp\f-echo.txt"

$line = "12345 ¾ 67890"

# This gives ANSI as implicit encoding (file size = 15 bytes)
Add-Content -path $myFile_ac -value $line

# This gives UTF-16 LE as implicit encoding (file size = 32 bytes)
echo $line | Out-File -append $myFile_echo

For "out-file", it is possible to explicitly specify the encoding, eg

echo $line | Out-File -append -Encoding UTF8 $myFile


沒有留言:

張貼留言