Reading the BOM/preamble

Sometimes we get a file with the BOM (or preamble) bytes at the start of the file, which denote a UNICODE encoded file. We don’t always care these and want to simple remove the BOM (if one exists).

Here’s some fairly simple code which shows the reading of a stream or file with code to “skip the BOM” at the bottom

using (var stream = 
   File.Open(currentLogFile, 
      FileMode.Open, 
      FileAccess.Read, 
      FileShare.ReadWrite))
{
   var length = stream.Length;
   var bytes = new byte[length];
   var numBytesToRead = (int)length;
   var numBytesRead = 0;
   do
   {
      // read the file in chunks of 1024
      var n = stream.Read(
         bytes, 
         numBytesRead, 
         Math.Min(1024, numBytesToRead));

      numBytesRead += n;
      numBytesToRead -= n;

   } while (numBytesToRead > 0);

   // skip the BOM
   var bom = new UTF8Encoding(true).GetPreamble();
                    
   return bom.Where((b, i) => b != bytes[i]).Any() ? 
      bytes : 
      bytes.Skip(bom.Length).ToArray();
}