Read binary file into a struct

C#StructIoBinaryfiles

C# Problem Overview


I'm trying to read binary data using C#. I have all the information about the layout of the data in the files I want to read. I'm able to read the data "chunk by chunk", i.e. getting the first 40 bytes of data converting it to a string, get the next 40 bytes.

Since there are at least three slightly different version of the data, I would like to read the data directly into a struct. It just feels so much more right than by reading it "line by line".

I have tried the following approach but to no avail:

StructType aStruct;
int count = Marshal.SizeOf(typeof(StructType));
byte[] readBuffer = new byte[count];
BinaryReader reader = new BinaryReader(stream);
readBuffer = reader.ReadBytes(count);
GCHandle handle = GCHandle.Alloc(readBuffer, GCHandleType.Pinned);
aStruct = (StructType) Marshal.PtrToStructure(handle.AddrOfPinnedObject(), typeof(StructType));
handle.Free();

The stream is an opened FileStream from which I have began to read from. I get an AccessViolationException when using Marshal.PtrToStructure.

The stream contains more information than I'm trying to read since I'm not interested in data at the end of the file.

The struct is defined like:

[StructLayout(LayoutKind.Explicit)]
struct StructType
{
    [FieldOffset(0)]
    public string FileDate;
    [FieldOffset(8)]
    public string FileTime;
    [FieldOffset(16)]
    public int Id1;
    [FieldOffset(20)]
    public string Id2;
}

The examples code is changed from original to make this question shorter.

How would I read binary data from a file into a struct?

C# Solutions


Solution 1 - C#

The problem is the strings in your struct. I found that marshaling types like byte/short/int is not a problem; but when you need to marshal into a complex type such as a string, you need your struct to explicitly mimic an unmanaged type. You can do this with the MarshalAs attrib.

For your example, the following should work:

[StructLayout(LayoutKind.Explicit)]
struct StructType
{
    [FieldOffset(0)]
    [MarshalAs(UnmanagedType.ByValTStr, SizeConst = 8)]
    public string FileDate;
    
    [FieldOffset(8)]
    [MarshalAs(UnmanagedType.ByValTStr, SizeConst = 8)]
    public string FileTime;
    
    [FieldOffset(16)]
    public int Id1;

    [FieldOffset(20)]
    [MarshalAs(UnmanagedType.ByValTStr, SizeConst = 66)] //Or however long Id2 is.
    public string Id2;
}

Solution 2 - C#

Here is what I am using.
This worked successfully for me for reading Portable Executable Format.
It's a generic function, so T is your struct type.

public static T ByteToType<T>(BinaryReader reader)
{
    byte[] bytes = reader.ReadBytes(Marshal.SizeOf(typeof(T)));

    GCHandle handle = GCHandle.Alloc(bytes, GCHandleType.Pinned);
    T theStructure = (T)Marshal.PtrToStructure(handle.AddrOfPinnedObject(), typeof(T));
    handle.Free();

    return theStructure;
}

Solution 3 - C#

As Ronnie said, I'd use BinaryReader and read each field individually. I can't find the link to the article with this info, but it's been observed that using BinaryReader to read each individual field can be faster than Marshal.PtrToStruct, if the struct contains less than 30-40 or so fields. I'll post the link to the article when I find it.

The article's link is at: http://www.codeproject.com/Articles/10750/Fast-Binary-File-Reading-with-C

When marshaling an array of structs, PtrToStruct gains the upper-hand more quickly, because you can think of the field count as fields * array length.

Solution 4 - C#

I don't see any problem with your code.

just out of my head, what if you try to do it manually? does it work?

BinaryReader reader = new BinaryReader(stream);
StructType o = new StructType();
o.FileDate = Encoding.ASCII.GetString(reader.ReadBytes(8));
o.FileTime = Encoding.ASCII.GetString(reader.ReadBytes(8));
...
...
...

also try

StructType o = new StructType();
byte[] buffer = new byte[Marshal.SizeOf(typeof(StructType))];
GCHandle handle = GCHandle.Alloc(buffer, GCHandleType.Pinned);
Marshal.StructureToPtr(o, handle.AddrOfPinnedObject(), false);
handle.Free();

then use buffer[] in your BinaryReader instead of reading data from FileStream to see whether you still get AccessViolation exception.

> I had no luck using the > BinaryFormatter, I guess I have to > have a complete struct that matches > the content of the file exactly.

That makes sense, BinaryFormatter has its own data format, completely incompatible with yours.

Solution 5 - C#

I had no luck using the BinaryFormatter, I guess I have to have a complete struct that matches the content of the file exactly. I realised that in the end I wasn't interested in very much of the file content anyway so I went with the solution of reading part of stream into a bytebuffer and then converting it using

Encoding.ASCII.GetString()

for strings and

BitConverter.ToInt32()

for the integers.

I will need to be able to parse more of the file later on but for this version I got away with just a couple of lines of code.

Solution 6 - C#

Try this:

using (FileStream stream = new FileStream(fileName, FileMode.Open))
{
    BinaryFormatter formatter = new BinaryFormatter();
    StructType aStruct = (StructType)formatter.Deserialize(filestream);
}

Solution 7 - C#

Reading straight into structs is evil - many a C program has fallen over because of different byte orderings, different compiler implementations of fields, packing, word size.......

You are best of serialising and deserialising byte by byte. Use the build in stuff if you want or just get used to BinaryReader.

Solution 8 - C#

I had structure:

[StructLayout(LayoutKind.Explicit, Size = 21)]
	public struct RecordStruct
	{
        [FieldOffset(0)]
		public double Var1;

        [FieldOffset(8)]
		public byte var2

        [FieldOffset(9)]
		[MarshalAs(UnmanagedType.ByValTStr, SizeConst = 12)]
		public string String1;
	}
}

and I received "incorrectly aligned or overlapped by non-object". Based on that I found: https://social.msdn.microsoft.com/Forums/vstudio/en-US/2f9ffce5-4c64-4ea7-a994-06b372b28c39/strange-issue-with-layoutkindexplicit?forum=clr

> OK. I think I understand what's going on here. It seems like the > problem is related to the fact that the array type (which is an object > type) must be stored at a 4-byte boundary in memory. However, what > you're really trying to do is serialize the 6 bytes separately. > > I think the problem is the mix between FieldOffset and serialization > rules. I'm thinking that structlayout.sequential may work for you, > since it doesn't actually modify the in-memory representation of the > structure. I think FieldOffset is actually modifying the in-memory > layout of the type. This causes problems because the .NET framework > requires object references to be aligned on appropriate boundaries (it > seems).

So my struct was defined as explicit with:

[StructLayout(LayoutKind.Explicit, Size = 21)]

and thus my fields had specified

[FieldOffset(<offset_number>)]

but when you change your struct to Sequentional, you can get rid of those offsets and the error will disappear. Something like:

[StructLayout(LayoutKind.Sequential, Size = 21)]
	public struct RecordStruct
	{
		public double Var1;

		public byte var2;

		[MarshalAs(UnmanagedType.ByValTStr, SizeConst = 12)]
		public string String1;
	}
}

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionRobert H&#246;glundView Question on Stackoverflow
Solution 1 - C#IshmaeelView Answer on Stackoverflow
Solution 2 - C#SergeyView Answer on Stackoverflow
Solution 3 - C#nevelisView Answer on Stackoverflow
Solution 4 - C#lubos haskoView Answer on Stackoverflow
Solution 5 - C#Robert HöglundView Answer on Stackoverflow
Solution 6 - C#uriniView Answer on Stackoverflow
Solution 7 - C#RonnieView Answer on Stackoverflow
Solution 8 - C#ssamkoView Answer on Stackoverflow