Handling binary files in C
Handling binary files isn’t as easy as text files. You have to know exact format prior to attempting handling them, but they come with one big advantage: they are much faster to read.
When and why they are used
They are often used when format won’t change often or it won’t change at all. Some of game developers while packaging their games intentionally bundle their resources and ship them in big boxes. Actually performance gain here is very big. Instead of reading hundreds of files you read one or two of them. Also keep in mind that if you are storing data in text format (such as JSON or XML) you will have to convert numbers to according formats which takes precious time and computing power.
Binary files are not pure gold though. They come with few drawbacks. If you are not careful while writing you might not always be able to read the file. For example files in 64 bit system you might not be able to read it on 32 bit due to variable types sizes differences.
How are we going to tackle it?
Handling binary files isn’t much different than it’s text brothers. As usual, first thing you have to do is open the file.
M stands here for mode. If you want to write you’ll have to put there w
and when reading r
. Please read through documentation about it.
After this first very easy step you can perform fread
and fwrite
operations. But just writing random stuff into the file might not always be good enough. If you have complex structures it might be troublesome. Data written sequentially you will have to read sequentially.
Let’s add a little bit of architecture there!
With some design in mind we can eliminate that sequential read. What if we wrote somehow offsets to each of the elements we want to store? We technically could do that! All we would have to do is to store number of elements first.
For that to work you need to learn about another few helper functions. Mainly about ftell/1
and fseek/3
.
First one is pretty straight forward. ftell(File*)
will return offset from the begging of the file till current point you are at.
fseek(File*,long int, int)
is general opposite of said function. It will take a file, offset and modifier and it will move pointer to position designated by modifier and offset. There are three main modifiers you can apply.
Modifier | Action |
---|---|
SEEK_SET | From the beginning of the file |
SEEK_CUR | From the current position of file’s pointer |
SEEK_END | From the end |
What have may guessed we are going to ftell
our position before writing, save it to some collection and after we write to file we are going to write down that collection to file.
Become TAXI MASTA
Let’s assume we want to make a Cab manager. Not fancy one with cute and clear web interface and highly scalable backend. Let’s assume it’s more like prehistoric era and maybe Drivers have devices that can record each ride and write them at the end of the day on some kind of hard drive or something. First of all we are gonna do some modeling. Each ride should have an address of start and the end location, time spent on serving customer and maybe distance travelled.
Now that we have our data model we can finally start writing it! Let’s start by preparing some variables and opening the file.
The most interesting thing in this code is type choice. I chose __int64
mainly because it’s big and it has constant size over both 64 and 32 bit platforms. For same reason I didn’t use size_t
. size_t
is a macro that returns either unsigned int
or unsigned __int64
. (Check for yourself, right click the type and hit Go to declaration
) As you can see I used those types either way, but I’ve done it explicitly. If you would use that macro here files you write wouldn’t be cross platform. Written on 64 bit couldn’t be read on 32 bit due too variable type size differences. (Program would want to read __int64 where int is written etc.)
Writing is actual data is much easier. I like to start by creating support structure that will hold lengths of strings to store. We need to do that to ease reading later.
Once we’ve done that we can proceed with writing all the structures in our collection.
Now you need to just close the file and voila! You’ve done it!
Now the other way…
Writing things to files isn’t useful at all when you can’t read what you’ve written. But really, it is really straight forward. You have to do everything you have already done, but in reverse! Instead of using fwrite
you have to use fread
. It’s as simple as that! Mostly…
Conclusion
Here you go! It is a little brain stretch with all those things you have to keep in mind. I intentionally left some code for you to fill. Feel free to implement this exercise and ask questions if you have any.