Estimating File Sizes – AS Computing Revision (F452)

Record Format

  • One way of storing information is to use an ordered text file, with a record format
  • Multiple records are stored, and each is made up of multiple fields
  • In the text file, records could be on separate lines and fields could be separated by control characters the computer can understand, for example
  • Records and fields work a bit like they do in a database – one record per person, for example, made up of fields for name, address, age, whatever

Estimating File Sizes

  • To estimate how much space a file will take up, work out how much space each record will take up, multiply it by the number of records, and add 10%
  • The ten percent is for the file overhead / metadata, etc.
  • You’ll probably be asked to give the answer in kilobytes, so divide your answer by 1000
  • (Yes, there are actually 1024 bytes in a kilobyte, but since you’re not allowed a calculator in the exam, the mark scheme apparently lets you use 1000 instead)
  • To work out the size of a single record, you need to work out the size of each field in the record, and add them together
  • Here is a table of data types and sizes:
Data Type Bytes Notes
Character 1 A single ASCII character is seven bits plus a parity bit, so one byte
String 1 per character A string is just several characters – you need to choose the maximum string length carefully, though
Boolean 1 Yes, a true or false value could be stored as a single bit, but since this is a text file using ASCII, it’s represented as a character, so it’ll take a whole byte
Integer 1, 2, 4 or 8 (There are several types of integer in Java – byte, short, int and long – but they’re all the same in pseudocode)
Real 4 or 8 (There are two types of real number in Java – float and double)
Date 4  
Time 4  
Date and Time 8  
Currency 8 You may be able to get away with using Real instead
  • Strings give you flexibility to choose the maximum length
  • You should make the string lengths sensible, but also use this leeway to make sure the fields add up to a total record size that’s easy to do calculations on later
  • There’s a large “BS” written next to “Date”, “Time”, “Date and Time” and “Currency” in my notes, because those aren’t data types! Not primitive data types in any language I know, anyway. They’re probably included here because computers have some way of storing date / time / currency information that’s better than using a string or a number or both

About Matt

I like writing, filmmaking, programming and gaming, and prefer creating media to consuming it. On the topic of consumption, I'm also a big fan of eating.
This entry was posted in AS Computing Revision, Revision and tagged , , , , , , , , , , , , . Bookmark the permalink.

Enter comment:

Please log in using one of these methods to post your comment: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s