File Types / Handling – AS Computing Revision (F452)

Serial Files

Files

Files (Photo credit: Velo Steve)

 

  • Files are useful not just for the information they hold, but the way they hold that information
  • There are a few different structures files use to make their data accessible
  • Serial files don’t make any effort to make searching easier – they just store records in the order the records are added
  • New records are always just added on to the end (this is called appending)
  • This type of storage is usually used for small files where the order the records are in doesn’t matter (or the only order they’re needed in is order they’re entered in)
  • Finding records again can be difficult if there are a lot
  • The computer has to search through every single record until it finds the one it’s looking for
  • Sometimes you don’t need to find single records individually – you either need all of them or none of them – in this case, serial files are completely fine

Sequential Files

  • In this type of structure, the records are stored in order
  • Each record has several fields, but one of them will be key field, which is what they’re stored in order of
  • This type of storage is great for data that needs to be processed in a specific order
  • Searching through sequential files is a bit faster than searching serial files, because the computer knows when to give up
  • If you’re searching for a number beginning 345 in a file organised in numerical order, you check each number from the beginning until you reach 345. If you don’t find it, and you reach 346, at least you know 345 isn’t there. In a serial file, you’d have to search right up to the end

Indexed Sequential Files

  • (At risk of sounding dramatic), this is the next step in the evolution of sequential files
  • Records are stored in order of a key field, just like in regular sequential files, but this time there’s an index, too
  • This makes it much faster for computers to search through (if there are a lot of records)
  • The index holds the physical disk locations of notable points
  • Continuing the example above, the index might tell you that numbers starting with 34_  begin at a certain memory location – you’d skip to that memory location, and start looking through the records from that point, until you eventually realised that there was no 345
  • If there are loads of records before the one you want, the index stops you having to look through most of them
  • (For instance, you don’t flick through every single page of the dictionary – you head straight to the section for words beginning with a certain letter, and flick through pages carefully from there)
  • The index has to be updated when a record is inserted or deleted

Random Files

  • This is a really weird way of storing files, but there is actually a reason for it
  • Most file types have their records all stored next to each other in memory (contiguously)
  • Random files spread their data all over the disk
  • This is done ‘randomly’ using a hashing algorithm, which takes the key field and distorts it in strange ways to produce a result that’s an address in memory
  • The record with that key field is then stored at that address
  • To find records again, put the key field into the hashing algorithm and you’ll get the address
  • (So it’s not random in the sense that he hashing algorithm produces a different output each time – it actually produces the same output every time for the same input)
  • Hashing algorithms are complicated because they have to avoid ‘collisions’ – where multiple records are told to live in the same memory location, which just can’t happen
  • However complicated the hashing algorithms are, collisions can still happen, and dealing with them takes a lot of memory (this is called redundancy)
  • The good thing about random files is that all you need to find a record is its key field – you don’t have to go searching through loads of them
  • This means random files are better than indexed sequential files, when you have massive databases

Opening and Closing

  • Before you can access a file, you have to open it
  • You can’t read and write to a file at the same time, so you can only open it for reading or for writing – not both
  • When you’ve finished with a file, you have to close it (which is very easy to forget)
  • If you don’t close it, your changes might not be saved
  • Each record tends to be a separate line in the file (or they’re separated by dilemeters characters, such as semi colons or commas) to make them easier to retrieve

Inserting, Updating and Deleting

  • To add something to a serial file, you just append it to the end – easy
  • Adding something to a sequential file is trickier, though, because it has to go in a certain place…
  • (Adding things to indexed sequential files or random files won’t be tested on, apparently)
  • Inserting something is done by:
  1. Open the file for reading
  2. Create a new file
  3. Open the new file for writing
  4. Copy all the records from the original file, before the point the new record needs to go, to the new file
  5. Copy the new record to the new file
  6. Copy everything else from the original file to the new file
  7. Close both files
  8. Delete the old file
  9. Rename the new file to what the old file used to be called
  • The same technique is used to update existing records
  • Similarly, this is how to remove a record from a sequential file:
  1. Open the file for reading
  2. Create a new file
  3. Open the new file for writing
  4. Copy all the records from the original file, up to and including the one before the one you want to get rid of, to the new file
  5. Don’t copy the one you want to get rid of – just skip it
  6. Copy the rest of the records to the new file
  7. Close both files
  8. Delete the original file
  9. Rename the new file to what the old file was called
Advertisements

About Matt

I like writing, filmmaking, programming and gaming, and prefer creating media to consuming it. On the topic of consumption, I'm also a big fan of eating.
This entry was posted in AS Computing Revision, Revision and tagged , , , , , , , , , , , , , . Bookmark the permalink.

Enter comment:

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s