Chapter 11 File-System Implementation

Disks provide most storage because
Disks can be rewritten in place
Disks can be accessed directly

File system design problems
How the file system looks to user
Creating algorithms and data structures to map logical file systems to physical file systems

Layered file system, p412
application programs
logical file system -- meta data information, file-control block (FCB)
file-organization module -- translates logical block addresses to physical block addresses
basic file system -- support generic commands to device driver
I/O control -- device drivers, interrupt handlers
devices -- disks

Data structures used to implement
On Disk -- boot control block, boot block, partition boot sector (info to boot from volume), volume control block (volume details), directory structure, file-control block
In memory -- mount table, directory cache, system and process open-file tables
File descriptors, file handles

Partitions and mounting
Raw vs. cooked disks
Dual-boot, root partition, mount table
Virtual File Systems -- separates file-system-generic operations from implementation, provides a mechanism for uniquely representing files
E.g., Linux -- inode object, file object, superblock, dentry

Directory Implementations
Linear list -- simple, requires linear search
Hash table -- problem with collisions

Allocation of space -- efficiency vs. speed
Contiguous allocation -- accessing easy
problems finding space, allocating space, external fragmentation (compactions), determining how much memory is needed (& dealing with mistakes in requests), preallocation of space not yet needed (extents)
Linked allocation -- solves all contiguous problems, easy to create
effective only for sequential-access files, space required for pointers (clusters/internal fragmentation), reliability & lost files
variation -- file-allocation table (FAT), a section at begriming of disk contains table
Indexed allocation -- index block for each file, supports direct access, no external fragmentation
wasted space due to pointer overhead, problem deciding how large index block should be
three schemes for larger files -- linked scheme (index block links to next index block), multilevel scheme (second level of index block), combined scheme (mixed linked & multilevel, e.g., Unix has direct blocks & single indirect, double indirect, and triple indirect blocks)

Performance -- determine how the system will be used

Free-space management -- free-space list
bit vector -- simple but inefficient if can't be kept in memory
linked list
grouping -- store address of n free blocks in first free block
counting -- address of 1-st free block & count of contiguous free blocks that follow

Efficiency & Performance -- disks are major bottleneck, slowest component
efficiency -- efficient use of disk space (e.g., UNIX preallocates inodes, e.g., problem selecting pointer size)
performance -- buffer cash, page cache (double caching)

Reliability -- loss, inconsistency
Consistency checker (fsck, chkdsk)
Backup & restore -- backup schedule, testing
Log-structured file systems & journaling -- transaction logging

Examples
NFS
WAFL