Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

File Management Systems: Objectives, File Structure, and Organization, Study Guides, Projects, Research of Operating Systems

The objectives of a file management system, file structure, and file organization from a programmer's perspective. It covers various file types, organization methods, and their performance implications. It also touches upon file directories, shared files, and free disk space management.

What you will learn

  • What are the objectives of a file management system?
  • What are the different file structures?
  • What are the methods of file allocation and their trade-offs?
  • What are the advantages and disadvantages of sequential, indexed, and hashed file organization?
  • How does file directory implementation impact performance?

Typology: Study Guides, Projects, Research

2021/2022

Uploaded on 09/27/2022

andreasphd
andreasphd 🇬🇧

4.7

(28)

288 documents

1 / 14

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
1
1
File Management
COMP3231
Operating Systems
2
References
Textbook
Tanenbaum, Chapter 6
3
Files
Named repository for data
Potentially large amount of data
Beyond that available via virtual memory
(Except maybe 64-bit systems)
File lifetime is independent of process lifetime
Used to share data between processes
Convenience
Input to applications is by means of a file
Output is saved in a file for long-term storage
4
File Management
File management system is considered
part of the operating system
Manages a trusted, shared resource
Bridges the gap between:
low-level disk organisation (an array of blocks),
and the user’s views (a stream or collection of
records)
Also includes tools outside the kernel
E.g. formatting, recovery, defrag, consistency,
and backup utilities.
5
Objectives for a
File Management System
Provide a convenient naming
system for files
Provide uniform I/O support for
a variety of storage device
types
Same file abstraction
Provide a standardized set of
I/O interface routines
Storage device drivers
interchangeable
Guarantee that the data in the
file are valid
Optimise performance
Minimize or eliminate the
potential for lost or destroyed
data
Provide I/O support and
access control for multiple
users
Support system administration
(e.g., backups)
6
File Names
File system must provide a convenient naming
scheme
Textual Names
May have restrictions
Only certain characters
E.g . no ‘/’ characters
Limited length
Only certain format
E.g DOS, 8 + 3
Case (in)sensitive
Names may obey conventions (.c files or C files)
Interpreted by tools (UNIX)
Interpreted by operating system (Windows)
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe

Partial preview of the text

Download File Management Systems: Objectives, File Structure, and Organization and more Study Guides, Projects, Research Operating Systems in PDF only on Docsity!

1

File Management

COMP

Operating Systems

2

References

• Textbook

– Tanenbaum, Chapter 6

3

Files

• Named repository for data

– Potentially large amount of data

  • Beyond that available via virtual memory
    • (Except maybe 64-bit systems)

– File lifetime is independent of process lifetime

– Used to share data between processes

• Convenience

– Input to applications is by means of a file

– Output is saved in a file for long-term storage

4

File Management

• File management system is considered

part of the operating system

– Manages a trusted, shared resource

– Bridges the gap between:

  • low-level disk organisation (an array of blocks),
  • and the user’s views (a stream or collection of records)

• Also includes tools outside the kernel

– E.g. formatting, recovery, defrag, consistency,

and backup utilities.

5

Objectives for a

File Management System

  • Provide a convenient naming system for files
  • Provide uniform I/O support for a variety of storage device types - Same file abstraction
  • Provide a standardized set of I/O interface routines - Storage device drivers interchangeable
  • Guarantee that the data in the file are valid - Optimise performance - Minimize or eliminate the potential for lost or destroyed data - Provide I/O support and access control for multiple users - Support system administration (e.g., backups)

6

File Names

• File system must provide a convenient naming

scheme

  • Textual Names
  • May have restrictions
    • Only certain characters
      • E.g. no ‘/’ characters
    • Limited length
    • Only certain format
      • E.g DOS, 8 + 3
  • Case (in)sensitive
  • Names may obey conventions (.c files or C files)
    • Interpreted by tools (UNIX)
    • Interpreted by operating system (Windows)

7

File Naming

Typical file extensions.

8

File Structure

From OS’s perspective

• Three kinds of files

  • byte sequence
  • record sequence
  • tree

9

File Structure

• Stream of Bytes

  • OS considers a file to be unstructured
  • Simplifies file management for the OS
  • Applications can impose their own structure
  • Used by UNIX, Windows, most modern OSes

• Records

  • Collection of bytes treated as a unit - Example: employee record
  • Operations at the level of records (read_rec, write_rec)
  • File is a collection of similar records
  • OS can optimise operations on records

10

File Structure

• Tree of Records

  • Records of variable length
  • Each has an associated key
  • Record retrieval based on key
  • Used on some data processing systems (mainframes)

11

File Types

  • Regular files
  • Directories
  • Device Files
    • May be divided into
      • Character Devices – stream of bytes
      • Block Devices
  • Some systems distinguish between regular file types
    • ASCII text files, binary files
  • At minimum, all systems recognise their own executable file format - May use a magic number

12

File Types

(a) An executable file (b) An archive (libxyz.a)

19

File Organisation and Access

Programmer’s Perspective

  • Performance considerations: - File system performance affects overall system performance - Organisation of the file system affects performance - File organisation (data layout) affects performance - depends on access patterns - Possible access patterns: - Read the whole file - Read individual blocks or records from a file - Read blocks or records preceding or following the current one - Retrieve a set of records - Write a whole file sequentially - Insert/delete/update records in a file - Update blocks in a file 20

Criteria for File Organization

  • Rapid access
    • Needed when accessing a single record
    • Not needed for batch mode
  • Ease of update
    • File on CD-ROM will not be updated, so this is not a concern
  • Economy of storage
    • Should be minimum redundancy in the data
    • Redundancy can be used to speed access such as an index
  • Simple maintenance
  • Reliability

21

Classic File Organisations

• There are many ways to organise a file’s

contents, here are just a few basic

methods

– Unstructured Stream (Pile)

– Sequential

– Indexed Sequential

– Direct or Hashed

22

Unstructured Stream

• Data are collected in

the order they arrive

• Purpose is to

accumulate a mass of

data and save it

• Records may have

different fields

• No structure

• Record access is by

exhaustive search

23

Unstructured Stream Performance

• Update

  • Same size record - okay
  • Variable size - poor

• Retrieval

  • Single record - poor
  • Subset – poor
  • Exhaustive - okay

24

The Sequential File

  • Fixed format used for records
  • Records are the same length
  • Field names and lengths are attributes of the file
  • One field is the key field
    • Uniquely identifies the record
    • Records are stored in key sequence

25

The Sequential File

• Update

  • Same size record - good
  • Variable size – No

• Retrieval

  • Single record - poor
  • Subset – poor
  • Exhaustive - okay

26

Indexed Sequential File

  • Index provides a lookup capability to quickly reach the vicinity of the desired record - Contains key field and a pointer to the main file - Indexed is searched to find highest key value that is equal or less than the desired key value - Search continues in the main file at the location indicated by the pointer

Index

Key File Ptr

Main File

27

Comparison of sequential and

indexed sequential lookup

• Example: a file contains 1 million records

• On average 500,00 accesses are required

to find a record in a sequential file

• If an index contains 1000 entries, it will

take on average 500 accesses to find the

key, followed by 500 accesses in the main

file. Now on average it is 1000 accesses

28

Indexed Sequential File

• Update

  • Same size record - good
  • Variable size - No

• Retrieval

  • Single record - good
  • Subset – poor
  • Exhaustive - okay

Index

Key File Ptr

Main File

29

The Direct, or Hashed File

• Key field required for each

record

• Key maps directly or via a

hash mechanism to an

address within the file

• Directly access a block at

a the known address

Key Hash

Hashed File

30

The Direct, or Hashed File

• Update

– Same size record - good

– Variable size – No

  • Fixed sized records used

• Retrieval

– Single record - excellent

– Subset – poor

– Exhaustive - poor

Key Hash

Hashed File

37

Current Working Directory

• Always specifying the absolute pathname

for a file is tedious!

• Introduce the idea of a working directory

– Files are referenced relative to the working

directory

• Example: cwd = /home/kevine

.profile = /home/kevine/.profile

38

Relative and Absolute

Pathnames

  • Absolute pathname
    • A path specified from the root of the file system to the file
  • A Relative pathname
    • A pathname specified from the cwd
  • Note: ‘.’ (dot) and ‘..’ (dotdot) refer to current and parent directory Example: cwd = /home/kevine ../../etc/passwd /etc/passwd ../kevine/../.././etc/passwd Are all the same file

39

Typical Directory Operations

1. Create

2. Delete

3. Opendir

4. Closedir

5. Readdir

6. Rename

7. Link

8. Unlink

40

Nice properties of UNIX naming

• Simple, regular format

– Names referring to different servers, objects,

etc., have the same syntax.

  • Regular tools can be used where specialised tools would be otherwise needed.

• Location independent

– Objects can be distributed or migrated, and

continue with the same names.

41

An example of a bad naming

convention

• From, Rob Pike and Peter Weinberger,

“The Hideous Name”, Bell Labs TR

UCBVAX::SYS$DISK:[ROB.BIN]CAT_V.EXE;

42

File Sharing

• In multiuser system, allow files to be

shared among users

• Two issues

– Access rights

– Management of simultaneous access

43

Access Rights

• None

– User may not know of the existence of the file

– User is not allowed to read the user directory

that includes the file

• Knowledge

– User can only determine that the file exists

and who its owner is

44

Access Rights

• Execution

– The user can load and execute a program but

cannot copy it

• Reading

– The user can read the file for any purpose,

including copying and execution

• Appending

– The user can add data to the file but cannot

modify or delete any of the file’s contents

45

Access Rights

• Updating

– The user can modify, deleted, and add to the

file’s data. This includes creating the file,

rewriting it, and removing all or part of the

data

• Changing protection

– User can change access rights granted to

other users

• Deletion

– User can delete the file

46

Access Rights

• Owners

– Has all rights previously listed

– May grant rights to others using the following

classes of users

  • Specific user
  • User groups
  • All for public files

47

Case Study:

UNIX Access Permissions

• First letter: file type

d for directories

- for regular files)

• Three user categories

user, group, and other

total 1704 drwxr-x--- 3 kevine kevine 4096 Oct 14 08:. drwxr-x--- 3 kevine kevine 4096 Oct 14 08: .. drwxr-x--- 2 kevine kevine 4096 Oct 14 08:12 backup -rw-r----- 1 kevine kevine 141133 Oct 14 08:13 eniac3.jpg -rw-r----- 1 kevine kevine 1580544 Oct 14 08:13 wk11.ppt

48

UNIX Access Permissions

• Three access rights per category

read, write, and execute

total 1704 drwxr-x--- 3 kevine kevine 4096 Oct 14 08:. drwxr-x--- 3 kevine kevine 4096 Oct 14 08: .. drwxr-x--- 2 kevine kevine 4096 Oct 14 08:12 backup -rw-r----- 1 kevine kevine 141133 Oct 14 08:13 eniac3.jpg -rw-r----- 1 kevine kevine 1580544 Oct 14 08:13 wk11.ppt

drwxrwxrwx

user (^) group other

55

Example Block Size Trade-off

  • Dark line (left hand scale) gives data rate of a disk
  • Dotted line (right hand scale) gives disk space efficiency
    • All files 2KB (an approximate median file size)

Block size

56

File System Implementation

A possible file system layout

57

Implementing Files

• The file system must keep track of

  • which blocks belong to which files.
  • in what order the blocks form the file
  • which blocks are free for allocation

• Given a logical region of a file, the file system

must identify the corresponding block(s) on disk.

  • Stored in file system metadata
    • file allocation table (FAT), directory, I-node

• Creating and writing files allocates blocks on

disk

  • How? 58

Allocation Strategies

• Preallocation

  • Need the maximum size for the file at the time of creation
  • Difficult to reliably estimate the maximum potential size of the file
  • Tend to overestimated file size so as not to run out of space

• Dynamic Allocation

  • Allocated in portions as needed

59

Portion Size

  • Extremes
    • Portion size = length of file (contiguous allocation)
    • Portion size = block size
  • Tradeoffs
    • Contiguity increases performance for sequential operations
    • Many small portions increase the size of the metadata required to book-keep components of a file, free-space, etc.
    • Fixed-sized portions simplify reallocation of space
    • Variable-sized portions minimise internal fragmentation losses

60

Methods of File Allocation

• Contiguous allocation

– Single set of blocks is allocated to a file at the

time of creation

– Only a single entry in the directory entry

  • Starting block and length of the file

• External fragmentation will occur

61

directory

62

• Eventually, we will need compaction to

reclaim unusable disk space.

63

directory

64

Methods of File Allocation

  • Chained (or linked list) allocation
  • Allocation on basis of individual block
    • Each block contains a pointer to the next block in the chain
    • Only single entry in a directory entry
      • Starting block and length of file
  • No external fragmentation
  • Best for sequential files
    • Poor for random access
  • No accommodation of the principle of locality
    • Blocks end up scattered across the disk

65

directory

66

• To improve performance, we can run a

defragmentation utility to consolidate files.

73

Implementing Directories

  • Simple fixed-sized directory entries

(a) disk addresses and attributes in directory entry

  • DOS/Windows

(b) Directory in which each entry just refers to an i-node

  • UNIX 74

Fixed Size Directory Entries

• Either too small

– Example: DOS 8+3 characters

• Waste too much space

– Example: 255 characters per file name

75

Implementing Directories

  • Two ways of handling long file names in directory
    • (a) In-line
    • (b) In a heap 76

Implementing Directories

• Free variable length entries can create

external fragmentation in directory blocks

– Can compact when block is in RAM

77

Shared Files

Files shared under different names

File system containing a shared file

78

Implementing Shared Files

  • Copy entire directory entry (including file attributes)
    • Updates to shared file not seen by all parties
    • Not useful
  • Keep attributes separate (in I-node) and create a new entry (name) that points to the attributes (hard link) - Updates visible - If one link remove, the other remains (ownership is an issue)
  • Create a special “LINK” file that contains the pathname of the shared file (symbolic link, shortcut). - File removal leaves dangling links - Not as efficient to access - Can point to names outside the particular file system - Can transparently replace the file with another

79

Shared Files

(a) Situation prior to linking

(b) After the link is created

(c)After the original owner removes the file 80

Free Disk Space Management

(a) Storing the free list on a linked list (b) A bit map

81

Bit Tables

• Individual bits in a bit vector flags used/free

blocks

• 16GB disk with 512-byte blocks 4MB table

• May be too large to hold in main memory

• Expensive to search

  • But may use a two level table

• Concentrating (de)allocations in a portion of the

bitmap has desirable effect of concentrating

access

• Simple to find contiguous free space

82

Free Block List

• List of all unallocated blocks

• Manage as LIFO or FIFO on disk with

ends in main memory

• Background jobs can re-order list for better

contiguity

• Store in free blocks themselves

– Does not reduce disk capacity

83

Quotas

Quotas for keeping track of each user’s disk use