•File Transfer Protocol (FTP)
•Sun’s Network File System (NFS)
•Andrew File System (AFS)
Other older file systems:
1. CODA
2. Sprite
3. Echo
4. Amoeba Bullet File Server
5. xFs
File Transfer Protocol (FTP)
•it is a Motivation is to provide file sharing (not a distribute file system)
•It helps to Connect to a remote machine and interactively send or fetch an arbitrary file.
•FTP deals with authentication, listing a directory contents, ascii or binary files, etc
Sun’s Network File System (NFS)
•Sun's NFS is one of the most popular and widespread distributed file systems in use today.
The design goals of NFS were:
•Any machine can be a client and/or a server.
•NFS must support diskless workstations (that are booted from the network). Diskless workstations were Sun’s major product line.
•Heterogeneous systems should be supported: clients and servers may have different hardware and/or operating systems. Interfaces for NFS were published to encourage the widespread adoption of NFS.
•High performance: try to make remote access as comparable to local access through caching and read-ahead
Andrew File System (AFS)
•The goal of the Andrew File System was to support information sharing on a large scale (thousands to 10000+ users).
•There were several incarnations of AFS, with the first version being available around 1994, AFS-2 in 1986, and AFS-3 in 1989).
•The assumptions about file usage were:
•most files are small
•reads are much more common than writes
•most files are read/written by one user
From these assumptions, the original goal of AFS was to use whole file serving on
the server (send an entire file when it is opened) and whole file caching on the
client (save the entire file onto a local disk).
Issues of distributed file system
•Naming :-In designing a distributed file service, we should consider whether all machines (and processes) should have the exact same view of the directory hierarchy.
•We might also wish to consider whether the name space on all machines should have a global root directory (a.k.a. super root) so that files can be accessed as, for example, //server/path
Caching
We can employ caching to improve system performance. There are four places in a distributed system where we can hold data.
1. on the server's disk
2. in a cache in the server's memory
3. in the client's memory
4. on the client's disk
Should servers maintain state?
In a stateless system:
•Fault tolerance: if a server crashes and then recovers, no state was lost about client connections because there was no state to maintain.
• No remote open/close calls are needed
• No wasted server space per client.
• No limit on the number of open files on the server per-client state.
•No problems if the client crashes.
•The server does not have any state to clean up.
On a stateful system:
•requests are shorter (less info to send).
•better performance in processing the requests.
•file locking is possible; the server can keep state that a certain client is locking a file
Features of distributed file system
•Uniform access: a distributed computing environment should support global file names. One mechanism that allows the name of a file to look the same on all computers is called a uniform name space.
•Manageability: systems should provide a way of keep track of configuration information (e.g. location of files). DFS uses distributed databases for this task.
•Security: distributed file systems must provide authentication. Furthermore, once users are authenticated, the system must ensure that the performed operations are permitted on the resources accessed. This process is called authorization.
•Standard conformance: DFS complies with the IEEE POSIX 1003.1 file systems semantics standard
•Reliability: the distributed file system scheme itself improves the reliability because its distributed nature, that is, the elimination of the single point of failure of non-distributed systems.
•DFS uses file replication to achieve this goal, i.e., multiple copies of files on multiple servers.
•Server load balancing A DFS root can support multiple targets that are physically distributed across a network.
•for example, if you have a file that you know will be accessed heavily by your users. Rather than all users physically accessing this file on a single server, and thus taxing the server
•DFS ensures that user access to the file is distributed to multiple servers. To users, however, the file resides in one location on the network.
•File and folder security:- Because the shared resources DFS manages use standard NTFS and file sharing permissions, you can use pre-existing security groups and user accounts to ensure that only authorized users have access to sensitive data.
•Easy access to files: A distributed file system makes it easier for users to access files. Users need only go to one location on the network to access files, even though the files may be physically spread across multiple servers
•Performance: the network is considerably slower than the internal buses. Therefore, the less clients have to access servers, the more performance can be achieved. DFS uses a cache (both of file status and real data) to lower the network load