Google File System
Aug. 13, 2024
Google File System
Google Inc. developed the Google File System (GFS), a scalable distributed file system (DFS), to meet the companys growing data processing needs. GFS offers fault tolerance, dependability, scalability, availability, and performance to big networks and connected nodes. GFS is made up of a number of storage systems constructed from inexpensive commodity hardware parts. The search engine, which creates enormous volumes of data that must be kept, is only one example of how it is customized to meet Googles various data use and storage requirements.
If you are looking for more details, kindly visit Wansheng.
The Google File System reduced hardware flaws while gains of commercially available servers.
GoogleFS is another name for GFS. It manages two types of data namely File metadata and File Data.
The GFS node cluster consists of a single master and several chunk servers that various client systems regularly access. On local discs, chunk servers keep data in the form of Linux files. Large (64 MB) pieces of the stored data are split up and replicated at least three times around the network. Reduced network overhead results from the greater chunk size.
Without hindering applications, GFS is made to meet Googles huge cluster requirements. Hierarchical directories with path names are used to store files. The master is in charge of managing metadata, including namespace, access control, and mapping data. The master communicates with each chunk server by timed heartbeat messages and keeps track of its status updates.
More than 1,000 nodes with 300 TB of disc storage capacity make up the largest GFS clusters. This is available for constant access by hundreds of clients.
Components of GFS
A group of computers makes up GFS. A cluster is just a group of connected computers. There could be hundreds or even thousands of computers in each cluster. There are three basic entities included in any GFS cluster as follows:
- GFS Clients: They can be computer programs or applications which may be used to request files. Requests may be made to access and modify already-existing files or add new files to the system.
- GFS Master Server: It serves as the clusters coordinator. It preserves a record of the clusters actions in an operation log. Additionally, it keeps track of the data that describes chunks, or metadata. The chunks place in the overall file and which files they belong to are indicated by the metadata to the master server.
- GFS Chunk Servers: They are the GFSs workhorses. They keep 64 MB-sized file chunks. The master server does not receive any chunks from the chunk servers. Instead, they directly deliver the client the desired chunks. The GFS makes numerous copies of each chunk and stores them on various chunk servers in order to assure stability; the default is three copies. Every replica is referred to as one.
Features of GFS
- Namespace management and locking.
- Fault tolerance.
- Reduced client and master interaction because of large chunk server size.
- High availability.
- Critical data replication.
- Automatic and efficient data recovery.
- High aggregate throughput.
Advantages of GFS
- High accessibility Data is still accessible even if a few nodes fail. (replication) Component failures are more common than not, as the saying goes.
- Excessive throughput. many nodes operating concurrently.
- Dependable storing. Data that has been corrupted can be found and duplicated.
Disadvantages of GFS
- Not the best fit for small files.
- Master may act as a bottleneck.
- unable to type at random.
- Suitable for procedures or data that are written once and only read (appended) later.
I
ifrahshaxil8
Improve
Please
Login
to comment...
Global file system
For the Red Hat product, see GFS2
In computer storage, a global file system is a distributed file system that can be accessed from multiple locations, typically across a wide-area network, and provides concurrent access to a global namespace from all locations. In order for a file system to be considered global, it must allow for files to be created, modified, and deleted from any location. This access is typically provided by a cloud storage gateway at each edge location, which provides access using the NFS or SMB network file sharing protocols.[1]
There are a number of benefits to using a global file system. First, global file systems can improve the availability of data by allowing multiple copies to be stored in different locations, as well as allowing for rapid restoration of lost data from a remote location. This can be helpful in the event of a disaster, such as a power outage or a natural disaster. Second, global file systems can improve performance by allowing data to be cached closer to the users who are accessing it. This can be especially beneficial in cases where data is accessed by users in different parts of the world. Finally, in contrast to traditional Network attached storage, global file systems can improve the ability of users to collaborate across multiple sites, in a manner similar to Enterprise file synchronization and sharing.[1]
History
[
For more information, please visit gfs technology.
edit
]
The term global file system has historically referred to a distributed virtual name space built on a set of local file systems to provide transparent access to multiple, potentially distributed, systems.[2] These global file systems had the same properties such as blocking interface, no buffering etc. but guaranteed that the same path name corresponds to the same object on all computers deploying the filesystem. Also called distributed file systems these file systems rely on redirection to distributed systems, therefore latency and scalability can affect file access depending on where the target systems reside.
The Andrew File System attempted to solve this for a campus environment using caching and a weak consistency model to achieve local access to remote files.
In the 's, global file systems have found a use case in providing hybrid cloud storage, that combine cloud or any object storage, versioning and local caching to create a single, unified, globally accessible file system that does not rely on redirection to a storage device [3] but serves files from the local cache while maintaining the single file system and all meta data in the object storage.[4] As described in Google's patents, advantages of these global file systems include the ability to scale with the object storage, use snapshots stored in the object storage for versioning to replace backup, and create a centrally managed consolidated storage repository in the object storage.
Comparison with Network Attached Storage
[
edit
]
When it comes to hybrid file storage, there are two main approaches: network attached storage (NAS) with cloud connectivity and global file system (GFS). The two solutions are fundamentally different.[5]
NAS with cloud connectivity is typically used to supplement on-premises storage. Public clouds may be combined with on-premises NAS for tasks such as backup, tiering, or disaster recovery. This type of setup uses the cloud for specific use cases to complement on-premises storage. On-premises NAS is sold by well-established IT vendors including Dell, IBM, NetApp, and others, and most build in support for some type of cloud connectivity.[5]
A Global File System utilizes a fundamentally different architecture. In these solutions, cloud storage typically object storage serves as the core storage element, while caching devices are utilized on-premises to provide data access. These devices can be physical but are increasingly available as virtual solutions that can be deployed in a hypervisor. The use of caching devices reduces the amount of required on-premises storage capacity, and the associated capital expense.[5]
Global file systems are better suited for remote collaboration, as they make it easier to manage access to files across dispersed geographic areas. Utilizing the cloud as a central storage location enables users to access the same data regardless of their location.[5]
There are some trade-offs to consider when choosing a GFS solution, however. One trade off is that because the gold copy of data is stored off-site, there may be latency issues when retrieving infrequently accessed files.[5]
Vendors
[
edit
]
Notable vendors in the global filesystem area include:[1]
See also
[
edit
]
References
[
edit
]
If you are looking for more details, kindly visit glass fused steel tank installation.
66
0
0
Comments
All Comments (0)