HDFS – Hadoop Distributed File System is a collection of open-source software utilities that facilitate using a network of many computers to solve problems involving massive amounts of data and computation. Repository – https://git-wip-us.apache.org/repos/asf?p=hadoop.git