Categories

Useful commands for Hadoop HDFS troubleshooting

[hadoop@master ~]$ hdfs dfsadmin -report 17/04/13 00:06:51 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform… using builtin-java classes where applicable Configured Capacity: 37576769536 (35.00 GB) Present Capacity: 33829806080 (31.51 GB) DFS Remaining: 33829789696 (31.51 GB) DFS Used: 16384 (16 KB) DFS Used%: 0.00% Under replicated blocks: 0 Blocks with corrupt replicas: 0 Missing […]

CentOS 7.3 under the Hadoop 2.7.2 cluster

CentOS 7.3 under the Hadoop 2.7.2 cluster

how to setup a Hadoop cluster on CentOS linux system. Before you read this article, I assume you already have all basic conceptions about Hadoop and Linux operating system.

mv ifcfg-eno16777736 ifcfg-eth0 vi /etc/udev/rules.d/90-eno-fix.rules # This file was automatically generated on systemd update SUBSYSTEM==”net”, ACTION==”add”, DRIVERS==”?*”, ATTR{address}==”00:0c:29:9e:8f:95″, NAME=”eno16777736″

[…]

HDFS Command Syntax

HDFS Command Syntax Overview: hadoop fs : Ex.: hadoop fs -ls  hadoop version : check hadoop installed properly

HELP: help [cmd]: hopefully this is self-describing 

Inspect files: ls/lsr : list all files in cat : print on stdout tail [-f] : output the last part of the test : return attributes of file and […]

Hadoop vs Spark

Hadoop is to solve big data (up to a computer cannot be stored, a computer cannot be processed within the required time) of reliable storage and processing.

HDFS, the cluster composed by the ordinary PC to provide highly reliable file storage, block by saving multiple copies of the solution to the problem server or hard […]

Apache Spark

Apache Spark 1.5.2 release, this version is a maintenance release that includes fixes Spark stability in some areas, mainly: DataFrame API, Spark Streaming, PySpark, R, Spark SQL and MLlib

 

Apache Spark is one of the hadoop open source cluster computing environments similar, but there are some differences between the two, these useful differences make […]

Centos and Rhel 6.5 Hadoop 2.4 3 Node Server Cluster

Centos and Rhel 6.5 Hadoop 2.4 3 Node Server

hadoop word has been popular for many years, the mention of big data will think hadoop, hadoop then what role is it? Official definition: hadoop is a developing and running large-scale data processing software platform. Core Words is a platform, which means we have a […]