an introduction to the apache hadoop command set
TRANSCRIPT
Apache Command Set
What types ?
What are they ?
What do they do ?
Environment
Configuration
Hadoop commands What types ?
User commands
Administration commands
Generic options for all commands
Configuration options
Environment
Variables i.e. HADOOP_PREFIX
Aliases i.e. hls = hadoop fs -ls
Hadoop commands What are they ?
User Commands
archive save files to a har archive
distcp copy files or directories recursively
fs file system commands
cat copies file to stdout
chgrp change group associated with file
chmod change file permissions
chown change file ownership
CopyFromLocal copy from local file reference
CopyToLocal copy to local file reference
Hadoop commands What are they ?
User Commands
fs file system commands
count count of dir / files/ bytes
cp copy files
du size of files and directories
dus display file lengths
expunge empty trash
get copy files to local file system
getmerge get but merge files
ls file listing
lsr recursive ls
Hadoop commands What are they ?
User Commands
fs file system commands
mkdir make directory
moveFromLocal put with delete of origin
mv move from source to destination
put copy between file systems
rm remove a file
rmr recursive delete
setrep change file replication factor
stat returns file stat information
Hadoop commands What are they ?
User Commands
fs file system commands
tail display end of file
test check file existence / type
text output file as text
touchz create zero length file
fsck HDFS file system check
fetchdt get delegation token from name node
jar run jar file
Job manage mapreduce jobs
Hadoop commands What are they ?
User Commands
pipes run a pipe job
queue interact and view job queue
version get Hadoop version
CLASSNAME run class named CLASSNAME
classpath print the class path
Hadoop commands What are they ?
Administration Commands
balancer run cluster balancing
daemonlog get/set daemon log level
datanode run hdfs data node
dfsadmin run dfsadmin client
mradmin run map reduce admin client
jobtracker run mr jobtracker node
namenode runs the name node
secondarynamenode run secondary name node
tasktracker run task tracker node
Hadoop Environment
See the .bashrc for environment set up
##export HADOOP_HOME=/usr/local/hadoop ## deprecated
export HADOOP_PREFIX=/usr/local/hadoop
export JAVA_HOME=/usr/lib/jvm/java-6-openjdk-i386
unalias hfs &> /dev/null
alias hfs="hadoop fs"
unalias hls &> /dev/null ; alias hls="hfs -ls"
unalias hup1 &> /dev/null ; alias hup1="cd $HADOOP_PREFIX/bin ; ./start-dfs.sh"
unalias hup2 &> /dev/null ; alias hup2="cd $HADOOP_PREFIX/bin ; ./start-mapred.sh"
unalias hdwn1 &> /dev/null ; alias hdwn1="cd $HADOOP_PREFIX/bin ; ./stop-mapred.sh"
unalias hdwn2 &> /dev/null ; alias hdwn2="cd $HADOOP_PREFIX/bin ; ./stop-dfs.sh"
# if using LZO compression then add entry here for viewing
# LZO compressed files
##PATH=$PATH:$HADOOP_HOME/bin ## deprecated
PATH=$PATH:$HADOOP_PREFIX/bin
PATH=$PATH:$JAVA_HOME/bin
export PATH
Hadoop Configuration
Configuration files under $HADOOP_PREFIX/conf
Initial set up in
core-site.xml
hdfs-site.xml
mapred-site.xml
Example from core-site.xml
hadoop.tmp.dir
/app/hadoop/tmp
A base for other temporary directories.
Contact Us
Feel free to contact us at
www.semtech-solutions.co.nz
We offer IT project consultancy
We are happy to hear about your problems
You can just pay for those hours that you need
To solve your problems