2015年6月30日 星期二

[Hadoop] 殺掉正在運作的job

查詢任務列表
hadoop job -list

殺掉job
hadoop job -kill jobId




2015年6月23日 星期二

[Debian] JAVA JDK在Debian系統的位置 + Hadoop&Spark設定


最近要在裝有Debian系統的Banana Pi安裝Hadoop & Spark

從原本裝在CentOS的master複製過去後發現JAVA的路徑不正確

按照習慣的邏輯卻完全找不到究竟在哪裡,於是費勁千辛萬苦終於找到了

/usr/lib/jvm/jdk-7-oracle-armhf



然後再到以下檔案改Hadoop的參數~

hduser@banana01 ~ $ sudo vi /etc/profile
hduser@banana01 ~ $ vi /home/hduser/.bashrc
hduser@banana01 ~ $ vi /opt/hadoop/libexec/hadoop-config.sh
hduser@banana01 ~ $ vi /opt/hadoop/etc/hadoop/hadoop-env.sh
hduser@banana01 ~ $ vi /opt/hadoop/etc/hadoop/yarn-env.sh

原本大概是長這樣
export JAVA_HOME=/usr/java/jdk1.7.0_65

Banana Pi要改成這樣~
export JAVA_HOME=/usr/lib/jvm/jdk-7-oracle-armhf



別忘了給Hadoop&Spark權限
sudo chown -R hduser:hadoop /opt/hadoop
sudo chown -R hduser:hadoop /opt/spark





如果還有第234....多台Pi要加入cluster
將系統燒成映像檔,再燒入到其他張SD卡再做以下設定即可

各別將它們設定IP
sudo nano /etc/network/interfaces

設定hostname (Banana02)
sudo nano /etc/hostname

最後別忘了給Hadoop&Spark權限
sudo chown -R hduser:hadoop /opt/hadoop
sudo chown -R hduser:hadoop /opt/spark



改完就可以啟動Hadoop&Spark囉!




















2015年6月15日 星期一

[Linux] CentOS 了解硬體資訊的指令


了解CPU資訊
cat /proc/cpuinfo

CPU有幾核呢~
cat /proc/cpuinfo|grep "model name"|wc -l



了解memory資訊
cat /proc/meminfo

memory總容量是多少呢~
cat /proc/meminfo |grep "Total"


(持續更新中)

2015年6月7日 星期日

[Linux] CentOS 更改系統日期、時間的指令


查看系統時間
date

修改系統時間
date MMDDhhmmYYYY

MM: two digit month number
DD: two digit date
hh: two digit hour (24 hour system)
mm: two digit minute
YYYY: four digit of year



[hduser@master01 spark]$ date
日  6月  7 00:05:59 CST 2015

[hduser@master01 spark]$ date 060716342015
日  6月  7 16:34:00 CST 2015

2015年5月17日 星期日

Cluster監控的tool (持續更新中)





















Sematext - SPM

我覺得它的UI非常好看,但是要收費




ambari包含了Ganglia & Nagios

Installing a Hadoop Cluster with three Commands

Ambari (the graphical monitoring and management environment for Hadoop)


ambari安裝經驗分享


使用Ambari快速部署Hadoop大数据环境

http://www.cnblogs.com/scotoma/archive/2013/05/18/3085248.html

Ganglia介紹

http://www.ascc.sinica.edu.tw/iascc/articals.php?_section=2.4&_op=?articalID:5134

























RPi-Monitor


專門監控Raspberry Pi

  • CPU Loads
  • Network
  • Disk Boot
  • Disk Root
  • Swap
  • Memory
  • Uptime
  • Temperature




2015/05/17

我要尋找監控Hadoop和Spark效能以及cluster功率消耗的tool
目前還沒找到最理想的解決方法 

[Paper Note] Raspberry Pi相關的paper


Heterogeneity: The Key to Achieve Power-Proportional Computing


da Costa, G. ; IRIT, Univ. de Toulouse, Toulouse, France

The Smart 2020 report on low carbon economy in the information age shows that 2% of the global CO2footprint will come from ICT in 2020. Out of these, 18% will be caused by data-centers, while 45% will come from personal computers. Classical research to reduce this footprint usually focuses on new consolidation techniques for global data-centers. In reality, personal computers and private computing infrastructures are here to stay. They are subject to irregular workload, and are usually largely under-loaded. Most of these computers waste tremendous amount of energy as nearly half of their maximum power consumption comes from simply being switched on. The ideal situation would be to use proportional computers that use nearly 0W when lightly loaded. This article shows the gains of using a perfectly proportional hardware on different type of data-centers: 50% gains for the servers used during 98 World Cup, 20% to the already optimized Google servers. Gains would attain up to 80% for personal computers. As such perfect hardware still does not exist, a real platform composed of Intel I7, Intel Atom and Raspberry Pi is evaluated. Using this infrastructure, gains are of 20% for the World Cup data-center, 5% for Google data-centers and up to 60% for personal computers.
這篇paper有拿intel的處理器和Pi作效能上的比較,可以做為異質環境比較的參考

Published in:

Cluster, Cloud and Grid Computing (CCGrid), 2013 13th IEEE/ACM International Symposium on

Date of Conference:

13-16 May 2013


Affordable and Energy-Efficient Cloud Computing Clusters: The Bolzano Raspberry Pi Cloud Cluster Experiment


Abrahamsson, P. ; Fac. of Comput. Sci., Free Univ. of Bozen-Bolzano, Bolzano, Italy ; Helmer, S. ; Phaphoom, N. ; Nicolodi, L. 

We present our ongoing work building a Raspberry Pi cluster consisting of 300 nodes. The unique characteristics of this single board computer pose several challenges, but also offer a number of interesting opportunities. On the one hand, a single Raspberry Pi can be purchased cheaply and has a low power consumption, which makes it possible to create an affordable and energy-efficient cluster. On the other hand, it lacks in computing power, which makes it difficult to run computationally intensive software on it. Nevertheless, by combining a large number of Raspberries into a cluster, this drawback can be (partially) offset. Here we report on the first important steps of creating our cluster: how to set up and configure the hardware and the system software, and how to monitor and maintain the system. We also discuss potential use cases for our cluster, the two most important being an inexpensive and green test bed for cloud computing research and a robust and mobile data center for operating in adverse environments.

Published in:

Cloud Computing Technology and Science (CloudCom), 2013 IEEE 5th International Conference on (Volume:2 )

Date of Conference:

2-5 Dec. 2013


Technical development and socioeconomic implications of the Raspberry Pi as a learning tool in developing countries


Ali, M. ; Sch. of Eng., Univ. of Warwick, Coventry, UK ; Vlaskamp, J.H.A. ; Eddin, N.N. ; Falconer, B. 

The recent development of the Raspberry Pi mini computer has provided new opportunities to enhance tools for education. The low cost means that it could be a viable option to develop solutions for education sectors in developing countries. This study describes the design, development and manufacture of a prototype solution for educational use within schools in Uganda whilst considering the social implications of implementing such solutions. This study aims to show the potential for providing an educational tool capable of teaching science, engineering and computing in the developing world. During the design and manufacture of the prototype, software and hardware were developed as well as testing performed to define the performance and limitation of the technology. This study showed that it is possible to develop a viable modular based computer systems for educational and teaching purposes. In addition to science, engineering and computing; this study considers the socioeconomic implications of introducing the EPi within developing countries. From a sociological perspective, it is shown that the success of EPi is dependant on understanding the social context, therefore a next phase implementation strategy is proposed.

Published in:

Computer Science and Electronic Engineering Conference (CEEC), 2013 5th

Date of Conference:

17-18 Sept. 2013



Raspberry PI Hadoop Cluster


安裝教學的blog~