当前位置: 首页 > news >正文

购物网站开发文档百度手机助手官网

购物网站开发文档,百度手机助手官网,宝安网站设计,手机免费建网站系列一中忘了说明,用Kubernetes部署大数据容器平台,有一个很大的好处是压根不用操心容器间的网络通信,不管是同一物理服务器内,还是跨物理服务器间的网络通信,你都不用操心,只需要把容器间互相关心的端口暴…

系列一中忘了说明,用Kubernetes部署大数据容器平台,有一个很大的好处是压根不用操心容器间的网络通信,不管是同一物理服务器内,还是跨物理服务器间的网络通信,你都不用操心,只需要把容器间互相关心的端口暴露好,把容器间的service name映射好,就OK了。

本篇教大家部署Hadoop 2.7.3集群,暂时没有做HA和联邦: Docker适合封装无状态的、单进程的应用程序,如果用它来部署Hadoop分布式集群,还是比较复杂的,主要考虑几个问题:

  • 1,需要制作几个镜像?
  • 2,哪些配置属于通用的,可以直接打包到镜像中?
  • 3,会变更的配置,如何在容器启动时自动的、灵活的替换?
  • 4,哪些服务端口需要暴露?
  • 5,哪些目录需要持久化?
  • 6,NameNode的Format怎么实现和如何避免重复Format?
  • 7,用哪些脚本来启动哪些服务?
  • 8,需要提前设置哪些环境变量?
  • 9,Pod间的依赖和启动顺序如何控制? 等等等。

物理拓扑及Hadoop集群角色如下:

服务器地址角色
yuyan210.0.8.182Docker镜像制作、K8s yaml脚本编辑、kubectl客户端
HARBOR10.10.4.57私有镜像库
ksp-110.10.4.56NameNode、DataNode、ResourceManager、NodeManager
ksp-210.10.4.57DataNode、NodeManager
ksp-510.10.4.60DataNode、NodeManager

1,先制作一个基础镜像: 包含Ubuntu 16.04 LTS、jdk1.8.0_111、各类常用工具(net-tools、iputils、vim、wget)、SSH免密及开机启动、时钟与宿主机同步、无防火墙等。 该镜像PUSH到HARBOR私有镜像库,以便后面制作Hadoop容器时引用它。 制作过程略过,想用的话直接从链接下载即可: Ubuntu基础镜像

2,鉴于Hadoop2.x版本有YARN的存在,所以制作两个镜像,称为: 主镜像,包含:NameNode、ResourceManager守护进程; 从镜像,包含:DataNode、NodeManager守护进程。

新建一个hadoop2.7.3目录,结构如下:

  • 从官网下载hadoop-2.7.3.tar.gz;
  • 取出其/etc/hadoop中的配置,放入到conf文件中,修改:core-site.xml、hdfs-site.xml、mapred-site.xml、yarn-site.xml、并将slaves清空;
  • docker-entrypoint-namenode.sh 制作主镜像时,重命名为docker-entrypoint.sh,启动相关服务进程;
  • docker-entrypoint-datanode.sh 制作从镜像时,重命名为docker-entrypoint.sh,启动相关服务进程;
  • Dockerfile,定义环境变量、申明端口、拷贝及替换文件等操作。

如下几个目录是需要持久化的,在后面的yaml文件中体现: core-site.xml:hadoop.tmp.dir hdfs-site.xml:dfs.namenode.name.dir hdfs-site.xml:dfs.namenode.data.dir

哪些配置文件作为通用文件,哪些配置文件的内容需要动态替换,NameNode 什么情况下做Format,主、从镜像各启动什么服务等,请自行参考链接中的脚本: Hadoop的两个镜像及相关脚本

制作好镜像,并推送到HARBOR私有镜像库中,以便K8s编排容器时使用:

➜  hadoop2.7.3 mv docker-entrypoint-namenode.sh docker-entrypoint.sh
➜  hadoop2.7.3 sudo docker build -t hadoop-2.7.3-namenode-resourcemanager:0.0.1 .
➜  hadoop2.7.3 sudo docker tag hadoop-2.7.3-namenode-resourcemanager:0.0.1  registry.k8s.intra.knownsec.com/bigdata/hadoop-2.7.3-namenode-resourcemanager:0.0.1
➜  hadoop2.7.3 sudo docker push registry.k8s.intra.knownsec.com/bigdata/hadoop-2.7.3-namenode-resourcemanager:0.0.1
➜  hadoop2.7.3 mv docker-entrypoint-datanode.sh docker-entrypoint.sh
➜  hadoop2.7.3 sudo docker build -t hadoop-2.7.3-datanode-nodemanager:0.0.1 .
➜  hadoop2.7.3 sudo docker tag hadoop-2.7.3-datanode-nodemanager:0.0.1  registry.k8s.intra.knownsec.com/bigdata/hadoop-2.7.3-datanode-nodemanager:0.0.1
➜  hadoop2.7.3 sudo docker push registry.k8s.intra.knownsec.com/bigdata/hadoop-2.7.3-datanode-nodemanager:0.0.1

相关镜像已经推送到私有镜像库:


3,经过上述的步骤,主、从两个镜像就制作好了,下面,新生成一个
hadoop文件夹,用来编写yaml文件,并执行K8s编排任务:


由于容器间需要相互通信,且需要对外提供服务,所以我们仍然采用Deployment方式来编排,每个Pod启动一个容器,Pod间暴露相关接口,每个Pod都可以对外提供服务,启动主镜像的Pod使用0.0.0.0监听自己的端口,启动从镜像的其他Pod使用主Pod的service name,作为目的hostname,持久化的 细节及动态替换hostname的细节,请参考yaml文件:
K8s编排Hadoop集群的yaml文件

按照顺序启动Pod(这个过程可以做成脚本,一键执行):

➜  hadoop kubectl create -f hadoop-namenode-resourcemanager.yaml --validate=false
➜  hadoop kubectl create -f hadoop-datanode-nodemanager01.yaml --validate=false
➜  hadoop kubectl create -f hadoop-datanode-nodemanager02.yaml --validate=false
➜  hadoop kubectl create -f hadoop-datanode-nodemanager05.yaml --validate=false

当主镜像的Pod要更换service name时,只需要替换从镜像Pod yaml中的ConfigMap hostname值即可

---
apiVersion: v1
kind: ConfigMap
metadata:name: hdp-2-cm
data:hostname: "hdp-1-svc"
---

验证:

  • 可以看到4个Pod分配到了指定的物理服务器上:

  • 可以看到每个Pod对外提供的端口:

  • 使用ksp-1主机的IP加hdp-1-2386450527-9x6b9暴露的50070:30980/TCP,可以看到HDFS Web中DataNode的情况:

  • 使用ksp-1主机的IP加hdp-1-2386450527-9x6b9暴露的8088:32326/TCP,可以看到YARN Web中NodeManager的情况:

补充:
对上述操作部署好的hadoop集群进行简单验证,hdfs的基本功能正常,执行一个mapreduce wordcount用例时,出现错误提示:

hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.3.jar wordcount file:///hadoop-2.7.3/NOTICE.txt file:///hadoop-2.7.3/output2
17/07/03 05:32:10 INFO client.RMProxy: Connecting to ResourceManager at hdp-1-svc/12.0.112.23:8032
17/07/03 05:32:11 INFO input.FileInputFormat: Total input paths to process : 1
17/07/03 05:32:12 INFO mapreduce.JobSubmitter: number of splits:1
17/07/03 05:32:12 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1498909283586_0006
17/07/03 05:32:12 INFO impl.YarnClientImpl: Submitted application application_1498909283586_0006
17/07/03 05:32:12 INFO mapreduce.Job: The url to track the job: http://hdp-1-2386450527-9x6b9:8088/proxy/application_1498909283586_0006/
17/07/03 05:32:12 INFO mapreduce.Job: Running job: job_1498909283586_0006
17/07/03 05:32:18 INFO mapreduce.Job: Job job_1498909283586_0006 running in uber mode : false
17/07/03 05:32:18 INFO mapreduce.Job:  map 0% reduce 0%
17/07/03 05:32:19 INFO mapreduce.Job: Task Id : attempt_1498909283586_0006_m_000000_0, Status : FAILED
Container launch failed for container_1498909283586_0006_01_000002 : java.lang.IllegalArgumentException: java.net.UnknownHostException: hdp-2-1789154958-g0njgat org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:377)at org.apache.hadoop.security.SecurityUtil.setTokenService(SecurityUtil.java:356)at org.apache.hadoop.yarn.util.ConverterUtils.convertFromYarn(ConverterUtils.java:238)at org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData.newProxy(ContainerManagementProtocolProxy.java:266)at org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData.<init>(ContainerManagementProtocolProxy.java:244)at org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy.getProxy(ContainerManagementProtocolProxy.java:129)at org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl.getCMProxy(ContainerLauncherImpl.java:409)at org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$Container.launch(ContainerLauncherImpl.java:138)at org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:375)at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)at java.lang.Thread.run(Thread.java:745)
Caused by: java.net.UnknownHostException: hdp-2-1789154958-g0njg... 12 more

这个错误的原因:
因为创建Pod时,会给每个容器随机分配一个hostname,我们前面是使用svc-name来保证服务间的通信,但是mapreduce执行时,使用的是hostname,而每台容器启动后,/etc/hosts中是仅有自己的主机名和ip的映射,是不包含其他容器的。而且,也无法动态的向/etc/hosts里面增加其他容器的主机名和ip的映射,所以分配ApplicationMaster时,无法响应客户端的请求。

修复起来较麻烦:
需要修改主、从镜像的dockerfile,docker-entrypoint.sh,重新制作主,从镜像,然后修改yaml,之后按照原有流程启动即可。
修改方案就是使用Headless Service,不再使用cluster ip,而是在 Pod之间直接使用各容器的私有IP,并在yaml指定Pod启动的容器的主机名,和svc-name配置成一致的。不过,这种方式下,外网不能直接访问集群了(比如外网的一个用户想登陆HDFS的web ui),需要启动一个Nginx容器,由它做一个DNS代理,能够让外网访问集群IP和端口即可,请自行搜索解决方案,这里不再描述。

相关修改如下,修改之后如何重新搭建集群,请参考前文。另,yaml我只修改了2个,其他节点的yaml照着修改就行了:
Dockerfile:

FROM registry.k8s.intra.knownsec.com/bigdata/ubuntu16.04_jdk1.8.0_111:0.0.2
MAINTAINER Wang Liang <wangl8@knownsec.com>ARG DISTRO_NAME=hadoop-2.7.3
ARG DISTRO_NAME_DIR=/hadoop-2.7.3ENV HADOOP_HOME=$DISTRO_NAME_DIR
ENV HADOOP_PREFIX=$DISTRO_NAME_DIR
ENV HADOOP_CONF_DIR=$HADOOP_PREFIX/etc/hadoop
ENV YARN_CONF_DIR=$HADOOP_PREFIX/etc/hadoop
ENV HADOOP_TMP_DIR=$HADOOP_HOME/tmp
ENV HADOOP_DFS_DIR=$HADOOP_HOME/dfs
ENV HADOOP_DFS_NAME_DIR=$HADOOP_DFS_DIR/name
ENV HADOOP_DFS_DATA_DIR=$HADOOP_DFS_DIR/data
ENV HADOOP_LOGS=$HADOOP_HOME/logs
ENV Master=localhost
ENV USER=rootUSER root# Hdfs ports
EXPOSE 9000 50010 50020 50070 50075 50090 31010 8020# Mapred ports
EXPOSE 19888 10020#Yarn ports
EXPOSE 8030 8031 8032 8033 8040 8042 8088#Other ports
EXPOSE 49707 2122ADD hadoop-2.7.3.tar.gz /
WORKDIR $DISTRO_NAME_DIR
#ENV ZOO_USER=zookeeper \
#    ZOO_CONF_DIR=/conf \
#    ZOO_DATA_DIR=/data \
#    ZOO_UI_DIR=/zkui \
#    ZOO_DATA_LOG_DIR=/datalog \
#    ZOO_PORT=2181 \
#    ZOO_TICK_TIME=2000 \
#    ZOO_INIT_LIMIT=5 \
#    ZOO_SYNC_LIMIT=2# Add a user and make dirs
#RUN set -x \
#    && adduser -D "$ZOO_USER" \
#    && mkdir -p "$ZOO_DATA_LOG_DIR" "$ZOO_DATA_DIR" "$ZOO_CONF_DIR" \
#    && chown "$ZOO_USER:$ZOO_USER" "$ZOO_DATA_LOG_DIR" "$ZOO_DATA_DIR" "$ZOO_CONF_DIR"#ARG DISTRO_NAME=hadoop-2.7.3RUN rm -r -f $HADOOP_CONF_DIR
RUN mkdir -p "$HADOOP_TMP_DIR" "$HADOOP_DFS_NAME_DIR" "$HADOOP_DFS_DATA_DIR" "$HADOOP_LOGS"
ADD conf $HADOOP_CONF_DIRCOPY conf /conf_tmpENV PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin# Download Apache Zookeeper, verify its PGP signature, untar and clean up
#RUN set -x \
#    && tar -xzf "$DISTRO_NAME.tar.gz"#WORKDIR $DISTRO_NAME#VOLUME ["$ZOO_DATA_DIR", "$ZOO_DATA_LOG_DIR"]#EXPOSE $ZOO_PORT 2888 3888#ENV PATH=$PATH:/$DISTRO_NAME/bin:$ZOO_UI_DIR \
#    ZOOCFGDIR=$ZOO_CONF_DIRCOPY docker-entrypoint.sh /
ENTRYPOINT ["/docker-entrypoint.sh"]

docker-entrypoint-namenode.sh:

#!/bin/bash
source /etc/environment
source ~/.bashrc
source /etc/profile
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbinsed "s/HOSTNAME/"$Master"/g"    /conf_tmp/core-site.xml > $HADOOP_CONF_DIR/core-site.xml
sed "s/HOSTNAME/"$Master"/g"    /conf_tmp/mapred-site.xml > $HADOOP_CONF_DIR/mapred-site.xml
sed "s/HOSTNAME/"$Master"/g"    /conf_tmp/yarn-site.xml > $HADOOP_CONF_DIR/yarn-site.xmlif [ "`ls -A $HADOOP_DFS_NAME_DIR`" = "" ]; then
echo "$DIRECTORY is indeed empty"
$HADOOP_PREFIX/bin/hdfs namenode -format
else
echo "$DIRECTORY is not empty"
fi$HADOOP_PREFIX/sbin/hadoop-daemon.sh --config $HADOOP_CONF_DIR start namenode
#$HADOOP_PREFIX/sbin/hadoop-daemon.sh --config $HADOOP_CONF_DIR start datanode$HADOOP_PREFIX/sbin/yarn-daemon.sh --config $HADOOP_CONF_DIR start resourcemanager
#$HADOOP_PREFIX/sbin/yarn-daemon.sh --config $HADOOP_CONF_DIR start nodemanager
$HADOOP_PREFIX/sbin/mr-jobhistory-daemon.sh   start historyserver#exec nohup /etc/init.d/ssh start &
/etc/init.d/ssh start
#exec /bin/bash
while true; do sleep 1000; done

docker-entrypoint-datanode.sh:

#!/bin/bash
source /etc/environment
source ~/.bashrc
source /etc/profile
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbinsed "s/HOSTNAME/"$Master"/g"    /conf_tmp/core-site.xml > $HADOOP_CONF_DIR/core-site.xml
sed "s/HOSTNAME/"$Master"/g"    /conf_tmp/mapred-site.xml > $HADOOP_CONF_DIR/mapred-site.xml
sed "s/HOSTNAME/"$Master"/g"    /conf_tmp/yarn-site.xml > $HADOOP_CONF_DIR/yarn-site.xml#if [ "`ls -A $HADOOP_DFS_NAME_DIR`" = "" ]; then
#echo "$DIRECTORY is indeed empty"
#$HADOOP_PREFIX/bin/hdfs namenode -format
#else
#echo "$DIRECTORY is not empty"
#fi#$HADOOP_PREFIX/sbin/hadoop-daemon.sh --config $HADOOP_CONF_DIR start namenode
$HADOOP_PREFIX/sbin/hadoop-daemon.sh --config $HADOOP_CONF_DIR start datanode#$HADOOP_PREFIX/sbin/yarn-daemon.sh --config $HADOOP_CONF_DIR start resourcemanager
$HADOOP_PREFIX/sbin/yarn-daemon.sh --config $HADOOP_CONF_DIR start nodemanager
#$HADOOP_PREFIX/sbin/mr-jobhistory-daemon.sh   start historyserver#exec nohup /etc/init.d/ssh start &
/etc/init.d/ssh start
#exec /bin/bash
while true; do sleep 1000; done

hadoop-namenode-resourcemanager.yaml :

---
apiVersion: v1
kind: Service
metadata:name: hdp-1-svclabels:app: hdp-1-svc
spec:clusterIP: None  #注意这里,要写成None,才能使用headless serviceports:- port: 9000name: hdfs- port: 50070name: hdfsweb- port: 19888name: jobhistory- port: 8088name: yarn- port: 50010name: hdfs2- port: 50020name: hdfs3- port: 50075name: hdfs5- port: 50090name: hdfs6- port: 10020name: mapred2- port: 8030name: yarn1- port: 8031name: yarn2- port: 8032name: yarn3- port: 8033name: yarn4- port: 8040name: yarn5- port: 8042name: yarn6- port: 49707name: other1- port: 2122name: other2- port: 31010name: hdfs7- port: 8020name: hdfs8selector:app: hdp-1type: NodePort
---
apiVersion: v1
kind: ConfigMap
metadata:name: hdp-1-cm
data:master: "0.0.0.0"hostname: "hdp-1-svc"
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:name: hdp-1
spec:replicas: 1template:metadata:labels:app: hdp-1spec:hostname: hdp-1-svc  #指定容器启动时的主机名,配置成svc-name即可nodeSelector:zk: zk-1containers:- name: myhadoop-nn-rmimagePullPolicy: Alwaysimage: registry.k8s.intra.knownsec.com/bigdata/hadoop-2.7.3-namenode-resourcemanager:0.0.1securityContext:privileged: trueresources:requests:memory: "2Gi"cpu: "500m"ports:- containerPort: 9000name: hdfs- containerPort: 50010name: hdfs2- containerPort: 50020name: hdfs3- containerPort: 50070name: hdfsweb- containerPort: 50075name: hdfs5- containerPort: 50090name: hdfs6- containerPort: 19888name: jobhistory- containerPort: 10020name: mapred2- containerPort: 8030name: yarn1- containerPort: 8031name: yarn2- containerPort: 8032name: yarn3- containerPort: 8033name: yarn4- containerPort: 8040name: yarn5- containerPort: 8042name: yarn6- containerPort: 8088name: yarn- containerPort: 49707name: other1- containerPort: 2122name: other2- containerPort: 31010name: hdfs7- containerPort: 8020name: hdfs8env:- name: MastervalueFrom:configMapKeyRef:name: hdp-1-cmkey: master- name: HOSTNAMEvalueFrom:configMapKeyRef:name: hdp-1-cmkey: hostname
#        readinessProbe:
#          exec:
#            command:
#            - "zkok.sh"
#          initialDelaySeconds: 10
#          timeoutSeconds: 5
#        livenessProbe:
#          exec:
#            command:
#            - "zkok.sh"
#          initialDelaySeconds: 10
#          timeoutSeconds: 5volumeMounts:- name: namemountPath: /hadoop-2.7.3/dfs/name- name: datamountPath: /hadoop-2.7.3/dfs/data- name: tmpmountPath: /hadoop-2.7.3/tmp- name: logsmountPath: /hadoop-2.7.3/logsvolumes:- name: namehostPath:path: /home/data/bjrddata/hadoop/name021- name: datahostPath:path: /home/data/bjrddata/hadoop/data021- name: tmphostPath:path: /home/data/bjrddata/hadoop/tmp021- name: logshostPath:path: /home/data/bjrddata/hadoop/logs021

hadoop-datanode-nodemanager01.yaml :

---
apiVersion: v1
kind: Service
metadata:name: hdp-2-svclabels:app: hdp-2-svc
spec:clusterIP: Noneports:- port: 9000name: hdfs- port: 50070name: hdfsweb- port: 19888name: jobhistory- port: 8088name: yarn- port: 50010name: hdfs2- port: 50020name: hdfs3- port: 50075name: hdfs5- port: 50090name: hdfs6- port: 10020name: mapred2- port: 8030name: yarn1- port: 8031name: yarn2- port: 8032name: yarn3- port: 8033name: yarn4- port: 8040name: yarn5- port: 8042name: yarn6- port: 49707name: other1- port: 2122name: other2- port: 31010name: hdfs7- port: 8020name: hdfs8selector:app: hdp-2type: NodePort
---
apiVersion: v1
kind: ConfigMap
metadata:name: hdp-2-cm
data:master: "hdp-1-svc"hostname: "hdp-2-svc"
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:name: hdp-2
spec:replicas: 1template:metadata:labels:app: hdp-2spec:hostname: hdp-2-svcnodeSelector:zk: zk-1containers:- name: myhadoop-dn-nmimagePullPolicy: Alwaysimage: registry.k8s.intra.knownsec.com/bigdata/hadoop-2.7.3-datanode-nodemanager:0.0.1securityContext:privileged: trueresources:requests:memory: "2Gi"cpu: "500m"ports:- containerPort: 9000name: hdfs- containerPort: 50010name: hdfs2- containerPort: 50020name: hdfs3- containerPort: 50070name: hdfsweb- containerPort: 50075name: hdfs5- containerPort: 50090name: hdfs6- containerPort: 19888name: jobhistory- containerPort: 10020name: mapred2- containerPort: 8030name: yarn1- containerPort: 8031name: yarn2- containerPort: 8032name: yarn3- containerPort: 8033name: yarn4- containerPort: 8040name: yarn5- containerPort: 8042name: yarn6- containerPort: 8088name: yarn- containerPort: 49707name: other1- containerPort: 2122name: other2- containerPort: 31010name: hdfs7- containerPort: 8020name: hdfs8env:- name: MastervalueFrom:configMapKeyRef:name: hdp-2-cmkey: master- name: HOSTNAMEvalueFrom:configMapKeyRef:name: hdp-2-cmkey: hostname
#        readinessProbe:
#          exec:
#            command:
#            - "zkok.sh"
#          initialDelaySeconds: 10
#          timeoutSeconds: 5
#        livenessProbe:
#          exec:
#            command:
#            - "zkok.sh"
#          initialDelaySeconds: 10
#          timeoutSeconds: 5volumeMounts:- name: namemountPath: /hadoop-2.7.3/dfs/name- name: datamountPath: /hadoop-2.7.3/dfs/data- name: tmpmountPath: /hadoop-2.7.3/tmp- name: logsmountPath: /hadoop-2.7.3/logsvolumes:- name: namehostPath:path: /home/data/bjrddata/hadoop/name022- name: datahostPath:path: /home/data/bjrddata/hadoop/data022- name: tmphostPath:path: /home/data/bjrddata/hadoop/tmp022- name: logshostPath:path: /home/data/bjrddata/hadoop/logs022

重新部署集群:(可以看到Cluster ip已经没有了)


wordcount运行结果:

root@hdp-5-svc:/hadoop-2.7.3# hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.3.jar wordcount hdfs://hdp-1-svc:9000/test/NOTICE.txt hdfs://hdp-1-svc:9000/test/output01
17/07/03 09:53:57 INFO client.RMProxy: Connecting to ResourceManager at hdp-1-svc/192.168.25.14:8032
17/07/03 09:53:57 INFO input.FileInputFormat: Total input paths to process : 1
17/07/03 09:53:57 INFO mapreduce.JobSubmitter: number of splits:1
17/07/03 09:53:58 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1499075310680_0001
17/07/03 09:53:58 INFO impl.YarnClientImpl: Submitted application application_1499075310680_0001
17/07/03 09:53:58 INFO mapreduce.Job: The url to track the job: http://hdp-1-svc:8088/proxy/application_1499075310680_0001/
17/07/03 09:53:58 INFO mapreduce.Job: Running job: job_1499075310680_0001
17/07/03 09:54:04 INFO mapreduce.Job: Job job_1499075310680_0001 running in uber mode : false
17/07/03 09:54:04 INFO mapreduce.Job:  map 0% reduce 0%
17/07/03 09:54:09 INFO mapreduce.Job:  map 100% reduce 0%
17/07/03 09:54:14 INFO mapreduce.Job:  map 100% reduce 100%
17/07/03 09:54:15 INFO mapreduce.Job: Job job_1499075310680_0001 completed successfully
17/07/03 09:54:15 INFO mapreduce.Job: Counters: 49File System CountersFILE: Number of bytes read=11392FILE: Number of bytes written=261045FILE: Number of read operations=0FILE: Number of large read operations=0FILE: Number of write operations=0HDFS: Number of bytes read=15080HDFS: Number of bytes written=8969HDFS: Number of read operations=6HDFS: Number of large read operations=0HDFS: Number of write operations=2Job CountersLaunched map tasks=1Launched reduce tasks=1Data-local map tasks=1Total time spent by all maps in occupied slots (ms)=2404Total time spent by all reduces in occupied slots (ms)=2690Total time spent by all map tasks (ms)=2404Total time spent by all reduce tasks (ms)=2690Total vcore-milliseconds taken by all map tasks=2404Total vcore-milliseconds taken by all reduce tasks=2690Total megabyte-milliseconds taken by all map tasks=2461696Total megabyte-milliseconds taken by all reduce tasks=2754560Map-Reduce FrameworkMap input records=437Map output records=1682Map output bytes=20803Map output materialized bytes=11392Input split bytes=102Combine input records=1682Combine output records=614Reduce input groups=614Reduce shuffle bytes=11392Reduce input records=614Reduce output records=614Spilled Records=1228Shuffled Maps =1Failed Shuffles=0Merged Map outputs=1GC time elapsed (ms)=173CPU time spent (ms)=1800Physical memory (bytes) snapshot=470913024Virtual memory (bytes) snapshot=4067495936Total committed heap usage (bytes)=342360064Shuffle ErrorsBAD_ID=0CONNECTION=0IO_ERROR=0WRONG_LENGTH=0WRONG_MAP=0WRONG_REDUCE=0File Input Format CountersBytes Read=14978File Output Format Counters

执行成功


作者:俺是亮哥
链接:http://www.jianshu.com/p/38606bbe138b
來源:简书
著作权归作者所有。商业转载请联系作者获得授权,非商业转载请注明出处。
http://www.lbrq.cn/news/2750905.html

相关文章:

  • 成都市建设网扬尘监控网站重庆网站建设推广
  • 网站没备案可以做商城吗seo推广排名软件
  • 推广游戏赚钱的平台北京优化网站推广
  • 黑龙江生产建设兵团各连网站百度指数数据分析平台官网
  • 专门做mmd的网站个人网站设计方案
  • 大型外贸商城网站建设推广项目网站
  • 北京企业建设网站网站设计公司哪家专业
  • 高德地图是国产软件吗当阳seo外包
  • 腾讯公司做的购物网站搜索引擎营销的常见方式
  • 动漫设计与制作好就业吗seo推广方法
  • 厦门手机网站建设是什么网站收录查询爱站
  • 怎么查网站开发的语言宁德市房价
  • 动态网站开发在线测试百度广告费
  • 长沙网站建设维护少儿编程
  • 常州市做网站的公司品牌网络推广外包
  • xxx网站建设策划书范文广告优化师适合女生吗
  • 常德市做网站联系电话b2b网站排名
  • 做招聘网站创业如何宣传推广产品
  • 建设汽车行业网站云搜索引擎
  • 做网站需要用什么开发软件抖音搜索关键词推广
  • 专业柳州网站建设哪家好baike seotl
  • 做网站广告软件seo网站优化方案
  • 网站建设的行业新闻职业培训网络平台
  • 网站自建系统广东广州网点快速网站建设
  • 重庆綦江网站制作公司哪家专业大型网站制作
  • 网站批量发布互联网广告公司
  • 江阴做网站的公司搜索引擎收录提交入口
  • 陕西企业营销型网站广东网络seo推广公司
  • sem推广是什么意思呢常德seo招聘
  • 图片制作表情包的软件惠州百度seo在哪
  • 基于STM32单片机智能RFID刷卡汽车位锁桩设计
  • 问津集 #5:Crystal: A Unified Cache Storage System for Analytical Databases
  • fastadmin 后台列表自定义搜索
  • 如何在FastAPI中玩转APScheduler,实现动态定时任务的魔法?
  • 2025年生成式引擎优化(GEO)服务商技术能力评估报告
  • Python爬虫实战:研究Scrapy Spiders ,构建豆瓣网电影数据分析处理系统