当前位置: 首页 > news >正文

邯郸网站设计定制百度seo代理

邯郸网站设计定制,百度seo代理,营销策划公司排行榜,河南高端网站怎么使用Mahout做聚类有空我会专门写的,这篇博客主要为了讲一下Mahout处理的结果。 Mahout版本为0.9,数据没做归一化、标准化,只是为了测试。 输出目录下有clusteredPoints、cluster-x、cluster-(x1)-final等几个文件…

怎么使用Mahout做聚类有空我会专门写的,这篇博客主要为了讲一下Mahout处理的结果。
Mahout版本为0.9,数据没做归一化、标准化,只是为了测试。

输出目录下有clusteredPoints、cluster-x、cluster-(x+1)-final等几个文件夹,x表示第x次迭代,每次的迭代结果都会存到cluster-x,最后一次(x+1)迭代结果存在cluster-(x+1)-final,clusteredPoints下存的也是最后聚类结果,但它俩存的东西不太一样,一个是类,一个是点,具体情况请看下面。
ps:
这里写图片描述

mahout clusterdump 解析ClusterWritable并转成可读文件 -of TEXT,CSV等,后面有贴的
#最后聚类结果(类名称vl-x,中心点位置c,半径r,类中点个数n)
[root@drguo home]# mahout clusterdump -i file:///home/guo/Desktop/output/clusters-2-final -o /home/guo/Desktop/result
VL-0{n=7 c=[1.714, 2.286, 4.429, 0.857, 7.571] r=[2.185, 2.711, 6.884, 2.100, 5.233]}
VL-1{n=3 c=[0.667, 8.667, 11.333, 5.333, 0.667, 4.333, 1.667, 3.333, 21.667] r=[0.943, 5.437, 5.185, 7.542, 0.943, 6.128, 2.357, 4.714, 9.428]}#最后聚类结果(key:所属类,value:权重wt、距离、向量(这是有名字的namedvector,不是普通的哦,之后我也会专门写如何生成))
[root@drguo clusteredPoints]# mahout seqdumper -i file:///home/guo/Desktop/output/clusteredPoints -o /home/guo/Desktop/points
Input Path: file:/home/guo/Desktop/output/clusteredPoints/part-m-0
Key class: class org.apache.hadoop.io.IntWritable Value Class: class org.apache.mahout.clustering.classify.WeightedPropertyVectorWritable
Key: 0: Value: wt: 0.7140480784137244 distance: 6.885358615591935  vec: 001461E4-86C64780-A0B495C4-D19BA86F__201601 = [5.000, 6.000, 6.000]
Key: 1: Value: wt: 0.6106543697821432 distance: 11.445523142259598  vec: 001461E4-86C64780-A0B495C4-D19BA86F__201602 = [12.000, 15.000, 15.000]
Key: 1: Value: wt: 0.6113140078611051 distance: 11.775681155103799  vec: 001461E4-86C64780-A0B495C4-D19BA86F__201603 = [13.000, 15.000, 15.000]
Key: 0: Value: wt: 0.7140480784137244 distance: 6.885358615591935  vec: 001461E4-86C64780-A0B495C4-D19BA86F__201604 = [5.000, 6.000, 6.000]
Key: 0: Value: wt: 0.7643111018595771 distance: 6.010195419417895  vec: 001461E4-86C64780-A0B495C4-D19BA86F__201605 = [2.000, 4.000, 4.000]
Key: 0: Value: wt: 0.7408819961153278 distance: 7.529533687488249  vec: 001641C0-75CC4BC2-9E31CF60-C15627D2__201603 = [6.000, 6.000]
Key: 0: Value: wt: 0.7511412095733683 distance: 7.989789402348321  vec: 001641C0-75CC4BC2-9E31CF60-C15627D2__201604 = [1.000, 1.000]
Key: 0: Value: wt: 0.6648742191066574 distance: 9.264811638337692  vec: 001641C0-75CC4BC2-9E31CF60-C15627D2__201605 = [12.000, 12.000]
Key: 0: Value: wt: 0.53656917576395 distance: 17.373449130609547  vec: 001641C0-75CC4BC2-9E31CF60-C15627D2__201606 = [18.000, 18.000]
Key: 1: Value: wt: 0.5948320024451352 distance: 23.202011407059803  vec: 001641C0-75CC4BC2-9E31CF60-C15627D2__201608 = [2.000, 1.000, 4.000, 16.000, 2.000, 13.000, 5.000, 10.000, 35.000]
Count: 10#将类与点结合输出
[root@drguo home]# mahout clusterdump -i file:///home/guo/Desktop/output/clusters-2-final -p file:///home/guo/Desktop/output/clusteredPoints -o /home/guo/Desktop/cluster-point
VL-0{n=7 c=[1.714, 2.286, 4.429, 0.857, 7.571] r=[2.185, 2.711, 6.884, 2.100, 5.233]}Weight : [props - optional]:  Point:0.7140480784137244 : [distance=6.885358615591935]: 001461E4-86C64780-A0B495C4-D19BA86F__201601 = [5.000, 6.000, 6.000]0.7140480784137244 : [distance=6.885358615591935]: 001461E4-86C64780-A0B495C4-D19BA86F__201604 = [5.000, 6.000, 6.000]0.7643111018595771 : [distance=6.010195419417895]: 001461E4-86C64780-A0B495C4-D19BA86F__201605 = [2.000, 4.000, 4.000]0.7408819961153278 : [distance=7.529533687488249]: 001641C0-75CC4BC2-9E31CF60-C15627D2__201603 = [6.000, 6.000]0.7511412095733683 : [distance=7.989789402348321]: 001641C0-75CC4BC2-9E31CF60-C15627D2__201604 = [1.000, 1.000]0.6648742191066574 : [distance=9.264811638337692]: 001641C0-75CC4BC2-9E31CF60-C15627D2__201605 = [12.000, 12.000]0.53656917576395 : [distance=17.373449130609547]: 001641C0-75CC4BC2-9E31CF60-C15627D2__201606 = [18.000, 18.000]
VL-1{n=3 c=[0.667, 8.667, 11.333, 5.333, 0.667, 4.333, 1.667, 3.333, 21.667] r=[0.943, 5.437, 5.185, 7.542, 0.943, 6.128, 2.357, 4.714, 9.428]}Weight : [props - optional]:  Point:0.6106543697821432 : [distance=11.445523142259598]: 001461E4-86C64780-A0B495C4-D19BA86F__201602 = [12.000, 15.000, 15.000]0.6113140078611051 : [distance=11.775681155103799]: 001461E4-86C64780-A0B495C4-D19BA86F__201603 = [13.000, 15.000, 15.000]0.5948320024451352 : [distance=23.202011407059803]: 001641C0-75CC4BC2-9E31CF60-C15627D2__201608 = [2.000, 1.000, 4.000, 16.000, 2.000, 13.000, 5.000, 10.000, 35.000]

最后贴一下参数选项

seqdumper

Job-Specific Options:                                                           --input (-i) input            Path to job input directory.                    
  --output (-o) output          The directory pathname for output.              
  --substring (-b) substring    The number of chars to print out per value      
  --count (-c)                  Report the count only                           
  --numItems (-n) numItems      Output at most <n> key value pairs              
  --facets (-fa)                Output the counts per key.  Note, if there are  
                                a lot of unique keys, this can take up a fair   amount of memory                                --quiet (-q)                  Print only file contents.                       
  --help (-h)                   Print out help                                  
  --tempDir tempDir             Intermediate output directory                   
  --startPhase startPhase       First phase to run                              
  --endPhase endPhase           Last phase to run   

clusterdump

Job-Specific Options:                                                           --input (-i) input                         Path to job input directory.       --output (-o) output                       The directory pathname for output. --outputFormat (-of) outputFormat          The optional output format for the results.  Options: TEXT, CSV, JSON or GRAPH_ML                        --substring (-b) substring                 The number of chars of the         asFormatString() to print          --numWords (-n) numWords                   The number of top terms to print   --pointsDir (-p) pointsDir                 The directory containing points    sequence files mapping input       vectors to their cluster.  If      specified, then the program will   output the points associated with  a cluster                          --samplePoints (-sp) samplePoints          Specifies the maximum number of    points to include _per_ cluster.   The default is to include all      points                             --dictionary (-d) dictionary               The dictionary file                --dictionaryType (-dt) dictionaryType      The dictionary file type           (text|sequencefile)                --evaluate (-e)                            Run ClusterEvaluator and           CDbwEvaluator over the input.  The output will be appended to the     rest of the output at the end.     --distanceMeasure (-dm) distanceMeasure    The classname of the               DistanceMeasure. Default is        SquaredEuclidean                   --help (-h)                                Print out help                     --tempDir tempDir                          Intermediate output directory      --startPhase startPhase                    First phase to run                 --endPhase endPhase                        Last phase to run     
http://www.lbrq.cn/news/2788813.html

相关文章:

  • 网站建设与维护制作网页成都网站关键词推广
  • 做企业门户网站都信息互联网推广
  • 建设行官方网站类似58的推广平台有哪些平台
  • 上海 网站公安备案唐山建站公司模板
  • 湖南省建筑信息网网站优化种类
  • 学做美食的网站网站推广费用一般多少钱
  • 网站设计团队有哪些职业搜索引擎有哪些类型
  • 北京网站建设外包高报师培训机构排名
  • 计算机专业代做毕设哪个网站靠谱网上互联网推广
  • 国外做ppt的网站有哪些网络游戏排行榜百度风云榜
  • 建网站的大公司建站的公司
  • 谷歌seo是啥北京seo优化排名推广
  • 网站图片一般的像素新开传奇网站发布站
  • 西安又出现疫情了么百度搜索seo
  • 婚庆行业网站建设方案1企业培训课程种类
  • 做网站前没建images文件夹网络营销与直播电商
  • 网站建设 数据可视化手机百度网页版入口
  • 装饰设计室内公司网站优化及推广
  • 织梦做的网站怎么上传视频教程如何自己做引流推广
  • 深圳网站建设 乐云seo衡阳seo优化
  • 网站后台插件百度浏览器网址大全
  • 做垂直平台网站友情链接实例
  • 做最精彩绳艺网站常州免费网站建站模板
  • 永久免费个人网站职业技术培训
  • 凡科网站怎么做建站北京网优化seo优化公司
  • 白云做网站要多少钱uv推广平台
  • 傻瓜式制作app的软件百度关键词在线优化
  • c2c电子商务网站定制开发系统优化软件哪个好
  • 合肥论坛网站建设线上广告推广
  • 网站建设计划书范文百度代理公司
  • C++智能指针详解:告别内存泄漏,拥抱安全高效
  • UTF-8 编解码可视化分析
  • 服务器硬件电路设计之 SPI 问答(二):SPI 与 I2C 的特性博弈及多从机设计之道
  • 【数据结构】堆和二叉树详解(下)
  • A股大盘数据-20250819 分析
  • 决策树(续)