发布于 2016-03-11 06:02:25 | 184 次阅读 | 评论: 0 | 来源: 网友投递

这里有新鲜出炉的精品教程,程序狗速度看过来!

Apache Spark

Spark是UC Berkeley AMP lab所开源的类Hadoop MapReduce的通用的并行,Spark,拥有Hadoop MapReduce所具有的优点;但不同于MapReduce的是Job中间输出结果可以保存在内存中,从而不再需要读写HDFS,因此Spark能更好地适用于数据挖掘与机器学习等需要迭代的map reduce的算法。


Apache spark 1.6.1 发布了,

新特性

[SPARK-10359] - Enumerate Spark's dependencies in a file and diff against it for new pull requests

Bug 修复

  • [SPARK-7615] - MLLIB Word2Vec wordVectors divided by Euclidean Norm equals to zero

  • [SPARK-9844] - File appender race condition during SparkWorker shutdown

  • [SPARK-10524] - Decision tree binary classification with ordered categorical features: incorrect centroid

  • [SPARK-10847] - Pyspark - DataFrame - Optional Metadata with `None` triggers cryptic failure

  • [SPARK-11394] - PostgreDialect cannot handle BYTE types

  • [SPARK-11624] - Spark SQL CLI will set sessionstate twice

  • [SPARK-11972] - [Spark SQL] the value of 'hiveconf' parameter in CLI can't be got after enter spark-sql session

  • [SPARK-12006] - GaussianMixture.train crashes if an initial model is not None

  • [SPARK-12010] - Spark JDBC requires support for column-name-free INSERT syntax

  • [SPARK-12016] - word2vec load model can't use findSynonyms to get words

  • [SPARK-12026] - ChiSqTest gets slower and slower over time when number of features is large

  • [SPARK-12268] - pyspark shell uses execfile which breaks python3 compatibility

  • [SPARK-12300] - Fix schema inferance on local collections

  • [SPARK-12316] - Stack overflow with endless call of `Delegation token thread` when application end.

  • [SPARK-12327] - lint-r checks fail with commented code

详情请看:https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12334009&styleName=Html&projectId=12315420&Create=Create&atl_token=A5KQ-2QAV-T4JA-FDED|a0202c18e71ce446af35a0775298cc3f2be9d54f|lin

下载地址:http://spark.apache.org/downloads.html



历史版本 :
Apache Spark 2.2.0 正式发布,提高可用性和稳定性
Spark 2.0 时代全面到来 —— 2.0.1 版本发布
Apache Spark 2.0.0 发布,APIs 更新
Apache Spark 1.6.2 发布,集群计算环境
Spark 2.0 预览:更简单,更快,更智能
Spark 2.7.6 发布,开源集群计算环境
Apache spark 1.6.1 发布,集群计算环境
Apache Spark 2.0 最快今年4月亮相
Apache Spark 1.6 正式发布,性能大幅度提升
Apache Spark 1.6 预览版:更简便的搜索
Apache Spark 1.5.2 发布,开源集群计算环境
Apache Spark 1.5.1 发布,开源集群计算环境
最新网友评论  共有(0)条评论 发布评论 返回顶部

Copyright © 2007-2017 PHPERZ.COM All Rights Reserved   冀ICP备14009818号  版权声明  广告服务