Posted on 2015-05-26 17:13
天边蓝 阅读(748)
评论(0) 编辑 收藏 引用
最近,完成了 Sequoiadb与spark的对接,为了方便以后查阅,记录如下~
版本说明:
Sequoiadb 版本: 1.12
spark版本:1.3.1
对接步骤:
1.配置hive-site.xml(hive要求,可省略)
hive.aux.jars.path
file:///ocsdev/hadoop/apache-hive-1.1.0-bin/lib/hive-sequoiadb-apache.jar,file:///ocsdev/hadoop/apache-hive-1.1.0-bin/lib/sequoiadb.jar
Sequoiadb store handler jar file
2.配置SPARK_CLASSPATH
export SPARK_CLASSPATH=/path/to/spark/lib/sequoiadb-driver-1.12.jar:/path/to/spark/lib/lib/spark-sequoiadb_2.10-1.12.jar:/ocsdev/hadoop/spark-1.3.1-bin-hadoop2.6/lib://path/to/spark/lib/mysql-connector-java-5.1.5-bin.jar
3.将sequoiadb-driver-1.12.jar、spark-sequoiadb_2.10-1.12.jar拷贝到spark的lib目录下;
4.hive-site.xml配置元数据存储方式
hive.metastore.local
true
javax.jdo.option.ConnectionURL
jdbc:mysql://192.168.0.103:3306/hive?characterEncoding=UTF-8
javax.jdo.option.ConnectionDriverName
com.mysql.jdbc.Driver
javax.jdo.option.ConnectionUserName
hive
javax.jdo.option.ConnectionPassword
hive
5.创建Sequoiadb集合映射关系;
./spark_sql
>CREATE table lw_test_sdb (id int, r5 double) using com.sequoiadb.spark OPTIONS ( host '192.168.0.103:11810,192.168.0.104:11810,192.168.0.102:11810', collectionspace 'hj', collection 'aws_min',username '',password '');
6.查询数据
>select * from lw_test_sdb
NULL 0.0
23 23.4
12 34.5
Time taken: 0.825 seconds, Fetched 3 row(s)