博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
sqoop搭建
阅读量:5816 次
发布时间:2019-06-18

本文共 14211 字,大约阅读时间需要 47 分钟。

sqoop版本1.99.7#此搭建过程在最后启动job的时候失败了,每个版本的差异性蛮大的。

版本下载链接:http://pan.baidu.com/s/1pKYrusz 密码:7ib5

搭建sqoop之前,已经配置好了hadoop和java的环境

当第一次启动后,KILL掉HADOOP进程后出现的各种问题,重启机器解决问题。

This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh

错误: 找不到或无法加载主类 org.apache.hadoop.hdfs.tools.GetConf

错误: 找不到或无法加载主类 org.apache.hadoop.hdfs.server.namenode.NameNode

/usr/local/hadoop/sbin/hadoop-daemon.sh start  namenode ##起来后用jps查看有namenode 和dataname

/usr/local/hadoop/sbin/yarn-daemon.sh start  resourcemanager

##按服务起来

提示什么鬼 都可以略过,这里被坑一次

1、下载包

2、解压安装包

tar -xzvf sqoop-1.99.7-bin-hadoop200.tar.gz -C /usr/local/

cd /usr/local/

mv sqoop-1.99.7-bin-hadoop200 sqoop1

3、配置环境变量

export SQOOP_HOME=/usr/local/sqoop

export PATH=$PATH:$SQOOP_HOME/bin

配置了hadoop的环境变量,在sqoop2版本也需要配置以下几个,不然重启后就问题了:

Can't load the Hadoop related java lib, please check the setting for the following environment variables:

    HADOOP_COMMON_HOME, HADOOP_HDFS_HOME, HADOOP_MAPRED_HOME, HADOOP_YARN_HOME

export HADOOP_COMMON_HOME=$HADOOP_HOME/share/hadoop/common

export HADOOP_HDFS_HOME=$HADOOP_HOME/share/hadoop/hdfs

export  HADOOP_MAPRED_HOME=$HADOOP_HOME/share/hadoop/mapreduce

export HADOOP_YARN_HOME=$HADOOP_HOME/share/hadoop/yarn

4、修改sqoop的配置文件

/usr/local/sqoop/conf/sqoop.properties 在144行

org.apache.sqoop.submission.engine.mapreduce.configuration.directory=/usr/local/hadoop/etc/hadoop/

#这里配置的hadoop的配置文件路径,没配置好会导致

not a directory or permission issues

5、拷贝驱动包到sqoop/lib目录下

拷贝mysql-connector-java-5.1.6-bin.jar

链接:http://pan.baidu.com/s/1qXIGeSG 密码:iykt 下载包

6、配置hadoop代理访问

sqoop访问Hadoop的MapReduce使用的是代理的方式,必须在Hadoop中配置所接受的proxy用户和组。

找到Hadoop的core-site.xml配置文件

vi /usr/local/hadoop/etc/hadoop/container-executor.cfg 

allowed.system.users=hadoop

<property>

<name>hadoop.proxyuser.hadoop.hosts</name>

<value>*</value>

</property><property>

     <name>hadoop.proxyuser.hadoop.groups</name>

      <value>*</value></property>

</property>

7、配置 Tool class org.apache.sqoop.tools.tool.VerifyTool has failed(没有配置yarm-site.xml)

 <property>

   <name>mapreduce.framework.name</name>

    <value>yarn</value>

 </property>

8、验证配置是否有效

bin/sqoop2-tool verify 返回

Verification was successful.

Tool class org.apache.sqoop.tools.tool.VerifyTool has finished correctly.

9、启动sqoop服务

sqoop2-server start

通过JDK中的jps工具查看是否已经正确启动起来,正常情况下会有个SqoopJettyServer的进程,

Sqoop server是基于jetty实现的。

10、进入SQOOP

进入Client的shell环境:./sqoop2-shell

问题:Exception in thread "main" java.lang.UnsatisfiedLinkError:

 /usr/local/jdk/jre/lib/i386/xawt/libmawt.so: libXext.so.6: 无法打开共对象文件: 没有那个文件或目录

yum install glibc.i686 libXtst.i686

问题2:

 /usr/local/jdk/jre/lib/i386/xawt/libmawt.so: libXrender.so.1:

需要安装一个RPM包,根据系统的版本642.4.2.el6.x86_64 #1

包下载地址:http://down.51cto.com/data/2260998

然后rpm -ivh libXrender-0.9.10-1.fc26.i686.rpm 

系统包的缺失是个坑

最后在执行sqoop2-shell 

[root@nod2 bin]# ./sqoop2-shell

Setting conf dir: /usr/local/sqoop/bin/../conf

Sqoop home directory: /usr/local/sqoop

Sqoop Shell: Type 'help' or '\h' for help.

sqoop:000> 

测试mysql到hadoop数据的导入

1、在MYSQL添加一个用户

grant all privileges on *.* to sqoop@'192.168.%' identified by 'sqoop' with grants option;

mysql> create database sqoop;

Query OK, 1 row affected (0.02 sec)

mysql> use sqoop

Database changed

mysql> create table sqoop(id int,c_time timestamp,name varchar(20))

    -> ;

Query OK, 0 rows affected (0.10 sec)

mysql> insert into sqoop(id,name)value(1,'sqoop')

    -> ;

Query OK, 1 row affected (0.02 sec)

mysql> insert into sqoop(id,name)value(2,'hadoop')

    -> ;

Query OK, 1 row affected (0.02 sec)

mysql> insert into sqoop(id,name)value(3,'hive')

    -> ;

Query OK, 1 row affected (0.02 sec)

mysql> select * from sqoop;

+------+---------------------+--------+

| id   | c_time              | name   |

+------+---------------------+--------+

|    1 | 2016-11-22 15:04:04 | sqoop  |

|    2 | 2016-11-22 15:04:13 | hadoop |

|    3 | 2016-11-22 15:04:21 | hive   |

+------+---------------------+--------+

3 rows in set (0.00 sec)

2、测试连接sqoop

首相了解下1.97的sqoop的操作和其他的版本有区别的:

#查看版本信息

show version /show version --all

#查看sqoop所有连接方式,重点

sqoop:000> show connector

+------------------------+---------+------------------------------------------------------------+----------------------+

|          Name          | Version |                           Class                            | Supported Directions |

+------------------------+---------+------------------------------------------------------------+----------------------+

| oracle-jdbc-connector  | 1.99.7  | org.apache.sqoop.connector.jdbc.oracle.OracleJdbcConnector | FROM/TO              |

| sftp-connector         | 1.99.7  | org.apache.sqoop.connector.sftp.SftpConnector              | TO                   |

| kafka-connector        | 1.99.7  | org.apache.sqoop.connector.kafka.KafkaConnector            | TO                   |

| kite-connector         | 1.99.7  | org.apache.sqoop.connector.kite.KiteConnector              | FROM/TO              |

| ftp-connector          | 1.99.7  | org.apache.sqoop.connector.ftp.FtpConnector                | TO                   |

| hdfs-connector         | 1.99.7  | org.apache.sqoop.connector.hdfs.HdfsConnector              | FROM/TO              |

| generic-jdbc-connector | 1.99.7  | org.apache.sqoop.connector.jdbc.GenericJdbcConnector       | FROM/TO              |

+------------------------+---------+------------------------------------------------------------+----------------------+

#查看当前的link

show link

#查看job

show job

#创建一个hdfs连接,这个和其他版本操作有区别的,

##这个版本创建一个链接必须指定好根据show link的类型才可以,

sqoop:000> create link -c hdfs-connector

Creating link for connector with name hdfs-connector

Please fill following values to create new link object

Name: hdfs_link

HDFS cluster

URI: hdfs://mycat:9000

Conf directory: /usr/local/hadoop/etc/hadoop

Additional configs:: 

There are currently 0 values in the map:

entry# 

New link was successfully created with validation status OK and name hdfs_link

sqoop:000> show link

+-----------+----------------+---------+

|   Name    | Connector Name | Enabled |

+-----------+----------------+---------+

| show link | hdfs-connector | true    |

| hdfs_link | hdfs-connector | true    |

+-----------+----------------+---------+

##创建mysql连接

sqoop:000> create link  -connector generic-jdbc-connector

Creating link for connector with name generic-jdbc-connector

Please fill following values to create new link object

Name: mysql

Database connection

Driver class: com.mysql.jdbc.Driver

Connection String: jdbc:mysql://192.168.1.107/sqoop

Username: sqoop

Password: *******

Fetch Size: 

Connection Properties: 

There are currently 0 values in the map:

entry# protocol=tcp

There are currently 1 values in the map:

protocol = tcp

entry# 

Identifier enclose: 

New link was successfully created with validation status OK and name mysql

sqoop:000> show link

+-----------+------------------------+---------+

|   Name    |     Connector Name     | Enabled |

+-----------+------------------------+---------+

| show link | hdfs-connector         | true    |

| hdfs_link | hdfs-connector         | true    |

| mysql     | generic-jdbc-connector | true    |

+-----------+------------------------+---------+

配置好连接,就要配置传输事物job,需要指定JOB,用于提交给mapreduce

Name: mysql-hdfs

Database source

Schema name: sqoop

Table name: sqoop

SQL statement: 

Column names: 

There are currently 1 values in the list:

1

element# 

Partition column: 

Partition column nullable: 

Boundary query: 

Incremental read

Check column: 

Last value: 

Target configuration

Override null value: false

Null value: 

File format: 

  0 : TEXT_FILE

  1 : SEQUENCE_FILE

  2 : PARQUET_FILE

Choose: 2

Compression codec: 

  0 : NONE

  1 : DEFAULT

  2 : DEFLATE

  3 : GZIP

  4 : BZIP2

  5 : LZO

  6 : LZ4

  7 : SNAPPY

  8 : CUSTOM

Choose: 0

Custom codec: 

Output directory: hdfs:/home/sqoop

Append mode: false

Throttling resources

Extractors: 2

Loaders: 2

Classpath configuration

Extra mapper jars: 

There are currently 1 values in the list:

1

element# 

New job was successfully created with validation status OK  and name mysql-hdfs

sqoop:000> 

sqoop:000> create job -f 2 -t 1  

Creating job for links with from id 1 and to id 6  

Please fill following values to create new job object  

Name: mysql_openfire--设置 任务名称  

FromJob configuration  

Schema name:(Required)sqoop --库名:必填  

Table name:(Required)sqoop --表名:必填  

Table SQL statement:(Optional) --选填  

Table column names:(Optional) --选填  

Partition column name:(Optional) id --选填  

Null value allowed for the partition column:(Optional) --选填  

Boundary query:(Optional) --选填  

ToJob configuration  

Output format:  

0 : TEXT_FILE  

1 : SEQUENCE_FILE  

Output format:  

0 : TEXT_FILE  

1 : SEQUENCE_FILE  

Choose: 0 --选择文件压缩格式  

Compression format:  

0 : NONE  

1 : DEFAULT  

2 : DEFLATE  

3 : GZIP  

4 : BZIP2  

5 : LZO  

6 : LZ4  

7 : SNAPPY  

8 : CUSTOM  

Choose: 0 --选择压缩类型  

Custom compression format:(Optional) --选填  

Output directory:hdfs:/ns1/sqoop --HDFS存储目录(目的地)  

Driver Config  

Extractors: 2 --提取器  

Loaders: 2 --加载器  

New job was successfully created with validation status OK and persistent id 1  

sqoop:000> show job

+----+------------+--------------------------------+----------------------------+---------+

| Id |    Name    |         From Connector         |        To Connector        | Enabled |

+----+------------+--------------------------------+----------------------------+---------+

| 1  | mysql-hdfs | mysql (generic-jdbc-connector) | hdfs_link (hdfs-connector) | true    |

+----+------------+--------------------------------+----------------------------+---------+

sqoop:000> 

 常用命令列表

sqoop:001> show link 显示所有链接

sqoop:001> carete link --cid 1床架连接

sqoop:000> delete link --lid 1 删除link

sqoop:001> show job 显示所有job

sqoop:001> create job --f 2 --t 1 创建job ( 从link 2 向link 1导入数据)

sqoop:000> start job --jid 1 启动job

sqoop:000> status job --jid 1 查看导入状态

sqoop:000> delete job --jid 1 删除job

ERROR tool.ImportTool: Encountered IOException running import job: 

java.io.IOException: No columns to generate for ClassWriter

连接器问题,换把

Exception in thread "main" Java.lang.IncompatibleClassChangeError: 

Found interface org.apache.hadoop.mapreduce.JobContext, but class was expected

hadoop与sqoop版本问题 不一致

Exception has occurred during processing command

Exception: org.apache.sqoop.common.SqoopException Message: CLIENT_0001:Server has returned exception

根本不知道这个提示说什么,通过修改设置:

set option --name verbose --value true

#####新建LINK 到JOB 最后到失败。。。。

sqoop:000> update job -n cmd

Updating job with name cmd

Please update job:

Name: cmd

Database source

Schema name: sqoop

Table name: test

SQL statement: 

Column names: 

There are currently 1 values in the list:

proctor=tcp

element# 

Partition column: 

Partition column nullable: 

Boundary query: 

Incremental read

Check column: 

Last value: 

Target configuration

Override null value: 

Null value: 

File format: 

  0 : TEXT_FILE

  1 : SEQUENCE_FILE

  2 : PARQUET_FILE

Choose: 0

Compression codec: 

  0 : NONE

  1 : DEFAULT

  2 : DEFLATE

  3 : GZIP

  4 : BZIP2

  5 : LZO

  6 : LZ4

  7 : SNAPPY

  8 : CUSTOM

Choose: 0

Custom codec: 

Output directory: hdfs://mycat:9000/home/sqoop

Append mode: 

Throttling resources

Extractors: 2

Loaders: 2

Classpath configuration

Extra mapper jars: 

There are currently 1 values in the list:

proctor=tcp

element# 

Job was successfully updated with status OK

Caused by: Exception: com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException Message: 

You have an error in your SQL syntax; check the manual that corresponds to your MySQL 

server version for the right syntax to use near '"sqoop"."test"' at line 1

Stack trace:

at  sun.reflect.GeneratedConstructorAccessor77 (null:-1)  

at  sun.reflect.DelegatingConstructorAccessorImpl (DelegatingConstructorAccessorImpl.java:45)  

at  java.lang.reflect.Constructor (Constructor.java:526)  

at  com.mysql.jdbc.Util (Util.java:406)  

at  com.mysql.jdbc.Util (Util.java:381)  

at  com.mysql.jdbc.SQLError (SQLError.java:1030)  

at  com.mysql.jdbc.SQLError (SQLError.java:956)  

at  com.mysql.jdbc.MysqlIO (MysqlIO.java:3491)  

at  com.mysql.jdbc.MysqlIO (MysqlIO.java:3423)  

at  com.mysql.jdbc.MysqlIO (MysqlIO.java:1936)  

at  com.mysql.jdbc.MysqlIO (MysqlIO.java:2060)  

at  com.mysql.jdbc.ConnectionImpl (ConnectionImpl.java:2542)  

at  com.mysql.jdbc.PreparedStatement (PreparedStatement.java:1734)  

at  com.mysql.jdbc.PreparedStatement (PreparedStatement.java:1885)  

at  org.apache.sqoop.connector.jdbc.GenericJdbcFromInitializer (GenericJdbcFromInitializer.java:257)  

at  org.apache.sqoop.connector.jdbc.GenericJdbcFromInitializer (GenericJdbcFromInitializer.java:61)  

at  org.apache.sqoop.connector.jdbc.GenericJdbcFromInitializer (GenericJdbcFromInitializer.java:47)  

at  org.apache.sqoop.driver.JobManager$1 (JobManager.java:520)  

at  org.apache.sqoop.driver.JobManager$1 (JobManager.java:517)  

at  org.apache.sqoop.utils.ClassUtils (ClassUtils.java:281)  

at  org.apache.sqoop.driver.JobManager (JobManager.java:516)  

at  org.apache.sqoop.driver.JobManager (JobManager.java:423)  

at  org.apache.sqoop.driver.JobManager (JobManager.java:317)  

at  org.apache.sqoop.handler.JobRequestHandler (JobRequestHandler.java:353)  

at  org.apache.sqoop.handler.JobRequestHandler (JobRequestHandler.java:114)  

at  org.apache.sqoop.server.v1.JobServlet (JobServlet.java:84)  

at  org.apache.sqoop.server.SqoopProtocolServlet (SqoopProtocolServlet.java:81)  

at  javax.servlet.http.HttpServlet (HttpServlet.java:710)  

at  javax.servlet.http.HttpServlet (HttpServlet.java:790)  

at  org.eclipse.jetty.servlet.ServletHolder (ServletHolder.java:808)  

at  org.eclipse.jetty.servlet.ServletHandler$CachedChain (ServletHandler.java:1669)  

at  org.apache.hadoop.security.authentication.server.AuthenticationFilter (AuthenticationFilter.java:595)  

at  org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticationFilter (DelegationTokenAuthenticationFilter.java:291)  

at  org.apache.hadoop.security.authentication.server.AuthenticationFilter (AuthenticationFilter.java:554)  

at  org.eclipse.jetty.servlet.ServletHandler$CachedChain (ServletHandler.java:1652)  

at  org.eclipse.jetty.servlet.ServletHandler (ServletHandler.java:585)  

at  org.eclipse.jetty.server.handler.ContextHandler (ContextHandler.java:1127)  

at  org.eclipse.jetty.servlet.ServletHandler (ServletHandler.java:515)  

at  org.eclipse.jetty.server.handler.ContextHandler (ContextHandler.java:1061)  

at  org.eclipse.jetty.server.handler.ScopedHandler (ScopedHandler.java:141)  

at  org.eclipse.jetty.server.handler.HandlerWrapper (HandlerWrapper.java:97)  

at  org.eclipse.jetty.server.Server (Server.java:499)  

at  org.eclipse.jetty.server.HttpChannel (HttpChannel.java:310)  

at  org.eclipse.jetty.server.HttpConnection (HttpConnection.java:257)  

at  org.eclipse.jetty.io.AbstractConnection$2 (AbstractConnection.java:540)  

at  java.util.concurrent.ThreadPoolExecutor (ThreadPoolExecutor.java:1145)  

at  java.util.concurrent.ThreadPoolExecutor$Worker (ThreadPoolExecutor.java:615)  

at  java.lang.Thread (Thread.java:745)  

本文转自 DBAspace 51CTO博客,原文链接:http://blog.51cto.com/dbaspace/1875955

转载地址:http://mxqbx.baihongyu.com/

你可能感兴趣的文章
【官方文档】Nginx负载均衡学习笔记(三) TCP和UDP负载平衡官方参考文档
查看>>
矩阵常用归一化
查看>>
Oracle常用函数总结
查看>>
【聚能聊有奖话题】Boring隧道掘进机完成首段挖掘,离未来交通还有多远?
查看>>
USNews大学排名遭美国计算机研究学会怒怼,指排名荒谬要求撤回
查看>>
七大关键数据 移动安全迎来历史转折点
查看>>
盘点物联网网关现有联网技术及应用场景
查看>>
mui 总结2--新建第一个app项目
查看>>
nginx的lua api
查看>>
考研太苦逼没坚持下来!看苑老师视频有点上头
查看>>
HCNA——RIP的路由汇总
查看>>
zabbix监控php状态(四)
查看>>
定时任务的创建
查看>>
实战Django:小型CMS Part2
查看>>
原创]windows server 2012 AD架构试验系列 – 16更改DC计算机名
查看>>
统治世界的十大算法
查看>>
linux svn安装和配置
查看>>
SSH中调用另一action的方法(chain,redirect)
查看>>
数据库基础
查看>>
表格排序
查看>>