-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathREADME
94 lines (60 loc) · 2.47 KB
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
Have you ever wanted to send your MySQL query results directly to a file
in HDFS? Well, now you can!
Pipefish sends the result of your SQL statement to a tab-delimited file in hdfs.
It doesn't create any intermediate or temporary files.
It just reads the rows from the MySQL server and writes them as tab-delimited
fields to a file in HDFS that you specify.
At the end of the row-set, pipefish flushes and closes the hdfs file and
closes the MySQL connection.
Usage:
$ pf --help
args:
--defaults_file='/path/to/my.cnf'
--user='user'
--host='host'
--db='db name'
--password='password'
--hdfs_path='/path/to/file'
--sql='sql statement'
[--overwrite]
user, host, db, and password settings override those specified in defaults_file.
Supply --overwrite to truncate the file instead of appending to it.
B U I L D I N S T R U C T I O N S
===================================
Linux
=========
Set JAVA_HOME.
Install Java architecture independent libraries. For example, openjdk-6-jre-lib.
Install MySQL development libs.
Install the libhdfs development bits from Cloudera.
If your dev environment uses 'parcels' installed by Cloudera Manager,
make sure your host configured as an HDFS gateway.
That will get you the HDFS headers and libs that you'll need.
-- Otherwise --
Configure your package manager by downloading and installing the "1-click install" package.
http://www.cloudera.com/content/cloudera-content/cloudera-docs/CDH5/latest/CDH5-Installation-Guide/cdh5ig_cdh5_install.html?scroll=concept_gp2_q32_24_unique_1
And then:
$ sudo apt-get install libhdfs0-dev (Ubuntu)
$ sudo yum install hadoop-libhdfs-devel (Linux)
Build and install pipefish.
$ make
$ sudo make install
`make install' puts it in /usr/local/bin.
On a Mac
========
Install Java
Install MySQL development libs.
Build the HDFS libs from a Cloudera tarball distribution.
Get your gcc toolchain installed along with cmake, ant, and maven.
Download and upack the CDH5 tarball. For example:
http://archive.cloudera.com/cdh5/cdh/5/hadoop-2.3.0-cdh5.1.0.tar.gz
Build just the hdfs lib that we need:
$ cd hadoop-2.3.0-cdh5.1.0/src/hadoop-hdfs-project/hadoop-hdfs
$ sudo mvn package -Pdist,native -DskipTests
Built OK? Put the headers and libs in a place that GCC can find them:
$ sudo cp `find target/native -name libhdfs\.*` /usr/lib
$ sudo cp `find target -name hdfs.h` /usr/include
Build and install it.
$ make
$ sudo make install
`make install' puts it in /usr/local/bin.