1.1.自定义Sink说明
sink是flume中用于指定数据下沉地的组件。自带的已经很多,对于某些sink如果没有我们想要的,也可以自定义sink实现将数据保存到我们想要的地方去,例如kafka,或者mysql,或者文件等等都可以
需求如下:从网络端口当中发送数据,自定义sink,使用sink从网络端口接收数据,然后将数据保存到本地文件当中去。
1.2.自定义Sink原理实现
自定义MySink
public class MySink extends AbstractSink implements Configurable {
private Context context ;
private String filePath = "";
private String fileName = "";
private File fileDir;
//这个方法会在初始化调用,主要用于初始化我们的Context,获取我们的一些配置参数
@Override
public void configure(Context context) {
try {
this.context = context;
filePath = context.getString("filePath");
fileName = context.getString("fileName");
fileDir = new File(filePath);
if(!fileDir.exists()){
fileDir.mkdirs();
}
} catch (Exception e) {
e.printStackTrace();
}
}
//这个方法会被反复调用
@Override
public Status process() throws EventDeliveryException {
Event event = null;
Channel channel = this.getChannel();
Transaction transaction = channel.getTransaction();
transaction.begin();
while(true){
event = channel.take();
if(null != event){
break;
}
}
byte[] body = event.getBody();
String line = new String(body);
try {
FileUtils.write(new File(filePath+File.separator+fileName),line,true);
transaction.commit();
} catch (IOException e) {
transaction.rollback();
e.printStackTrace();
return Status.BACKOFF;
}finally {
transaction.close();
}
return Status.READY;
}
}
1.3 功能测试
将代码使用打包插件,打成jar包,注意一定要将commons-langs这个依赖包打进去,放到flume的lib目录下
开发flume的配置文件:
a1.sources = r1
a1.sinks = k1
a1.channels = c1
# Describe/configure the source
a1.sources.r1.type = netcat
a1.sources.r1.bind = node-1
a1.sources.r1.port = 5678
a1.sources.r1.channels = c1
# # Describe the sink
a1.sinks.k1.type = cn.itcast.flumesink.MySink
a1.sinks.k1.filePath=/export/servers
a1.sinks.k1.fileName=filesink.txt
# # Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100
# # Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1
1.4启动flume,并且使用telnet测试:
bin/flume-ng agent -c conf -f conf/filesink.conf -n a1 -Dflume.root.logger=INFO,console
Telnet node-1 5678 连接到机器端口上输入数据。