- 运用 Spooling Directory Source 可以实现将将要收集的数据放置到”自动搜集”目录中。这个Source将监视该目录,实时解析新文件。事件处理逻辑是可插拔的,当一个文件被完全读入通道,它会被重命名或可选的直接删除。本例为重命名。
- 要注意的是,放置到自动搜集目录下的文件不能修改,如果修改,则flume会报错。另外,也不能产生重名的文件,如果有重名的文件被放置进来,则flume会报错。
属性说明:(由于比较长 这里只给出了必须给出的属性,全部属性请参考官方文档):
!type – 类型,需要指定为”spooldir”
!spoolDir – 读取文件的路径,即”搜集目录”
fileSuffix.COMPLETED对处理完成的文件追加的后缀 - 案例:
1、创建对应的hbase 表
hbase(main):017:0> create 'test01','cf_log'
讯享网
2、编写配置文件
讯享网a1.sources = r1 a1.channels = c1 a1.sinks = k1 a1.sources.r1.type = spooldir a1.sources.r1.spoolDir = /opt/data a1.sources.r1.batchSize = 1 a1.sources.r1.channels = c1 a1.channels.c1.type = memory a1.channels.c1.capacity = 10000 a1.channels.c1.transactionCapacity = 10000 a1.channels.c1.byteCapacityBufferPercentage = 20 a1.channels.c1.byteCapacity = a1.sinks.k1.batchSize = 1 a1.sinks.k1.type = hbase a1.sinks.k1.table = test01 a1.sinks.k1.columnFamily = cf_log a1.sinks.k1.serializer = org.apache.flume.sink.hbase.RegexHbaseEventSerializer a1.sinks.k1.serializer.rowKeyIndex = 0 a1.sinks.k1.serializer.regex=(\\d+-\\d\\d-\\d\\d\\s\\d+:\\d+:\\d+)\\s.*cmd\\\":\\\"(\\d+)\\\".*\\\"phoneNum\\\":\\\"(\\d{
0,13})\\\",\\\"username\\\":\\\"([a-z0-9_-]{
3,16})\\\",\\\"name\\\":\\\"(.*)\\\",\\\"wor kOrgName\\\":\\\"(.*)\\\",\\\"workOrg\\\":\\\"(\\d+).*$ a1.sinks.k1.serializer.colNames=ROW_KEY,cmd,phone,userName,name,workOrgName,workOrg a1.sinks.k1.channel = c1 a1.sinks.k1.kerberosPrincipal = hbase/ a1.sinks.k1.kerberosKeytab = /etc/security/keytabs/hbase.service.keytab
启动flume 样例
flume-ng agent -n a1 -c ../conf -f test01.conf -Dflume.root.logger=DEBUG,console
启动成功后,会将配置文件中对应的目录下所有文件解析到hbase中,解析过的文件会被重命名,当有新文件添加进该目录,会被自动进行解析。
源文件:
讯享网2017-08-18 10:16:10 [http-bio-7080-exec-16] DEBUG - <==请求报文:{
"cmd":"","common":{
"user":{
"phoneNum":"","username":"admin","name":"admin","workOrgName":"广州市市民服务和社会保障卡管理中心","wo rkOrg":"0000001"}},"params":{"batchType":"1"}} (WebServiceImpl.java:53) 2017-08-18 10:17:55 [http-bio-7080-exec-16] DEBUG - <==请求报文:{
"cmd":"","common":{
"user":{
"phoneNum":"","username":"admin","name":"admin","workOrgName":"广州市市民服务和社会保障卡管理中心","wo rkOrg":"0000001"}},"params":{"batchType":"1"}} (WebServiceImpl.java:53) 2017-08-18 10:18:18 [http-bio-7080-exec-16] DEBUG - <==请求报文:{
"cmd":"","common":{
"user":{
"phoneNum":"","username":"admin","name":"admin","workOrgName":"广州市市民服务和社会保障卡管理中心","wo rkOrg":"0000001"}},"params":{"batchType":"1"}} (WebServiceImpl.java:53) 2017-08-18 10:19:17 [http-bio-7080-exec-16] DEBUG - <==请求报文:{
"cmd":"","common":{
"user":{
"phoneNum":"","username":"admin","name":"admin","workOrgName":"广州市市民服务和社会保障卡管理中心","wo rkOrg":"0000001"}},"params":{"batchType":"1"}} (WebServiceImpl.java:53) 2017-08-18 10:20:30 [http-bio-7080-exec-16] DEBUG - <==请求报文:{
"cmd":"","common":{
"user":{
"phoneNum":"","username":"admin","name":"admin","workOrgName":"广州市市民服务和社会保障卡管理中心","wo rkOrg":"0000001"}},"params":{"batchType":"1"}} (WebServiceImpl.java:53) 2017-08-18 10:23:40 [http-bio-7080-exec-16] DEBUG - <==请求报文:{
"cmd":"","common":{
"user":{
"phoneNum":"","username":"admin","name":"admin","workOrgName":"广州市市民服务和社会保障卡管理中心","wo rkOrg":"0000001"}},"params":{"batchType":"1"}} (WebServiceImpl.java:53) ....
解析后Hbase中结果查询:
2017-08-18 10:20:30 column=cf_log:cmd, timestamp=94, value= 2017-08-18 10:20:30 column=cf_log:name, timestamp=94, value=admin 2017-08-18 10:20:30 column=cf_log:phone, timestamp=94, value= 2017-08-18 10:20:30 column=cf_log:userName, timestamp=94, value=admin 2017-08-18 10:20:30 column=cf_log:workOrg, timestamp=94, value=0000001 2017-08-18 10:20:30 column=cf_log:workOrgName, timestamp=94, value=\xE5\xB9\xBF\xE5\xB7\x9E\xE5\xB8\x82\xE5\xB8\x82\xE6\xB0\x91\xE6\x9C\x8D\xE5\x8A\xA1\xE5\x92\x8C\ xE7\xA4\xBE\xE4\xBC\x9A\xE4\xBF\x9D\xE9\x9A\x9C\xE5\x8D\xA1\xE7\xAE\xA1\xE7\x90\x86\xE4\xB8\xAD\xE5\xBF\x83 2017-08-18 10:23:40 column=cf_log:cmd, timestamp=96, value= 2017-08-18 10:23:40 column=cf_log:name, timestamp=96, value=admin 2017-08-18 10:23:40 column=cf_log:phone, timestamp=96, value= 2017-08-18 10:23:40 column=cf_log:userName, timestamp=96, value=admin 2017-08-18 10:23:40 column=cf_log:workOrg, timestamp=96, value=0000001 2017-08-18 10:23:40 column=cf_log:workOrgName, timestamp=96, value=\xE5\xB9\xBF\xE5\xB7\x9E\xE5\xB8\x82\xE5\xB8\x82\xE6\xB0\x91\xE6\x9C\x8D\xE5\x8A\xA1\xE5\x92\x8C\ xE7\xA4\xBE\xE4\xBC\x9A\xE4\xBF\x9D\xE9\x9A\x9C\xE5\x8D\xA1\xE7\xAE\xA1\xE7\x90\x86\xE4\xB8\xAD\xE5\xBF\x83 ......

版权声明:本文内容由互联网用户自发贡献,该文观点仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容,请联系我们,一经查实,本站将立刻删除。
如需转载请保留出处:https://51itzy.com/kjqy/57927.html