Flume —— 启动与基本使用
Posted fonxian
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了Flume —— 启动与基本使用相关的知识,希望对你有一定的参考价值。
Flume is a distributed, reliable(可靠地), and available service for efficiently(高效地) collecting, aggregating, and moving large amounts of log data. It has a simple and flexible architecture based on streaming data flows. It is robust and fault tolerant with tunable reliability mechanisms and many failover and recovery mechanisms. It uses a simple extensible data model that allows for online analytic application.
Flume是一个分布式、高可靠、高可用的服务,用来高效地采集、聚合和传输海量日志数据。它有一个基于流式数据流的简单、灵活的架构。
A Flume source consumes events delivered to it by an external source like a web server. The external source sends events to Flume in a format that is recognized by the target Flume source. For example, an Avro Flume source can be used to receive Avro events from Avro clients or other Flume agents in the flow that send events from an Avro sink. A similar flow can be defined using a Thrift Flume Source to receive events from a Thrift Sink or a Flume Thrift Rpc Client or Thrift clients written in any language generated from the Flume thrift protocol.When a Flume source receives an event, it stores it into one or more channels. The channel is a passive store that keeps the event until it’s consumed by a Flume sink. The file channel is one example – it is backed by the local filesystem. The sink removes the event from the channel and puts it into an external repository like HDFS (via Flume HDFS sink) or forwards it to the Flume source of the next Flume agent (next hop) in the flow. The source and sink within the given agent run asynchronously with the events staged in the channel.
Flume在下图中的作用是,实时读取服务器本地磁盘的数据,将数据写入到HDFS中。
参考文档
Flume官网
尚硅谷大数据课程之Flume
FlumeUserGuide
以上是关于Flume —— 启动与基本使用的主要内容,如果未能解决你的问题,请参考以下文章
Flume基础知识 01简介 + 基本架构 + 核心概念 + 架构模式 + Agent内部原理 + 配置格式(一篇就可入门flume)