public class HdfsSpout extends BaseRichSpout
Constructor and Description |
---|
HdfsSpout() |
Modifier and Type | Method and Description |
---|---|
void |
ack(Object msgId)
Storm has determined that the tuple emitted by this spout with the msgId identifier has been fully processed.
|
void |
close()
Called when an ISpout is going to be shutdown.
|
void |
declareOutputFields(OutputFieldsDeclarer declarer)
Declare the output schema for all the streams of this topology.
|
protected void |
emitData(List<Object> tuple,
org.apache.storm.hdfs.spout.HdfsSpout.MessageId id) |
void |
fail(Object msgId)
The tuple emitted by this spout with the msgId identifier has failed to be fully processed.
|
SpoutOutputCollector |
getCollector() |
org.apache.hadoop.fs.Path |
getLockDirPath() |
void |
nextTuple()
When this method is called, Storm is requesting that the Spout emit tuples to the output collector.
|
void |
open(Map<String,Object> conf,
TopologyContext context,
SpoutOutputCollector collector)
Called when a task for this component is initialized within a worker on the cluster.
|
HdfsSpout |
setArchiveDir(String archiveDir) |
HdfsSpout |
setBadFilesDir(String badFilesDir) |
HdfsSpout |
setClocksInSync(boolean clocksInSync) |
HdfsSpout |
setCommitFrequencyCount(int commitFrequencyCount) |
HdfsSpout |
setCommitFrequencySec(int commitFrequencySec) |
HdfsSpout |
setHdfsUri(String hdfsUri) |
HdfsSpout |
setIgnoreSuffix(String ignoreSuffix) |
HdfsSpout |
setLockDir(String lockDir) |
HdfsSpout |
setLockTimeoutSec(int lockTimeoutSec) |
HdfsSpout |
setMaxOutstanding(int maxOutstanding) |
HdfsSpout |
setReaderType(String readerType) |
HdfsSpout |
setSourceDir(String sourceDir) |
HdfsSpout |
withConfigKey(String configKey)
set key name under which HDFS options are placed.
|
HdfsSpout |
withOutputFields(String... fields)
Output field names.
|
HdfsSpout |
withOutputStream(String streamName)
Set output stream name
|
activate, deactivate
getComponentConfiguration
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
getComponentConfiguration
public HdfsSpout setCommitFrequencyCount(int commitFrequencyCount)
public HdfsSpout setCommitFrequencySec(int commitFrequencySec)
public HdfsSpout setMaxOutstanding(int maxOutstanding)
public HdfsSpout setLockTimeoutSec(int lockTimeoutSec)
public HdfsSpout setClocksInSync(boolean clocksInSync)
public HdfsSpout withOutputFields(String... fields)
Output field names. Number of fields depends upon the reader type
public HdfsSpout withConfigKey(String configKey)
set key name under which HDFS options are placed. (similar to HDFS bolt). default key name is ‘hdfs.config’
public org.apache.hadoop.fs.Path getLockDirPath()
public SpoutOutputCollector getCollector()
public void nextTuple()
ISpout
When this method is called, Storm is requesting that the Spout emit tuples to the output collector. This method should be non-blocking, so if the Spout has no tuples to emit, this method should return. nextTuple, ack, and fail are all called in a tight loop in a single thread in the spout task. When there are no tuples to emit, it is courteous to have nextTuple sleep for a short amount of time (like a single millisecond) so as not to waste too much CPU.
protected void emitData(List<Object> tuple, org.apache.storm.hdfs.spout.HdfsSpout.MessageId id)
public void open(Map<String,Object> conf, TopologyContext context, SpoutOutputCollector collector)
ISpout
Called when a task for this component is initialized within a worker on the cluster. It provides the spout with the environment in which the spout executes.
This includes the:
conf
- The Storm configuration for this spout. This is the configuration provided to the topology merged in with cluster configuration on this machine.context
- This object can be used to get information about this task’s place within the topology, including the task id and component id of this task, input and output information, etc.collector
- The collector is used to emit tuples from this spout. Tuples can be emitted at any time, including the open and close methods. The collector is thread-safe and should be saved as an instance variable of this spout object.public void close()
ISpout
Called when an ISpout is going to be shutdown. There is no guarentee that close will be called, because the supervisor kill -9’s worker processes on the cluster.
The one context where close is guaranteed to be called is a topology is killed when running Storm in local mode.
close
in interface ISpout
close
in class BaseRichSpout
public void ack(Object msgId)
ISpout
Storm has determined that the tuple emitted by this spout with the msgId identifier has been fully processed. Typically, an implementation of this method will take that message off the queue and prevent it from being replayed.
ack
in interface ISpout
ack
in class BaseRichSpout
public void fail(Object msgId)
ISpout
The tuple emitted by this spout with the msgId identifier has failed to be fully processed. Typically, an implementation of this method will put that message back on the queue to be replayed at a later time.
fail
in interface ISpout
fail
in class BaseRichSpout
public void declareOutputFields(OutputFieldsDeclarer declarer)
IComponent
Declare the output schema for all the streams of this topology.
declarer
- this is used to declare output stream ids, output fields, and whether or not each output stream is a direct streamCopyright © 2019 The Apache Software Foundation. All rights reserved.