Apache Storm 0.9.2 released

We are pleased to announce that Apache Storm 0.9.2-incubating has been released and is available from the downloads page. This release includes many important fixes and improvements.

Netty Transport Improvements

Apache Storm's Netty-based transport has been overhauled to significantly improve performance through better utilization of thread, CPU, and network resources, particularly in cases where message sizes are small. Apache Storm contributor Sean Zhong (@clockfly) deserves a great deal of credit not only for discovering, analyzing, documenting and fixing the root cause, but also for persevering through an extended review process and promptly addressing all concerns.

Those interested in the technical details and evolution of this patch can find out more in the JIRA ticket for STORM-297.

Sean also discovered and fixed an elusive bug in Apache Storm's usage of the Disruptor queue that could lead to out-of-order or lost messages.

Many thanks to Sean for contributing these important fixes.

Apache Storm UI Improvements

This release also includes a number of improvements to the Apache Storm UI service. Contributor Sriharsha Chintalapani(@harshach) added a REST API to the Apache Storm UI service to expose metrics and operations in JSON format, and updated the UI to use that API.

The new REST API will make it considerably easier for other services to consume availabe cluster and topology metrics for monitoring and visualization applications. Kyle Nusbaum (@knusbaum) has already leveraged the REST API to create a topology visualization tool now included in Apache Storm UI and illustrated in the screenshot below.

 

Apache Storm UI Topology Visualization

 

In the visualization, spout components are represented as blue, while bolts are colored between green and red depending on their associated capacity metric. The width of the lines between the components represent the flow of tuples relative to the other visible streams.

Kafka Spout

This is the first Apache Storm release to include official support for consuming data from Kafka 0.8.x. In the past, development of Kafka spouts for Apache Storm had become somewhat fragmented and finding an implementation that worked with certain versions of Apache Storm and Kafka proved burdensome for some developers. This is no longer the case, as the storm-kafka module is now part of the Apache Storm project and associated artifacts are released to official channels (Maven Central) along with Apache Storm's other components.

Thanks are due to GitHub user @wurstmeister for picking up Nathan Marz' original Kafka 0.7.x implementation, updating it to work with Kafka 0.8.x, and contributing that work back to the Apache Storm community.

The storm-kafka module can be found in the /external/ directory of the source tree and binary distributions. The external area has been set up to contain projects that while not required by Apache Storm, are often used in conjunction with Apache Storm to integrate with some other technology. Such projects also come with a maintenance committment from at least one Apache Storm committer to ensure compatibility with Apache Storm's main codebase as it evolves.

The storm-kafka dependency is available now from Maven Central at the following coordinates:

groupId: org.apache.storm
artifactId: storm-kafka
version: 0.9.2-incubating

Users, and Scala developers in particular, should note that the Kafka dependency is listed as provided. This allows users to choose a specific Scala version as described in the project README.

Apache Storm Starter and Examples

Similar to the external section of the codebase, we have also added an examples directory and pulled in the storm-starter project to ensure it will be maintained in lock-step with Apache Storm's main codebase.

Thank you to Apache Storm committer Michael G. Noll for his continued work in maintaining and improving the storm-starter project.

Plugable Serialization for Multilang

In previous versions of Apache Storm, serialization of data to and from multilang components was limited to JSON, imposing somewhat of performance penalty. Thanks to a contribution from John Gilmore (@jsgilmore) the serialization mechanism is now plugable and enables the use of more performant serialization frameworks like protocol buffers in addition to JSON.

Thanks

Special thanks are due to all those who have contributed to Apache Storm -- whether through direct code contributions, documentation, bug reports, or helping other users on the mailing lists. Your efforts are much appreciated.

Changelog

  • STORM-352: [storm-kafka] PartitionManager does not save offsets to ZooKeeper
  • STORM-66: send taskid on initial handshake
  • STORM-342: Contention in Disruptor Queue which may cause out of order or lost messages
  • STORM-338: Move towards idiomatic Clojure style
  • STORM-335: add drpc test for removing timed out requests from queue
  • STORM-69: Storm UI Visualizations for Topologies
  • STORM-297: Performance scaling with CPU
  • STORM-244: DRPC timeout can return null instead of throwing an exception
  • STORM-63: remove timeout drpc request from its function's request queue
  • STORM-313: Remove log-level-page from logviewer
  • STORM-205: Add REST API To Storm UI
  • STORM-326: tasks send duplicate metrics
  • STORM-331: Update the Kafka dependency of storm-kafka to 0.8.1.1
  • STORM-308: Add support for config_value to {supervisor,nimbus,ui,drpc,logviewer} childopts
  • STORM-309: storm-starter Readme: windows documentation update
  • STORM-318: update storm-kafka to use apache curator-2.4.0
  • STORM-303: storm-kafka reliability improvements
  • STORM-233: Removed inline heartbeat to nimbus to avoid workers being killed when under heavy ZK load
  • STORM-267: fix package name of LoggingMetricsConsumer in storm.yaml.example
  • STORM-265: upgrade to clojure 1.5.1
  • STORM-232: ship JNI dependencies with the topology jar
  • STORM-295: Add storm configuration to define JAVA_HOME
  • STORM-138: Pluggable serialization for multilang
  • STORM-264: Removes references to the deprecated topology.optimize
  • STORM-245: implement Stream.localOrShuffle() for trident
  • STORM-317: Add SECURITY.md to release binaries
  • STORM-310: Change Twitter authentication
  • STORM-305: Create developer documentation
  • STORM-280: storm unit tests are failing on windows
  • STORM-298: Logback file does not include full path for metrics appender fileNamePattern
  • STORM-316: added validation to registermetrics to have timebucketSizeInSecs >= 1
  • STORM-315: Added progress bar when submitting topology
  • STORM-214: Windows: storm.cmd does not properly handle multiple -c arguments
  • STORM-306: Add security documentation
  • STORM-302: Fix Indentation for pom.xml in storm-dist
  • STORM-235: Registering a null metric should blow up early
  • STORM-113: making thrift usage thread safe for local cluster
  • STORM-223: use safe parsing for reading YAML
  • STORM-238: LICENSE and NOTICE files are duplicated in storm-core jar
  • STORM-276: Add support for logviewer in storm.cmd.
  • STORM-286: Use URLEncoder#encode with the encoding specified.
  • STORM-296: Storm kafka unit tests are failing on windows
  • STORM-291: upgrade http-client to 4.3.3
  • STORM-252: Upgrade curator to latest version
  • STORM-294: Commas not escaped in command line
  • STORM-287: Fix the positioning of documentation strings in clojure code
  • STORM-290: Fix a log binding conflict caused by curator dependencies
  • STORM-289: Fix Trident DRPC memory leak
  • STORM-173: Treat command line "-c" option number config values as such
  • STORM-194: Support list of strings in *.worker.childopts, handle spaces
  • STORM-288: Fixes version spelling in pom.xml
  • STORM-208: Add storm-kafka as an external module
  • STORM-285: Fix storm-core shade plugin config
  • STORM-12: reduce thread usage of netty transport
  • STORM-281: fix and issue with config parsing that could lead to leaking file descriptors
  • STORM-196: When JVM_OPTS are set, storm jar fails to detect storm.jar from environment
  • STORM-260: Fix a potential race condition with simulated time in Storm's unit tests
  • STORM-258: Update commons-io version to 2.4
  • STORM-270: don't package .clj files in release jars.
  • STORM-273: Error while running storm topologies on Windows using "storm jar"
  • STROM-247: Replace links to github resources in storm script
  • STORM-263: Update Kryo version to 2.21+
  • STORM-187: Fix Netty error "java.lang.IllegalArgumentException: timeout value is negative"
  • STORM-186: fix float secs to millis long convertion
  • STORM-70: Upgrade to ZK-3.4.5 and curator-1.3.3
  • STORM-146: Unit test regression when storm is compiled with 3.4.5 zookeeper