Apache Hop 2.0.0 is available!

Apache Hop 2.0.0 is available!

The Apache Hop team just released Apache Hop 2.0.0, the first major release after graduating as a Top-Level Project at the Apache Software Foundation at the very end of 2021. 

This release contains almost three months of work on over 150 improvements and bug fixes by a growing number of contributors, 

Let's walk through the highlights in this release. 

Upgrade to Java 11

With Java 8 reaching (or already beyond) the end of its life, an upgrade to Java 11 was imminent. Since upgrading the entire Hop codebase to a new Java version is not a small feat, this alone justified a new major release. 

Apache Hop has been running in a separate branch on Java 11 for months, gradually fixing all issues and running all of the available unit and integration tests. Before this release, With almost half a year of active testing and development, Hop 2.0 is robust and reliable on Java 11.

Since code changes were unavoidable in the Java 11 upgrade, the Apache Hop team took the opportunity to make some breaking API changes in a never-ending quest to clean up, improve and simplify the codebase. 

Chinese translations

Apache Hop's growing community in Asia has resulted in a significant contribution. In addition to improvements in the Hop Translator, a UI tool that enables non-developers to translate Apache Hop into their own native language, Apache Hop is now available in Simplified Chinese (zh_CN). 

hop-gui-simplified-chinese

 

New Transform Plugins

Apache Hop contains over 400 transform, action and other plugins. Hop 2.0.0 adds another couple of transform plugins to the platform. 

Apache Avro File Output

The Apache Avro File Output transform allows Hop data engineers to write data to binary files or fields in the Avro Binary or JSON format. 

This plugin comes in addition to the Avro File Input, Avro Encode and Avro Decode transforms already available in earlier Hop releases. 

apache-avro-file-output-transform

Apache Doris Bulk Loader

From the Apache Doris website: "Apache Doris is a modern MPP analytical database product. It can provide sub-second queries and efficient real-time data analysis. With its distributed architecture, up to 10PB level datasets will be well-supported and easy to operate."

The Apache Doris Bulk Loader transform allows you to insert data into Apache Doris at high speed and volume, making it a faster way to load data than using the traditional database insert statements.

This new Apache Doris Bulk Loader transform was developed by the Apache Doris community and donated to Apache Hop. This shows the importance of the Apache community and the interaction and collaboration between Apache projects. 

apache-doris-bulk-loader-transform

Drools Rules Accumulator and Drools Rules Executor

From the Drools website: "Drools is a Business Rules Management System (BRMS) solution. It provides a core Business Rules Engine (BRE), a web authoring and rules management application (Drools Workbench), full runtime support for Decision Model and Notation (DMN) models at Conformance level 3, and an Eclipse IDE plugin for core development."


The Drools Accumulator transform collects incoming rows and executes them against a rule set. This may be useful to determine the answer to a question or otherwise analyze a dataset.

The Drools Rule Executor transform allows fields of incoming rows to be executed against a rule set. This may be useful to determine additional information or route rows onto another transform.

Both transforms were contributed by our partner Serasoft

 

drools-rules-accumulator-transformdrools-rules-executor-transform

 

Formula

The Formula transform allows you to apply Excel-like formulas and functions on fields in a pipeline. This transform replaces the transform plugin from Kettle/PDI that could not be ported to Apache Hop because of licensing and code quality issues. 

The transform was developed by Lean With Data, the development was requested and sponsored by BaselTech (thanks!!). 

Contact us if you have a need for additional functionality in Apache Hop. We're happy to help with custom plugin and feature development. 

formula-transform

Dimension Lookup/Update

Existing Hop (or Kettle/PDI) users know this plugin is not new: it's been in the code base for ages. 

Over time, the UI for this transform became overcrowded, which made it hard to use with complex dimensions.

This transform deserves an honorable mention here because of the UI cleanup Sergio (Serasoft) did. The dialog has been cleaned up and now shows the available options in 4 tabs: keys, fields, technical key and versioning.

 

dimension-lookup-update-transform

Apache Beam Upgrade

Apache Beam has been a very important plugin for Apache Hop. Actually, Hop waited to switch to Java 11 until Beam was completely Java 11 ready. 

Apache Beam is an advanced unified programming model that allows you to implement batch and streaming data processing jobs that run on any execution engine. Popular execution engines are for example Apache Spark, Apache Flink or Google Cloud Platform Dataflow.

Check our information on Apache Beam if you'd like to find out more about how Lean With Data can help you to be successful with Apache Hop and Apache Beam. 

apache-beam-logo-png

Community

The Apache Hop community continues to grow.

On the chat and various social media channels, the Hop community has grown between 6.5% and almost 20% since the previous 1.2.0 release

The importance of the community for an open source project at the Apache Software Foundation. The community drives the roadmap, development and testing of the platform. A healthy and growing community ensures that no single organization can take control over the Hop platform and keeps the Hop ecosystem healthy and vibrant. 

 

 

 

Blog comments

related posts