Apache Hop 0.70 is available

Apache Hop 0.70 Released

 

Apache Hop 0.70Apache Hop 0.70 was released earlier this week.

This release is a major milestone for Apache Hop. As an incubating project at the Apache Software Foundation (ASF), Hop has two main tasks.

Firstly, ASF releases are a legal process. Apache projects need to make sure all aspects of the software comply to the Apache Public License v2.0. The Apache Hop team now controls this process sufficiently to not only release source code (as in the 0.60 release) that is APL2.0 compliant, now all binary artifacts (the dependencies) that need to be shipped with the software are valid. This should be smooth sailing from now on.

Secondly, Apache projects are community projects. Building and growing a community is equally, if not more, important than building software. The Hop community has grown. The chat and social media channels have grown between 25% and 75% (YouTube). A number of community members have become very active, contributing not just code but also artwork, tests, documentation and participation in discussions. A very noticeable corporate contribution was the donation of a vast batch of functionality from Neo4j.

With the 0.70 release out the door, and the current development branch at version 0.99, the Apache Hop team started working towards a 1.0 release. Hop continues to evolve fast. Even though 1.0 won’t bring any major architectural changes, there will be lots of new functionality through plugins. Most importantly, the Hop team is hardening the platform for the 1.0 release: there’s a lot of bug hunting and fixing going on, documentation, samples and integration tests need to be feature-complete.

0.70 Highlights

The full release announcement contains all the details, but let’s look at a number of the highlights in the release.

Neo4j Integration

Neo4j Neo4j already had rich support in Hop through a series of external plugins. These plugins have now been fully integrated in Hop. It seems fair to say that no other data orchestration has better support for Neo4j than Hop.
With Hop, it is now easier than ever to read from and write to Neo4j, to run Cypher queries in a reliable and manageable way throughout a project’s entire life cycle.

In addition to direct data orchestration functionality, the Neo4j integration also includes the possibility to write Hop execution logs to a Neo4j database, and a new Neo4j perspective to view and query these execution logs.

Full Cloud Storage integration

Hop was designed to seamlessly run in and integrate with the major cloud platforms. Hop 0.70 offers a number of new tools and improvements in this area. With Hop 0.70, you now have direct access to your data in all major cloud storage services.

AWS Amazon Web Services’s S3 storage service is now fully integrated. All Hop transforms and actions can access files in S3 buckets, and S3 buckets and files can be browsed from the file dialog through the s3:// url over Apache VFS.

Azure Hop data developers can now access data in Azure Blog Storage directly over VFS. Just like with AWS, this is supported in all transforms and actions that operate on files and folders, over an azure:// url. In addition to Blog Storage, Azure support now also includes two new transforms: Azure Event Hubs Listener and Azure Event Hubs Writer.

Dropbox Two new transforms were added to Hop to download from and upload to Dropbox. A VFS driver may be added in a future release.

GCP Similar to AWS S3 and Azure Blob Storage, Google storage is now supported over VFS as well.
Hop developers can access their data over through gs:// urls for Google Cloud Storage and googledrive:// for Google Drive. Additionally, new transforms were added for Google Analytics, and Google Sheets (Input and Output).

Pentaho Migration Tools

Kettle/PDI Import When Hop started as a Kettle (or Pentaho Data Integration) fork, the project set itself a number of long term and architectural goals that broke compatibility with Kettle/PDI almost immediately. Both Kettle and Hop increasingly are individual and independent platforms, each with their own roadmap.
However, the shared history between Kettle/PDI and Hop allows a smooth transition from Kettle/PDI projects to Hop projects. Not only can your jobs and transformations be imported in (or upgrade to) Hop workflows and pipelines, your projects immediately get to benefit from all of the additional features Hop offers: projects and environments, runtime configurations, life cycle management, testing etc.

 

Lean Orchestration 0.70

One of the main reasons Apache Hop was created is because the project team feels there is a need for a top of the notch data orchestration platform that is fully open source and completely independent. As an Apache project, Hop is community driven, no single organization or individual drives Hop’s roadmap or development.

Apache Hop’s independence guarantees the platform will always remain open source and free. However, open source and free may be great for Apache Hop as a project, that doesn’t help your projects in production.
Lean Orchestration is Lean With Data’s enterprise-ready Hop offering: based on and following the Apache Hop releases, but with everything your organization needs to successfully run and deploy advanced data projects in production:

  • certified installations

  • additional functionality through a number of plugins that are not available in the default Apache Hop distribution

  • patch releases

  • enterprise support

  • coaching and training

In addition to support, training and coaching, Lean With Data offers Pentaho to Lean Orchestration migrations. Converting code is only one aspect of a successful migration project. While upgrading from Pentaho to Hop, you’ll want to make some architecture changes, improve your project life cycle management, start working towards a more test-driven approach, improve logging and monitoring etc.
Our migration packages offer everything you need to upgrade from Pentaho to Hop without losing sleep.

Blog comments

related posts