+91 90691 39140 | +1 253 214 3115 | info@hub4tech.com | hub4tech

Oozie Tutorial


Oozie

  • Workflow scheduler for Hadoop
    • Java MapReduce Jobs
    • Streaming Jobs
    • Pig
  • Top level Apache project
    • Comes packaged in major Hadoop Distributions
    • Cloudera Distribution for Hadoop (CDH)
    • http://incubator.apache.org/oozie
  • Provides workflow management and coordination of those workflows
  • Manages Directed Acyclic Graph (DAG) of actions
  • oozie

  • Runs HTTP service
    • Clients interact with the service by submitting workflows
    • Workflows are executed immediately or later
  • Workflows are defined via XML
Instead of writing Java code that implements Tool interface and extending Configured class

Action and Control Nodes

Action and Control Nodes

Control Flow
  • start, end, kill
  • decision
  • fork, join
Actions
  • map-reduce
  • java
  • pig
  • hdfs

Oozie Coordination Engine

Oozie Coordination Engine can trigger workflows by
  • Time (Periodically)
  • Data Availability (Data appears in a directory)
Is it Helpful?
Copyright ©2015 Hub4Tech.com, All Rights Reserved. Hub4Tech™ is registered trademark of Hub4tech Portal Services Pvt. Ltd.
All trademarks and logos appearing on this website are the property of their respective owners.
FOLLOW US