In the previous posts on Onyx (you can read part 1 and part 2 on how to setup a Kafka Streaming application with Onyx), there are a couple of things worth noting that I didn’t mention in the original articles.
When you submit your job you are submitting it to Zookeeper, then when the peers are running it sees what jobs are available to execute. The template code’s -main method does a good job of separating these actions out.
(onyx.test-helper/feedback-exception! peer-config job-id)
is looking out for exception messages on the job. If all is well it will block any form of exiting and wait for someone to kill it. Removing the line will effectively submit the job to Zookeeper, exit out and that’s that. When the peer is running it will then pick up the job. In our Kafka job the peer will pick up the job and then do it’s Kafka input job. Kill the peer and start it again, the job should pick up from the last offset.
The Greedy and the Balanced
The default template works on a greedy job scheduler, meaning that if you run ten peers and the job demands three, then the remaining seven will be blocked from use.
Fine when there’s only one job submitted but a pain when there’s two or more. In the config.edn file it’s a case of changing the job scheduler from greedy to balanced. The balanced scheduler will keep the remaining available peers and allocate them to the other jobs as they come on.