Changeset [3fe2323190e62596b14f583b3ee61d07336223f5] by Shayan Pooya

October 2nd, 2014 @ 07:05 PM

jobpack: make reduce_shuffle optional

a reduce_shuffle phase after reduce might be useful for decreasing the number
of outputs that are given to the client. By default, one output is going to
be created for each reduce. However, there might be a lot of reduce tasks (one per
each label). The need_reduce_shuffle can be specified to request a simple stage to
combine the results of reduce.
By default, disco avoids the reduce_shuffle phase.
https://github.com/discoproject/disco/commit/3fe2323190e62596b14f58...

Committed by Shayan Pooya

  • M lib/disco/worker/classic/worker.py
  • M master/src/jobpack.erl
New-ticket Create new ticket

Create your profile

Help contribute to this project by taking a few moments to create your personal profile. Create your profile ยป

Disco is an open-source implementation of the Map-Reduce framework for distributed computing. As the original framework, Disco supports parallel computations over large data sets on unreliable cluster of computers.

Shared Ticket Bins