Changeset [3fe2323190e62596b14f583b3ee61d07336223f5] by Shayan Pooya
October 2nd, 2014 @ 07:05 PM
jobpack: make reduce_shuffle optional
a reduce_shuffle phase after reduce might be useful for
decreasing the number
of outputs that are given to the client. By default, one output is
going to
be created for each reduce. However, there might be a lot of reduce
tasks (one per
each label). The need_reduce_shuffle can be specified to request a
simple stage to
combine the results of reduce.
By default, disco avoids the reduce_shuffle phase.
https://github.com/discoproject/disco/commit/3fe2323190e62596b14f58...
Committed by Shayan Pooya
- M lib/disco/worker/classic/worker.py
- M master/src/jobpack.erl
Create your profile
Help contribute to this project by taking a few moments to create your personal profile. Create your profile ยป
Disco is an open-source implementation of the Map-Reduce framework for distributed computing. As the original framework, Disco supports parallel computations over large data sets on unreliable cluster of computers.