changeset 7626:5a128a6aa03e

Additional package-info spec Contributed-by: dholmes
author briangoetz
date Wed, 13 Mar 2013 15:03:35 -0400
parents f9763ce154d8
children f136adc07108
files src/share/classes/java/util/stream/package-info.java
diffstat 1 files changed, 87 insertions(+), 0 deletions(-) [+]
line wrap: on
line diff
--- a/src/share/classes/java/util/stream/package-info.java	Wed Mar 13 14:32:07 2013 -0400
+++ b/src/share/classes/java/util/stream/package-info.java	Wed Mar 13 15:03:35 2013 -0400
@@ -213,6 +213,93 @@
  * lambdas sharing mutable state
  *
  * <h2>Side-effects</h2>
+ *
+ * <h2><a name="Reduction">Reduction operations</h2>
+ *
+ * In simple terms, a <em>reduction</em> operation takes a stream of elements and processes them in a way
+ * that reduces to a single value, such as finding the sum or maximum of a set of numbers. In more complex
+ * scenarios, the reduction operation might need to extract data from the elements before reducing that data
+ * to a single value, such as finding the sum of weights of a set of blocks.  This would requireextracting the
+ * weight from each block before summing up the weights.
+ *
+ * <p>Of course, such operations can be readily implemented as simple sequential loops, as in:
+ * <pre>
+ *    int max = Integer.MIN_VALUE;
+ *    for (int x : numbers) {
+ *       if (x > max)
+ *           max = x;
+ *    }
+ * </pre>
+ * However, there may be a significant advantage to preferring a {@link Stream#reduce reduce operation} over
+ * a mutative accumulation such as the above -- a properly constructed reduce operation is inherently parallelizable,
+ * so long as the {@link BinaryOperator} has the right characteristics, specifically that it is
+ * <a href="#Associativity">associative</a>.  For example, given a stream of numbers for which we want to
+ * find the maximum, we can write:
+ * <pre>
+ *    int max = numbers.reduce(Integer.MIN_VALUE, (x,y) -> x > y ? x : y );
+ * </pre>
+ * or more succinctly:
+ * <pre>
+ *    int max = numbers.reduce(Integer.MIN_VALUE, Integer::max);
+ * </pre>
+ *
+ * Reduction parallellizes easily since the implementation of {@code reduce} can operate on subsets of the
+ * stream in parallel, and then combine the intermediate results to get the final correct answer.  Even if you were
+ * to use a parallelizable form of the {@link forEach} method in place of the original for-each loop above, you would
+ * still have to provide thread-safe updates to the shared accumulating variable {@code max}, and the required
+ * synchronization would likely eliminate any performance gain from parallelism. Using a {@code reduce} method instead
+ * removes all of the burden of parallelizing the reduction operation, and the library can provide an efficient
+ * parallel implementation.
+ *
+ * <p>In its more general form, a {@code reduce} operation on elements of type {@code T} yielding a result of
+ * type {@code U} requires three parameters:
+ * <pre>
+ * &lt;U> U reduce(U identity,
+ *                 BiFunction&lt;U, ? super T, U> accumlator,
+ *                 BinaryOperator&lt;U> combiner);
+ * </pre>
+ * Here, the <em>identity</em> element is both an initial seed for the reduction, and a default result if
+ * there are no elements. The <em>accumulator</em> function takes a partial result and the next element, and
+ * produce a new partial result. The <em>combiner</em> function combines combine the partial results of two
+ * accumulators to produce a new partial result, or eventually the final result.
+ *
+ * <p>In our {@code max} example above, {@code Integer.MIN_VALUE} was used the identity element, while
+ * {@code Integer::max} was used as both the reducer and combiner.  For a more complicated example such as the
+ * sum-of-weights, we would have:
+ * <pre>
+ *     int sumOfWeights = blocks.stream().reduce(0,
+ *                                               (sum, b) -> sum + b.getWeight())
+ *                                               Integer::sum);
+ * </pre>
+ * This computation could also be expressed as:
+ * <pre>
+ *     int sumOfWeights = blocks.stream().map(b -> b.getWeight())
+ *                                       .reduce(0, Integer::sum);
+ * </pre>
+ * though there are situations where the generality of the three-argument form may enable a more efficient
+ * computation.
+ *
+ * <p>More formally, the {@code identity} value must be an <em>identity</em> for the combiner function.
+ * This means that for all {@code u}, {@code combiner(identity, u)} is equal to {@code u}.
+ * Additionally, the {@code combiner} function must be <a href="#Associativity">associative</a> and must be
+ * compatible with the {@code accumulator} function; for all {@code u} and {@code t}, the following must hold:
+ * <pre>
+ *     combiner(u, accumulator(identity, t)) == accumulator(u, t)
+ * </pre>
+ *
+ * <a name="Associativity"><h2>Associativity</h2></a>
+ *
+ * An operator or function {@code op} is <em>associative</em> if the following holds:
+ * <pre>
+ *     (a op b) op c == a op (b op c)
+ * </pre>
+ * The importance of this to parallel evaluation can be seen if we expand this to four terms:
+ * <pre>
+ *     a op b op c op d == (a op b) op (c op d)
+ * </pre>
+ * So we can evaluate {@code (a op b)} in parallel with {@code (c op d)} and then invoke {@code op} on
+ * the results.
+ *
  */
 
 package java.util.stream;