changeset 9573:1144321d2b59

Spec updates in package spec
author briangoetz
date Fri, 30 Aug 2013 15:02:51 -0400
parents 843be828477e
children b2fcc37f2103
files src/share/classes/java/util/stream/package-info.java
diffstat 1 files changed, 86 insertions(+), 71 deletions(-) [+]
line wrap: on
line diff
--- a/src/share/classes/java/util/stream/package-info.java	Thu Aug 29 22:18:37 2013 -0700
+++ b/src/share/classes/java/util/stream/package-info.java	Fri Aug 30 15:02:51 2013 -0400
@@ -24,8 +24,8 @@
  */
 
 /**
- * Classes to support functional-style operations on streams of values, as in
- * the following:
+ * Classes to support functional-style operations on streams of values, such
+ * as map-reduce transformations on collections.  For example:
  *
  * <pre>{@code
  *     int sum = widgets.stream()
@@ -77,6 +77,10 @@
  *     <li>From a {@link java.util.Collection} via the {@code stream()} and
  *     {@code parallelStream()} methods;</li>
  *     <li>From an array via {@link java.util.Arrays#stream(Object[])};</li>
+ *     <li>From static factory methods on the stream classes, such as
+ *     {@link java.util.stream.Stream#of(Object[])},
+ *     {@link java.util.stream.IntStream#range(int, int)}
+ *     or {@link java.util.stream.Stream#iterate(Object, UnaryOperator)};</li>
  *     <li>The lines of a file can be obtained from {@link java.io.BufferedReader#lines()};</li>
  *     <li>Streams of file paths can be obtained from methods in {@link java.nio.file.Files};</li>
  *     <li>Streams of random numbers can be obtained from {@link java.util.Random#ints()};</li>
@@ -127,17 +131,17 @@
  *
  * <p>Stream operations are divided into intermediate and terminal operations.
  * Intermediate operations are further divided into <em>stateless</em> and
- * <em>stateful</em> operations.  Stateless operations retain no state from
- * previously seen values when processing a new value--each value is processed
- * independently of operations on other values. Stateless intermediate
- * operations include {@code filter} and {@code map}.  Stateful operations may
- * incorporate state from previously seen elements in processing new values.
- * Examples of stateful intermediate operations include {@code distinct} and
- * {@code sorted}.  Stateful operations may need to process the entire input
- * before producing a result.  For example, one cannot produce any results from
- * sorting a stream until one has seen all elements of the stream.  As a result,
+ * <em>stateful</em> operations.
+ *
+ * <p>Stateless operations, such as {@code filter} and {@code map}, retain no
+ * state from previously seen values when processing a new value -- each value is processed
+ * independently of operations on other values.  Stateful operations, such as
+ * {@code distinct} and {@code sorted}, may incorporate state from previously
+ * seen elements in processing new values, and may need to process the entire input
+ * before producing a result.  (For example, one cannot produce any results from
+ * sorting a stream until one has seen all elements of the stream.)  As a result,
  * under parallel computation, some pipelines containing stateful intermediate
- * operations have to be executed in multiple passes or may need to buffer
+ * operations may require multiple passes on the data or may need to buffer
  * significant data.  Pipelines containing exclusively stateless intermediate
  * operations can be processed in a single pass, whether sequential or parallel,
  * with minimal data buffering.
@@ -150,11 +154,11 @@
  * necessary, but not sufficient, condition for the processing of an infinite
  * stream to terminate normally in finite time.)
  *
- * <p>With the exception of the {@code iterator()} and {@code spliterator()}
- * terminal operations (which are provided as an "escape hatch" to enable
+ * <p>Terminal operations are almost always <em>eager</em>, executing completely
+ * before returning.  Only the terminal operations {@code iterator()} and
+ * {@code spliterator()} are not; these are provided as an "escape hatch" to enable
  * arbitrary stream traversals in the event that the existing operations are not
- * sufficient to the task), terminal operations are always <em>eager</em>,
- * executing completely before returning.
+ * sufficient to the task.
  *
  * <h3>Parallelism</h3>
  *
@@ -190,8 +194,8 @@
  * operation is initiated, the stream pipeline is executed sequentially or in
  * parallel depending on the mode of the stream on which it is invoked.
  *
- * <p>Except for operations identified as explicitly nondeterministic (such
- * as {@code findAny())}, whether a stream executes sequentially or in parallel
+ * <p>Except for operations identified as explicitly nondeterministic, such
+ * as {@code findAny()}, whether a stream executes sequentially or in parallel
  * should not change the result of the computation.
  *
  * <p>Most stream operations accept parameters that describe user-specified
@@ -208,9 +212,9 @@
  * variety of data sources, including even non-thread-safe collections such as
  * {@code ArrayList}. This is possible only if we can prevent
  * <em>interference</em> with the data source during the execution of a stream
- * pipeline. (Except for the escape-hatch methods {@code iterator()} and
+ * pipeline.  With the exception of the escape-hatch methods {@code iterator()} and
  * {@code spliterator()}, execution begins when the terminal operation is
- * invoked, and ends when the terminal operation completes.) For most data
+ * invoked, and ends when the terminal operation completes.  For most data
  * sources, preventing interference means ensuring that the data source is
  * <em>not modified at all</em> during the execution of the stream pipeline.
  * The concurrent collections, which are specifically
@@ -257,24 +261,26 @@
  * string: "three". Finally the elements of the stream are collected and joined
  * together. Since the list was modified before the terminal {@code collect}
  * operation commenced the result will be a string of "one two three". All the
- * streams returned from JDK classes are well-behaved in this manner; for
- * streams generated by other libraries, see
+ * streams returned from JDK collections, and most other JDK classes,
+ * are well-behaved in this manner; for streams generated by other libraries, see
  * <a href="package-summary.html#StreamSources">Low-level stream
  * construction</a> for requirements for building well-behaved streams.
  *
- * ** The example isn't about concurrent sources **
  * <p>Some streams, particularly those whose stream sources are concurrent, can
  * tolerate concurrent modification during execution of a stream pipeline.
- * However, in no case should behavioral parameters to stream operations modify
- * the stream source.  For example, constructions like the following may fail
- * to terminate, produce inaccurate results, or throw {@link java.util.ConcurrentModificationException}:
+ * However, in no case -- even if the stream source is concurrent -- should
+ * behavioral parameters to stream operations modify the stream source.  Modifying
+ * the stream source from within the stream source may fail to terminate, produce
+ * inaccurate results, or throw exceptions.  The following example shows
+ * inappropriate interference with the source:
  * <pre>{@code
+ *     // Don't do this!
  *     List<String> l = new ArrayList(Arrays.asList("one", "two"));
  *     Stream<String> sl = l.stream();
  *     String s = sl.peek(s -> l.add("BAD LAMBDA")).collect(joining(" "));
  * }</pre>
- * will fail as the {@code peek} operation will attempt to add the string
- * "BAD LAMBDA" to the source after the terminal operation has commenced.
+ * because the {@code peek} operation will modify the source while the stream
+ * pipeline is being executed.
  *
  * <h3>Side-effects</h3>
  *
@@ -297,16 +303,16 @@
  * <pre>{@code
  *     ArrayList<String> results = new ArrayList<>();
  *     stream.filter(s -> pattern.matcher(s).matches())
- *           .forEach(s -> results.add(s));  // BAD!  Uses side-effects!
+ *           .forEach(s -> results.add(s));  // Unnecessary use of side-effects!
  * }</pre>
  *
  * This code uses side-effects unnecessarily.  If executed in parallel, the
  * non-thread-safety of {@code ArrayList} would cause incorrect results, and
  * adding needed synchronization would cause contention, undermining the
- * benefit of parallelism.  And, using side-effects here are completely
- * unnecessarily; the {@code forEach()} can be replaced with a reduction
+ * benefit of parallelism.  Moreover, using side-effects here is completely
+ * unnecessary; the {@code forEach()} can simply be replaced with a reduction
  * operation that is safer, more efficient, and more amenable to
- * parallelization.
+ * parallelization:
  *
  * <pre>{@code
  *     List<String>results =
@@ -330,19 +336,20 @@
  * elements in their encounter order; if the source of a stream is a {@code List}
  * containing {@code [1, 2, 3]}, then the result of executing {@code map(x -> x*2)}
  * must be {@code [2, 4, 6]}.  However, if the source has no defined encounter
- * order, then any of the six permutations of the values {@code [2, 4, 6]} would
- * be a valid result.
+ * order, then any permutation of the values {@code [2, 4, 6]} would be a valid
+ * result.
  *
  * <p>For sequential streams, ordering is only relevant to the determinism
- * of operations performed repeatedly on the same source.  (An {@code ArrayList}
- * is constrained to iterate elements in order; a {@code HashSet} is not, and
- * repeated iteration might produce a different order.)
+ * of operations performed repeatedly on the same source.  For example, an
+ * {@code ArrayList} is constrained to iterate elements in order, whereas a
+ * {@code HashSet} is not, and therefore repeated iteration might produce a
+ * different order.
  *
  * <p>For parallel streams, relaxing the ordering constraint can sometimes enable
  * more efficient implementation for some operations.  Certain aggregate operations,
  * such as filtering duplicates ({@code distinct()}) or grouped reductions
- * ({@code Collectors.groupingBy()}) can be performed more efficiently using
- * concurrent data structures rather than merging if ordering of elements
+ * ({@code Collectors.groupingBy()}) can be performed more efficiently (using
+ * concurrent data structures rather than merging) if ordering of elements
  * is not relevant.  Operations that are intrinsically tied to encounter order,
  * such as {@code limit()} or {@code forEachOrdered()}, may require
  * buffering to ensure proper ordering, undermining the benefit of parallelism.
@@ -352,7 +359,7 @@
  * the stream with {@link java.util.stream.BaseStream#unordered()} may result in
  * improved parallel performance for some stateful or terminal operations.
  * However, most stream pipelines, such as the "sum of weight of blocks" example
- * above, can still be efficiently parallelized even under ordering constraints.
+ * above, still parallelize efficiently even under ordering constraints.
  *
  * <h2><a name="Reduction">Reduction operations</a></h2>
  *
@@ -360,10 +367,12 @@
  * of input elements and combines them into a single summary result by repeated
  * application of a combining operation, such as finding the sum or maximum of
  * a set of numbers, or accumulating them into a list.  The streams classes have
- * many forms of reduction operations, called
+ * multiple forms of general reduction operations, called
  * {@link java.util.stream.Stream#reduce(java.util.function.BinaryOperator) reduce()}
  * and {@link java.util.stream.Stream#collect(java.util.stream.Collector) collect()},
- * for performing reductions.
+ * as well as multiple specialized reduction forms such as
+ * {@link java.util.stream.IntStream#sum()}, {@link java.util.stream.IntStream#max()},
+ * or {@link java.util.stream.IntStream#count()}.
  *
  * <p>Of course, such operations can be readily implemented as simple sequential
  * loops, as in:
@@ -378,8 +387,8 @@
  * "more abstract" -- it operates on the stream as a whole rather than individual
  * elements -- but a properly constructed reduce operation is inherently
  * parallelizable, so long as the function(s) used to process the elements
- * have the right characteristics.  (Specifically, operators passed to
- * {@code reduce()} must be <a href="package-summary.html#Associativity">associative</a>.)
+ * are <a href="package-summary.html#Associativity">associative</a> and
+ * <a href="package-summary.html#NonInterfering">stateless</a>.
  * For example, given a stream of numbers for which we want to find the sum, we
  * can write:
  * <pre>{@code
@@ -398,7 +407,8 @@
  *
  * <p>The primitive stream classes, such as {@link java.util.stream.IntStream},
  * have convenience methods for common reductions, such as
- * {@link java.util.stream.IntStream#sum() sum()} and {@link java.util.stream.IntStream#max() max()}.
+ * {@link java.util.stream.IntStream#sum() sum()}
+ * and {@link java.util.stream.IntStream#max() max()}.
  *
  * <p>Reduction parallellizes well because the implementation of {@code reduce()}
  * can operate on subsets of the stream in parallel, and then combine the
@@ -472,7 +482,7 @@
  *     String concatenated = strings.reduce("", String::concat)
  * }</pre>
  *
- * We would get the desired result, and it would even work in parallel.  However,
+ * <p>We would get the desired result, and it would even work in parallel.  However,
  * we might not be happy about the performance!  Such an implementation would do
  * a great deal of string copying, and the run time would be <em>O(n^2)</em> in
  * the number of characters.  A more performant approach would be to accumulate
@@ -480,20 +490,22 @@
  * container for accumulating strings.  We can use the same technique to
  * parallelize mutable reduction as we do with ordinary reduction.
  *
- * <p>The mutable reduction operation is called {@link java.util.stream.Stream#collect(Collector) collect()},
+ * <p>The mutable reduction operation is called
+ * {@link java.util.stream.Stream#collect(Collector) collect()},
  * as it collects together the desired results into a result container such
- * as {@code StringBuilder}. A {@code collect} operation requires three things:
+ * as a {@code Collection} or {@code StringBuilder}.
+ * A {@code collect} operation requires three things:
  * a factory function to construct new instances of the result container, an
  * accumulating function that will incorporate an input element into a result
- * container, and a combining function that can take two result containers and
- * merge their contents.  The form of this is very similar to the general
+ * container, and a combining function that takes two result containers and
+ * merges their contents.  The form of this is very similar to the general
  * form of ordinary reduction:
  * <pre>{@code
  * <R> R collect(Supplier<R> resultFactory,
  *               BiConsumer<R, ? super T> accumulator,
  *               BiConsumer<R, R> combiner);
  * }</pre>
- * As with {@code reduce()}, the benefit of expressing {@code collect} in this
+ * <p>As with {@code reduce()}, a benefit of expressing {@code collect} in this
  * abstract way is that it is directly amenable to parallelization: we can
  * accumulate partial results in parallel and then combine them, so long as the
  * accumulation and combining functions satisfy the appropriate requirements.
@@ -515,29 +527,39 @@
  * or, noting that we have buried a mapping operation inside the accumulator
  * function, more succinctly as:
  * <pre>{@code
- *     ArrayList<String> strings = stream.map(Object::toString)
- *                                       .collect(ArrayList::new, ArrayList::add, ArrayList::addAll);
+ *     List<String> strings = stream.map(Object::toString)
+ *                                  .collect(ArrayList::new, ArrayList::add, ArrayList::addAll);
  * }</pre>
  * Here, our supplier is just the {@link java.util.ArrayList#ArrayList()
  * ArrayList constructor}, the accumulator adds the stringified element to an
  * {@code ArrayList}, and the combiner simply uses {@link java.util.ArrayList#addAll addAll}
  * to copy the strings from one container into the other.
  *
- * <p>Packaging mutable reductions into a collector has another advantage:
+ * <p> The three aspects of {@code collect} -- supplier, accumulator, and combiner --
+ * are often very tightly coupled, and we can use the abstraction of
+ * of a {@link java.util.stream.Collector} to capture all three aspects.
+ * The above example for collecting strings into a {@code List} can be rewritten
+ * using a standard {@code Collector} as:
+ * <pre>{@code
+ *     ArrayList<String> strings = stream.map(Object::toString)
+ *                                       .collect(Collectors.toList());
+ * }</pre>
+ *
+ * <p>Packaging mutable reductions into a Collector has another advantage:
  * composability.  The class {@link java.util.stream.Collectors} contains a
  * number of predefined factories for collectors, including some combinators
- * that take one collector and produce a derived collector.  For example, given
+ * that take one collector and produce another collector.  For example, given
  * the following collector that computes the sum of the salaries of a stream of
  * employees:
  *
  * <pre>{@code
  *     Collector<Employee, ?, Integer> summingSalaries
- *         = Collectors.summingInt(Employee::getSalary))
+ *         = Collectors.summingInt(Employee::getSalary);
  * } </pre>
  *
  * If we wanted to create a collector to tabulate the sum of salaries by
  * department, we could reuse {@code summingSalaries} using
- * {@link java.util.stream.Collectors#groupingBy(java.util.function.Function, java.util.stream.Collector)}:
+ * {@link java.util.stream.Collectors#groupingBy(java.util.function.Function, java.util.stream.Collector) groupingBy}:
  *
  * <pre>{@code
  *     Map<Department, Integer> salariesByDept
@@ -545,8 +567,8 @@
  *                                                            summingSalaries));
  * } </pre>
  *
- * <p>As with the regular reduction operation, the ability to parallelize only
- * comes if appropriate conditions are met.  For any partially accumulated result,
+ * <p>As with the regular reduction operation, {@code collect()} operations can
+ * only be parallelized if appropriate conditions are met.  For any partially accumulated result,
  * combining it with an empty result container must produce an equivalent
  * result.  That is, for a partially accumulated result {@code a} that is the
  * result of any series of accumulator and combiner invocations, {@code a} must
@@ -572,17 +594,6 @@
  * but in some cases equivalence may be relaxed to account for differences in
  * order.
  *
- * <p> The three aspects of {@code collect}: supplier, accumulator, and combiner,
- * are often very tightly coupled, and it is convenient to introduce the notion
- * of a {@link java.util.stream.Collector} as being an object that embodies all
- * three aspects. There is a {@link java.util.stream.Stream#collect(Collector) collect}
- * method that simply takes a {@code Collector}. The above example for collecting
- * strings into a {@code List} can be rewritten using a standard {@code Collector} as:
- * <pre>{@code
- *     ArrayList<String> strings = stream.map(Object::toString)
- *                                       .collect(Collectors.toList());
- * }</pre>
- *
  * <h3><a name="ConcurrentReduction">Reduction, Concurrency, and Ordering</a></h3>
  *
  * With some complex reduction operations, for example a {@code collect()} that
@@ -653,6 +664,9 @@
  * So we can evaluate {@code (a op b)} in parallel with {@code (c op d)}, and
  * then invoke {@code op} on the results.
  *
+ * <p>Examples of associative operations include numeric addition, min, and max,
+ * and string concatenation.
+ *
  * <h2><a name="StreamSources">Low-level stream construction</a></h2>
  *
  * So far, all the stream examples have used methods like
@@ -707,4 +721,5 @@
  */
 package java.util.stream;
 
-import java.util.function.BinaryOperator;
\ No newline at end of file
+import java.util.function.BinaryOperator;
+import java.util.function.UnaryOperator;