Interface Segments


  • public interface Segments
    An interface that represents the segmentation results, including the APIs for iteration therein, that are yielded from passing an input CharSequence to a Segmenter.

    The segmentation results can be provided either as the segmentation boundary indices ({code int}s) or as segments, which are represented by the Segment class. In turn, the Segment object can also provide the subsequence of the original input that it represents.

    Example:

     Segmenter wordSeg =
         LocalizedSegmenter.builder()
             .setLocale(ULocale.forLanguageTag("de"))
             .setSegmentationType(SegmentationType.WORD)
             .build();
    
     Segments segments = wordSeg.segment("Das 21ste Jahrh. ist das beste.");
    
     List<CharSequence> words = segments.subSequences().collect(Collectors.toList());
     
    See Also:
    Segmenter, Segment
    Status:
    Draft ICU 78.
    • Method Detail

      • subSequences

        default Stream<CharSequence> subSequences()
        Returns a Stream of the CharSequences for all of the segments in the source sequence. Start from the beginning of the sequence and iterate forwards until the end.
        Returns:
        a Stream of all Segments in the source sequence.
        Status:
        Draft ICU 78.
      • segmentAt

        Segment segmentAt​(int i)
        Returns the segment that contains index i. Containment is inclusive of the start index and exclusive of the limit index.

        Specifically, the containing segment is defined as the segment with start s and limit l such that s ≤ i < l.

        Parameters:
        i - index in the input CharSequence to the Segmenter
        Returns:
        A segment that either starts at or contains index i
        Throws:
        IndexOutOfBoundsException - if i is less than 0 or greater than or equal to the length of the input CharSequence to the Segmenter
        Status:
        Draft ICU 78.
      • segments

        default Stream<Segment> segments()
        Returns a Stream of all Segments in the source sequence. Start with the first and iterate forwards until the end of the sequence.

        This is equivalent to segmentsFrom(0).

        Returns:
        a Stream of all Segments in the source sequence.
        Status:
        Draft ICU 78.
      • segmentsFrom

        Stream<Segment> segmentsFrom​(int i)
        Returns a Stream of all Segments in the source sequence where all segment limits l satisfy i < l. Iteration moves forwards.

        This means that the first segment in the stream is the same as what is returned by segmentAt(i).

        The word "from" is used here to mean "at or after", with the semantics of "at" for a Segment defined by segmentAt(int)}. We cannot describe the segments all as being "after" since the first segment might contain i in the middle, meaning that in the forward direction, its start position precedes i.

        segmentsFrom and segmentsBefore(int) create a partitioning of the space of all Segments.

        Parameters:
        i - index in the input CharSequence to the Segmenter
        Returns:
        a Stream of all Segments at or after i
        Status:
        Draft ICU 78.
      • segmentsBefore