" Unnnn" rather than the Java formats of "\unnnn" or "0xnnnn). For example: Stream linesStream = "Once\nupon\ra\r\ntime".lines() linesStream.forEach(System.The question asked for a function to convert a string value representing a Unicode code point (i.e. Stream lines = “someStringWithLineTerminators”.lines()Ī line terminator is one of the following: - a line feed character “ \n” (U 000A), - a carriage return character “ \r” (U 000D), or - a carriage return followed immediately by a line feed “ \r\n” (U 000D U 000A). That is the only reason you would want to use the codePoints() method - to represent a supplementary character as a single code point. forEach(c -> (Character.toChars(c))) //prints: ? forEach(c -> ((char)c)) //prints: dePoints(). forEach(System.out::println) //prints: 128512 dePoints(). When produced by the codePoints() method, supplementary characters represented by Unicode surrogate pairs are merged into a single code point: dePoints(). IntStream intStream = “someString”.codePoints() For such cases - if your application has to process supplementary characters too - use method codePoints() instead. forEach(c -> (Character.toChars(c))) //prints: ?Īs you can see, the information about the character as a whole is lost. forEach(c -> ((char)c)) //prints: ? supplString.chars(). forEach(System.out::println) //prints: 55357 56832 supplString.chars(). The chars() method treats them as two different code points: String supplString = new String( Character.toChars(0x1F600) ) supplString.chars(). The supplementary characters are not appropriate for processing in a stream created by the chars() method because they are represented as a pair of char values, the first - from the high-surrogates range, (\uD800-\uDBFF), the second - from the low-surrogates range (\uDC00-\uDFFF). The character 0x1F600 is an example of an emoji called “grinning face”: String supplString = new String(Character.toChars(0x1F600)) ("\n" supplString) //prints: ? For example: ('a' > Character.MIN_SUPPLEMENTARY_CODE_POINT) //false (0x1F600 > Character.MIN_SUPPLEMENTARY_CODE_POINT) //true You can find if the character is a supplementary one by comparing it with Character.MIN_SUPPLEMENTARY_CODE_POINT. They are called supplementary characters and they are greater than U FFFF (such as emoji, for example). There are also characters that are not appropriate for processing in a stream created by the chars() method. For the majority of mainstream applications - those that do not process a significant number of characters - the performance gain does not justify the clarity of the code, which going to be less human-readable if all the characters are processed as integers. It does not mean though that one has to process characters this way all the time. This example is very simplistic, but I hope it makes the point: processing characters as integers - without converting them to Character - can be more efficient than processing them as objects or boxing/unboxing them unnecessarily. filter(c -> c (c)) //prints: 68 70 Arrays.stream(chars). For example, the following code selects only the lower-case letters, taking advantage of the fact that the upper-case Latin letters are listed in the character set before the lower-case Latin letters: int chars = "abcDeF".chars(). Using code points you can process the characters without boxing them into the Character objects, thus improving performance, which can be significant when the number of processed characters is big enough. forEach(System.out::println) //prints: a b c To demonstrate that the emitted values are the expected ‘a’, ‘b’, and ‘c’, we can cast the emitted values back to char as follows: IntStream intStream = "abc".chars() intStream.mapToObj(c -> (char)c). For example: IntStream intStream = "abc".chars() intStream.forEach(System.out::println) //prints: 97 98 99 The created IntStream emits code points (integer char values) of the characters. IntStream intStream = “someString”.chars() The lines() method creates a stream of lines extracted from this string, separated by line terminators. The chars() and codePoints() methods create a stream of code points of the characters that compose the string. Create from String using chars(), codePoints(), and lines()Ĭlass String has the following methods that create streams: - IntStream chars() - IntStream codePoints() - Stream lines()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |