Java8中的Stream的彙總和分組操作~它並不難的詳情 - node.js,fastapi,scala 運維社動態日志

前言
在前面的文章中其實大家也已經看到我使用過collect(Collectors.toList()) 將數據最後彙總成一個 List 集合。
但其實還可以轉換成Integer、Map、Set 集合等。
一、查找流中的最大值和最小值
static List<Student> students = new ArrayList<>();

static {
students.add(new Student("學生A", "大學A", 18, 98.0));
students.add(new Student("學生B", "大學A", 18, 91.0));
students.add(new Student("學生C", "大學A", 18, 90.0));
students.add(new Student("學生D", "大學B", 18, 76.0));
students.add(new Student("學生E", "大學B", 18, 91.0));
students.add(new Student("學生F", "大學B", 19, 65.0));
students.add(new Student("學生G", "大學C", 20, 80.0));
students.add(new Student("學生H", "大學C", 21, 78.0));
students.add(new Student("學生I", "大學C", 20, 67.0));
students.add(new Student("學生J", "大學D", 22, 87.0));
}

public static void main(String[] args) {
Optional<Student> collect1 = students.stream().collect(Collectors.maxBy((s1, s2) -> s1.getAge() - s2.getAge()));
Optional<Student> collect2 = students.stream().collect(Collectors.minBy((s1, s2) -> s1.getAge() - s2.getAge()));
Student max = collect1.get();
Student min = collect2.get();
System.out.println("max年齡的學生==>" + max);
System.out.println("min年齡的學生==>" + min);
/**
* max年齡的學生==>Student(name=學生J, school=大學D, age=22, score=87.0)
* min年齡的學生==>Student(name=學生A, school=大學A, age=18, score=98.0)
*/
}
複製代碼
Optional，它是一個容器，可以包含也可以不包含值。它是java8中人們常説的優雅的判空的操作。
另一個常見的返回單個值的歸約操作是對流中對象的一個數值字段求和。或者你可能想要求平均數。這種操作被稱為彙總操作。讓我們來看看如何使用收集器來表達彙總操作。
二、彙總
Collectors類專門為彙總提供了一些個工廠方法：

當然除此之外還有求平均數averagingDouble、求總數counting等等
我們暫且就先以summingDouble和summarizingDouble來舉例吧
案例數據仍然是上面的那些student數據...

求全部學生成績的總分，求全部學生的平均分。

1、首先使用summingDouble 和 averagingDouble 來實現
Double summingScore = students.stream().collect(Collectors.summingDouble(Student::getScore));
Double averagingScore = students.stream().collect(Collectors.averagingDouble(Student::getScore));
System.out.println("學生的總分==>" + summingScore);
System.out.println("學生的平均分==>" + averagingScore);
/**
* 學生的總分==>823.0
* 學生的平均分==>82.3
*/
複製代碼
2、使用summarizingDouble來實現
它更為綜合，可以直接計算出相關的彙總信息
DoubleSummaryStatistics summarizingDouble = students.stream().collect(Collectors.summarizingDouble(Student::getScore));

double sum = summarizingDouble.getSum();
long count = summarizingDouble.getCount();
double average = summarizingDouble.getAverage();
double max = summarizingDouble.getMax();
double min = summarizingDouble.getMin();
System.out.println("sum==>"+sum);
System.out.println("count==>"+count);
System.out.println("average==>"+average);
System.out.println("max==>"+max);
System.out.println("min==>"+min);
/**
* sum==>823.0
* count==>10
* average==>82.3
* max==>98.0
* min==>65.0
*/
複製代碼
但其實大家也都發現了，使用一個接口能夠實現，也可以拆開根據自己的所需，選擇合適的API來實現，具體的使用還是需要看使用場景。
三、連接字符串
Joining，就是把流中每一個對象應用toString方法得到的所有字符串連接成一個字符串。
如果這麼看，它其實沒啥用，但是Java也留下了後招，它的同伴（重載方法）提供了一個可以接受元素之間的分割符的方法。

String studentsName = students.stream().map(student -> student.getName()).collect(Collectors.joining());
System.out.println(studentsName);
String studentsName2 = students.stream().map(student -> student.getName()).collect(Collectors.joining(","));
System.out.println(studentsName2);
/**
* 學生A學生B學生C學生D學生E學生F學生G學生H學生I學生J
* 學生A,學生B,學生C,學生D,學生E,學生F,學生G,學生H,學生I,學生J
*/
複製代碼
對於對象的打印：
// 不過對於對象的打印個人感覺還好哈哈
String collect = students.stream().map(student -> student.toString()).collect(Collectors.joining(","));
System.out.println(collect);
System.out.println(students);
/**
* Student(name=學生A, school=大學A, age=18, score=98.0),Student(name=學生B, school=大學A, age=18, score=91.0),Student(name=學生C, school=大學A, age=18, score=90.0),Student(name=學生D, school=大學B, age=18, score=76.0),Student(name=學生E, school=大學B, age=18, score=91.0)....
* [Student(name=學生A, school=大學A, age=18, score=98.0), Student(name=學生B, school=大學A, age=18, score=91.0), Student(name=學生C, school=大學A, age=18, score=90.0), Student(name=學生D, school=大學B, age=18, score=76.0)..)]
*/
複製代碼
但其實我還有一些沒有講到的API使用方法，大家也可以額外去嘗試嘗試，這其實遠比你看這篇文章吸收的更快~
四、分組
就像數據庫中的分組統計一樣~
1、分組
舉個例子，我想統計每個學校有哪些學生
我是不是得設計這樣的一個數據結構Map<String,List<Student>>才能存放勒，我在循環的時候，是不是每次都得判斷一下學生所在的學校的名稱，然後看是否要給它添加到這個List集合中去，最後再put到map中去呢？
看着就特別繁瑣，但是在 stream 中就變成了一行代碼，其他的東西，都是 Java 內部給你優化了。
// 我想知道每所學校中，學生的數量及相關信息，只要這一行代碼即可
Map<String, List<Student>> collect = students.stream().collect(Collectors.groupingBy(Student::getSchool));
System.out.println(collect);
/**
* {大學B=[Student(name=學生D, school=大學B, age=18, score=76.0), Student(name=學生E, school=大學B, age=18, score=91.0), Student(name=學生F, school=大學B, age=19, score=65.0)],
* 大學A=[Student(name=學生A, school=大學A, age=18, score=98.0), Student(name=學生B, school=大學A, age=18, score=91.0), Student(name=學生C, school=大學A, age=18, score=90.0)],
* 大學D=[Student(name=學生J, school=大學D, age=22, score=87.0)],
* 大學C=[Student(name=學生G, school=大學C, age=20, score=80.0), Student(name=學生H, school=大學C, age=21, score=78.0), Student(name=學生I, school=大學C, age=20, score=67.0)]}
*/
複製代碼
有些時候這真的是十分有用且方便的。
但是有時候我們往往不止於如此，假如我要統計每個學校中20歲年齡以上和20以下的學生分別有哪些學生，那麼我的參數就不再是Student::getSchool了，而是要加上語句了。那麼該如何編寫呢？
//統計每個學校中20歲年齡以上和20以下的學生分別有多少
Map<String, List<Student>> collect = students.stream().collect(Collectors.groupingBy(student -> {
if (student.getAge() > 20) {
return "20歲以上的";
}
return "20以下的";
}));
System.out.println(collect);
複製代碼

如果要統計每個學校有多少20歲以上和20歲以下的學生的信息，其實也就是把 return 語句修改以下即可。

//統計每個學校中20歲年齡以上和20以下的學生分別有多少
Map<String, List<Student>> collect = students.stream().collect(Collectors.groupingBy(student -> {
if (student.getAge() > 20) {
return student.getSchool();
}
return student.getSchool();
}));
System.out.println(collect);
複製代碼
相信大家也看出來groupingBy中的 return 語句就是 Map 中的key值
2、多級分組
但其實groupingBy()並不只是一個人，它也有兄弟姐妹

假如我想把上面的例子再改造改造，
改為：我想知道20歲以上的學生在每個學校有哪些學生，20歲以下的學生在每個學校有哪些學生。
數據結構就應當設計為Map<String, Map<String, List<Student>>> 啦，第一級存放 20歲以上以下兩組數據，第二級存放以每個學校名為key的數據信息。
Map<String, Map<String, List<Student>>> collect = students.stream().collect(Collectors.groupingBy(Student::getSchool, Collectors.groupingBy(student -> {
if (student.getAge() > 20) {
return "20以上的";
}
return "20歲以下的";
})));
System.out.println(collect);
/**
* {大學B={20歲以下的=[Student(name=學生D, school=大學B, age=18, score=76.0),Student(name=學生E, school=大學B, age=18, score=91.0), Student(name=學生F, school=大學B, age=19, score=65.0)]},
* 大學A={20歲以下的=[Student(name=學生A, school=大學A, age=18, score=98.0), Student(name=學生B, school=大學A, age=18, score=91.0), Student(name=學生C, school=大學A, age=18, score=90.0)]},
* 大學D={20以上的=[Student(name=學生J, school=大學D, age=22, score=87.0)]},
* 大學C={20以上的=[Student(name=學生H, school=大學C, age=21, score=78.0)],20歲以下的=[Student(name=學生G, school=大學C, age=20, score=80.0), Student(name=學生I, school=大學C, age=20, score=67.0)]}}
*/
複製代碼
這裏利用的就是把一個內層groupingBy傳遞給外層groupingBy，俗稱的套娃~
外層Map的鍵就是第一級分類函數生成的值，而這個Map的值又是一個Map，鍵是二級分類函數生成的值。
3、按子組數據進行劃分
之前我的截圖中，groupingBy的重載方法中，其實對於第二個參數的限制，並非説一定是要groupingBy類型的收集，更抽象點説，它可以是任意的收集器~
再假如，我的例子改為：

我現在明確的想知道每個學校20歲的學生的人數。

那麼這個數據結構就應當改為
Map<String,Long>或者是Map<String,Integer>呢？
那麼在這裏該如何實現呢？
Map<String, Long> collect = students.stream().collect(Collectors.groupingBy(Student::getSchool, Collectors.counting()));
System.out.println(collect);
/**
* {大學B=3, 大學A=3, 大學D=1, 大學C=3}
*/
複製代碼

實際上還有許多未曾談到的東西，這裏都只是非常簡單的應用，對於其中的流的執行的先後順序，以及一些簡單的原理，都沒有過多的涉及，大家先上手用着吧~

後記

我這裏只是闡述了一些比較簡單的應用性操作，未談及設計思想之類，但是要明白那種才是更值得去閲讀和理解的。