Spring Boot Integrating the Tablesaw Plug-in Quick Start Demo

HBLOG
11 min readJun 5, 2024

--

1. What is Tablesaw?

Tablesaw is a data visualization library for Java that consists of two parts:

  • The data parsing library, which is mainly used to load data, manipulate data (transformation, filtering, summarization, etc.), analogous to the Pandas library in Python;
  • A data visualization library that converts target data into visual charts, analogous to the Matplotlib library in Python.

Unlike Pandas, tables in Tablesaw are based on columns, so most operations are based on columns. Of course, it also includes some functions that operate on rows, but the functions are relatively limited

Tablesaw directory description:

  1. aggregate: the parent project of Maven, which mainly defines the configuration of project packaging.
  2. Beakerx: The registry for the Tablesaw library, the main registry and columns.
  3. core: The core code of the Tablesaw library, which is mainly used for data processing operations: data appending, sorting, grouping, querying, etc.
  4. data: the directory of project test data.
  5. docs: The directory of the project’s MarkDown documentation.
  6. docs-src: the source code directory of project documents, which is mainly used to generate MarkDown documents.
  7. Excel: A subproject that parses the data in an Excel file.
  8. HTML: A subproject that parses HTML file data.
  9. JSON: A subproject that parses data in a JSON file.
  10. jsplot: a sub-project of data visualization, which is mainly used to load data and generate visualization charts.
  11. saw: a sub-item of tablesaw that reads and writes chart data.

2. Environmental preparation

mysql setup

docker run --name docker-mysql-5.7 -e MYSQL_ROOT_PASSWORD=123456 -p 3306:3306 -d mysql:5.7

init data

create database demo;
create table user_info
(
user_id varchar(64) not null primary key,
username varchar(100) null ,
age int(3) null ,
gender tinyint(1) null ,
remark varchar(255) null ,
create_time datetime null ,
create_id varchar(64) null ,
update_time datetime null ,
update_id varchar(64) null ,
enabled tinyint(1) default 1 null
);
INSERT INTO demo.user_info
(user_id, username, age, gender, remark, create_time, create_id, update_time, update_id, enabled)
VALUES('1', '1', 1, 1, '1', NULL, '1', NULL, NULL, 1);

remark

msyql username:root
mysql password:123456

3. Code engineering

Objectives:

Two-dimensional data is processed and visualized using Tablesaw

pom.xml

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<parent>
<artifactId>springboot-demo</artifactId>
<groupId>com.et</groupId>
<version>1.0-SNAPSHOT</version>
</parent>
<modelVersion>4.0.0</modelVersion>
<artifactId>tablesaw</artifactId>
<properties>
<maven.compiler.source>8</maven.compiler.source>
<maven.compiler.target>8</maven.compiler.target>
</properties>
<dependencies>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-autoconfigure</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-test</artifactId>
<scope>test</scope>
</dependency>
<dependency>
<groupId>tech.tablesaw</groupId>
<artifactId>tablesaw-core</artifactId>
<version>0.43.1</version>
</dependency>
<dependency>
<groupId>tech.tablesaw</groupId>
<artifactId>tablesaw-jsplot</artifactId>
<version>0.43.1</version>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-jdbc</artifactId>
</dependency>
<!--mysql-->
<dependency>
<groupId>mysql</groupId>
<artifactId>mysql-connector-java</artifactId>
<version>8.0.29</version>
</dependency>
</dependencies>
</project>

application.yaml

server:
port: 8088
spring:
datasource:
driver-class-name: com.mysql.cj.jdbc.Driver
type: com.zaxxer.hikari.HikariDataSource
url: jdbc:mysql://127.0.0.1:3306/demo?useUnicode=true&characterEncoding=utf-8&useSSL=false
username: root
password: 123456

Read CSV data

@Before
public void before() throws IOException {
log.info("init some data");
tornadoes = Table.read().csv("/Users/liuhaihua/IdeaProjects/springboot-demo/tablesaw/src/main/resources/data/tornadoes_1950-2014.csv");
}

Print the column name

@Test
public void columnNames() throws IOException {
System.out.println(tornadoes.columnNames());
}

outcome

[Date, Time, State, State No, Scale, Injuries, Fatalities, Start Lat, Start Lon, Length, Width]

Check out the shape

@Test
public void shape() throws IOException {
System.out.println(tornadoes.shape());
}

outcome

tornadoes_1950-2014.csv: 59945 rows X 11 cols

View the table structure

@Test
public void structure() throws IOException {
System.out.println(tornadoes.structure().printAll());
}

outcome

2024-06-05 21:13:01.297 INFO 41685 --- [ main] com.et.tablesaw.DemoTests : init some data
Structure of tornadoes_1950-2014.csv
Index | Column Name | Column Type |
-----------------------------------------
0 | Date | LOCAL_DATE |
1 | Time | LOCAL_TIME |
2 | State | STRING |
3 | State No | INTEGER |
4 | Scale | INTEGER |
5 | Injuries | INTEGER |
6 | Fatalities | INTEGER |
7 | Start Lat | DOUBLE |
8 | Start Lon | DOUBLE |
9 | Length | DOUBLE |
10 | Width | INTEGER |
2024-06-05 21:13:02.907 INFO 41685 --- [ main] com.et.tablesaw.DemoTests : clean some data

Review the data

@Test
public void show() throws IOException {
System.out.println(tornadoes);
}

outcome

2024-06-05 21:14:52.181 INFO 41732 --- [ main] com.et.tablesaw.DemoTests : init some data
tornadoes_1950-2014.csv
Date | Time | State | State No | Scale | Injuries | Fatalities | Start Lat | Start Lon | Length | Width |
-----------------------------------------------------------------------------------------------------------------------------------------
1950-01-03 | 11:00:00 | MO | 1 | 3 | 3 | 0 | 38.77 | -90.22 | 9.5 | 150 |
1950-01-03 | 11:00:00 | MO | 1 | 3 | 3 | 0 | 38.77 | -90.22 | 6.2 | 150 |
1950-01-03 | 11:10:00 | IL | 1 | 3 | 0 | 0 | 38.82 | -90.12 | 3.3 | 100 |
1950-01-03 | 11:55:00 | IL | 2 | 3 | 3 | 0 | 39.1 | -89.3 | 3.6 | 130 |
1950-01-03 | 16:00:00 | OH | 1 | 1 | 1 | 0 | 40.88 | -84.58 | 0.1 | 10 |
1950-01-13 | 05:25:00 | AR | 1 | 3 | 1 | 1 | 34.4 | -94.37 | 0.6 | 17 |
1950-01-25 | 19:30:00 | MO | 2 | 2 | 5 | 0 | 37.6 | -90.68 | 2.3 | 300 |
1950-01-25 | 21:00:00 | IL | 3 | 2 | 0 | 0 | 41.17 | -87.33 | 0.1 | 100 |
1950-01-26 | 18:00:00 | TX | 1 | 2 | 2 | 0 | 26.88 | -98.12 | 4.7 | 133 |
1950-02-11 | 13:10:00 | TX | 2 | 2 | 0 | 0 | 29.42 | -95.25 | 9.9 | 400 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
2014-12-23 | 14:46:00 | MS | 0 | 0 | 0 | 0 | 31.4247 | -89.5358 | 0.24 | 50 |
2014-12-23 | 15:22:00 | MS | 0 | 2 | 0 | 2 | 31.6986 | -89.2239 | 5.89 | 350 |
2014-12-23 | 16:27:00 | MS | 0 | 0 | 0 | 0 | 32.0005 | -88.4854 | 0.49 | 100 |
2014-12-24 | 09:30:00 | NC | 0 | 1 | 1 | 0 | 34.3124 | -77.917 | 0.2 | 25 |
2014-12-24 | 11:50:00 | GA | 0 | 1 | 0 | 0 | 31.406 | -82.251 | 5.69 | 100 |
2014-12-24 | 16:16:00 | OH | 0 | 0 | 0 | 0 | 39.728 | -82.634 | 0.63 | 30 |
2014-12-27 | 13:12:00 | TX | 0 | 0 | 0 | 0 | 30.3853 | -94.7823 | 0.23 | 75 |
2014-12-27 | 14:00:00 | TX | 0 | 1 | 0 | 0 | 30.6829 | -94.2553 | 4.25 | 50 |
2014-12-27 | 14:35:00 | TX | 0 | 1 | 0 | 0 | 30.864 | -93.979 | 1.98 | 150 |
2014-12-29 | 10:26:00 | GA | 0 | 2 | 9 | 0 | 30.8157 | -83.2833 | 0.74 | 180 |
2024-06-05 21:14:54.430 INFO 41732 --- [ main] com.et.tablesaw.DemoTests : clean some data

Table structure filtering

@Test
public void structurefilter() throws IOException {
System.out.println( tornadoes
.structure()
.where(tornadoes.structure().stringColumn("Column Type").isEqualTo("DOUBLE")));
}

outcome

2024-06-05 21:16:12.489 INFO 41766 --- [ main] com.et.tablesaw.DemoTests : init some data
Structure of tornadoes_1950-2014.csv
Index | Column Name | Column Type |
-----------------------------------------
7 | Start Lat | DOUBLE |
8 | Start Lon | DOUBLE |
9 | Length | DOUBLE |
2024-06-05 21:16:14.104 INFO 41766 --- [ main] com.et.tablesaw.DemoTests : clean some data

Preview the data

@Test
public void previewdata() throws IOException {
System.out.println(tornadoes.first(3));
}

outcome

2024-06-05 21:17:21.268 INFO 41798 --- [ main] com.et.tablesaw.DemoTests : init some data
tornadoes_1950-2014.csv
Date | Time | State | State No | Scale | Injuries | Fatalities | Start Lat | Start Lon | Length | Width |
-----------------------------------------------------------------------------------------------------------------------------------------
1950-01-03 | 11:00:00 | MO | 1 | 3 | 3 | 0 | 38.77 | -90.22 | 9.5 | 150 |
1950-01-03 | 11:00:00 | MO | 1 | 3 | 3 | 0 | 38.77 | -90.22 | 6.2 | 150 |
1950-01-03 | 11:10:00 | IL | 1 | 3 | 0 | 0 | 38.82 | -90.12 | 3.3 | 100 |
2024-06-05 21:17:23.206 INFO 41798 --- [ main] com.et.tablesaw.DemoTests : clean some data

Column operations

@Test
public void ColumnOperate() throws IOException {
StringColumn month = tornadoes.dateColumn("Date").month();
tornadoes.addColumns(month);
System.out.println(tornadoes.first(3));
tornadoes.removeColumns("State No");
System.out.println(tornadoes.first(3));
}

outcome

2024-06-05 21:20:33.554 INFO 41879 --- [ main] com.et.tablesaw.DemoTests : init some data
tornadoes_1950-2014.csv
Date | Time | State | State No | Scale | Injuries | Fatalities | Start Lat | Start Lon | Length | Width | Date month |
--------------------------------------------------------------------------------------------------------------------------------------------------------
1950-01-03 | 11:00:00 | MO | 1 | 3 | 3 | 0 | 38.77 | -90.22 | 9.5 | 150 | JANUARY |
1950-01-03 | 11:00:00 | MO | 1 | 3 | 3 | 0 | 38.77 | -90.22 | 6.2 | 150 | JANUARY |
1950-01-03 | 11:10:00 | IL | 1 | 3 | 0 | 0 | 38.82 | -90.12 | 3.3 | 100 | JANUARY |
tornadoes_1950-2014.csv
Date | Time | State | Scale | Injuries | Fatalities | Start Lat | Start Lon | Length | Width | Date month |
-------------------------------------------------------------------------------------------------------------------------------------------
1950-01-03 | 11:00:00 | MO | 3 | 3 | 0 | 38.77 | -90.22 | 9.5 | 150 | JANUARY |
1950-01-03 | 11:00:00 | MO | 3 | 3 | 0 | 38.77 | -90.22 | 6.2 | 150 | JANUARY |
1950-01-03 | 11:10:00 | IL | 3 | 0 | 0 | 38.82 | -90.12 | 3.3 | 100 | JANUARY |
2024-06-05 21:20:37.052 INFO 41879 --- [ main] com.et.tablesaw.DemoTests : clean some data

sort

@Test
public void sort() throws IOException {
tornadoes.sortOn("-Fatalities");
System.out.println(tornadoes.first(20));
}

outcome

2024-06-05 21:23:22.978 INFO 41945 --- [ main] com.et.tablesaw.DemoTests : init some data
tornadoes_1950-2014.csv
Date | Time | State | State No | Scale | Injuries | Fatalities | Start Lat | Start Lon | Length | Width |
-----------------------------------------------------------------------------------------------------------------------------------------
1950-01-03 | 11:00:00 | MO | 1 | 3 | 3 | 0 | 38.77 | -90.22 | 9.5 | 150 |
1950-01-03 | 11:00:00 | MO | 1 | 3 | 3 | 0 | 38.77 | -90.22 | 6.2 | 150 |
1950-01-03 | 11:10:00 | IL | 1 | 3 | 0 | 0 | 38.82 | -90.12 | 3.3 | 100 |
1950-01-03 | 11:55:00 | IL | 2 | 3 | 3 | 0 | 39.1 | -89.3 | 3.6 | 130 |
1950-01-03 | 16:00:00 | OH | 1 | 1 | 1 | 0 | 40.88 | -84.58 | 0.1 | 10 |
1950-01-13 | 05:25:00 | AR | 1 | 3 | 1 | 1 | 34.4 | -94.37 | 0.6 | 17 |
1950-01-25 | 19:30:00 | MO | 2 | 2 | 5 | 0 | 37.6 | -90.68 | 2.3 | 300 |
1950-01-25 | 21:00:00 | IL | 3 | 2 | 0 | 0 | 41.17 | -87.33 | 0.1 | 100 |
1950-01-26 | 18:00:00 | TX | 1 | 2 | 2 | 0 | 26.88 | -98.12 | 4.7 | 133 |
1950-02-11 | 13:10:00 | TX | 2 | 2 | 0 | 0 | 29.42 | -95.25 | 9.9 | 400 |
1950-02-11 | 13:50:00 | TX | 3 | 3 | 12 | 1 | 29.67 | -95.05 | 12 | 1000 |
1950-02-11 | 21:00:00 | TX | 4 | 2 | 5 | 0 | 32.35 | -95.2 | 4.6 | 100 |
1950-02-11 | 23:55:00 | TX | 5 | 2 | 6 | 0 | 32.98 | -94.63 | 4.5 | 67 |
1950-02-12 | 00:30:00 | TX | 6 | 2 | 8 | 1 | 33.33 | -94.42 | 8 | 833 |
1950-02-12 | 01:15:00 | TX | 7 | 1 | 0 | 0 | 32.08 | -98.35 | 2.3 | 233 |
1950-02-12 | 06:10:00 | TX | 8 | 2 | 0 | 0 | 31.52 | -96.55 | 3.4 | 100 |
1950-02-12 | 11:57:00 | TX | 9 | 1 | 32 | 0 | 31.8 | -94.2 | 7.7 | 100 |
1950-02-12 | 12:00:00 | TX | 10 | 3 | 15 | 3 | 31.8 | -94.2 | 1.9 | 50 |
1950-02-12 | 12:00:00 | MS | 2 | 1 | 0 | 0 | 34.6 | -89.12 | 2 | 10 |
1950-02-12 | 12:00:00 | MS | 1 | 2 | 2 | 3 | 34.6 | -89.12 | 0.1 | 10 |
2024-06-05 21:23:24.864 INFO 41945 --- [ main] com.et.tablesaw.DemoTests : clean some data

summary

@Test
public void summary() throws IOException {
System.out.println( tornadoes.column("Fatalities").summary().print());
}

outcome

2024-06-05 21:24:17.166 INFO 41963 --- [ main] com.et.tablesaw.DemoTests : init some data
Column: Fatalities
Measure | Value |
------------------------------------
Count | 59945 |
sum | 6802 |
Mean | 0.11347068145800349 |
Min | 0 |
Max | 158 |
Range | 158 |
Variance | 2.901978053261765 |
Std. Dev | 1.7035193140266314 |
2024-06-05 21:24:19.131 INFO 41963 --- [ main] com.et.tablesaw.DemoTests : clean some data

Data filtering

@Test
public void filter() throws IOException {
Table result = tornadoes.where(tornadoes.intColumn("Fatalities").isGreaterThan(0));
result = tornadoes.where(result.dateColumn("Date").isInApril());
result =
tornadoes.where(
result
.intColumn("Width")
.isGreaterThan(300) // 300 yards
.or(result.doubleColumn("Length").isGreaterThan(10))); // 10 miles
result = result.select("State", "Date");
System.out.println(result);
}

outcome

2024-06-05 21:25:03.671 INFO 41980 --- [ main] com.et.tablesaw.DemoTests : init some data
tornadoes_1950-2014.csv
State | Date |
------------------------
MO | 1950-01-03 |
IL | 1950-01-03 |
OH | 1950-01-03 |
AR | 1950-01-13 |
MO | 1950-01-25 |
IL | 1950-01-25 |
TX | 1950-02-12 |
TX | 1950-02-12 |
LA | 1950-02-12 |
AR | 1950-02-12 |
... | ... |
KS | 1951-06-21 |
OK | 1951-06-21 |
KS | 1951-06-23 |
WV | 1951-06-26 |
KS | 1951-06-27 |
IL | 1951-06-27 |
PA | 1951-06-27 |
SD | 1951-07-20 |
KS | 1951-08-06 |
NC | 1951-08-09 |
2024-06-05 21:25:05.764 INFO 41980 --- [ main] com.et.tablesaw.DemoTests : clean some data

Write to the file

@Test
public void write() throws IOException {
tornadoes.write().csv("rev_tornadoes_1950-2014-test.csv");

Read data from MySQL

@Resource
private JdbcTemplate jdbcTemplate;
@Test
public void datafrommysql() throws IOException {
Table table = jdbcTemplate.query("SELECT user_id,username,age from user_info", new ResultSetExtractor<Table>() {
@Override
public Table extractData(ResultSet resultSet) throws SQLException, DataAccessException {
return Table.read().db(resultSet);
}
});
System.out.println(table);
}

outcome

2024-06-05 21:26:04.963 INFO 42016 --- [ main] com.et.tablesaw.DemoTests : init some data
2024-06-05 21:26:06.622 INFO 42016 --- [ main] com.zaxxer.hikari.HikariDataSource : HikariPool-1 - Starting...
2024-06-05 21:26:06.949 INFO 42016 --- [ main] com.zaxxer.hikari.HikariDataSource : HikariPool-1 - Start completed.
user_id | username | age |
--------------------------------
1 | 1 | 1 |
2024-06-05 21:26:07.037 INFO 42016 --- [ main] com.et.tablesaw.DemoTests : clean some data

Data visualization

/*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package com.et.tablesaw;
import java.io.IOException;
import tech.tablesaw.api.Table;
import tech.tablesaw.plotly.Plot;
import tech.tablesaw.plotly.api.BubblePlot;
import tech.tablesaw.plotly.components.Figure;
/** */
public class BubbleExample2 {
public static void main(String[] args) throws IOException {
Table wines = Table.read().csv("/Users/liuhaihua/IdeaProjects/springboot-demo/tablesaw/src/main/resources/data/test_wines.csv");
Table champagne =
wines.where(
wines
.stringColumn("wine type")
.isEqualTo("Champagne & Sparkling")
.and(wines.stringColumn("region").isEqualTo("California")));
Figure figure =
BubblePlot.create(
"Average retail price for champagnes by year and rating",
champagne, // table namex
"highest pro score", // x variable column name
"year", // y variable column name
"Mean Retail" // bubble size
);
Plot.show(figure);
}
}

The results are shown in the figure below

The above are just some of the key codes, all of which can be found in the repositories below

Code repositories

4. References

--

--