SeaTunnel released its first version 2.3.2 after graduating as Apache TLP, further improving the stability and usability of the Zeta engine

Apache SeaTunnel
5 min readJun 20, 2023

--

,Recently, Apache SeaTunnel officially released version 2.3.2. This version was released more than two months after the previous version 2.3.1. During this period, we collected feedback from users and developers and made bug fixes to the SeaTunnel Zeta Engine in version 2.3.2, improving the stability and efficiency of the engine.

In addition, the new version also optimized the functionality and performance of the connectors in Connector-V2, and SQL Transfrom now supports custom UDF functions. Zeta Engine also provides cluster monitoring and query operations through Rest API in this version.

This article will introduce the specific updates of Apache SeaTunnel 2.3.2 version.

This article will introduce the specific updates of Apache SeaTunnel 2.3.2 version.

New features

In this update, Zeta Engine supports obtaining task and system monitoring information through RestAPI. Users can call HTTP requests to any node to complete data information requests, enhancing task monitoring capabilities.

For details, please refer to: https://seatunnel.apache.org/docs/2.3.2/seatunnel-engine/rest-api

In addition, SQL Transform in version 2.3.2 now supports custom UDF functions. For detailed usage, please refer to: https://seatunnel.apache.org/docs/2.3.2/transform-v2/sql-udf

Core

  • [Core] [API] Support convert strings as List option (#4362)
  • [Core] [API] Add copy method to Catalog codes (#4414)
  • [Core] [API] Add options check before create source and sink and transform in FactoryUtil (#4424)
  • [Core] [Shade] Add guava shade module (#4358)

Connector-V2

  • [Connector-V2] [CDC] [SQLServer] Support multi-table read (#4377)
  • [Connector-V2] [Kafka] Kafka source supports data deserialization failure skipping (#4364)
  • [Connector-V2] [Jdbc] [TiDB] Add TiDB catalog (#4438)
  • [Connector-V2] [File] Add file excel sink and source (#4164)
  • [Connector-v2] [Snowflake] Add Snowflake Source&Sink connector (#4470)
  • [Connector-V2] [Pular] support read format for pulsar (#4111)
  • [Connector-V2] [Paimon] Introduce paimon connector (#4178)
  • [Connector V2] [Cassandra] Expose configurable options in Cassandra (#3681)
  • [Connector V2] [Jdbc] Supports GEOMETRY data type for PostgreSQL (#4673)
  • [Transform-V2] Add UDF SPI and an example implement for SQL Transform plugin (#4392)
  • [Transform-V2] Support copy field list (#4404)
  • [Transform-V2] Add support CatalogTable for FieldMapperTransform (#4423)
  • [Transform-V2] Add CatalogTable support for ReplaceTransform (#4411)
  • [Transform-V2] Add Catalog support for FilterRowKindTransform (#4420)
  • [Transform-V2] Add support CatalogTable for FilterFieldTransform (#4422)
  • [Transform-V2] Add catalog support for SQL Transform plugin (#4819)

Zeta Engine

  • [Zeta] Support for mixing Factory and Plugin SPI (#4359)
  • [Zeta] Add get running job info by jobId rest api (#4140)
  • [Zeta] Add REST API To Get System Monitoring Information (#4315)
  • [Transform V2 & Zeta] Make SplitTransform Support CatalogTable And CatalogTable Evolution (#4396)

Improve

Core

  • [Core] [Spark] Push transform operation from Spark Driver to Executors (#4503)
  • [Core] [Starter] Optimize code structure & remove redundant code (#4525)
  • [Core] [Translation] [Flink] Optimize code structure & remove redundant code (#4527)

Connector-V2

  • [Connector-V2] [CDC] Improve startup.mode/stop.mode options (#4360)
  • [Connector-V2] [CDC] Optimize jdbc fetch-size options (#4352)
  • [Connector-V2] [CDC] Fix chunk start/end parameter type error (#4777)
  • [Connector-V2] [SQLServer] Fix sqlserver catalog (#4441)
  • [Connector-V2] [StarRocks] Improve StarRocks Serialize Error Message (#4458)
  • [Connector-V2] [Jdbc] Add the log for SQL and update some style (#4475)
  • [Connector-V2] [Jdbc] Fix the table name is not automatically obtained when multiple tables (#4514)
  • [Connector-V2] [S3 & Kafka] Delete unavailable S3 & Kafka Catalogs (#4477)
  • [Connector-V2] [Pulsar] Support Canal Format

Zeta Engine

  • [Zeta] Support runs the server through daemon mode (#4161)
  • [Zeta] Change ClassLoader To Improve the SDK compatibility of the client (#4447)
  • [Zeta] Client Support Async Submit Job (#4456)
  • [Zeta] Add more detailed log output. (#4446)
  • [Zeta] Improve seatunnel-cluster.sh (#4435)
  • [Zeta] Reduce CPU Cost When Task Not Ready (#4479)
  • [Zeta] Add parser log (#4485)
  • [Zeta] Remove redundant code (#4489)
  • [Zeta] Remove redundancy code in validateSQL (#4506)
  • [Zeta] Improve JobMetrics fetch performance (#4467)

Bug fix

Core

  • [Core] [API] Fixed generic class loss for lists (#4421)
  • [Core] [API] Fix parse nested row data type key changed upper (#4459)

Connector-V2

  • [Json-format] [Canal-Json] Fix json deserialize NPE (#4195)
  • [Connector-V2] [Jdbc] Field aliases are not supported in the query of jdbc source. (#4210)
  • [Connector-V2] [Jdbc] Fix connection failure caused by connection timeout. (#4322)
  • [Connector-V2] [Jdbc] Set default value to false of JdbcOption: generate_sink_sql (#4471)
  • [Connector-V2] [JDBC] Fix TiDBCatalog without open (#4718)
  • [Connector-V2] [Jdbc] Fix XA DataSource crash(Oracle/Dameng/SqlServer) (#4866)
  • [Connector-V2] [Pulsar] Fix the bug that can’t always consume messages. (#4125)
  • [Connector-V2] [Eleasticsearch] Document description error (#4390)
  • [Connector-V2] [Eleasticsearch] Source deserializer error and inappropriate (#4233)
  • [Connector-V2] [Kafka] Fix KafkaProducer resources have never been released. (#4302)
  • [Connector-V2] [Kafka] Fix the permission problem caused by client.id. (#4246)
  • [Connector-V2] [Kafka] Fix KafkaConsumerThread exit caused by commit offset error. (#4379)

Zeta Engine

  • [Zeta] Fix LogicalDagGeneratorTest test case (#4401)
  • [Zeta] Fix MultipleTableJobConfigParser parse only one transform (#4412)
  • [Zeta] Fix missing common plugin jars (#4448)
  • [Zeta] Fix handleCheckpointError be called while checkpoint already complete (#4442)
  • [Zeta] Fix job error message is not right bug (#4463)
  • [Zeta] Fix finding TaskGroup deployment node bug (#4449)
  • [Zeta] Fix the bug of conf (#4488)
  • [Zeta] Fix Connector load logic from zeta (#4510)
  • [Zeta] Fix conflict dependency of hadoop-hdfs (#4509)

E2E

  • [E2E] [Kafka] Fix kafka e2e testcase (#4520)
  • [Container Version] Fix risk of unreproducible test cases #4591

Docs

  • [Docs] Optimizes part of the Doris and SelectDB connector documentation (#4365)
  • [Docs] Fix docs code style (#4368)
  • [Docs] Update jdbc doc and Kafka doc (#4380)
  • [Docs] Fix max_retries default value is 0. (#4383)
  • [Docs] Fix markdown syntax (#4426)
  • [Docs] Fix Kafka Doc Error Config Key “kafka.” (#4427)
  • [Docs] Add Transform to Quick Start v2 (#4436)

Thanks to the contributors

Thanks to the release support of WhaleOps Engineer Fan Jia, and the contributions from the community members below:

Andrew Wetmore,

Bibo,

Carl-Zhou-CN,

Cason-ACE,

Chengyu Yan,

CodingGPT,

dalong,Eric,

FlechazoW,

Guangdong Liu,

Hao Xu,

J.A.R.V.I.S,

Kim,

Laglangyue,

Marvin,

TaoZex,

Tyrantlucifer,

Xiaojian Sun,

ZhilinLi,

Zongwen Li,

dylandai,

gnehil,

hailin0,

ic4y,

kezhenxu94,

lightzhao,

lucklilili,

lvshaokang,

mengxiaopeng,

monster,

songjianet,

stdnt-xiao,

thomasc,

will27,

wyc,

xiaofan2012,

zhilinli,

About Apache SeaTunnel

Apache SeaTunnel (formerly Waterdrop) is an easy-to-use, ultra-high-performance distributed data integration platform that supports real-time synchronization of massive amounts of data and can synchronize hundreds of billions of data per day in a stable and efficient manner.

Why do we need Apache SeaTunnel?

Apache SeaTunnel does everything possible to solve the problems you may encounter by synchronizing massive amounts of data.

  • Data loss and duplication
  • Task buildup and latency
  • Low throughput
  • Long application-to-production cycle time
  • Lack of application status monitoring

Apache SeaTunnel Usage Scenarios

  • Massive data synchronization
  • Massive data integration
  • ETL of large volumes of data
  • Massive data aggregation
  • Multi-source data processing

Features of Apache SeaTunnel

  • Rich components
  • High scalability
  • Easy to use
  • Mature and stable

How to get started with Apache SeaTunnel quickly?

Want to experience Apache SeaTunnel quickly? SeaTunnel 2.1.0 takes 10 seconds to get you up and running.

https://seatunnel.apache.org/docs/2.1.0/developement/setup

How can I contribute?

We invite all partners interested in making local open-source global to join the Apache SeaTunnel contributors family and foster open-source together!

Submit an issue:

https://github.com/apache/seatunnel/issues

Contribute code to:

https://github.com/apache/seatunnel/pulls

Subscribe to the community development mailing list :

dev-subscribe@seatunnel.apache.org

Development Mailing List :

dev@seatunnel.apache.org

Join Slack:

https://join.slack.com/t/apacheseatunnel/shared_invite/zt-1kcxzyrxz-lKcF3BAyzHEmpcc4OSaCjQ

Follow Twitter:

https://twitter.com/ASFSeaTunnel

Join us now!❤️❤️

--

--

Apache SeaTunnel
Apache SeaTunnel

Written by Apache SeaTunnel

The next-generation high-performance, distributed, massive data integration tool.

No responses yet