SeaTunnel 2024 Roadmap Released: Defining the Future of Data Integration

Apache SeaTunnel
3 min readMar 25, 2024

Last year, through the efforts of numerous community contributors, the Apache SeaTunnel community met so many goals, and in the new year of 2024, the community is planning more to polish SeaTunnel to be a more attractive big data processing tool to you all.

We pleasurely announce the update of the Apache SeaTunnel Roadmap of 2024, which focuses on enhancing the core features of Apache SeaTunnel, expanding the connector ecosystem, optimizing data processing capabilities, and improving user experience. Let’s build together!

Support for Running on K8s and Yarn

Currently, Zeta engine job submission only supports local mode and standalone mode. The community plans to comprehensively expand job runtime environments to support K8s and Yarn, with optimizations specifically targeting CDC real-time synchronization scenarios, greatly improving resource utilization and data processing efficiency. This marks a solid step forward for SeaTunnel in addressing large-scale data processing requirements.

Issue Link: https://github.com/apache/seatunnel/issues/4386

Support for More Connectors

Added support for multiple data sources and target connectors, further enriching the application scenarios of Apache SeaTunnel. Each expansion is a practical step for us to open up more possibilities for developers.

Catalog Support for More Connectors

The design and adaptation of TypeConverter and DataTypeConverter aim to make each connector more accurately describe the conversion and inversion between the database’s own data types and SeaTunnel data types. Currently, the API level development should be completed, and subsequent adaptation implementations are required for all connectors. TypeConverter can assist SeaTunnel in better deducing data models and generating table creation statements automatically.

DataTypeConverter, together with TypeConverter, will help SeaTunnel better perform implicit conversions of data types between different databases. For example, in the scenario of JDBC Oracle Sink, DataTypeConverter needs to combine with TypeConverter to determine information such as the length of the field and the field type at the target end when writing String types into SeaTunnel, deciding when to use setString and when to use blob.

In summary, by introducing Catalog adaptation, TypeConverter, and DataTypeConverter, we provide strong support for the automatic retrieval of data structures and the precise conversion of data types.

Issue Links:

Event Notification Mechanism

To improve the efficiency and transparency of task management, we plan to introduce an event notification mechanism to promptly inform users of various task statuses and important events.

Table-Level Monitoring

With the support of multi-table synchronization functionality in the latest 2.3.4 version, table-level monitoring has become necessary. Users will be able to understand the synchronization status of each table through monitoring information, further enhancing monitoring granularity.

Dirty Data Collection

During the data synchronization process, data that cannot be written to the target end will no longer directly cause job failures. Through the dirty data collection function, this data will be stored in advance, not affecting the normal operation of jobs, ensuring the sustainability of the data processing workflow.

Community Collaboration, Co-creating the Future

The SeaTunnel community is in a period of vigorous development. Each update condenses the wisdom and sweat of community members. The development of the community cannot be separated from the contribution and support of each member. The journey of the community is far from over, and more challenges and opportunities await us to explore together.

We warmly invite developers and technology enthusiasts from around the world to join the SeaTunnel community and participate together in this innovative journey.

The development of the SeaTunnel community depends on the contribution and support of each member. We enthusiastically invite more developers to join us, not only to jointly promote these exciting new features but also to explore the infinite possibilities in the field of data processing under the spirit of open source.

“As one person runs fast, it takes a group to go far.” — — The SeaTunnel community looks forward to your participation.

--

--

Apache SeaTunnel

The next-generation high-performance, distributed, massive data integration tool.