Apache SeaTunnel 2.3.7 Released: New Support for Large Language Model Data Transformation
We are excited to announce that Apache SeaTunnel version 2.3.7 is officially released! As a popular next-generation open-source data integration tool, Apache SeaTunnel has always been committed to providing users with more flexible and efficient data synchronization and integration capabilities. This version introduces several new features such as support for LLM (Large Language Model) data transformation, enhanced SQL support, and new connector support, along with optimizations and improvements to existing functionalities and bug fixes. This article will detail the key updates in Apache SeaTunnel 2.3.7 and invite more developers and users to join our open-source community.
- Download version 2.3.7: https://seatunnel.apache.org/download/
- Release Notes: https://github.com/apache/seatunnel/releases/tag/2.3.7
New Features Highlights
- LLM Data Transformation Support: The new version 2.3.7 supports LLM (Large Language Model) data transformation. This feature will significantly enhance Apache SeaTunnel’s application capabilities in handling complex text data and natural language processing tasks, providing greater convenience for users in cutting-edge data processing fields.
In version 2.3.6, we added vector-type support for vector database writing, which can accelerate the development of AI applications and simplify workloads driven by AI applications. Apache SeaTunnel 2.3.6 added support for the vector database Milvus to better support AI development. This is the first vector database supported by Apache SeaTunnel, and future versions will extend support to other vector databases. For more details, see Version 2.3.6 Released! Apache SeaTunnel Zeta Engine Welcomes New Architecture!
2. Enhanced SQL Support: This version adds the CAST TO BYTES function for SQL, making data type conversion more flexible. Users will have more options when dealing with different data formats, improving the flexibility and operability of data processing.
3. Alibaba Cloud SLS Connector Support: This update adds a connector for Alibaba Cloud SLS (Alibaba Cloud Log Service). With this feature, users can directly import data into Alibaba Cloud Log Service and leverage its powerful log management and analysis capabilities. This feature is particularly suitable for user scenarios requiring real-time log monitoring and analysis.
4. ActiveMQ Sink Connector Support: Support for ActiveMQ as a sink further expands SeaTunnel’s message queue integration capabilities. ActiveMQ is a high-performance message broker system, and the newly added support allows Apache SeaTunnel users to exchange data with ActiveMQ more conveniently, especially for scenarios involving data stream processing and real-time data analysis.
Improvements and Optimizations
In terms of feature optimization, Apache SeaTunnel 2.3.7 brings several improvements aimed at enhancing system performance and stability.
- Flink API Method Naming Optimization: Improved the method naming conventions in the Flink API to make the code more readable and understandable. This optimization not only enhances the development experience but also reduces potential confusion for developers when using Flink.
- Enhanced API Validity Checks: The new version adds validity checks for API input parameters to ensure that user-configured settings and parameters meet expected requirements. This improvement reduces runtime exceptions caused by configuration errors and enhances overall system stability.
- Multi-table Sink Configuration Optimization: For scenarios requiring multi-table output, version 2.3.7 further optimizes sink option configuration, making it more convenient and efficient for users to configure multi-table outputs.
- OceanBase Support Optimization: Fixed compatibility issues related to OceanBase, improving Apache SeaTunnel’s performance and stability when handling OceanBase databases.
Key Bug Fixes
This version update fixes multiple key issues, significantly improving system stability and user experience.
- MySQL-CDC Connector Fix: Fixed an issue where the MySQL-CDC connector could not synchronize data properly under certain circumstances. This fix ensures that users using MySQL data sources can perform data synchronization operations more reliably.
- Doris Connector Fix: Addressed several critical issues with the Doris connector, enhancing compatibility and performance between Apache SeaTunnel and the Doris database, providing better support for users using Doris as a data storage option.
- Zeta Engine Task Stop Issue Fix: This update resolves a bug where the Zeta engine could not stop tasks properly under certain conditions. This improvement prevents resource leakage issues and improves overall system stability.
Documentation and Community Contributions
We understand that excellent documentation is crucial for users to successfully use Apache SeaTunnel. In version 2.3.7, we have updated and revised documentation for multiple modules to ensure users can access the most accurate and understandable user guides.
- Documentation Updates and Revisions: This version update includes corrections to several documents, especially for modules like Oracle-CDC. We have not only fixed errors from previous versions but also added more usage examples and operational guides to help users better understand and use SeaTunnel.
- Thanks to Community Contributors: This version update would not have been possible without the support and contributions from the community. We especially thank all the contributors who submitted code, reported issues, and provided suggestions for SeaTunnel 2.3.7. It is because of your selfless contributions that Apache SeaTunnel can continue to progress and grow.
Thanks to the contributors
Special thanks to @wuchunfu for leading the release work, and many thanks to the following community members for their contributions to this release:
Carl-Zhou-CN
Hisoka-X
Jarvis
OswinWu
TyrantLucifer
XenosK
alextinng
asapekia
chaos-cn
corgy-w
dailai
dependabot[bot]
gdliu3
hailin0
hawk9821
jackyyyyyssss
liugddx
luzongzhu
q3356564
virvle
whhe
wuchunfu
xxsc0529
zhangshenghang
Summary
The release of Apache SeaTunnel 2.3.7 is an important step in our ongoing efforts to improve product performance and user experience. Through new features, optimizations of existing functionalities, and fixing known issues, we aim to provide a better data integration and processing experience for our users. We also look forward to more users and developers joining the SeaTunnel community to jointly promote the development of this open-source project.
We welcome you to download version 2.3.7 of SeaTunnel and experience the latest features and improvements. If you have any questions or suggestions during use, feel free to contact us. Let’s work together to build a more open, powerful, and flexible data integration tool!
- How to Contribute: You can participate in the open-source community of SeaTunnel by submitting code, reporting issues, writing documentation, and more. Our GitHub page provides detailed contribution guidelines to help you get started quickly
- Join Our Discussions: We highly value the voice of the community and encourage everyone to share their thoughts and suggestions on GitHub Issue pages, mailing lists, and other discussion channels. Every suggestion from you is a valuable asset for us to improve and enhance Apache SeaTunnel.
About Apache SeaTunnel
Apache SeaTunnel is an easy-to-use, ultra-high-performance distributed data integration platform that supports real-time synchronization of massive amounts of data and can synchronize hundreds of billions of data per day stably and efficiently.
Welcome to fill out this form to be a speaker of Apache SeaTunnel: https://forms.gle/vtpQS6ZuxqXMt6DT6 :)
Why do we need Apache SeaTunnel?
Apache SeaTunnel does everything it can to solve the problems you may encounter in synchronizing massive amounts of data.
- Data loss and duplication
- Task buildup and latency
- Low throughput
- Long application-to-production cycle time
- Lack of application status monitoring
Apache SeaTunnel Usage Scenarios
- Massive data synchronization
- Massive data integration
- ETL of large volumes of data
- Massive data aggregation
- Multi-source data processing
Features of Apache SeaTunnel
- Rich components
- High scalability
- Easy to use
- Mature and stable
How to get started with Apache SeaTunnel quickly?
Want to experience Apache SeaTunnel quickly? SeaTunnel 2.1.0 takes 10 seconds to get you up and running.
https://seatunnel.apache.org/docs/2.1.0/developement/setup
How can I contribute?
We invite all partners who are interested in making local open-source global to join the Apache SeaTunnel contributors family and foster open-source together!
Submit an issue:
https://github.com/apache/seatunnel/issues
Contribute code to:
https://github.com/apache/seatunnel/pulls
Subscribe to the community development mailing list :
dev-subscribe@seatunnel.apache.org
Development Mailing List :
dev@seatunnel.apache.org
Join Slack:
https://join.slack.com/t/apacheseatunnel/shared_invite/zt-1kcxzyrxz-lKcF3BAyzHEmpcc4OSaCjQ
Follow Twitter:
https://twitter.com/ASFSeaTunnel
Join us now!❤️❤️