Version 2.1.0 Released! Apache SeaTunnel(Incubating) First Apache Release — Refactors Kernel and Supports Flink Overall
On December 9, 2021, Apache SeaTunnel (formerly known as Waterdrop) successfully joined the Apache Incubator. Once in the incubator, we spent a lot of time sorting through external dependencies to ensure compliance throughout the project.
Finally, after four months of work by contributors, the community officially released the first Apache software version on March 18, 2022.
This release passed a rigorous 2-round ballot review by the Apache incubator in one go, ensuring maximum compliance with the Apache SeaTunnel Software License.
This means version 2.1.0 is an official release that is safe for corporate and individual users to use, which has been voted on by the Apache SeaTunnel community and the Apache Incubator.
2.1.0 version download address.
https://seatunnel.apache.org/download
GitHub Release:
https://github.com/apache/incubator-seatunnel/releases/tag/2.1.0
Note:
A license is a legal contract or guideline that regulates the use or distribution of copyrighted software. A software license is a contract between a software developer and its users that guarantee the user will be protected under the terms of the license. It is highly recommended that before choosing open source software, users and developers first pay attention to whether the license of the software is applicable to their products, and no doubt the Apache License is a very business-friendly license.
Specific release notes
[New Features]
1. The kernel part of the microkernel plug-in architecture is heavily optimized, with Java as the main kernel, and a lot of improvements to command line parameter parsing, plug-in loading, etc. Meanwhile, plug-in extensions can be developed according to the language that the user (or contributor) is good at, which greatly reduces the development threshold of plug-ins.
2. Fully support Flink, but at the same time the user is also free to choose the underlying engine, this update also brings a large number of Flink plug-ins, but also welcome you to contribute related plug-ins.
3. Provide local development speedy start environment support (example), contributors or users can quickly and silky smoothly start without changing any code to facilitate local rapid development debugging experience. This is certainly exciting news for contributors or users who need to customize their plugins. In fact, in our pre-release testing, a large number of contributors have used this approach to quickly test the plugin.
4. Provide Docker container installation, users can deploy and install Apache SeaTunnel via Docker very fast, we will also make a lot of iterations around Docker & K8s in the future, welcome to discuss and exchange.
[Specific Feature Description]
Use JCommander to do command line parameter parsing, making developers focus on the logic itself.
Flink is upgraded from 1.9 to 1.13.5, keeping compatibility with older versions and preparing for subsequent CDC.
Support for Doris, Hudi, Phoenix, Druid, and other Connector plugins, and you can find complete plugin support here [plugins-supported-by-seatunnel](https://github.com/apache/incubator-seatunnel#plugins-supported-by-seatunnel).
Local development extremely fast starts environment support. It can be achieved by using the example module without modifying any code, which is convenient for local debugging.
Support for installing and trying out Apache SeaTunnel via Docker containers.
SQL component supports SET statements and configuration variables.
Config module refactoring to facilitate understanding for the contributors while ensuring code compliance (License) of the project.
Project structure realigned to fit the new Roadmap.
CI&CD support, code quality automation control (more plans will be carried out to support CI&CD development).
Greetings from users
From the early days of waterdrop, we have witnessed the growth of Apache SeaTunnel along the way. Huya has been used as the core component for data pipeline docking. The plug-in capability has greatly simplified the tedious work of data interfacing. Recently Apache SeaTunnel has been deeply optimized in many aspects, especially in terms of scalability, which has made great progress. For example, Engine supports for both Spark and Flink, and with the ability to extend other engines. Plugins support more than 20 kinds of common data stores, with the ability to extend other multilingual development plug-ins. Through the continuous efforts of the community, Apache SeaTunnel has made unprecedented progress in documentation, configuration and development testing environment. At the same time, Apache SeaTunnel has made bold adjustments in the project structure to pave the way for future support of CDC, CI&CD, code quality automation and other features. The future of Apache SeaTunnel is promising. We hope you will continue to pay attention to China’s own open-source projects. Fighting!
- Qiang Huang, Data Architect at Huya.com
We are glad to see the release of the first Apache version of Apache SeaTunnel. The new version has a much clearer code structure and more supported plugins. I will continue to participate in contributing to the Apache SeaTunnel community in the future.
Together with the community, we will make Apache SeaTunnel easier and more efficient to use.
- Weitai Fan , Senior Engineer at OPPO
Apache SeaTunnel’s unique architectural design, modularity and plug-in advanced ideas are worth learning from. When Apache SeaTunnel was still named Waterdrop, we kept an eye on the project and validated it in multiple ETL scenarios. We incorporated a graphical interface that allowed users to perform ETL operations with simple configuration and apply it in production environments on a large scale. We hope Apache SeaTunnel develops better and better!
- Lei Nie, head of Big Data Platform at Li Auto Inc
Congratulations to Apache SeaTunnel on the release of its first Apache version since joining Apache. Version 2.1.0 is based on a clearer code structure, a richer family of plugins, excellent and easy to use. It is perfect for a second opening and enterprise landing. In addition, the upgraded and optimized architecture and improved performance will help enterprises to transfer data more efficiently and enhance the value of data.
- Zhang Zongyao, Senior Engineer at Bilibili
The emergence of Apache SeaTunnel (Incubating) has filled the gap of high concurrent data pushing and cleansing in the open-source ecology of big data.
Its plug-in thinking architecture has attracted a large number of contributors to continuously add improvements, making multi-source data exchange easier and more convenient. And these highlights, in the latest version 2.1.0 are also best reflected. It saves its users a lot of money. As one of the fans of Apache SeaTunnel(Incubating), I sincerely wish SeaTunnel develop better and better, and also I will synchronize my personal and company’s experience to the community in the future, so as to add more bricks to make Apache SeaTunnel more efficient and easier to use.
- Yuan Hongjun, OLAP Platform Architect at kidswant
Congratulations on the release of the first Apache version of Apache SeaTunnel. When I first came across Apache SeaTunnel, I was attracted by its simplicity and ease of use. The new version not only has a great improvement in the architecture but also supports a richer data source. At the same time, the community is becoming more and more mature. We hope more open-source lovers will join us and make Apache SeaTunnel shine.
- Di Wu, Senior Big Data Engineer at Shuhai SupplyChain Solutions
It’s great to see the first release of Apache SeaTunnel since it joined Apache. The new version is a huge step forward in terms of system architecture, configuration optimization, and performance improvements. If you are still struggling with distributed data access and cleansing, please join in the Apache SeaTunnel community, big surprises waiting for you there!
- Hu Chen, Big Data Engineer at CETC
Acknowledgements
Thanks to the following contributors who participated in this version release (GitHub IDs, in no particular order).
Al-assad, BenJFan, CalvinKirs, JNSimba, JiangTChen, Rianico, TyrantLucifer, Yves-yuan, ZhangchengHu0923, agendazhang, an-shi-chi-fan, asdf2014, bigdataf, chaozwn, choucmei, dailidong, dongzl, felix-thinkingdata, fengyuceNv, garyelephant, kalencaya, kezhenxu94, legendtkl, leo65535, liujinhui1994, mans2singh, marklightning, mosence, nielifeng, ououtt, ruanwenjun, simon824, totalo, wntp, wolfboys, wuchunfu, xbkaishui, xtr1993, yx91490, zhangbutao, zhaomin1423, zhongjiajie, zhuangchong, zixi0825.
Also sincere gratitude to our Mentors:
Zhenxu Ke, Willem Jiang, William Guo, LiDong Dai, Ted Liu, Kevin, JB for their help!
Planning for the next few releases:
- CDC (Change Data Capture) is a technology for capturing database change data. In the future, we will support Spark, FlinkCDC support.
- The monitoring system includes data, read time/s, the total amount of input data read by the task, data transfer records and other common indicators of monitoring.
- UI system support, support for user interface editing.
- SDK support, supports for service-oriented, more user-friendly.
- More Connector support and more efficient Sink support, such as ClickHouse, will be available soon in the next release.
The follow-up Features are decided by the community consensus, and we sincerely appeal to more participation in the community construction. If you are concerned about a feature, please submit an issue or reply to the issue. Because the issues with more concerns will be implemented first.
Community Status
Recent Development
Since entering the Apache incubator, the contributor group has grown from 13 to 55 and continues to grow, with the average weekly community commits remaining at 20+.
Three contributors from different companies (Lei Xie, HuaJie Wang, Chunfu Wu) have been invited to become Committers on account of their contributions to the community.
We held two Meetups, where instructors from Bilibili, OPPO, Vipshop, and other companies shared their large-scale production practices based on Apache SeaTunnel in their companies (we will hold one meetup monthly in the future, and welcome Apache SeaTunnel users or contributors to come and share their stories about Apache SeaTunnel).
Users of Apache SeaTunnel
Registered users of Apache SeaTunnel are shown below. If you are also using Apache SeaTunnel, too, welcome to register on Who is using Apache SeaTunne!
Note: Only registered users are included.
PPMC’s Word
LiFeng Nie, PPMC of Apache SeaTunnel (Incubating), commented on the first Apache version release.
From the first day entering Apache Incubator, we have been working hard to learn the Apache Way and various Apache policies. Although the first release took a lot of time (mainly for compliance), we think it was well worth it, and that’s one of the reasons we chose to enter Apache. We need to give our users peace of mind, and Apache is certainly the best choice, with its almost demanding license control that allows users to avoid compliance issues as much as possible and ensure that the software is circulating reasonably and legally. In addition, its practice of the Apache Way, such as public service mission, pragmatism, community over code, openness and consensus decision-making, and meritocracy, can drive the Apache SeaTunnel community to become more open, transparent, and diverse.
Committer and Contributor Words
Apache SeaTunnel (Incubating) links data and unlocks value. I’ve been following and participating in the Apache incubator from the time I entered it until the first Apache version was released. I’m very excited about the first Apache release of Apache SeaTunnel. The new version is a big improvement in code architecture and specification. And the Apache SeaTunnel community is very active. I will continue to contribute, and warmly welcome more people to join in and contribute to the development of Apache SeaTunnel.
- Huajie Wang, Apache SeaTunnel Committer
We are happy to see that Apache SeaTunnel has released its first Apache version. Although it is the first version, Apache SeaTunnel is already very capable in terms of ease of use and data source support. It helps users to complete data synchronization tasks easily, quickly and efficiently. At the same time, the community is flourishing, and we hope that you will join us in contributing to Apache SeaTunnel (Incubating) and contribute to the growth of Apache SeaTunnel.
- Jia Fan, Apache SeaTunnel Contributor
Thanks to the joint efforts of the community, the first Apache version of Apache SeaTunnel has been released. The first Apache version has been heavily refactored from the code level compared to the previous non-Apache version. The Apache SeaTunnel community is very active, and we hope more people will join us and contribute to it.
-Chunfu Wu, Apache SeaTunnel Committer
About SeaTunnel
Apache SeaTunnel (formerly Waterdrop) is an easy-to-use, ultra-high-performance distributed data integration platform that supports real-time synchronization of massive amounts of data and can synchronize hundreds of billions of data per day in a stable and efficient manner.
Why do we need Apache SeaTunnel
Apache SeaTunnel does everything it can to solve the problems you may encounter in synchronizing massive amounts of data.
- Data loss and duplication
- Task buildup and latency
- Low throughput
- Long application-to-production cycle time
- Lack of application status monitoring
Apache SeaTunnel Usage Scenarios
- Massive data synchronization
- Massive data integration
- ETL of large volumes of data
- Massive data aggregation
- Multi-source data processing
Features of Apache SeaTunnel
- Rich components
- High scalability
- Easy to use
- Mature and stable
How to get started with Apache SeaTunnel quickly?
Want to experience Apache SeaTunnel quickly. 2.1.0 Ten seconds to get you up and running.
https://seatunnel.apache.org/docs/2.1.0/developement/setup
How can I contribute?
We invite all partners who are interested in making local open-source global to join the SeaTunnel contributor family and build open-source together!
Submit an issue to:
https://github.com/apache/incubator-seatunnel/issues
Contribute code to:
https://github.com/apache/incubator-seatunnel/pulls
Subscribe to the community development mailing list :
dev-subscribe@seatunnel.apache.org
Development Mailing List :
dev@seatunnel.apache.org
Join Slack:
https://join.slack.com/t/apacheseatunnel/shared_invite/zt-10u1eujlc-g4E~ppbinD0oKpGeoo_dAw
Follow Twitter:
https://twitter.com/ASFSeaTunnel
Come and join us!