[Issue 2]Apache SeaTunnel(Incubating) weekly FAQ
April 8-April 14
Q: How to synchronize MySQL to ClickHouse?
A: The existing historical data in the MySQL table will be synchronized first, and then the changed data will be synchronized.
Q: Does SeaTunnel support HBase?
A: It’s under implementation.
Q: Does execution.parallelism in env work in SeaTunnel Engine? Or does it only work on the Flink engine?
A: Now the parameters are unified into parallelism.
Q: I plan to sync from MySQL to StarRocks, and I want it to support initialization structure migration, stock, increment, DDL synchronization. The document states that the save_mode_create_template attribute of StarRocks sink can support structure migration, JDBC can support stock migration, and CDC supports incremental. Now there are two questions, first, does it support DDL synchronization? Second, how to seamlessly start incremental synchronization after stock synchronization?
A: At present, DDL synchronization is not supported yet, and it is under design and development. Directly use MySQL CDC source, Starrocks sink to automatically synchronize in full first, and then incrementally. There is no need to use JDBC for stock migration first.
Q: Does SeaTunnel 2.3.1 still support connector v1?
A: No.
Q: How to monitor synchronization progress in SeaTunnel?
A: It can be viewed with the Zeta engine. If you use Flink, you can connect to the internal metric reporting platform, and each vertex in Flink has metric data.
Q: Does SeaTunnel Zeta Engine support streaming?
A: It supports streaming and batch integration, you only need to change the job.mode to STREAMING in env, provided that the corresponding data source supports streaming.
Q: Is there any information about the integration of SeaTunnel and DolphinScheduler?
A: You can refer to https://dolphinscheduler.apache.org/zh-cn/docs/3.1.5/guide/task/seatunnel. The PR of DolphinScheduler’s support for the SeaTunnel Zeta engine has been merged, and the version is to be released.
📌📌Welcome to fill out this survey to give your feedback on your user experience or just your ideas about Apache SeaTunnel:)
About Apache SeaTunnel
Apache SeaTunnel (formerly Waterdrop) is an easy-to-use, ultra-high-performance distributed data integration platform that supports real-time synchronization of massive amounts of data and can synchronize hundreds of billions of data per day in a stable and efficient manner.
Why do we need Apache SeaTunnel?
Apache SeaTunnel does everything it can to solve the problems you may encounter in synchronizing massive amounts of data.
- Data loss and duplication
- Task buildup and latency
- Low throughput
- Long application-to-production cycle time
- Lack of application status monitoring
Apache SeaTunnel Usage Scenarios
- Massive data synchronization
- Massive data integration
- ETL of large volumes of data
- Massive data aggregation
- Multi-source data processing
Features of Apache SeaTunnel
- Rich components
- High scalability
- Easy to use
- Mature and stable
How to get started with Apache SeaTunnel quickly?
Want to experience Apache SeaTunnel quickly? SeaTunnel 2.1.0 takes 10 seconds to get you up and running.
https://seatunnel.apache.org/docs/2.1.0/developement/setup
How can I contribute?
We invite all partners who are interested in making local open-source global to join the Apache SeaTunnel contributors family and foster open-source together!
Submit an issue:
https://github.com/apache/incubator-seatunnel/issues
Contribute code to:
https://github.com/apache/incubator-seatunnel/pulls
Subscribe to the community development mailing list :
dev-subscribe@seatunnel.apache.org
Development Mailing List :
dev@seatunnel.apache.org
Join Slack:
https://join.slack.com/t/apacheseatunnel/shared_invite/zt-1kcxzyrxz-lKcF3BAyzHEmpcc4OSaCjQ
Follow Twitter:
https://twitter.com/ASFSeaTunnel
Come and join us!