[Issue 2]Apache SeaTunnel(Incubating) weekly FAQ

Apache SeaTunnel
3 min readApr 27, 2023

--

April 8-April 14

Q: How to synchronize MySQL to ClickHouse?
A: The existing historical data in the MySQL table will be synchronized first, and then the changed data will be synchronized.

Q: Does SeaTunnel support HBase?
A: It’s under implementation.

Q: Does execution.parallelism in env work in SeaTunnel Engine? Or does it only work on the Flink engine?
A: Now the parameters are unified into parallelism.

Q: I plan to sync from MySQL to StarRocks, and I want it to support initialization structure migration, stock, increment, DDL synchronization. The document states that the save_mode_create_template attribute of StarRocks sink can support structure migration, JDBC can support stock migration, and CDC supports incremental. Now there are two questions, first, does it support DDL synchronization? Second, how to seamlessly start incremental synchronization after stock synchronization?

A: At present, DDL synchronization is not supported yet, and it is under design and development. Directly use MySQL CDC source, Starrocks sink to automatically synchronize in full first, and then incrementally. There is no need to use JDBC for stock migration first.

Q: Does SeaTunnel 2.3.1 still support connector v1?
A: No.

Q: How to monitor synchronization progress in SeaTunnel?
A: It can be viewed with the Zeta engine. If you use Flink, you can connect to the internal metric reporting platform, and each vertex in Flink has metric data.

Q: Does SeaTunnel Zeta Engine support streaming?
A: It supports streaming and batch integration, you only need to change the job.mode to STREAMING in env, provided that the corresponding data source supports streaming.

Q: Is there any information about the integration of SeaTunnel and DolphinScheduler?
A: You can refer to https://dolphinscheduler.apache.org/zh-cn/docs/3.1.5/guide/task/seatunnel. The PR of DolphinScheduler’s support for the SeaTunnel Zeta engine has been merged, and the version is to be released.

📌📌Welcome to fill out this survey to give your feedback on your user experience or just your ideas about Apache SeaTunnel:)

About Apache SeaTunnel

Apache SeaTunnel (formerly Waterdrop) is an easy-to-use, ultra-high-performance distributed data integration platform that supports real-time synchronization of massive amounts of data and can synchronize hundreds of billions of data per day in a stable and efficient manner.

Why do we need Apache SeaTunnel?

Apache SeaTunnel does everything it can to solve the problems you may encounter in synchronizing massive amounts of data.

  • Data loss and duplication
  • Task buildup and latency
  • Low throughput
  • Long application-to-production cycle time
  • Lack of application status monitoring

Apache SeaTunnel Usage Scenarios

  • Massive data synchronization
  • Massive data integration
  • ETL of large volumes of data
  • Massive data aggregation
  • Multi-source data processing

Features of Apache SeaTunnel

  • Rich components
  • High scalability
  • Easy to use
  • Mature and stable

How to get started with Apache SeaTunnel quickly?

Want to experience Apache SeaTunnel quickly? SeaTunnel 2.1.0 takes 10 seconds to get you up and running.

https://seatunnel.apache.org/docs/2.1.0/developement/setup

How can I contribute?

We invite all partners who are interested in making local open-source global to join the Apache SeaTunnel contributors family and foster open-source together!

Submit an issue:

https://github.com/apache/incubator-seatunnel/issues

Contribute code to:

https://github.com/apache/incubator-seatunnel/pulls

Subscribe to the community development mailing list :

dev-subscribe@seatunnel.apache.org

Development Mailing List :

dev@seatunnel.apache.org

Join Slack:

https://join.slack.com/t/apacheseatunnel/shared_invite/zt-1kcxzyrxz-lKcF3BAyzHEmpcc4OSaCjQ

Follow Twitter:

https://twitter.com/ASFSeaTunnel

Come and join us!

--

--

Apache SeaTunnel
Apache SeaTunnel

Written by Apache SeaTunnel

The next-generation high-performance, distributed, massive data integration tool.

No responses yet