Java Resource Management and Leak Prevention: Insights from SeaTunnel Source Code
Resource management is a critical yet often overlooked aspect of Java development. This article explores how to properly manage resources in Java to prevent leaks, using a real example from the SeaTunnel project.
A Fix in SeaTunnel
In Apache SeaTunnel, the HiveSink
component once had a typical resource leak issue. The code comparison before and after the fix is shown below:
Before the fix:
@Override
public List<FileAggregatedCommitInfo> commit(...) throws IOException {
HiveMetaStoreProxy hiveMetaStore = HiveMetaStoreProxy.getInstance(pluginConfig);
List<FileAggregatedCommitInfo> errorCommitInfos = super.commit(aggregatedCommitInfos);
if (errorCommitInfos.isEmpty()) {
// Handle partition logic...
}
hiveMetaStore.close(); // Will not execute if an exception occurs above
return errorCommitInfos;
}
After the fix:
@Override
public List<FileAggregatedCommitInfo> commit(...) throws IOException {
List<FileAggregatedCommitInfo> errorCommitInfos = super.commit(aggregatedCommitInfos);
HiveMetaStoreProxy hiveMetaStore = HiveMetaStoreProxy.getInstance(pluginConfig);
try {
if (errorCommitInfos.isEmpty()) {
// Handle partition logic...
}
} finally {
hiveMetaStore.close(); // Ensures the resource is always released
}
return errorCommitInfos;
}
Though the change appears small, it effectively prevents resource leaks and ensures system stability. Let’s now explore general strategies for Java resource management.
What Is a Resource Leak?
A resource leak occurs when a program acquires a resource but fails to properly release it, resulting in long-term resource occupation. Common resources that require proper management include:
- 📁 File handles
- 📊 Database connections
- 🌐 Network connections
- 🧵 Thread resources
- 🔒 Lock resources
- 💾 Memory resources
Improper resource management may lead to:
- Performance degradation
- Out of memory errors
- Application crashes
- Service unavailability
Two Approaches to Resource Management
(1) Traditional Approach: try-catch-finally
Connection conn = null;
try {
conn = DriverManager.getConnection(url, user, password);
// Use connection
} catch (SQLException e) {
// Exception handling
} finally {
if (conn != null) {
try {
conn.close();
} catch (SQLException e) {
// Handle close exception
}
}
}
Drawbacks:
- Verbose code
- Complex nesting
- Easy to forget closing resources
- Harder to maintain when multiple resources are involved
(2) Modern Approach: try-with-resources (Java 7+)
try (Connection conn = DriverManager.getConnection(url, user, password)) {
// Use connection
} catch (SQLException e) {
// Exception handling
}
Advantages:
- Cleaner and more concise
- Automatically closes resources
- Handles exceptions gracefully
- Maintains readability even with multiple resources
Custom Resource Classes
To manage custom resources, implement the AutoCloseable
interface:
public class MyResource implements AutoCloseable {
private final ExpensiveResource resource;
public MyResource() {
this.resource = acquireExpensiveResource();
}
@Override
public void close() {
if (resource != null) {
try {
resource.release();
} catch (Exception e) {
logger.error("Error while closing resource", e);
}
}
}
}
Key principles:
- The
close()
method should be idempotent - Handle internal exceptions within
close()
- Log failures during resource release
Common Pitfalls and Solutions
(1) Resource management inside loops
Incorrect:
public void processFiles(List<String> filePaths) throws IOException {
for (String path : filePaths) {
FileInputStream fis = new FileInputStream(path); // Potential leak
// Process file
fis.close(); // Won’t run if an exception is thrown
}
}
Correct:
public void processFiles(List<String> filePaths) throws IOException {
for (String path : filePaths) {
try (FileInputStream fis = new FileInputStream(path)) {
// Process file
} // Automatically closed
}
}
(2) Nested resource handling
// Recommended approach
public void nestedResources() throws Exception {
try (
OutputStream os = new FileOutputStream("file.txt");
BufferedOutputStream bos = new BufferedOutputStream(os)
) {
// Use bos
} // Automatically closed in reverse order
}
Real-World Examples
(1) Database Connection
public void processData() throws SQLException {
try (
Connection conn = getConnection();
Statement stmt = conn.createStatement();
ResultSet rs = stmt.executeQuery("SELECT * FROM users")
) {
while (rs.next()) {
// Process data
}
} // All resources closed automatically
}
(2) File Copy
public void copyFile(String source, String target) throws IOException {
try (
FileInputStream in = new FileInputStream(source);
FileOutputStream out = new FileOutputStream(target)
) {
byte[] buffer = new byte[1024];
int len;
while ((len = in.read(buffer)) > 0) {
out.write(buffer, 0, len);
}
}
}
(3) HTTP Request
public String fetchData(String urlString) throws IOException {
URL url = new URL(urlString);
HttpURLConnection connection = (HttpURLConnection) url.openConnection();
connection.setRequestMethod("GET");
try (BufferedReader reader = new BufferedReader(
new InputStreamReader(connection.getInputStream()))) {
StringBuilder response = new StringBuilder();
String line;
while ((line = reader.readLine()) != null) {
response.append(line);
}
return response.toString();
} finally {
connection.disconnect();
}
}
Summary & Best Practices
- Prefer try-with-resources for managing resources
- If try-with-resources is not applicable, use
finally
blocks to release resources - Custom resource classes should implement
AutoCloseable
- Release resources in the reverse order of acquisition
- Handle exceptions in
close()
gracefully - Be extra cautious with resource creation inside loops
About Apache SeaTunnel
Apache SeaTunnel is an easy-to-use, ultra-high-performance distributed data integration platform that supports real-time synchronization of massive amounts of data and can synchronize hundreds of billions of data per day stably and efficiently.
Welcome to fill out this form to be a speaker of Apache SeaTunnel: https://forms.gle/vtpQS6ZuxqXMt6DT6 :)
Why do we need Apache SeaTunnel?
Apache SeaTunnel does everything it can to solve the problems you may encounter in synchronizing massive amounts of data.
- Data loss and duplication
- Task buildup and latency
- Low throughput
- Long application-to-production cycle time
- Lack of application status monitoring
Apache SeaTunnel Usage Scenarios
- Massive data synchronization
- Massive data integration
- ETL of large volumes of data
- Massive data aggregation
- Multi-source data processing
Features of Apache SeaTunnel
- Rich components
- High scalability
- Easy to use
- Mature and stable
How to get started with Apache SeaTunnel quickly?
Want to experience Apache SeaTunnel quickly? SeaTunnel 2.1.0 takes 10 seconds to get you up and running.
https://seatunnel.apache.org/docs/2.1.0/developement/setup
How can I contribute?
We invite all partners who are interested in making local open-source global to join the Apache SeaTunnel contributors family and foster open-source together!
Submit an issue:
https://github.com/apache/seatunnel/issues
Contribute code to:
https://github.com/apache/seatunnel/pulls
Subscribe to the community development mailing list :
dev-subscribe@seatunnel.apache.org
Development Mailing List :
dev@seatunnel.apache.org
Join Slack:
https://join.slack.com/t/apacheseatunnel/shared_invite/zt-1kcxzyrxz-lKcF3BAyzHEmpcc4OSaCjQ
Follow Twitter:
https://twitter.com/ASFSeaTunnel
Join us now!❤️❤️