Optimizing the backend of a web service always aims to increase its performance, a key aspect of which is speeding up data processing. This process encompasses many important improvements aimed at making more efficient use of resources and minimizing the system's response time to requests. In this article, I will share several proven techniques that can significantly speed up your web service.
Many programmers, in their quest to make an application faster, focus on optimizing the code and algorithms, choosing suitable data structures and optimal operations. This typically leads to improved performance, often improving the speed of the optimized code but not substantially.
The modest increase is due to the inherent speed of memory operations, and significant improvements should not be expected unless the original code was very inefficient. However, certain time-consuming operations should be prioritized for optimization, particularly input-output operations.
Whether working with files or interacting with a database, the execution time for these tasks is always notable in comparison to in-memory operations. You cannot significantly influence the process of reading data from a file, but working with the database is under your direct control. As a developer, you have all the capabilities to significantly improve this interaction.
Let's explore the following strategies to make working with your database more efficient, thereby significantly boosting the performance of your backend service
Today, it's rare to find a backend web service that doesn't utilize an Object-Relational Mapping (ORM) system for database interactions. If you're aiming for top-notch results, consider customizing the ORM. Although ORMs are efficient and error-free, they are designed for general use. This broad applicability often comes at the expense of high performance.
Remember, ORMs are created to be compatible with various databases, which might mean missing out on specific advantages of the database you've selected for your project. For instance, as illustrated here, leveraging unique database features can significantly enhance the speed of database interactions by up to 30 times.
Instead of solely depending on the default queries provided by an ORM, it's worthwhile to craft your own optimized queries. Custom queries often perform better, particularly in scenarios involving multiple joins.
Below is a simple example in Spring JPA of how you can improve performance using a join query:
@Transactional
@Lock(LockModeType.PESSIMISTIC_READ)
@Query(value = """
SELECT e
FROM EmployeeRecord e
LEFT JOIN DepartmentRecord d ON e.departmentId = d.id
WHERE e.departmentId = :departmentId;
""")
List<EmployeeRecord> findEmployeesByDepartmentId(Integer departmentId);
Using complex classes with nested objects and a deep hierarchical system can lead to a significant loss in system performance. It is often unnecessary to query the database for the entire nested structure, especially when not all classes in the structure are fully utilized.
While lazy initialization helps mitigate unnecessary queries on nested objects, challenges arise when a nested object is needed, but not all of its data is required. The solution to this dilemma is to employ flat data classes.
You should create a class designed to collect only the necessary field data from the database. Then, with a custom database query incorporating all the necessary joins, select only those fields that are genuinely needed.
This approach will not only enhance query speed but also reduce data traffic from the database to your service.
For example, using NamedParameterJdbcTemplate from Spring JPA, a flat class with the necessary fields can be created:
public record EmployeeDepartment(Integer employeeId, String employeeName, String departmentName) {
}
Next, using a straightforward script, only the necessary fields are collected from the main and joined tables:
public List<EmployeeDepartment> employeeDepartments() {
return template.query("""
SELECT
employees.employee_id,
employees.employee_name,
departments.department_name
FROM
employees
LEFT JOIN
departments
ON
employees.department_id = departments.department_id;
""",
new MapSqlParameterSource(), employeeDepartmentMapper);
}
This approach will significantly reduce the load and make working with data much more efficient.
The next important step in working with data is defining the data types, with the main type being Hot Data.
Hot Data is the data that the service processes in real-time. This data cannot be cached because the relevance of the web service's response depends on its immediate responsiveness. Therefore, this data must always be up-to-date. The service consistently works with Hot Data, continually recording new values and extracting information for timely updates.
To work with Hot Data as efficiently as possible, it is crucial to ensure that the table in which it is stored remains as compact as possible.
Keep as little columns as possible
Your table should contain only the fields that are actively used and store all other data in a separate table, retaining only the ID of the relevant row. This approach allows you to access all the unused fields when needed, such as for reporting purposes, without overburdening the main table with this data.
Keep as little rows as possible
Don't store lines you no longer need. Instead, move them to the archive table. This approach allows your queries to find the required rows faster while preserving all historical data in an archive. Automating this process with a simple job minimizes your involvement in data archiving.
Keep indexes updated
Remember to build indexes. Indexes are crucial for quick data searches and are often overlooked by programmers. Proper indexing can reduce search times and memory consumption of your database significantly. Make sure to build indexes for both conditions and columns involved in joins, including composite indexes.
Give up using foreign keys
Using foreign keys places an additional load on the database to ensure that the key exists in the associated table, which slows down data operations, especially when writing data. Don’t get me wrong; storing a foreign key in such a table is possible and sometimes even necessary, but it's better to store the key simply as a plain value.
These simple methods will enable you to maximize the efficiency and utility of your table.
Warm Data is data used to prepare a response, though its relevance does not have a critical impact. Examples include product descriptions or a list of available accessories. While storing such data, closely monitoring the size of the table is no longer necessary. However, it is important not to overlook the creation of indexes on these tables, as they are frequently used for joins.
Cache Warm Data
A key advantage of Warm Data is its cacheability. Once a request is made, you can store the data in memory, reducing the number of database calls and speeding up calculations. However, remember that the cache needs regular updates.
Set reasonable TTL (Time To Live)
Set the correct Time To Live (TTL) for proper operation. Usually, a TTL of around 90 seconds is sufficient, aligning with the average time a user takes to make a decision and place an order on a website. Always adjust the TTL based on your service's requirements.
Use smaller classes to store Warm Data
For caching, use compact classes. Even when full-fledged queries are made and all data from tables is collected, avoid storing everything in the cache. Store only the necessary data. This approach significantly reduces the memory consumption of your backend service.
Setting up Warm Data will not require much time, and ultimately, you will achieve tangible results.
Cold data refers to data that seldom changes yet is needed for a response. Examples of such data include a store's name or address. This data changes very rarely and has a minimal impact on the relevance of the response.
Cache Cold Data or store it in a file
This data type should always be cached. If fitting such data into memory is not feasible due to its large size, consider unloading it from the database and storing it in files in a ready-to-use format. Dividing it into categories and selecting only the most frequently used ones will help reduce memory usage. Additionally, this approach noticeably improves speed compared to fetching the data from the database, as it eliminates the need to work over the network.
Update Cache on trigger
The time to live (TTL) for such a cache is typically set at 24 hours. To keep the cache updated, you should schedule a task or create a trigger that monitors changes to this data and initiates a cache update. For instance, if an endpoint is called to post or update Cold Data, a trigger should be activated to update the cache.
Effectively managing Cold Data is also an important part of optimizing response efficiency, thereby enhancing overall system performance.
In conclusion, optimizing the backend of a web service does not solely hinge on code and algorithm optimization. Enhancing database interactions will lead to greater efficiency and overall performance of the service. Implementing techniques such as fine-tuning ORM (Object-Relational Mapping) queries, utilizing flat data classes, accurately defining data types, and adopting caching strategies can significantly boost service performance. Through these measures, the web service will ultimately achieve improved efficiency, responsiveness, and enhanced overall functionality.
Key steps: