
The Struggle
The issue was difficult to intentionally trigger, which made it hard to pinpoint the root cause. We spend time doing spikes, sending messages and meetings with experts to try to identify the issue. The maximum we could get out of it were some insights and small improvements.
Some positive outcomes were the identification of extra function calls and fixing a memory leak caused by event listeners not being properly removed. While that was good, it didn't solve the main problem. We added logs to properly monitor how often the issue was happening, but still no clue about the cause.
The Growing Complexity
Over the last year many changes and improvements were made around the page but not on the feature we display on the map. As the page was redesigned, new requirements and features were added making the map and its integration more and more complex.
The library and the features had been in place for years, but no significant improvements were made during this time. New features being integrated inside the complexity just adds more complexity when not properly planned. “Why do we need this feature?”, “What needs to be changed?" Often we don’t see the problems in an estimation session, we see it only when trying to implement.
Eventually we decided it was time to prioritize the refactoring of the suspected feature and its integration to address the issue properly.
The “Root Cause”
We identified one function as a potential cause of the problem, but the problem couldn't be solved by simply refactoring one function. The problem was the entire implementation which over the years had become:
- Hard to understand with unnecessary complexity added over time.
- Difficult to debug as a consequence of understanding the complexity.
- Tightly coupled with code spread across multiple files, both in the library and the frontend, making it hard to modify or improve.
After some insights around the code and the problems faced, we decided that a full refactor was necessary. That means more time is invested to understand the whole feature: what the feature does, and how we can keep the functionalities but improving the code. No old code is going to be reused, no copy paste. We need to understand what each function does and we improve the implementation for better clarity and maintainability.
The Refactor
It took us around two and a half weeks to refactor the code, run test and do profiling to ensure we can recreate the functionalities and prevent the error. Some of the problems we identified and resolved included:
- Unnecessary API calls: We reduced redundant API requests
- Extra frontend processing: We moved some processing to the backend.
- Inefficient event listeners: We minimized unnecessary DOM changes and event listener usage.
- Multiple DOM updates: We added
requestAnimationFrame
to optimize DOM changes and avoid unnecessary reflows. - Memory management: We now use
WeakMap
in some places to ensure that objects could be garbage-collected when no longer referenced, improving memory management. - and some more…
The Outcome
While we didn't fully solve the problem, the improvements reduced the issue by 93%. This reduction of errors is already a win, and we gained more insights into the feature's behavior and its integration. Tackling tech debt is never a quick fix, and specially when you don't know that something is a tech debt until it starts causing problems. This time invested paid off in reducing the error rate, improved the time spent in "Messaging” (more efficient communication on the main thread) and "Loading” (time spent loading resources), and now code quality is moving forward. There will always be something to be improved.

Comparison in a 4 days range:
At the end of the process we decided to remove the smooth zoom feature. After the refactoring we reevaluated the impact of having the feature enabled and we realized it could cause more harm than good. To properly use it would require additional time in other improvements, but that's not a priority now at the moment.
During the refactor, we also explored some alternative solutions to avoid so much DOM manipulation, but they would have required additional time to evaluate. In the end, we focused on improving what we could, and we’re confident that this will lead to better performance moving forward.
Key Lessons Learned
- The complexity that built up over the years made this issue difficult to diagnose and fix.
- Code quality matters and lack of care and attention to code quality leads to more time and effort to maintain.
- Question often "why do we need this code and what it affects?" to understand better the purpose and think about new approaches.
- Sometimes when dealing with tech debt a fresh approach is better than patching over old code.
Red Flags to Watch out For
As a result of this experience I've become more aware of certain red flags in the codebase:
- Spending too much time trying to understand what some code does: If a function’s purpose isn’t clear, it’s a sign that the code may need refactoring.
- Large files full of unrelated functions: And managing state without clear logic can quickly lead to issues.
- Complex code involving lists and
setTimeout
together: This can easily lead to race conditions, making debugging harder. - Unclear dependencies: When there's some change that needs to happen and it's not clear which parts of the code should be changed.
Conclusion: Continuous Improvement
This is my second post about refactoring (here is the first: The Thought Process Behind Rewriting a Feature ), and I always try to do it with an open mind. The intention is not to blame, we must acknowledge that previous implementations had a purpose and worked at one point but as the time passes it is essential to revisit the implementation and improve them.
We evolve, we learn from reading code, insights, conversation and experiences, and applying improvements is a constant process.
If you're still here. Thanks for reading :)