perf: improve mapping lookup performance#144
Draft
Malte Janz (MalteJanz) wants to merge 2 commits intotrunkfrom
Draft
perf: improve mapping lookup performance#144Malte Janz (MalteJanz) wants to merge 2 commits intotrunkfrom
Malte Janz (MalteJanz) wants to merge 2 commits intotrunkfrom
Conversation
| @@ -0,0 +1,168 @@ | |||
| <?php declare(strict_types=1); | |||
|
|
|||
| // Todo: do not merge this experiment | |||
Contributor
Author
There was a problem hiding this comment.
Todo: remove this and document the approach properly somewhere else
Contributor
Author
There was a problem hiding this comment.
I didn't come very far with implementing this as an actual MappingServiceV2 and using that in our ProductConverter to validate this idea further today.
But I'm still curious what others would think about this rather unusual approach 🙂
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This is just an experiment / PoC for now (created on a circle day)
Context
Converterclasses which map data from the source system to the SW6 schemaConverterconvertmethod is calledConverterhas to map foreign keys in lots of places (e.g.manufacturerIdof42in SW5 to the correct SW6 UUID). This is where theMappingServicecomes inBaseline benchmark
I explained the environment of these benchmarks in a previous PR so I only focus on the results here:
Perfect environment
Let's start with looking at a usual
productconvert batch of 100 entities:https://blackfire.io/profiles/37d53432-4c24-4f51-b79e-507449c25ee6/graph
The
SwagMigrationAssistant\Migration\Mapping\MappingService::getMappingmethod:477msand is responsible for23,67%of the total wall time of the process3047times1263timesBut you have to keep in mind that in my dev environment, the DB is running on the same machine than the message worker PHP process. Means my network latency is near zero.
Let's see how it looks if we add a small delay of
1msevery time we reach the DB in the mapping method (by adding ausleep(1000)call there for the non cached case.Simulating production
1ms is just an assumption, ChatGPT told me this:
With that 1ms I get this result:
https://blackfire.io/profiles/57a1ea2b-0332-4584-a320-47d89942d400/graph
Notice how much different the situation looks now for the
getMappingmethod:2,28sand is responsible for50,36%of the total wall time of the process3047times1263times but also calledusleep(1000)the same number of times for simulating the network latency.This makes it more obvious why this N+1 Query problem is such a big deal for all applications that talk to the DB in production.
Proposed solution
To keep the converter logic mostly untouched and simple to read, I would propose a concept similar to asynchronous computing (like Promises from the JS world).
The idea is simple:
getMappingcall results in a placeholder / promise at first, but registers a task to lookup a certain mapping from the DBOne way of implementing this would be to store absolute paths inside the nested
$convertedarray to remember which string needs to be updated, but that would be really cumbersome and error prone to use.Fortunately PHP is quite powerful and supports variable references / aliases. I tried to experiment using them a little like pointers from lower level languages and to some degree that seems to work (see mapping-experiment.php)
Result benchmark
TBA