Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance optimizations #48

Merged
merged 10 commits into from
Jan 22, 2024
Merged

Performance optimizations #48

merged 10 commits into from
Jan 22, 2024

Conversation

Goltergaul
Copy link
Owner

Before (valid data scenario from benchmark/model.rb):

Comparison:
          dry-struct:    56198.5 i/s
          definition:    28121.2 i/s - 2.00x  slower

After (valid data scenario from benchmark/model.rb):

Comparison:
          dry-struct:    58341.7 i/s
          definition:    47363.5 i/s - 1.23x  slower

@Goltergaul
Copy link
Owner Author

tested this in the sport activities definiton model branch. found one issue i fixed, all tests pass ( 986fb47 )

@Goltergaul
Copy link
Owner Author

@erikpaperik @thoesl your review would be appreciated if you have some time left :)

class Conformer
def initialize(definition)
self.definition = definition
results = values.map do |value|
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no way to fail faster as you improved it for the and operation since we want to return all errors right?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm yeah at least that is how it currently works 🤔 The error case is not so much the bottleneck though from what i've found. In error cases definition is already faster then dry-struct. The main performance issue is behind the all-is-valid case

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just an idea to try.
Instead of using map and building up an array with entries we probably don't need, we could also have two separte stacks for valid and invalid results like this:

results = []
faulty_results = []

values.each do |value|
  result = item_definition.conform(value)
          
  if result.passed?
    results << result
  else
    errors = true
    faulty_results << result
   end
end

return ConformResult.new(results.map(&:value)) unless errors

ConformResult.new(values, errors: [ConformError.new(self,
                                                            "Not all items conform with '#{name}'",
                                                            sub_errors: convert_errors(faulty_results))])

...

def convert_errors(faulty_results)
  errors = []
  faulty_results.each_with_index do |faulty_result, index|
    errors << KeyConformError.new(
      self,
      "Item #{faulty_result.value.inspect} did not conform to #{name}",
       key:        index,
       sub_errors: faulty_result.error_tree
       )
       end
      errors
end

Map uses push under the hood, which is a bit slower than <<.
Since we don't need all entries if there is an error, we don't need to have all individual results stored in the array.
By having a separate faulty_results stack, we can get rid of a check within convert_errors, too.

This version constantly performed a bit better on my machine.
Maybe give it a try and see if it's worth it.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just gave this a try with a slight adpatation, i still need all results for the convert_errors function, as this one needs to extract the correct index of the array elements for the error message.

with that i don't see a significant change in execution time. the benchmark results vary more from run to run than the difference it makes

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also with this the results are within error, seems even a bit slower:

def conform(values)
        return non_array_error(values) unless values.is_a?(Array)

        errors = []
        result_values = []

        values.each_with_index do |value, index|
          result = item_definition.conform(value)
          if result.passed?
            result_values << result.value
          else
            errors << KeyConformError.new(
            self,
            "Item #{result.value.inspect} did not conform to #{name}",
            key:        index,
            sub_errors: result.error_tree
          )
          end
        end

        return ConformResult.new(result_values) if errors.empty?

        ConformResult.new(values, errors: [ConformError.new(self,
                                                            "Not all items conform with '#{name}'",
                                                            sub_errors: errors)])
      end


def valid_input_type?
value.is_a?(Hash)
errors.push(key_error(definition, key, result))
Copy link
Collaborator

@erikpaperik erikpaperik Dec 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe try << instead of push.
IIRC << is the fastest way to add an element to an array.

Maybe there are some more occurrences of push in the code you could change.

@Goltergaul Goltergaul merged commit 5693aec into master Jan 22, 2024
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants