profile
viewpoint
If you are wondering where the data of this site comes from, please visit https://api.github.com/users/pyromaniac/events. GitMemory does not store any data, but only uses NGINX to cache data for a period of time. The idea behind GitMemory is simply to give users a better reading experience.
Arkadiy Zabazhanov pyromaniac @BookingSync Russia

jashmenn/activeuuid 335

Binary uuid keys in Rails

pyromaniac/hoof 176

Linux zero-configuration server

skyeagle/nested_set 175

Rails 3 support! An awesome replacement for acts_as_nested_set and better_nested_set.

pyromaniac/active_data 119

Working with any data in AR style

mongoid/mongoid_orderable 98

Acts as list mongoid implementation

pyromaniac/hotcell 19

Hotcell is sandboxed template language for ruby

pyromaniac/contextuality 17

Contextual global variables

pyromaniac/secure_routes 4

Routing-level ssl support for ruby application

pyromaniac/mongoid_orderable 1

Acts as list mongoid implementation

pyromaniac/operations_cookbook 1

Operations framework implementation experience

push eventtoptal/chewy

Maciej Rzasa

commit sha e1064f48843fd5f8c5888c6a1c77977324853a0a

fixup! Add support for crutches and raw import

view details

push time in 4 days

push eventtoptal/chewy

Maciej Rzasa

commit sha 3e924a93ad6a3abe0d4d7cd3b5ee920194f1c2e2

fixup! Add support for crutches and raw import

view details

push time in 4 days

push eventtoptal/chewy

Maciej Rzasa

commit sha 33683134af45f9e6a4bee37f912bd7600e6e2766

Add support for crutches and raw import

view details

push time in 4 days

Pull request review commenttoptal/chewy

Remove parent/child mapping

+# Testing parent/child, remove before merging++curl -X PUT localhost:9206/quiz?pretty=true -H 'Content-Type: application/json' -d  @- <<'EOF'+{+  "mappings": {+    "question": {+      "properties": {+        "id_field": {+          "type": "keyword"+        },+        "content": {+          "type": "text"+        },+        "comment_type": {+          "type": "join",+          "relations": {+            "question": "answer"+          }+        }+      }+    }+  }+}+EOF++curl -X PUT localhost:9206/quiz/_doc/1?pretty=true -H 'Content-Type: application/json' -d  @- <<'EOF'+{+  "id_field": "1",+  "content": "2+3?",+  "comment_type": "question"+}+EOF++curl -X PUT localhost:9206/quiz/_doc/2?pretty=true -H 'Content-Type: application/json' -d  @- <<'EOF'+{+  "id_field": "2",+  "content": "3+4?",+  "comment_type": "question"+}+EOF+++curl -X PUT 'localhost:9206/quiz/_doc/3?routing=1&pretty=true' -H 'Content-Type: application/json' -d  @- <<'EOF'+{+  "id_field": "3",+  "content": "3!",+  "comment_type": {+    "name": "answer",+    "parent": "1"+  }+}+EOF+++curl -X PUT 'localhost:9206/quiz/_doc/4?routing=1&pretty=true' -H 'Content-Type: application/json' -d  @- <<'EOF'+{+  "id_field": "4",+  "content": "4!",+  "comment_type": {+    "name": "answer",+    "parent": "1"+  }++}+EOF++# fails with+#          "type" : "document_missing_exception",+#          "reason" : "[question][3]: document missing",+curl -X POST 'localhost:9206/_bulk/?pretty=true' -H 'Content-Type: application/json' -d  '+{ "update": { "_id": "3", "_index": "quiz", "_type": "question" }  }+{ "doc": { "content": "Changed answer!" } }+{ "create": { "_id": "5", "_index": "quiz", "_type": "question" }  }+{ "doc": { "content": "New answer!", "comment_type": { "name": "answer", "parent": "1" } } }+'++# works+curl -X POST 'localhost:9206/_bulk/?pretty=true' -H 'Content-Type: application/json' -d  '+{ "update": { "_id": "3", "_index": "quiz", "_type": "question" }  }+{ "doc": { "content": "Changed answer!", "comment_type": { "name": "answer", "parent": "1" } } }+{ "create": { "_id": "5", "_index": "quiz", "_type": "question" }  }+{ "doc": { "content": "New answer!", "comment_type": { "name": "answer", "parent": "1" } } }+'++curl -X POST 'localhost:9206/_bulk/?pretty=true' -H 'Content-Type: application/json' -d  '+{ "delete": { "_id": "3", "_index": "quiz", "_type": "question" }  }

https://github.com/toptal/chewy/pull/760#discussion_r648253240

mrzasa

comment created time in 10 days

Pull request review commenttoptal/chewy

Remove parent/child mapping

+# Testing parent/child, remove before merging++curl -X PUT localhost:9206/quiz?pretty=true -H 'Content-Type: application/json' -d  @- <<'EOF'+{+  "mappings": {+    "question": {+      "properties": {+        "id_field": {+          "type": "keyword"+        },+        "content": {+          "type": "text"+        },+        "comment_type": {+          "type": "join",+          "relations": {+            "question": "answer"+          }+        }+      }+    }+  }+}+EOF++curl -X PUT localhost:9206/quiz/_doc/1?pretty=true -H 'Content-Type: application/json' -d  @- <<'EOF'+{+  "id_field": "1",+  "content": "2+3?",+  "comment_type": "question"+}+EOF++curl -X PUT localhost:9206/quiz/_doc/2?pretty=true -H 'Content-Type: application/json' -d  @- <<'EOF'+{+  "id_field": "2",+  "content": "3+4?",+  "comment_type": "question"+}+EOF+++curl -X PUT 'localhost:9206/quiz/_doc/3?routing=1&pretty=true' -H 'Content-Type: application/json' -d  @- <<'EOF'+{+  "id_field": "3",+  "content": "3!",+  "comment_type": {+    "name": "answer",+    "parent": "1"+  }+}+EOF+++curl -X PUT 'localhost:9206/quiz/_doc/4?routing=1&pretty=true' -H 'Content-Type: application/json' -d  @- <<'EOF'+{+  "id_field": "4",+  "content": "4!",+  "comment_type": {+    "name": "answer",+    "parent": "1"+  }++}+EOF++# fails with+#          "type" : "document_missing_exception",+#          "reason" : "[question][3]: document missing",+curl -X POST 'localhost:9206/_bulk/?pretty=true' -H 'Content-Type: application/json' -d  '+{ "update": { "_id": "3", "_index": "quiz", "_type": "question" }  }

I think I used an old manual while doing those experiments. They're soon to be removed either way.

mrzasa

comment created time in 10 days

push eventtoptal/chewy

Maciej Rzasa

commit sha 73339132c1020815d5a62db244849115ae4642ca

review fixes

view details

push time in 10 days

push eventtoptal/chewy

Maciek Rząsa

commit sha 0784b7b8a796407671e9b91daab45fb67a73fc19

Update lib/chewy/index/import/bulk_builder.rb Co-authored-by: Ivan Rabotyaga <ivan.rabotyaga@toptal.com>

view details

push time in 10 days

push eventtoptal/chewy

Maciek Rząsa

commit sha 1afe7b44b86294d3d7008043f462ce0c16edbb04

Update lib/chewy/index/import/bulk_builder.rb Co-authored-by: Ivan Rabotyaga <ivan.rabotyaga@toptal.com>

view details

push time in 10 days

Pull request review commenttoptal/chewy

Remove parent/child mapping

 def crutches           @crutches ||= Chewy::Index::Crutch::Crutches.new @index, @to_index         end -        def parents-          return unless type_root.parent_id--          @parents ||= begin-            ids = @index.map do |object|-              object.respond_to?(:id) ? object.id : object-            end-            ids.concat(@delete.map do |object|-              object.respond_to?(:id) ? object.id : object-            end)-            @index.filter(ids: {values: ids}).order('_doc').pluck(:_id, :_parent).to_h-          end-        end-         def index_entry(object)           entry = {}           entry[:_id] = index_object_ids[object] if index_object_ids[object] -          if parents-            entry[:parent] = type_root.compose_parent(object)-            parent = entry[:_id].present? && parents[entry[:_id].to_s]-          end+          data = data_for(object)+          parent = cache(entry[:_id]) -          if parent && entry[:parent].to_s != parent-            entry[:data] = @index.compose(object, crutches)-            [{delete: entry.except(:data).merge(parent: parent)}, {index: entry}]+          entry[:routing] = routing(object) if join_field?+          if parent_changed?(data, parent)+            reindex_entries(object, data) + reindex_descendants(object)           elsif @fields.present?             return [] unless entry[:_id] -            entry[:data] = {doc: @index.compose(object, crutches, fields: @fields)}+            entry[:data] = {doc: data_for(object, fields: @fields)}             [{update: entry}]           else-            entry[:data] = @index.compose(object, crutches)+            entry[:data] = data             [{index: entry}]           end         end +        def reindex_entries(object, data, root: object)+          entry = {}+          entry[:_id] = index_object_ids[object] || entry_id(object)+          entry[:data] = data+          entry[:routing] = routing(root) || routing(object) if join_field?+          delete = delete_single_entry(object, root: root).first+          index = {index: entry}+          [delete, index]+        end++        def reindex_descendants(root)+          load_descendants(root).flat_map do |object|+            reindex_entries(+              object,+              data_for(object),+              root: root+            )+          end+        end+         def delete_entry(object)+          delete_single_entry(object) + delete_descendants(object)+        end++        def delete_single_entry(object, root: object)           entry = {}           entry[:_id] = entry_id(object)           entry[:_id] ||= object.as_json            return [] if entry[:_id].blank? -          if parents-            parent = entry[:_id].present? && parents[entry[:_id].to_s]-            return [] unless parent+          if join_field?+            cached_parent = cache(entry[:_id])+            entry_parent_id =+              if cached_parent+                cached_parent[:parent_id]+              else+                find_parent_id(object)+              end -            entry[:parent] = parent+            entry[:routing] = existing_routing(root.try(:id)) || existing_routing(object.id)+            entry[:parent] = entry_parent_id if entry_parent_id           end            [{delete: entry}]         end +        def delete_descendants(root)+          return [] unless root.respond_to?(:id)++          load_descendants(root).flat_map do |object|+            delete_single_entry(object, root: root)+          end+        end++        def load_descendants(root)+          root_type = join_field_type(root)+          return [] unless root_type++          descendant_ids = []+          grouped_parents = {root_type => [root.id]}+          until grouped_parents.empty?+            children_data = grouped_parents.flat_map do |parent_type, parent_ids|+              @index.query(+                has_parent: {+                  parent_type: parent_type,+                  # ignore_unmapped to avoid error for the leaves of the tree+                  # (types without children)+                  ignore_unmapped: true,+                  query: {ids: {values: parent_ids}}+                }+              ).pluck(:_id, join_field).map { |id, join| [join['name'], id] }+            end+            descendant_ids |= children_data.map(&:last)++            grouped_parents = {}+            children_data.each do |name, id|+              next unless name++              grouped_parents[name] ||= []+              grouped_parents[name] << id+            end+          end+          @index.adapter.load(descendant_ids)+        end++        def populate_cache+          @cache = load_cache+        end++        def cache(id)+          @cache[id.to_s]+        end++        def load_cache+          return {} unless join_field?++          @index+            .filter(ids: {values: ids_for_cache})+            .order('_doc')+            .pluck(:_id, :_routing, join_field)+            .map do |id, routing, join|+              [+                id,+                {routing: routing, parent_id: join['parent']}+              ]+            end.to_h+        end++        def existing_routing(id)+          # All objects needed here should be cached in #load_cache,+          # if not, we return nil. In some cases we don't have existing routing cached,+          # e.g. for loaded descendants+          return unless cache(id)++          cache(id.to_s)[:routing]+        end++        # Two types of ids:+        # * of parents of the objects to be indexed+        # * of objects to be deleted+        def ids_for_cache+          ids = @to_index.flat_map do |object|+            [find_parent_id(object), object.id] if object.respond_to?(:id)+          end+          ids.concat(@delete.map do |object|+            object.id if object.respond_to?(:id)+          end)+          ids.uniq.compact+        end++        def routing(object)+          # filter out non-model objects, early return on object==nil+          return unless object.respond_to?(:id)++          parent_id = find_parent_id(object)+          if parent_id+            routing(index_objects_by_id[parent_id.to_s]) || existing_routing(parent_id)+          else+            object.id.to_s+          end+        end++        def find_parent_id(object)+          return unless object.respond_to?(:id)++          join = data_for(object)[join_field]+          join['parent'] if join+        end++        def join_field+          return @join_field if defined?(@join_field)++          @join_field = find_join_field

I'd don't want to call find_join_field if the first call returned nil. If we use ||=, we'll call the finder repeatedly if it returns nil:

[11] pry(main)> @c ||= (puts(1) && nil)
1
=> nil
[12] pry(main)> @c ||= (puts(1) && nil)
1
=> nil
mrzasa

comment created time in 10 days

Pull request review commenttoptal/chewy

Remove parent/child mapping

 def import_fields(*args, &block)          def load(ids, **options)           scope = all_scope_where_ids_in(ids)-          additional_scope = options[options[:_index].to_sym].try(:[], :scope) || options[:scope]+          additional_scope = options[options[:_index].try(:to_sym)].try(:[], :scope) || options[:scope]

I had failures in load_descendants, but I fixed the way we use load there.

mrzasa

comment created time in 10 days

Pull request review commenttoptal/chewy

Remove parent/child mapping

+# load it to a pry instance and run `simple_test`

Yep

mrzasa

comment created time in 10 days

Pull request review commenttoptal/chewy

Remove parent/child mapping

+# Testing parent/child, remove before merging

Yep, I'll remove both queries.sh and manual.rb, I'll keep them somewhere locally

mrzasa

comment created time in 10 days

Pull request review commenttoptal/chewy

Remove parent/child mapping

     CitiesIndex.create!   end -  let(:index) { [double(id: 1, name: 'Name', object: {}), double(id: 2, name: 'Name', object: {})] }-  let(:delete) { [double(id: 3, name: 'Name')] }+  let(:index) { [double('to_index', id: 1, name: 'Name', object: {}), double(id: 2, name: 'Name', object: {})] }+  let(:delete) { [double('to_delete', id: 3, name: 'Name', object: {})] }

It's easier to debug unexpected messages sent to double if the doubles are named.

mrzasa

comment created time in 10 days

Pull request review commenttoptal/chewy

Remove parent/child mapping

         end       end     end++    context 'with parents' do+      let(:index) { CommentsIndex }+      before do+        stub_model(:comment)+        stub_index(:comments) do+          index_scope Comment+          field :content+          field :comment_type, type: :join, relations: {question: %i[answer comment], answer: :vote, vote: :subvote}, join_type: :comment_type, join_id: :commented_id+        end+      end++      let!(:existing_comments) do+        [+          Comment.create!(id: 1, content: 'Where is Nemo?', comment_type: :question),+          Comment.create!(id: 2, content: 'Here.', comment_type: :answer, commented_id: 1),+          Comment.create!(id: 31, content: 'What is the best programming language?', comment_type: :question)+        ]+      end++      def do_raw_index_comment(options:, data:)+        CommentsIndex.client.index(options.merge(index: 'comments', type: '_doc', refresh: true, body: data))+      end++      def raw_index_comment(comment)+        options = {id: comment.id, routing: root(comment).id}+        comment_type = comment.commented_id.present? ? {name: comment.comment_type, parent: comment.commented_id} : comment.comment_type+        do_raw_index_comment(+          options: options,+          data: {content: comment.content, comment_type: comment_type}+        )+      end++      def root(comment)+        current = comment+        # slow, but it's OK, as we don't have too deep trees+        current = Comment.find(current.commented_id) while current.commented_id+        current+      end++      def routing_for(id)

nope, removed. thanks :)

mrzasa

comment created time in 10 days

Pull request review commenttoptal/chewy

Remove parent/child mapping

 module Chewy   module Fields     class Base-      attr_reader :name, :options, :value, :children+      attr_reader :name, :options, :children       attr_accessor :parent +      JOIN_FIELD_EXTRA_OPTIONS = %i[join_id join_type].freeze+       def initialize(name, value: nil, **options)         @name = name.to_sym         @options = {}         update_options!(**options)         @value = value         @children = []+        @allowed_relations = find_allowed_relations(options[:relations]) # for join fields

relations will be passed to elastic to define the parent-children hierarchy. join_id and join_type are just a syntactic sugar that allow us to define value correctly, they're not passed to elastic.

mrzasa

comment created time in 10 days

Pull request review commenttoptal/chewy

Remove parent/child mapping

 def crutches           @crutches ||= Chewy::Index::Crutch::Crutches.new @index, @to_index         end -        def parents-          return unless type_root.parent_id--          @parents ||= begin-            ids = @index.map do |object|-              object.respond_to?(:id) ? object.id : object-            end-            ids.concat(@delete.map do |object|-              object.respond_to?(:id) ? object.id : object-            end)-            @index.filter(ids: {values: ids}).order('_doc').pluck(:_id, :_parent).to_h-          end-        end-         def index_entry(object)           entry = {}           entry[:_id] = index_object_ids[object] if index_object_ids[object] -          if parents-            entry[:parent] = type_root.compose_parent(object)-            parent = entry[:_id].present? && parents[entry[:_id].to_s]-          end+          data = data_for(object)+          parent = cache(entry[:_id]) -          if parent && entry[:parent].to_s != parent-            entry[:data] = @index.compose(object, crutches)-            [{delete: entry.except(:data).merge(parent: parent)}, {index: entry}]+          entry[:routing] = routing(object) if join_field?+          if parent_changed?(data, parent)+            reindex_entries(object, data) + reindex_descendants(object)           elsif @fields.present?             return [] unless entry[:_id] -            entry[:data] = {doc: @index.compose(object, crutches, fields: @fields)}+            entry[:data] = {doc: data_for(object, fields: @fields)}             [{update: entry}]           else-            entry[:data] = @index.compose(object, crutches)+            entry[:data] = data             [{index: entry}]           end         end +        def reindex_entries(object, data, root: object)+          entry = {}+          entry[:_id] = index_object_ids[object] || entry_id(object)+          entry[:data] = data+          entry[:routing] = routing(root) || routing(object) if join_field?+          delete = delete_single_entry(object, root: root).first+          index = {index: entry}+          [delete, index]+        end++        def reindex_descendants(root)+          load_descendants(root).flat_map do |object|+            reindex_entries(+              object,+              data_for(object),+              root: root+            )+          end+        end+         def delete_entry(object)+          delete_single_entry(object) + delete_descendants(object)+        end++        def delete_single_entry(object, root: object)           entry = {}           entry[:_id] = entry_id(object)           entry[:_id] ||= object.as_json            return [] if entry[:_id].blank? -          if parents-            parent = entry[:_id].present? && parents[entry[:_id].to_s]-            return [] unless parent+          if join_field?+            cached_parent = cache(entry[:_id])+            entry_parent_id =+              if cached_parent+                cached_parent[:parent_id]+              else+                find_parent_id(object)+              end -            entry[:parent] = parent+            entry[:routing] = existing_routing(root.try(:id)) || existing_routing(object.id)+            entry[:parent] = entry_parent_id if entry_parent_id           end            [{delete: entry}]         end +        def delete_descendants(root)+          return [] unless root.respond_to?(:id)++          load_descendants(root).flat_map do |object|+            delete_single_entry(object, root: root)+          end+        end++        def load_descendants(root)+          root_type = join_field_type(root)+          return [] unless root_type++          descendant_ids = []+          grouped_parents = {root_type => [root.id]}+          until grouped_parents.empty?+            children_data = grouped_parents.flat_map do |parent_type, parent_ids|+              @index.query(+                has_parent: {+                  parent_type: parent_type,+                  # ignore_unmapped to avoid error for the leaves of the tree+                  # (types without children)+                  ignore_unmapped: true,+                  query: {ids: {values: parent_ids}}+                }+              ).pluck(:_id, join_field).map { |id, join| [join['name'], id] }+            end+            descendant_ids |= children_data.map(&:last)++            grouped_parents = {}+            children_data.each do |name, id|+              next unless name++              grouped_parents[name] ||= []+              grouped_parents[name] << id+            end+          end+          @index.adapter.load(descendant_ids)+        end++        def populate_cache+          @cache = load_cache+        end++        def cache(id)+          @cache[id.to_s]+        end++        def load_cache+          return {} unless join_field?++          @index+            .filter(ids: {values: ids_for_cache})+            .order('_doc')+            .pluck(:_id, :_routing, join_field)+            .map do |id, routing, join|+              [+                id,+                {routing: routing, parent_id: join['parent']}+              ]+            end.to_h+        end++        def existing_routing(id)+          # All objects needed here should be cached in #load_cache,+          # if not, we return nil. In some cases we don't have existing routing cached,+          # e.g. for loaded descendants+          return unless cache(id)++          cache(id.to_s)[:routing]+        end++        # Two types of ids:+        # * of parents of the objects to be indexed+        # * of objects to be deleted+        def ids_for_cache+          ids = @to_index.flat_map do |object|+            [find_parent_id(object), object.id] if object.respond_to?(:id)+          end+          ids.concat(@delete.map do |object|+            object.id if object.respond_to?(:id)+          end)+          ids.uniq.compact+        end++        def routing(object)+          # filter out non-model objects, early return on object==nil+          return unless object.respond_to?(:id)++          parent_id = find_parent_id(object)+          if parent_id+            routing(index_objects_by_id[parent_id.to_s]) || existing_routing(parent_id)+          else+            object.id.to_s+          end+        end++        def find_parent_id(object)+          return unless object.respond_to?(:id)++          join = data_for(object)[join_field]+          join['parent'] if join+        end++        def join_field+          return @join_field if defined?(@join_field)++          @join_field = find_join_field
          @join_field ||= find_join_field
mrzasa

comment created time in 10 days

Pull request review commenttoptal/chewy

Remove parent/child mapping

 def import_fields(*args, &block)          def load(ids, **options)           scope = all_scope_where_ids_in(ids)-          additional_scope = options[options[:_index].to_sym].try(:[], :scope) || options[:scope]+          additional_scope = options[options[:_index].try(:to_sym)].try(:[], :scope) || options[:scope]

Just curious: where/why did it fail?

mrzasa

comment created time in 10 days

Pull request review commenttoptal/chewy

Remove parent/child mapping

 def crutches           @crutches ||= Chewy::Index::Crutch::Crutches.new @index, @to_index         end -        def parents-          return unless type_root.parent_id--          @parents ||= begin-            ids = @index.map do |object|-              object.respond_to?(:id) ? object.id : object-            end-            ids.concat(@delete.map do |object|-              object.respond_to?(:id) ? object.id : object-            end)-            @index.filter(ids: {values: ids}).order('_doc').pluck(:_id, :_parent).to_h-          end-        end-         def index_entry(object)           entry = {}           entry[:_id] = index_object_ids[object] if index_object_ids[object] -          if parents-            entry[:parent] = type_root.compose_parent(object)-            parent = entry[:_id].present? && parents[entry[:_id].to_s]-          end+          data = data_for(object)+          parent = cache(entry[:_id]) -          if parent && entry[:parent].to_s != parent-            entry[:data] = @index.compose(object, crutches)-            [{delete: entry.except(:data).merge(parent: parent)}, {index: entry}]+          entry[:routing] = routing(object) if join_field?+          if parent_changed?(data, parent)+            reindex_entries(object, data) + reindex_descendants(object)           elsif @fields.present?             return [] unless entry[:_id] -            entry[:data] = {doc: @index.compose(object, crutches, fields: @fields)}+            entry[:data] = {doc: data_for(object, fields: @fields)}             [{update: entry}]           else-            entry[:data] = @index.compose(object, crutches)+            entry[:data] = data             [{index: entry}]           end         end +        def reindex_entries(object, data, root: object)+          entry = {}+          entry[:_id] = index_object_ids[object] || entry_id(object)+          entry[:data] = data+          entry[:routing] = routing(root) || routing(object) if join_field?+          delete = delete_single_entry(object, root: root).first+          index = {index: entry}+          [delete, index]+        end++        def reindex_descendants(root)+          load_descendants(root).flat_map do |object|+            reindex_entries(+              object,+              data_for(object),+              root: root+            )+          end+        end+         def delete_entry(object)+          delete_single_entry(object) + delete_descendants(object)+        end++        def delete_single_entry(object, root: object)           entry = {}           entry[:_id] = entry_id(object)           entry[:_id] ||= object.as_json            return [] if entry[:_id].blank? -          if parents-            parent = entry[:_id].present? && parents[entry[:_id].to_s]-            return [] unless parent+          if join_field?+            cached_parent = cache(entry[:_id])+            entry_parent_id =+              if cached_parent+                cached_parent[:parent_id]+              else+                find_parent_id(object)+              end -            entry[:parent] = parent+            entry[:routing] = existing_routing(root.try(:id)) || existing_routing(object.id)+            entry[:parent] = entry_parent_id if entry_parent_id           end            [{delete: entry}]         end +        def delete_descendants(root)+          return [] unless root.respond_to?(:id)++          load_descendants(root).flat_map do |object|+            delete_single_entry(object, root: root)+          end+        end++        def load_descendants(root)+          root_type = join_field_type(root)+          return [] unless root_type++          descendant_ids = []+          grouped_parents = {root_type => [root.id]}+          until grouped_parents.empty?+            children_data = grouped_parents.flat_map do |parent_type, parent_ids|+              @index.query(+                has_parent: {+                  parent_type: parent_type,+                  # ignore_unmapped to avoid error for the leaves of the tree+                  # (types without children)+                  ignore_unmapped: true,+                  query: {ids: {values: parent_ids}}+                }+              ).pluck(:_id, join_field).map { |id, join| [join['name'], id] }+            end+            descendant_ids |= children_data.map(&:last)++            grouped_parents = {}+            children_data.each do |name, id|+              next unless name++              grouped_parents[name] ||= []+              grouped_parents[name] << id+            end+          end+          @index.adapter.load(descendant_ids)+        end++        def populate_cache+          @cache = load_cache+        end++        def cache(id)+          @cache[id.to_s]+        end++        def load_cache+          return {} unless join_field?++          @index+            .filter(ids: {values: ids_for_cache})+            .order('_doc')+            .pluck(:_id, :_routing, join_field)+            .map do |id, routing, join|+              [+                id,+                {routing: routing, parent_id: join['parent']}+              ]+            end.to_h+        end++        def existing_routing(id)+          # All objects needed here should be cached in #load_cache,+          # if not, we return nil. In some cases we don't have existing routing cached,+          # e.g. for loaded descendants+          return unless cache(id)++          cache(id.to_s)[:routing]
          cache(id)[:routing]
mrzasa

comment created time in 10 days

Pull request review commenttoptal/chewy

Remove parent/child mapping

+# Testing parent/child, remove before merging++curl -X PUT localhost:9206/quiz?pretty=true -H 'Content-Type: application/json' -d  @- <<'EOF'+{+  "mappings": {+    "question": {+      "properties": {+        "id_field": {+          "type": "keyword"+        },+        "content": {+          "type": "text"+        },+        "comment_type": {+          "type": "join",+          "relations": {+            "question": "answer"+          }+        }+      }+    }+  }+}+EOF++curl -X PUT localhost:9206/quiz/_doc/1?pretty=true -H 'Content-Type: application/json' -d  @- <<'EOF'+{+  "id_field": "1",+  "content": "2+3?",+  "comment_type": "question"+}+EOF++curl -X PUT localhost:9206/quiz/_doc/2?pretty=true -H 'Content-Type: application/json' -d  @- <<'EOF'+{+  "id_field": "2",+  "content": "3+4?",+  "comment_type": "question"+}+EOF+++curl -X PUT 'localhost:9206/quiz/_doc/3?routing=1&pretty=true' -H 'Content-Type: application/json' -d  @- <<'EOF'+{+  "id_field": "3",+  "content": "3!",+  "comment_type": {+    "name": "answer",+    "parent": "1"+  }+}+EOF+++curl -X PUT 'localhost:9206/quiz/_doc/4?routing=1&pretty=true' -H 'Content-Type: application/json' -d  @- <<'EOF'+{+  "id_field": "4",+  "content": "4!",+  "comment_type": {+    "name": "answer",+    "parent": "1"+  }++}+EOF++# fails with+#          "type" : "document_missing_exception",+#          "reason" : "[question][3]: document missing",+curl -X POST 'localhost:9206/_bulk/?pretty=true' -H 'Content-Type: application/json' -d  '+{ "update": { "_id": "3", "_index": "quiz", "_type": "question" }  }+{ "doc": { "content": "Changed answer!" } }+{ "create": { "_id": "5", "_index": "quiz", "_type": "question" }  }+{ "doc": { "content": "New answer!", "comment_type": { "name": "answer", "parent": "1" } } }+'++# works+curl -X POST 'localhost:9206/_bulk/?pretty=true' -H 'Content-Type: application/json' -d  '+{ "update": { "_id": "3", "_index": "quiz", "_type": "question" }  }+{ "doc": { "content": "Changed answer!", "comment_type": { "name": "answer", "parent": "1" } } }+{ "create": { "_id": "5", "_index": "quiz", "_type": "question" }  }+{ "doc": { "content": "New answer!", "comment_type": { "name": "answer", "parent": "1" } } }+'++curl -X POST 'localhost:9206/_bulk/?pretty=true' -H 'Content-Type: application/json' -d  '+{ "delete": { "_id": "3", "_index": "quiz", "_type": "question" }  }

And here

mrzasa

comment created time in 10 days

Pull request review commenttoptal/chewy

Remove parent/child mapping

 module Chewy   module Fields     class Base-      attr_reader :name, :options, :value, :children+      attr_reader :name, :options, :children       attr_accessor :parent +      JOIN_FIELD_EXTRA_OPTIONS = %i[join_id join_type].freeze+       def initialize(name, value: nil, **options)         @name = name.to_sym         @options = {}         update_options!(**options)         @value = value         @children = []+        @allowed_relations = find_allowed_relations(options[:relations]) # for join fields

Why do we treat relations, join_id & join_type in different manner?

mrzasa

comment created time in 10 days

Pull request review commenttoptal/chewy

Remove parent/child mapping

 def crutches           @crutches ||= Chewy::Index::Crutch::Crutches.new @index, @to_index         end -        def parents-          return unless type_root.parent_id--          @parents ||= begin-            ids = @index.map do |object|-              object.respond_to?(:id) ? object.id : object-            end-            ids.concat(@delete.map do |object|-              object.respond_to?(:id) ? object.id : object-            end)-            @index.filter(ids: {values: ids}).order('_doc').pluck(:_id, :_parent).to_h-          end-        end-         def index_entry(object)           entry = {}           entry[:_id] = index_object_ids[object] if index_object_ids[object] -          if parents-            entry[:parent] = type_root.compose_parent(object)-            parent = entry[:_id].present? && parents[entry[:_id].to_s]-          end+          data = data_for(object)+          parent = cache(entry[:_id]) -          if parent && entry[:parent].to_s != parent-            entry[:data] = @index.compose(object, crutches)-            [{delete: entry.except(:data).merge(parent: parent)}, {index: entry}]+          entry[:routing] = routing(object) if join_field?+          if parent_changed?(data, parent)+            reindex_entries(object, data) + reindex_descendants(object)           elsif @fields.present?             return [] unless entry[:_id] -            entry[:data] = {doc: @index.compose(object, crutches, fields: @fields)}+            entry[:data] = {doc: data_for(object, fields: @fields)}             [{update: entry}]           else-            entry[:data] = @index.compose(object, crutches)+            entry[:data] = data             [{index: entry}]           end         end +        def reindex_entries(object, data, root: object)+          entry = {}+          entry[:_id] = index_object_ids[object] || entry_id(object)+          entry[:data] = data+          entry[:routing] = routing(root) || routing(object) if join_field?+          delete = delete_single_entry(object, root: root).first+          index = {index: entry}+          [delete, index]+        end++        def reindex_descendants(root)+          load_descendants(root).flat_map do |object|+            reindex_entries(+              object,+              data_for(object),+              root: root+            )+          end+        end+         def delete_entry(object)+          delete_single_entry(object) + delete_descendants(object)+        end++        def delete_single_entry(object, root: object)           entry = {}           entry[:_id] = entry_id(object)           entry[:_id] ||= object.as_json            return [] if entry[:_id].blank? -          if parents-            parent = entry[:_id].present? && parents[entry[:_id].to_s]-            return [] unless parent+          if join_field?+            cached_parent = cache(entry[:_id])+            entry_parent_id =+              if cached_parent+                cached_parent[:parent_id]+              else+                find_parent_id(object)+              end -            entry[:parent] = parent+            entry[:routing] = existing_routing(root.try(:id)) || existing_routing(object.id)+            entry[:parent] = entry_parent_id if entry_parent_id           end            [{delete: entry}]         end +        def delete_descendants(root)+          return [] unless root.respond_to?(:id)++          load_descendants(root).flat_map do |object|+            delete_single_entry(object, root: root)+          end+        end++        def load_descendants(root)+          root_type = join_field_type(root)+          return [] unless root_type++          descendant_ids = []+          grouped_parents = {root_type => [root.id]}+          until grouped_parents.empty?+            children_data = grouped_parents.flat_map do |parent_type, parent_ids|+              @index.query(+                has_parent: {+                  parent_type: parent_type,+                  # ignore_unmapped to avoid error for the leaves of the tree+                  # (types without children)+                  ignore_unmapped: true,+                  query: {ids: {values: parent_ids}}+                }+              ).pluck(:_id, join_field).map { |id, join| [join['name'], id] }+            end+            descendant_ids |= children_data.map(&:last)++            grouped_parents = {}+            children_data.each do |name, id|+              next unless name++              grouped_parents[name] ||= []+              grouped_parents[name] << id+            end+          end+          @index.adapter.load(descendant_ids)+        end++        def populate_cache+          @cache = load_cache+        end++        def cache(id)+          @cache[id.to_s]+        end++        def load_cache+          return {} unless join_field?++          @index+            .filter(ids: {values: ids_for_cache})+            .order('_doc')+            .pluck(:_id, :_routing, join_field)+            .map do |id, routing, join|+              [+                id,+                {routing: routing, parent_id: join['parent']}+              ]+            end.to_h+        end++        def existing_routing(id)+          # All objects needed here should be cached in #load_cache,+          # if not, we return nil. In some cases we don't have existing routing cached,+          # e.g. for loaded descendants+          return unless cache(id)++          cache(id.to_s)[:routing]+        end++        # Two types of ids:+        # * of parents of the objects to be indexed+        # * of objects to be deleted+        def ids_for_cache+          ids = @to_index.flat_map do |object|+            [find_parent_id(object), object.id] if object.respond_to?(:id)+          end+          ids.concat(@delete.map do |object|+            object.id if object.respond_to?(:id)+          end)+          ids.uniq.compact+        end++        def routing(object)+          # filter out non-model objects, early return on object==nil+          return unless object.respond_to?(:id)++          parent_id = find_parent_id(object)+          if parent_id+            routing(index_objects_by_id[parent_id.to_s]) || existing_routing(parent_id)+          else+            object.id.to_s+          end+        end++        def find_parent_id(object)+          return unless object.respond_to?(:id)++          join = data_for(object)[join_field]+          join['parent'] if join+        end++        def join_field+          return @join_field if defined?(@join_field)++          @join_field = find_join_field+        end++        def find_join_field+          type_settings = @index.mappings_hash[:mappings]+          return unless type_settings++          properties = type_settings[:properties]+          join_fields = properties.find { |_, options| options[:type] == :join }+          return unless join_fields++          join_fields.first.to_s+        end++        def join_field_type(object)+          join_field_value = data_for(object)[join_field]+          case join_field_value+          when String+            join_field_value+          when Hash+            join_field_value['name']+          end+        end++        def join_field?+          join_field && !join_field.empty?+        end++        def data_for(object, fields: [])+          @index.compose(object, crutches, fields: fields)+        end++        def parent_changed?(data, old_parent)+          return false unless old_parent+          return false unless join_field?+          return false unless @fields.include?(join_field.to_sym)+          return false unless data.key?(join_field)++          # The join field value can be a hash, e.g.:+          # {"name": "child", "parent": "123"} for a child+          # {"name": "parent"} for a parent+          # but it can also be a string: (e.g. "parent") for a parent:+          # https://www.elastic.co/guide/en/elasticsearch/reference/current/parent-join.html#parent-join+          new_join_field_value = data[join_field]+          if new_join_field_value.is_a? Hash+            # If we have a hash in the join field,+            # we're taing the `parent` field that helds the parent id.
            # we're taking the `parent` field that holds the parent id.
mrzasa

comment created time in 10 days

Pull request review commenttoptal/chewy

Remove parent/child mapping

     CitiesIndex.create!   end -  let(:index) { [double(id: 1, name: 'Name', object: {}), double(id: 2, name: 'Name', object: {})] }-  let(:delete) { [double(id: 3, name: 'Name')] }+  let(:index) { [double('to_index', id: 1, name: 'Name', object: {}), double(id: 2, name: 'Name', object: {})] }+  let(:delete) { [double('to_delete', id: 3, name: 'Name', object: {})] }

Why? Anyway, if we're using a name for the first element in index array - let's do the same with the second too.

mrzasa

comment created time in 10 days

Pull request review commenttoptal/chewy

Remove parent/child mapping

         end       end     end++    context 'with parents' do+      let(:index) { CommentsIndex }+      before do+        stub_model(:comment)+        stub_index(:comments) do+          index_scope Comment+          field :content+          field :comment_type, type: :join, relations: {question: %i[answer comment], answer: :vote, vote: :subvote}, join_type: :comment_type, join_id: :commented_id+        end+      end++      let!(:existing_comments) do+        [+          Comment.create!(id: 1, content: 'Where is Nemo?', comment_type: :question),+          Comment.create!(id: 2, content: 'Here.', comment_type: :answer, commented_id: 1),+          Comment.create!(id: 31, content: 'What is the best programming language?', comment_type: :question)+        ]+      end++      def do_raw_index_comment(options:, data:)+        CommentsIndex.client.index(options.merge(index: 'comments', type: '_doc', refresh: true, body: data))+      end++      def raw_index_comment(comment)+        options = {id: comment.id, routing: root(comment).id}+        comment_type = comment.commented_id.present? ? {name: comment.comment_type, parent: comment.commented_id} : comment.comment_type+        do_raw_index_comment(+          options: options,+          data: {content: comment.content, comment_type: comment_type}+        )+      end++      def root(comment)+        current = comment+        # slow, but it's OK, as we don't have too deep trees+        current = Comment.find(current.commented_id) while current.commented_id+        current+      end++      def routing_for(id)

Is it used anywhere?

mrzasa

comment created time in 10 days

Pull request review commenttoptal/chewy

Remove parent/child mapping

+# load it to a pry instance and run `simple_test`

This will be removed in the final version?

mrzasa

comment created time in 10 days

Pull request review commenttoptal/chewy

Remove parent/child mapping

+# Testing parent/child, remove before merging++curl -X PUT localhost:9206/quiz?pretty=true -H 'Content-Type: application/json' -d  @- <<'EOF'+{+  "mappings": {+    "question": {+      "properties": {+        "id_field": {+          "type": "keyword"+        },+        "content": {+          "type": "text"+        },+        "comment_type": {+          "type": "join",+          "relations": {+            "question": "answer"+          }+        }+      }+    }+  }+}+EOF++curl -X PUT localhost:9206/quiz/_doc/1?pretty=true -H 'Content-Type: application/json' -d  @- <<'EOF'+{+  "id_field": "1",+  "content": "2+3?",+  "comment_type": "question"+}+EOF++curl -X PUT localhost:9206/quiz/_doc/2?pretty=true -H 'Content-Type: application/json' -d  @- <<'EOF'+{+  "id_field": "2",+  "content": "3+4?",+  "comment_type": "question"+}+EOF+++curl -X PUT 'localhost:9206/quiz/_doc/3?routing=1&pretty=true' -H 'Content-Type: application/json' -d  @- <<'EOF'+{+  "id_field": "3",+  "content": "3!",+  "comment_type": {+    "name": "answer",+    "parent": "1"+  }+}+EOF+++curl -X PUT 'localhost:9206/quiz/_doc/4?routing=1&pretty=true' -H 'Content-Type: application/json' -d  @- <<'EOF'+{+  "id_field": "4",+  "content": "4!",+  "comment_type": {+    "name": "answer",+    "parent": "1"+  }++}+EOF++# fails with+#          "type" : "document_missing_exception",+#          "reason" : "[question][3]: document missing",+curl -X POST 'localhost:9206/_bulk/?pretty=true' -H 'Content-Type: application/json' -d  '+{ "update": { "_id": "3", "_index": "quiz", "_type": "question" }  }

Why do we use _type here?

mrzasa

comment created time in 10 days

Pull request review commenttoptal/chewy

Remove parent/child mapping

+# Testing parent/child, remove before merging

This will be removed in final version?

mrzasa

comment created time in 10 days

push eventtoptal/chewy

Maciek Rząsa

commit sha a2d59ef72acd6e028a3584c9b9fa736c5d06d845

Update README.md Co-authored-by: Ivan Rabotyaga <ivan.rabotyaga@toptal.com>

view details

push time in 10 days

Pull request review commenttoptal/chewy

Remove parent/child mapping

 end  See the section on *Script fields* for details on calculating distance in a search. +### Join fields++You can use a [join field](https://www.elastic.co/guide/en/elasticsearch/reference/current/parent-join.html)+to implement parent-child relationships between documents.+It [replaces the old `parent_id` based parent-child mapping](https://www.elastic.co/guide/en/elasticsearch/reference/current/removal-of-types.html#parent-child-mapping-types)++To use it, you need to pass `relations`, `join_type` and `join_id` options:+```ruby+field :hierarchy_link, type: :join, relations: {question: %i[answer comment], answer: :vote, vote: :subvote}, join_type: :comment_type, join_id: :commented_id+```+assuming you have `comment_type` and `comment_id` fields in your model.
assuming you have `comment_type` and `commented_id` fields in your model.
mrzasa

comment created time in 10 days

push eventtoptal/chewy

Maciej Rzasa

commit sha 2dfdbcfa65470748e119af43fda072d69ccd2ee1

fixed spec

view details

push time in 11 days

push eventtoptal/chewy

Maciej Rzasa

commit sha 42c8cd7cc5dbeb2ee6013d872a9e3aa97dcc462e

rubocop

view details

push time in 11 days