profile
viewpoint
If you are wondering where the data of this site comes from, please visit https://api.github.com/users/CraigFe/events. GitMemory does not store any data, but only uses NGINX to cache data for a period of time. The idea behind GitMemory is simply to give users a better reading experience.
Craig Ferguson CraigFe @tarides Paris craigfe.io OCaml enthusiast. Software engineer @tarides.

CraigFe/causal-rpc 20

A traceable distributed computation framework

CraigFe/brands 16

Defunctionalised higher-kinded polymorphism in OCaml.

CraigFe/diff 8

Diffing and edit-distance algorithms for OCaml

CraigFe/dissertation 4

University of Cambridge Part II Dissertation (2018 – 2019)

CraigFe/causal-rpc-talk 3

CausalRPC talk slides (ICFP 2019).

CraigFe/accessor 0

A library that makes it nicer to work with nested functional data structures

PullRequestReviewEvent

Pull request review commentmirage/irmin

Use seq in Tree.fold

 module Make (P : Private.S) = struct         | `True -> empty_marks ()         | `Marks n -> n       in+      let pre path bindings acc =+        match pre with+        | None -> Lwt.return acc+        | Some pre ->+            let s = Seq.fold_left (fun acc (s, _) -> s :: acc) [] bindings in+            pre path s acc+      in+      let post path bindings acc =+        match post with+        | None -> Lwt.return acc+        | Some post ->+            let s = Seq.fold_left (fun acc (s, _) -> s :: acc) [] bindings in+            post path s acc+      in       let rec aux : type r. (t, acc, r) folder =        fun ~path acc d t k ->         let apply acc = node path t acc in         let next acc =           match force with-          | `True | `And_clear ->-              (* XXX: Let's not call [to_map] when [Value] *)+          | `And_clear -> (+              match t.v with+              | Map m ->+                  if force = `And_clear then clear ~depth:0 t;+                  (map [@tailcall]) ~path acc d (Some m) k+              | Value (repo, _, _) | Hash (repo, _) ->+                  let* v = to_value t >|= get_ok "fold" in+                  if force = `And_clear then clear ~depth:0 t;
                  clear ~depth:0 t;
                  (map [@tailcall]) ~path acc d (Some m) k
              | Value (repo, _, _) | Hash (repo, _) ->
                  let* v = to_value t >|= get_ok "fold" in
                  clear ~depth:0 t;

(These checks are always true now.)

icristescu

comment created time in a day

PullRequestReviewEvent

Pull request review commentmirage/irmin

Rework caching

 struct        let of_target : type ptr. ptr layout -> ptr t -> ptr = function         | Total -> fun target -> Total_ptr target-        | Partial _ ->-            fun target -> { target = Some target; target_hash = target.hash }+        | Partial _ -> fun target -> { target = Newie target }         | Truncated -> fun target -> Intact target        let of_hash : type ptr. ptr layout -> hash -> ptr = function         | Total -> assert false-        | Partial _ -> fun hash -> { target = None; target_hash = lazy hash }+        | Partial _ -> fun hash -> { target = Lazy hash }         | Truncated -> fun hash -> Broken hash -      let iter_if_loaded :+      let save :

I still like iter_if_loaded, but I don't strongly dislike save so I retract my comment :-) Apologies for the bikeshedding.

Ngoguey42

comment created time in a day

PullRequestReviewEvent

Pull request review commentmirage/irmin

Rework caching

 struct        let of_target : type ptr. ptr layout -> ptr t -> ptr = function         | Total -> fun target -> Total_ptr target-        | Partial _ ->-            fun target -> { target = Some target; target_hash = target.hash }+        | Partial _ -> fun target -> { target = Newie target }         | Truncated -> fun target -> Intact target        let of_hash : type ptr. ptr layout -> hash -> ptr = function         | Total -> assert false-        | Partial _ -> fun hash -> { target = None; target_hash = lazy hash }+        | Partial _ -> fun hash -> { target = Lazy hash }         | Truncated -> fun hash -> Broken hash -      let iter_if_loaded :+      let save :

Sure, but it doesn't actually do the save: it's an iterator intended for use by a function that does save the values being iterated over. OTOH, we have precedent for this sort of naming (e.g. Tree.Contents.export doesn't actually do an export), so we can keep this as-is.

Ngoguey42

comment created time in a day

PullRequestReviewEvent

Pull request review commentmirage/irmin

Rework caching

 module Make (P : Private.S) = struct           v       | _, v -> v -    let hash c =+    let hash ?(cache = true) c =

Hmm... quite :P I would have sworn this was possible.

Never mind then; sorry for the noise.

Ngoguey42

comment created time in a day

PullRequestReviewEvent

Pull request review commentmirage/irmin

Rework caching

 module Make (P : Private.S) = struct         acc ->         acc Lwt.t =      fun ~force ~uniq ~pre ~post ~path ?depth ~node ~contents ~tree t acc ->+      let cache = force = `True in

I think it makes sense in this PR, but you're welcome to split it if you prefer.

Ngoguey42

comment created time in a day

PullRequestReviewEvent
PullRequestReviewEvent

Pull request review commentmirage/irmin

Rework caching

 module Make (P : Private.S) = struct         acc ->         acc Lwt.t =      fun ~force ~uniq ~pre ~post ~path ?depth ~node ~contents ~tree t acc ->+      let cache = force = `True in

It might be a shame to lose the type info that the skip function is called only when ~force:false (and ignored otherwise). What about this?

?force:[ `True | `False of key -> 'a -> 'a Lwt.t ] -> ?cache:bool ->
Ngoguey42

comment created time in a day

Pull request review commentmirage/irmin

Rework caching

 module Make (P : Private.S) = struct           v       | _, v -> v -    let hash c =+    let hash ?(cache = true) c =

Yes, I was thinking:

module Contents = struct
  let hash ~cache t = (* ... *)
end

(* rest of the file *)

module Contents = struct
  include Contents
  let hash ?(cache = true) t = hash ~cache t
end

(This cure is perhaps worse than the disease though :stuck_out_tongue:)

Ngoguey42

comment created time in a day

PullRequestReviewEvent

pull request commentmirage/irmin

Use seq in Tree.fold

For the failing test, I couldn't find a nice solution so I commented out for now.

I thought about this a bit, and I think the failing test is highlighting a weird behaviour of the new implementation when ~force:`True.

The new strategy is to load the backend node value, then convert its child hashes to lazy values w/ Seq.map and force those, then discard the forced child pointers. I think it's odd that we discard the loaded child values after folding over them: this seems like something ~force:`And_clear would do. (Presumably if the user has explicitly chosen ~force:`True rather than ~force:`And_clear then they intend to load the entire tree into memory so that subsequent operations on the tree are fast).

As @icristescu points out, it's not possible to preserve the loaded values inside P.Node.Val.t (since Node.S doesn't support caching semantics, nor backpointers back to a user-defined in-memory handle like Tree.Node.t). Perhaps we will want to do that some day (to save needless conversions between backend nodes and Stdlib.Maps), but perhaps in the mean-time the correct approach is to call to_map on Value _ when ~force:`True? (but not when ~force:`And_clear)

icristescu

comment created time in a day

pull request commentmirage/irmin

Use seq in Tree.fold

(Removed the no-changelog-needed tag as this is a breaking change to the ordering of Tree.fold.)

icristescu

comment created time in a day

Pull request review commentmirage/irmin

Rework caching

 module Make (P : Private.S) = struct     let fold ~force ~path f t acc =       match force with       | `True | `And_clear ->-          let* c = to_value t in-          if force = `And_clear then clear t;+          let* c = to_value ~cache:(force = `True) t in

(As with Tree.fold below, I think it's probably best to pattern match here.)

Ngoguey42

comment created time in a day

Pull request review commentmirage/irmin

Rework caching

 struct        let of_target : type ptr. ptr layout -> ptr t -> ptr = function         | Total -> fun target -> Total_ptr target-        | Partial _ ->-            fun target -> { target = Some target; target_hash = target.hash }+        | Partial _ -> fun target -> { target = Newie target }         | Truncated -> fun target -> Intact target        let of_hash : type ptr. ptr layout -> hash -> ptr = function         | Total -> assert false-        | Partial _ -> fun hash -> { target = None; target_hash = lazy hash }+        | Partial _ -> fun hash -> { target = Lazy hash }         | Truncated -> fun hash -> Broken hash -      let iter_if_loaded :+      let save :

iter_if_loaded was perhaps a more descriptive name for this function? Not sure what "save" means in this context.

Ngoguey42

comment created time in a day

Pull request review commentmirage/irmin

Rework caching

 module Make (P : Private.S) = struct       let info = { hash; map; value; findv_cache } in       { v; info } +    let of_map m = of_v (Map m)+    let of_hash repo k = of_v (Hash (repo, k))+    let of_value ?updates repo v = of_v (Value (repo, v, updates))++    let cached_hash t =+      match (t.v, t.info.hash) with+      | Hash (_, h), None ->+          let h = Some h in+          t.info.hash <- h;+          h+      | _, h -> h++    let cached_map t =+      match (t.v, t.info.map) with+      | Map m, None ->+          let m = Some m in+          t.info.map <- m;+          m+      | _, m -> m++    let cached_value t =+      match (t.v, t.info.value) with+      | Value (_, v, None), None ->+          let v = Some v in+          t.info.value <- v;+          v+      | _, v -> v++    let info_is_empty i =+      i.map = None && i.value = None && i.findv_cache = None && i.hash = None+     let clear_info_fields i =       if not (info_is_empty i) then (         i.value <- None;         i.map <- None;         i.hash <- None;         i.findv_cache <- None) -    let rec clear_elt ~max_depth depth v =-      match v with-      | `Contents (c, _) -> if depth + 1 > max_depth then Contents.clear c-      | `Node t -> clear ~max_depth (depth + 1) t+    let rec clear_elt ~max_depth depth _ = function+      | `Contents (c, _) -> Contents.clear c+      | `Node t -> clear_node ~max_depth (depth + 1) t -    and clear_info ~max_depth ?v depth i =-      let clear _ v = clear_elt ~max_depth depth v in-      let () =-        match v with-        | Some (Value (_, _, Some um)) ->+    and clear_node ~max_depth depth n =+      if depth = max_depth then (+        clear_info_fields n.info;+        match n.v with Value (_, v, _) -> P.Node.Val.clear v | _ -> ())

Why only call this in the depth = max_depth case?

Ngoguey42

comment created time in a day

Pull request review commentmirage/irmin

Rework caching

 module Make (P : Private.S) = struct           v       | _, v -> v -    let hash c =+    let hash ?(cache = true) c =

Perhaps keep this as a non-optional parameter until the bottom of the file?

It's unambiguous & avoids unnecessary Some _ allocations in the process.

Ngoguey42

comment created time in a day

Pull request review commentmirage/irmin

Rework caching

 struct       | Partial : (hash -> partial_ptr t option) -> partial_ptr layout       | Truncated : truncated_ptr layout -    and partial_ptr = {-      target_hash : hash Lazy.t;-      mutable target : partial_ptr t option;-    }-    (** [mutable target : partial_ptr t option] could be turned to-        [target : partial_ptr t Lazy.t] to make the code even clearer (we never-        set it back to [None]), but we might soon implement a garbage collection-        method for inodes that will necessitate that mutable option (among other-        things). *)+    and partial_ptr_target =+      | Newie of partial_ptr t

I'm not sure what "Newie" means in this context – is it the same concept as "dirty"?

Ngoguey42

comment created time in a day

Pull request review commentmirage/irmin

Rework caching

 module Make (P : Private.S) = struct         acc ->         acc Lwt.t =      fun ~force ~uniq ~pre ~post ~path ?depth ~node ~contents ~tree t acc ->+      let cache = force = `True in

This puzzled me for a bit, but as I understand it the choice of cache is irrelevant in the case `False, and this is just switching on `True | `And_clear. If so, maybe let's use pattern matching and add a comment to that effect in the `False case; then we're robust to extension of the force type here.

Ngoguey42

comment created time in a day

Pull request review commentmirage/irmin

Rework caching

 module Make (P : Private.S) = struct     | `Contents (k, m) -> `Contents (Contents.of_hash repo k, m)    let export ?clear repo contents_t node_t n =+    let non_impacting = false in

non_impacting = false reads to me that it is actually impacting :P Perhaps:

 let cache = 
   (* This choice of [cache] flag has no impact, since we either immediately clear the
      corresponding cache or are certain that the it is already filled. *)
   false
 in
Ngoguey42

comment created time in a day

Pull request review commentmirage/irmin

Rework caching

 module type S = sig   val of_list : (step * value) list -> t   (** [of_list l] is the node [n] such that [list n = l]. *) -  val list : ?offset:int -> ?length:int -> t -> (step * value) list+  val list :+    ?offset:int -> ?length:int -> ?cache:bool -> t -> (step * value) list   (** [list t] is the contents of [t]. [offset] and [length] are used to-      paginate results.*)+      paginate results. [cache] regulates the caching behaviour regarding the+      node's internal data which are lazily loaded.++      [cache] defaults to [true] which may greatly reduce the IOs and the+      runtime but may also grealy increase the memory consumption.++      [cache = false] doesn't replace a call to [clear], it only prevents the+      storing of new data, it doesn't discard the existing one. *)

Perhaps add a {2 caching} section below and reference it with e.g. "See {!caching} for an explanation of the [?cache] parameter", to avoid duplicating this big docstring.

the node's internal data which are lazily loaded

This isn't strictly true, since it's implementation dependent (and neither irmin-git/node nor irmin/node actually does this). Perhaps ".,. which may be lazily loaded from the backend, depending on the node implementation."

Ngoguey42

comment created time in a day

PullRequestReviewEvent
PullRequestReviewEvent

push eventCraigFe/irmin

Ngoguey42

commit sha 5d1da25db9c2484f47dbf732757126d834f39274

irmin-mirage: Add missing constraint on Irmin_mirage_git.KV_make

view details

Craig Ferguson

commit sha f9eaa648a95cedf77208c7f45fea2477ea592904

Merge pull request #1521 from Ngoguey42/fix-mirage-git-kv irmin-mirage: Add missing constraint on Irmin_mirage_git.KV_maker

view details

Thomas Gazagnaire

commit sha 1f7f310269a8f7a077921f6d13089e232c749aa7

irmin: use sequences instead of list for node constructors This avoids allocating large intermediate lists.

view details

Ngoguey42

commit sha 4c90fa54f7a26f1a53cce55597fe044cd93be56a

irmin: Rewrite Tree.export

view details

Thomas Gazagnaire

commit sha 0508968bbea1411e9c7ed229ee4d4f1b2addd265

Update CHANGES

view details

ngoguey

commit sha 9d8cae02b2ffb916813a395a444a2f19aef2befd

Merge pull request #1508 from Ngoguey42/new-tree-export New tree export

view details

Craig Ferguson

commit sha 13505745510006c46fb36ff6849c25a35980993d

Add support for non-content-addressed backends

view details

Craig Ferguson

commit sha e8786891ce100e0e15593ffd58cbf2239d9a7bae

irmin-test: support non-content-addressable backends

view details

Craig Ferguson

commit sha bd97c4c4fee2bb669b27c4b83babd741f1f15143

irmin-test: remove unnecessary module type aliases

view details

Craig Ferguson

commit sha 4b97f44763650a9323ef6e1b04ad8ba8bcc52015

irmin: do more work to cache backend nodes during hashing

view details

Craig Ferguson

commit sha 5335696406fe614e5bb95a9297ae9edd87bfc427

irmin: supply a portable node implementation for use with Node.Make

view details

Craig Ferguson

commit sha f95d4282e132219665b2f66c50a3ffc01fd28736

irmin: provide a utility functor for generic-key stores

view details

Craig Ferguson

commit sha ef8712a97c9b80a973e3d0408bd565be85069325

irmin: use hashes as keys in Dot / Sync These modules deal with _portable_ store contents, so shouldn't expose any instance-specific key information.

view details

Craig Ferguson

commit sha 39bd8c95d1e9bcd93dcefa7653c0704ad884af38

irmin-mem: remove dead code in tests

view details

Craig Ferguson

commit sha d6d9e7ea96e25ade198dadd1443375efcf59071c

irmin: cache contents keys in Tree slightly more often

view details

Craig Ferguson

commit sha 75bd78a42e2de2bbd26c6da0477436fbac978843

irmin: re-add Store.{Contents,Tree}.of_hash functions These functions now explicitly invoke `index` functions in the backend stores to resolve hashes to keys before attempting `find` / `mem`.

view details

Craig Ferguson

commit sha 28071e6c2e17f3f1f852ef9c34dd0212a4462abf

irmin: generalise contents keys to support inlining This commit introduces a distinction between the type of contents keys and node / commit key types. The former now takes a type parameter corresponding to the type of the _value_ that the key points to, allowing content keys to inline the value inside the key. In future, we may want to add this capability to the other key types as well, but this is more complex to implement as it introduces a cycle: i.e. node values contain node keys, which may then contain recursive node values.

view details

Craig Ferguson

commit sha f25e873e146effd33e052552f07e139ec8a78459

irmin: add initial tests of a generic-keyed backend

view details

Craig Ferguson

commit sha a6370cd004050c3bb2b6ff77dcdb5294e17b6cfe

irmin-test: provide defaults for certain suite hook functions

view details

Craig Ferguson

commit sha 0c1f32765e8c634beb24d8f90e028088d2fda6ec

irmin: add test of generic keys using a store with inlined contents

view details

push time in 3 days

PullRequestReviewEvent
PullRequestReviewEvent