profile
viewpoint
If you are wondering where the data of this site comes from, please visit https://api.github.com/users/Kerollmops/events. GitMemory does not store any data, but only uses NGINX to cache data for a period of time. The idea behind GitMemory is simply to give users a better reading experience.
Clément Renault Kerollmops Meili France, Paris meilisearch.com @meilisearch Co-Founder and CTO @42School 2013 Alumni

Kerollmops/atom-header-42 13

An atom package to add the 42 header to your files

Kerollmops/atom-norminette-linter 10

A linter for the 42 students which use Atom

Kerollmops/canonical-raft 8

The, not released yet, simplest Rust library to replicate anything over the network

Kerollmops/chunk-json-lite 4

A little tool to split a json into multiple valid json array of a given max size

Kerollmops/csv2json-lite 3

A little tool to convert a csv to a valid json array of object

Kerollmops/astar-iter 2

An A* algorithm which iterates over shortest and next shortests paths

Kerollmops/brainfrust 2

A translation tool from brainfuck into Rust

Kerollmops/bulk-rename 2

A little tool to rename all the files in the current directory using your editor

Kerollmops/cml 2

Coroutines memory lookups - Permit multi memory lookups inside coroutines

Kerollmops/csv-generate-ids 2

A simple tool to generate unique sequential identifiers

push eventKerollmops/grenad

Clément Renault

commit sha a6eb52102e9c61942f26abacbd62e7f49b9efca3

Simplify iterating by moving on first/last when uninitialized

view details

push time in 12 hours

push eventKerollmops/grenad

Clément Renault

commit sha e7a1e2b61a948f3b3fa902ae74899ca4ccac94ea

Implement the legacy next method on the Reader type

view details

push time in a day

push eventKerollmops/grenad

Clément Renault

commit sha 3ca7e8af65a1a59704d3d45920d1ee7f0950503d

Revert prefixing by Stream

view details

push time in a day

push eventKerollmops/grenad

Clément Renault

commit sha 6300585955ca4b906c01017cb1109283593173a4

Introduce the BlockWriterBuilder struct

view details

push time in a day

push eventKerollmops/grenad

Kerollmops

commit sha 2b5d337dbe5c67c942f9ba15ef72aae1bd3fdfb8

Implement multiple block cursor methods and iterators

view details

push time in a day

push eventKerollmops/grenad

Kerollmops

commit sha b10c23b7ea8d534df9073836381c2f0a8dde41a2

Implement multiple block cursor methods and iterators

view details

push time in a day

push eventKerollmops/grenad

Kerollmops

commit sha 0bcc316b6e81b3a3ea66d853ee65740ccd23101c

Implement multiple block cursor methods and iterators

view details

push time in a day

push eventKerollmops/grenad

Kerollmops

commit sha ad3a651eca2619ff32afc5baca49f69cf4fa238e

Implement multiple block cursor methods and iterators

view details

push time in a day

issue commentRoaringBitmap/roaring-rs

The `insert_range` method does not properly handle boundary condition

Hi @Kerollmops

I've pushed another fix based on your branch(you can find it here), trying to use RangeBounds as parameter to fix inner insert_range and remove_range method.

I like what you have done, I didn't find time to review it in details but what I can see is that you changed a little bit too much things in this commit, for example you modifies the !0u64 to u64::MAX and things like that. You also changed/added new tests in this commit. I would prefer that you split the changes in multiple commits and also don't touch too much to the current logic. It would be much easier for me to review bit by bit your changes 😄

  1. In rust std-lib, RangeBounds are always used with <usize>, but I think a solid boundary is much more meaningful in our scenario, preventing boundary check. What do you think?

I think the user-side API must, indeed, expose a RangeBounds<u32> for the Bitmap and a RangeBounds<u64> for the Treemap.

  1. Though we use RangeBounds, which has three types in both boundary. But, in fact, only Range( [a, b) ) and RangeInclusive( [a, b] ) are effective, am I right?

It would be easier to just support all of them, as we know the real boundaries i.e u32::MIN and u32::MAX, we fan easily convert unbounded ranges to fully bounded ones.

oliverdding

comment created time in 3 days

push eventKerollmops/grenad

Kerollmops

commit sha c5e0f154a067ae0802b6076f7c6c55b528770a64

Implement multiple block cursor methods and iterators

view details

push time in 3 days

push eventKerollmops/grenad

Kerollmops

commit sha 9f8267c62587de3dead3aa42517230d771898de7

Implement multiple block cursor methods and iterators

view details

push time in 3 days

startedDataDog/glommio

started time in 4 days

Pull request review commentmeilisearch/milli

Geosearch

 impl Criterion for Geo<'_> {  fn geo_point(     rtree: &RTree<GeoPoint>,-    candidates: RoaringBitmap,+    mut candidates: RoaringBitmap,     point: [f64; 2], ) -> Box<dyn Iterator<Item = RoaringBitmap>> {     let results = rtree         .nearest_neighbor_iter(&point)-        .filter_map(move |point| candidates.contains(point.data).then(|| point.data))+        .filter_map(move |point| candidates.remove(point.data).then(|| point.data))

This doesn't make the iteration stop, you should add something like a take_while or even better: rewrite this iterator combination as a for loop, instead.

irevoire

comment created time in 4 days

PullRequestReviewEvent
PullRequestReviewEvent

push eventKerollmops/grenad

Kerollmops

commit sha cb256619f871123f5a307a807745cb4619ed1f49

Implement multiple block cursor methods and iterators

view details

push time in 4 days

push eventKerollmops/grenad

Kerollmops

commit sha 0a8ab241d242ac656df051addb2a4dc69973cc3f

Implement multiple block cursor methods and iterators

view details

push time in 4 days

push eventKerollmops/grenad

Kerollmops

commit sha b801f0facbc919d8dfd7c3eddd8b62e3d8ea7367

Reintroduce the block index footer for futur get, iter operations

view details

push time in 4 days

push eventmeilisearch/milli

Kerollmops

commit sha 2741aa8589cc69bb51a6e7f530f52a276b60366a

Update the indexing timings in the README

view details

push time in 4 days

PR opened meilisearch/milli

Update the README

This PR updates a little bit the README and more specifically the indexing times, fixes #352.

+7 -8

0 comment

1 changed file

pr created time in 5 days

push eventmeilisearch/milli

Kerollmops

commit sha 3492e8fcf325057473683ca2cf0344ee56eae7e7

Update the indexing timings in the README

view details

push time in 5 days

push eventKerollmops/grenad

Kerollmops

commit sha dca4556858a7fd448be02eff0c84a75e2d864598

Only enable snappy as the default compression type

view details

push time in 5 days

push eventKerollmops/grenad

Kerollmops

commit sha 67e5fb3f5c8b503d44f564aacc3f27391950821e

Only enable snappy as the default compression type

view details

push time in 5 days

PullRequestReviewEvent

issue closedmeilisearch/milli

UUID is not automatically generated when POST a document using CSV

Following the README.md instructions. You can reproduce the issue starting the http-ui service with

cargo run --release -- --db my-database.mdb -vvv --indexing-jobs 8

and then trying to add the documents using this command:

printf "name,age\nhello,32\nkiki,24\n" | http POST 127.0.0.1:9700/documents content-type:text/csv

The server answers with: 200 OK

But no document was added to the DB, Instead the errors have been inserted in the DB in the file my-database.mbd/updates.mdb/data.mdb, as you can see in the following pictures: image

closed time in 5 days

catunlock

issue commentmeilisearch/milli

UUID is not automatically generated when POST a document using CSV

Hey @catunlock,

That indeed has been removed from the program, we disabled it by default. You must specify the id field that must be used by the engine to identify the documents. I am updating the README to fix this issue.

Thank you 😺

catunlock

comment created time in 5 days

create barnchmeilisearch/milli

branch : update-readme

created branch time in 5 days

pull request commentmeilisearch/MeiliSearch

Use tikv-jemallocator instead of jemallocator

bors cancel

felixonmars

comment created time in 5 days

Pull request review commentmeilisearch/milli

Geosearch

+use std::iter;++use roaring::RoaringBitmap;+use rstar::RTree;++use super::{Criterion, CriterionParameters, CriterionResult};+use crate::search::criteria::{resolve_query_tree, CriteriaBuilder};+use crate::{GeoPoint, Index, Result};++pub struct Geo<'t> {+    index: &'t Index,+    rtxn: &'t heed::RoTxn<'t>,+    parent: Box<dyn Criterion + 't>,+    candidates: Box<dyn Iterator<Item = RoaringBitmap>>,+    allowed_candidates: RoaringBitmap,+    bucket_candidates: RoaringBitmap,+    rtree: Option<RTree<GeoPoint>>,+    point: [f64; 2],+}++impl<'t> Geo<'t> {+    pub fn new(+        index: &'t Index,+        rtxn: &'t heed::RoTxn<'t>,+        parent: Box<dyn Criterion + 't>,+        point: [f64; 2],+    ) -> Result<Self> {+        let candidates = Box::new(iter::empty());+        let allowed_candidates = index.geo_faceted_documents_ids(rtxn)?;+        let bucket_candidates = RoaringBitmap::new();+        let rtree = index.geo_rtree(rtxn)?;++        Ok(Self {+            index,+            rtxn,+            parent,+            candidates,+            allowed_candidates,+            bucket_candidates,+            rtree,+            point,+        })+    }+}++impl Criterion for Geo<'_> {+    fn next(&mut self, params: &mut CriterionParameters) -> Result<Option<CriterionResult>> {+        let rtree = self.rtree.as_ref();++        loop {+            match self.candidates.next() {+                Some(mut candidates) => {+                    candidates -= params.excluded_candidates;+                    self.allowed_candidates -= &candidates;+                    return Ok(Some(CriterionResult {+                        query_tree: None,+                        candidates: Some(candidates),+                        filtered_candidates: None,+                        bucket_candidates: Some(self.bucket_candidates.clone()),+                    }));+                }+                None => match self.parent.next(params)? {+                    Some(CriterionResult {+                        query_tree,+                        candidates,+                        filtered_candidates,+                        bucket_candidates,+                    }) => {+                        let mut candidates = match (&query_tree, candidates) {+                            (_, Some(candidates)) => candidates,+                            (Some(qt), None) => {+                                let context = CriteriaBuilder::new(&self.rtxn, &self.index)?;+                                resolve_query_tree(&context, qt, params.wdcache)?+                            }+                            (None, None) => self.index.documents_ids(self.rtxn)?,+                        };++                        if let Some(filtered_candidates) = filtered_candidates {+                            candidates &= filtered_candidates;+                        }++                        match bucket_candidates {+                            Some(bucket_candidates) => self.bucket_candidates |= bucket_candidates,+                            None => self.bucket_candidates |= &candidates,+                        }++                        if candidates.is_empty() {+                            continue;+                        }+                        self.allowed_candidates = &candidates - params.excluded_candidates;+                        self.candidates = match rtree {+                            Some(rtree) => {+                                geo_point(rtree, self.allowed_candidates.clone(), self.point)+                            }+                            None => Box::new(std::iter::empty()),+                        };+                    }+                    None => return Ok(None),+                },+            }+        }+    }+}++fn geo_point(+    rtree: &RTree<GeoPoint>,+    candidates: RoaringBitmap,+    point: [f64; 2],+) -> Box<dyn Iterator<Item = RoaringBitmap>> {+    let results = rtree+        .nearest_neighbor_iter(&point)+        .filter_map(move |point| candidates.contains(point.data).then(|| point.data))+        .map(|id| iter::once(id).collect::<RoaringBitmap>())+        .collect::<Vec<_>>();

We should stop iterating through the whole list of points, instead, we should stop when all the candidates have been found. By doing a candidates.remove().is_some() instead of a simple contains.

irevoire

comment created time in 5 days

PullRequestReviewEvent