profile
viewpoint
If you are wondering where the data of this site comes from, please visit https://api.github.com/users/VaysseRobin/events. GitMemory does not store any data, but only uses NGINX to cache data for a period of time. The idea behind GitMemory is simply to give users a better reading experience.
VaysseRobin IRIT Toulouse, France PhD student in signal processing

MaxHalford/idao-2020-qualifier 16

Solution of team "Data O Plomo" to the qualification phase of the 2020 edition of the International Data Analysis Olympiad (IDAO)

MaxHalford/idao-2020-final 7

Solution of team "Data O Plomo" to the final phase of the 2020 edition of the International Data Analysis Olympiad (IDAO)

VaysseRobin/creme 0

:loop: Online machine learning in Python

startedhelboukkouri/character-bert

started time in 8 hours

push eventonline-ml/river

smastelini

commit sha 9356e9c534a0d64205d493f1d22ffce4e775f877

rerun notebook

view details

push time in 10 hours

push eventonline-ml/river

smastelini

commit sha d12c913bb60d7bb085eab36cf07011e8979490fb

Add coverage tests

view details

smastelini

commit sha 7810820687c9319022924a88f3b64c707258e595

pre-commit actions

view details

push time in 10 hours

pull request commentonline-ml/river

Make tree splitters public

The failing tests are related to classifier chains. I believe they were recently updated. Pinging @MaxHalford.

smastelini

comment created time in 13 hours

pull request commentonline-ml/river

Make tree splitters public

Hey @MaxHalford and @jacobmontiel, I added a page in the user guides that is a walkthrough in the tree module. I talked about the different tree models available, model inspection, memory management, and splitters. Hope that can help the users.

Your feedback is welcomed.

(Later today I'll add some basic coverage tests for the splitters)

smastelini

comment created time in 13 hours

push eventonline-ml/river

Max Halford

commit sha 235bb742f458527bc5a656440ff187525df46d2a

Make Metric inherit from base.Base

view details

Max Halford

commit sha 362f5e5e86f5ee20ec1f49d988c7cb0c4a0ddff7

init track logic

view details

Max Halford

commit sha 6238410535c730df4a3cf03467d2f7688616eb58

finish benchmark logic

view details

Max Halford

commit sha f6af4cbdbb2a95c63e85a51783af8cfaf9018fde

fix style

view details

Max Halford

commit sha a2b31917aa9df8e88d9507b4a4e5acbed245190d

fix tests

view details

Max Halford

commit sha 93a3361e44d628fbe60c3171665cfdec0d97ed4c

fix tests

view details

Max Halford

commit sha 5f11a104bb012cdc051ac20b047e29c83ded572f

Creme -> River

view details

hoanganhngo610

commit sha c7d355dd24ad226d5050f95235c969e50c43a27f

Add the DenStream clustering algorithm

view details

Max Halford

commit sha ba5bebc2866dbd910db3bcde2ed69cc761199551

Merge pull request #471 from online-ml/benchmarks Benchmarks refactoring

view details

raphaelsty

commit sha 6dbf473c28b3e9c8761592370eb96f8ac21f6849

Add exemple sentence classification

view details

raphaelsty

commit sha bc56193a060428297128f30dafbbd9605472b2a1

Update sentence classification example

view details

Buster Styren

commit sha 5afbf72ca054d1efcbf8693123260d4103b1a463

add no_learn parameter to pipeline prediction to avoid updating transformers

view details

Buster Styren

commit sha d5f9d81c26a324c82cffc74e066d2b755034bbe0

add tests verifying scaler updating

view details

Buster Styren

commit sha 47d6ac679f73aa09a98b583d60abcbe7e22c9d79

s/no_learn/learn_unsupervised

view details

Max Halford

commit sha 02fa2386d2c37b1f55b085339b4cfd8d527a7289

Merge pull request #503 from Styren/add-no-learn Add no learn parameter to pipeline transformations

view details

Buster Styren

commit sha 75c6525e1f9acfc469400af3b18bc10dd77ee30c

parametrize tests for proba and score_one

view details

Buster Styren

commit sha 55185c10955e239cc9a4dcff61a31af5e8265cea

lint

view details

Max Halford

commit sha 9c9179663557d46c0603e8959218c67cef3f3bcd

Merge pull request #505 from Styren/add-no-learn Parametrize tests for proba-methods

view details

raphaelsty

commit sha f5fdb24fd43289c6b0d0a929c3ebdaee51a9ff18

Merge remote-tracking branch 'upstream/master'

view details

Max Halford

commit sha 34406cb6b59281dead2a1069bc56e0bd4dd8877e

Allow multioutput chain to ingest new outputs

view details

push time in 13 hours

pull request commentonline-ml/river

Add four internal metrics for incremental clustering (Cohesion, SSQ, Separation and Silhouette)

Yes you're probably right @jacobmontiel. I agree that a table would be very helpful :)

hoanganhngo610

comment created time in a day

pull request commentonline-ml/river

Add four internal metrics for incremental clustering (Cohesion, SSQ, Separation and Silhouette)

My opinion would be to put these clustering metrics in the cluster module.

This could be confusing to the user and would break the existing structure. I vote to keep them inside metrics. Also, some metrics (external) are used for both classification and clustering.

We can improve the documentation of this module. For example, the scikit-learn documentation has a table:

image

hoanganhngo610

comment created time in a day

startedscikit-mine/scikit-mine

started time in 2 days

startedWICG/floc

started time in 2 days

created repositoryMaxHalford/bbc-weather-honolulu

☀️ Measuring the accuracy of BBC weather in Honolulu, USA

created time in 2 days

push eventonline-ml/river

github-actions

commit sha 2daf394ea1d8d4fa7081b520bfe99df8e4391fa5

Deployed c84531ae to dev with MkDocs 1.1.2 and mike 0.5.3

view details

push time in 2 days

PR merged online-ml/river

Sentence classification - Doc - Example Type: Documentation

Hi team,

Here is a small pull-request to add an example in the doc. I created a notebook dedicated to sentence classification using the SMSSpam dataset. This tutorial is related to the discussion #482.

Goals of this tutorial:

  • Handle text using river
  • How to use naive_bayes using TFIDF.
  • How to use regression logistic for classification using TFIDF
  • Because distributions of targets of SMSSPAM dataset are not uniform, some links with the imblearn tutorial and module.
  • But mainly how to integrate pre-trained word embeddings (spicy) into river pipelines.

Sorry, I did not ask if it was useful to do it but I think it's great to have more examples in the documentation.

Thank you in advance for the review.

Raphaël

+1524 -0

2 comments

5 changed files

raphaelsty

pr closed time in 2 days

push eventonline-ml/river

raphaelsty

commit sha 6dbf473c28b3e9c8761592370eb96f8ac21f6849

Add exemple sentence classification

view details

raphaelsty

commit sha bc56193a060428297128f30dafbbd9605472b2a1

Update sentence classification example

view details

raphaelsty

commit sha f5fdb24fd43289c6b0d0a929c3ebdaee51a9ff18

Merge remote-tracking branch 'upstream/master'

view details

raphaelsty

commit sha 665098a1671275e6253a8b16afbabbd4a324b5e1

Merge remote-tracking branch 'upstream/master'

view details

raphaelsty

commit sha 0429c040fb72e8212e538a92f45c579d1e895b97

Update after Max review. Link betweens tutorials are ok.

view details

Max Halford

commit sha c84531aeb532619cdf5dbae7aba690e8275f2fca

Merge pull request #502 from raphaelsty/master Sentence classification - Doc - Example

view details

push time in 2 days

pull request commentonline-ml/river

Sentence classification - Doc - Example

@MaxHalford Again, thank you for the review, I followed all your comments and updated the code and comments. The link to the imblearn tutorial works fine locally. By replacing standard scaler with normalize, we improved the results of the regression logistic + TF-IDF. I made some updates on the description.

raphaelsty

comment created time in 2 days

push eventonline-ml/river

github-actions

commit sha 20de2cf42350b2506f9acc989332a22de7c786c9

Deployed 178ff024 to dev with MkDocs 1.1.2 and mike 0.5.3

view details

push time in 2 days

push eventonline-ml/river

Max Halford

commit sha 178ff02448a6c0c71c987ae9d23128b2552c440b

fix style

view details

push time in 2 days

push eventonline-ml/river

github-actions

commit sha 10b69f147eefdd1dc3c81b91dfe8a4654a114a57

Deployed cc59256f to dev with MkDocs 1.1.2 and mike 0.5.3

view details

push time in 2 days

issue closedonline-ml/river

Multiclass benchmark

We need to write a simple benchmark for multi-class learning. I'm not set on a particular dataset yet so feel free to propose one!

closed time in 2 days

MaxHalford

delete branch online-ml/river

delete branch : improve-chain-models

delete time in 2 days

push eventonline-ml/river

Max Halford

commit sha 34406cb6b59281dead2a1069bc56e0bd4dd8877e

Allow multioutput chain to ingest new outputs

view details

Max Halford

commit sha cc59256f226e692c7c7986352eab9ecc6126e4a0

Merge pull request #508 from online-ml/improve-chain-models Allow multioutput chain to ingest new outputs

view details

push time in 2 days

PR opened online-ml/river

Allow multioutput chain to ingest new outputs
+52 -52

0 comment

1 changed file

pr created time in 2 days

create barnchonline-ml/river

branch : improve-chain-models

created branch time in 2 days

issue commentonline-ml/river

Bug with ClassificationReport / use of classifier?

What would the preferred interface be? Presumably this is just an argument to the constructor. Do we want to simply have a Boolean-valued keyword argument (default to True) eg ... ignore_nones=<True|False> or perhaps a filter predicate function?

I would add this ignore_none parameter to the __init__ of confusion.ConfusionMatrix. You would also have to expose the parameter in the constructor of BinaryMetric and MultiClassMetric which are inherited by the classification metrics.

My assumption is the former as the cm implementation is Cython and we probably don’t want to be evaluating Python per se in the middle of that, which an arbitrary filtering function might entail?

Correct. A boolean flag that defaults to True will do the job.

Nb. I’ll probably not have a chance to get this into shape for a PR this weekend, but happy to do it!

No pressure! Please do not hesitate to ping me if you struggle.

jbone

comment created time in 3 days

issue commentonline-ml/river

Bug with ClassificationReport / use of classifier?

What would the preferred interface be? Presumably this is just an argument to the constructor. Do we want to simply have a Boolean-valued keyword argument (default to True) eg ... ignore_nones=<True|False> or perhaps a filter predicate function?

My assumption is the former as the cm implementation is Cython and we probably don’t want to be evaluating Python per se in the middle of that, which an arbitrary filtering function might entail?

Nb. I’ll probably not have a chance to get this into shape for a PR this weekend, but happy to do it!

jbone

comment created time in 3 days

issue commentonline-ml/river

Bug with ClassificationReport / use of classifier?

Will do!

jbone

comment created time in 3 days

push eventonline-ml/river

github-actions

commit sha 49faadf1d9e0618720b5829d3e3ba082775043d4

Deployed 9c917966 to dev with MkDocs 1.1.2 and mike 0.5.3

view details

push time in 3 days

push eventonline-ml/river

Buster Styren

commit sha 75c6525e1f9acfc469400af3b18bc10dd77ee30c

parametrize tests for proba and score_one

view details

Buster Styren

commit sha 55185c10955e239cc9a4dcff61a31af5e8265cea

lint

view details

Max Halford

commit sha 9c9179663557d46c0603e8959218c67cef3f3bcd

Merge pull request #505 from Styren/add-no-learn Parametrize tests for proba-methods

view details

push time in 3 days

PR merged online-ml/river

Parametrize tests for proba-methods

Add coverage for testing of learn_unsupervised in compose.Pipeline

+35 -9

0 comment

1 changed file

Styren

pr closed time in 3 days