profile
viewpoint
If you are wondering where the data of this site comes from, please visit https://api.github.com/users/southpaw27/events. GitMemory does not store any data, but only uses NGINX to cache data for a period of time. The idea behind GitMemory is simply to give users a better reading experience.

southpaw27/anonymous-inner-classes 0

Examples of anonymous inner clsases.

southpaw27/babel 0

:tropical_fish: Babel is a compiler for writing next generation JavaScript.

southpaw27/bst 0

A simple and partial implementation of binary search trees

southpaw27/csc207-hw2 0

Home work 2 repository for CSC 207

issue openedalan-turing-institute/sktime

[DOC] create table of content for forecasting tutorial

It would be nice if the forecasting tutorial, and other tutorials, had a table of content.

This should be looked at during or after the 2021 summer dev days (the tutorials are undergoing some refactoring, but contributions are welcome).

See also #972 and discussion therein.

created time in 4 hours

push eventalan-turing-institute/sktime

Franz Király

commit sha 88b6898d68367bf990e56aa192d867bdc97a65b3

linting

view details

Franz Király

commit sha 793637278a698fc3a1db3709a9d824ad2e0f09e3

nbqa black formatting

view details

Franz Király

commit sha 07106fd3e720abd9c243f6929f84d6f6e1c26232

all_estimators and typo fix

view details

Franz Király

commit sha bb0decef9f5319e6e74b6d8f70af2b3165047654

examples: updating

view details

Franz Király

commit sha a6b6575607b6f218f882a018cd7dea32044cd48f

prediction intervals moved forward; evaluation text updated

view details

push time in 4 hours

issue openedalan-turing-institute/sktime

forecasting: pretty-plotting for predictive interval forecasts

There should be pretty-plotting functionality for predictive intervals in forecasting.

Something like plot_series, but adapted to the return pair when return_pred_int=True in predict.

This could be easily adapted from the code in the tutorial notebook.

created time in 4 hours

issue openedalan-turing-institute/sktime

forecasting interval prediction interface

Let's think about a re-factoring of the forecasting interface for interval and probabilistic predictions.

I think we should have:

  • predict_interval, this should take alpha as in predict and return quantiles; one should be able to pass lists of alpha too (if the forecaster supports it). If a list is passed, we need to think about the return object - perhaps a data frame, with nested column index if we are in the multivariate case? Or, we could annotate columns [variablename]__[statistic], e.g., temperature__P95 using the reserved double-underscore.
  • predict_var, this should return predictive variance (not quantiles) - note that that's what comes out Gaussian process type regressors primarily.
  • predict_proba, this returns a skpro-like distribution object. Requires work on skpro first though.

For the first two, we could alternatively think about return flags and return objects in predict, but that just seems very messy to me.

Any opinions? @aiwalter, @mloning, @RNKuhns

created time in 4 hours

issue openedalan-turing-institute/sktime

tag: forecaster can return prediction intervals

Whether a forecaster can return prediction intervals should be a tag.

created time in 5 hours

issue commentalan-turing-institute/sktime

re-factoring forecaster.update, update_predict, update_predict_single

@mloning, @aiwalter, thoughts?

fkiraly

comment created time in 5 hours

issue openedalan-turing-institute/sktime

re-factoring forecaster.update, update_predict, update_predict_single

The rolling prediction interface of forecasters is currently confusing and unintuitive.

I suggest we re-factor this as follows:

  • update remains as is currently
  • update_predict becomes what update_predict_single is currently, a short-hand for "one update then one predict
  • current functionality in update_predict gets replaced by an update_predict_playback, as below.

update_predict_playback(y, cutoffs, X=None, fh=None, update_params=True) -> list

y is the full series; cutoffs (a list of indices) are indices on the same axis as y (but not necessarily a sub-set of y.index). It is assumed (and asserted/tested) thatcutoffis monotonous.fh` must be relative.

The return is a list of same length as cutoffs; the 0-th element is the return of predict after fit to y[:cutoffs[0]] (temporal slice); the i-th element is the return of the i-th update_predict, with y[cutoffs[i-1]:cutoffs[i]]. All predictions are with the fh passed.

created time in 5 hours

pull request commentalan-turing-institute/sktime

Forecasting tutorial rework

I meant that changes to Jupyter notebooks are not line based so it's hard to see what's going on with git to the point where tracking line changes is basically useless (hence tools like ReviewNB)

but we're using reviewnb, no?

fkiraly

comment created time in 8 hours

push eventalan-turing-institute/sktime

mloning

commit sha 5ddf46cc37130a003ebe81ea471940b330c21e41

Update contributors

view details

push time in 8 hours

pull request commentalan-turing-institute/sktime

Forecasting tutorial rework

I meant that changes to Jupyter notebooks are not line based so it's hard to see what's going on with git to the point where tracking line changes is basically useless (hence tools like ReviewNB)

fkiraly

comment created time in 8 hours

pull request commentalan-turing-institute/sktime

Forecasting tutorial rework

due to their incompatibility with git

they are not incompatible with git? Or what do you mean?

fkiraly

comment created time in 8 hours

pull request commentalan-turing-institute/sktime

Forecasting tutorial rework

Agreed, but sphinx gallery allows you to download Python scripts as Jupyter notebooks

Hm, does this properly support markdown and cells? Happy to try it if someone else sets it up.

fkiraly

comment created time in 8 hours

pull request commentalan-turing-institute/sktime

Forecasting tutorial rework

My opinion is to keep the entry barrier to the tutorials as low as possible.

Agreed, but sphinx gallery allows you to download Python scripts as Jupyter notebooks (see e.g. https://scikit-learn.org/stable/auto_examples/bicluster/plot_spectral_coclustering.html#sphx-glr-auto-examples-bicluster-plot-spectral-coclustering-py)

fkiraly

comment created time in 8 hours

pull request commentalan-turing-institute/sktime

Forecasting tutorial rework

The other more general point is that Jupyter notebooks are very hard to maintain in general, due to their incompatibility with git. An alternative may be to convert them into Python files and then use sphinx gallery for docs and an automated conversion for binder as done by scikit-learn.

Yes, but I still think we should use jupyter notebooks. Why: all users can run them, assuming the right packages are installed.

Binder is error prone (it still is not fixed! see #662), and python raises the barrier to run them. My opinion is to keep the entry barrier to the tutorials as low as possible.

fkiraly

comment created time in 8 hours

pull request commentalan-turing-institute/sktime

Update plot_series to handle pd.Int64 and pd.Range index uniformly

Thanks again @Dbhasin1 for the PR, great work! As said, please reach out again if you want to join the mentoring scheme or have any questions.

Dbhasin1

comment created time in 8 hours

push eventalan-turing-institute/sktime

Drishti Bhasin

commit sha eb020bfa5f1b7e01255fd5f97163fde6023b8d63

Update plot_series to handle pd.Int64 and pd.Range index uniformly (#892) * check for indices added * typecasting modified * check for indices added * check for indices added * removed union operation from check_consistent_index_types * Updated docs to comply with pydocstyle * comments modified * Trigger CI Checks * Add unit test * Minor fixes after self-review * Fix tests Co-authored-by: Sakshi Bhasin <sbhasin@sbhasin-ltmo42k.internal.salesforce.com> Co-authored-by: mloning <markus.loning.17@ucl.ac.uk>

view details

push time in 8 hours

PR merged alan-turing-institute/sktime

Update plot_series to handle pd.Int64 and pd.Range index uniformly

<!-- Thanks for contributing a pull request! Please ensure you have taken a look at our contribution guide: https://github.com/alan-turing-institute/sktime/blob/main/CONTRIBUTING.md -->

Reference Issues/PRs

Fixes #882

What does this implement/fix? Explain your changes.

added the provision of typecasting different types of series indexes such as Int64, Float64, UInt64 and RangeIndex into each other

Does your contribution introduce a new dependency? If yes, which one?

No

+78 -17

3 comments

4 changed files

Dbhasin1

pr closed time in 8 hours

issue closedalan-turing-institute/sktime

The function 'plot_series' should accept series with different index types

Is your feature request related to a problem? Please describe. The plot_series function throws an error when multiple series with different index types are passed as parameters.

Describe the solution you'd like The function should include implicit type casting of the series' indexes

Describe alternatives you've considered Manually converting the datatype of the series' indexes before passing them to the plot_series function using pandas

Additional context Screenshot 2021-05-20 at 4 48 00 PM

closed time in 8 hours

Dbhasin1

pull request commentalan-turing-institute/sktime

Feature/information criteria get_fitted_params

2 approvals, so can be merged now.

ltsaprounis

comment created time in 8 hours

issue commentalan-turing-institute/sktime

Refactor tags

I see tags as being defined on the class (not object) level, conceptually I think of them as describing algorithm properties, so I would prefer not to update them dynamically

Yes, but do you see the problem? The properties of composites may depend on components. Especially in pipelines this is important, so it would be good if there is something like a "dynamic tag". At least, it should be settable or overridable at construction.

the tag should be set to False, the composite doesn't require the forecasting horizon during fitting, it merely passes it on to the components, which have their own tag and raise an error accordingly if they require the forecasting horizon during fitting

disagreed!

I think each forecaster should be able to tell the user in advance whether it will raise an error when no fh is passed to fit. What else is the purpose of these tags? That's also what you are testing for.

Now the stage at which the forecaster should be able to tell that is up for discussion though, i.m.o - my preference would be when constructed, i.e., as an object.

mloning

comment created time in 8 hours

pull request commentalan-turing-institute/sktime

Refactoring Stacking, Multiplexer, Ensembler and TransformedTarget Forecasters

ok, agreed - let's split off the larger tags discussion, and just do a "safe" refactoring here. Outstanding changes:

  • the _SktimeForecaster removal is not tags related and should be done
  • multiplexer tag should be changed to requires fh in fit (see above)
thayeylolu

comment created time in 8 hours

Pull request review commentalan-turing-institute/sktime

Refactoring Stacking, Multiplexer, Ensembler and TransformedTarget Forecasters

 class MultiplexForecaster(     'arima'     """ +    _tags = {+        "univariate-only": True,+        "requires-fh-in-fit": False,+        "handles-missing-data": False,+    }+

this should be changed to "requires-fh-in-fit" : True, that's safe until the tags are sorted out.

thayeylolu

comment created time in 8 hours

Pull request review commentalan-turing-institute/sktime

Refactoring Stacking, Multiplexer, Ensembler and TransformedTarget Forecasters

   class TransformedTargetForecaster(-    _OptionalForecastingHorizonMixin,     _SktimeForecaster,

this still should be done, it´s not a tags issue

thayeylolu

comment created time in 8 hours

Pull request review commentalan-turing-institute/sktime

Refactoring Stacking, Multiplexer, Ensembler and TransformedTarget Forecasters

  from sktime.forecasting.base._base import DEFAULT_ALPHA from sktime.forecasting.base._meta import _HeterogenousEnsembleForecaster-from sktime.forecasting.base._sktime import _RequiredForecastingHorizonMixin from sktime.forecasting.model_selection import SingleWindowSplitter  from warnings import warn  -class StackingForecaster(-    _RequiredForecastingHorizonMixin, _HeterogenousEnsembleForecaster-):+class StackingForecaster(_HeterogenousEnsembleForecaster):+    """StackingForecaster.++    Stacks two or more Forecasters++    Parameters+    ----------+    forecasters : list of (str, estimator) tuples+    final_regressor: Regressor+    n_jobs : int or None, optional (default=None)+        The number of jobs to run in parallel for fit. None means 1 unless+        in a joblib.parallel_backend context.+        -1 means using all processors.+    """+     _required_parameters = ["forecasters", "final_regressor"]+    _tags = {+        "univariate-only": True,+        "requires-fh-in-fit": True,

ok, tags issue is now in #981. For now, this is fine.

thayeylolu

comment created time in 8 hours

PR merged stellar/stellar-core

Reviewers
Bug 2607 stream debug meta files

Description

This adds a new debugging facility to core: a fixed-size set of metadata streams kept on-disk in the buckets directory, gzipped once closed, and rotated out once they get too old. By default it's set to 64k ledgers which is 3.8 days. Long enough to notice and fetch a copy of the offending ledger metadata from any node in the network if there's a serious fault.

Along the way I fixed a (potential) deadlock in posting work back from background threads to foreground, if it happens when foreground is blocked waiting for futurebucket merges and the merge work is itself blocked behind too much background thread work that's posting-back results to main thread. This is a fairly subtle condition to arrange and I don't think anything in the existing code triggers it, but this new code could and it's a footgun to leave it in, so I eliminated it.

The performance costs of the new meta streams (at current pubnet rates) are:

  • TIme spent writing meta on ledger close.
  • Running fsync on background thread then gzip on a ~100MB file once every ~20 min. Gzip takes about 5s of CPU time -- could degrade with a ledger close on a heavily-loaded machine. Also an IO spike, potentially interfering.
  • An extra ~5GB of storage in the bucket dir for the 256 preserved files.

Rebased on top of #3030 to correctly support <filesystem> on gcc 8 and clang 8. That one should land before this one.

I'll also try to quantify this PR's costs a bit more tomorrow. Meanwhile review of the logic is welcome.

Resolves #2607

Checklist

  • [x] Reviewed the contributing document
  • [x] Rebased on top of master (no merge commits)
  • [x] Ran clang-format v8.0.0 (via make format or the Visual Studio extension)
  • [x] Compiles
  • [x] Ran all tests
  • [ ] If change impacts performance, include supporting evidence per the performance document
+595 -287

8 comments

19 changed files

graydon

pr closed time in 8 hours

issue closedstellar/stellar-core

Discussion - historical data tables future in core

As we're making progress towards using a pipe for propagating the meta into downstream systems (Horizon's captive mode), we should decide on the changes we want to make when the new mode becomes standard as we can better separate how a full validator and a watcher (captive) instance work.

For example, we could eliminate storing the meta entirely as described in https://github.com/stellar/stellar-core/issues/1756 (as the only historical data needed is what will be eventually published to an archive).

There is an interesting question of what to do when it comes to debugging/diagnostics: for a watcher node, the responsibility can be delegated to the host process (so Horizon ingestion for example), but for a full validator we need some way to debug when things go wrong (corruption, etc).

While keeping the meta in the main tables would work, it adds a lot of overhead to "closing a ledger", so it may be a good time to start to separate "critical" (needed for consensus) from "non critical" data.

The simplest thing to do might be to just emit the same stream than what Horizon needs, but into a file or a set of files (~ managed like a ring buffer, so that we can keep the last N (64?) ledgers worth of debugging information) and with weak flushing guarantees (ie, if there is a hard system crash we may corrupt those files and this is fine).

We could also start experimenting with some alternate high performance store like https://ned14.github.io/llfio/index.html (that we may decide to use later on for the ledger itself).

closed time in 8 hours

MonsieurNicolas

push eventstellar/stellar-core

Graydon Hoare

commit sha 2012a0726af22b3309ac4a1199983d32bf3e1da7

Remove potential deadlock when posting from background to main thread.

view details

Graydon Hoare

commit sha c5fb66637b9c4ce9a8538c138994fce504753b7f

Add METADATA_DEBUG_LEDGERS config var.

view details

Graydon Hoare

commit sha a5626a18f4e31260c275fdd20584b04eafa9f7bd

Add ledger/FlushAndRotateMetaDebugWork.{cpp,h}

view details

Graydon Hoare

commit sha bf633bef683b51da43658adac152504e93cffaf8

Add meta-debug streaming machinery to LedgerManagerImpl.

view details

Graydon Hoare

commit sha 1256bc040e006e3d2b15291b5d24e1a2c4a20bcf

Migrate more of util/Fs.{h,cpp} to C++17 <filesystem>

view details

Graydon Hoare

commit sha 3c523576e16f97614d99d2e9b22cd932e9017f65

Support LedgerCloseMeta on dump-xdr.

view details

Latobarita

commit sha be75bd1de18fc01890bc0e96d0af0465f74bc17e

Merge pull request #3059 from graydon/bug-2607-stream-debug-meta-files Bug 2607 stream debug meta files Reviewed-by: MonsieurNicolas

view details

push time in 8 hours

delete branch stellar/stellar-core

delete branch : auto

delete time in 8 hours

CommitCommentEvent
CommitCommentEvent