Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix #298 Handle removal of sklearn randomized l1 models #302

Merged
merged 2 commits into from
Mar 29, 2019
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Next Next commit
Handle future removal of randomized l1 transforms (sklearn)
  • Loading branch information
teabolt committed Mar 27, 2019
commit 00e66d22032e2ad7055d98edb103fdf07785e12c
19 changes: 12 additions & 7 deletions eli5/sklearn/transform.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,7 @@
import numpy as np # type: ignore
from sklearn.pipeline import Pipeline, FeatureUnion # type: ignore
from sklearn.feature_selection.base import SelectorMixin # type: ignore
from sklearn.linear_model import ( # type: ignore
RandomizedLogisticRegression,
RandomizedLasso,
)

from sklearn.preprocessing import ( # type: ignore
MinMaxScaler,
StandardScaler,
Expand All @@ -21,15 +18,23 @@

# Feature selection:

@transform_feature_names.register(SelectorMixin)
@transform_feature_names.register(RandomizedLogisticRegression)
@transform_feature_names.register(RandomizedLasso)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had an idea for another approach here, but after some thought it looks worse than what is implemented here, still let me put it here:

  • put imports at the top, set RandomizedLasso etc. to None if they fail to import
  • make it possible to write something equivalent to @transform_feature_names.register(RandomizedLasso) which would do nothing if RandomizedLasso is None. But it seems that we'd need a wrapper for that, something like @register(transform_feature_names, RandomizedLasso)

Advantage is that the flow would be more natural (imports at the top, decorators above the function). Disadvantage is that it might be a bit confusing that RandomizedLasso can be None, and also we would need to introduce the wrapper. So to me it seems that it would make sense only if we get a lot of similar conditional registers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestions @lopuhin. Looking at the recently deprecated in sklearn there doesn't seem to be too many removals of existing models. But I would certainly prefer your approach if we face more issues like this down the line.

def _select_names(est, in_names=None):
mask = est.get_support(indices=False)
in_names = _get_feature_names(est, feature_names=in_names,
num_features=len(mask))
return [in_names[i] for i in np.flatnonzero(mask)]

try:
from sklearn.linear_model import ( # type: ignore
RandomizedLogisticRegression,
RandomizedLasso,
)
_select_names = transform_feature_names.register(RandomizedLasso)(_select_names)
_select_names = transform_feature_names.register(RandomizedLogisticRegression)(_select_names)
except ImportError: # Removed in scikit-learn 0.21
pass
_select_names = transform_feature_names.register(SelectorMixin)(_select_names)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SelectorMixin could be left as a decorator, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are correct. I expanded out and placed SelectorMixin last to keep the original order that the decorators are applied in (RandomizedLasso -> RandomizedLogisticRegression -> SelectorMixin). But in this case order does not seem to matter.



# Scaling

Expand Down