Negative Selection Algorithms for Outlier detections #690
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 76645387d1
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| def test_all_nsa_variants_fit_score_predict(): | ||
| X_train, X_test, y_test, contamination = _dataset() | ||
| for model in _models(contamination): |
There was a problem hiding this comment.
Convert NSA example file into an executable example
The new examples/nsa_example.py is written as a pytest test module (test_* functions with asserts) instead of a runnable example script, and it has no if __name__ == "__main__": execution path. Running this file directly will not provide the usage/demo behavior expected from files in examples/, so users do not get an actual NSA example despite the new-model example requirement.
Useful? React with 👍 / 👎.
| X_restored = np.zeros((X_selected.shape[0], n_features)) | ||
| X_restored[:, self.feature_indices_] = X_selected | ||
| return X_restored |
There was a problem hiding this comment.
Preserve original features when caching train data
When feature subsampling is active, _restore_feature_matrix fills dropped columns with zeros before inverse scaling, so X_train_original_ is not the original training set but a distorted version with omitted features forced to feature minima. Any later partial_fit retrains on this corrupted history, which can shift scaling and detector behavior unexpectedly.
Useful? React with 👍 / 👎.
| self.X_train_original_ = self.scaler_.inverse_transform( | ||
| self._restore_feature_matrix(X, self.n_features_in_) | ||
| ).copy() if X.shape[1] == self.n_features_in_ else np.empty((0, self.n_features_in_)) |
There was a problem hiding this comment.
Keep prior training data for grid partial_fit path
In the grid variant, when feature subsampling is used (X.shape[1] != n_features_in_), X_train_original_ is set to an empty array instead of the fitted training data. A subsequent partial_fit then refits only on the new batch (vstack with empty), dropping all historical self samples despite the method contract saying it combines old and new profiles.
Useful? React with 👍 / 👎.
All Submissions Basics:
All Submissions Cores:
New Model Submissions: