Skip to content

Negative Selection Algorithms for Outlier detections #690

Open
kishordgupta wants to merge 5 commits into
yzhao062:developmentfrom
kishordgupta:master
Open

Negative Selection Algorithms for Outlier detections #690
kishordgupta wants to merge 5 commits into
yzhao062:developmentfrom
kishordgupta:master

Conversation

@kishordgupta
Copy link
Copy Markdown

All Submissions Basics:

  • Have you followed the guidelines in our Contributing document? -- Hope so
  • Have you checked to ensure there aren't other open Pull Requests for the same update/change? -- N/A
  • Have you checked all Issues to tie the PR to a specific one? -- “No existing issue found for NSA family models.”

All Submissions Cores:

  • Have you added an explanation of what your changes do and why you'd like us to include them? --Adds a new Negative Selection Algorithm family implementation for PyOD, covering historical NSA variants such as binary NSA, real-valued NSA, randomized real-valued NSA, V-Detector, grid/matrix NSA, adaptive/evolutionary NSA, PSO/DE-based NSA, bidirectional inhibition NSA, and related variants. The goal is to add immune-inspired anomaly detection coverage to PyOD’s tabular detector set.
  • Have you written new tests for your core changes, as applicable? -Added NSA family tests covering fit, scoring, prediction, detector consistency, and basic benchmark execution.
  • Have you successfully ran tests with your changes locally? Local result: 2 passed in 13.67s.
  • Does your submission pass tests, including CircleCI, Travis CI, and AppVeyor?
  • Does your submission have appropriate code coverage? The cutoff threshold is 95% by Coversall.

New Model Submissions:

  • Have you created a .py in ~/pyod/models/? Added pyod/models/nsa.py.
  • Have you created a _example.py in ~/examples/? Added examples/nsa_family_benchmark.py
  • Have you created a test_.py in ~/pyod/test/? Added in examples
  • Have you lint your code locally prior to submission? yes Local result: 2 passed in 13.67s.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 76645387d1

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread pyod/test/nsa_example.py
Comment thread examples/nsa_example.py
Comment on lines +101 to +103
def test_all_nsa_variants_fit_score_predict():
X_train, X_test, y_test, contamination = _dataset()
for model in _models(contamination):
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Convert NSA example file into an executable example

The new examples/nsa_example.py is written as a pytest test module (test_* functions with asserts) instead of a runnable example script, and it has no if __name__ == "__main__": execution path. Running this file directly will not provide the usage/demo behavior expected from files in examples/, so users do not get an actual NSA example despite the new-model example requirement.

Useful? React with 👍 / 👎.

Comment thread pyod/models/nsa.py
Comment thread pyod/models/nsa.py
Comment on lines +465 to +467
X_restored = np.zeros((X_selected.shape[0], n_features))
X_restored[:, self.feature_indices_] = X_selected
return X_restored
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve original features when caching train data

When feature subsampling is active, _restore_feature_matrix fills dropped columns with zeros before inverse scaling, so X_train_original_ is not the original training set but a distorted version with omitted features forced to feature minima. Any later partial_fit retrains on this corrupted history, which can shift scaling and detector behavior unexpectedly.

Useful? React with 👍 / 👎.

Comment thread pyod/models/nsa.py
Comment thread pyod/models/nsa.py
Comment on lines +436 to +438
self.X_train_original_ = self.scaler_.inverse_transform(
self._restore_feature_matrix(X, self.n_features_in_)
).copy() if X.shape[1] == self.n_features_in_ else np.empty((0, self.n_features_in_))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Keep prior training data for grid partial_fit path

In the grid variant, when feature subsampling is used (X.shape[1] != n_features_in_), X_train_original_ is set to an empty array instead of the fitted training data. A subsequent partial_fit then refits only on the new batch (vstack with empty), dropping all historical self samples despite the method contract saying it combines old and new profiles.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant