-
-
Notifications
You must be signed in to change notification settings - Fork 554
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implementig Ad predictor #1642
Implementig Ad predictor #1642
Conversation
comments
river/base/adpredictor.py
Outdated
beta (float, default=0.1): | ||
A smoothing parameter that regulates the weight updates. Smaller values allow for finer updates, | ||
while larger values can accelerate convergence but may risk instability. | ||
prior_probability (float, default=0.5): | ||
The initial estimate rate. This value sets the bias weight, influencing the model's predictions | ||
before observing any data. | ||
|
||
epsilon (float, default=0.1): | ||
A variance dynamics parameter that controls how the model balances prior knowledge and learned information. | ||
Larger values prioritize prior knowledge, while smaller values favor data-driven updates. | ||
|
||
num_features (int, default=10): | ||
The maximum number of features the model can handle. This parameter affects scalability and efficiency, | ||
especially for high-dimensional data. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need to follow the docstring syntax we use everywhere else in River. Take a look at the source code of another model for examples :)
river/base/adpredictor.py
Outdated
adpredictor = AdPredictor(beta=0.1, prior_probability=0.5, epsilon=0.1, num_features=5) | ||
data = [({"feature1": 1, "feature2": 1}, 1),({"feature1": 1, "feature3": 1}, 0),({"feature2": 1, "feature4": 1}, 1),({"feature1": 1, "feature2": 1, "feature3": 1}, 0),({"feature4": 1, "feature5": 1}, 1),] | ||
def train_and_test(model, data): | ||
for x, y in data: | ||
pred_before = model.predict_one(x) | ||
model.learn_one(x, y) | ||
pred_after = model.predict_one(x) | ||
print(f"Features: {x} | True label: {y} | Prediction before training: {pred_before:.4f} | Prediction after training: {pred_after:.4f}") | ||
|
||
train_and_test(adpredictor, data) | ||
|
||
Features: {'feature1': 1, 'feature2': 1} | True label: 1 | Prediction before training: 0.5000 | Prediction after training: 0.7230 | ||
Features: {'feature1': 1, 'feature3': 1} | True label: 0 | Prediction before training: 0.6065 | Prediction after training: 0.3650 | ||
Features: {'feature2': 1, 'feature4': 1} | True label: 1 | Prediction before training: 0.6065 | Prediction after training: 0.7761 | ||
Features: {'feature1': 1, 'feature2': 1, 'feature3': 1} | True label: 0 | Prediction before training: 0.5455 | Prediction after training: 0.3197 | ||
Features: {'feature4': 1, 'feature5': 1} | True label: 1 | Prediction before training: 0.5888 | Prediction after training: 0.7699 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same: take a look at another model source code for an example. You should be able to run the docstring test with pytest river/base/adpredictor.py
river/base/adpredictor.py
Outdated
|
||
""" | ||
|
||
config = namedtuple("config", ["beta", "prior_probability", "epsilon", "num_features"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is this for?
river/base/adpredictor.py
Outdated
def prior_bias_weight(self): | ||
# Calculate initial bias weight using prior probability | ||
|
||
return np.log(self.prior_probability / (1 - self.prior_probability)) / self.beta |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We prefer using Python's standard library. So here you'll have to use math.log
. This is also the case for other parts of the code
river/base/adpredictor.py
Outdated
self.weights[feature]["mean"] = mean + mean_delta | ||
self.weights[feature]["variance"] = variance * variance_multiplier |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's cleaner to have two dicts: one to hold the means and how to hold the variances
river/base/adpredictor.py
Outdated
def __str__(self): | ||
# String representation of the model for easy identification | ||
return "AdPredictor" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no need for this
river/base/adpredictor.py
Outdated
return {"mean": 0.0, "variance": 1.0} | ||
|
||
|
||
class AdPredictor(Classifier): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this model can live in the linear_model
module!
river/base/adpredictor.py
Outdated
@@ -0,0 +1,154 @@ | |||
from __future__ import annotations | |||
|
|||
from collections import defaultdict, namedtuple |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We prefer to import the package, and then access its properties, instead of importing what we need
from collections import defaultdict, namedtuple | |
import collections |
Hello! |
No description provided.