Train the XGBoost Model with Regularization

Nov 14, 2024

Use alpha for L1 regularization and lambda for L2 regularization.

Example Code:

python

# Initialize the model with L1 and L2 regularization
model = xgb.XGBClassifier(
    n_estimators=100,
    learning_rate=0.1,
    max_depth=3,
    alpha=0.1,  # L1 regularization term on weights
    lambda=1.0,  # L2 regularization term on weights
    random_state=42
)

# Train the model
model.fit(X_train_scaled, y_train)

# Make predictions
y_pred = model.predict(X_test_scaled)

# Evaluate the model
accuracy = accuracy_score(y_test, y_pred)
print(f'Accuracy: {accuracy}')
print(classification_report(y_test, y_pred))

Complete Code Example

Here’s the complete example combining all the steps:

python

import pandas as pd
import numpy as np
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
import xgboost as xgb
from sklearn.metrics import accuracy_score, classification_report

# Load the dataset
iris = load_iris()
X = iris.data
y = iris.target

# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42, stratify=y)

# Scale the features
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

# Initialize the model with L1 and L2 regularization
model = xgb.XGBClassifier(
    n_estimators=100,
    learning_rate=0.1,
    max_depth=3,
    alpha=0.1,  # L1 regularization term on weights
    lambda=1.0,  # L2 regularization term on weights
    random_state=42
)

# Train the model
model.fit(X_train_scaled, y_train)

# Make predictions
y_pred = model.predict(X_test_scaled)

# Evaluate the model
accuracy = accuracy_score(y_test, y_pred)
print(f'Accuracy: {accuracy}')
print(classification_report(y_test, y_pred))

Explanation

alpha: This parameter specifies the L1 regularization term. Increasing it can help in feature selection by driving less important feature coefficients to zero.
lambda: This parameter specifies the L2 regularization term. It helps in reducing the magnitudes of the feature coefficients, thus preventing overfitting.

By using these regularization techniques with XGBoost, you can improve the generalization of your model and reduce overfitting.

If you have any specific questions or need further assistance, feel free to ask!

source: ChatGPT

You need to login in order to like this post: click here