- Find the same probability distribution in a large amount of data and make predictions based on the same probability distribution:
y = f(x)
- Just like learning a function relationship, the inverse function or reverse engineering function requires DL. You just know that the data has a certain pattern and then guess what the original function that generated the data is. For example, you train to get a calculator neural network.
- The idea of high-dimensional space: the code is cut into high-dimensional space, and then a very detailed high-dimensional classification is done to separate it. Then the search is also high-dimensional, just like the code, it is entered into the treesitter to do training to obtain logical learning relationships. Most of NLP is a multi-classification problem in high-dimensional space.
- Collect the input x and output y around you as training data, and mine their mapping relationship f(x) at any time. You can use GPT to generate certain data for your model training needs or write crawler to get you need data.
Deep learning is a subset of machine learning that uses artificial neural networks with multiple layers (deep neural networks) to progressively extract higher-level features from raw input. Here are the key components and concepts:
- Input Layer: Receives raw data and normalizes it for processing
- Hidden Layers: Multiple layers that transform data through weighted connections
- Output Layer: Produces the final prediction or output
- Activation Functions: Non-linear functions (ReLU, sigmoid, tanh) that help networks learn complex patterns
-
Backpropagation
- Algorithm for calculating gradients in neural networks
- Efficiently updates weights by propagating error backwards through the network
- Uses chain rule to compute partial derivatives
-
Gradient Descent Optimization
- Stochastic Gradient Descent (SGD)
- Mini-batch Gradient Descent
- Adaptive optimizers (Adam, RMSprop)
-
Loss Functions
- Mean Squared Error (MSE) for regression
- Cross-Entropy Loss for classification
- Custom loss functions for specific tasks
-
Regularization Techniques
- Dropout: Randomly deactivates neurons during training
- L1/L2 Regularization: Adds penalty terms to prevent overfitting
- Batch Normalization: Normalizes layer inputs for stable training
-
Convolutional Neural Networks (CNNs)
- Specialized for processing grid-like data (images)
- Key components: Convolutional layers, pooling layers, fully connected layers
- Applications: Image classification, object detection, segmentation
-
Recurrent Neural Networks (RNNs)
- Process sequential data with memory of previous inputs
- Variants: LSTM, GRU for handling long-term dependencies
- Applications: Time series prediction, natural language processing
-
Transformers
- State-of-the-art architecture for sequence processing
- Self-attention mechanism for capturing relationships
- Applications: Language models, machine translation, text generation
-
Autoencoders
- Unsupervised learning for dimensionality reduction
- Encoder-decoder architecture
- Applications: Feature learning, denoising, anomaly detection
- Python & R Machine Learning
- R Machine Learning
- least squares method
- least squares method by neural network
- nonlinear fitting
- polar coordinate classification
- mnist ocr
- use mnist
- calculator neural network
- Data cleaning
- SVM
- kmeans
- Decision Tree Classifier
- Reinforcement Learning (DQN)
- Flappy bird dqn
- SGD
- CNN with Attention
- LSTM generator
- Transformer generator
conda create -n emacspy python=3.11
conda activate emacspy
poetry install
import numpy as np
import matplotlib.pyplot as plt
# Example data points
X = np.array([1, 2.2, 3, 4, 5])
y = np.array([2, 4, 6.3, 8, 11])
# Add a column of ones to X for the intercept term (bias)
X_b = np.c_[np.ones((X.shape[0], 1)), X] # X_b is X with a bias column
# Calculate the best fit line parameters using the Normal Equation
theta_best = np.linalg.inv(X_b.T.dot(X_b)).dot(X_b.T).dot(y)
# Print the parameters (intercept and slope)
print(f"Intercept: {theta_best[0]}")
print(f"Slope: {theta_best[1]}")
# Predict values using the model
y_pred = X_b.dot(theta_best)
# Plot the data points and the best fit line
plt.scatter(X, y, color='blue', label='Data points')
plt.plot(X, y_pred, color='red', label='Best fit line')
plt.xlabel('X')
plt.ylabel('y')
plt.legend()
plt.show()
import torch
import torch.nn as nn
import torch.optim as optim
import matplotlib.pyplot as plt
# graph show the pytorch torch.optim.Adam and plot it How it works
# Define a simple linear model
class LinearModel(nn.Module):
def __init__(self):
super(LinearModel, self).__init__()
self.linear = nn.Linear(1, 1)
def forward(self, x):
return self.linear(x)
# Initialize the model, loss function, and optimizer
model = LinearModel()
criterion = nn.MSELoss()
optimizer = optim.Adam(model.parameters(), lr=0.01)
# Generate some synthetic data (y = 2x + 1 with some noise)
x_train = torch.linspace(-1, 1, 100).reshape(-1, 1)
y_train = 2 * x_train + 1 + 0.2 * torch.randn(x_train.size())
# List to store the loss values
loss_values = []
# Training loop
for epoch in range(1000):
model.train()
optimizer.zero_grad()
outputs = model(x_train)
loss = criterion(outputs, y_train)
loss.backward()
optimizer.step()
loss_values.append(loss.item())
import torch
import torch.nn as nn
import numpy as np
import matplotlib.pyplot as plt
# Step 1: Generate a 100-length random sequence
n = 100
x = torch.linspace(1, 10, n).unsqueeze(1)
y = torch.sin(x) + torch.rand(n, 1) * 0.5
# Step 2: Define a simple neural network model for nonlinear fitting
class NonlinearModel(nn.Module):
def __init__(self):
super(NonlinearModel, self).__init__()
self.fc1 = nn.Linear(1, 10)
self.fc2 = nn.Linear(10, 10)
self.fc3 = nn.Linear(10, 1)
def forward(self, x):
x = torch.relu(self.fc1(x))
x = torch.relu(self.fc2(x))
x = self.fc3(x)
return x
model = NonlinearModel()
# Step 3: Define loss function and optimizer
criterion = nn.MSELoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.01)
# Step 4: Train the model
epochs = 1000
for epoch in range(epochs):
model.train()
# Forward pass
outputs = model(x)
loss = criterion(outputs, y)
# Backward pass and optimization
optimizer.zero_grad()
loss.backward()
optimizer.step()
if (epoch+1) % 100 == 0:
print(f'Epoch [{epoch+1}/{epochs}], Loss: {loss.item():.4f}')
# Step 5: Plot the original data and the fitted curve
model.eval()
with torch.no_grad():
predicted = model(x).numpy()
plt.figure(figsize=(10, 5))
plt.plot(x.numpy(), y.numpy(), 'ro', label='Original data')
plt.plot(x.numpy(), predicted, 'b-', label='Fitted curve')
plt.legend()
plt.show()
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader, TensorDataset
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
# Helper function to convert Cartesian to Polar coordinates
def cartesian_to_polar(x, y, z):
r = torch.sqrt(x**2 + y**2 + z**2)
theta = torch.atan2(y, x)
phi = torch.acos(z / r)
return r, theta, phi
# Example data generation (replace with your actual data)
n_samples = 5000
x = torch.randn(n_samples)
y = torch.randn(n_samples)
z = torch.randn(n_samples)
labels = torch.randint(0, 4, (n_samples,)) # Four classes (0, 1, 2, 3)
# Convert to polar coordinates
r, theta, phi = cartesian_to_polar(x, y, z)
# Combine into a single tensor
data = torch.stack((r, theta, phi), dim=1)
# Create a Dataset and DataLoader
dataset = TensorDataset(data, labels)
train_loader = DataLoader(dataset, batch_size=32, shuffle=True)
# Define a simple feedforward neural network
class PolarNet(nn.Module):
def __init__(self):
super(PolarNet, self).__init__()
self.fc1 = nn.Linear(3, 64)
self.fc2 = nn.Linear(64, 128)
self.fc3 = nn.Linear(128, 4) # Four output classes
def forward(self, x):
x = torch.relu(self.fc1(x))
x = torch.relu(self.fc2(x))
x = self.fc3(x)
return x
# Initialize the model, loss function, and optimizer
model = PolarNet()
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)
# Training loop
for epoch in range(20): # Number of epochs
for inputs, targets in train_loader:
# Forward pass
outputs = model(inputs)
loss = criterion(outputs, targets)
# Backward pass and optimization
optimizer.zero_grad()
loss.backward()
optimizer.step()
print(f'Epoch {epoch+1}/20, Loss: {loss.item()}')
# After training, evaluate the model on the entire dataset for visualization
with torch.no_grad():
predicted_labels = model(data).argmax(dim=1)
# Plotting the results in 3D
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
# Convert polar back to Cartesian for plotting
x_cartesian = r * torch.sin(phi) * torch.cos(theta)
y_cartesian = r * torch.sin(phi) * torch.sin(theta)
z_cartesian = r * torch.cos(phi)
# Plot the 3D scatter plot
scatter = ax.scatter(x_cartesian, y_cartesian, z_cartesian, c=predicted_labels, cmap='viridis', marker='o')
# Add color bar and labels
plt.colorbar(scatter, ax=ax)
ax.set_xlabel('X')
ax.set_ylabel('Y')
ax.set_zlabel('Z')
plt.title('3D Visualization of PolarNet Classifications')
plt.show()
import torch
import torch.nn as nn
import torch.optim as optim
import torch.nn.functional as F
from torchvision import datasets, transforms
from torch.utils.data import DataLoader
batch_size = 64
learning_rate = 0.01
epochs = 100
transform = transforms.Compose([
transforms.ToTensor(),
transforms.Normalize((0.1307,), (0.3081,))
])
train_dataset = datasets.MNIST(root='./data', train=True, download=True, transform=transform)
test_dataset = datasets.MNIST(root='./data', train=False, download=True, transform=transform)
train_loader = DataLoader(dataset=train_dataset, batch_size=batch_size, shuffle=True)
test_loader = DataLoader(dataset=test_dataset, batch_size=batch_size, shuffle=False)
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.fc1 = nn.Linear(28 * 28, 128)
self.fc2 = nn.Linear(128, 64)
self.fc3 = nn.Linear(64, 10)
def forward(self, x):
x = x.view(-1, 28 * 28)
x = F.relu(self.fc1(x))
x = F.relu(self.fc2(x))
x = self.fc3(x)
return x
model = Net()
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(model.parameters(), lr=learning_rate)
for epoch in range(epochs):
model.train()
for batch_idx, (data, target) in enumerate(train_loader):
optimizer.zero_grad()
output = model(data)
loss = criterion(output, target)
loss.backward()
optimizer.step()
if batch_idx % 100 == 0:
print(f'Epoch: {epoch+1}/{epochs} [Batch: {batch_idx*len(data)}/{len(train_loader.dataset)}] Loss: {loss.item():.6f}')
model.eval()
test_loss = 0
correct = 0
with torch.no_grad():
for data, target in test_loader:
output = model(data)
test_loss += criterion(output, target).item()
pred = output.argmax(dim=1, keepdim=True)
correct += pred.eq(target.view_as(pred)).sum().item()
test_loss /= len(test_loader.dataset)
accuracy = 100. * correct / len(test_loader.dataset)
print(f'Test set: Average loss: {test_loss:.4f}, Accuracy: {correct}/{len(test_loader.dataset)} ({accuracy:.2f}%)')
torch.save(model.state_dict(), "mnist_model.pth")
model = Net()
### 3. Load the Trained Model Weights
model.load_state_dict(torch.load("mnist_model.pth"))
model.eval() # Set the model to evaluation mode
### 4. Prepare the Handwritten Input Image
#You need to preprocess the handwritten image to match the format of the MNIST dataset (28x28 pixels, grayscale).
def preprocess_image(image_path):
transform = transforms.Compose([
transforms.Grayscale(), # Ensure the image is grayscale
transforms.Resize((28, 28)), # Resize to 28x28 pixels
transforms.ToTensor(), # Convert to tensor
transforms.Normalize((0.1307,), (0.3081,)) # Normalize with the same mean and std as MNIST
])
image = Image.open(image_path)
image = transform(image).unsqueeze(0) # Add batch dimension
return image
### 5. Perform Inference
def recognize_digit(image_path):
image = preprocess_image(image_path)
with torch.no_grad():
output = model(image)
prediction = output.argmax(dim=1, keepdim=True)
return prediction.item()
# Example usage
image_path = 'path_to_your_handwritten_digit_image3.png'
predicted_digit = recognize_digit(image_path)
print(f'Predicted Digit: {predicted_digit}')
import torch
import torch.nn as nn
import torch.optim as optim
import random
import numpy as np
# Define the neural network architecture
class CalculatorNN(nn.Module):
def __init__(self):
super(CalculatorNN, self).__init__()
self.fc1 = nn.Linear(3, 128) # Input: 2 numbers + operation
self.fc2 = nn.Linear(128, 64)
self.fc3 = nn.Linear(64, 1) # Output: the result
def forward(self, x):
x = torch.relu(self.fc1(x))
x = torch.relu(self.fc2(x))
x = self.fc3(x)
return x
model = CalculatorNN()
criterion = nn.MSELoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)
# Training loop
num_epochs = 50000 # loss is too large if is 5000.
for epoch in range(num_epochs):
model.train()
# Forward pass
predictions = model(X_train)
loss = criterion(predictions, y_train)
# Backward pass and optimization
optimizer.zero_grad()
loss.backward()
optimizer.step()
if (epoch + 1) % 10 == 0:
print(f'Epoch [{epoch+1}/{num_epochs}], Loss: {loss.item():.4f}')
# ---- use
model = CalculatorNN()
model.load_state_dict(torch.load('calculator_model.pth'))
model.eval()
# Perform the prediction
with torch.no_grad():
# Prepare the input (32 * 3)
input_data = torch.tensor([[32.0, 3.0, 2]], dtype=torch.float32) # 2 corresponds to multiplication
prediction = model(input_data)
print(f'Prediction for 32 * 3: {prediction.item():.4f}')
## split by pattern, a full log for instance
def split_log_file(input_file, split_pattern, output_pattern):
with open(input_file, 'r') as file:
log_content = file.read()
pattern = re.compile(split_pattern)
split_points = [match.start() for match in re.finditer(pattern, log_content)]
split_points.append(len(log_content))
for i in range(len(split_points) - 1):
start = split_points[i]
end = split_points[i + 1]
segment = log_content[start:end]
match = pattern.search(segment)
if match:
number = match.group(1)
output_file = output_pattern.format(number=number)
with open(output_file, 'w') as file:
file.write(segment)
print(f"Segment saved as {output_file}")
## difference patterns save log
def move_patterns_logs(destination_path, patterns):
current_directory = os.getcwd()
log_files = glob.glob("*.log")
for log_file in log_files:
with open(log_file, 'r') as file:
if any(re.search(pattern, line) for pattern in patterns for line in file):
shutil.move(os.path.join(current_directory, log_file), destination_path)
break
## filter show or data visualization
def filter_log_file(log_file_path, exclude_keywords):
with open(log_file_path, "r") as file:
lines = file.readlines()
filtered_lines = [line for line in lines if not any(keyword in line for keyword in exclude_keywords)]
for line in filtered_lines:
print(line, end="")
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
from sklearn.datasets import make_classification
from sklearn.svm import SVC
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
X, y = make_classification(n_samples=100, n_features=3, n_informative=3, n_redundant=0, n_classes=2, random_state=42)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)
model = SVC(kernel='linear')
model.fit(X_train, y_train)
def plot_svm_decision_boundary_3d(model, X, y):
fig = plt.figure(figsize=(10, 8))
ax = fig.add_subplot(111, projection='3d')
# Plot the training points
scatter = ax.scatter(X[:, 0], X[:, 1], X[:, 2], c=y, s=30, cmap=plt.cm.coolwarm)
# Create grid to evaluate model (this defines the 3D space)
xlim = ax.get_xlim()
ylim = ax.get_ylim()
zlim = ax.get_zlim()
xx = np.linspace(xlim[0], xlim[1], 20)
yy = np.linspace(ylim[0], ylim[1], 20)
zz = np.linspace(zlim[0], zlim[1], 20)
# Create a meshgrid to evaluate the decision function
YY, ZZ = np.meshgrid(yy, zz)
XX = -(model.coef_[0][0] * YY + model.coef_[0][2] * ZZ + model.intercept_) / model.coef_[0][1]
# Plot the decision surface
ax.plot_surface(XX, YY, ZZ, color='gray', alpha=0.3, rstride=100, cstride=100)
# Highlight support vectors
ax.scatter(model.support_vectors_[:, 0], model.support_vectors_[:, 1], model.support_vectors_[:, 2],
s=100, facecolors='none', edgecolors='k', linewidth=1.5, label='Support Vectors')
ax.set_title('SVM Decision Boundary in 3D')
ax.set_xlabel('Feature 1')
ax.set_ylabel('Feature 2')
ax.set_zlabel('Feature 3')
# Add color legend
legend1 = ax.legend(*scatter.legend_elements(), loc="best", title="Classes")
ax.add_artist(legend1)
plt.show()
plot_svm_decision_boundary_3d(model, X_train, y_train)
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.cluster import KMeans
def cluster_error_messages(error_messages, num_clusters=5):
vectorizer = TfidfVectorizer(stop_words='english')
X = vectorizer.fit_transform(error_messages)
kmeans = KMeans(n_clusters=num_clusters, random_state=0)
kmeans.fit(X)
labels = kmeans.labels_
clustered_errors = {}
for i, label in enumerate(labels):
if label not in clustered_errors:
clustered_errors[label] = []
clustered_errors[label].append(error_messages[i])
return clustered_errors
import matplotlib.pyplot as plt
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier, plot_tree
from sklearn import metrics
iris = load_iris()
X = iris.data # Features
y = iris.target # Labels
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
clf = DecisionTreeClassifier()
clf.fit(X_train, y_train)
y_pred = clf.predict(X_test)
accuracy = metrics.accuracy_score(y_test, y_pred)
print(f"Accuracy: {accuracy * 100:.2f}%")
plt.figure(figsize=(12,8))
plot_tree(clf, feature_names=iris.feature_names, class_names=iris.target_names, filled=True)
plt.show()
import torch
import torch.nn as nn
import torch.optim as optim
import torch.nn.functional as F
import numpy as np
import random
# Define a simple fully connected neural network
class DQN(nn.Module):
def __init__(self, input_dim, output_dim):
super(DQN, self).__init__()
self.fc1 = nn.Linear(input_dim, 128)
self.fc2 = nn.Linear(128, 128)
self.fc3 = nn.Linear(128, output_dim)
def forward(self, x):
x = F.relu(self.fc1(x))
x = F.relu(self.fc2(x))
return self.fc3(x)
# ### 3. **Initialize the environment and model:**
import gymnasium as gym
import torch
env = gym.make("LunarLander-v2", render_mode="human")
state_dim = env.observation_space.shape[0]
action_dim = env.action_space.n
# Create the DQN model
model = DQN(input_dim=state_dim, output_dim=action_dim)
# ### 4. **Define the training loop:**
# In this section, we'll define how the agent interacts with the environment, how rewards are collected, and how the model is updated.
# Parameters
learning_rate = 0.001
gamma = 0.99 # Discount factor
epsilon = 1.0 # Exploration rate
epsilon_decay = 0.995
epsilon_min = 0.01
episodes = 500
# Optimizer
optimizer = optim.Adam(model.parameters(), lr=learning_rate)
# Function to choose action (using epsilon-greedy policy)
def choose_action(state, epsilon):
if np.random.rand() <= epsilon:
return np.random.choice(action_dim) # Random action
state = torch.FloatTensor(state).unsqueeze(0)
with torch.no_grad():
q_values = model(state)
return torch.argmax(q_values).item()
# Function to train the model
def train_model(memory, batch_size=64):
if len(memory) < batch_size:
return
# Randomly sample a batch from memory
batch = random.sample(memory, batch_size)
# Extract states, actions, rewards, next_states, and dones from the batch
states, actions, rewards, next_states, dones = zip(*batch)
# Convert them to tensors
states = torch.FloatTensor(states)
actions = torch.LongTensor(actions)
rewards = torch.FloatTensor(rewards)
next_states = torch.FloatTensor(next_states)
dones = torch.FloatTensor(dones)
# Compute Q values for the current states
q_values = model(states).gather(1, actions.unsqueeze(1)).squeeze(1)
# Compute the maximum Q values for the next states
next_q_values = model(next_states).max(1)[0]
# Compute the target Q values
q_targets = rewards + (1 - dones) * gamma * next_q_values
# Compute the loss
loss = F.mse_loss(q_values, q_targets)
# Optimize the model
optimizer.zero_grad()
loss.backward()
optimizer.step()
# Main loop
memory = []
for episode in range(episodes):
state = env.reset()[0]
total_reward = 0
for t in range(1000):
action = choose_action(state, epsilon)
next_state, reward, done, truncated, _ = env.step(action)
memory.append((state, action, reward, next_state, done))
train_model(memory)
state = next_state
total_reward += reward
if done or truncated:
break
epsilon = max(epsilon_min, epsilon * epsilon_decay)
print(f"Episode {episode + 1}, Total Reward: {total_reward}")
env.close()
import gymnasium as gym
import numpy as np
import pygame
from gymnasium import spaces
import torch
import torch.nn as nn
import torch.optim as optim
import random
from collections import deque
import time
import pygame
import numpy as np
from gymnasium import spaces
from flappy_bird_cl3_pass_env_to_nn_3 import FlappyBirdEnv
class DQN(nn.Module):
def __init__(self, input_size, n_actions):
super(DQN, self).__init__()
self.fc = nn.Sequential(
nn.Linear(input_size, 64),
nn.ReLU(),
nn.Linear(64, 64),
nn.ReLU(),
nn.Linear(64, n_actions)
)
def forward(self, x):
return self.fc(x)
class DQNAgent:
def __init__(self, env, learning_rate=1e-3, gamma=0.99, epsilon_start=1.0, epsilon_final=0.01, epsilon_decay=0.995):
self.env = env
self.n_actions = env.action_space.n
self.device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
self.epsilon = epsilon_start
self.epsilon_final = epsilon_final
self.epsilon_decay = epsilon_decay
self.memory = deque(maxlen=10000)
self.batch_size = 64
state_size = len(env.get_state())
self.model = DQN(state_size, self.n_actions).to(self.device)
self.optimizer = optim.Adam(self.model.parameters(), lr=learning_rate)
self.criterion = nn.MSELoss()
self.gamma = gamma
def get_action(self, state):
if random.random() < self.epsilon:
return random.randint(0, self.n_actions - 1)
with torch.no_grad():
state = torch.FloatTensor(state).unsqueeze(0).to(self.device)
q_values = self.model(state)
return torch.argmax(q_values).item()
def update_epsilon(self):
self.epsilon = max(self.epsilon_final, self.epsilon * self.epsilon_decay)
def remember(self, state, action, reward, next_state, done):
self.memory.append((state, action, reward, next_state, done))
def train(self):
if len(self.memory) < self.batch_size:
return
batch = random.sample(self.memory, self.batch_size)
states, actions, rewards, next_states, dones = zip(*batch)
states = torch.FloatTensor(states).to(self.device)
actions = torch.LongTensor(actions).to(self.device)
rewards = torch.FloatTensor(rewards).to(self.device)
next_states = torch.FloatTensor(next_states).to(self.device)
dones = torch.FloatTensor(dones).to(self.device)
current_q_values = self.model(states).gather(1, actions.unsqueeze(1))
with torch.no_grad():
next_q_values = self.model(next_states).max(1)[0]
target_q_values = rewards + (1 - dones) * self.gamma * next_q_values
loss = self.criterion(current_q_values.squeeze(), target_q_values)
self.optimizer.zero_grad()
loss.backward()
self.optimizer.step()
def train_dqn(env, episodes=2000, max_steps=1000, render_interval=10):
agent = DQNAgent(env)
scores = []
for episode in range(episodes):
state = env.reset()
score = 0
for step in range(max_steps):
if episode % render_interval == 0:
env.render()
action = agent.get_action(state)
next_state, reward, done, _, _ = env.step(action)
agent.remember(state, action, reward, next_state, done)
agent.train()
state = next_state
score += reward
if done:
break
if episode % render_interval == 0:
pygame.event.pump()
agent.update_epsilon()
scores.append(score)
if episode % 10 == 0:
print(f"Episode: {episode}, Score: {score}, Epsilon: {agent.epsilon:.2f}")
return agent, scores
if __name__ == "__main__":
env = FlappyBirdEnv()
agent, scores = train_dqn(env, episodes=6000, render_interval=50)
# Test the trained agent
state = env.reset()
done = False
score = 0
while not done:
env.render()
action = agent.get_action(state)
next_state, reward, done, _, _ = env.step(action)
state = next_state
score += reward
for event in pygame.event.get():
if event.type == pygame.QUIT:
done = True
pygame.event.pump()
time.sleep(0.03)
print(f"Final Score: {score}")
env.close()
import torch
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
from matplotlib.animation import FuncAnimation
# Random 3D surface (loss function)
def loss_function(x, y):
return torch.sin(x) * torch.cos(y) + 0.1 * (x**2 + y**2)
# Generate a meshgrid for plotting the surface
x = torch.linspace(-5, 5, 100)
y = torch.linspace(-5, 5, 100)
X, Y = torch.meshgrid(x, y)
Z = loss_function(X, Y).detach().numpy()
# Initialize figure and 3D axis for animation
fig = plt.figure(figsize=(10, 7))
ax = fig.add_subplot(111, projection='3d')
ax.set_xlabel('X')
ax.set_ylabel('Y')
ax.set_zlabel('Z')
ax.set_title('SGD Optimization Path on 3D Surface')
# Plot the static 3D surface
ax.plot_surface(X.numpy(), Y.numpy(), Z, cmap='viridis', alpha=0.7)
# SGD starting point
start_point = torch.tensor([4.0, 4.0], requires_grad=True)
# Hyperparameters
learning_rate = 0.1
optimizer = torch.optim.SGD([start_point], lr=learning_rate)
# Number of steps and animation frames
steps = 10
path = np.zeros((steps, 3))
# Plotting the initial point on the surface
point_plot, = ax.plot([], [], [], color='r', marker='o', markersize=5)
# Function to update the frame during animation
def update(i):
global start_point
optimizer.zero_grad()
# Calculate the loss (z value)
loss = loss_function(start_point[0], start_point[1])
# Backpropagation to compute gradients
loss.backward()
# Perform optimization step
optimizer.step()
# Store the (x, y, z) values
path[i, 0] = start_point[0].item()
path[i, 1] = start_point[1].item()
path[i, 2] = loss.item()
# Update point on the surface
point_plot.set_data(path[:i+1, 0], path[:i+1, 1])
point_plot.set_3d_properties(path[:i+1, 2])
return point_plot,
# Animate SGD for 10 steps
ani = FuncAnimation(fig, update, frames=steps, interval=500, blit=True)
# Show the animation
plt.show()
import torch.nn as nn
import torch.nn.functional as F
class Attention(nn.Module):
def __init__(self, in_channels, out_channels):
super(Attention, self).__init__()
self.conv = nn.Conv2d(in_channels, out_channels, kernel_size=1)
self.softmax = nn.Softmax(dim=-1)
def forward(self, x):
# Global feature extraction
global_features = torch.mean(x, dim=(2, 3), keepdim=True)
attention_map = self.conv(global_features)
attention_map = self.softmax(attention_map)
out = x * attention_map
return out
class CNNWithAttention(nn.Module):
def __init__(self):
super(CNNWithAttention, self).__init__()
# Convolutional layers
self.conv1 = nn.Conv2d(3, 32, kernel_size=3, padding=1)
self.conv2 = nn.Conv2d(32, 64, kernel_size=3, padding=1)
self.pool = nn.MaxPool2d(2, 2)
# Attention layer
self.attention = Attention(64, 64)
# Fully connected layers
self.fc1 = nn.Linear(64 * 8 * 8, 512)
self.fc2 = nn.Linear(512, 10)
def forward(self, x):
x = self.pool(F.relu(self.conv1(x)))
x = self.pool(F.relu(self.conv2(x)))
# Attention mechanism
x = self.attention(x)
x = x.view(-1, 64 * 8 * 8)
x = F.relu(self.fc1(x))
x = self.fc2(x)
return x
# --
# Initialize the model, loss function, and optimizer
model = CNNWithAttention()
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)
# Training loop
for epoch in range(5): # Train for 5 epochs
running_loss = 0.0
for inputs, labels in trainloader:
# Zero the parameter gradients
optimizer.zero_grad()
# Forward pass
outputs = model(inputs)
loss = criterion(outputs, labels)
# Backward pass and optimize
loss.backward()
optimizer.step()
running_loss += loss.item()
print(f"Epoch [{epoch + 1}/5], Loss: {running_loss / len(trainloader)}")
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import Dataset, DataLoader
class Vocab:
def __init__(self, stoi, itos):
self.stoi = stoi
self.itos = itos
# Provided corpus (AI history)
corpus = """
The history of artificial intelligence (AI) began in antiquity, with myths, stories and rumors of artificial beings endowed with intelligence or consciousness by master craftsmen.
... ...
"""
# Simple tokenization (splitting by spaces)
corpus = corpus.replace("\n", " ") # Remove newlines
# Tokenization can be improved using libraries like nltk or spacy, but we'll use a simple split here
tokens = corpus.split()
# You can build a vocabulary from this corpus as you did before, for instance:
from collections import Counter
# Create a vocabulary from the corpus
token_counts = Counter(tokens)
vocab_stoi = {token: idx for idx, (token, count) in enumerate(token_counts.items())}
vocab_itos = {idx: token for token, idx in vocab_stoi.items()}
# Create the Vocab object
vocab = Vocab(stoi=vocab_stoi, itos=vocab_itos)
class RNNModel(nn.Module):
def __init__(self, vocab_size, embed_size, hidden_size, num_layers):
super(RNNModel, self).__init__()
self.num_layers = num_layers
self.hidden_size = hidden_size
self.embedding = nn.Embedding(vocab_size, embed_size)
self.rnn = nn.LSTM(embed_size, hidden_size, num_layers, batch_first=True)
self.fc = nn.Linear(hidden_size, vocab_size)
def forward(self, x, hidden):
x = self.embedding(x)
out, hidden = self.rnn(x, hidden)
out = self.fc(out)
return out, hidden
def init_hidden(self, batch_size):
# Initialize hidden states (h_0) and cell states (c_0) with correct batch size
weight = next(self.parameters()).data
return (weight.new_zeros(self.num_layers, batch_size, self.hidden_size),
weight.new_zeros(self.num_layers, batch_size, self.hidden_size))
class TextDataset(Dataset):
def __init__(self, text, vocab, sequence_length):
self.vocab = vocab
self.sequence_length = sequence_length
self.data = self.tokenize_and_encode(text)
def tokenize_and_encode(self, text):
tokens = text.split() # Simple tokenization (split by spaces)
return [self.vocab.stoi[token] for token in tokens if token in self.vocab.stoi]
def __len__(self):
return len(self.data) - self.sequence_length
def __getitem__(self, idx):
x = self.data[idx:idx + self.sequence_length]
y = self.data[idx + 1:idx + 1 + self.sequence_length]
return torch.tensor(x, dtype=torch.long), torch.tensor(y, dtype=torch.long)
# Define sequence length and batch size
sequence_length = 10 # Can be tuned
batch_size = 100
# Create the dataset and dataloader
dataset = TextDataset(corpus, vocab, sequence_length)
train_loader = DataLoader(dataset, batch_size=batch_size, shuffle=True)
# Now you're ready to train the model using the provided corpus
# Define model, loss function, and optimizer
vocab_size = len(vocab.stoi)
embed_size = 50 # Adjust as needed
hidden_size = 100 # Adjust as needed
num_layers = 2
num_epochs = 100 # Adjust based on performance
learning_rate = 0.001
model = RNNModel(vocab_size, embed_size, hidden_size, num_layers)
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=learning_rate)
# Training loop
for epoch in range(num_epochs):
for batch in train_loader:
inputs, targets = batch
batch_size = inputs.size(0) # Get the actual batch size for this iteration
hidden = model.init_hidden(batch_size) # Initialize hidden state with correct batch size
outputs, hidden = model(inputs, hidden)
loss = criterion(outputs.view(-1, vocab_size), targets.view(-1))
optimizer.zero_grad()
loss.backward()
optimizer.step()
print(f'Epoch {epoch+1}, Loss: {loss.item()}')
torch.save(model.state_dict(), 'rnn_model_ai.pth')
def generate_text(model, start_text, max_length=100):
model.eval()
hidden = model.init_hidden(1) # Start with batch size 1
input = torch.tensor([[vocab.stoi[start_text]]]) # Convert start_text to input tensor
result = [start_text]
for _ in range(max_length):
output, hidden = model(input, hidden)
prob = nn.functional.softmax(output[0, -1], dim=0).data
next_word = torch.multinomial(prob, 1).item()
result.append(vocab.itos[next_word]) # Convert back to word using vocab
input = torch.tensor([[next_word]]) # Feed the next word as input
return ' '.join(result)
start_text = 'AI' # The starting word
generated_text = generate_text(model, start_text, max_length=100)
print(generated_text)
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import Dataset, DataLoader
import math
class Vocab:
def __init__(self, stoi, itos):
self.stoi = stoi
self.itos = itos
corpus = """
The history of artificial intelligence (AI) began in antiquity, with myths, stories and rumors of artificial beings endowed with intelligence or consciousness by master craftsmen.
...
"""
corpus = corpus.replace("\n", " ")
tokens = corpus.split()
from collections import Counter
token_counts = Counter(tokens)
vocab_stoi = {token: idx for idx, (token, count) in enumerate(token_counts.items())}
vocab_itos = {idx: token for token, idx in vocab_stoi.items()}
vocab = Vocab(stoi=vocab_stoi, itos=vocab_itos)
class PositionalEncoding(nn.Module):
def __init__(self, embed_size, max_len=5000):
super(PositionalEncoding, self).__init__()
self.encoding = torch.zeros(max_len, embed_size)
position = torch.arange(0, max_len, dtype=torch.float).unsqueeze(1)
div_term = torch.exp(torch.arange(0, embed_size, 2).float() * (-math.log(10000.0) / embed_size))
self.encoding[:, 0::2] = torch.sin(position * div_term)
self.encoding[:, 1::2] = torch.cos(position * div_term)
self.encoding = self.encoding.unsqueeze(0)
def forward(self, x):
return x + self.encoding[:, :x.size(1), :].to(x.device)
class TransformerModel(nn.Module):
def __init__(self, vocab_size, embed_size, num_heads, hidden_size, num_layers, dropout=0.1):
super(TransformerModel, self).__init__()
self.embedding = nn.Embedding(vocab_size, embed_size)
self.pos_encoder = PositionalEncoding(embed_size)
encoder_layers = nn.TransformerEncoderLayer(embed_size, num_heads, hidden_size, dropout)
self.transformer = nn.TransformerEncoder(encoder_layers, num_layers)
self.fc = nn.Linear(embed_size, vocab_size)
def forward(self, src, src_mask=None):
src = self.embedding(src) * math.sqrt(src.size(-1)) # scale by sqrt(embed_size)
src = self.pos_encoder(src)
output = self.transformer(src, src_mask)
output = self.fc(output)
return output
class TextDataset(Dataset):
def __init__(self, text, vocab, sequence_length):
self.vocab = vocab
self.sequence_length = sequence_length
self.data = self.tokenize_and_encode(text)
def tokenize_and_encode(self, text):
tokens = text.split()
return [self.vocab.stoi[token] for token in tokens if token in self.vocab.stoi]
def __len__(self):
return len(self.data) - self.sequence_length
def __getitem__(self, idx):
x = self.data[idx:idx + self.sequence_length]
y = self.data[idx + 1:idx + 1 + self.sequence_length]
return torch.tensor(x, dtype=torch.long), torch.tensor(y, dtype=torch.long)
sequence_length = 10
batch_size = 100
dataset = TextDataset(corpus, vocab, sequence_length)
train_loader = DataLoader(dataset, batch_size=batch_size, shuffle=True)
vocab_size = len(vocab.stoi)
embed_size = 50 # Can be tuned
num_heads = 2 # Number of attention heads
hidden_size = 100 # Hidden layer size in feedforward network
num_layers = 88 # Number of Transformer layers
dropout = 0.1
num_epochs = 100 # Adjust based on performance
learning_rate = 0.001
model = TransformerModel(vocab_size, embed_size, num_heads, hidden_size, num_layers, dropout)
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=learning_rate)
for epoch in range(num_epochs):
for batch in train_loader:
inputs, targets = batch
inputs = inputs.permute(1, 0) # (batch_size, sequence_length) -> (sequence_length, batch_size)
targets = targets.permute(1, 0)
outputs = model(inputs)
# Instead of view(), use reshape()
loss = criterion(outputs.reshape(-1, vocab_size), targets.reshape(-1))
optimizer.zero_grad()
loss.backward()
optimizer.step()
print(f'Epoch {epoch+1}, Loss: {loss.item()}')
torch.save(model.state_dict(), 'transformer_model_ai.pth')
def generate_text(model, start_text, max_length=100):
model.eval()
input = torch.tensor([[vocab.stoi[start_text]]]).permute(1, 0) # Convert start_text to input tensor
result = [start_text]
for _ in range(max_length):
output = model(input)
prob = nn.functional.softmax(output[-1, 0], dim=0).data
next_word = torch.multinomial(prob, 1).item()
result.append(vocab.itos[next_word])
input = torch.cat([input, torch.tensor([[next_word]])], dim=0)
return ' '.join(result)
start_text = 'AI'
generated_text = generate_text(model, start_text, max_length=100)
print(generated_text)