Jump To
šCheck Out The AI Arena Core Intuition Series
AI Arena Crash Course
Chapter Bookmarks
Overview
Game Action Set
Data Collection
The Inspector
- AI Inspector: What it is and why it matters.
- Your AI's Policy: Discovering the probability of your AI taking a specific action in its state.
- Understanding your AI's Policy
- How can I discover what my AI's most probable action is?
- How can I discover and manipulate what my AI will do in different scenarios?: Map toggling and Customization.
- How can I select different maps?
- How can I toggle my AI's lives?
- Toggling your AI's and your opponents Elemental Special Power Up
- Toggling your opponents configurations
- Training your AI to react to Elemental moves: How to add Projectiles
- How to analyze your AI's/your opponents previous actions: Using the Action Selector
- How can I start the AI Inspector tour in the game?
Settings
Training Configuration
- How to tell your AI how much of it's training to remember: Training Intensity.
- How to tell your AI what to focus on during training: Focus Area.
- How to remove idling data from training with Remove Sparsity: Data Cleaning.
- Balancing your Data Set with Oversampling and Multi Stream: Data Cleaning
- Guppy's demonstration of a Training Configuration Combo
- Analyzing your AI's policy based on your Training Configuration.
Game Updates
- How to switch fighters mid-battle in Data Collection
- How to record an action sequence for your opponent: Targeted Training in Data Collection
- Simple Configuration Overview
- Stacking multiple training sessions Risk Free with Quick Save and Finalize
- How to restore your model to ranked-battle-model after training
Git Gud with Guppy
Episode 1
This video discusses the objective of machine learning, which is to change the value of a set of parameters to alter what the AI is going to do, and minimize the loss between what the AI is shown and what it actually does.
Highlights
- The objective of machine learning is to change the parameter values to alter what the AI is doing.
- The goal is to minimize the loss function between what the AI is shown and what it does.
- Data collection and Configuration are where the you can achieve different results depending on your objective.
- The parameter updates are a series of small steps down the loss curve until the minimum is reached.
- Multiple iterations may be required to reach the minimum.
- Objective of machine learning: adjust parameters (0m3s)
- AI agent performing a task (0m7s)
- Gradient descent (8m32s)
- Strategies to escape local optima (13m54s)
- Training AI on a variety of tasks (36m24s)
- Fixing imbalanced data sets (21m29s)
- Early stopping, overfitting, and missing important data (31m30s)
- Batch size and updates per epoch (45m54s)
- Data Collection is more important than tuning hyperparameters (53m51s)
- "Garbage in, garbage out" concept (57m1s)
Episode 2
Summary
In this episode of AI Arena's G³ Git Gud with Guppy, the host teaches viewers how to recover an AI agent back to the platform and how to incrementally build skills. The episode also explains why machine learning models forget things and how to fix this.
Highlights
- The episode teaches viewers how to recover their AI agent back to the platform.
- The host explains the concept of random initialization and how it impacts the model's intelligence.
- Viewers learn how to incrementally build their AI agent's skills.
- The episode offers tips on training movement by selecting an active opponent in Data Collection.
- The host shows viewers how to balance retaining previous memories and learning new skills for you AI.
- To optimize recovery jumps to the platform, Data Collection should be turned on only while in the air ref 561s.
- Moderate training is important to build skills incrementally ref 622s.
- The AI needs to use raycasts to understand its environment ref 1201s
- An active opponent (not the dummy) should be used sometimes to help the AI train to recover after being knock far away from the stage ref 309s.
- Players can adjust distance and angle to better train their AI using the Action Recorder ref 2767s
- Lambda can be dialed to find the right balance between remembering old information and learning new skills ref 3289s ref 2853s.
Episode 3
Summary
In this episode of AI ARENA - G³ Git Gud with Guppy, the topic of discussion is combos and conditional probability of taking an action based on the current action. The concept of properly using the shield in the game is also covered. The episode shows how to train the AI model to perform a specific combo by selecting the appropriate features.
Highlights
- š® The episode talks about combos and conditional probability of actions in the game.
- š”ļø Proper use of the shield and the concept of conditional probability in AI training is explained.
- š¤ The episode shows how to train an AI model to perform a specific combo by selecting the appropriate features.
- Guppy explains how to use conditional probability to increase the probability of holding the shield button. ref 1:02
- Action representation involves selecting a subset of features for the AI to focus on. ref 11:50
- Using the action recorder to train the AI in specific situations. ref 23:39
- Selecting fewer features reduces the probability of spurious correlation. ref 40:48
- Focusing on weight can help train differently depending on the weight of the opponent. ref 45:42
Episode 4
Summary
In this video, Guppy explains the concept of interpolation in AI, where an AI is shown what to do in two different scenarios, and it figures out what to do in between those scenarios.
Highlights
- š¤ Guppy shows how to use interpolation to train an AI efficiently.
- š¤ Guppy explains how the AI is able to learn what to do in between the two scenarios shown.
- š Guppy uses a simplified example and some math to demonstrate how the concept of interpolation works.
- Guppy explains how to use conditional probability to increase the probability of holding the shield button. ref 1:02
- Action representation involves selecting a subset of features for the AI to focus on. ref 11:50
- Using the action recorder to train the AI in specific situations. ref 23:39
- Selecting fewer features reduces the probability of spurious correlation. ref 40:48
- Focusing on weight can help train differently depending on the weight of the opponent. ref 45:42
Episode 5
Summary
In this video, guppy discusses the pros and cons of different weights in a platform fighting game. They emphasize that every weight class is viable but requires different strategies. They demonstrate how lighter weights have better mobility and recovery, while heavier weights have more knockback power. They also explain the importance of positioning and combo potential for lighter fighters.
Highlights
- š„ Lighter weights offer better mobility and recovery in the game.
- š® Heavier weights have more knockback power but struggle with combo potential.
- šļøāāļø Positioning plays a crucial role, and lighter fighters should stay on the inside of their opponents.
- 0023 Difference in weight and viability of different weight classes.
Episode 6
Summary
This video tutorial discusses two topics effectively using shields and charging attacks in an AI arena game. Guppy explains the changes made to the game mechanics and demonstrates how to train the AI to perform these actions. He also address some bugs that affected shield usage and explain the concept of conditional probability in training the AI.
Highlights
- š”ļø The shield mechanic in the game has been improved, allowing players to use shields effectively by separating the shield action from directional inputs.
- ā” The charging attack, specifically the headbutt move, is now trainable in the AI. The speaker demonstrates how to train the AI to perform headbutts effectively.
- ā²ļø Future updates will include a timer system for shields and generalization of charging attacks to accommodate new moves.
- 0000Ā - Introduction and explanation of the shield execution improvement
- 0520Ā - Discussion on the bug affecting charging headbutts and its resolution
- 1140Ā - Explanation of the neural network setup and its benefits
- 1810Ā - Future plans for implementing a shield health timer and generalizing charging attacks
- 2150Ā - Conclusion and invitation for questions
Episode 7
Summary
In this video, the presenter showcases some updates made to a game, including the ability to pick a banner for display, and introduces the concept of a perfect shield that can counter attacks or stun opponents. They demonstrate how to use the perfect shield effectively. The presenter also discusses training AI to recover from underneath a platform, explaining the importance of recognizing platform positioning and collecting data for training. They answer questions from viewers regarding sparsity, recovering from airborne attacks, and platform positioning.
Highlights
- š® Updates to the game include the ability to pick a banner for display and the introduction of a perfect shield.
- š”ļø The perfect shield can bounce back projectiles or stun opponents when timed correctly.
- š Training AI to recover from underneath a platform involves collecting data based on platform positioning and using over sampling for better results.
- [0047] Introduction to HUD banner customization and perfect shield mechanics.
- [0417] Demonstrating the use of perfect shield against projectiles and hand-to-hand combat.
- [0817] Training AI to recover from underneath a platform by recognizing empty spaces.
- [1446] Explanation of using X and Y coordinates for positioning and potential limitations.
- [1910] The importance of generalization in AI training and future game updates.
Episode 8
Summary
Guppy demonstrates the process of training an AI fighter to effectively use shields in a game. He starts with an already-trained AI and assesses its performance, focusing on shield usage. The presenter then explains their method of training shields by holding the shield, releasing it, and repeating the process while mixing in attacks. They show the progress of the AI's shield usage through training and make adjustments to improve its recovery and shield actions.
Highlights
- š”ļø The presenter demonstrates their process of training shields in an AI fighter.
- š They assess the AI's performance and make adjustments along the way.
- āļø The AI shows improvement in shield usage and performs well in battles.
- [000] Introduction and overview of the AI's current abilities.
- [535] Training shields The presenter's approach and method.
- [1120] Assessing the AI's shield usage in simulation battles.
- [1550] Refining recovery moves to improve the AI's performance.
- [2110] Showcasing improved shield usage and battle performance.