Skip to main content
AA228/CS238 Decision Making under Uncertainty

Past Final Projects

Fall 2023

File index, Files

  • To Skip or Not to Skip: Modeling Music Recommendation Systems with Markov Decision Processes
  • Aligning language models towards better moral value judgements
  • Who Did It? Using Beam Search with Rollouts to Model Independent Agents
  • Safer Flight Planning Through Computational Decision Making
  • Application of Reinforcement Learning in Options Hedging
  • Optimizing Tetris with Deep Reinforcement Learning
  • Prodaumatic 3000: Helping Product Managers Make Decisions through LLM-Enhanced Monte Carlo Tree Search
  • Unscented Kalman Filter for State Estimation Under Fluctuating and Complex Nonlinear Dynamics
  • Redundancy for Small Aircraft: Kalman Filtering for Heading and Altitude Prediction Under Uncertainty
  • Optimizing Fish Strategy
  • Improving Scientific Collaboration through RL Agent-based Social Connections
  • Optimizing Fantasy Football Drafting with MCTS
  • Memoized Upper Confidence bounds applied to Decision Tree Search
  • Optimal Saber Fencing Decisions Using Q-Learning
  • Collision Avoidance for eVTOL Airplanes using Partially Observable Markov Decision Processes
  • Autonomous Helium-3 Mining on the Lunar Surface
  • Development of an Optimal Strategy Bot to Play Fog of War Chess
  • Model-Free Learning Applied to Optimal Blackjack Gameplay
  • Optimizing Roguelike Video Game Strategies Using Reinforcement Learning
  • Hyperparameter Optimization of Neural Networks Based on Q-learning
  • Improving Seismic Resilience in Pivoting-Spine Structures: Smart Connections, Reinforcement Learning, and Loss Mitigation
  • Portfolio Optimization with Q-Learning
  • Navigating the UNO-verse: Using Q-Learning, Double Q-Learning, and SARSA to Play UNO!
  • Cercano – A Parking Spot Near You!
  • Wordle is a POMDP!
  • Goal-based Investment Portfolio Optimization Tool Using Reinforcement Learning
  • Exploration for Book Recommendation
  • Mechanical ventilator management with reinforcement learning
  • Understanding Partial Observability in Autonomous Highway Driving
    Hex game play
  • Multi-Elevator Systems Optimization
  • Adaptive Control of Hall Effect Thrusters Using Reinforced Learning and PDE State Estimation
  • Using MDPs and Model-Free RL with Player Classification to Determine Optimal Poker Strategy
  • To Invest or Not To Invest… That is the POMDP
  • Playing Connect 4 Using Q-Learning
  • QRemoveD: Extended Q-Learning for Debris-on-Debris Removal in Outer Space
  • CLASSCONNECT: Building Educational Bridges with Branch and Bound
  • Endowment Asset Allocation: A Reinforcement Learning Approach
  • Applying Monte Carlo Tree Search To Among Us
  • Bark Beetle Infestation Prevention by Sparse Sampling
  • Autonomous Approach ATC with Partially Observable Aircraft Positions
  • Melody Mastermind
  • Can We Win Blackjack?
  • Integrating Traditional and Machine Learning Techniques in Option Pricing: A Comprehensive Approach to Modeling American Equity Options
  • Searching for a Reaction: Monte Carlo Tree Search Applied to Atomas
  • The Ultimate Coinche Bot
  • Sensor Placement Strategy Learning for Crowd Monitoring in Stadiums
  • Exploring Optimal Strategy of Playing Blackjack Using Different Algorithms
  • Reimagining Bus Routes with Q-Learning and an Equity Based Reward Model
  • An MDP-Based Approach to Optimal Queue Selection in Supermarkets
  • Modeling a Pair Trading Agent For Brazilian Bonds As A Partially-Observed Markov Decision Process
  • Comparing Policy Search Methods for Optimal Quadcopter Control: Hooke-Jeeves vs Genetic Algorithms vs Cross Entropy
  • A Novel Approach to Solving Physical Puzzles with Chain Reactions
  • Clothing Inventory Predictor
  • Drone Delivery Optimization Under Uncertainty
  • Online Learning Approaches in Autonomous Driving.
  • Heads-Up Poker Optimization
  • Exploring Monte Carlo Methods for Gomoku
  • Autonomous Underwater Vehicle Navigation in Uncertainty
  • Under the Sea(weed): Improving Water Quality through Planting and Harvesting Safer Seaweed
  • Model-Free Reinforcement Learning Methods for Playing Nim
  • High-IQ Shot Selection With Q-Learning
  • Siting Climate Resilience Hubs Using Q-Learning
  • Boardgame Optimization through Neural-network Enhanced Reasoning
  • Training an Agent to Beat Super Mario Bros. (1985)
  • Optimal Earthquake-Prevention Renovations using Reinforcement Learning
  • Training Reinforcement Learning Agents to Cross the Glass Bridge in Squid Game
  • Exploring Memory In Offline Reinforcement Learning
  • Reinforcement learning augmented model predictive control
  • Quadcopter Control Learning with Policy Gradient Methods
  • Multi-agent Path Finding as a Markov Game
  • Reinforcement Learning for Optimal Candidate Selection
  • From Gridlock to Green Light: Algorithmic Strategies for Smart Traffic Control
  • Optimization of proportional–integral–derivative controller gains for autonomous vehicle path-tracking
  • QTrader: Combining Qualitative and Quantitative Methods For Smarter Algorithmic Trading
  • Memorar: A Reinforcement Learning-Based Space Repetition System
  • Enhancing NFL Offensive Play Calling: A Data-Driven Decision-Making Model
  • Decision-making in the Reversi game
  • I, Robot meet DQL
  • AlphaGOPS: Modelling the Game of Pure Strategy with Adaptive Monte-Carlo Tree Search
  • Satellite Collision Avoidance Using Monte Carlo Tree Search
  • Prediction of Smoker Status using Bio-signals
  • Reactive Collision Avoidance in Robots: Applying Reinforcement Learning via Subsumption Architecture
  • RNN-Tuning through Deep Q-Learning
  • Genetic Algorithm Based Deep Q-Learning Approach To Algorithmic Trading: Developing, Training, and Testing a Rudimentary Agent
  • Battleship Buster: Deep Function Approximation for Variants of Battleship
  • Free Food Hunter
  • Winning Draughts Championship: Checkers Agent
  • Adversarial Carcassonne Agents
  • Neural-MPPI: Estimating Vehicle Dynamics for Improved Sequential Decision Making in Autonomous Driving
  • Tire Strategy Optimization for Formula 1
  • On-the-fly motorsport pit stop strategy decisions with Q-Learning
  • Cancer Treatment Policy Creation with Markov Decision Process Value Iteration
  • Pandemic Campus Simulator
  • Low Resource RLHF
  • Deep Reinforcement Learning For Autonomous Vehicle Control
  • Autonomous multi-agent Formula One races
  • A Behaviorist Approach to Deep Q-Learning for Geological Forecasting and Risk Management
  • QTetris: Tetris with Q-Learning
  • Solving Dweep with Reinforcement Learning
  • MAPS 2.0: Optimal Route Planning in San Francisco Considering Crime, Elevation, and Distance
  • Using Q-Learning to Solve Cliff Problem
  • Agile Earth Observation Satellite Tasking
  • How Did The Chicken Cross The Road: Evaluating DQL with Additional Insights for Atari Freeway
  • Solving Hangman with Reinforcement Learning Models
  • Q-Learning for Fantasy Football Management
  • Sampling-Based Monte Carlo Tree Search for
  • Freeflyer Thruster Control
  • Fighting the Smog: Finding crop-burning in small-holder farms in North India with Q-learning
  • Poker as a POMDP
  • Mao Master — An Agent for the Card Game Mao
  • Reinforcement Learning for Car Racing
  • A lost Rover
  • Strategic Learning in Combat Sports: an RL Approach to Simulating Muay Thai Fights
  • Building Poker Agents with Q-Learning and Deep Q-Networks
  • Composing Music with RL Agents
  • Lunar Logic- Intelligent Thrust Control Amid Noisy Observations, Wind Variability, and Engine Failures for Precise Spacecraft Landings
  • Optimal trade execution as sequential decision making under uncertainty
  • Robotic peg-in-hole assembly with reinforcement learning based exploration strategy
  • WordleFS: Wordle Strategies with Forward Search
  • Learn To Play StarCraft II
  • QDP: Standard Cell Detail Placement via Q-Learning
  • A Comparative Study of DQN and Dueling DDQN in Stochastic POMDP Environments
  • Dynamic Bayesian Networks for Adaptive Collective Decision-Making
  • RPS Revolution: Enhancing RL for Rock, Paper, Scissors
  • An Exploration of Trends Among Various Methods of Solving 2048
  • Enhanced Navigation and Obstacle Avoidance for the Visually Impaired: A Deep Q-Network Approach
  • Optimizing Chemical Reactions for Product Yield using Reinforcement Learning
  • Learning How to Walk Using Improved Reinforcement Learning Techniques
  • Electric Bus Fleet Scheduling – a Reinforcement Learning Approach
  • Maximizing Chance Encounters
  • “Strategic Decisions in Uncertainty: Optimizing Play in 1010!”
  • Gridworld Tag with Vision and Obstacles
  • Mercury: An Offline Approximation Method For Enhanced Cardio Performance
  • Tricky Trading Under Uncertainty —- Reinforcement Learning for the Figgie Card Game
  • Safe Motion Optimization and Operational Trajectory Handling (S.M.O.O.T.H.)
  • Offline Q-Learning with Linear Function Approximation for Robotic Peg Insertion
  • Policy Search Method Evaluation using a Two-Parameter Reservoir Operation Policy
  • Attention is Not All You Need For Offline Reinforcement Learning
  • Designing Chemical Industrial Processes Under Uncertainty
  • Modeling Prescribed Burns in California as a Markov Decision Process
  • DeepQHoldem: Applying Deep Q-Learning to No-Limit Texas Hold’em Poker
  • Highway Exit Policy with POMDPs
  • Optimizing Energy Carbon Emissions and Cost for Grid-Interactive Efficient Buildings with Q-Learning
  • Parametric Design Learning
  • PPO v DQN: A Duel at the Limits of Handling
  • SmartTidyBot
  • FortniteRL: Decision Making in Dynamic Adversarial Environments
  • Reinforcement Learning for Stock Trading
  • Multi-Robot Autonomy Planning in a Grid World
  • Music Recommendation With SARSA’
  • Learning to be a World-Class Tennis Player with Reinforcement Learning
  • Decision Making For Lost Cities
  • Optimizing Employee Training Using Reinforcement Learning
  • Settling the Unknown: Decision-Making in the Uncertain World of Catan
  • Mini Bridge with Q-Learning
  • A Restaurant Recommending System Based on Multi-Arm Bandits
  • Crop Patterns: Projecting Crop Planting Dates in San Joaquin Valley Due to Climate Change
  • Controlling the Government with RL
  • Go, No, or CoHo SPCE
  • Solving Minesweeper with RL and Online Planning
  • Dynamic Prioritization for Organ Matching with Model-Free Reinforcement Learning
  • Commuting on Stanford’s Campus: a Decision-Making Problem
  • Q Learning and Deep Q Learning For Financial Decision Making
  • Deep Q-Learning for self-driving car
  • Navigating an Autonomous Vehicle in a Racing Environment with Competing Agents
  • Deep Q Learning for Medical Triage
  • Optimizing Graduate Student Life using Model-Free Reinforcement Learning
  • Q-learning with Customized State Space Reduction on Atari Breakout Game
  • Wizard Whiz: A Deep Q-Learning Agent for the Card Game Wizard
  • Integrating Model-Free Reinforcement Learning for Dynamic Wildfire Risk Assessment and Firefighting Resource Allocation in California
  • Optimizing Organ Allocation
  • Modeling Macroeconomic Dynamics as a Stationary Markov Decision Process
  • MDPs for Music Generation
  • MCTS for Bananagrams
  • Predicting Southbound Monarch Migrations with RNN’s
  • Building a Texas Hold ‘Em Poker Agent
  • Iteratively Improving Capabilities of Text-To-Image Models
  • Stanford course planning with Sarsa-lambda
  • Tab WhiRL: A Model Based RL Approach to Browser Memory Management
  • Optimal Insulin Dosing for Type I Diabetes: Reinforcement Learning Approaches
  • Intelligent Tutoring System: Providing practice or guided instruction
  • “What’s Nine Plus Ten?” Finding An Optimal Policy for Blackjack
  • Automated Euchre Play Using Deep Reinforcement Learning
  • Calm Mind: A Reinforcement Learning Agent for Pokémon Battling
  • Policy Iteration for Traffic Circles
  • Making a Tetris Bot with Heuristic Search and Genetic Algorithms
  • Exact State Planning for Repairing Persistent Methane Leaks
  • Multi-Agent Planning in Unknown Environment using MCTS
  • Offensive Strategy for the San Francisco 49ers – Can We Do Better Than Kyle Shanahan Using RL?
  • Reinforcement Learning of RC Helicopter Control
  • AlphaLiarsDice: Q-learning and MCTS for Liar’s Dice
  • OptiLodge – An RL Model for Optimizing Airbnb Booking Costs
  • Five star Movie
  • Determinism in Connect 4 using Monte Carlo Tree Search
  • Omasolved Poker
  • John Q-Train: An AI Jazz Musician
  • Optimal Policy for Blackjack with Card-Counting
  • How to “do” school: Optimizing within injustice
  • Monte Carlo Tree Search and Euchre
  • Evaluating Model-Free Reinforcement Learning Approaches Towards Optimizing Thrust-Vectored Rocket Landings
  • Reinforcement Learning For Fantasy Football
  • Optimizing Traffic Light Phases to Improve Traffic Flow
  • Mahjong Decision Making using One-Step Lookahead and Monte Carlo Tree Search
  • Optimizing Mars Base Resource Management
  • Deep Reinforcement Learning for Stock Market Trading
  • Discovering Counterexamples to Graph Theoretic Conjectures with Monte Carlo Tree Search
  • A Comparative Analysis of Kalman Filters for Satellite Attitude Determination
  • Better Be Swift: Modeling and Optimizing Event Ticket Resale As A MDP Using Q-Learning
  • Model Free Reinforcement Learning for On-Demand Delivery Optimization

Winter 2023

File index, Files

  • Fantasy Ice Hockey Lineup Decisions with Q-Learning and SARSA
  • Optimal Policy for Uber/Lyft drivers in the Bay Area
  • Twenty-Twenty Learning: A 2048 solver powered by Deep Q-Networks
  • Drag-based Reconfiguration of Spacecraft Formations Under Uncertainty
  • Value Iteration applied to Orbital Debris Removal with a Satellite
  • Mortal (Q)ombat: Training an AI agent with Reinforcement Learning to Play a Brawler Game
  • Flipping the Bit: Explorations in Learning the Ising Model
  • Optimal Strategy Generation for Pandemic Game Using Monte Carlo Tree Search
  • Tropical Reforestation
  • Queue RL
  • ClutchRL: Using Q-Learning To Find The Optimal Policy To Win A Close Basketball Game
  • Decision Making in Quadcopter Navigation with The Effects of Wind
  • Reinforcement Learning Approaches for Sepsis Treatment
  • Localization and path finding of a known environment
  • Particle Filter For Autonomous Vehicle Handover
  • Q-QWOP: Using Q-Learning for QWOP Policy Development
  • Wordlebot: Improving Wordle Performance Using Forward Search
  • Optimal Load Scheduling for Smart Homes
  • Trajectory Optimization for Autonomous Parking Lot Cleaning using Q-Learning
  • Comparison of Online Planning and Reinforcement Learning for Oil Spill Containment
  • Estimating an Optimal Solution to Backgammon
  • I ♥ RL: Using Q-Learning to Play Hearts
  • AITA Judge Twitch Streamer
  • Traffic Light Optimization Under Uncertainty
  • Fuel Efficient Autonomous Driving
  • Theseus and the Cyclops: Adventures in Navigation Using Monocular Depth Estimation
  • Maximum Entropy Reinforcement Learning for Prompt Tuning
  • Applying Reinforcement Learning and Online Planning to the Game ”Regenwormen”
  • Blackjack with Card Counting
  • Search and ResQ
  • Optimal Operation of Renewable Energy Plus Storage Systems Using Reinforcement Learning
  • Landing a Plane on Final Approach with Q-Learning
  • Blackjack: Are the “Basic Strategies” Sufficient?
  • Driving with Large Language Models
  • Belief Updating for Improved Navigation in Wind
  • Taxi Route Recommendations Using Reinforcement Learning
  • Using Reinforcement Learning for Life and Death problem in Go
  • Travelling with (Un)certainty
  • Optimal Drone Navigation with Stochastic Dynamics
  • Reinforcement Learning for Autonomous Navigation of Subterranean Environments
  • Measuring potential effects of an intelligent tutoring system
  • X’s and O’s: Optimization of Tic-Tac-Toe
  • Autonomous Driving in a Roundabout with Rule-Breaking Agents
  • Playing blackjack using reinforcement learning
  • Playing Mancala via Reinforcement Learning
  • Fantasy Football Team Draft Sequential Decision Process
  • Optimizing Store Inventory Management with Q-learning
  • Deep Double Q Network for Lunar Lander
  • Optimal Online Ad Allocation Under Uncertainty
  • Reinforcement Learning for Chess Variants
  • A Chess Engine on Ice: Developing an Automated Curling Skip Capable of Decision Making under Uncertainty
  • Online Planning with Terrain Friction Estimation for Safe Rover Navigation
  • Simple Reinforcement Learning for Space Mining Bots
  • Preventing the Spread of Wind-Driven Wildfires with Partially Observable Markov Decision Processes
  • Making a (Good) Pokemon 1v1 Battle Agent
  • Learning to Play Backgammon Using Reinforcement Learning
  • Self-Driving School Bus for Stanford Campus
  • Deep Q-learning Network on Atari Game
  • FishBot: Decision Making in Canadian Fish
  • Modeling Deceit in Coup
  • Buying a House without Going Broke
  • Taxi Route Optimization using Reinforcement Learning
  • Minesweeper: A Reinforcement Learning Approach
  • Reinforcement Learning for Robotic Arm Reach Problem
  • Optimal Policy for Ground-to-Satellite Communication using Reinforcement Learning
  • Feel-Good Othello-Bot: Othello-Player Agent to Bring Satisfying Win for Human Player
  • Value Iteration and Contraceptive Method Choice: Applying Algorithmic Decision Making to Women’s Health Practices
  • Settlers of Catan Settlement Placement Optimization with Deep Q-Learning
  • A Markov-Based Approach to Evaluate the Optimal Position of Zeposia in the Treatment of Ulcerative Colitis
  • Multi-Rover Exploration using Dec-POMDPs
  • Optimizing Green Infrastructure Citing for Urban Flood Mitigation
  • Optimal Elevator Algorithm
  • Transmission Expansion for Offshore Wind Farms Under Uncertainty
  • Card-Counting Blackjack Agent
  • Tic-Tac-Toe: An Unbeatable Foe
  • How do we walk? Controlling joint motor torque to achieve balanced movement
  • How to not lose your starship: Strategy Optimization for Corellian Spike
  • Geometry Dash AI
  • Planar Manipulation of an Object with Unknown and Changing Center of Mass
  • Ping Pong Prodigy
  • Solving Half Cheetah Problem Using Offline RL
  • Solving Wordle using Information Theory, Minimax Algorithms, and Q-Learning
  • Applying Q-Learning to Uno
  • Rocket Self-Landing using Proximal Policy Optimization
  • Determining an optimal screening policy for colorectal cancer
  • Code-Lint: Code Generation with Reinforcement Learning incorporating Linter Feedback
  • Safe Navigation: Training Autonomous Vehicles using Deep Reinforcement Learning in CARLA
  • Implementing Q-learning to Obtain Control Policy for Residential Battery System
  • Cubo: Your Friendly Neighborhood Rolling Cube
  • Deep Reinforcement Learning for In-Flight Calibration of a Lunar Lander
  • Learning and Improving the Intelligent Driver Model with Reinforcement Learning
  • Safe Lane Merging for Autonomous Cars on the Highway through Markov Decision Process
  • Solving Wordle Using Monte Carlo Tree Search
  • Prepare to AI: Model-Free Reinforcement Learning for Dark Souls 1
  • Dynamic Efficient Sampling Policy Learning – An Investigation of DRL-based Atari Game Playing
  • Cleaning Up with Q-Learning: Optimizing Roomba Navigation in Unknown Environments
  • A Novel Reward Shaping Function for Single-Player Mahjong
  • Evaluating the Disposition Effect using Reinforcement Learning
  • Playing Risk with Reinforcement Learning
  • Using Q-Learning to Optimize Fantasy Premier League
  • Improving Matches in Dynamic Organ Exchanges
  • Exploring Crop Irrigation Decisions Under Climate Uncertainty
  • Monte Carlo Tree Search Applied to Spaceship Maze
  • Adaptive Water Supply Desalination Planning under Hydrological Uncertainty
  • Cleaning the streets with RL
  • UAV Search and Rescue Optimization
  • Greedy Search for Optimal Soil Sample Analysis
  • Modeling Strategies for the Conservation of Florida Manatees using POMDP Methods
  • Building an Agent for Settlers of Catan
  • Applying Reinforcement Learning to Pac-Man
  • Understanding Push-Fold Pre-Flop Strategy for Heads-Up NLH Poker
  • Local Search Approaches to Trading Strategy Optimization in Animal Crossing: New Horizons
  • A Julia Package for Calculating the Gittins Index
  • Planning Flood Evacuations Using Monte Carlo Tree Search
  • Random Adaption to Trawler Problem for Ocean Search and Rescue
  • Energy Markets as an MDP
  • POMDPoker
  • Get out of the way! Vehicle-Object Collision Avoidance
  • Developing An Automated 2D Simulation For Parking a Car
  • Bidding Strategy for Residential Solar and Storage in Transactive Electricity Market using Reinforcement Learning
  • Reinforcement Learning for Optimal Policy in Texas Hold’em Game
  • Wildfire Evacuation Planning: Using MDPs to go Beyond Surveillance
  • Solving the Taxi-v3 Problem Using Model-Free Reinforcement Learning
  • Modeling the Board Game Secret Hitler as a Decision Making Problem
  • Beating Cube Field
  • Battery Temperature Control for Deep Space Missions
  • Is There an “I” in Team? An Exploration of Capture the Flag Reinforcement Learning Agents
  • Reinforcement Learning Based Magnetic Detumbling of Low Earth Orbit Satellites
  • Multi-Agent Control in Partially Observable Environment
  • Payment Optimization for Combination Shoppers and Delivery Drivers using Deep Q Learning with Experience Replay
  • Q-Learning for a Traveling Salesman Problem Variant in Modified Landmark Traversal
  • Multiagent Coordination of Space Debris Cleanup under Uncertainty
  • Rapid Calibration of a Cellular-Resolution Bidirectional Neural Interface
  • Smart Water Heater via Sparse Sampling
  • Designing a Reliable Crew Member
  • Bayesian Graphical and Deep Q-Learning Agents in Colonel Blotto Game
  • RL Agent for Fish
  • FlyGenie: Minimize your flight ticket purchase costs
  • Helping and Hindering: Recursive Reasoning in a Multi-Agent MDP
  • SARSA Applications to UNO!
  • Development and Evaluation of a Reinforcement Learning Model to Optimize Discharge Decisions in Hospital Emergency Departments
  • CALIBRATING BLACK-BOX TRANSITION MODELS FOR MDP PLANNING USING CONFORMAL PREDICTION
  • Decision-Making Algorithms for Winning Coinche
  • Baseball Pitch Analytics
  • Optimal sequential decision making for managing ecosystem with POMDP
  • Particle Filter for Battery Aging State Estimation
  • Using Deep Q-Learning to Improve Automated Train Systems
  • Satellite Collision Avoidance Through Neural Networks
  • CheatGPT: Evaluating Robustness of Language Model Detectors to Adversarial Prompting
  • Guided Decision Making Under Uncertainty
  • Automated Theorem Proving with Graph Neural Network
  • Scrabble Engine Based on MuZero
  • Applying State Estimation Techniques to Improve Stability of a Ball-Balancing Robot
  • Electric Vehicle Route Planning with Uncertainty in Charger Wait Times
  • RL-ATC: Multi-Agent Reinforcement Learning for On-the-Ground Air Traffic Control and Planning
  • Effect of Additional Information on Learning to Bid in First Price Auctions
  • Tham Luang Cave Rescue: A Reinforcement Learning Approach
  • Reinforcement Learning Strategies for Love Letter
  • Building a Poker Agent

Fall 2021

Public Reports

Public Video Reports

Fall 2020

Public Reports

Public Video Reports

Other Titles

  • Tic Tac Toe Solver
  • Beating the house in Blackjack
  • Effect of Noise on Learning a Planar Pushing Task using SAC
  • ResQNet: Finding Optimal Fire Rescue Routes
  • COVID Chatbot
  • Regularized Follow-the-Leader in Online MDP for Efficient Topographical Mapping
  • Learning POMDP model parameters from missing observations
  • Reinforcement Learning for No-Limit Texas Hold ‘Em with Bomb Pots
  • Identifying Optimal Locations for Satellite Image Capture
  • Diet Conscious Meal Planner
  • Mitigating Risk of Public Transit during COVID-19
  • Predicting the Match: Using Bayesian Networks to Predict Professional Tennis Outcomes
  • Efficient Single-Agent Capture of a Moving Target
  • Q-Learning Applied to the Taxi Problem
  • Settlers of CATAN
  • Autonomous Snake
  • Q-Learning for Pre-Flop Texas Hold ‘Em
  • Deep RL for Atari Games
  • Simulating a D&D Encounter with Q-Learning
  • Deep RL for Automated Stock Trading
  • Dating Under Uncertainty
  • Retinal Implant Electrical Stimulation via RL
  • Batch Offline RL in the Healthcare Setting
  • Computer Caddy – Using RL to advice Golfers’ Club Selection
  • RL for Fischer Random Chess
  • Timely Decision Making with Probability Path Model
  • Satellite-Imagery Based Poverty Level Evaluation System in Mexico with Deep RL Approach
  • Pokemon Showdown
  • Deep RL for Space Invaders
  • Learning to Run
  • RL-Based Control of Policy Selection in Near-Accident Scenarios
  • Model Predictive Control for an Aircraft Autopilot
  • Finding Inharmonic Timbres Locally Consonant with Arbitrary Scales
  • Escape Roomba
  • Driving in Traffic
  • Playing Snake
  • The 2020 FLatland Challenge
  • Elevator Scheduling with Neural Q-Learning
  • Optimizing Immunotherapy Treatment using RL
  • Modeling Leduc Hold ‘Em Poker
  • Auto Trading System Using Q-Learning
  • Energy System Modeling
  • Optimizing Fox in the Forest through RL
  • Learning Gin Rummy
  • Car Racing with Deep RL
  • Sequential Decision Making for Mineral Exploration
  • Advanced Driver Assistance Systems
  • Learning to Play Stargunner with Deep Q-Networks
  • A Fourth-and-Goal Football Recommender System
  • Algorithms for Motion Planning
  • Playing Farkle
  • Connect 4: A Survey of Different RL Techniques to Destroy Your Pride
  • Decision Making in the word game, Codenames
  • Reinforcement Learning Approaches for An Adversarial Snake Agent
  • An Attention-Based, Reinforcement-Learned Heuristic Solver for the Double Travelling Salesman Problem With Multiple Stacks
  • Uncertainty Aware Model-Based Policy Optimization
  • Navigating the Four-Way Stop Autonomously
  • Ground Water Remediation Using Sequential Decision Making
  • Final Project: Satellite Collision Avoidance
  • Q Learning for 4th Down Decision Making in the NFL
  • Q-Learning for the Game of Nim: Does The Agent Learn a Combinatorially Optimal Strategy On Its Own?
  • Contextual Bandit Algorithms in Recommender Systems
  • A Comparison of Reinforcement Learning Methods for Autonomous Navigation
  • Reinforcement Learning for Behavior Planning in Intersections for Autonomous Vehicles
  • Reinforcement Learning for Pacman Capture the Flag
  • Comparing Different Optimization Techniques for Learning Continuous Control with Neural Networks
  • Autonomous Exploration in Subterranean Environments
  • Improving Image Denoising through Decision Making
  • Using MDPs to Optimally Allocate Funds
  • Explanations Meet Decision Theory
  • Learning Policies for Adaptive LiDAR Scanning with POMDPs
  • Cautious Markov Games: A New Framework for Human-Robot Interaction
  • Selecting a multibasis community structure for the connectome
  • Reinforcement Learning Techniques for Long-Term Trading and Portfolio Management
  • Optimal Asset Allocation with Markov Decision Processes
  • Symbolic Regression with Bayesian Networks
  • Scheduling battery charging using deep reinforcement learning
  • Online Knapsack Problem Using Reinforcement Learning
  • Policy gradient optimization for
  • Resource Allocation for Wildfire Prevention
  • Using Reinforcement Learning to Play Omaha
  • Fraud Detection for Mobile Payments using Bayesian Network and CNN
  • Neuro-Adaptive Artificial Neural Networks for Reinforcement Learning
  • AI Agent for Qwirkle
  • Learning Optimal Wildfire Suppression Policies With Reinforcement Learning
  • Bid Smart with Uncertainty: An Autonomous Bidder
  • AA228/CS238 Final Report
  • Modeling Identification of Approaching Aircraft as a POMDP
  • Short-Term Trading Policies for Bitcoin Cryptocurrency Using Q-learning
  • Reinforcement Learning of a Battery Power Schedule for a Short-Haul Hybrid-Electric Aircraft Mission
  • Autonomous Helicopter Control for Rocket Recovery
  • Reinforcement Learning Strategies Solving Game Gomoku
  • A Wildfire Evacuation Recommendation System
  • Battleship with Alogrithm
  • Developing an Optimal Structure for Breast Cancer Single Cell Classification
  • Utilizing Deep Q Networks to Optimally Execute Stock Market Entrance and Exit Strategies
  • Online Planning for a Grid World POMDP
  • Contingency Manager Agent for Safe UAV Autonomous Operations
  • Solving Mastermind as a POMDP
  • Simulated Drone Flight with Advantaged Actor Critic Reinforced Learning in 2 and 3 Dimensions
  • Solving Queueing Problem Using Monte Carlo Tree Search
  • Bayesian Structure Learning on NFL play data
  • Multi-Agent Rendezvous Using Reinforcement Learning
  • Dynamic Portfolio Optimization
  • Fairness and Efficiency in Multi-Portfolio Liquidation: An Multiple-Agent Deep Reinforcement Learning Approach
  • Evaluating Poker Hands
  • Saving Artificial Intelligence Clinician
  • Evaluation of online trajectory planning methods for autonomous vehicles
  • Solving Leduc Hold’em Counterfactual Regret Minimization
  • From aerospace guidance to COVID-19: Tutorial for the application of the Kalman filter to track COVID-19
  • A Reinforcement Learning Algorithm for Recycling Plants
  • Monte Carlo Tree Search with Repetitive Self-Play for Tic-Tac-Toe
  • Developing a Decision Making Agent to Play RISK

Fall 2019

Public Reports

Other Titles

  • Linear Array Target Motion Analysis Using POMDPs
  • Speed or Safety?: Calculating Urban Walking Routes Based on Probability of Crime and Foot Traffic
  • AlphaGomoku
  • Modelling Uncertainty in Dynamic Real-time Multimodal Routing Problems
  • Reinforcement Learning for Portfolio Allocation
  • Preparation of Papers for AIAA Technical Conferences
  • Autonomous Racing
  • Deep Learning Enabled Uncorrelated Space Observation Association
  • Landing a Lunar Spacecraft with Deep Q-Learning
  • POkerMDP: Decision Making for Poker
  • 1V1 Leduc Hold’em Bot
  • Political Influencers: Using Election Finance Data to Analyze Campaign Success via Bayesian Networks
  • Developing AI Policies for Street Fighter via Q-learning
  • Impact of Market Technical Indicators On Future Stock Prices Using Reinforcement Learning
  • Allocation of Hearts for Transplant as an MDP
  • Multi-Agent Reinforcement Learning in a 2D Environment for Transportation Optimization
  • Planning under Uncertainty for Discrete Robotic Navigation with Partial Observability
  • Deep Reinforcement Learning Applied to Mid-Frequency Trading
  • Application of Subspace Identification for Classification of Neural-Activity during Decision-Making
  • Using Markov-Decision Processes to Design Betting Strategies for the NFL
  • Maneuvering Characteristics Control Systems using Discrete-Time MDPs
  • MDP Based Motion Planning In Unsignaled Intersections
  • Competitive Blackjack Using Reinforcement Learning
  • Modelling Pedestrian Vehicle Interaction at Stop Sign using Markov Decision Process
  • Jeopardy! Wagering Under Uncertainty
  • Love Letters Under Uncertainty
  • Playing The Resistance with a POMDP
  • Robotic Simultaneous Localization and Mapping with 2D Laser Scan
  • Mars Rover: Navigating an Uncertain World
  • Modeling Blood Donations Over Time as a POMDP
  • Reinforcement Learning for Control on OpenAI Gym Environments
  • Playing Connect 4 using Reinforcement Learning
  • Evaluation of Reduced Algorithmic Complexity for Grasping Tasks by Using a Novel Underactuated Curling Grasper with Reinforcement Learning
  • Optimizing Strategies for Settlers of Catan
  • Exploring Search Algorithms for Klondike Solitaire
  • A Sparse Sampling Control Strategy for Risk Minimization during Stretchable Sensor Network Deployment
  • Computing Strategies for the 7 Wonders Board Game
  • POMDP modeling of stochastic Tetris
  • Solving a Maze with Doors and Hidden Tigers
  • Playing “Dominion” with Deep Reinforcement Learning
  • Delivery Vehicle Navigation in Crowd with Reinforcement Learning
  • Capturing Uncertainty in a Multi-Modal Setting With JRMOT: A Real-Time 2D-3D Multi-Object Tracker
  • Decentralized Satellite Network Communication
  • Seismic Network Planning
  • Reinforcement Learning for PaoDeKuai, A Card Game
  • Training A Bai Fen Agent with Reinforcement Learning
  • Decision Making for Launch Cancellation Based Upon Storm Conditions
  • Optimizing for the Competitive Edge: Modeling Sequential Binary Decision Making for Two Competing Firms
  • Datacenter Equipment Maintenance Optimization
  • To Heat Or Not To Heat: Reinforcement Learning for Optimal Residential Water Heater Control
  • Learning to Play Snake Game with Deep Reinforcement Learning
  • Optimal Traffic Light Control for Efficient City Transportation
  • Modeling NBA Point Spread Betting as an MDP
  • Solving a car racing game using Reinforcement Learning
  • Is Uncertainty Really Harmful: Solving Partially Observable Lunar Lander Problem with Deep Reinforcement Learning
  • Autonomous Navigation of an RC Boat Under a POMDP Framework
  • Evaluating the Bayes-Adaptive MDP Framework on Stochastic Gridworld Environments
  • Value Iteration with Enhanced Action Space for Path Planning
  • The Medical Triage Problem: Improving Hospitals’ Admission Decisions
  • Optimal Route Selection for Riders in Toronto
  • Model Free Learning for Optimal Power Grid Management
  • Wasting Less Time on the Road Using MDPs
  • Learning User Preferences to Produce Sequential Movie Recommendations
  • A Comparison of Learning Based Control Methods for Optimal Trajectory Tracking with a Quadrotor
  • Artificial Pancreas: Q-Learning Based Control for Closed-loop Insulin Delivery Systems
  • Navigating in an Uncertain World
  • Teaching an Autonomous Car to Drive through an Intersection with POMDPs
  • Atomic structure minimization using simulated annealing with a MCTS temperature scheme
  • AI Game Player for 2048
  • Deep Q-Learning with GARCH for Stock Volatility Trading
  • Learning to Become President
  • Solving GNSS Integrity-Based Path Planning in Urban Environments via a POMDP Framework
  • Reservoir operation under climate uncertainty
  • Reinforcement Learning for Maze Solving
  • Using Reinforcement Learning to Find Basins of Attraction
  • Planning for Asteroid Prospecting Missions with POMDPs
  • Human-Aware Robot Motion Planner
  • Determining Federal Funds Rate Changes – Hike / Cut / Hold – Under Economic Uncertainty
  • Simulating Work-Life Balance with POMDP
  • Solving 2048 as a Markov Decision Process
  • Accounting for Delay in Dynamic Resource Allocation for Wildfire Suppression – a POMDP Approach
  • Daily Allocation of Assets with Distinct Risk Profiles using Reinforcement Learning
  • LocoNets for Deep Reinforcement Learning
  • Exploring a full joint observability game with Markov decision processes
  • Deep Bayesian Active Learning for Multiple Correct Outputs
  • Convolutionally Reducing Markov Decision Processes
  • Robust Decision Making Agent for Frozen Lake
  • Tic-Tac-Toe How Many In A Row?
  • Turbomachinery Optimization Under Uncertainty
  • Devising a Policy for Liar’s Dice Using Model Free Reinforcement Learning
  • Political Compromises: an Iterative Game of Prisoner’s Dilemma
  • Optimal Home Energy System Management using Reinforcement Learning
  • Drone Tracking in a 2-dimentional Grid using Particle Filter Algorithm
  • Deep Reinforcement Learning for Traffic Signal Control
  • A Deep Reinforcement Learning Approach to Recommender Systems
  • FlyCroTugs – Collaborative Object Manipulation Using Flying Tugs
  • Local Approximation Q-Learning for a Simplified Satellite Reconnaissance Mission
  • Developing Policies for Blackjack Using Reinforcement Learning
  • Applying Q-learning to the Homicidal Chauffeur Problem
  • Optimal Satellite Detumbling through Reinforcement Learning
  • Active Preference-Based Gaussian Process Regression for Reward Learning and Optimization
  • A Comparative Study on Heart Disease Prediction
  • Robot Navigation with Human Intent Uncertainty
  • Conquering the Queen of Spades: A Hearts Agent
  • Using Markov Decision Processes to Predict Soccer Player Market Value
  • Effectiveness of Recurrent Network for Partially-Observable MDPs
  • Capture The Flag
  • Predicting uncertainty
  • Optimal Asset Allocation with Markov Decision Processes
  • Nets on Nets: Using Bayesian Networks to Predict Supplier Links in Economic Networks
  • Playing 2048 With Reinforcement Learning
  • Trading strategies using deep reinforcement learning with news and time series stock data
  • Modeling Contract Bridge as a POMDP
  • Solving Rubik’s Cubes Using Milestones
  • Playing 2048 with Deep Reinforcement Learning
  • An Approximate Dynamic Programming Minimum-Time Guidance Policy for High Altitude Balloons
  • Identifying Bots on Twitter
  • Approaches to Model-Free Blackjack
  • Jumping Robot Simulator: An Exploration of Methods to Teach a Bio-Inspired Frog Robot to Navigate
  • Air Traffic Control Tower Policy for Terminal Environment Operations
  • Managing a Prediction Market Portfolio
  • Applying Partially Observable Markov Decision Making Processes to a Product Recommendation System
  • Self-Driving Under Uncertainty
  • Reinforcement Learning for QWOP
  • Modeling Macroeconomic Phenomena with Multi-Agent Reinforcement Learning
  • Optimal Learning Policy via POMDP planning
  • AI Guidance for Thermalling in a Glider
  • Decision Making For Profit: Portfolio Management using Deep Reinforcement Learning
  • Self-play Reinforcement Learning for Open-face Chinese Poker
  • Feature Constrained Graph Generation with a Modified Multi-Kernel Kronecker Model
  • Sensor Fusion of IMU and LiDAR Data Using a Multirate Extended Kalman Filter
  • Optimizing Empiric Antibiotic Delivery in the Emergency Department
  • The Task Completion Game
  • Optimizing Modified Mini-Metro (M³)
  • Improving Pragmatic Inferences with BERT and Rational Speech Act Framework and Data Augmentation
  • Deep Q-Learning for Playing Hanabi as a POMDP
  • A Comparative Study on Heart Disease Prediction

Fall 2018

Public Reports

Other Titles (excluding optional final projects)

  • Occlusion Handling for Local Path Planning with Stereo Vision
  • Pre-Flop Betting Policy in Poker
  • Optimal Impulsive Maneuver Times for Simultaneous Imaging and Gravity Recovery of an Asteroid
  • Monte-Carlo Planning in Subsurface Resource Development
  • Learning to Win at Go-Stop
  • Police Officer Distribution
  • Optimizing Road Construction to Improve Traffic Conditions Using Reinforcement Learning
  • Q-Learning for Casino Hold’em
  • Modeling a Connected Highway Merge as a POMDP Using Dynamic GPS Error
  • Figure 8 Race Track Optimal and Safe Driving
  • Predictive Maintenance of Trucks using POMDPs
  • Predictive Models for Maximizing Return on Agriculture given Location and Temperature
  • A Policy to Deal With Delay Uncertainty
  • Reinforcement Learning Methods for Energy Microgrids
  • Boom! Tetris for Bot – Designing a Reinforcement Learning Framework for NES Tetris
  • Hidden Markov Models for Economic Cycle Regime Estimation
  • Push Me: Optimizing Notification Timing to Promote Physical Activity
  • Resource Allocation for Floridian Hurricanes
  • Motion Planning in Human-Robot Interaction Using Reinforcement Learning
  • Automated Neural Network Architecture Tuning with Reinforcement Learning
  • Imitation Learning in OpenAI Gym with Reward Shaping
  • Collision Avoidance for Unmanned Rockets using Markov Decision Processes
  • MDP Solvers for a Successful Sushi Go! Agent
  • Uncovering Personalized Mammography Screening Recommendations through the use of POMDP Methods
  • Implementing Particle Filters for Human Tracking
  • Decision Making in the Stock Market: Can Irrationality be Mathematically Modelled?
  • Single and Multi-Agent Autonomous Driving using Value Iteration and Deep Q-Learning
  • Buying and Selling Stock with Q-Learning
  • Application and Analysis of Online, Offline, and Deep Reinforcement Learning Algorithms on Real-World Partially-Observable Markov Decision Processes
  • Reward Augmentation to Model Emergent Properties of Human Driving Behavior Using Imitation Learning
  • Classification and Segmentation of Cancer Under Uncertainty
  • Comparison of Learning Methods for Price Setting of Airfare
  • QMDP Method Comparisons for POMDP Pathfinding
  • Global Value Function Approximation using Matrix Completion
  • Artificial Intelligence Techniques for a Game of 2048
  • Exploring the Boundaries of Art
  • An Iterative Linear Algebra Approach to Dynamic Programming
  • Solving Open AI Gym’s Lunar Lander with Deep Reinforcement Learning
  • Application of Imitation Learning to Modeling Driver Behavior in Generalized Environments
  • Craps Shoot: Beating the House…?
  • Movie Recommendations with Reinforcement Learning
  • Playing Atari 2600 Games Using Deep Learning
  • Traverse Synthesis for Planetary Exploration
  • Optimal operation of an islanded microgird under a Markov Decision Process framework
  • Implementing Deep Q-learning Extensions in Julia with Flux.jl
  • Learning How to Buy Food
  • Using Dynamic Programming for Optimal Meal Planning
  • Modelling Wildfire Evacuation using MDPs
  • Comparing Multimodal Representations for Robotic Reinforcement Learning Tasks
  • Applying Reinforcement Learning to Packet Routing in Mesh Networks
  • Xs & Os: Creating a Tic-Tac-Toe Foe
  • Doggo Does a Backflip: Deep Reinforcement Learning on a Quadruped Platform
  • GrocerAI: Using Reinforcement Learning to Optimize Supermarket Purchases
  • Reinforcement Learning For The Buying and Selling of Financial Assets
  • Towards Designing a Policy on Automotive GPS Integrity
  • Generalized Kinetic Component Analysis
  • Trading Wheat Futures Contracts
  • Using PCR, Neural Networks, and Reinforcement Learning
  • Reinforcement Learning for Inverted Pendulums
  • Electric Vehicle Charging under Uncertainty
  • Automatic Accompaniment Generator: An MDP Approach
  • Comparison of Methods in Artificial Life
  • Modeling a Better Visual Acuity Test
  • Online Methods Applied to the Game of Euchre
  • Missile Defense Strategy: Towards Optimal Interceptor Allocation
  • Smart Charging of Electric Vehicles under State Uncertainty
  • Learning to Play Atari Breakout Using Deep Q-Learning and Variants
  • Decision making on fault-code
  • Learning FlappyBird with Deep Q-Networks
  • A Fresh Start: Using Reinforcement Learning to Minimize Food Waste and Stock-Outs in Supermarkets
  • Autonomous orbital maneuvering using reinforcement learning
  • Autonomous Decision-Making for Space Debris Avoidance
  • Maximizing Monthly Expenditures Under Uncertainty
  • Modeling Voter Preferences in US General Elections
  • Application of Reinforcement Learning to the Path Planning with Dynamic Obstacles
  • A Decision Making framework for Medical Diagnostics
  • Learning to Walk Using Deep RL
  • Q(λ)-Learning with Boltzmann and ε-greedy Exploration Applied to a Race Car Simulation
  • Reinforcement learning for Glassy/Phase Transitions
  • Proximal Policy Optimization in Julia
  • University Technology Patent and License Decisions: Open- versus Closed-Loop Planning in a Markov Decision Process
  • A Policy Gradient Approach for Continuous-Time Control of Spacecraft Manipulator Systems
  • Applying Techniques in Reinforcement Learning to Motion Planning in Redundant Robotic Manipulators
  • Deep Q-Learning for Atari Pong
  • Adversarial Curiosity for Model-Based Reinforcement Learning
  • A Markov Decision Process Approach to Home Energy Management with Integrated Storage
  • Using Maximum Likelihood Model-Based Reinforcement Learning to Play Skull
  • Cryptocurrency Trading Strategy with Deep Reinforcement Learning
  • Evaluating Multisense Word Embeddings Final Report
  • Near-Earth Object (NEO) Deflection via POMDP
  • Reinforcement Learning for Car Driving
  • Reinforcement Learning for Automatic Wheel Alignment
  • Julia Implementation of Trust Region Policy Optimization
  • Deep Reinforcement Learning with Target and Prediction Networks
  • Playing Tower Defense with Reinforcement Learning
  • Q-Learning agent as a portfolio trader
  • Multi-Robot Rendezvous from Indoor Acoustics
  • Portfolio Asset Allocation using Reinforcement Learning
  • Creating a 2048 AI Solver using Expectimax
  • Robustness of Reinforcement Learning Based Communication Networks in Multi-Modal Multi-Step Games to Input Based Adversarial Attacks
  • Deep Q-Learning with Nearest Neighbors in Sequential Decision-Making for Sepsis Treatment
  • Positioning Archival Radar Data with a Particle Filter
  • Reinforcement Learning for Atari Skiing
  • Understanding Donations with Reinforcement Learning
  • Known and Unknown Discrete Space Exploration Using Deep Q-Learning
  • Speeding Up Reinforcement Learning with Imitation
  • Learning Bandwidth-Limited Communication in Decentralized Multi-agent Reinforcement Learning

Fall 2017

  • 2048 as a MDP
  • A Computational Approach to Employee Resource Allocation between Multiple Projects
  • Accelerated Training of Deep Q Learning Models for Atari Games
  • AlphaOthello: Developing an Othello player through Reinforcement Learning on Deep Neural Networks
  • An Online Approach to Energy Storage Management Optimization
  • An Optimal Basketball Foul Strategy by Value Iteration
  • Annealed Reward Functions in Continuous Control Reinforcement Learning
  • Applications of Inverse Reinforcement Learning for Multi-Feature Path Planning
  • Attributing Authorship in the Case of Anonymous Supreme Court Opinions Utilizing SVMs and Probabilistic Inference on Score Uncertainty
  • Balancing Safety and Performance in Imitation Learning
  • Baseball Pitch Calling as a Markov Decision Process
  • Batch Reinforcement Learning Technological Investment Strategies Utilizing The Contingent Effectiveness Model In A Markov Decision Process
  • Bayesian Learning of Image Transformations from User Preferences for Individualized Automatic Filters
  • BetaMiniMax: An Agent for Cheat
  • Building a Game Agent to Play Resistance
  • Building Trust in Autonomy: Sharing Future Behavior in Reinforcement Learning
  • Car racing with low dimensional input features
  • Comparison of Classical Control Methods and POMDPs for 3D Motion Control
  • Control of a Partially-Observable Linear Actuator
  • DDQN Learning for 2048
  • Deep Q-learning in OpenAI gym
  • Deep Q-Learning with Target Networks for Game Playing
  • Design of A Planning Machinery for Choosing an NBA Team’s Play Style Strategy
  • Detecting Human from Image with Double DQN
  • Determining the Optimal Betting Policy: World Series
  • Disrupting Distributed Consensus (or Not) Using Reinforcement Learning
  • Dominating Dominoes
  • Double A3C: Deep Reinforcement Learning on OpenAI Gym Games
  • Emergent Language in Multi-Agent Co-operative Reinforcement Learning
  • Explore the Frontier of Safe Imitation Learning for Autonomous Driving
  • Fast Operation of Coordinated Distributed Energy Resources without Network Models using Deep Deterministic Policy Gradients
  • Faster Algorithms for Contextual Combinatorial Cascading Bandits
  • Finding a Scent Source with a Soft Growing Robot Using Monte Carlo Tree Search
  • Gaming Bitcoin Leveraging Model-Based Reinforcement Learning
  • Get Ready for Demand Response
  • GlideAI: Optimizing Soaring Strategy with Reinforcement Learning
  • Grid Stability Management and Price Arbitrage for Distributed Energy Storage and Generation via Reinforcement Learning
  • Guiding the management of sepsis with deep reinforcement learning
  • HMMs for Prediction of High-Cost Failures
  • Integrating Mini-Model Evidence into Policy Evaluation
  • Investigating Parametric Insurance Models As Multi- Variable Decision Networks
  • Learning an Optimal Policy for Police Resource Allocation on Freeways
  • Learning Terminal Airspace Models from TSAA Data
  • Learning the Education System
  • Learning the Policy of the Policy: Deep Reinforcement Learning with Model-Based Feedback Controllers
  • Learning to Play a Simplified Version of Monopoly Using Multi-Agent SARSA
  • Learning to Play Othello Without Human Knowledge
  • Limbed Robot Motion Control through Online Reinforcement Learning
  • Linear Approximation Q-Learning to Learn Movement in a 2D Space
  • Locally Optimal Risk Aware Path Planning
  • Massively Parallel Reinforcement Learning Using Trust Region Policy Optimization
  • Model-Free Learning of Casino Blackjack
  • Model-Free Reinforcement Learning of a Modified Helicopter Game
  • Model-Free Reinforcement Learning on Flappy Bird
  • Modeling Disaster Evacuation Paths
  • Modeling Flight Delay and Cancelation
  • Modeling NBA Matchups
  • Modeling Optimally Efficient Earth to Earth Flight Trajectories in Kerbal Space Program with Reinforcement Learning
  • Modeling Real Estate Investment with Deep Reinforcement Learning
  • Multi-Agent Cooperative Language Learning
  • Multi-armed Bandits with Unobserved Confounders
  • Multidisciplinary Design Optimization for Approximating Unsteady Aerodynamics of Flexible Aircraft Structures
  • Navigating Chaos: Autonomous Driving in a Highly Stochastic Environment
  • Optimal Flight Itineraries Under Uncertainty Using a Stochastic Markov Decision Process
  • Optimal Strategy for Two-Player Incremental Classification Games Under Non-Traditional Reward Mechanisms
  • Optimizing sequential time-lapse seismic davcx bta collection using a POMDP
  • Personal Portfolio Asset Allocation as An MDP Problem
  • Planetary Lander with Limited Sensor Information and Topographical Uncertainty
  • Playing Flappy Bird Using Deep Reinforcement Learning
  • POMDP and MDP for Underwater Navigation
  • POMDP Modeling of a Simulated Automatic Faucet for Cognitive State and Task Recognition
  • Portfolio Management
  • Power Grid real time optimization using Q-Learning
  • Predicting Congressional Voting Behavior and Party Affiliation using Machine Learning
  • Predicting Income From OkCupid Profiles
  • Predicting NBA Game Outcomes using POMDPs
  • Predicting Subjective Sleep Quality
  • Preparation of Papers for AIAA Technical Journals
  • Pursuit-Evasion Game with an Agent Unaware of its Role
  • Rapid Reinforcement Learning by Injecting Stochasticity into Bellman
  • Real Time Collision Detection and Identification for Robotic Manipulators
  • Reinforcement Learning Applied to Quadcopter Hovering
  • Reinforcement Learning Approaches to Pathfinding in Stochastic Environments
  • Reinforcement Learning For A Reach-Avoid Game
  • Reinforcement Learning for Atari Breakout
  • Reinforcement Learning for Crypto-Currency Arbitrage Bot
  • Reinforcement Learning for Precision Landing of a Physically Simulated Rocket
  • Reinforcement learning in an online multiplayer game
  • Reinforcement Learning of Blackjack Variants
  • Reinforcement training of nonlinear reduced order models
  • Reward Shaping with Dynamic Guidance to Accelerate Learning for Multi-Agent Systems
  • Risk – Bayesian World Conquest
  • Roboat: Reinforcement of Boat’s Optimal Adaptive Trajectory
  • Robotic Arm Motion Planning Based on Reinforcement Learning
  • Robotic Decision Making Under Uncertainty
  • Sensor Selection for Energy-Efficient Path Planning Under Uncertainty
  • ShAIkespeare: Generating Poetry with Reinforcement Learning and Factor Graphs
  • Shared Policies in Aircraft Avoidance
  • Simulated Autonomous Driving with Deep Reinforcement Learning
  • Simulating Coverage Path Planning with Roomba
  • SLAMming into Obstacles: Simultaneous Localization and Mapping the Path of a Turtlebot
  • Smart Health Coach: Using Markov Decision Processes to Optimize Health Advising Strategies
  • Smarter Queues by Reinforcement Learning
  • Solving Real-world Oil Drilling Problem with Multi-Armed Bandit and POMDP Models
  • Stay in Your Lane: Probabilistic Vehicular Automation for DIY Robocars
  • Supervised Learning and Reinforcement Learning for Algorithmic Trading
  • Taking Out the GaRLbage
  • Terrain Relative Navigation and Path Planning for Planetary Rovers
  • Time-Constrained Sample Retrieval in a Martian Gridworld with Unknown Terrain
  • Trade-offs in Connect Four Game-Playing Agents
  • Training an Intelligent Driver on Highway Using Reinforcement Learning
  • UAV Collision Avoidance Using Neural Network-Assisted Q-Learning
  • Understanding Limitations of Network Meta-analysis Approaches to Rank Effectiveness of Treatments
  • Using Bayesian Networks to Impute Missing Data
  • Using Bayesian Networks to Predict Credit Card Default
  • Using Bayesian Networks to Understand and Predict Wildfires
  • Using Classification Models to Represent and Predict Students’ Restaurant Preferences
  • Using Q-Learning to Optimize Lunar Lander Game Play
  • Using the QMDP Method to Determine an Open Ocean Fishing Policy
  • Utilizing fundamental factors in reinforcement learning for active portfolio management
  • Utilizing Fundamental Factors in Reinforcement Learning for Active Portfolio Management

Fall 2016

  • Model-Free Reinforcement Learning of Blackjack
  • Partially Observable Actions in Solving Markov Decision Processes. The Case for Insulin Dosing Optimization in Diabetic Patients.
  • Using Monte-Carlo Tree Search to Solve the Board Game Hive
  • Blackjack: How to use MDP’s to (nearly) beat the house
  • Cancer Metabolism Mapping: Bayesian Networks and Network Learning Techniques to Understand Cancer Metabolic and Regulatory Pathways
  • Gibbs Sampling in BayesNets.jl
  • UAV Collision Avoidance Policy Optimization with Deep Reinforcement Learning
  • Improving Training Efficiency in Deep Q-Learning for Atari Breakout
  • Monitoring Machine Workload Distribution with Kalman Filter
  • Approximating Transition Functions to Cart Track MDPs via Sub-State Sampling
  • Approaching Quantitative Trading with Machine Learning
  • Structure and Parameter Learning in Bayesian Networks with Applications to Predicting Breast Cancer Tumor Malignancy in a Lower Dimension Feature Space
  • Autonomous Racing by Learning from Professionals
  • Bravo Zulu: Optimizing Teammate Selection for Military and Civilian Applications
  • Investigating Transfer Learning in Deep Reinforcement Learning Context
  • Simultaneous Estimation and Control with MCTS
  • Controlling Soft Robots with POMCP
  • Automatic Learning of Computer Users’ Habits
  • Learning to Play Soccer in the OpenAI Gym
  • Playing Ultimate Tic-Tac-Toe with TD Learning and Monte Carlo Tree Search
  • A Bayesian Network Model of Pilot Response to TCAS Resolution Advisories
  • Improving Head Impact Kinematics Measurement Accuracy using Sensor Fusion
  • Drive Decision Making at Intersections
  • Deterministic and Bayesian Techniques for Spaceborne Vision-Based Non-Stellar Object Detection
  • A Two-Phased Deep Reinforcement Learning Algorith for High-Frequency Trading
  • Implementation and Experimentation of a DQN solver in Julia for POMDPs.jl
  • Landing on the Moon
  • Deserted Island: Cooperative Behavior in Absence of Explicit Delayed Reward
  • DeepGo.py
  • Managing Groundwater under Uncertain Seasonal Recharge
  • Using Reinforcement Learning to Find Flaws in Collision Avoidance Systems
  • Effectiveness of Bayesian Networks in Building a Prediction Model for Movie Success
  • Data Driven Agent based on Aircraft Intent
  • Deep Q-Learning with Natural Gradients
  • A Shot in the Dark: Beating Battleship with POMCP
  • Accelerated Asynchronous Deep Reinforcement Learning Variant of Advantage Actor-Critic
  • Applying Reinforcement Learning and Online Methods on the Inverted Pendulum Problem
  • Predicting Sentiment with Deep Q-Learning
  • A Lookahead Strategy for Super-Level Set Estimation using Gaussian Processes
  • Modeling Breast Cancer Treatment as a Markov Decision Process
  • Learning 31 using Cross-Entropy Methods
  • Improving Haptic Guidance using Reinforcement Learning
  • NLPLab: Actor-Critic Training in Natural Language Processing
  • Deep Reinforcement Learning on Atari Breakout
  • Reinforcement Learning for LunarLander
  • Reinforcement Learning for AI Machine Playing Hearthstone
  • Using Deep Q-Learning to Automate CNN Training
  • Automatic Continuous Variable Encoder in Bayesian Network
  • Side Channel Analysis using Neural Networks and Random Forests
  • A Decision-Making System for Wildfire Management
  • Decentralized Game Theoretic Methods for the Distributed Graph Coverage Problem
  • Autonomous altitude control for high altitude balloons
  • Neural Network Arbitration for Better Time and Accuracy trade-offs
  • Deep Deterministic Policy Gradient with Robot Soccer
  • Towards a Personal Decision Support System
  • Optimal Gerrymandering under Uncertainty
  • The Ambulance Dilemma: Crossing an Intersection with Monte Carlo Tree Search
  • DeepDominionDevelopmental Policy Design: an MDP approach
  • Training of a craps betting strategy with Reinforcement Learning Techniques
  • Engineering a Better Monkey
  • Decision Making During a Bicycle Race
  • Using Discrete Pressure Measurements to Understand Subsonic Bluff-Body Dynamic Damping
  • Effective Move Selection in Chess Without Lookahead Search
  • Solving Texas Hold’em with Monte-Carlo Planning
  • Reinforcement Learning of High-Speed Autonomous Driving through Unknown Map
  • Implementation and deployment of particle filter for simulated and real-world localization tasks
  • Tree Augmented Naive Bayes and Backward Simulation
  • Transfer of Q values across tasks in Reinforcement Learning
  • Training Regime Modifications for Deep Q-Network Learning Acceleration
  • Reinforce Optimizer
  • Approximating Ligand Docking Using a Markov Decision Process
  • Breaking Down Social Media Filter Bubbles via Reinforcement Learning
  • Performing an N-Sentiment Classification Task on Tinder Profiles Based On Image Feature Extraction
  • Play Blackjack With Monte Carlo Simulation And Q-learning with Linear Regression
  • Observer-Actor Neural Networks for Self-Play in Imperfect Information Games
  • Using Hybrid Bayes Nets to Model Country Prosperity
  • Solving a Pandemic! Various Approaches for Tackling the Board Game
  • Improved Markov Decision Process Model for Resource Allocation in Disaster Scenarios
  • Learning Chess through Reinforcement Learning
  • Deep Reinforcement Learning For Continuous Control: An Investigation of Techniques and Tricks
  • Computer Vision Through Perception: Semantic Understanding of Novel Scenes through Data Programming
  • Path Planning for Insertion of a Bevel-Tip Needle
  • Modeling human biases through reinforcement learning
  • Bootstrapping Neural Network with Auxiliary Tasks
  • Q-Learning Application in Optimizing Pokémon Battle Strategy
  • Model-based exploration in natural language generation
  • Automated Aircraft Touchdown
  • Longitudinal Vehicle Control using a Markov Decision Process and Deep Neural Network
  • MOMDP-based Aerial Target Search Optimization
  • Greedy Thick-Thinning Structure Learning and Bayesian Network Conditional Independence Implementations in BayesNets.jl
  • Multiagent Planning For Aerial Broadband Internet
  • Viral Marketing as an MDP
  • Neural Soccer – Towards Exploration by the Pursuit of Novelty
  • Locally Weighted Value Iteration in Julia
  • Fully-Nested Interactive POMDPs for Partially-Observable Turn-Based Games
  • Optimal Policy Considerations for Gas Turbine Maintenance
  • Learning Optimal Manipulation of Food Webs
  • Estimating Resource Prospector’s Probability of Failure Using Importance Sampling and Cross Entropy
  • Dynamically Discount Deep Reinforcement Learning
  • Deep Reinforcement Learning: Accelerated Learning with Effective Gradient Ascent Optimization Algorithms
  • Autonomous Human Tracking in Simulated Environment
  • A LQG Library for POMDPs.jl

Fall 2015

  • Mars Hab-Bot: Using MDPs to simulate a robot constructing human-livable habitats on Mars
  • A Value Iteration Study of BlackJack
  • Optimized Store-Stocking via Monte Carlo Tree Search with Stochastic Rewards
  • Trajectory Planning for Map Exploration Using Terrain Features
  • Instruction Following with Deep Reinforcement Learning
  • Using Markov Decision Processes to Minimize Golf Score
  • Reinforcement Learning for Scheduling I/O Requests on Multi-Queue Flash Storage
  • Finding the Perfect ‘Job’ in resource allocation
  • Maximizing Influence in Social Networks
  • A Machine Learning Regression Approach to General Game Playing
  • Modeling GPS Spoofing as a Partially Observable Markov Decision Process
  • Travel Hacking with MDPs
  • Optimal Mission Planning for a Satellite-Based Particle Detector via Online Reinforcement Learning
  • An MDP Approach to Motion Planning for Hopping Rovers on Small Solar System Bodies
  • Sampling Strategies for Deep Reinforcement Learning
  • Descriptive Power of Bayesian Structure Learning in Stock Market
  • Large-Scale Traffic Grid Signal Control Using Fuzzy Logic and Decentralized Reinforcement Learning
  • Simulated Pedestrian-like Navigation with a 1D Kalman Filter with an Accelerometer and the Global Positioning System
  • Search and Track Tradeoff for Multifunction Radars
  • Play Calling in American Football Using Value Iteration
  • Reinforcement learning for commodity trading
  • Learning the Stock Market, a Naive Approach
  • A POMDP Framework for Modelling Robotic Guidance During a Tissue Palpation Task
  • Reinforcement Learning of an Artificially Intelligent Hearts Player1
  • Toy Helicopter Control via Deep Reinforcement Learning
  • Gas Refuelling Optimization Modelled as a Markov Decision Process
  • Q-Matrix and Policy Compression via Deep Learning
  • Augmenting Self-Learning In Chess Through Expert Imitation
  • Monte Carlo Tree Search Applied to a Variant of the RockSample Problem
  • Supply Chain Management using POMDPs
  • Online Markov Decision Process Framework for Modeling Efficient Home Robot Cleaners
  • Reinforcement Learning for Path Planning with Soft Robotic Manipulators
  • Exploring POMDPS with Recurrent Neural Networks
  • Tic-tac-toe with reinforcement learning: best strategies and influence of parameters
  • Vehicle Speed Prediction using Long Short-Term Memory Networks
  • Explorations on Learning Bayesian Networks
  • Playing unknown game on a visual world
  • Reinforcement Learning for Atari Games
  • Q-learning in the Game of Mastermind
  • Modeling of a Baseball Inning as MDP
  • Reinforcement Learning for Path Planning with Soft Robotic Manipulators
  • Autonomous Driving on a Multi-lane Highway Using POMDPs
  • Solving a Maze Without Location Data
  • Markov Decision Processes and Optimal Policy Determination for Street Parking
  • Solving an opponent-based match-three mobile game
  • Life begins as a POMDP: improving decision making in the IVF clinic
  • Path Planning for Target-Tracking Unmanned Aerial Vehicle
  • Discrete State Filter Implementation for a Battleships Artificial Intelligence
  • POMDP for Search and Rescue with Obstacle Avoidance: Incorporation of Human in the Loop
  • Application performance over cellular networks
  • An MDP Approach to Motion Planning for Hopping Rovers on Small Solar System Bodies
  • Solving Dudo: beating Liar’s Dice with a POMDP
  • Reinforcement Learning for Tetris
  • Robot Path Planning using Monte Carlo POMDP
  • Reinforcement Learning of an Artificially Intelligent Hearts Player
  • Enhancing Computational Efficiency of PILCO Model-based Reinforcement Learning Algorithm
  • Analysis of UCT Exploration Parameter in Sailing Domain Problems
  • Solving a Search and Rescue Planning problem with MOMDPs
  • Robot Motion Planning in Unknown Environments using Monte Carlo Tree Search
  • Delivery optimization of an on-demand delivery service
  • Solving Multi­Agent Decision Making using MDPs
  • Efficient and Modular Inventory Management Framework for Small Businesseses
  • Markov Decision Processes in Board Game Playing
  • Automated Model Selection via Gaussian Processes
  • Predictive Hybrid Vehicle Control Policy
  • Optimal Policies for In-Space Satellite Communications
  • Spacecraft Navigation in Cluttered, Dynamic Environments Using 3D Lidar
  • Playing Chess Endgames using Reinforcement Learning
  • Space Debris Removal
  • Large-Scale Traffic Grid Signal Control Using Fuzzy Logic and Decentralized Reinforcement Learning
  • Relation Extraction from Scratch
  • Lane Merging as a Markov Decision Process
  • Using MDP/POMDP to Help in Search of Survivors of a Plane Crash
  • Applying POMCP to Controlling Partially Observable Diffusion Processes
  • Credit Risk Classification using Bayesian Network

Fall 2014

  • Automating Air Traffic Management for Flight Arrivals
  • Policy Learning for Sokoban
  • Flight Path Optimization Under Constraints Using a Markov Decision Process Approach
  • Visual Localization and POMDP for Autonomous Indoor Navigation
  • Monte Carlo Tree Search for Online Learning in Golf Course Management
  • Pushing on Leaves
  • Beating 2048
  • Improved electrical grid balancing with demand response scheduled by an MDP
  • Multi-Fidelity Model Management in Engineering Design Optimization Using Partially Observable Markov Decision Processes
  • Smarter Generators in Power Markets
  • Beach Paddle Ball
  • Applying POMDP to RockSample problem
  • Targeting Hostile Vehicle Modeled as a Partially Observable Markov Decision Process with State-Dependent Observation Model
  • Reinforcement Learning and Linear Gaussian Dynamics Applied to Multifidelity Optimization of a Supersonic Wedge
  • Approximate POMDP Solutions for Short-Range UAV Traffic Conflict Resolution
  • WorkSmart: The Implementation of a Modified Q-Learning Algorithm for an Intelligent Daily To-Do List Android Application
  • Imminent Obstacle Avoidance with Friction Uncertainty
  • Dynamic Restrictions during Commercial Space Vehicle Launches
  • Autonomous Direct Marketing with Deep Q-Learning
  • Efficient Risk Estimation for Chance-Constrained Robotic Motion Planning Under Uncertainty
  • Probabilistic Aircraft Arrival Rate Prediction
  • Audio Keylogging: Translating Acoustic Signals into Keystrokes
  • Collision Avoidance for Small Multi-Rotor Aircraft using SARSA(λ) and Fourier Basis Functions
  • Reinforcement Learning with Tetris
  • Stock Market Reinforcement Learning
  • Obstacle Avoidance for Automated Vehicle using Markov Decision Processes
  • Control of Epidemics on a Graph
  • Autonomous ATC for non-towered airports
  • Path Planning for Terrain Relative Navigation using POMDPs
  • Vehicle Braking Controller in a Markov Decision Process Framework
  • Multi-Armed Bandit Heuristics for HTTP Denial-of-Service Attacks
  • Structure Learning for Probabilistic Driving Models
  • Casino Blackjack Modeled as a Markov Decision Process
  • Competitive Collision Avoidance
  • Efficient Sampling Of Protein Landscapes Via Markov Decision Processes
  • Flight Deck Interval Management (An MDP Approach)
  • BGT Model for Analysis of Head-On Collisions
  • Collision Avoidance System Parameter Optimization
  • Dynamic Demand Prediction and Routing for Autonomous Mobility-on-Demand Systems
  • Action-Constrained, Multi-Species Task Scheduling: The Kayaker Problem
  • Reinforcement Learning with Low-rank Matrix Factorization
  • Automated Sequencing and Spacing of Arrival Aircraft in Final Vector Approach Airspace
  • Exploring Policy Learning for Blackjack