Reinforcement Learning Framework Documentation

Description

Reinforcement Learning Framework is used to train various models on various games. The framework is built in such a way that minimal user input is required.

Functionality

  1. Share endpoints that allows API and Frontend to connect

  2. Collect game data from API

  3. Collect data about model training and testing and pass them to API and Frontend

  4. Pass list of all models with their parameters to Frontend

Structure of the Reinforcement Learning Framework

Framework consists of the following parts:

  1. algorithms - module containing reinforcement learning algorithms and neural networks implementation

  2. api - module containing application runner and shared endpoints

  3. logger - module containg logger functionalities

Algorithms Module Structure

Algorithms Module consists of the following parts:

  1. Algorithm class

  2. modules module containing neural networks implementations

  3. AlgorithmManager class

  4. Config class

  5. States enum

  6. ParameterType enum

  7. Parameter tuple subclass

Algorithm

Algorithm class delivers functions that every algorithm implemented in the framework must provide
NOTE - every algorithm implemented needs to be registered by AlgorithmManager!

The structure of Algorithm class is the following:

from abc import ABC, abstractmethod
from rl.algorithms.Config import Config

class Algorithm(ABC):
    def __init__(self, logger) -> None:
        self.logger = logger
        self.config = None

    @abstractmethod
    def forward(self, state: list, actions: list, reward: float, game_status: str) -> int:
        pass

    @classmethod
    @abstractmethod
    def get_configurable_parameters(cls) -> dict:
        pass

    @abstractmethod
    def get_model(self) -> object:
        pass

    @abstractmethod
    def set_params(self, params) -> object:
        pass

    def config_model(self, config: dict) -> None:
        self.config = Config.from_dict(config)

    def restart(self) -> None:
        pass

    def update_config(self, config: dict) -> None:
        self.config.update(config)
Methods

__init__

Initiate algorithm

def __init__(self, logger) -> None:
Input: logger used to log features
Output: None

forward

Make one iteration of the model

@abstractmethod
def forward(self, state: list, actions: list, reward: float) -> int:
Input: current state of the game, list of possible actions, reward for previous action
Output: the index of move chosen by the algorithm

get_configurable_parameters

Get all model’s parameters

@classmethod
@abstractmethod
def get_configurable_parameters(cls) -> dict:
Input: None
Output: List of all model’s parameters

get_model

Get model data

@abstractmethod
def get_model(self) -> object:
Input: None
Output: TO DO

set_params

Load model parameters

@abstractmethod
def set_params(self, params) -> None:
Input: model parameters
Output: None

config_model

Create model configuration using Config class

def config_model(self, config: dict) -> None:
Input: dictionary containing documentation
Output: None

restart

Restart defined model parameters

def restart(self) -> None:
Input: None
Output: None

update_config

Update model configuration

def update_config(self, config: dict) -> None:
Input: dictionary containing model configuration
Output: None

validate

Validate whether model parameters are set properly

def validate(data: dict) -> tuple(bool, string):
Input: dictionary containing model configuration
Output: tuple containg bool value indicating if parameters are set properly and error message (if parameters are set improperly)

modules

Module containing implementation of neural networks that could be later used to build reinforcement learning models.
Neural networks do not need to adhere to any rules.
User can implement their own neural networks when needed
SimpleNet - example of neural network

Class implementing a multilayer perceptron with ReLU as activation function

The structure of SimpleNet is the following:

import torch
import torch.nn as nn


class SimpleNet(nn.Module):
    def __init__(self, layers: list[int]) -> None:
        super().__init__()
        self.layers = nn.ModuleList(
            [nn.Linear(layers[i], layers[i + 1]) for i in range(len(layers) - 1)]
        )
        self.activation = nn.ReLU()

    def forward(self, x: torch.Tensor) -> torch.Tensor:
        for layer in self.layers:
            x = layer(x)
            x = self.activation(x)
        return x
SimpleNet takes as an input the list of the number of nodes in the hidden layers
Than during forward method it performs a simple forward propagation

AlgorithmManager

AlgorithmManager class handles all implemented algorithms. It registers all the implemented algorithms thus providing access to them to users. It also handles all operations related to setting up algorithms.

Methods

__init__

Initiate algorithm manager

def __init__(self) -> None:
Input: None
Output: None

mount

Mount algorithm manager

def mount(self, logger) -> None:
Input: logger used to log messages
Output: None

set_default_algorithm

Set default algorithm defined in the algorithm manager and configure it using default parameters

def set_default_algorithm(self) -> None:
Input: None
Output: None

set_algorithm

Set algorithm and configure it using default parameters. Log changes

def set_algorithm(self, algorithm_name: str, *args, **kwargs) -> None:
Input: name of the algorithm
Output: None

configure_algorithm

Create new configuration for currently used algorithm. Log changes

def configure_algorithm(self, config: dict) -> None:
Input: new configuration for currently used algorithm
Output: None

update_config

Update configuration of the currently used algorithm. Log changes

def update_config(self, config: dict) -> None:
Input: new configuration for currently used algorithm
Output: None

register_algorithm

Register the algorithm. Every implemented algorithm needs to be registered with the usage of register_algorithm decorator

def register_algorithm(self, name: str):
Input: name of the algorithm
Output: None

Config

Config class stores configuration of the algorithms. Every algorithm uses config to access theirs parameters

Methods

from_dict

Create new configuration from dictionary

@staticmethod
def from_dict(config: dict) -> None:
Input: dictionary with configuration
Output: None

as_dict

Get configuration as dictionary

def as_dict(self) -> dict:
Input: None
Output: dictionary with configuration

update

Update configuration

@staticmethod
def update(self, config: dict) -> None:
Input: dictionary with configuration
Output: None

States

Enum defining if the model is currently in train or test mode

from enum import Enum

class States(Enum):
    TRAIN = "train"
    TEST = "test"

ParameterType

Enum defining possible model parameter types. Currently it allows for parameters to have 4 types

from enum import Enum, auto

class ParameterType(Enum):
    INT = auto()
    FLOAT = auto()
    BOOL = auto()
    STRING = auto()

Parameter

Tuple subclass for defining model parameters:

from collections import namedtuple

Parameter = namedtuple(
    "Parameter", ("type", "default", "min", "max", "help", "modifiable")
)
type - type of the parameter (types defined in ParameterType in Config)
default - default value of the parameter. Set to None if parameter doesn’t have a default value
min - minimal value of the parameter. Set to None if parameter doesn’t have minimal value
max - maximal value of the parameter. Set to None if parameter doesn’t have maximal value
help - description of the parameter
modifiable - is parameter modifiable after training has started?

API Module Structure

Algorithms Module consists of the following parts:

  1. main

  2. endpoints

  3. Runner class

  4. GameResults class

  5. GameState enum

main

main file is used to start the reinforcement learning framework server. You need to start the server if you want to use the framework

Endpoints

We share multiple endpoints that allow to use reinforcement learning framework methods
To use them you need to use HTTP methods: GET, POST, PUT

/logs

GET - get logs from the server

/run

GET - get current runner state
POST - start/stop runner

/model

GET - return model in zip format
PUT - import model from the file

/config

GET - get current model configuration
PUT - update model configuration
POST - create new model configuration

/config-params

GET - get all model parameters (you can choose if you want modifiable parameters or all parameters)

/game-history

GET - return game history

/stats

GET - return statistics about training/testing process

Runner

Runner class is an intermediary between algorithms module, API and Frontend

Methods

__init__

Initiate runner

def __init__(self, logger: Logger, algorithm_manager: AlgorithmManager, max_game_len=100, config="config.json") -> None:
Input: logger, algorithm manager, maximal amount of moves that the model can make in one iteration of the game, configuration file for connecting with API
Output: None

_mount_socketio

Initiate connection to API

def _mount_socketio(self) -> None:
Input: None
Output: None

time

Get the running time of runner

@property
def time(self) -> float:
Input: None
Output: None

run

Run the runner until end condition is met, user stops the process or the connection gets lost

def run(self) -> None:
Input: None
Output: None

start

Start the runner

def start(self) -> None:
Input: None
Output: None

stop

Stop the runner

def stop(self) -> None:
Input: None
Output: None

GameResults

GameResults class stores model training/testing results

Methods

__init__

Initiate GameResults class. Set all statistics to 0

def __init__(self) -> None:
Input: None
Output: None

store_game_results

Store game results from one iteration

def store_game_results(self, reward, game_status, is_end_game):
Input: reward from the current game, game status (won, lost, ongoing), is game finished
Output: None

__str__

Get train/test statistics as string

def __str__(self) -> str:
Input: None
Output: train/test statistics as string

get_results

Get train/test statistics as dictionary

def get_results(self):
Input: None
Output: train/test statistics as dictionary

GameState

Enum defining the current state of the game

from enum import Enum, auto

class GameStates(Enum):
    ONGOING = auto()
    WON = auto()
    LOST = auto()

Logger Module Structure

Logger Module consists of the following parts:

  1. Logger class

  2. LogLevel enum

  3. LogType enum

Logger

Logger class is used to log operations made within the framework

Methods

__init__

Initiate logger

def __init__(self) -> None:
Input: None
Output: None

log

Print log message and save it in messages

def log(self, message: str, log_level: LogLevel, log_type: LogType) -> None:
Input: message, log level, log type
Output: None

info

Log message of into type

def info(self, message: str, log_type: LogType) -> None:
Input: message, log type
Output: None

get_messages

Get all logged messages

def get_messages(self, filter: str = None) -> list:
Input: condition on which messages will be filtered. If None than no filter is applied
Output: list of all the messages

LogLevel

Enum defining level of the message

from enum import Enum, auto

class LogLevel(Enum):
    DEBUG = auto()
    INFO = auto()
    WARNING = auto()
    ERROR = auto()
    FATAL = auto()

LogType

Enum defining the type of the message

from enum import Enum, auto

class LogType(Enum):
    CONFIG = auto()
    TRAIN = auto()
    TEST = auto()