rofunc.learning.RofuncRL.models.critic_models#
1. Module Contents#
1.1. Classes#
1.2. API#
- class rofunc.learning.RofuncRL.models.critic_models.BaseCritic(cfg: omegaconf.DictConfig, observation_space: Optional[Union[int, Tuple[int], gym.Space, gymnasium.Space, List]], action_space: Optional[Union[int, Tuple[int], gym.Space, gymnasium.Space]], state_encoder: Optional[torch.nn.Module] = EmptyEncoder(), cfg_name: str = 'critic')[source]#
Bases:
torch.nn.Module- freeze_parameters(freeze: bool = True) None[source]#
Freeze or unfreeze internal parameters :param freeze: freeze (True) or unfreeze (False)
- update_parameters(model: torch.nn.Module, polyak: float = 1) None[source]#
Update internal parameters by hard or soft (polyak averaging) update - Hard update: \(\theta = \theta_{net}\) - Soft (polyak averaging) update: \(\theta = (1 - \rho) \theta + \rho \theta_{net}\) :param model: Model used to update the internal parameters :param polyak: Polyak hyperparameter between 0 and 1 (default:
1).A hard update is performed when its value is 1
- class rofunc.learning.RofuncRL.models.critic_models.Critic(cfg: omegaconf.DictConfig, observation_space: Optional[Union[int, Tuple[int], gym.Space, gymnasium.Space, List]], action_space: Optional[Union[int, Tuple[int], gym.Space, gymnasium.Space]], state_encoder: Optional[torch.nn.Module] = EmptyEncoder(), cfg_name: str = 'critic')[source]#
Bases:
rofunc.learning.RofuncRL.models.critic_models.BaseCritic