Now showing 1 - 2 of 2
No Thumbnail Available
Publication

A Reinforcement Learning Method for Continuous Domains Using Artificial Hydrocarbon Networks

2018 , Ponce, Hiram , González Mora, José Guillermo , Martinez-Villaseñor, Lourdes

Reinforcement learning in continuous states and actions has been limitedly studied in ocassions given difficulties in the determination of the transition function, lack of performance in continuous-to-discrete relaxation problems, among others. For instance, real-world problems, e.g. Fobotics, require these methods for learning complex tasks. Thus, in this paper, we propose a method for reinforcement learning with continuous states and actions using a model-based approach learned with artificial hydrocarbon networks (AHN). The proposed method considers modeling the dynamics of the continuous task with the supervised AHN method. Initial random rollouts and posterior data collection from policy evaluation improve the training of the AHN-based dynamics model. Preliminary results over the well-known mountain car task showed that artificial hydrocarbon networks can contribute to model-based approaches in continuous RL problems in both estimation efficiency (0.0012 in root mean squared-error) and sub-optimal policy convergence (reached in 357 steps), in just 5 trials over a parameter space θin R86. Data from experimental results are available at: http://sites.google.com/up.edu.mx/reinforcement-learning/ ©2018 IEEE.

No Thumbnail Available
Publication

A Methodology Based on Deep Q-Learning/Genetic Algorithms for Optimizing COVID-19 Pandemic Government Actions

2020 , Miralles-Pechuán, Luis , Jiménez, Fernando , Ponce, Hiram , Martinez-Villaseñor, Lourdes

Whenever countries are threatened by a pandemic, as is the case with the COVID-19 virus, governments need help to take the right actions to safeguard public health as well as to mitigate the negative effects on the economy. A restrictive approach can seriously damage the economy. Conversely, a relaxed one may put at risk a high percentage of the population. Other investigations in this area are focused on modelling the spread of the virus or estimating the impact of the different measures on its propagation. However, in this paper, we propose a new methodology for helping governments in planning the phases to combat the pandemic based on their priorities. To this end, we implement the SEIR epidemiological model to represent the evolution of the COVID-19 virus on the population. To optimize the best sequences of actions governments can take, we propose a methodology with two approaches, one based on Deep Q-Learning and another one based on Genetic Algorithms. The sequences of actions (confinement, self-isolation, two-meter distance or not taking restrictions) are evaluated according to a reward system focused on meeting two objectives: firstly, getting few people infected so that hospitals are not overwhelmed, and secondly, avoiding taking drastic measures which could cause serious damage to the economy. The conducted experiments evaluate our methodology based on the accumulated rewards during the established period. The experiments also prove that it is a valid tool for governments to reduce the negative effects of a pandemic by optimizing the planning of the phases. According to our results, the approach based on Deep Q-Learning outperforms the one based on Genetic Algorithms. © 2020 ACM.