Skip to content

Why reinforcement learning?

post from 20.08.2022

Principle idea is to get “more” in sense of achieving better results in all aspects of life and what is base of essence of life from evolution point of view. The difference from other Machine In the most of the real life behaviour scenarios or decision-making situation we don’t have right answers how to behave in real life. We become more experienced, start to understand life year by year and after, by trying and error become better in making decisions (I hope so). It is not only description of people’s life or any creature but it’s also the base of Reinforcement learning. Principles of the life are inspiration of Reinforcement learning same as for Evolution strategies/algorithms. You are like a RL Agent which tries to weight future situation after all possible Actions and choose the best possible action or you like individual who tries to notice what made this person successful and tries to do more similar things to achieve the same in order to have some goods in life. Life isn’t instant, it is a process. Some Rewards are delayed and some opposite become smaller with time are discounted. Goal of RL is to maximize total cumulative Reward. Such formulation of tasks is applicable in many areas and real life decision-making cases, depending only on your imagination and points of application of efforts. Everyone wants to know trade off to get the best juice in life. The point is that juice for everyone is different. To be successful in life it is important to have proper allocation of resources. And you need to decide what is the best for you and in parallel to keep in mind that everything has cost and effects and consequences.
We live on the Earth. It is our Environment. We affect Environment and Environment affect to us. Information which surrounds us in the Environment is the State. We take Actions based on current State. Based on current State and our Action, Environment produces Reward for us and new State as well. Main circle of RL is a constant exchange of our Actions, States and Rewards, called SARS' (State, Action, Reward, New State). Feedback in the form of Reward and current State affects to our future Actions and our current Actions affects to the future State and Rewards. So, why Reinforcement learning? Because it gives inspiration and hope to maximize total Reward.

post from 02.01.2025

I wrote this post above when I had been learning RL for 3 month. For me, it was clear that this direction in ML is the most suitable for my character and temperament and interests in life. Since 2.5 year of working in this area as RL-engineer no updates from this point of view I plan to continue work in this area of applied RL as well as Genetic Algorithms. This RL course is created for young RL-engineer in order to put together materials that would make my path in RL easier.


I wish you to go through this course painstakingly without any haste. I hope that knowledge of the fact based on statistics that only 5% of people who start course eventually finish it and I wish you to be in this 5% group. It is really important.