<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="4.1.1">Jekyll</generator><link href="https://minhlong94.github.io/blog/feed.xml" rel="self" type="application/atom+xml" /><link href="https://minhlong94.github.io/blog/" rel="alternate" type="text/html" /><updated>2022-09-20T11:19:25-05:00</updated><id>https://minhlong94.github.io/blog/feed.xml</id><title type="html">fastpages</title><subtitle>An easy to use blogging platform with support for Jupyter Notebooks.</subtitle><entry><title type="html">Ways to sync your code from PyCharm to a remote Linux server</title><link href="https://minhlong94.github.io/blog/tricks/2022/02/10/Pycharm-and-Linux-server.html" rel="alternate" type="text/html" title="Ways to sync your code from PyCharm to a remote Linux server" /><published>2022-02-10T00:00:00-06:00</published><updated>2022-02-10T00:00:00-06:00</updated><id>https://minhlong94.github.io/blog/tricks/2022/02/10/Pycharm%20and%20Linux%20server</id><author><name></name></author><category term="tricks" /><summary type="html">Introduction I am not a fan of long wording so let’s jump straight into the problem.</summary></entry><entry><title type="html">You should not treat RL as a black box: an example using League of Legends</title><link href="https://minhlong94.github.io/blog/drl/shitpost/2022/01/04/RL-Optimal-Example-With-Faker.html" rel="alternate" type="text/html" title="You should not treat RL as a black box: an example using League of Legends" /><published>2022-01-04T00:00:00-06:00</published><updated>2022-01-04T00:00:00-06:00</updated><id>https://minhlong94.github.io/blog/drl/shitpost/2022/01/04/RL%20Optimal%20Example%20With%20Faker</id><author><name></name></author><category term="DRL" /><category term="shitpost" /><summary type="html">Recently I tried to explain a struggle in a project to a friend, who had zero knowledge in Reinforcement Learning, that was: my friend set up a suitable action space for an agent to choose to “neglect” a chest in a gridworld, and she thought: “if the agent is smart enough, it should learn how to neglect these chests if necessary.” The key word here is “if necessary”: she did not know what cases should be considered “necessary”. In short, she did not know the optimal policy and under what conditions would make the policy optimal. Say, for example, if you have 10 steps left in the gridworld, and you can both get the chest and the goal in 10 steps, why should you neglect the chest, assume that it gives bonus reward and the number of steps left does not contribute to the reward? Transfer to our problem, in this case, she wonders why the agent does not neglect the chest (expectation), but she does not know how many steps are left, and where is the agent on the gridworld (reality).</summary></entry><entry><title type="html">Making your Deep RL matters.</title><link href="https://minhlong94.github.io/blog/drl/2021/09/25/Making-your-Deep-RL-matters.html" rel="alternate" type="text/html" title="Making your Deep RL matters." /><published>2021-09-25T00:00:00-05:00</published><updated>2021-09-25T00:00:00-05:00</updated><id>https://minhlong94.github.io/blog/drl/2021/09/25/Making%20your%20Deep%20RL%20matters</id><author><name></name></author><category term="DRL" /><summary type="html">A collection of implementation tricks, hyperparameter sensitivity, and others in Deep RL which I gave a presentation in my research group.</summary></entry><entry><title type="html">Action Space Shaping in Deep RL</title><link href="https://minhlong94.github.io/blog/drl/2021/09/25/Action-space-shaping-in-Deep-Reinforcement-Learning.html" rel="alternate" type="text/html" title="Action Space Shaping in Deep RL" /><published>2021-09-25T00:00:00-05:00</published><updated>2021-09-25T00:00:00-05:00</updated><id>https://minhlong94.github.io/blog/drl/2021/09/25/Action%20space%20shaping%20in%20Deep%20Reinforcement%20Learning</id><author><name></name></author><category term="DRL" /><summary type="html">This is a presentation of the paper “Action space shaping in Deep Reinforcement Learning”, by Anssi Kanervisto et al., 2020, in IEEE Conference on Games 2020.</summary></entry><entry><title type="html">A Neural Network to print from 1 to 50</title><link href="https://minhlong94.github.io/blog/shitpost/2021/09/10/A-Neural-Network-To-Print-From-1-to-50.html" rel="alternate" type="text/html" title="A Neural Network to print from 1 to 50" /><published>2021-09-10T00:00:00-05:00</published><updated>2021-09-10T00:00:00-05:00</updated><id>https://minhlong94.github.io/blog/shitpost/2021/09/10/A-Neural-Network-To-Print-From-1-to-50</id><author><name></name></author><category term="shitpost" /><summary type="html">Recently, a user named logo asked a question on the RL Discord server: how to print numbers from 1 to 50 in python? Little did I know, I was about to engage in one of the funniest conversation. Pure Python approach for i in range(50): print(i) A very simple and straightforward solution. Wait, why can’t we just print(1,2,3,4,5,6,7,8,9,10...,50)?</summary></entry></feed>