Hacker News<p>How to scale RL to 10^26 FLOPs</p><p><a href="https://blog.jxmo.io/p/how-to-scale-rl-to-1026-flops" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">blog.jxmo.io/p/how-to-scale-rl</span><span class="invisible">-to-1026-flops</span></a></p><p><a href="https://mastodon.social/tags/HackerNews" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>HackerNews</span></a> <a href="https://mastodon.social/tags/scaleRL" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>scaleRL</span></a> <a href="https://mastodon.social/tags/FLOPs" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>FLOPs</span></a> <a href="https://mastodon.social/tags/reinforcementLearning" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>reinforcementLearning</span></a> <a href="https://mastodon.social/tags/AIresearch" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AIresearch</span></a> <a href="https://mastodon.social/tags/optimization" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>optimization</span></a></p>