mstdn.social is one of the many independent Mastodon servers you can use to participate in the fediverse.
A general-purpose Mastodon server with a 500 character limit. All languages are welcome.

Administered by:

Server stats:

7.2K
active users

#generalization

0 posts0 participants0 posts today
Hacker News<p>Just Ask for Generalization</p><p><a href="https://evjang.com/2021/10/23/generalization.html" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">evjang.com/2021/10/23/generali</span><span class="invisible">zation.html</span></a></p><p><a href="https://mastodon.social/tags/HackerNews" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>HackerNews</span></a> <a href="https://mastodon.social/tags/Just" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Just</span></a> <a href="https://mastodon.social/tags/Ask" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Ask</span></a> <a href="https://mastodon.social/tags/for" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>for</span></a> <a href="https://mastodon.social/tags/Generalization" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Generalization</span></a> <a href="https://mastodon.social/tags/AI" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AI</span></a> <a href="https://mastodon.social/tags/Generalization" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Generalization</span></a> <a href="https://mastodon.social/tags/MachineLearning" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>MachineLearning</span></a> <a href="https://mastodon.social/tags/DataScience" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>DataScience</span></a> <a href="https://mastodon.social/tags/HackerNews" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>HackerNews</span></a></p>
nf-core<p>Pipeline release! nf-core/drugresponseeval v1.1.0 - Drugresponseeval 1.1.0 - Humongous Zapdos!</p><p>Please see the changelog: <a href="https://github.com/nf-core/drugresponseeval/releases/tag/1.1.0" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">github.com/nf-core/drugrespons</span><span class="invisible">eeval/releases/tag/1.1.0</span></a></p><p><a href="https://mstdn.science/tags/celllines" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>celllines</span></a> <a href="https://mstdn.science/tags/crossvalidation" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>crossvalidation</span></a> <a href="https://mstdn.science/tags/deeplearning" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>deeplearning</span></a> <a href="https://mstdn.science/tags/drugresponse" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>drugresponse</span></a> <a href="https://mstdn.science/tags/drugresponseprediction" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>drugresponseprediction</span></a> <a href="https://mstdn.science/tags/drugs" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>drugs</span></a> <a href="https://mstdn.science/tags/fairprinciples" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>fairprinciples</span></a> <a href="https://mstdn.science/tags/generalization" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>generalization</span></a> <a href="https://mstdn.science/tags/hyperparametertuning" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>hyperparametertuning</span></a> <a href="https://mstdn.science/tags/machinelearning" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>machinelearning</span></a> <a href="https://mstdn.science/tags/randomizationtests" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>randomizationtests</span></a> <a href="https://mstdn.science/tags/robustnessassessment" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>robustnessassessment</span></a> <a href="https://mstdn.science/tags/training" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>training</span></a> <a href="https://mstdn.science/tags/nfcore" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>nfcore</span></a> <a href="https://mstdn.science/tags/openscience" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>openscience</span></a> <a href="https://mstdn.science/tags/nextflow" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>nextflow</span></a> <a href="https://mstdn.science/tags/bioinformatics" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>bioinformatics</span></a></p>
JMLR<p>'Random Pruning Over-parameterized Neural Networks Can Improve Generalization: A Training Dynamics Analysis', by Hongru Yang, Yingbin Liang, Xiaojie Guo, Lingfei Wu, Zhangyang Wang.</p><p><a href="http://jmlr.org/papers/v26/23-0832.html" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">http://</span><span class="ellipsis">jmlr.org/papers/v26/23-0832.ht</span><span class="invisible">ml</span></a> <br> <br><a href="https://sigmoid.social/tags/pruning" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>pruning</span></a> <a href="https://sigmoid.social/tags/pruned" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>pruned</span></a> <a href="https://sigmoid.social/tags/generalization" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>generalization</span></a></p>
PLOS Biology<p>Humans can apply solutions of past problems to new problems. <span class="h-card" translate="no"><a href="https://fediscience.org/@gershbrain" class="u-url mention" rel="nofollow noopener" target="_blank">@<span>gershbrain</span></a></span> <span class="h-card" translate="no"><a href="https://mastodon.online/@nicoschuck" class="u-url mention" rel="nofollow noopener" target="_blank">@<span>nicoschuck</span></a></span> &amp;co reveal the neural correlates of <a href="https://fediscience.org/tags/generalization" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>generalization</span></a> and show that humans apply past policies in a reward-sensitive manner that leads to high performance <span class="h-card" translate="no"><a href="https://fediscience.org/@PLOSBiology" class="u-url mention" rel="nofollow noopener" target="_blank">@<span>PLOSBiology</span></a></span> <a href="https://plos.io/3SJPMof" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="">plos.io/3SJPMof</span><span class="invisible"></span></a></p>
Hacker News<p>π0.5: A VLA with open-world generalization</p><p><a href="https://pi.website/blog/pi05" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="">pi.website/blog/pi05</span><span class="invisible"></span></a></p><p><a href="https://mastodon.social/tags/HackerNews" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>HackerNews</span></a> <a href="https://mastodon.social/tags/%CF%800" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>π0</span></a>.5 <a href="https://mastodon.social/tags/VLA" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>VLA</span></a> <a href="https://mastodon.social/tags/openworld" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>openworld</span></a> <a href="https://mastodon.social/tags/generalization" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>generalization</span></a> <a href="https://mastodon.social/tags/machinelearning" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>machinelearning</span></a> <a href="https://mastodon.social/tags/AI" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AI</span></a></p>
Games at Work dot biz<p>e509 — Maverick and&nbsp;Marbles</p><p>e509 with Michael and Michael - stories and discussion all around <a href="https://mastodon.social/tags/AI" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AI</span></a>, <a href="https://mastodon.social/tags/LLMs" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>LLMs</span></a>, <a href="https://mastodon.social/tags/llamas" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>llamas</span></a>, generated <a href="https://mastodon.social/tags/Quake" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Quake</span></a>, <a href="https://mastodon.social/tags/grokking" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>grokking</span></a>, <a href="https://mastodon.social/tags/generalization" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>generalization</span></a> and much more.</p><p><a href="https://gamesatwork.biz/2025/04/14/e509-maverick-and-marbles/" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">gamesatwork.biz/2025/04/14/e50</span><span class="invisible">9-maverick-and-marbles/</span></a></p>
Michael Martine<p>e509 — Maverick and Marbles</p><p>e509 with Michael and Michael - stories and discussion all around <a href="https://mstdn.social/tags/AI" class="mention hashtag" rel="tag">#<span>AI</span></a>, <a href="https://mstdn.social/tags/LLMs" class="mention hashtag" rel="tag">#<span>LLMs</span></a>, <a href="https://mstdn.social/tags/llamas" class="mention hashtag" rel="tag">#<span>llamas</span></a>, generated <a href="https://mstdn.social/tags/Quake" class="mention hashtag" rel="tag">#<span>Quake</span></a>, <a href="https://mstdn.social/tags/grokking" class="mention hashtag" rel="tag">#<span>grokking</span></a>, <a href="https://mstdn.social/tags/generalization" class="mention hashtag" rel="tag">#<span>generalization</span></a> and much more.</p><p><a href="https://gamesatwork.biz/2025/04/14/e509-maverick-and-marbles/" target="_blank" rel="nofollow noopener" translate="no"><span class="invisible">https://</span><span class="ellipsis">gamesatwork.biz/2025/04/14/e50</span><span class="invisible">9-maverick-and-marbles/</span></a></p>
nf-core<p>Pipeline release! nf-core/drugresponseeval v1.0.0 - 1.0.0!</p><p>Please see the changelog: <a href="https://github.com/nf-core/drugresponseeval/releases/tag/1.0.0" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">github.com/nf-core/drugrespons</span><span class="invisible">eeval/releases/tag/1.0.0</span></a></p><p><a href="https://mstdn.science/tags/celllines" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>celllines</span></a> <a href="https://mstdn.science/tags/crossvalidation" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>crossvalidation</span></a> <a href="https://mstdn.science/tags/deeplearning" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>deeplearning</span></a> <a href="https://mstdn.science/tags/drugresponse" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>drugresponse</span></a> <a href="https://mstdn.science/tags/drugresponseprediction" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>drugresponseprediction</span></a> <a href="https://mstdn.science/tags/drugs" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>drugs</span></a> <a href="https://mstdn.science/tags/fairprinciples" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>fairprinciples</span></a> <a href="https://mstdn.science/tags/generalization" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>generalization</span></a> <a href="https://mstdn.science/tags/hyperparametertuning" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>hyperparametertuning</span></a> <a href="https://mstdn.science/tags/machinelearning" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>machinelearning</span></a> <a href="https://mstdn.science/tags/randomizationtests" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>randomizationtests</span></a> <a href="https://mstdn.science/tags/robustnessassessment" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>robustnessassessment</span></a> <a href="https://mstdn.science/tags/training" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>training</span></a> <a href="https://mstdn.science/tags/nfcore" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>nfcore</span></a> <a href="https://mstdn.science/tags/openscience" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>openscience</span></a> <a href="https://mstdn.science/tags/nextflow" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>nextflow</span></a> <a href="https://mstdn.science/tags/bioinformatics" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>bioinformatics</span></a></p>
Sean Murthy<p>People value us for the value (they believe) we (might) add to them. </p><p>Generalizing of course, but it's all transactional. There's no (longer) valuing people for just who they are.</p><p><a href="https://hachyderm.io/tags/society" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>society</span></a> <a href="https://hachyderm.io/tags/people" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>people</span></a> <a href="https://hachyderm.io/tags/life" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>life</span></a> <a href="https://hachyderm.io/tags/generalization" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>generalization</span></a></p>
Victoria Stuart 🇨🇦 🏳️‍⚧️<p>Grokking at Edge of Numerical Stability<br><a href="https://arxiv.org/abs/2501.04697" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="">arxiv.org/abs/2501.04697</span><span class="invisible"></span></a><br><a href="https://old.reddit.com/r/MachineLearning/comments/1i34keg/grokking_at_the_edge_of_numerical_stability" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">old.reddit.com/r/MachineLearni</span><span class="invisible">ng/comments/1i34keg/grokking_at_the_edge_of_numerical_stability</span></a><br><a href="https://en.wikipedia.org/wiki/Grokking_(machine_learning)" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">en.wikipedia.org/wiki/Grokking</span><span class="invisible">_(machine_learning)</span></a></p><p>* sudden generalization after prolonged overfitting<br>* massively overtrained NN can acq. "emergent"/supra performance/unexpected abilities<br>* unexp./accid. finding<br>* mechanisms starting to unravel</p><p>Grokked Transformers are Implicit Reasoners: Mechanistic Journey to Edge of Generalization<br><a href="https://arxiv.org/abs/2405.15071" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="">arxiv.org/abs/2405.15071</span><span class="invisible"></span></a><br><a href="https://news.ycombinator.com/item?id=40495149" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">news.ycombinator.com/item?id=4</span><span class="invisible">0495149</span></a></p><p><a href="https://mastodon.social/tags/LLM" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>LLM</span></a> <a href="https://mastodon.social/tags/ML" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>ML</span></a> <a href="https://mastodon.social/tags/grokking" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>grokking</span></a> <a href="https://mastodon.social/tags/NN" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>NN</span></a> <a href="https://mastodon.social/tags/emergence" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>emergence</span></a> <a href="https://mastodon.social/tags/generalization" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>generalization</span></a></p>
Different Than<p>A post from August 2024 by <span class="h-card" translate="no"><a href="https://mastodon.social/@grimalkina" class="u-url mention" rel="nofollow noopener" target="_blank">@<span>grimalkina</span></a></span>, boosted by someone on another instance, about why to report demographics in research even when you're not studying those groups. This seems like a great primer for people who have little background in basic <a href="https://infosec.exchange/tags/sampling" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>sampling</span></a> and <a href="https://infosec.exchange/tags/generalization" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>generalization</span></a> (for some reason I can't link/boost from here, so):</p><p><a href="https://mastodon.social/@grimalkina/112966685297897685" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">mastodon.social/@grimalkina/11</span><span class="invisible">2966685297897685</span></a></p><p>My 2 cents (already at least partially covered by Dr. Hicks): </p><p>1. Your study is never just about your study. Good science is <a href="https://infosec.exchange/tags/open" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>open</span></a> and reusable. e.g., maybe your study on tech-enabled healthcare access isn't specifically about LGBTQ+ or Hispanic people, but what are you doing to help a researcher who comes along in 10 years? That information will change what they find and report.</p><p>2. Marginalized groups are often minorities, meaning representative probability samples (or --uncomfortable gesture-- convenience samples) for bread-and-butter research frequently have subpopulations too small for reasonable power in correlations, group differences, etc. That's just reality. It's also a big problem for our understanding of <a href="https://infosec.exchange/tags/marginalized" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>marginalized</span></a> + <a href="https://infosec.exchange/tags/minority" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>minority</span></a> groups. Oversampling or targeted studies of those groups are important. It's also important to have a large number of less-targeted studies with relevant information that can be synthesized later (see #1): one study with 1.3% trans participants doesn't tell us much about the trans population, but 20 studies, each of which has 1.3% trans participants, could tell us meaningful things.</p><p>3. Representation is important. My belief is that <a href="https://infosec.exchange/tags/marginalized" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>marginalized</span></a>+minoritized people need their identities and existence public and constant. In <a href="https://infosec.exchange/tags/science" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>science</span></a>, both they and other people consuming the research will benefit from being reminded that they are there, almost always, in our <a href="https://infosec.exchange/tags/research" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>research</span></a>.</p>
JMLR<p>'Generalization on the Unseen, Logic Reasoning and Degree Curriculum', by Emmanuel Abbe, Samy Bengio, Aryo Lotfi, Kevin Rizk.</p><p><a href="http://jmlr.org/papers/v25/24-0220.html" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">http://</span><span class="ellipsis">jmlr.org/papers/v25/24-0220.ht</span><span class="invisible">ml</span></a> <br> <br><a href="https://sigmoid.social/tags/sparse" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>sparse</span></a> <a href="https://sigmoid.social/tags/learns" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>learns</span></a> <a href="https://sigmoid.social/tags/generalization" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>generalization</span></a></p>
JMLR<p>'Mentored Learning: Improving Generalization and Convergence of Student Learner', by Xiaofeng Cao, Yaming Guo, Heng Tao Shen, Ivor W. Tsang, James T. Kwok.</p><p><a href="http://jmlr.org/papers/v25/23-1213.html" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">http://</span><span class="ellipsis">jmlr.org/papers/v25/23-1213.ht</span><span class="invisible">ml</span></a> <br> <br><a href="https://sigmoid.social/tags/learners" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>learners</span></a> <a href="https://sigmoid.social/tags/learner" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>learner</span></a> <a href="https://sigmoid.social/tags/generalization" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>generalization</span></a></p>
JMLR<p>'Neural Networks with Sparse Activation Induced by Large Bias: Tighter Analysis with Bias-Generalized NTK', by Hongru Yang, Ziyu Jiang, Ruizhe Zhang, Yingbin Liang, Zhangyang Wang.</p><p><a href="http://jmlr.org/papers/v25/23-0831.html" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">http://</span><span class="ellipsis">jmlr.org/papers/v25/23-0831.ht</span><span class="invisible">ml</span></a> <br> <br><a href="https://sigmoid.social/tags/sparse" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>sparse</span></a> <a href="https://sigmoid.social/tags/gradient" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>gradient</span></a> <a href="https://sigmoid.social/tags/generalization" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>generalization</span></a></p>
Jan Vlug<p><span class="h-card" translate="no"><a href="https://mastodon.social/@schizanon" class="u-url mention" rel="nofollow noopener" target="_blank">@<span>schizanon</span></a></span> <span class="h-card" translate="no"><a href="https://en.osm.town/@strebski" class="u-url mention" rel="nofollow noopener" target="_blank">@<span>strebski</span></a></span> <span class="h-card" translate="no"><a href="https://chaos.social/@fossdd" class="u-url mention" rel="nofollow noopener" target="_blank">@<span>fossdd</span></a></span> I think <a href="https://mastodon.social/tags/nationalism" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>nationalism</span></a> and <a href="https://mastodon.social/tags/generalization" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>generalization</span></a> are important factors for war and killing. I try to treat living beings as <a href="https://mastodon.social/tags/individuals" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>individuals</span></a>.</p>
Jim Donegan 🎵 ✅<p><a href="https://mastodon.scot/tags/STARTREK" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>STARTREK</span></a> <a href="https://mastodon.scot/tags/LogicalThinking" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>LogicalThinking</span></a> #70 - Proof By Example (Inappropriate <a href="https://mastodon.scot/tags/Generalization" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Generalization</span></a>)</p><p><a href="https://www.youtube.com/watch?v=NjntoaujuF0" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://www.</span><span class="ellipsis">youtube.com/watch?v=NjntoaujuF</span><span class="invisible">0</span></a></p><p><a href="https://mastodon.scot/tags/Trek" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Trek</span></a> <a href="https://mastodon.scot/tags/LogicalThinking" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>LogicalThinking</span></a> <a href="https://mastodon.scot/tags/Philosophy" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Philosophy</span></a> <a href="https://mastodon.scot/tags/Spock" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Spock</span></a> <a href="https://mastodon.scot/tags/Enterprise" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Enterprise</span></a> <a href="https://mastodon.scot/tags/TAS" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>TAS</span></a> <a href="https://mastodon.scot/tags/StarTrekTAS" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>StarTrekTAS</span></a> <a href="https://mastodon.scot/tags/TheAnimatedSeries" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>TheAnimatedSeries</span></a></p>
Habr<p>Могут ли трансформеры «думать»</p><p>Недавние исследования показывают, что модели трансформеров способны почти безошибочно решать задачи, требующие нескольких логических шагов. Например, из утверждения А вывести Б и дойти логически до В. И что удивительно, это достигается без использования Chain-of-Thought или особых промптов — только классический GPT-2. Давайте посмотрим, как трансформеры «думают» при решении задач рассуждения, и напишем для этого код с использованием библиотеки Hugging Face.</p><p><a href="https://habr.com/ru/articles/840136/" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="">habr.com/ru/articles/840136/</span><span class="invisible"></span></a></p><p><a href="https://zhub.link/tags/GPT" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>GPT</span></a> <a href="https://zhub.link/tags/%D0%B3%D1%80%D0%BE%D0%BA%D0%B8%D0%BD%D0%B3" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>грокинг</span></a> <a href="https://zhub.link/tags/%D0%BF%D0%B0%D0%BC%D1%8F%D1%82%D1%8C_%D0%98%D0%98" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>память_ИИ</span></a> <a href="https://zhub.link/tags/%D0%B7%D0%B0%D0%B4%D0%B0%D1%87%D0%B8_%D1%80%D0%B0%D1%81%D1%81%D1%83%D0%B6%D0%B4%D0%B5%D0%BD%D0%B8%D1%8F" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>задачи_рассуждения</span></a> <a href="https://zhub.link/tags/%D0%BE%D0%B1%D1%89%D0%B8%D0%B9_%D0%B8%D1%81%D0%BA%D1%83%D1%81%D1%81%D1%82%D0%B2%D0%B5%D0%BD%D0%BD%D1%8B%D0%B9_%D0%B8%D0%BD%D1%82%D0%B5%D0%BB%D0%BB%D0%B5%D0%BA%D1%82" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>общий_искусственный_интеллект</span></a> <a href="https://zhub.link/tags/%D0%BE%D0%B1%D0%BE%D0%B1%D1%89%D0%B5%D0%BD%D0%B8%D0%B5" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>обобщение</span></a> <a href="https://zhub.link/tags/generalization" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>generalization</span></a> <a href="https://zhub.link/tags/%D1%82%D1%80%D0%B0%D0%BD%D1%81%D1%84%D0%BE%D1%80%D0%BC%D0%B0%D1%82%D0%BE%D1%80" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>трансформатор</span></a> <a href="https://zhub.link/tags/%D0%BF%D0%B0%D0%BC%D1%8F%D1%82%D1%8C_%D1%82%D1%80%D0%B0%D0%BD%D1%81%D1%84%D0%BE%D1%80%D0%BC%D0%B5%D1%80%D0%BE%D0%B2" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>память_трансформеров</span></a></p>
Matthias Nau<p>#8<br>The benefits of <a href="https://neuromatch.social/tags/Multitask" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Multitask</span></a> studies are huge!</p><p>Most importantly, they allow testing the prevalent assumption of <a href="https://neuromatch.social/tags/generalization" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>generalization</span></a>, yielding results with high chance of generalizing beyond the lab. What's more, they even enable the discovery of *new concepts*!</p>
JMLR<p>'Three-Way Trade-Off in Multi-Objective Learning: Optimization, Generalization and Conflict-Avoidance', by Lisha Chen, Heshan Fernando, Yiming Ying, Tianyi Chen.</p><p><a href="http://jmlr.org/papers/v25/23-1287.html" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">http://</span><span class="ellipsis">jmlr.org/papers/v25/23-1287.ht</span><span class="invisible">ml</span></a> <br> <br><a href="https://sigmoid.social/tags/objectives" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>objectives</span></a> <a href="https://sigmoid.social/tags/objective" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>objective</span></a> <a href="https://sigmoid.social/tags/generalization" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>generalization</span></a></p>
Ralph Straumann (@rastrau)<p><span class="h-card" translate="no"><a href="https://urbanists.social/@markstos" class="u-url mention" rel="nofollow noopener" target="_blank">@<span>markstos</span></a></span> Impressive work. Connectivity, to me, implies network / topological metrics. I’ve experimented a bit with betweenness centrality (<a href="https://en.wikipedia.org/wiki/Betweenness_centrality" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">en.wikipedia.org/wiki/Betweenn</span><span class="invisible">ess_centrality</span></a>) in Python and found it promising (also, e.g., for <a href="https://swiss.social/tags/network" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>network</span></a> <a href="https://swiss.social/tags/generalization" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>generalization</span></a>). However, it’s computationally expensive. <a href="https://swiss.social/tags/gis" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>gis</span></a></p>