Axel Pichler<p>Just compared Claude-Sonnet-3.5 with OpenAI's o1 on a CLS task – classifying text inputs from US short stories with regard to focalization. Turns out, Sonnet doesn't recognize zero focalization and achieved an F1-score of 0.47, while o1 performed better with 0.69. Not bad - but problematic, as the hidden tokens of the optimizer (?) from o1 would be of particular interest.</p><p><a href="https://fedihum.org/tags/CLS" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>CLS</span></a> <a href="https://fedihum.org/tags/AI" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AI</span></a> <a href="https://fedihum.org/tags/ClaudeSonnet" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>ClaudeSonnet</span></a> <a href="https://fedihum.org/tags/OpenAI" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>OpenAI</span></a>'s_o1 <a href="https://fedihum.org/tags/TextClassification" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>TextClassification</span></a> <a href="https://fedihum.org/tags/Focalization" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Focalization</span></a></p>