[P] Fast and Simple Solution to Kaggle’s `Jigsaw – Agile Community Rules Classification`
Fast and Simple: Ranker fine-tuning + Embeddings + Classifier Orders of Magnitud Faster and Less...
Fast and Simple: Ranker fine-tuning + Embeddings + Classifier Orders of Magnitud Faster and Less...
I’ve been exploring architectures that make agent systems reproducible, debuggable, and deterministic. Most current agent...
Happy to announce an exciting new project from the lab: “Adopting a human developmental visual...
TL;R: I’m experimenting with an orchestration layer that tracks a synthetic "somatic" state (dopamine and...
TL;R: A deep dive into foundational to advanced topics like Python, statistics, neural networks, and...
Try our interactive maze-solving demo: https://pub.sakana.ai/ctm/ Continuous Thought Machines arXiv: https://arxiv.org/abs/2505.05522 Interactive Website: https://pub.sakana.ai/ctm/ Blog...
Hello. I am essentially a complete layman in terms of machine learning. However, it is...
I made a post a week ago, requesting advice regarding my paper, which was allegedly...
https://preview.redd.it/idwd99rlr85g1.png?width=2954&format=png&auto=webp&s=ae5db7ed100fab0485063598bc9ef92e0732f24e I’ve been running a set of continual learning experiments across 12 multimodal tasks (vision, speech, and...
Note: this is adapted from a piece I first posted on my personal site; link...
I trained a 7B to learn a niche language and reaching 86% code accuracy Hi...
Hi everyone, I'm a senior CS undergrad researching the infrastructure required for the next generation...
Lately I've been really worried about a trend in the ML community: the overwhelming dominance...
Inspired by an earlier post that called out an Apple ICLR paper for having an...
Hello, I'm working on Continued Pre-Training (CPT) for a Gemma 4B/12B model on a social...
Hello, I am new to this community. I am an ML researcher and a computer...
I've been experimenting with outcome-based learning for AI agent memory and got some interesting results,...
Hi all, some of you might know that there is a relatively niche and emerging...
ln(x + sqrt(x2 +1)) strikes me as a pretty good non-linearity activation. Unbounded, odd-function, logarithmic...
Hi everyone, I need advice on which direction to explore. I have a large table...
So here’s what happened. Earlier this month, a colleague shared an Apple paper on arXiv...
The OpenReview identity leak has created a difficult situation not only for authors, but also...
Unlike current AI systems, brains can quickly and flexibly adapt to changing environments. This is...
You are receiving this email as an author of a submitted paper to ICLR 2026....
Hello all. I am doing a Ph in Computer Science at a mid tier university...
Hello, I'm considering doing a Ph in computer vision. I have a somewhat unconventional situation...
TL;R: Through an ablation study, it is demonstrated that current activation functions result in discrete representations,...
Hey everyone, Like many of you, I've been wrestling with the cost of using different...
Hi everyone, I am here to find a new contributor for our team's project, pruning...
Hi there! I'd like to share a project I've been working on over the last...