MachineLearning

[D] Moral Uncertainty Around Emerging AI Introspection

Relevant paper to read first: https://transformer-circuits.pub/2025/introspection/index.html On the Moral Uncertainty Emerging Around AI Introspection In...

Nov 27, 2025

MachineLearning

[D][P] PKBoost v2 is out! An entropy-guided boosting library with a focus on drift adaptation and multiclass/regression support.

Hey everyone in the ML community, I wanted to start by saying a huge thank...

Nov 27, 2025

MachineLearning

[D] What would change in your ML workflow if Jupyter or VS Code opened in seconds on a cloud-hosted OS?

Imagine your ML development environment running inside a web platform where each tool such as...

Nov 27, 2025

MachineLearning

[R] GRAM: General-purpose Real-world Audio Model to efficiently learn spatial audio representations.

Hey all, I am excited to share our new pre-print with you. GRAM: a General-purpose...

Nov 27, 2025

MachineLearning

[R] WavJEPA: Semantic learning unlocks robust audio foundation models for raw waveforms

https://preview.redd.it/7u5do1x19uzf1.png?width=1103&format=png&auto=webp&s=bfc314716f4e33593b16e6e131870dae62d7577a Hey All, We have just released our new pre-print on WavJEPA. WavJEPA is an...

Nov 27, 2025

MachineLearning

[D] Random occasional spikes in validation loss

https://preview.redd.it/a9a5cmud890g1.png?width=320&format=png&auto=webp&s=4d3b35fe360f74ce16de394f4cce37ac00ca6acf Hello everyone, I am training a captcha recognition model using CRNN. The problem now...

Nov 27, 2025

MachineLearning

[D] Information geometry, anyone?

The last few months I've been doing a deep-dive into information geometry and I've really,...

Nov 27, 2025

MachineLearning

[P] ElikaAI AI Trainer — Open-Source Sandbox for Teaching Transferable Skills (Apache 2.0)

ElikaAi AI Trainer v2.0 — Open-Source Sandbox for Teaching Transferable Skills (Apache 2.0) I’ve...

Nov 27, 2025

MachineLearning

[D] Let’s discuss World Models

Hey everyone, I've been reading about "World Models" for a while now and wanted to...

Nov 27, 2025

MachineLearning

[R] Generative Flows on Weight Space for Covariate Shift Detection (AAAI 2026 Workshop)

Abstract: Flow-based generative modeling provides a powerful framework for reasoning about uncertainty in weight space....

Nov 27, 2025

MachineLearning

[D] I managed to fine-tune Qwen2.5-Omni-3B while keeping multimodal abilities — is it actually as hard as it felt?

Hey everyone, I'm working on a personal project (AI for agriculture) and I just spent...

Nov 27, 2025

MachineLearning

[D] An alternative to Nested Cross Validation and independent test set doubts

I have a small tabular dataset with ~ 300 elements. I have to build a...

Nov 27, 2025

MachineLearning

[P] A “foveated” memory layer for LLM agents: +46.7pp accuracy at 256-token context (open-source)

Hi all! I’ve been experimenting with long-term memory for LLM agents under small context budgets,...

Nov 27, 2025

MachineLearning

[D] Exploring a High-Accountability Peer Collaboration Model for Intermediate ML Engineers/Researchers

Hi everyone, I’m exploring the idea of creating a small, high-signal peer collaboration model for...

Nov 27, 2025

MachineLearning

[P] Human Action Classification: Reproducible baselines for UCF-101 (87%) and Stanford40 (88.5%) with training code + pretrained models

Human Action Classification: Reproducible Research Baselines Hey r/MachineLearning! I built reproducible baselines for human action...

Nov 27, 2025

MachineLearning

[P] mamba2-jax is here! Pure JAX/Flax implementation of Mamba2 (≈2× faster CPU inference vs PyTorch on my micro-benchmark)

Hey guys! I’ve open-sourced mamba2-jax, an experimental but stable JAX/Flax implementation of Mamba2 (“Transformers are...

Nov 27, 2025

MachineLearning

[R] Inference-time attractor layer for transformers: preliminary observations

We tested a small “attractor” layer that updates during inference (no training/backprop). It preserved perplexity...

Nov 27, 2025

MachineLearning

[D] I built a reasoning pipeline that boosts 8B models using structured routing + verification

This is a project I’ve been working on quietly for a while, and I finally...

Nov 27, 2025

MachineLearning

[P] Knowledge Distillation: 97% Cost Reduction Distilling Claude Sonnet 4 → GPT-4.1-nano (98% Fidelity Retained)

TL;R: Fine-tuned GPT-4.1-nano achieved 98% of Claude Sonnet 4's quality (0.784 vs 0.795) on structured...

Nov 27, 2025

MachineLearning

[D] Moral Uncertainty Around Emerging AI Introspection

[D][P] PKBoost v2 is out! An entropy-guided boosting library with a focus on drift adaptation and multiclass/regression support.

[D] What would change in your ML workflow if Jupyter or VS Code opened in seconds on a cloud-hosted OS?

[R] GRAM: General-purpose Real-world Audio Model to efficiently learn spatial audio representations.

[R] WavJEPA: Semantic learning unlocks robust audio foundation models for raw waveforms

[D] Random occasional spikes in validation loss

[D] Information geometry, anyone?

[P] ElikaAI AI Trainer — Open-Source Sandbox for Teaching Transferable Skills (Apache 2.0)

[D] Let’s discuss World Models

[R] Generative Flows on Weight Space for Covariate Shift Detection (AAAI 2026 Workshop)

[D] I managed to fine-tune Qwen2.5-Omni-3B while keeping multimodal abilities — is it actually as hard as it felt?

[D] An alternative to Nested Cross Validation and independent test set doubts

[P] A “foveated” memory layer for LLM agents: +46.7pp accuracy at 256-token context (open-source)

[D] Exploring a High-Accountability Peer Collaboration Model for Intermediate ML Engineers/Researchers

[P] Human Action Classification: Reproducible baselines for UCF-101 (87%) and Stanford40 (88.5%) with training code + pretrained models

[P] mamba2-jax is here! Pure JAX/Flax implementation of Mamba2 (≈2× faster CPU inference vs PyTorch on my micro-benchmark)

[R] Inference-time attractor layer for transformers: preliminary observations

[D] I built a reasoning pipeline that boosts 8B models using structured routing + verification

[P] Knowledge Distillation: 97% Cost Reduction Distilling Claude Sonnet 4 → GPT-4.1-nano (98% Fidelity Retained)

You Missed

Is this normal? Lol

✍️ 9 ChatGPT Prompts That Instantly Improve Your Writing (Copy + Paste)

AI Now Builds the Whole Damn Thing

Chinese astronauts return from space station after delay blamed on space debris damage

Archives