Also: ChatGPT reaches 800M weekly active users
😎 News From The Web
- Introducing apps in ChatGPT and the new Apps SDK. OpenAI introduces a new generation of chat-compatible apps within ChatGPT, accessible outside the EU and UK, powered by the new Apps SDK built on the Model Context Protocol. Collaborating with partners like Spotify and Zillow, OpenAI allows developers to reach 800 million users, offering enhanced interaction through conversational interfaces.
- Introducing AgentKit. OpenAI launched AgentKit, offering tools for building and optimizing agents efficiently. Features include Agent Builder for visual workflows, Connector Registry for data management, and ChatKit for embedding chat UIs. AgentKit aims to streamline agent creation and deployment, providing developers with comprehensive resources for effective agent development.
- Sora hit 1M downloads, faster than ChatGPT. OpenAI’s Sora app reached 1 million downloads in under five days, surpassing ChatGPT’s initial performance despite being invite-only. Launching on September 30, 2025, in the U.S. and Canada, Sora achieved 627,000 downloads in its first week. With a high of 107,800 daily iOS downloads, Sora’s rapid adoption includes innovative deepfake generation capabilities.
- ChatGPT reaches 800m weekly active users. OpenAI CEO Sam Altman announced at Dev Day that ChatGPT now boasts 800 million weekly active users and 4 million developers. The event also introduced new tools for building interactive, personalized applications. OpenAI’s rapid growth is evident, with the company valued at $500 billion after a private stock sale, reinforcing its position as a leading consumer AI product.
- Introducing Plan Mode. Cursor’s new Plan Mode enhances code creation by facilitating plan development, codebase research, and agent execution. Users activate it with Shift + Tab, provide requirements, and review or edit detailed plans directly. Cursor generates a Markdown file with file paths and code references. This mode supports complex task descriptions and suggests plans automatically.
- OpenAI and AMD announce multibillion-dollar partnership — AMD to supply 6 gigawatts in chips. OpenAI and AMD have announced a multibillion-dollar partnership for AI data centers using AMD processors. OpenAI plans to purchase 6 gigawatts of AMD chips, leading to revenue in tens of billions for AMD. OpenAI will gain warrants for up to 10% of AMD shares, contingent on meeting deployment milestones, with the first MI450 chip deployment in 2026.
- OpenAI and Jony Ive may be struggling to figure out their AI device. OpenAI and Jony Ive, following a $6.5 billion acquisition of the device startup io, aim to develop a screen-less, AI-powered device by 2026. Technical challenges include defining the device’s “personality” and managing privacy. The device intends to remain “always on,” capturing audio and visual cues but ensuring it engages only when beneficial.
📚 Guides From The Web
- State of LLMs in Late 2025. By October 2025, AI models specialize in distinct tasks rather than adopting a “one-size-fits-all” approach. Advances include GPT-5’s task-based router system, Claude Sonnet 4.5’s extended coding focus, and Llama 4’s open-source multimodal capabilities. These innovations prioritize efficient task alignment, emphasizing specialized functionalities, real-time processing, and cost-efficient solutions, underscoring the importance of selecting the right model for each job.
- A small number of samples can poison LLMs of any size. Researchers from the UK AI Security Institute, the Alan Turing Institute, and Anthropic found that injecting just 250 malicious documents can effectively create vulnerabilities in large language models, regardless of size. This discovery challenges assumptions about data poisoning vulnerabilities and highlights the need for defenses against such attacks, suggesting that a fixed number of poisoned samples could suffice to backdoor models.
- Visualizing How VLMs Work. Visual Language Models (VLMs) like SmolVLM process text and images by converting them into high-dimensional embeddings. They integrate visual and textual data, replacing text placeholders with image tokens, and use a decoder to generate context-aware outputs. This architecture enables multimodal reasoning, allowing handling of various input combinations for versatile applications.
- GPT-5-Codex is a better AI researcher than me. GPT-5-Codex surpassed personal AI research efforts by automating idea generation and experiment procedures. Codex’s integration with transformer and n-gram models improved outputs, reducing model perplexity and enhancing storytelling coherence. The research showcased how leveraging Codex enabled searching for higher-quality AI models rapidly, highlighting its effectiveness over manual processes.
🔬 Interesting Papers and Repositories
- Agent Learning via Early Experience. Researchers introduce a middle-ground paradigm called early experience for language agents, using agent-generated interaction data to improve learning. They study implicit world modeling and self-reflection, enhancing effectiveness and out-of-domain generalization across diverse environments. Early experience shows promise for bridging imitation learning and experience-driven agents, offering a foundation for subsequent reinforcement learning in environments with verifiable rewards.
- Doriandarko/sora-mcp. The sora-mcp repository integrates a Model Context Protocol server with OpenAI’s Sora 2 API for video generation and remixing. The project features video creation from text prompts, status monitoring, and remix variations.
- Less is More: Recursive Reasoning with Tiny Networks. The authors introduce the Tiny Recursive Model (TRM), which uses a single 7M parameter, 2-layer network to outperform many Large Language Models in puzzle tasks like ARC-AGI. TRM achieves 45% test-accuracy on ARC-AGI-1 and 8% on ARC-AGI-2, showcasing its potential for effective recursive reasoning with minimal computational resources.
- MaximeRivest/maivi. Maivi transforms human speech into text using NVIDIA’s Parakeet model. Users press Alt+Q to record, immediately viewing text almost instantly while it copies to the clipboard. The application features real-time transcription, overlay display, and smart merging of audio chunks.
- Thinking with Camera: A Unified Multimodal Model for Camera-Centric Understanding and Generation. The Puffin model unifies camera-centric understanding and generation, integrating language regression and diffusion-based generation for spatial intelligence. Puffin was trained on a 4-million triplet dataset, aligning visual cues with photographic terminology. It outperforms specialized models in cross-view tasks, such as spatial imagination, and plans to release code, models, and the dataset pipeline for research advancement.
✨ Extra Content
- Want to see this newsletter and more AI content from me in your Medium feed? Follow me and/or clap to this story!
- Want to receive this newsletter via email? It’s free, you can subscribe to receive my articles via email. I’ll only send this newsletter.
Thank you for reading!
Learn more Sora hit 1M downloads, faster than ChatGPT — Weekly AI Newsletter (October 13th 2025)