AI 每日快讯

AI 每日快讯

AI 产品、模型、开源工具和官方动态的时间流。保留历史记录,按分类、日期和标签继续筛选。

1335历史快讯
80开源工具
30当前结果
06 月 24 日 昨日快讯
MarkTechPost 官方资讯

MarkTechPost:DFlash Speculative Decoding Drafts Whole Token Blocks in Parallel for Up to 15x Higher Throu…

原文摘要:UC San Diego's DFlash replaces autoregressive drafting with a lightweight block diffusion model for speculative decoding. It drafts whole token blocks in a single forward pass and 来源:MarkTechPost。建议继续查看原文,重点核对它影响的工具入口、成本、风险和真实使用场景。

06 月 23 日 2026-06-23 快讯
MarkTechPost 官方资讯

MarkTechPost:How to Use NVIDIA Canary-1B-v2 for ASR, Translation, and Automatic SRT Subtitle Export in Py…

原文摘要:In this tutorial, we build a multilingual ASR and speech translation pipeline with NVIDIA Canary-1B-v2. We load the model on a GPU-enabled runtime, prepare audio into 16 kHz mono, 来源:MarkTechPost。建议继续查看原文,重点核对它影响的工具入口、成本、风险和真实使用场景。

06 月 19 日 2026-06-19 快讯
MarkTechPost 官方资讯

MarkTechPost:NVIDIA AI Introduce SpatialClaw: A Training-Free Agent That Treats Code as the Action Interf…

原文摘要:SpatialClaw is a training-free agent that writes Python in a persistent kernel, composing perception tools for 3D spatial reasoning The post NVIDIA AI Introduce SpatialClaw: A Trai 来源:MarkTechPost。建议继续查看原文,重点核对它影响的工具入口、成本、风险和真实使用场景。

06 月 18 日 2026-06-18 快讯
MarkTechPost 官方资讯

MarkTechPost:NVIDIA SkillSpector Guide: Scanning AI Skills for Security Risks with Static Analysis and SA…

原文摘要:In this tutorial, we use NVIDIA SkillSpector to evaluate AI skills for security risks before deployment. We build a corpus of benign and deliberately vulnerable skills, then scan t 来源:MarkTechPost。建议继续查看原文,重点核对它影响的工具入口、成本、风险和真实使用场景。

06 月 17 日 2026-06-17 快讯
MarkTechPost 官方资讯

MarkTechPost:How to Build Memory-Efficient Transformers with xFormers Using Packed Sequences, GQA, ALiBi,…

原文摘要:We implement xFormers, a practical toolkit for fast, memory-efficient Transformer models on GPUs. We validate memory-efficient attention against a standard implementation, then com 来源:MarkTechPost。建议继续查看原文,重点核对它影响的工具入口、成本、风险和真实使用场景。

06 月 15 日 2026-06-15 快讯
MarkTechPost 官方资讯

MarkTechPost:Meet Flash-KMeans: An IO-Aware, Exact K-Means That Runs Over 200× Faster Than FAISS on GPUs

原文摘要:Flash-KMeans is an open-source, IO-aware implementation of standard Lloyd's k-means in Triton GPU kernels. It does not change the math or approximate. FlashAssign removes distance- 来源:MarkTechPost。建议继续查看原文,重点核对它影响的工具入口、成本、风险和真实使用场景。

06 月 10 日 2026-06-10 快讯
MarkTechPost 官方资讯

MarkTechPost:Google AI Releases DiffusionGemma, a 26B MoE Open Model Using Text Diffusion for Up to 4x Fa…

原文摘要:DiffusionGemma is Google DeepMind's experimental 26B open model using text diffusion for up to 4x faster generation on GPUs. The post Google AI Releases DiffusionGemma, a 26B MoE O 来源:MarkTechPost。建议继续查看原文,重点核对它影响的工具入口、成本、风险和真实使用场景。

MarkTechPost 官方资讯

MarkTechPost:Building a Code Dataset Pipeline from NVIDIA Nemotron-Pretraining-Code-v3 Metadata with Stre…

原文摘要:In this tutorial, we work with NVIDIA's Nemotron-Pretraining-Code-v3 dataset as a large-scale metadata index for code pretraining research. We stream the dataset instead of downloa 来源:MarkTechPost。建议继续查看原文,重点核对它影响的工具入口、成本、风险和真实使用场景。

06 月 09 日 2026-06-09 快讯
MarkTechPost 官方资讯

MarkTechPost:NVIDIA cuTile Python Tutorial: Building Tiled GPU Kernels for Vector Addition, Matrix Additi…

原文摘要:In this tutorial, we implement a hands-on 工作流 for NVIDIA cuTile Python, a tile-based GPU programming interface for CUDA-style kernels in Python. We prepare a Colab-friendly en 来源:MarkTechPost。建议继续查看原文,重点核对它影响的工具入口、成本、风险和真实使用场景。

06 月 08 日 2026-06-08 快讯
MarkTechPost 官方资讯

MarkTechPost:Xiaomi MiMo and TileRT Push a 1-Trillion-Parameter Model Past 1000 Tokens Per Second on Comm…

原文摘要:Xiaomi's MiMo team, with TileRT, released MiMo-V2.5-Pro-UltraSpeed, a serving mode for the MiMo-V2.5-Pro model. It decodes over 1000 tokens per second on a 1-trillion-parameter mod 来源:MarkTechPost。建议继续查看原文,重点核对它影响的工具入口、成本、风险和真实使用场景。

06 月 06 日 2026-06-06 快讯
MarkTechPost 官方资讯

MarkTechPost:Google’s New Colab CLI Lets 开发者 and AI Agents Run Python on Remote Colab GPUs and TPU…

原文摘要:Google released the Colab CLI, letting 开发者 and AI agents run local code on remote Colab GPU and TPU runtime The post Google’s New Colab CLI Lets 开发者 and AI Agents Run 来源:MarkTechPost。建议继续查看原文,重点核对它影响的工具入口、成本、风险和真实使用场景。

MarkTechPost 官方资讯

MarkTechPost:NVIDIA Releases Nemotron 3.5 ASR: A 600M-Parameter Cache-Aware Streaming Model Transcribing …

原文摘要:NVIDIA released Nemotron 3.5 ASR, a cache-aware 600M streaming model transcribing 40 language-locales in real time from one checkpoint. The post NVIDIA Releases Nemotron 3.5 ASR: A 来源:MarkTechPost。建议继续查看原文,重点核对它影响的工具入口、成本、风险和真实使用场景。

06 月 05 日 2026-06-05 快讯
MarkTechPost 官方资讯

MarkTechPost:NVIDIA AI Releases Dynamo Snapshot: A CRIU-Based Fast Startup System for AI Inference on Kub…

原文摘要:NVIDIA Dynamo Snapshot checkpoints and restores vLLM inference workers on Kubernetes using CRIU and cuda-checkpoint tools. The post NVIDIA AI Releases Dynamo Snapshot: A CRIU-Based 来源:MarkTechPost。建议继续查看原文,重点核对它影响的工具入口、成本、风险和真实使用场景。

06 月 04 日 2026-06-04 快讯
MarkTechPost 官方资讯

MarkTechPost:NVIDIA AI Releases Nemotron 3 Ultra: An Open 550B Mixture-of-Experts Hybrid Mamba-Transforme…

原文摘要:NVIDIA has released Nemotron 3 Ultra, a 550B total (55B active) open Mixture-of-Experts hybrid Mamba-Transformer for long-running agents. It pairs a 1M-token context with up to ~6x 来源:MarkTechPost。建议继续查看原文,重点核对它影响的工具入口、成本、风险和真实使用场景。

06 月 03 日 2026-06-03 快讯
MarkTechPost 官方资讯

MarkTechPost:NVIDIA Releases Cosmos 3: A Two-Tower Mixture-of-Transformers Foundation Model Unifying Phys…

原文摘要:NVIDIA released Cosmos 3, open omnimodal world models pairing an autoregressive VLM reasoner with a diffusion generator for physical AI. The post NVIDIA Releases Cosmos 3: A Two-To 来源:MarkTechPost。建议继续查看原文,重点核对它影响的工具入口、成本、风险和真实使用场景。

06 月 02 日 2026-06-02 快讯
MarkTechPost 官方资讯

MarkTechPost:How to Speed Up Transformer Training Using NVIDIA Apex (FusedAdam, FusedLayerNorm) and Nativ…

原文摘要:We build NVIDIA Apex from source, detect fused kernels, and 评测 FusedAdam, FusedLayerNorm, and torch.amp in Transformer training. The post How to Speed Up Transformer Trainin 来源:MarkTechPost。建议继续查看原文,重点核对它影响的工具入口、成本、风险和真实使用场景。

05 月 29 日 2026-05-29 快讯
MarkTechPost 官方资讯

MarkTechPost:NVIDIA Introduces X-Token: Projection-Guided Cross-Tokenizer KD That Outperforms GOLD by +3.…

原文摘要:NVIDIA's X-Token fixes two structural failures in GOLD and improves GSM8k accuracy from 2.56 to 15.54 The post NVIDIA Introduces X-Token: Projection-Guided Cross-Tokenizer KD That 来源:MarkTechPost。建议继续查看原文,重点核对它影响的工具入口、成本、风险和真实使用场景。

MarkTechPost 官方资讯

MarkTechPost:Hexo Labs Open-Sources SIA: A Self-Improving Agent That Updates Both the Harness and the Mod…

原文摘要:Hexo Labs released SIA, an open-source self-improving loop, under an MIT license. A Feedback-Agent reads each run's trajectory, then either rewrites the scaffold or triggers a LoRA 来源:MarkTechPost。建议继续查看原文,重点核对它影响的工具入口、成本、风险和真实使用场景。

05 月 27 日 2026-05-27 快讯
MarkTechPost 官方资讯

MarkTechPost:NVIDIA Releases Polar, a Token-Faithful Rollout Framework for GRPO Training Across Codex, Cl…

原文摘要:NVIDIA researchers have introduced Polar, a rollout framework that trains language agents using reinforcement learning without modifying their agent harnesses. Polar places a model 来源:MarkTechPost。建议继续查看原文,重点核对它影响的工具入口、成本、风险和真实使用场景。

05 月 26 日 2026-05-26 快讯
MarkTechPost 官方资讯

MarkTechPost:Stability AI Releases Stable Audio 3: A Family of Fast Latent Diffusion Models for Audio Gen…

原文摘要:Stability AI has released Stable Audio 3, a family of latent diffusion models for instrumental music and sound effects generation. The release includes open weights for the small a 来源:MarkTechPost。建议继续查看原文,重点核对它影响的工具入口、成本、风险和真实使用场景。

05 月 24 日 2026-05-24 快讯
MarkTechPost 官方资讯

MarkTechPost:NVIDIA AI Releases Gated DeltaNet-2: A Linear Attention Layer That Decouples Erase and Write…

原文摘要:Linear attention squeezes the unbounded KV cache into a fixed-size recurrent state, but editing that memory without scrambling existing associations is hard. Prior delta-rule model 来源:MarkTechPost。建议继续查看原文,重点核对它影响的工具入口、成本、风险和真实使用场景。

05 月 21 日 2026-05-21 快讯
MarkTechPost 官方资讯

MarkTechPost:Cohere Releases Command A+: A 218B Sparse MoE Model for Agentic 工作流 That Runs on as Fe…

原文摘要:Cohere releases Command A+, an open-source 218B Sparse Mixture-of-Experts model consolidating four prior Command A variants into one. It runs on as few as two H100 GPUs at W4A4 qua 来源:MarkTechPost。建议继续查看原文,重点核对它影响的工具入口、成本、风险和真实使用场景。

05 月 20 日 2026-05-20 快讯
MarkTechPost 官方资讯

MarkTechPost:NVIDIA AI Releases Nemotron-Labs-Diffusion: A Tri-Mode Language Model with 6× Tokens Per For…

原文摘要:NVIDIA researchers have released Nemotron-Labs-Diffusion, a language model family that unifies three decoding modes in one architecture. The model supports autoregressive (AR) deco 来源:MarkTechPost。建议继续查看原文,重点核对它影响的工具入口、成本、风险和真实使用场景。

05 月 18 日 2026-05-18 快讯
MarkTechPost 官方资讯

MarkTechPost:NVIDIA Introduces a 4-Bit Pretraining Methodology Using NVFP4, Validated on a 12B Hybrid Mam…

原文摘要:NVIDIA introduces a 4-bit pretraining methodology built around the NVFP4 microscaling format — combining selective BF16 layers, 16×16 Random Hadamard Transforms on Wgrad inputs, 2D 来源:MarkTechPost。建议继续查看原文,重点核对它影响的工具入口、成本、风险和真实使用场景。

05 月 16 日 2026-05-16 快讯
05 月 15 日 2026-05-15 快讯
MarkTechPost 官方资讯

MarkTechPost:Zyphra Releases ZAYA1-8B-Diffusion-Preview: The First MoE Diffusion Model Converted From an …

原文摘要:Zyphra's latest release shows that an autoregressive MoE model can be converted into a discrete diffusion model with no systematic loss in 评测 performance. ZAYA1-8B-Diffusio 来源:MarkTechPost。建议继续查看原文,重点核对它影响的工具入口、成本、风险和真实使用场景。

05 月 14 日 2026-05-14 快讯
MarkTechPost 官方资讯

MarkTechPost:A Coding Implementation to Master GPU Computing with CuPy, Custom CUDA Kernels, Streams, Spa…

原文摘要:In this tutorial, we delve into CuPy as a powerful GPU-accelerated alternative to NumPy for high-performance numerical computing in Python. We start by inspecting the available CUD 来源:MarkTechPost。建议继续查看原文,重点核对它影响的工具入口、成本、风险和真实使用场景。

05 月 11 日 2026-05-11 快讯
MarkTechPost 官方资讯

MarkTechPost:Sakana AI and NVIDIA Introduce TwELL with CUDA Kernels for 20.5% Inference and 21.9% Trainin…

原文摘要:Sakana AI and NVIDIA Researchers demonstrate that simple L1 regularization can induce over 99% sparsity in feedforward layers with negligible downstream performance impact, and tra 来源:MarkTechPost。建议继续查看原文,重点核对它影响的工具入口、成本、风险和真实使用场景。

05 月 10 日 2026-05-10 快讯
MarkTechPost 官方资讯

MarkTechPost:NVIDIA AI Just Released cuda-oxide: An Experimental Rust-to-CUDA Compiler Backend that Compi…

原文摘要:NVlabs releases cuda-oxide v0.1.0, a custom rustc codegen backend that compiles #[kernel]-annotated Rust functions to PTX through a Rust → Stable MIR → Pliron IR → LLVM IR → PTX pi 来源:MarkTechPost。建议继续查看原文,重点核对它影响的工具入口、成本、风险和真实使用场景。

05 月 09 日 2026-05-09 快讯
MarkTechPost 官方资讯

MarkTechPost:NVIDIA AI Releases Star Elastic: One Checkpoint that Contains 30B, 23B, and 12B Reasoning Mo…

原文摘要:NVIDIA researchers have introduced Star Elastic, a post-training method that embeds multiple nested reasoning models — at 30B, 23B, and 12B parameter scales — inside a single check 来源:MarkTechPost。建议继续查看原文,重点核对它影响的工具入口、成本、风险和真实使用场景。