AI 每日快讯

AI 每日快讯

AI 产品、模型、开源工具和官方动态的时间流。保留历史记录,按分类、日期和标签继续筛选。

1335历史快讯
80开源工具
80当前结果
06 月 25 日 今日快讯
The Decoder 官方资讯

The Decoder:Google bakes computer control directly into Gemini 3.5 Flash, letting the model see and oper…

原文摘要:Google has integrated "Computer Use" directly into Gemini 3.5 Flash, letting the model operate computers, browsers, and mobile devices on its own. On the OSWorld 评测 来源:The Decoder。建议继续查看原文,重点核对它影响的工具入口、成本、风险和真实使用场景。

06 月 24 日 昨日快讯

InfoQ AI ML Data Engineering:AI Is Moving up the Software Lifecycle: From Code Review to PRD Governance

原文摘要:Technology companies are extending AI beyond code generation into earlier stages of the software lifecycle, including PRD validation, design inputs, and code review. Initiatives fr 来源:InfoQ AI ML Data Engineering。建议继续查看原文,重点核对它影响的工具入口、成本、风险和真实使用场景。

06 月 23 日 2026-06-23 快讯
MarkTechPost 官方资讯

MarkTechPost:Datalab Releases lift: A 9B Open-Weights Vision Model That Extracts Structured JSON From PDF…

原文摘要:Datalab released lift, a 9B open-weights vision model that turns PDFs and images into schema-matching JSON. It uses schema-constrained decoding for valid structure and trained abst 来源:MarkTechPost。建议继续查看原文,重点核对它影响的工具入口、成本、风险和真实使用场景。

MarkTechPost 官方资讯

MarkTechPost:How to Use NVIDIA Canary-1B-v2 for ASR, Translation, and Automatic SRT Subtitle Export in Py…

原文摘要:In this tutorial, we build a multilingual ASR and speech translation pipeline with NVIDIA Canary-1B-v2. We load the model on a GPU-enabled runtime, prepare audio into 16 kHz mono, 来源:MarkTechPost。建议继续查看原文,重点核对它影响的工具入口、成本、风险和真实使用场景。

The Decoder 官方资讯

The Decoder:OpenAI says new GPT-5.5-Cyber outperforms Anthropic's Mythos on cybersecurity 评测

原文摘要:OpenAI is expanding its Daybreak cybersecurity initiative with an updated Codex Security plugin, the full GPT-5.5-Cyber model, and a partner network with more than 25 secu 来源:The Decoder。建议继续查看原文,重点核对它影响的工具入口、成本、风险和真实使用场景。

06 月 22 日 2026-06-22 快讯
MarkTechPost 官方资讯

MarkTechPost:Sakana AI Launches Sakana Fugu: An Orchestration Model That Routes Tasks Across a Swappable …

原文摘要:Fugu and Fugu Ultra route tasks across a swappable model pool, leading most coding, reasoning, and agentic 评测. The post Sakana AI Launches Sakana Fugu: An Orchestration Mod 来源:MarkTechPost。建议继续查看原文,重点核对它影响的工具入口、成本、风险和真实使用场景。

AWS Machine Learning 动态:Embed the world: Multimodal AI for searchable aerial imagery at scale

原文摘要:In this post, we walk through the problem space, our architecture on Amazon Bedrock and Amazon OpenSearch Serverless, the 评测 methodology we built on OpenStreetMap ground tr 来源:AWS Machine Learning 动态。建议继续查看原文,重点核对它影响的工具入口、成本、风险和真实使用场景。

NVIDIA Developer 动态:Inside NVIDIA Halos for Robotics: A Full-Stack Functional Safety System for Physical AI

原文摘要:Physical AI—robots working autonomously alongside people in factories, warehouses, hospitals, and homes—is arriving faster than most expected. Traditional... 来源:NVIDIA 开发者 动态。建议继续查看原文,重点核对它影响的工具入口、成本、风险和真实使用场景。

The Decoder 官方资讯

The Decoder:Sakana AI's Fugu orchestrates multiple LLMs to match Anthropic's Fable and Mythos 评测

原文摘要:Japanese AI startup Sakana AI is launching Fugu, a system that coordinates multiple AI models on the fly to compete with leaders like Anthropic's Fable 5. The approach als 来源:The Decoder。建议继续查看原文,重点核对它影响的工具入口、成本、风险和真实使用场景。

06 月 21 日 2026-06-21 快讯
06 月 20 日 2026-06-20 快讯
MarkTechPost 官方资讯

MarkTechPost:Cisco AI Introduces FAPO: Pipeline-Aware Prompt Optimization With Step-Level Failure Attribu…

原文摘要:Cisco Foundation AI has open-sourced FAPO (Fully Automated Prompt Optimization), a Claude Code-driven system that autonomously optimizes multi-step LLM pipelines from baseline prom 来源:MarkTechPost。建议继续查看原文,重点核对它影响的工具入口、成本、风险和真实使用场景。

06 月 19 日 2026-06-19 快讯
MarkTechPost 官方资讯

MarkTechPost:VibeThinker-3B: A 3B Dense Reasoning Model Built on Qwen2.5-Coder-3B With the Spectrum-to-Si…

原文摘要:VibeThinker-3B, a 3B MIT-licensed reasoning model matching DeepSeek V3.2 and Kimi K2.5 on verifiable 评测. The post VibeThinker-3B: A 3B Dense Reasoning Model Built on Qwen2. 来源:MarkTechPost。建议继续查看原文,重点核对它影响的工具入口、成本、风险和真实使用场景。

The Decoder 官方资讯

The Decoder:OpenAI researchers show small doses of "beneficial trait" training make AI models broadly sa…

原文摘要:OpenAI researchers show that reinforcement learning on desired behavioral traits like truthfulness and corrigibility works across domains. Training on health data also imp 来源:The Decoder。建议继续查看原文,重点核对它影响的工具入口、成本、风险和真实使用场景。

06 月 18 日 2026-06-18 快讯
MarkTechPost 官方资讯

MarkTechPost:OpenAI Releases LifeSciBench, a 750-Task 评测 Grading AI Models on Real Life-Science Re…

原文摘要:OpenAI's LifeSciBench evaluates whether frontier AI can handle real life-science research across 750 expert-authored tasks, seven 工作流, and seven biological domains. Built by 来源:MarkTechPost。建议继续查看原文,重点核对它影响的工具入口、成本、风险和真实使用场景。

MarkTechPost 官方资讯

MarkTechPost:NVIDIA SkillSpector Guide: Scanning AI Skills for Security Risks with Static Analysis and SA…

原文摘要:In this tutorial, we use NVIDIA SkillSpector to evaluate AI skills for security risks before deployment. We build a corpus of benign and deliberately vulnerable skills, then scan t 来源:MarkTechPost。建议继续查看原文,重点核对它影响的工具入口、成本、风险和真实使用场景。

06 月 17 日 2026-06-17 快讯
MarkTechPost 官方资讯

MarkTechPost:Vercel Releases Eve: An Open-Source AI Agent Framework Where Each Agent is a Directory of Fi…

原文摘要:Vercel has open-sourced eve, an Apache-2.0 agent framework now in public preview. An agent is a directory of files, with durable execution, sandboxes, approvals, connections, chann 来源:MarkTechPost。建议继续查看原文,重点核对它影响的工具入口、成本、风险和真实使用场景。

MarkTechPost 官方资讯

MarkTechPost:MiniMax Sparse Attention (MSA): a Two-Branch Block-Sparse Attention Trained on a 109B-Parame…

原文摘要:MiniMax released MSA, a sparse attention built on Grouped Query Attention. A lightweight Index Branch selects Top-k key-value blocks per query and GQA group; the Main Branch attend 来源:MarkTechPost。建议继续查看原文,重点核对它影响的工具入口、成本、风险和真实使用场景。

06 月 16 日 2026-06-16 快讯

AWS Machine Learning 动态:Safeguard your agentic AI applications with the Amazon Bedrock Guardrails InvokeGuardrailChe…

原文摘要:Today, we’re announcing a new API with Amazon Bedrock Guardrails. With this API, you can apply individual safeguards, also referred to as safety checks, at any point in your agenti 来源:AWS Machine Learning 动态。建议继续查看原文,重点核对它影响的工具入口、成本、风险和真实使用场景。

NVIDIA Developer 动态:NVIDIA Blackwell Tops MLPerf Training 6.0 with Industry-Leading Scale and Performance

原文摘要:NVIDIA delivered a clean sweep in MLPerf Training v6.0, the latest edition of industry-standard AI training 评测 developed by the MLCommons consortium.... 来源:NVIDIA 开发者 动态。建议继续查看原文,重点核对它影响的工具入口、成本、风险和真实使用场景。

06 月 15 日 2026-06-15 快讯

AWS Machine Learning 动态:AI Agent Failure Detection and Root Cause Analysis with Strands Evals

原文摘要:In this post, we walk you through calling the detector functions to diagnose real agent failures. You learn how to interpret their structured output: categorized failures with conf 来源:AWS Machine Learning 动态。建议继续查看原文,重点核对它影响的工具入口、成本、风险和真实使用场景。

06 月 14 日 2026-06-14 快讯
06 月 13 日 2026-06-13 快讯
The Decoder 官方资讯

The Decoder:Moonshot's open model Kimi K2.7 Code undercuts GPT-5.5 and Claude by up to 12x on price per …

原文摘要:Moonshot AI has released Kimi K2.7 Code, an open-weights model with one trillion parameters built for programming. It still trails GPT-5.5 and Claude Opus 4.8 in coding be 来源:The Decoder。建议继续查看原文,重点核对它影响的工具入口、成本、风险和真实使用场景。

06 月 12 日 2026-06-12 快讯

NVIDIA Developer 动态:NVIDIA Achieves Leading Agentic Coding Performance on First Agentic AI 评测

原文摘要:AI agents have fundamentally changed the complexity of inference workloads. Until now, the industry has struggled to define a standard for measuring how... 来源:NVIDIA 开发者 动态。建议继续查看原文,重点核对它影响的工具入口、成本、风险和真实使用场景。

06 月 11 日 2026-06-11 快讯

AWS Machine Learning 动态:Evaluate AI agents systematically with Agent-EvalKit

原文摘要:Agent-EvalKit is an open-source toolkit (Apache 2.0) that makes this 评测 infrastructure available by integrating with AI coding assistants, including Claude Code, Kiro CLI, 来源:AWS Machine Learning 动态。建议继续查看原文,重点核对它影响的工具入口、成本、风险和真实使用场景。

MIT Technology Review AI:Google DeepMind is worried about what happens when millions of agents start to interact

原文摘要:Google DeepMind is funding research into the potential dangers of millions of different AI agents interacting with each other online. According to Rohin Shah, who directs the compa 来源:MIT Technology Review AI。建议继续查看原文,重点核对它影响的工具入口、成本、风险和真实使用场景。

06 月 10 日 2026-06-10 快讯

AWS Machine Learning 动态:Build an AI-Powered Equipment Repair Assistant Using Amazon Bedrock AgentCore

原文摘要:In this post, you build an AI-powered equipment repair assistant using Amazon Bedrock AgentCore that helps farmers and field technicians diagnose equipment problems, identify requi 来源:AWS Machine Learning 动态。建议继续查看原文,重点核对它影响的工具入口、成本、风险和真实使用场景。

The Decoder 官方资讯

The Decoder:Germany's National Security Council greenights an AI Safety Institute modeled after the UK's…

原文摘要:Germany's National Security Council has decided to establish an AI security institute. The "DE-AISI" will test frontier models from Anthropic or OpenAI for security risks, 来源:The Decoder。建议继续查看原文,重点核对它影响的工具入口、成本、风险和真实使用场景。

06 月 09 日 2026-06-09 快讯

NVIDIA Developer 动态:Evaluate Clinical ASR Models Faster with Agent Skills and NVIDIA Nemotron Speech

原文摘要:Training a speech AI model to correctly recognize or synthesize clinical terminology is surprisingly difficult. Drug names like Acetaminophen, Amlodipine,... 来源:NVIDIA 开发者 动态。建议继续查看原文,重点核对它影响的工具入口、成本、风险和真实使用场景。

MarkTechPost 官方资讯

MarkTechPost:NVIDIA cuTile Python Tutorial: Building Tiled GPU Kernels for Vector Addition, Matrix Additi…

原文摘要:In this tutorial, we implement a hands-on 工作流 for NVIDIA cuTile Python, a tile-based GPU programming interface for CUDA-style kernels in Python. We prepare a Colab-friendly en 来源:MarkTechPost。建议继续查看原文,重点核对它影响的工具入口、成本、风险和真实使用场景。

06 月 08 日 2026-06-08 快讯
The Decoder 官方资讯

The Decoder:Microsoft Research's Lens proves detailed captions matter more than raw scale for training e…

原文摘要:Microsoft Research presents Lens, a text-to-image model with just 3.8 billion parameters that matches much larger rivals on 评测, at a fraction of the training cost. 来源:The Decoder。建议继续查看原文,重点核对它影响的工具入口、成本、风险和真实使用场景。

06 月 07 日 2026-06-07 快讯
MarkTechPost 官方资讯

MarkTechPost:Building Reflective Prompt Optimization with GEPA: Multi-Component Prompts, Structured Feedb…

原文摘要:In this tutorial, we use GEPA as a reflective prompt-evolution framework to improve how a small language model solves multi-step arithmetic word problems. We start from a weak seed 来源:MarkTechPost。建议继续查看原文,重点核对它影响的工具入口、成本、风险和真实使用场景。

The Decoder 官方资讯

The Decoder:Perplexity's "Search as Code" lets AI models write their own search pipelines instead of cal…

原文摘要:Perplexity's new "Search as Code" architecture dumps rigid search APIs and lets AI models write their own search routines in Python. By letting the agent handle its own fi 来源:The Decoder。建议继续查看原文,重点核对它影响的工具入口、成本、风险和真实使用场景。

MarkTechPost 官方资讯

MarkTechPost:Meet Harness-1: A 20B Retrieval Subagent Trained With Reinforcement Learning Inside a Statef…

原文摘要:UIUC and Chroma's Harness-1 is a 20B retrieval subagent trained with reinforcement learning inside a stateful search harness. The harness maintains the bookkeeping — candidate pool 来源:MarkTechPost。建议继续查看原文,重点核对它影响的工具入口、成本、风险和真实使用场景。

MarkTechPost 官方资讯

MarkTechPost:NVIDIA garak Tutorial: Build a Complete Defensive LLM Red-Teaming 工作流 with Custom Probe…

原文摘要:This tutorial walks through NVIDIA garak as an end-to-end framework for defensive LLM red-teaming. It covers setup, plugin discovery, dry runs, real-model scans on a Hugging Face g 来源:MarkTechPost。建议继续查看原文,重点核对它影响的工具入口、成本、风险和真实使用场景。

06 月 06 日 2026-06-06 快讯
The Decoder 官方资讯

The Decoder:Qwen3.7-Plus is Alibaba's bid to turn multimodal AI into a full-blown autonomous agent

原文摘要:Alibaba's Qwen team has released Qwen3.7-Plus, a multimodal agent model that combines visual perception, GUI operation, and coding in a single agent loop. In a demo, an ag 来源:The Decoder。建议继续查看原文,重点核对它影响的工具入口、成本、风险和真实使用场景。

06 月 05 日 2026-06-05 快讯
The Decoder 官方资讯

The Decoder:Florida's lawsuit against OpenAI and CEO Altman treats ChatGPT as a defective product and pu…

原文摘要:Florida is the first US state to sue OpenAI and CEO Sam Altman personally over risks to minors, missing age checks, and inadequate safety investment. The 83-page complaint 来源:The Decoder。建议继续查看原文,重点核对它影响的工具入口、成本、风险和真实使用场景。

06 月 03 日 2026-06-03 快讯
The Decoder 官方资讯

The Decoder:Google Deepmind's Gemma 4 12B squeezes multimodal AI onto a laptop with just 16 GB of RAM

原文摘要:Google Deepmind's Gemma 4 12B is an open-source model that processes text, images, and audio natively and runs on laptops with just 16 GB of RAM. It nearly matches the twi 来源:The Decoder。建议继续查看原文,重点核对它影响的工具入口、成本、风险和真实使用场景。

The Decoder 官方资讯

The Decoder:Trump's new executive order wants AI companies to voluntarily submit models for government s…

原文摘要:The White House has issued an executive order requiring agencies like the Pentagon and CISA to strengthen cyber defense with AI tools within 30 days. AI 开发者 can vol 来源:The Decoder。建议继续查看原文,重点核对它影响的工具入口、成本、风险和真实使用场景。

AWS Machine Learning 动态:Improve your agent’s tool-calling accuracy with SFT and DPO on Amazon SageMaker AI

原文摘要:In this post, you learn how to use Supervised Fine-Tuning (SFT) and Direct Preference Optimization (DPO) together to improve the tool-calling accuracy of a small language model (SL 来源:AWS Machine Learning 动态。建议继续查看原文,重点核对它影响的工具入口、成本、风险和真实使用场景。

InfoQ AI ML Data Engineering:Presentation: Choosing Your AI Copilot: Maximizing 开发者 Productivity

原文摘要:Sepehr Khosravi discusses the evolution of 开发者 productivity tools. Evaluating the strengths of tools like Cursor and Claude Code, he explains actionable techniques for senior 来源:InfoQ AI ML Data Engineering。建议继续查看原文,重点核对它影响的工具入口、成本、风险和真实使用场景。

06 月 02 日 2026-06-02 快讯

InfoQ AI ML Data Engineering:Article: Why Vector Search Alone Isn't Enough: Hybrid Retrieval for RAG

原文摘要:In this article, author Aaditya Chauhan discusses the limitations of RAG pipelines based purely on vector search and how an internal omni-search application using Reciprocal Rank F 来源:InfoQ AI ML Data Engineering。建议继续查看原文,重点核对它影响的工具入口、成本、风险和真实使用场景。

MarkTechPost 官方资讯

MarkTechPost:How to Speed Up Transformer Training Using NVIDIA Apex (FusedAdam, FusedLayerNorm) and Nativ…

原文摘要:We build NVIDIA Apex from source, detect fused kernels, and 评测 FusedAdam, FusedLayerNorm, and torch.amp in Transformer training. The post How to Speed Up Transformer Trainin 来源:MarkTechPost。建议继续查看原文,重点核对它影响的工具入口、成本、风险和真实使用场景。

06 月 01 日 2026-06-01 快讯
The Decoder 官方资讯

The Decoder:Turing Award winner Richard Sutton says pure generative AI can't do real science

原文摘要:Turing Award winner Richard Sutton sees a central weakness in conventional generative AI: it can't evaluate its own results. Without that ability, real scientific discover 来源:The Decoder。建议继续查看原文,重点核对它影响的工具入口、成本、风险和真实使用场景。

MarkTechPost 官方资讯

MarkTechPost:Meet Memory OS: A 6-Layer Open-Source Memory Stack Built on Top of Hermes Agent

原文摘要:The open-source project adds local persistent memory to Hermes Agent through six layers, gated retrieval, and a wiki. The post Meet Memory OS: A 6-Layer Open-Source Memory Stack Bu 来源:MarkTechPost。建议继续查看原文,重点核对它影响的工具入口、成本、风险和真实使用场景。

InfoQ AI ML Data Engineering:BadHost Vulnerability Exposes AI Agents, Evaluators, and LLM Gateways

原文摘要:BadHost is a high-severity authentication bypass vulnerability in the widely used Python web framework Starlette, with 325 million weekly downloads. The flaw allows attackers to us 来源:InfoQ AI ML Data Engineering。建议继续查看原文,重点核对它影响的工具入口、成本、风险和真实使用场景。

05 月 31 日 2026-05-31 快讯
The Decoder 官方资讯

The Decoder:AI search agents often confirm what they already know instead of actually researching the we…

原文摘要:Leading AI search agents like GPT-5.4 and Kimi K2.6 don't appear to do much actual research on established 评测. They mostly just use the web to confirm what they al 来源:The Decoder。建议继续查看原文,重点核对它影响的工具入口、成本、风险和真实使用场景。

MarkTechPost 官方资讯

MarkTechPost:Build Skill-Augmented AI Agents with SkillNet for Search, 评测, Graph Analysis, and Ta…

原文摘要:In this tutorial, we implement a SkillNet use case as a practical framework for discovering, installing, inspecting, evaluating, and organizing reusable AI skills. The post Build S 来源:MarkTechPost。建议继续查看原文,重点核对它影响的工具入口、成本、风险和真实使用场景。

05 月 30 日 2026-05-30 快讯
MarkTechPost 官方资讯

MarkTechPost:Genesis AI Releases Nyx, Quadrants, and Genesis World 1.0 Physics Platform for Scalable Robo…

原文摘要:Genesis AI released Genesis World 1.0 on May 27, 2026 — a four-component simulation platform covering physics, rendering, compilation, and tooling. The system achieves a Pearson co 来源:MarkTechPost。建议继续查看原文,重点核对它影响的工具入口、成本、风险和真实使用场景。

MarkTechPost 官方资讯

MarkTechPost:Hermes Agent Ships Tool Search for MCP: Anthropic Evals Show 49% to 74% Accuracy Gain on Opu…

原文摘要:Nous Research's Hermes Agent adds Tool Search to fix MCP context bloat using BM25 progressive schema disclosure. The post Hermes Agent Ships Tool Search for MCP: Anthropic Evals Sh 来源:MarkTechPost。建议继续查看原文,重点核对它影响的工具入口、成本、风险和真实使用场景。

05 月 29 日 2026-05-29 快讯

InfoQ AI ML Data Engineering:Presentation: Building Evals for AI Adoption: From Principles to Practice

原文摘要:Mallika Rao discusses the hidden risk of 评测 debt in production AI systems, drawing on her experience at Twitter, Walmart, and Netflix. She explains why traditional metrics 来源:InfoQ AI ML Data Engineering。建议继续查看原文,重点核对它影响的工具入口、成本、风险和真实使用场景。

05 月 28 日 2026-05-28 快讯
The Decoder 官方资讯

The Decoder:Claude company Anthropic nears a trillion-dollar valuation after raising $65 billion in Seri…

原文摘要:Anthropic raises $65 billion in a Series H round at a $965 billion valuation. Annualized revenue tops $47 billion, according to CFO Krishna Rao. The company plans to inves 来源:The Decoder。建议继续查看原文,重点核对它影响的工具入口、成本、风险和真实使用场景。

The Decoder 官方资讯

The Decoder:Anthropic ships Claude Opus 4.8 as a "modest but tangible improvement" that tops GPT-5.5 in …

原文摘要:Anthropic releases Claude Opus 4.8, which beats GPT-5.5 and Gemini 3.1 Pro in most 评测. The model also catches its own coding errors four times more often than its 来源:The Decoder。建议继续查看原文,重点核对它影响的工具入口、成本、风险和真实使用场景。

AWS Machine Learning 动态:Evaluating Deep Agents using LangSmith on AWS

原文摘要:This post combines learnings from LangChain’s work on evaluating deep agents and Anthropic’s guide to demystifying evals for AI agents into a practical guide. In this post, you wil 来源:AWS Machine Learning 动态。建议继续查看原文,重点核对它影响的工具入口、成本、风险和真实使用场景。

AWS Machine Learning 动态:Build a test suite that grows with your agent with dataset management in Amazon Bedrock Agen…

原文摘要:Agent 评测 is most powerful when you combine fast-moving online signals with stable offline baselines. To understand whether your agent is truly improving over time, you need 来源:AWS Machine Learning 动态。建议继续查看原文,重点核对它影响的工具入口、成本、风险和真实使用场景。

05 月 27 日 2026-05-27 快讯

InfoQ AI ML Data Engineering:Sarang Kulkarni on Lessons from Building Deep Research Agents in Production

原文摘要:Deep Research Agentic Systems are AI Agents designed to conduct multi-step research for complex tasks using dynamic reasoning, multi-hop information retrieval, and generate structu 来源:InfoQ AI ML Data Engineering。建议继续查看原文,重点核对它影响的工具入口、成本、风险和真实使用场景。