PyraTok:面向视频理解与生成的语言对齐金字塔分词器
论文信息 标题: PyraTok: Language-Aligned Pyramidal Tokenizer for Video Understanding and Generation 作者: Onkar Susladkar, Tushar Prakash, Adheesh Juvekar, et al. 发布日期: 2026-01-22 arXiv ID: 2601.16210v...
论文信息 标题: PyraTok: Language-Aligned Pyramidal Tokenizer for Video Understanding and Generation 作者: Onkar Susladkar, Tushar Prakash, Adheesh Juvekar, et al. 发布日期: 2026-01-22 arXiv ID: 2601.16210v...
论文信息 标题: Why Can’t I Open My Drawer? Mitigating Object-Driven Shortcuts in Zero-Shot Compositional Action Recognition 作者: Geo Ahn, Inwoong Lee, Taeoh Kim, et al. 发布日期: 2026-01-22 arXiv ID: 2601...
论文信息 标题: Rethinking Video Generation Model for the Embodied World 作者: Yufan Deng, Zilin Pan, Hongyu Zhang, et al. 发布日期: 2026-01-21 arXiv ID: 2601.15282v1 PDF链接: 下载PDF 从虚拟到具身:RBench与RoVid-X如...
论文信息 标题: Iterative Refinement Improves Compositional Image Generation 作者: Shantanu Jaiswal, Mihir Prabhudesai, Nikash Bhardwaj, et al. 发布日期: 2026-01-21 arXiv ID: 2601.15286v1 PDF链接: 下载PDF 迭...
论文信息 标题: Jet-RL: Enabling On-Policy FP8 Reinforcement Learning with Unified Training and Rollout Precision Flow 作者: Haocheng Xi, Charlie Ruan, Peiyuan Liao, et al. 发布日期: 2026-01-20 arXiv ID: 26...
论文信息 标题: VideoMaMa: Mask-Guided Video Matting via Generative Prior 作者: Sangbeom Lim, Seoung Wug Oh, Jiahui Huang, et al. 发布日期: 2026-01-20 arXiv ID: 2601.14255v1 PDF链接: 下载PDF 从“掩膜”到“蒙版”:Vide...
论文信息 标题: MetaboNet: The Largest Publicly Available Consolidated Dataset for Type 1 Diabetes Management 作者: Miriam K. Wolff, Peter Calhoun, Eleonora Maria Aiello, et al. 发布日期: 2026-01-16 arXiv I...
论文信息 标题: ShapeR: Robust Conditional 3D Shape Generation from Casual Captures 作者: Yawar Siddiqui, Duncan Frost, Samir Aroudj, et al. 发布日期: 2026-01-16 arXiv ID: 2601.11514v1 PDF链接: 下载PDF 从随意拍...
论文信息 标题: Building Production-Ready Probes For Gemini 作者: János Kramár, Joshua Engels, Zheng Wang, et al. 发布日期: 2026-01-16 arXiv ID: 2601.11516v1 PDF链接: 下载PDF 构建面向生产的Gemini模型滥用检测探针:从理论到部署的深度...
论文信息 标题: Do explanations generalize across large reasoning models? 作者: Koyena Pal, David Bau, Chandan Singh 发布日期: 2026-01-16 arXiv ID: 2601.11517v1 PDF链接: 下载PDF 大型推理模型的解释能“举一反三”吗?—— 一项关于思维链...