YUE开源音乐大模型
YuE: Scaling Open Foundation Models for Long-Form Music Generation
YuE: Scaling Open Foundation Models for Long-Form Music Generation
Open-Sora: Democratizing Efficient Video Production for All
PIN(Paired and INterleaved multimodal documents)
开源模型、开源框架、开源榜单
记录一些简单的脚本项目
Published in NTCIR, 2020-12
Dialogue Quality, BiLSTM + Attention), CNN (Convolutional Neural Network), Pre‑trained Language Model, MoE
Published in EMNLP, 2021-10
Multimodal Interaction, Trilinear Transformer, Visual Question Answering (VQA), Two-Stage Workflow
Published in Arxiv, 2022-08
Semantic Matching, Pre-trained Language Model (PLM), Propensity‑Corrected Loss (PCL), LUE Semantic Matching Challenge
Published in Arxiv, 2022-09
Massive Tool Retrieval (MTR), Query‑Tool Alignment (QTA), Massive Tool Retrieval Benchmark, DPO
Published in ACM MM, 2022-10
Multimedia Recommendation, Graph Fusion, Edge-wise Modulation, Graph Convolutional Network (GCN)
Published in EMNLP, 2022-10
Zero‑Shot Learning, Multiple Choice Format, Pre‑trained Masked Language Model (PMLM), Unified Multiple Choice model
Published in CVPR, 2023-06
Multimodality, Uncertainty Modeling, Vision-Language Pre-training, Probability Distribution Encoder (PDE)
Published in ACL, 2023-07
System 1 & System 2, Cooperative Reasoning (CoRe),Stepwise Feedback, Math Word Problems
Published in ACL, 2023-07
Information Extraction , Unified Across IE Tasks, Triaffine Attention, Span‑extractive Framework, Low‑resource Transferability
Published in SIGIR-AP, 2023-11
Multidimensional Ethics, Ethical Judgment, Large Multimodal Models (LMMs), Binary & Multi‑label Classification
Published in Arxiv, 2024-01
Multimodal Understanding, Subject Knowledge Reasoning , University Exam Questions, LMM Performance Evaluation
Published in COLM, 2024-02
Structured Knowledge Grounding (SKG), Instruction tuning, Generalist model, Tables / Graphs / Databases
Published in ACM TOIS, 2024-05
Named Entity Recognition (NER), Machine Reading Comprehension (MRC), Single-Stream Reasoner (SSR), Multi-choice Input Format
Published in Arxiv, 2024-06
Knowledge‑intensive, Paired & Interleaved, Large Multimodal Models (LMMs), Scalability
Published in SIGIR-AP, 2024-10
Massive Tool Retrieval (MTR), Query‑Tool Alignment (QTA), Massive Tool Retrieval Benchmark, DPO
Published in EMNLP, 2024-11
ToolBeHonest, Hallucination, LLM, Multi‑level Diagnostic
Published in EMNLP, 2024-11
Screenwriting, Large Language Models (LLMs), Role Playing, Creative Generation, Multi-Agent Collaboration
Published in ICLR, 2025-04
Large Multimodal Models (LMMs), Chart Understanding, Code Generation, Cross-modal Reasoning
Published in ACL, 2025-07
Multi-Paradigm, Large Language Model (LLM), Progressive Paradigm Training (PPT), Zero-shot Generalization
Published:
随着大模型在自然语言处理、计算机视觉等多个领域兴起,认知智能正在经历范式上的变化。借助大规模的数据以及庞大的参数量,这些模型展现出能够有效处理各种任务的特征,并正在以惊人的速度被部署到各个专业领域中,对社会和经济发展产生深远的影响。但是目前中文社区出现了某种停滞不前的现象,因为模型的体量已经从原本的百万参数飞跃至千亿级别,一些高校和传统公司并不具备足够的算力,也缺少有效的基础设施帮助他们训练和使用模型。因此,要推动人工智能技术进一步发展,坚实的基础设施尤为重要。
Published:
全面讲解太乙系列模型从模型的生产到应用。该分享从训练,微调和加速等角度揭秘封神榜开源体系之一的太乙系列(多模态系列)模型是如何生产的。基于该团队训练后开源的权重,讲解如何推理加速以及如何部署在 webui 和 dreambooth 等应用上。
Published:
NCAA 2023 tutorial speaker
Published:
EALM: Introducing Multidimensional Ethical Alignment in Conversational Information Retrieval
Undergraduate course, University 1, Department, 2014
This is a description of a teaching experience. You can use markdown like any other post.
Workshop, University 1, Department, 2015
This is a description of a teaching experience. You can use markdown like any other post.