Sitemap

A list of all the posts and pages found on the site. For you robots out there, there is an XML version available for digesting as well.

Posts

Comprehensive LaTeX Template for Waseda University PhD Theses

less than 1 minute read

Published: June 24, 2025

Github: https://github.com/wanng-ide/phd_thesis_template_waseda_university

📚论文阅读

less than 1 minute read

Published: June 21, 2025

阅读的论文合集。

🤔一些思考

less than 1 minute read

Published: June 13, 2025

记录一些简单的思考💬。

Markdown Guide

7 minute read

Published: June 13, 2025

📒 This page is from Academic Pages.

Academic Pages 简明双语指南

3 minute read

Published: June 12, 2025

Quick Guide for Academic Pages (Chinese-English Bilingual)

从视觉问答 Visual Question Answering（VQA）到多模态表征 Multimodal representation learning 简单综述

less than 1 minute read

Published: August 14, 2022

文本主要是对VQA整个任务做一个综述。

portfolio

YUE开源音乐大模型

YuE: Scaling Open Foundation Models for Long-Form Music Generation

Open‑Sora

Open-Sora: Democratizing Efficient Video Production for All

最大的开源的交错图文对数据集：PIN

PIN（Paired and INterleaved multimodal documents）

封神榜开源体系

开源模型、开源框架、开源榜单

一些自动脚本

记录一些简单的脚本项目

publications

⭐ [NTCIR 15] SKYMN at the NTCIR-15 DialEval-1 Task

Published in NTCIR, 2020-12

Dialogue Quality, BiLSTM + Attention), CNN (Convolutional Neural Network), Pre‑trained Language Model, MoE

Download Paper

⭐ [EMNLP 2021] MIRTT: Learning Multimodal Interaction Representations from Trilinear Transformers for Visual Question Answering

Published in EMNLP, 2021-10

Multimodal Interaction, Trilinear Transformer, Visual Question Answering (VQA), Two-Stage Workflow

Download Paper

⭐ Towards No.1 in CLUE Semantic Matching Challenge: Pre-trained Language Model Erlangshen with Propensity-Corrected Loss

Published in Arxiv, 2022-08

Semantic Matching, Pre-trained Language Model (PLM), Propensity‑Corrected Loss (PCL), LUE Semantic Matching Challenge

Download Paper

⭐ Fengshenbang 1.0: Being the Foundation of Chinese Cognitive Intelligence

Published in Arxiv, 2022-09

Massive Tool Retrieval (MTR), Query‑Tool Alignment (QTA), Massive Tool Retrieval Benchmark, DPO

Download Paper

[ACM 2022] Breaking Isolation: Multimodal Graph Fusion for Multimedia Recommendation by Edge-wise Modulation

Published in ACM MM, 2022-10

Multimedia Recommendation, Graph Fusion, Edge-wise Modulation, Graph Convolutional Network (GCN)

Download Paper

⭐ [EMNLP 2022] Zero-Shot Learners for Natural Language Understanding via a Unified Multiple Choice Perspective

Published in EMNLP, 2022-10

Zero‑Shot Learning, Multiple Choice Format, Pre‑trained Masked Language Model (PMLM), Unified Multiple Choice model

Download Paper

⭐ [CVPR 2023] MAP: Modality-Agnostic Uncertainty-Aware Vision-Language Pre-training Model

Published in CVPR, 2023-06

Multimodality, Uncertainty Modeling, Vision-Language Pre-training, Probability Distribution Encoder (PDE)

Download Paper

⭐ [ACL 2023] Solving Math Word Problems via Cooperative Reasoning induced Language Models

Published in ACL, 2023-07

System 1 & System 2, Cooperative Reasoning (CoRe),Stepwise Feedback, Math Word Problems

Download Paper

[ACL 2023] UniEX: An Effective and Efficient Framework for Unified Information Extraction via a Span-extractive Perspective

Published in ACL, 2023-07

Information Extraction , Unified Across IE Tasks, Triaffine Attention, Span‑extractive Framework, Low‑resource Transferability

Download Paper

⭐ [SIGIR-AP 2023] EALM: Introducing Multidimensional Ethical Alignment in Conversational Information Retrieval

Published in SIGIR-AP, 2023-11

Multidimensional Ethics, Ethical Judgment, Large Multimodal Models (LMMs), Binary & Multi‑label Classification

Download Paper

CMMMU: A Chinese Massive Multi-discipline Multimodal Understanding Benchmark

Published in Arxiv, 2024-01

Multimodal Understanding, Subject Knowledge Reasoning , University Exam Questions, LMM Performance Evaluation

Download Paper

[COLM 2024] StructLM: Towards Building Generalist Models for Structured Knowledge Grounding

Published in COLM, 2024-02

Structured Knowledge Grounding (SKG), Instruction tuning, Generalist model, Tables / Graphs / Databases

Download Paper

⭐ [ACM TOIS 2024] SSR: Solving Named Entity Recognition Problems via a Single-stream Reasoner

Published in ACM TOIS, 2024-05

Named Entity Recognition (NER), Machine Reading Comprehension (MRC), Single-Stream Reasoner (SSR), Multi-choice Input Format

Download Paper

⭐🚩 PIN: A Knowledge-Intensive Dataset for Paired and Interleaved Multimodal Documents

Published in Arxiv, 2024-06

Knowledge‑intensive, Paired & Interleaved, Large Multimodal Models (LMMs), Scalability

Download Paper

⭐ [SIGIR-AP 2024 最佳论文提名] Data-Efficient Massive Tool Retrieval: A Reinforcement Learning Approach for Query-Tool Alignment with Language Models

Published in SIGIR-AP, 2024-10

Massive Tool Retrieval (MTR), Query‑Tool Alignment (QTA), Massive Tool Retrieval Benchmark, DPO

Download Paper

⭐ [EMNLP 2024] ToolBeHonest: A Multi-level Hallucination Diagnostic Benchmark for Tool-Augmented Large Language Models

Published in EMNLP, 2024-11

ToolBeHonest, Hallucination, LLM, Multi‑level Diagnostic

Download Paper

[EMNLP 2024] HoLLMwood: Unleashing the Creativity of Large Language Models in Screenwriting via Role Playing

Published in EMNLP, 2024-11

Screenwriting, Large Language Models (LLMs), Role Playing, Creative Generation, Multi-Agent Collaboration

Download Paper

⭐ [ICLR 2025] ChartMimic: Evaluating LMM’s Cross-Modal Reasoning Capability via Chart-to-Code Generation

Published in ICLR, 2025-04

Large Multimodal Models (LMMs), Chart Understanding, Code Generation, Cross-modal Reasoning

Download Paper

🚩 [ACL 2025] Chain-of-Reasoning: Towards Unified Mathematical Reasoning in Large Language Models via a Multi-Paradigm Perspective

Published in ACL, 2025-07

Multi-Paradigm, Large Language Model (LLM), Progressive Paradigm Training (PPT), Zero-shot Generalization

Download Paper

teaching

Teaching experience 1

Undergraduate course, University 1, Department, 2014

This is a description of a teaching experience. You can use markdown like any other post.

Teaching experience 2

Workshop, University 1, Department, 2015

This is a description of a teaching experience. You can use markdown like any other post.

Junjie Wang (王军杰)

Sitemap

Pages

Posts

portfolio

publications

talks

teaching