Projects
Awesome World Models for Digital Agents
Awesome World Models for Digital Agents is a curated list and accompanying survey of world models for digital agents across games, web & GUI, tool use, and code. The survey introduces a unified design space W = (X, L, U): what is modeled (X), how it is built (L), and how it is used by the agent (U).

GemmaX: Multilingual Translator based on Gemma Open Models
GemmaX is a family of many-to-many LLM-based multilingual translation models that adopt multilingual continual pretraining with a Parallel-First Monolingual-Second (PFMS) data mixing strategy and instruction fine-tuning with high-quality translation prompts.

Forte: Composing Diverse NLP Tools for Text Retrieval, Analysis and Generation
Forte is a flexible, composable system designed for text processing, providing integrated architecture support for a wide spectrum of tasks, from Information Retrieval to Natural Language Processing (including text analysis and language generation). Built on principled abstractions and design patterns, Forte provides a platform to gather cutting-edge NLP and ML technologies in a composable manner.

Texar: A Modularized, Versatile, and Extensible Toolkit for Text Generation
Texar-PyTorch is an open-source toolkit based on PyTorch, aiming to support a broad set of machine learning tasks, especially text generation tasks such as machine translation, dialog, summarization, content manipulation, and language modeling. Texar is designed for both researchers and practitioners for fast prototyping and experimentation.

