common

A Modular Approach for Clinical SLMs Driven by Synthetic Data with Pre-Instruction Tuning, Model MerA Strategic Coordination Framework of Small LLMs Matches Large LLMs in Data SynthesisAlleviating Distribution Shift in Synthetic Data for Machine Translation Quality EstimationDeCoT: Debiasing Chain-of-Thought for Knowledge-Intensive Tasks in Large Language Models via CausalEvaluating Language Models as Synthetic Data GeneratorsFrom REAL to SYNTHETIC: Synthesizing Millions of Diversified and Complicated User Instructions withGenerative Reward Modeling via Synthetic Criteria Preference LearningHintsOfTruth: A Multimodal Checkworthiness Detection Dataset with Real and Synthetic ClaimsOCEAN: Offline Chain-of-thought Evaluation and Alignment in Large Language ModelsOn Synthetic Data Strategies for Domain-Specific Generative RetrievalRethinking Chain-of-Thought from the Perspective of Self-TrainingScaling Text-Rich Image Understanding via Code-Guided Synthetic Multimodal Data GenerationSelf-Generated Critiques Boost Reward Modeling for Language ModelsSpaRE: Enhancing Spatial Reasoning in Vision-Language Models with Synthetic DataTARGA: Targeted Synthetic Data Generation for Practical Reasoning over Structured DataTheorem Prover as a Judge for Synthetic Data GenerationTree-of-Traversals: A Zero-Shot Reasoning Algorithm for Augmenting Black-box Language Models with KnTreeCut: A Synthetic Unanswerable Math Word Problem Dataset for LLM Hallucination EvaluationGroup Sequence Policy OptimizationSEA: Low-Resource Safety Alignment for Multimodal Large Language Models via Synthetic EmbeddingsOn LLMs-Driven Synthetic Data Generation, Curation, and Evaluation: A Survey

Last updated