Discriminative Representation and Adaptive Calibrated Inference for Cross-Domain Few-Shot Named Entity Recognition
QIU Quanan1, HUANG Qi1,2, TONG Zirong1, LUO Wenbing1,2, YI Jie3,4, WANG Mingwen1,2
1. School of Digital Industry, Jiangxi Normal University, Shang-rao 334000; 2. School of Artificial Intelligence, Jiangxi Normal University, Nanchang 330022; 3. Management Science and Engineering Research Center, Jiangxi Normal University, Nanchang 330022; 4. School of Big Data, Shangrao Vocational and Technical Co-llege, Shangrao 334109
摘要 针对跨域少样本命名实体识别任务因源域特征与目标域特征分布偏移导致的边界模糊与误差累积问题,提出基于判别性表示与自适应校准推理的跨域少样本命名实体识别模型(Discriminative Representation and Adaptive Calibrated Inference for Cross-Domain Few-Shot Named Entity Recognition, DR-ACI).首先,设计非对称边界对比损失重塑跨度检测空间,采用实体中心的非对称约束策略,在保持背景语义多样性的同时显式锐化实体边界.同时引入自适应门控增强模块,通过多层级语义融合对稀疏原型进行动态校准,降低因支持集样本稀疏带来的表征不确定性与偏差.然后,设计场景感知的自适应校准推理机制,针对特征模长漂移与支持集偏差瓶颈,利用特征归一化与可靠性感知的双模式门控策略,动态重构判决边界,抑制迁移噪声.实验表明,DR-ACI在Few-NERD数据集上具有一定的竞争力,同时在跨域数据集上性能较优,由此验证判别性表示与自适应推理协同优化的有效性.
Abstract:To address the challenges of boundary ambiguity and error accumulation caused by feature distribution shifts between source and target domains in few-shot Named Entity Recognition(NER), a model of cross-domain few-shot NER via discriminative representation and adaptive calibrated inference(DR-ACI) is proposed. First, the span detection space is reshaped through an asymmetric boundary contrastive(ABC) loss. An entity-centric asymmetric constraint strategy is adopted. With this strategy, entity boundaries are explicitly sharpened while the semantic diversity of the background is preserved. Simultaneously, an adaptive gated enhancement(AGE) module is introduced to dynamically calibrate sparse prototypes through multi-level semantic fusion, thereby mitigating representation uncertainty and bias resulting from support set sparsity. Subsequently, a scenario-aware adaptive calibrated inference mechanism is designed to tackle the bottlenecks of feature norm drift and support set bias. By leveraging feature normalization and a reliability-aware dual-mode gated strategy, the above mechanism dynamically reconstructs decision boundaries to suppress transfer noise. Experimental results demonstrate that DR-ACI maintains competitive performance on Few-NERD dataset and is superior to the baseline models on cross-domain datasets. These results verify the effectiveness of the synergistic optimization of discriminative representation and adaptive inference.
[1] HUANG Z H, XU W, YU K.Bidirectional LSTM-CRF Models for Sequence Tagging[C/OL]. [2025-12-26].https://arxiv.org/pdf/1508.01991. [2] MA X Z, HOVY E.End-to-End Sequence Labeling via Bi-directional LSTM-CNNs-CRF // Proc of the 54th Annual Meeting of the Association for Computational Linguistics(Long Papers). Stroudsburg, USA: ACL, 2016: 1064-1074. [3] CHIU J P C, NICHOLS E. Named Entity Recognition with Bi-directional LSTM-CNNs. Transactions of the Association for Computatio-nal Linguistics, 2016, 4: 357-370. [4] SNELL J, SWERSKY K, ZEMEL R S.Prototypical Networks for Few-Shot Learning // Proc of the 31st International Conference on Neural Information Processing Systems. Cambridge, USA: MIT Press, 2017: 4080-4090. [5] WANG P Y, XU R X, LIU T Y, et al. An Enhanced Span-Based Decomposition Method for Few-Shot Sequence Labeling // Proc of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg, USA: ACL, 2022: 5012-5024. [6] DING N, XU G W, CHEN Y L, et al. Few-NERD: A Few-Shot Named Entity Recognition Dataset // Proc of the 59th Annual Mee-ting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing(Long Papers). Stroudsburg, USA: ACL, 2021: 3198-3213. [7] YANG Y, KATIYAR A.Simple and Effective Few-Shot Named Entity Recognition with Structured Nearest Neighbor Learning // Proc of the Conference on Empirical Methods in Natural Language Processing. Stroudsburg, USA: ACL, 2020: 6365-6375. [8] FANG J Y, WANG X B, MENG Z Q, et al. MANNER: A Variational Memory-Augmented Model for Cross Domain Few-Shot Named Entity Recognition // Proc of the 61st Annual Meeting of the Association for Computational Linguistics(Long Papers). Stroudsburg, USA: ACL, 2023: 4261-4276. [9] ASHOK D, LIPTON Z C.PromptNER: Prompting for Named Entity Recognition[C/OL]. [2025-12-26].https://arxiv.org/pdf/2305.15444. [10] WANG J N, WANG C Y, TAN C Q, et al. SpanProto: A Two-Stage Span-Based Prototypical Network for Few-Shot Named Entity Recognition // Proc of the Conference on Empirical Methods in Natural Language Processing. Stroudsburg, USA: ACL, 2022: 3466-3476. [11] CHEN W, ZHAO L L, ZHENG Z, et al. Double-Checker: Large Language Model as a Checker for Few-Shot Named Entity Recognition // Findings of the Association for Computational Linguistics. Stroudsburg, USA: ACL, 2024: 3172-3181. [12] YE J J, XU N, WANG Y K, et al. LLM-DA: Data Augmentation via Large Language Models for Few-Shot Named Entity Recognition[C/OL].[2025-12-26]. https://arxiv.org/pdf/2402.14568. [13] BOGDANOV S, CONSTANTIN A, BERNARD T, et al. NuNER: Entity Recognition Encoder Pre-training via LLM-Annotated Data // Proc of the Conference on Empirical Methods in Natural Language Processing. Stroudsburg, USA: ACL, 2024: 11829-11841. [14] GUO Q J, DONG Y H, TIAN L, et al. BANER: Boundary-Aware LLMs for Few-Shot Named Entity Recognition // Proc of the 31st International Conference on Computational Linguistics. Stroudsburg, USA: ACL, 2025: 10375-10389. [15] LI Y Q, YU Y, QIAN T Y.Type-Aware Decomposed Framework for Few-Shot Named Entity Recognition // Findings of the Association for Computational Linguistics. Stroudsburg, USA: ACL, 2023: 8911-8927. [16] LOSHCHILOV I, HUTTER F.Decoupled Weight Decay Regularization[C/OL]. [2025-12-26].https://arxiv.org/pdf/1711.05101. [17] DAS S S S, KATIYAR A, PASSONNEAU R J, et al. CONTaiNER: Few-Shot Named Entity Recognition via Contrastive Learning // Proc of the 60th Annual Meeting of the Association for Computational Linguistics(Long Papers). Stroudsburg, USA: ACL, 2022: 6338-6353. [18] MA J, BALLESTEROS M, DOSS S, et al. Label Semantics for Few Shot Named Entity Recognition // Findings of the Association for Computational Linguistics. Stroudsburg, USA: ACL, 2022: 1956-1971. [19] LIU Y H, OTT M, GOYAL N, et al. RoBERTa: A Robustly Optimized BERT Pretraining Approach[C/OL].[2025-12-26]. https://arxiv.org/pdf/1907.11692. [20] MA T T, JIANG H Q, WU Q H, et al. Decomposed Meta-Lear-ning for Few-Shot Named Entity Recognition // Findings of the Association for Computational Linguistics. Stroudsburg, USA: ACL, 2022: 1584-1596. [21] FINN C, ABBEEL P, LEVINE S.Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks // Proc of the 34th International Conference on Machine Learning. San Diego, USA: JMLR, 2017: 1126-1135. [22] JI S J, KONG F.A Novel Three-Stage Framework for Few-Shot Named Entity Recognition // Proc of the Joint International Confe-rence on Computational Linguistics, Language Resources and Eva-luation. Stroudsburg, USA: ACL, 2024: 1293-1305. [23] XUE X J, ZHANG C X, XU T X, et al. Robust Few-Shot Named Entity Recognition with Boundary Discrimination and Correlation Purification. Proceedings of the AAAI Conference on Artificial Intelligence, 2024, 38(17): 19341-19349.