Online construction of vectorized high-definition(HD) map is one of the critical tasks in autonomous driving system. However, the prediction performance of the existing methods is reduced under complex conditions, such as occlusion and low illumination. To address this issue, a method for temporal vector map perception with enhanced query and map prior(QPMap) is proposed. First, a Transformer-based standard definition map enhancement module and a rasterized high definition map initialization module are designed to enhance environmental understanding in complex scenarios. Second, an instance-level query encoding strategy is designed. The detection results from previous frames are introduced as temporal query priors into the current perception process to improve prediction stability and temporal consistency. Finally, a multi-scale decoder and a direction vector loss function are utilized to refine query representations. Thus, vectorized map elements are predicted accurately. A real-world dataset is constructed and a fisheye camera adaptation strategy is developed to validate the applicability of QPMap in actual autonomous driving scenarios. Experiments on multiple public datasets and a self-collected dataset demonstrate that QPMap achieves superior map construction performance while maintaining good real-time performance. Moreover, QPMap effectively improves the accuracy and robustness of online HD map construction.
The Transformer demonstrates considerable potential in time series analysis. However, the attention mechanism of the Transformer often aggregates semantically irrelevant query-key pairs, thereby resulting in the degradation of prediction performance. Moreover, complex patterns in time series, including periodicity and abrupt fluctuations, pose additional challenges for effective modeling. To address these issues, a multivariate time series forecasting method with global-local feature fusion(MTS-GLFF) is proposed. First, a TopK selection operator is designed. It dynamically generates sparse masks based on learnable sensor embeddings, thereby retaining key sequences for subsequent feature aggregation. Next, a dual-branch time series forecasting framework comprising global and local branch networks is constructed. The global branch captures global interaction features through a cross-variable attention mechanism, while the local branch adopts a multi-scale architecture that decomposes time series into multi-granularity patterns for fine-grained modeling of local dependencies. Experiments on 10 benchmark datasets demonstrate that MTS-GLFF achieves competitive performance.
In traditional convolutional neural networks,the lack of explicit constraints on the frequency-domain responses of convolutional kernels leads to excessive enhancement of high-frequency components. To address this issue, an image classification network with frequency decay convolution(FDNet) is proposed. First, the frequency decay convolution(FDConv) module is designed. The neuron signal attenuation mechanism is introduced into the frequency domain. By attenuating the frequency-domain amplitude of the convolutional kernel, the responses of excessively strong frequency components are suppressed with complete phase information retained. Consequently, the overfitting problem caused by imbalance in frequency-domain responses is effectively alleviated. Second, FDConv is embedded into the shallow feature extraction stage and the residual block structure, enabling spectral regularization throughout the entire feature extraction process. Then, the directional spatial decay(DSD) module is constructed. Parallel 1×3 and 3×1 FDConv modules are adopted to separately extract horizontal and vertical directional features, thereby achieving collaborative enhancement of directional features and important channels. Finally, the DSD module is embedded at the end of the residual block. After the residual connection, both directional feature decomposition and channel recalibration are performed, enabling the network to focus on key discriminative information. Experiments on CIFAR-10, CIFAR-100, SVHN, STL-10, Imagenette, and Imagewoof image datasets demonstrate the superior classification performance of FDNet.
It is difficult for existing face super-resolution methods to achieve an optimal tradeoff between computational efficiency and reconstruction quality. To address this issue, a face super-resolution network based on frequency-domain interaction and dense fusion is proposed. First, wavelet transform is adopted for frequency-domain downsampling. The spectral information of images is effectively preserved. On this basis, a global-local Transformer module is designed to serially process local window attention and global sparse attention. The overall geometric structure of the face is better captured and the computational complexity is reduced. Second, a frequency-domain interaction Transformer module is proposed to construct a cross-frequency band interaction mechanism. High-frequency features are utilized to refine low-frequency semantics, thereby improving image sharpness. Finally, cross-scale feature aggregation is realized through an adjacent cross-scale fusion mechanism. Experiments indicate that the proposed network achieves a good balance between perceptual quality and pixel fidelity while reducing the number of parameters. A new solution is provided for face super-resolution in resource-constrained scenarios. The code is available at https://github.com/Hddcc/FIDFN.
In traditional conflict analysis, agents are typically assumed to be independent, and the latent connections and status inequality among agents in social networks are neglected. Furthermore, trust relationships in social networks are often incomplete, and the scientific determination of the issue weights is also a key factor in mitigating conflict. To address these issues, a trust-driven three-way conflict analysis model based on the best-worst method(TBWM-3WCA) is proposed. First, to address the issue of incomplete trust relationships in social networks, the path penalty coefficient and the Einstein product are utilized to simulate trust propagation and complete the trust matrix. Second, to reflect the differences in agents influence within a group, latent relationships among agents are explored to derive influence weights. Then, the influence weights are employed to aggregate group attitudes and objectively identify the most and least supported issues in conflict scenarios. BWM is subsequently incorporated to determine the issue weights. Finally, a dynamic feedback mechanism based on the system conflict degree is incorporated to iteratively adjust the attitudes of agents, thereby promoting consensus formation and conflict convergence. Case studies and comparative experiments demonstrate the effectiveness of the proposed model in resolving conflicts.
In ultra-fine-grained visual classification(Ultra-FGVC), the inter-class differences are extremely subtle while the intra-class variations remain considerable. To address these issues, an ultra-fine-grained visual classification network based on dynamic confusion discovery(DCD-Net) is proposed. DCD-Net is composed of two key modules: the dynamic confusion discovery module(DCDM) and the confusion-aware dual contrastive learning module(CDCLM). DCDM constructs a confusion affinity matrix by analyzing the predicted probability distributions of the model. The globally optimal class pairing relationships are derived to identify the most confusable class pairs at the current training stage. CDCLM focuses on these identified confusion pairs to optimize the feature space from two perspectives: preserving intra-class feature consistency and enlarging the feature margin between easily confused classes. A collaborative mechanism is formed by the two modules through a confusion pairing table, enabling the network to dynamically adjust its learning focus throughout the training and continuously concentrate on the most indistinguishable class boundaries. Experiments on five ultra-fine-grained datasets and one fine-grained dataset demonstrate that DCD-Net achieves high recognition accuracy and strong generalization ability.