Follow
Shitao Xiao
Shitao Xiao
Verified email at bupt.edu.cn
Title
Cited by
Cited by
Year
C-pack: Packaged resources to advance general chinese embedding
S Xiao, Z Liu, P Zhang, N Muennighof
arXiv preprint arXiv:2309.07597, 2023
3232023
Bge m3-embedding: Multi-lingual, multi-functionality, multi-granularity text embeddings through self-knowledge distillation
J Chen, S Xiao, P Zhang, K Luo, D Lian, Z Liu
arXiv preprint arXiv:2402.03216, 2024
1622024
Graphformers: Gnn-nested transformers for representation learning on textual graph
J Yang, Z Liu, S Xiao, C Li, D Lian, S Agrawal, A Singh, G Sun, X Xie
Advances in Neural Information Processing Systems 34, 28798-28810, 2021
1442021
RetroMAE: Pre-training Retrieval-oriented Transformers via Masked Auto-Encoder
S Xiao, Z Liu, Y Shao, Z Cao
arXiv preprint arXiv:2205.12035, 2022
133*2022
Retrieve anything to augment large language models
P Zhang, S Xiao, Z Liu, Z Dou, JY Nie
arXiv preprint arXiv:2310.07554, 2023
752023
LECF: recommendation via learnable edge collaborative filtering
S Xiao, Y Shao, Y Li, H Yin, Y Shen, B Cui
Science China Information Sciences 65 (1), 112101, 2022
432022
Training large-scale news recommenders with pretrained language models in the loop
S Xiao, Z Liu, Y Shao, T Di, B Middha, F Wu, X Xie
Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and …, 2022
382022
Soaring from 4K to 400K: Extending LLM's Context with Activation Beacon
P Zhang, Z Liu, S Xiao, N Shao, Q Ye, Z Dou
arXiv preprint arXiv:2401.03462, 2024
372024
Making large language models a better foundation for dense retrieval
C Li, Z Liu, S Xiao, Y Shao
arXiv preprint arXiv:2312.15503, 2023
24*2023
Uni-retriever: Towards learning the unified embedding based retriever in bing sponsored search
J Zhang, Z Liu, W Han, S Xiao, R Zheng, Y Shao, H Sun, H Zhu, ...
Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and …, 2022
242022
Distill-vq: Learning retrieval oriented vector quantization by distilling knowledge from dense embeddings
S Xiao, Z Liu, W Han, J Zhang, D Lian, Y Gong, Q Chen, F Yang, H Sun, ...
Proceedings of the 45th International ACM SIGIR Conference on Research and …, 2022
232022
MLVU: A Comprehensive Benchmark for Multi-Task Long Video Understanding
J Zhou, Y Shu, B Zhao, B Wu, S Xiao, X Yang, Y Xiong, B Zhang, T Huang, ...
arXiv preprint arXiv:2406.04264, 2024
202024
RetroMAE-2: Duplex Masked Auto-Encoder For Pre-Training Retrieval-Oriented Language Models
Z Liu, S Xiao, Y Shao, Z Cao
Proceedings of the 61st Annual Meeting of the Association for Computational …, 2023
202023
Matching-oriented Product Quantization For Ad-hoc Retrieval
S Xiao, Z Liu, Y Shao, D Lian, X Xie
EMNLP, 2021
192021
Progressively optimized bi-granular document representation for scalable embedding based retrieval
S Xiao, Z Liu, W Han, J Zhang, Y Shao, D Lian, C Li, H Sun, D Deng, ...
Proceedings of the ACM Web Conference 2022, 286-296, 2022
142022
Lm-cocktail: Resilient tuning of language models via model merging
S Xiao, Z Liu, P Zhang, X Xing
arXiv preprint arXiv:2311.13534, 2023
132023
Mindsim: user simulator for news recommenders
X Luo, Z Liu, S Xiao, X Xie, D Li
Proceedings of the ACM Web Conference 2022, 2067-2077, 2022
112022
BGE Landmark Embedding: A Chunking-Free Embedding Method For Retrieval Augmented Long-Context Large Language Models
K Luo, Z Liu, S Xiao, K Liu
arXiv preprint arXiv:2402.11573, 2024
102024
Omnigen: Unified image generation
S Xiao, Y Wang, J Zhou, H Yuan, X Xing, R Yan, S Wang, T Huang, Z Liu
arXiv preprint arXiv:2409.11340, 2024
52024
Extending Llama-3's Context Ten-Fold Overnight
P Zhang, N Shao, Z Liu, S Xiao, H Qian, Q Ye, Z Dou
arXiv preprint arXiv:2404.19553, 2024
52024
The system can't perform the operation now. Try again later.
Articles 1–20