Le Xue

Sitert av

	Alle	Siden 2020
Sitater	703	694
h-indeks	8	8
i10-indeks	8	7

500

250

125

375

2017201820192020202120222023202420252 5 2 3 3 5 96 485 99

Offentlig tilgang

Vis alle

0 artikler

1 artikkel

tilgjengelige

ikke tilgjengelige

Basert på finansieringsmandater

Medforfattere

Caiming XiongSalesforce ResearchVerifisert e-postadresse på salesforce.com
Ran XuSalesforce ResearchVerifisert e-postadresse på salesforce.com
Silvio SavareseAssociate Professor of Computer Science at Stanford UniversityVerifisert e-postadresse på stanford.edu
Juan Carlos NieblesResearch Director (Salesforce) & Adjunct Professor (Stanford University)Verifisert e-postadresse på cs.stanford.edu
Zeyuan ChenSalesforceVerifisert e-postadresse på salesforce.com
Roberto Martín-MartínThe University of Texas at AustinVerifisert e-postadresse på cs.utexas.edu
Weiran YaoResearch Scientist, Salesforce AI ResearchVerifisert e-postadresse på cmu.edu
Mingfei GaoApple Inc.Verifisert e-postadresse på apple.com
Chen Xing (星辰)Scale AIVerifisert e-postadresse på scale.com
Jiajun WuStanford UniversityVerifisert e-postadresse på cs.stanford.edu

Følg

Le Xue

Senior Applied Scientist, Salesforce Research

Verifisert e-postadresse på salesforce.com

Multimodal Foundation Models


Tittel Sorter etter sitater Sorter etter år Sorter etter tittel	Sitert av Sitert av	År
Ulip: Learning a unified representation of language, images, and point clouds for 3d understanding L Xue, M Gao, C Xing, R Martín-Martín, J Wu, C Xiong, R Xu, JC Niebles, ... Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2023	257	2023
ULIP-2: Towards Scalable Multimodal Pre-training for 3D Understanding L Xue, N Yu, S Zhang, J Li, R Martín-Martín, J Wu, C Xiong, R Xu, ... Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2023	113	2023
Bolaa: Benchmarking and orchestrating llm-augmented autonomous agents Z Liu, W Yao, J Zhang, L Xue, S Heinecke, R Murthy, Y Feng, Z Chen, ... arXiv preprint arXiv:2308.05960, 2023	82	2023
Retroformer: Retrospective large language agents with policy gradient optimization W Yao, S Heinecke, JC Niebles, Z Liu, Y Feng, L Xue, R Murthy, Z Chen, ... arXiv preprint arXiv:2308.02151, 2023	64	2023
xgen-mm (blip-3): A family of open large multimodal models L Xue, M Shu, A Awadalla, J Wang, A Yan, S Purushwalkam, H Zhou, ... arXiv preprint arXiv:2408.08872, 2024	60	2024
X-instructblip: A framework for aligning x-modal instruction-aware representations to llms and emergent cross-modal reasoning A Panagopoulou, L Xue, N Yu, J Li, D Li, S Joty, R Xu, S Savarese, ... arXiv preprint arXiv:2311.18799, 2023	47	2023
Mint-1t: Scaling open-source multimodal data by 10x: A multimodal dataset with one trillion tokens A Awadalla, L Xue, O Lo, M Shu, H Lee, E Guha, S Shen, M Awadalla, ... Advances in Neural Information Processing Systems 37, 36805-36828, 2024	22	2024
Directed weighted network structure analysis of complex impedance measurements for characterizing oil-in-water bubbly flow ZK Gao, WD Dang, L Xue, SS Zhang Chaos: An Interdisciplinary Journal of Nonlinear Science 27 (3), 2017	15	2017
Rex: Rapid exploration and exploitation for ai agents R Murthy, S Heinecke, JC Niebles, Z Liu, L Xue, W Yao, Y Feng, Z Chen, ... arXiv preprint arXiv:2307.08962, 2023	8	2023
xgen-mm-vid (blip-3-video): You only need 32 tokens to represent a video even in vlms MS Ryoo, H Zhou, S Kendre, C Qin, L Xue, M Shu, S Savarese, R Xu, ... arXiv preprint arXiv:2410.16267, 2024	6	2024
Robustness evaluation of transformer-based form field extractors via form attacks L Xue, M Gao, Z Chen, C Xiong, R Xu International Conference on Document Analysis and Recognition, 167-184, 2023	6	2023
ProVision: Programmatically Scaling Vision-centric Instruction Data for Multimodal Language Models J Zhang, L Xue, L Song, J Wang, W Huang, M Shu, A Yan, Z Ma, ... arXiv preprint arXiv:2412.07012, 2024	5	2024
Docquerynet: value retrieval with arbitrary queries for form-like documents M Gao, L Xue, C Ramaiah, C Xing, R Xu, C Xiong Proceedings of the 29th International Conference on Computational …, 2022	5*	2022
Llavidal: Benchmarking large language vision models for daily activities of living R Chakraborty, A Sinha, D Reilly, MK Govind, P Wang, F Bremond, S Das arXiv preprint arXiv:2406.09390, 2024	3	2024
Image analysis based document processing for inference of key-value pairs in non-fixed digital documents M Gao, C Zeyuan, L Xue, R Xu, C Xiong US Patent 11,699,297, 2023	3	2023
Model-agnostic hierarchical attention for 3d object detection M Shu, L Xue, N Yu, R Martín-Martín, JC Niebles, C Xiong, R Xu arXiv e-prints, arXiv: 2301.02650, 2023	3	2023
xGen-VideoSyn-1: High-fidelity Text-to-Video Synthesis with Compressed Representations C Qin, C Xia, K Ramakrishnan, M Ryoo, L Tu, Y Feng, M Shu, H Zhou, ... arXiv preprint arXiv:2408.12590, 2024	2	2024
BLIP3-KALE: Knowledge Augmented Large-Scale Dense Captions A Awadalla, L Xue, M Shu, A Yan, J Wang, S Purushwalkam, S Shen, ... arXiv preprint arXiv:2411.07461, 2024	1	2024
`X-InstructBLIP`: A Framework for Aligning Image, 3D, Audio, Video to LLMs and its Emergent Cross-Modal Reasoning A Panagopoulou, L Xue, N Yu, J Li, D Li, S Joty, R Xu, S Savarese, ... European Conference on Computer Vision, 177-197, 2024	1	2024
SYSTEMS AND METHODS FOR ORCHESTRATING LLM-AUGMENTED AUTONOMOUS AGENTS Z Liu, W Yao, J Zhang, L Xue, S Heinecke, R Murthy, Y Feng, Z Chen, ... US Patent App. 18/494,393, 2025		2025

Systemet kan ikke utføre handlingen. Prøv på nytt senere.

Artikler 1–20

Sitater per år

Duplikatsitater

Sammenslåtte sitater

Legg til medforfattereMedforfattere

Følg

Sitert av

Medforfattere