Jiangsu Du

Cited by

	All	Since 2020
Citations	121	120
h-index	7	7
i10-index	3	3

202120222023202420257 10 31 58 14

Public access

View all

9 articles

available

not available

Based on funding mandates

Co-authors

Jiazhi JiangBeijing Normal UniversityVerified email at bnu.edu.cn
Shenggan ChengNational University of SingaporeVerified email at comp.nus.edu.sg

Jiangsu Du

Sun Yat-sun University

Verified email at mail.sysu.edu.cn - Homepage

HPC-AI


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
Model parallelism optimization for distributed inference via decoupled CNN structure J Du, X Zhu, M Shen, Y Du, Y Lu, N Xiao, X Liao IEEE Transactions on Parallel and Distributed Systems 32 (7), 1665-1676, 2020	32	2020
A distributed in-situ CNN inference system for IoT applications J Du, M Shen, Y Du 2020 IEEE 38th International Conference on Computer Design (ICCD), 279-287, 2020	16	2020
Optimizing small channel 3D convolution on GPU with tensor core J Jiang, D Huang, J Du, Y Lu, X Liao Parallel Computing 113, 102954, 2022	10	2022
Galaxy: A resource-efficient collaborative edge ai system for in-situ transformer inference S Ye, J Du, L Zeng, W Ou, X Chu, Y Lu, X Chen IEEE INFOCOM 2024-IEEE Conference on Computer Communications, 1001-1010, 2024	8	2024
Handling heavy-tailed input of transformer inference on GPUs J Du, J Jiang, Y You, D Huang, Y Lu Proceedings of the 36th ACM International Conference on Supercomputing, 1-11, 2022	8	2022
Liger: Interleaving Intra-and Inter-Operator Parallelism for Distributed Large Model Inference J Du, J Wei, J Jiang, S Cheng, D Huang, Z Chen, Y Lu Proceedings of the 29th ACM SIGPLAN Annual Symposium on Principles and …, 2024	7	2024
Improving computation and memory efficiency for real-world transformer inference on gpus J Du, J Jiang, J Zheng, H Zhang, D Huang, Y Lu ACM Transactions on Architecture and Code Optimization 20 (4), 1-22, 2023	7	2023
Full-stack optimizing transformer inference on ARM many-core CPU J Jiang, J Du, D Huang, Z Chen, Y Lu, X Liao IEEE Transactions on Parallel and Distributed Systems 34 (7), 2221-2235, 2023	7	2023
Characterizing and optimizing transformer inference on arm many-core processor J Jiang, J Du, D Huang, D Li, J Zheng, Y Lu Proceedings of the 51st International Conference on Parallel Processing, 1-11, 2022	6	2022
P-sobi: A parallel implementation for second order blind identification algorithm H Li, J Du, Y Du, Z Chen, N Xiao 2019 IEEE 21st International Conference on High Performance Computing and …, 2019	5	2019
ATP: Adaptive Tensor Parallelism for Foundation Models S Cheng, Z Liu, J Du, Y You arXiv preprint arXiv:2301.08658, 2023	4	2023
EnergonAI: An inference system for 10-100 billion parameter transformer models J Du, Z Liu, J Fang, S Li, Y Li, Y Lu, Y You arXiv preprint arXiv:2209.02341, 2022	4	2022
Hierarchical model parallelism for optimizing inference on many-core processor via decoupled 3D-CNN structure J Jiang, Z Huang, D Huang, J Du, L Chen, Z Chen, Y Lu ACM Transactions on Architecture and Code Optimization 20 (3), 1-21, 2023	2	2023
CosNAS: Enhancing estimation on cosmological parameters via neural architecture search Y Wen, W Yu, D Li, J Du, D Huang, N Xiao New Astronomy 99, 101955, 2023	2	2023
Enhancing Distributed In-Situ CNN Inference in the Internet of Things J Du, Y Du, D Huang, Y Lu, X Liao IEEE Internet of Things Journal 9 (17), 15511-15524, 2022	2	2022
Optimizing massively parallel sparse matrix computing on ARM many-core processor J Zheng, J Jiang, J Du, D Huang, Y Lu Parallel Computing 117, 103035, 2023	1	2023
Concerto: Automatic Communication Optimization and Scheduling for Large-Scale Deep Learning S Cheng, S Lin, L Diao, H Wu, S Wang, C Si, Z Liu, X Zhao, J Du, W Lin, ... Proceedings of the 30th ACM International Conference on Architectural …, 2025		2025
ORFA: Exploring WebAssembly as a Turing Complete Query Language for Web APIs Y Gu, C Chen, J Du, X Zhang, X Zhang THE WEB CONFERENCE 2025, 2025		2025
Co-designing Transformer Architectures for Distributed Inference with Low Communication J Du, Y Wei, S Ye, J Jiang, X Chen, D Huang, Y Lu IEEE Transactions on Parallel and Distributed Systems, 2024		2024
APTMoE: Affinity-Aware Pipeline Tuning for MoE Models on Bandwidth-Constrained GPU Nodes Y Wei, J Du, J Jiang, X Shi, X Zhang, D Huang, N Xiao, Y Lu SC24: International Conference for High Performance Computing, Networking …, 2024		2024

The system can't perform the operation now. Try again later.

Articles 1–20

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors