Follow
Jiangsu Du
Jiangsu Du
Sun Yat-sun University
Verified email at mail.sysu.edu.cn - Homepage
Title
Cited by
Cited by
Year
Model parallelism optimization for distributed inference via decoupled CNN structure
J Du, X Zhu, M Shen, Y Du, Y Lu, N Xiao, X Liao
IEEE Transactions on Parallel and Distributed Systems 32 (7), 1665-1676, 2020
322020
A distributed in-situ CNN inference system for IoT applications
J Du, M Shen, Y Du
2020 IEEE 38th International Conference on Computer Design (ICCD), 279-287, 2020
162020
Optimizing small channel 3D convolution on GPU with tensor core
J Jiang, D Huang, J Du, Y Lu, X Liao
Parallel Computing 113, 102954, 2022
102022
Galaxy: A resource-efficient collaborative edge ai system for in-situ transformer inference
S Ye, J Du, L Zeng, W Ou, X Chu, Y Lu, X Chen
IEEE INFOCOM 2024-IEEE Conference on Computer Communications, 1001-1010, 2024
82024
Handling heavy-tailed input of transformer inference on GPUs
J Du, J Jiang, Y You, D Huang, Y Lu
Proceedings of the 36th ACM International Conference on Supercomputing, 1-11, 2022
82022
Liger: Interleaving Intra-and Inter-Operator Parallelism for Distributed Large Model Inference
J Du, J Wei, J Jiang, S Cheng, D Huang, Z Chen, Y Lu
Proceedings of the 29th ACM SIGPLAN Annual Symposium on Principles and …, 2024
72024
Improving computation and memory efficiency for real-world transformer inference on gpus
J Du, J Jiang, J Zheng, H Zhang, D Huang, Y Lu
ACM Transactions on Architecture and Code Optimization 20 (4), 1-22, 2023
72023
Full-stack optimizing transformer inference on ARM many-core CPU
J Jiang, J Du, D Huang, Z Chen, Y Lu, X Liao
IEEE Transactions on Parallel and Distributed Systems 34 (7), 2221-2235, 2023
72023
Characterizing and optimizing transformer inference on arm many-core processor
J Jiang, J Du, D Huang, D Li, J Zheng, Y Lu
Proceedings of the 51st International Conference on Parallel Processing, 1-11, 2022
62022
P-sobi: A parallel implementation for second order blind identification algorithm
H Li, J Du, Y Du, Z Chen, N Xiao
2019 IEEE 21st International Conference on High Performance Computing and …, 2019
52019
ATP: Adaptive Tensor Parallelism for Foundation Models
S Cheng, Z Liu, J Du, Y You
arXiv preprint arXiv:2301.08658, 2023
42023
EnergonAI: An inference system for 10-100 billion parameter transformer models
J Du, Z Liu, J Fang, S Li, Y Li, Y Lu, Y You
arXiv preprint arXiv:2209.02341, 2022
42022
Hierarchical model parallelism for optimizing inference on many-core processor via decoupled 3D-CNN structure
J Jiang, Z Huang, D Huang, J Du, L Chen, Z Chen, Y Lu
ACM Transactions on Architecture and Code Optimization 20 (3), 1-21, 2023
22023
CosNAS: Enhancing estimation on cosmological parameters via neural architecture search
Y Wen, W Yu, D Li, J Du, D Huang, N Xiao
New Astronomy 99, 101955, 2023
22023
Enhancing Distributed In-Situ CNN Inference in the Internet of Things
J Du, Y Du, D Huang, Y Lu, X Liao
IEEE Internet of Things Journal 9 (17), 15511-15524, 2022
22022
Optimizing massively parallel sparse matrix computing on ARM many-core processor
J Zheng, J Jiang, J Du, D Huang, Y Lu
Parallel Computing 117, 103035, 2023
12023
Concerto: Automatic Communication Optimization and Scheduling for Large-Scale Deep Learning
S Cheng, S Lin, L Diao, H Wu, S Wang, C Si, Z Liu, X Zhao, J Du, W Lin, ...
Proceedings of the 30th ACM International Conference on Architectural …, 2025
2025
ORFA: Exploring WebAssembly as a Turing Complete Query Language for Web APIs
Y Gu, C Chen, J Du, X Zhang, X Zhang
THE WEB CONFERENCE 2025, 2025
2025
Co-designing Transformer Architectures for Distributed Inference with Low Communication
J Du, Y Wei, S Ye, J Jiang, X Chen, D Huang, Y Lu
IEEE Transactions on Parallel and Distributed Systems, 2024
2024
APTMoE: Affinity-Aware Pipeline Tuning for MoE Models on Bandwidth-Constrained GPU Nodes
Y Wei, J Du, J Jiang, X Shi, X Zhang, D Huang, N Xiao, Y Lu
SC24: International Conference for High Performance Computing, Networking …, 2024
2024
The system can't perform the operation now. Try again later.
Articles 1–20