Volgen
Yuchen Hao
Yuchen Hao
Meta
Geverifieerd e-mailadres voor cs.ucla.edu
Titel
Geciteerd door
Geciteerd door
Jaar
The llama 3 herd of models
A Dubey, A Jauhri, A Pandey, A Kadian, A Al-Dahle, A Letman, A Mathur, ...
arXiv preprint arXiv:2407.21783, 2024
11092024
Pytorch fsdp: experiences on scaling fully sharded data parallel
Y Zhao, A Gu, R Varma, L Luo, CC Huang, M Xu, L Wright, H Shojanazeri, ...
arXiv preprint arXiv:2304.11277, 2023
1972023
A quantitative analysis on microarchitectures of modern CPU-FPGA platforms
Y Choi, J Cong, Z Fang, Y Hao, G Reinman, P Wei
Proceedings of the 53rd Annual Design Automation Conference, 1-6, 2016
1912016
Supporting address translation for accelerator-centric architectures
Y Hao, Z Fang, G Reinman, J Cong
2017 IEEE International Symposium on High Performance Computer Architecture …, 2017
1272017
Software-hardware co-design for fast and scalable training of deep learning recommendation models
D Mudigere, Y Hao, J Huang, Z Jia, A Tulloch, S Sridharan, X Liu, ...
Proceedings of the 49th Annual International Symposium on Computer …, 2022
1112022
In-depth analysis on microarchitectures of modern heterogeneous CPU-FPGA platforms
YK Choi, J Cong, Z Fang, Y Hao, G Reinman, P Wei
ACM Transactions on Reconfigurable Technology and Systems (TRETS) 12 (1), 1-20, 2019
552019
Hardware acceleration for an accurate stereo vision system using mini-census adaptive support region
Y Shan, Y Hao, W Wang, Y Wang, X Chen, H Yang, W Luk
ACM Transactions on Embedded Computing Systems (TECS) 13 (4s), 1-24, 2014
482014
On-chip interconnection network for accelerator-rich architectures
J Cong, M Gill, Y Hao, G Reinman, B Yuan
Proceedings of the 52nd Annual Design Automation Conference, 1-6, 2015
402015
Best-effort FPGA programming: A few steps can go a long way
J Cong, Z Fang, Y Hao, P Wei, CH Yu, C Zhang, P Zhou
arXiv preprint arXiv:1807.01340, 2018
342018
Mtia: First generation silicon targeting meta's recommendation systems
A Firoozshahian, J Coburn, R Levenstein, R Nattoji, A Kamath, O Wu, ...
Proceedings of the 50th Annual International Symposium on Computer …, 2023
222023
DHEN: A deep and hierarchical ensemble network for large-scale click-through rate prediction
B Zhang, L Luo, X Liu, J Li, Z Chen, W Zhang, X Wei, Y Hao, M Tsang, ...
arXiv preprint arXiv:2203.11014, 2022
212022
FPGA based memory efficient high resolution stereo vision system for video tolling
Y Shan, Z Wang, W Wang, Y Hao, Y Wang, K Tsoi, W Luk, H Yang
2012 International Conference on Field-Programmable Technology, 29-32, 2012
192012
Software-hardware co-design of heterogeneous SmartNIC system for recommendation models inference and training
A Guo, Y Hao, C Wu, P Haghi, Z Pan, M Si, D Tao, A Li, M Herbordt, ...
Proceedings of the 37th International Conference on Supercomputing, 336-347, 2023
182023
Wukong: Towards a Scaling Law for Large-Scale Recommendation
B Zhang, L Luo, Y Chen, J Nie, X Liu, D Guo, Y Zhao, S Li, Y Hao, Y Yao, ...
arXiv preprint arXiv:2403.02545, 2024
112024
Rankitect: Ranking architecture search battling world-class engineers at meta scale
W Wen, KH Liu, I Fedorov, X Zhang, H Yin, W Chu, K Hassani, M Sun, ...
Companion Proceedings of the ACM on Web Conference 2024, 73-82, 2024
32024
Disaggregated Multi-Tower: Topology-aware Modeling Technique for Efficient Large Scale Recommendation
L Luo, B Zhang, M Tsang, Y Ma, CH Chu, Y Chen, S Li, Y Hao, Y Zhao, ...
Proceedings of Machine Learning and Systems 6, 266-278, 2024
12024
Reconfigurable Accelerator Compute Hierarchy: A Case Study using Content-Based Image Retrieval
N Farahpour, Y Hao, Z Fang, G Reinman
2020 IEEE International Symposium on Workload Characterization (IISWC), 276-287, 2020
12020
Architectural Techniques to Enhance the Efficiency of Accelerator-Centric Architectures
Y Hao
University of California, Los Angeles, 2018
2018
Het systeem kan de bewerking nu niet uitvoeren. Probeer het later opnieuw.
Artikelen 1–18