Abdul Dakkak
Abdul Dakkak
بريد إلكتروني تم التحقق منه على modular.com - الصفحة الرئيسية
عدد مرات الاقتباسات
عدد مرات الاقتباسات
Accelerating reduction and scan using tensor core units
A Dakkak, C Li, J Xiong, I Gelado, W Hwu
Proceedings of the ACM International Conference on Supercomputing, 46-57, 2019
Evaluating characteristics of CUDA communication primitives on high-bandwidth interconnects
C Pearson, A Dakkak, S Hashash, C Li, IH Chung, J Xiong, WM Hwu
Proceedings of the 2019 ACM/SPEC International Conference on Performance …, 2019
Accelerating fourier and number theoretic transforms using tensor cores and warp shuffles
S Durrani, MS Chughtai, M Hidayetoglu, R Tahir, A Dakkak, ...
2021 30th International conference on parallel architectures and compilation …, 2021
XSP: Across-stack profiling and analysis of machine learning models on GPUs
C Li, A Dakkak, J Xiong, W Wei, L Xu, W Hwu
2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS …, 2020
Trims: Transparent and isolated model sharing for low latency deep learning inference in function-as-a-service
A Dakkak, C Li, SG De Gonzalo, J Xiong, W Hwu
2019 IEEE 12th International Conference on Cloud Computing (CLOUD), 372-382, 2019
Webgpu: A scalable online development platform for gpu programming courses
A Dakkak, C Pearson, W Hwu
2016 IEEE International Parallel and Distributed Processing Symposium …, 2016
Recovering missing depth information from Microsoft’s Kinect
A Dakkak, A Husain
Proc. Embedded Vis. Alliance, 1-9, 2012
Enhancing the usability and utilization of accelerated architectures via docker
N Haydel, S Gesing, I Taylor, G Madey, A Dakkak, SG De Gonzalo, ...
2015 IEEE/ACM 8th International Conference on Utility and Cloud Computing …, 2015
Triolet: A programming system that unifies algorithmic skeleton interfaces for high-performance cluster computing
C Rodrigues, T Jablin, A Dakkak, WM Hwu
ACM SIGPLAN Notices 49 (8), 247-258, 2014
Benanza: Automatic μBenchmark Generation to Compute" Lower-bound" Latency and Inform Optimizations of Deep Learning Models on GPUs
C Li, A Dakkak, J Xiong, W Hwu
2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS …, 2020
The design and implementation of a scalable deep learning benchmarking platform
C Li, A Dakkak, J Xiong, W Hwu
2020 IEEE 13th International Conference on Cloud Computing (CLOUD), 414-425, 2020
Tangram: a high-level language for performance portable code synthesis
LW Chang, A Dakkak, CI Rodrigues, W Hwu
Programmability Issues for Heterogeneous Multicores, 2015
Transitioning HPC software to exascale heterogeneous computing
WM Hwu, LW Chang, HS Kim, A Dakkak, I El Hajj
2015 Computational Electromagnetics International Workshop (CEM), 1-2, 2015
Fft blitz: the tensor cores strike back
S Durrani, MS Chughtai, A Dakkak, W Hwu, L Rauchwerger
Proceedings of the 26th ACM SIGPLAN Symposium on Principles and Practice of …, 2021
MLModelScope: Evaluate and measure ML models within AI pipelines
A Dakkak, C Li, A Srivastava, J Xiong, WM Hwu
arXiv preprint arXiv:1811.09737, 2018
Across-stack profiling and characterization of machine learning models on gpus
C Li, A Dakkak, J Xiong, W Wei, L Xu, W Hwu
arXiv preprint arXiv:1908.06869, 2019
A programming system for future proofing performance critical libraries
LW Chang, I El Hajj, HS Kim, J Gómez-Luna, A Dakkak, W Hwu
ACM SIGPLAN Notices 51 (8), 1-2, 2016
Frustrated with replicating claims of a shared model? a solution
A Dakkak, C Li, J Xiong, WM Hwu
arXiv preprint arXiv:1811.09737, 2018
Thoughts on massively-parallel heterogeneous computing for solving large problems
W Hwu, M Hidayetoglu, WC Chew, C Pearson, S Garcia, S Huang, ...
2017 Computing and Electromagnetics International Workshop (CEM), 67-68, 2017
Mlmodelscope: A distributed platform for model evaluation and benchmarking at scale
A Dakkak, C Li, J Xiong, W Hwu
arXiv preprint arXiv:2002.08295, 2020
يتعذر على النظام إجراء العملية في الوقت الحالي. عاود المحاولة لاحقًا.
مقالات 1–20