Artículos con órdenes de acceso público - Nathan DeBardelebenMás información
No disponibles en ningún lugar: 17
Tensorfi: A configurable fault injector for tensorflow applications
G Li, K Pattabiraman, N DeBardeleben
2018 IEEE International symposium on software reliability engineering …, 2018
Órdenes: US Department of Energy, Natural Sciences and Engineering Research Council …
Sdc is in the eye of the beholder: A survey and preliminary study
B Fang, P Wu, Q Guan, N DeBardeleben, L Monroe, S Blanchard, Z Chen, ...
2016 46th Annual IEEE/IFIP International Conference on Dependable Systems …, 2016
Órdenes: US Department of Energy, Natural Sciences and Engineering Research Council …
Towards building resilient scientific applications: Resilience analysis on the impact of soft error and transient error tolerance with the clamr hydrodynamics mini-app
Q Guan, N DeBardeleben, B Atkinson, R Robey, WIM Jones
2015 IEEE International Conference on Cluster Computing, 176-179, 2015
Órdenes: US Department of Energy
Empirical studies of the soft error susceptibility ofsorting algorithms to statistical fault injection
Q Guan, N DeBardeleben, S Blanchard, S Fu
Proceedings of the 5th Workshop on Fault Tolerance for HPC at eXtreme Scale …, 2015
Órdenes: US Department of Energy
Fault injection experiments with the clamr hydrodynamics mini-app
B Atkinson, N Debardeleben, Q Guan, R Robey, WM Jones
2014 IEEE International Symposium on Software Reliability Engineering …, 2014
Órdenes: US Department of Energy
Thermal neutrons: a possible threat for supercomputer reliability
D Oliveira, S Blanchard, N DeBardeleben, F Fernandes dos Santos, ...
The Journal of Supercomputing 77, 1612-1634, 2021
Órdenes: US Department of Energy, UK Science and Technology Facilities Council …
Exploring the tradeoff between reliability and performance in hpc systems
C Walker, B Slade, G Bailey, N Przybylski, N DeBardeleben, WM Jones
2021 IEEE High Performance Extreme Computing Conference (HPEC), 1-7, 2021
Órdenes: US Department of Energy
Enhancing HPC system log analysis by identifying message origin in source code
M Hickman, D Fulp, E Baseman, S Blanchard, H Greenberg, W Jones, ...
2018 IEEE International Symposium on Software Reliability Engineering …, 2018
Órdenes: US Department of Energy
On the inherent resilience of integer operations
L Monroe, WM Jones, SR Lavigne, CH Davis, Q Guan, N DeBardeleben
Euro-Par 2016: Parallel Processing Workshops: Euro-Par 2016 International …, 2017
Órdenes: US Department of Energy
Impact of contextual error correction techniques in CLAMR
D Wallace, WM Jones, R Robey, L Monroe, T Grové, N DeBardeleben
2020 SoutheastCon, 1-2, 2020
Órdenes: US Department of Energy
An overview of the risk posed by thermal neutrons to the reliability of computing devices
D Oliveira, S Blanchard, N DeBardeleben, FF dos Santos, GP Davila, ...
2020 50th Annual IEEE-IFIP International Conference on Dependable Systems …, 2020
Órdenes: US Department of Energy, UK Science and Technology Facilities Council …
Extreme scale and bleeding edge technology lead to a need for resilient high performance computing systems
N DeBardeleben
2016 IEEE International Reliability Physics Symposium (IRPS), 3B-1-1-3B-1-8, 2016
Órdenes: US Department of Energy
Incorporating Staggered Planned Maintenance Reservations to Improve Performance in Computational Clusters
WM Jones, CS Walker, VE Hafener, WD Graham, NA DeBardeleben, ...
2023 IEEE International Conference on Cluster Computing Workshops (CLUSTER …, 2023
Órdenes: US Department of Energy
Online Detection and Classification of State Transitions of Multivariate Shock and Vibration Data
N Przybylski, WM Jones, N DeBardeleben
2022 IEEE High Performance Extreme Computing Conference (HPEC), 1-7, 2022
Órdenes: US Department of Energy
Statistical Framework for Two-Party Acceptance Testing of HPC Systems for Reliability
N DeBardeleben, T Burr, S Penton, C Walker, J Loncaric, WM Jones
2021 IEEE/ACM 11th Workshop on Fault Tolerance for HPC at eXtreme Scale …, 2021
Órdenes: US Department of Energy
Do Solar Proton Events Reduce the Number of Faults in Supercomputers?: A Comparative Analysis of Faults During and without Solar Proton Events
CMK Bowen, N DeBardeleben, S Blanchard, C Anderson-Cook
2019 IEEE International Reliability Physics Symposium (IRPS), 1-5, 2019
Órdenes: US Department of Energy
Resilience Analysis of Top K Selection Algorithms
R Slechta, L Monroe, N DeBardeleben, Q Guan, J Wendelberger, ...
2017 13th European Dependable Computing Conference (EDCC), 42-49, 2017
Órdenes: US Department of Energy
Disponibles en algún lugar: 30
Memory errors in modern systems: The good, the bad, and the ugly
V Sridharan, N DeBardeleben, S Blanchard, KB Ferreira, J Stearley, ...
ACM SIGARCH Computer Architecture News 43 (1), 297-310, 2015
Órdenes: US Department of Energy
Understanding GPU errors on large-scale HPC systems and the implications for system design and operation
D Tiwari, S Gupta, J Rogers, D Maxwell, P Rech, S Vazhkudai, D Oliveira, ...
2015 IEEE 21st International Symposium on High Performance Computer …, 2015
Órdenes: US Department of Energy
On the diversity of cluster workloads and its impact on research results
G Amvrosiadis, JW Park, GR Ganger, GA Gibson, E Baseman, ...
2018 USENIX Annual Technical Conference (USENIX ATC 18), 533-546, 2018
Órdenes: US Department of Energy
La información de publicación y financiación se determina de forma automática mediante un programa informático