TY - GEN
T1 - Estimating the accuracy of user surveys for assessing the impact of HPC systems
AU - Hart, David
AU - Rishel, Melissa
AU - Nychka, Doug
N1 - Publisher Copyright:
© 2016 ACM.
PY - 2016/7/17
Y1 - 2016/7/17
N2 - Each year, the Computational & Information Systems Laboratory (CISL) conducts a survey of its current and recent user community to gather a number of metrics about the scientific impact and outcomes from the use of CISL's high-performance computing systems, particularly peer-reviewed publications. However, with a modest response rate and reliance on selfreporting by users, the accuracy of the survey is uncertain as is the degree of that uncertainty. To quantify this uncertainty, CISL undertook a project that attempted to provide statistically supported limits on the accuracy and precision of the survey approach. We discovered limitations related to the range of users' HPC usage in our modeling phase, and several methods were attempted to adjust the model to fit the usage data. The resulting statistical models leverage data about the HPC usage associated with survey invitees to quantify the degree to which the survey undercounts the relevant publications. A qualitative assessment of the collected publications aligns with the statistical models, reiterates the challenges associated with acknowledgment for use of HPC resources, and suggests ways to improve the survey results further.
AB - Each year, the Computational & Information Systems Laboratory (CISL) conducts a survey of its current and recent user community to gather a number of metrics about the scientific impact and outcomes from the use of CISL's high-performance computing systems, particularly peer-reviewed publications. However, with a modest response rate and reliance on selfreporting by users, the accuracy of the survey is uncertain as is the degree of that uncertainty. To quantify this uncertainty, CISL undertook a project that attempted to provide statistically supported limits on the accuracy and precision of the survey approach. We discovered limitations related to the range of users' HPC usage in our modeling phase, and several methods were attempted to adjust the model to fit the usage data. The resulting statistical models leverage data about the HPC usage associated with survey invitees to quantify the degree to which the survey undercounts the relevant publications. A qualitative assessment of the collected publications aligns with the statistical models, reiterates the challenges associated with acknowledgment for use of HPC resources, and suggests ways to improve the survey results further.
KW - Science impact
KW - Supercomputers
KW - User publications
UR - https://www.scopus.com/pages/publications/84989169887
U2 - 10.1145/2949550.2949583
DO - 10.1145/2949550.2949583
M3 - Conference contribution
AN - SCOPUS:84989169887
T3 - ACM International Conference Proceeding Series
BT - Proceedings of XSEDE 2016
PB - Association for Computing Machinery
T2 - Conference on Diversity, Big Data, and Science at Scale, XSEDE 2016
Y2 - 17 July 2016 through 21 July 2016
ER -