NSF Org: |
CCF Division of Computing and Communication Foundations |
Recipient: |
|
Initial Amendment Date: | July 24, 2023 |
Latest Amendment Date: | July 24, 2023 |
Award Number: | 2327509 |
Award Instrument: | Standard Grant |
Program Manager: |
Phillip Regalia
pregalia@nsf.gov (703)292-2981 CCF Division of Computing and Communication Foundations CSE Direct For Computer & Info Scie & Enginr |
Start Date: | October 1, 2023 |
End Date: | September 30, 2026 (Estimated) |
Total Intended Award Amount: | $375,000.00 |
Total Awarded Amount to Date: | $375,000.00 |
Funds Obligated to Date: |
|
History of Investigator: |
|
Recipient Sponsored Research Office: |
3 RUTGERS PLZ NEW BRUNSWICK NJ US 08901-8559 (848)932-0150 |
Sponsor Congressional District: |
|
Primary Place of Performance: |
3 RUTGERS PLZA NEW BRUNSWICK NJ US 08901-8559 |
Primary Place of Performance Congressional District: |
|
Unique Entity Identifier (UEI): |
|
Parent UEI: |
|
NSF Program(s): | Comm & Information Foundations |
Primary Program Source: |
|
Program Reference Code(s): |
|
Program Element Code(s): |
|
Award Agency Code: | 4900 |
Fund Agency Code: | 4900 |
Assistance Listing Number(s): | 47.070 |
ABSTRACT
Artificial intelligence and machine learning algorithms rely on parallel, distributed computing systems to efficiently carry out intricate, data-heavy tasks. A significant challenge in designing large-scale distributed computing systems is addressing the unpredictable variations in service times across multiple servers. Computing redundancy, such as task replication, is a promising powerful tool to curtail the overall variability in service time. This project focuses on the intelligent management of redundancy in distributed computing that will affect the execution efficiency of data-intensive algorithms in large-scale systems. The project will quantify redundancy benefits, pivotal to developing and ultimately deploying efficient redundancy schemes for executing artificial intelligence and machine learning workloads. The educational goal of the project includes stimulating students' interest in applied probability and mathematical modeling and developing hands-on labs on cloud computing infrastructure. The project will contribute to the Research Experiences for Undergraduate and High School students and will recruit and mentor women and members of underrepresented groups.
This project considers distributed computing systems that use replication and erasure coding to reduce job execution times. The project aims to maximize the gain of using computing redundancy (coding gain) in practical scenarios. It complements recent work on redundancy in distributed systems, focusing primarily on designing redundancy schemes using erasure codes. The project will use statistical analysis and queueing and coding theories to make the following contributions: (i) characterization of the crucial effects of using redundancy in distributed computing, including analysis of the benefits and costs of redundancy; (ii) new mathematical models that capture the performance of distributed computing systems with stragglers; (iii) new analysis tools for computing coding gain in coded computing systems; (iv) development of redundancy management algorithms; (v) characterization of the diversity vs. parallelism trade-off; and (vi) addressing other critical issues in coded computing that do not exist in the better-understood replication solutions.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH
Note:
When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
Please report errors in award information by writing to: awardsearch@nsf.gov.