Award Abstract # 2327509
Collaborative Research: CIF: Small: Maximizing Coding Gain in Coded Computing

NSF Org: CCF
Division of Computing and Communication Foundations
Recipient: RUTGERS, THE STATE UNIVERSITY
Initial Amendment Date: July 24, 2023
Latest Amendment Date: July 24, 2023
Award Number: 2327509
Award Instrument: Standard Grant
Program Manager: Phillip Regalia
pregalia@nsf.gov
 (703)292-2981
CCF
 Division of Computing and Communication Foundations
CSE
 Direct For Computer & Info Scie & Enginr
Start Date: October 1, 2023
End Date: September 30, 2026 (Estimated)
Total Intended Award Amount: $375,000.00
Total Awarded Amount to Date: $375,000.00
Funds Obligated to Date: FY 2023 = $375,000.00
History of Investigator:
  • Emina Soljanin (Principal Investigator)
    emina.soljanin@rutgers.edu
Recipient Sponsored Research Office: Rutgers University New Brunswick
3 RUTGERS PLZ
NEW BRUNSWICK
NJ  US  08901-8559
(848)932-0150
Sponsor Congressional District: 12
Primary Place of Performance: Rutgers, The State University of New Jersey
3 RUTGERS PLZA
NEW BRUNSWICK
NJ  US  08901-8559
Primary Place of Performance
Congressional District:
12
Unique Entity Identifier (UEI): M1LVPE5GLSD9
Parent UEI:
NSF Program(s): Comm & Information Foundations
Primary Program Source: 01002324DB NSF RESEARCH & RELATED ACTIVIT
Program Reference Code(s): 7797, 7923, 7937, 9102
Program Element Code(s): 779700
Award Agency Code: 4900
Fund Agency Code: 4900
Assistance Listing Number(s): 47.070

ABSTRACT

Artificial intelligence and machine learning algorithms rely on parallel, distributed computing systems to efficiently carry out intricate, data-heavy tasks. A significant challenge in designing large-scale distributed computing systems is addressing the unpredictable variations in service times across multiple servers. Computing redundancy, such as task replication, is a promising powerful tool to curtail the overall variability in service time. This project focuses on the intelligent management of redundancy in distributed computing that will affect the execution efficiency of data-intensive algorithms in large-scale systems. The project will quantify redundancy benefits, pivotal to developing and ultimately deploying efficient redundancy schemes for executing artificial intelligence and machine learning workloads. The educational goal of the project includes stimulating students' interest in applied probability and mathematical modeling and developing hands-on labs on cloud computing infrastructure. The project will contribute to the Research Experiences for Undergraduate and High School students and will recruit and mentor women and members of underrepresented groups.

This project considers distributed computing systems that use replication and erasure coding to reduce job execution times. The project aims to maximize the gain of using computing redundancy (coding gain) in practical scenarios. It complements recent work on redundancy in distributed systems, focusing primarily on designing redundancy schemes using erasure codes. The project will use statistical analysis and queueing and coding theories to make the following contributions: (i) characterization of the crucial effects of using redundancy in distributed computing, including analysis of the benefits and costs of redundancy; (ii) new mathematical models that capture the performance of distributed computing systems with stragglers; (iii) new analysis tools for computing coding gain in coded computing systems; (iv) development of redundancy management algorithms; (v) characterization of the diversity vs. parallelism trade-off; and (vi) addressing other critical issues in coded computing that do not exist in the better-understood replication solutions.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

Note:  When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Birnie, Dunbar and Cheng, Christopher and Soljanin, Emina "Information Rates With Non Ideal Photon Detectors in Time-Entanglement Based QKD" IEEE Transactions on Communications , v.71 , 2023 https://doi.org/10.1109/TCOMM.2023.3244244 Citation Details

Please report errors in award information by writing to: awardsearch@nsf.gov.

Print this page

Back to Top of page