Beyond Errors and Erasures: Coding for Data Management and Delivery in Networks

2021

Guest editors

Elza Erkip

Deniz Gündüz

Stratis Ioannidis

Joerg Kliewer

Derya Malak

Muriel Médard

R. Srikant

The CFP for JSAIT's 7th Special Issue. The focus of this special issue is on the applications of coding to the broad area of networking for efficient exploitation and delivery of data. Various coding techniques have been devised to tackle erasures and achieve fundamental limits of compression to recover a message with a fidelity criterion. Motivated by the research in this direction and a wide variety of applications at the intersection of distributed systems and networking, this special issue will focus on key aspects ranging from employment of coding for enhancing the efficiency of networking, protocols, computation and delivery in distributed systems, to maintaining consistency in updates and improving accessibility in distributed storage systems, as well as providing desired performance tradeoffs in terms of efficiency, delay and atomicity.

Elza Erkip Deniz Gündüz Stratis Ioannidis Joerg Kliewer Derya Malak Muriel Médard R. Srikant

It is our pleasure to share with you this special issue, providing a snapshot of the current evolution of coding for data management and delivery in networks. Using coding to provide flexibility and efficiency in data management, rather than merely as tool to combat locally bit rot or transmission impediments, has become an increasingly rich and active domain of investigation. It weaves themes of protocol design, resource allocation, quality of experience management and code construction. These aspects arise in the papers of this issue, each of which individually represents one facet of the current state of the art, and collectively we hope constitute a well-cut gem.

Manuj Mukherjee Ran Gelles

We consider computations over networks with multiple broadcast channels that intersect at a single party. Each broadcast link suffers from random bit-flip noise that affects the receivers independently. We design interactive coding schemes that successfully perform any computation over these noisy networks and strive to reduce their communication overhead with respect to the original (noiseless) computation. A simple variant of a coding scheme by Rajagopalan and Schulman (STOC 1994) shows that any (noiseless) protocol of $R$ rounds can be reliably simulated in $O(R\log n)$ rounds over a network with $n=n_{1}n_{2}+1$ parties in which a single party is connected to $n_{2}$ noisy broadcast channels, each of which connects $n_{1}$ distinct parties. We design new coding schemes with improved overheads. Our approach divides the network into four regimes according to the relationship between $n_{1}$ and $n_{2}$ . We employ a two-layer coding where the inner code protects each broadcast channel and is tailored to the specific conditions of the regime in consideration. The outer layer protects the computation in the network and is generally based on the scheme of Rajagopalan and Schulman, adapted to the case of broadcast channels. The overhead we obtain ranges from $O(\log \log n_{2})$ to $O\left({\log n_{2} \frac {\log \log n_{1}}{\log n_{1}}}\right)$ and beats the trivial $O(\log n)$ overhead in all four regimes.

Nitish Mital Katina Kralevska Cong Ling Deniz Gündüz

We consider a distributed storage system with $n$ nodes, where a user can recover the stored file from any $k$ nodes, and study the problem of repairing $r$ partially failed nodes. We consider broadcast repair, that is, $d$ surviving nodes transmit broadcast messages on an error-free wireless channel to the $r$ nodes being repaired, which are then used, together with the surviving data in the local memories of the failed nodes, to recover the lost content. First, we derive the trade-off between the storage capacity and the repair bandwidth for partial repair of multiple failed nodes, based on the cut-set bound for information flow graphs. It is shown that utilizing the broadcast nature of the wireless medium and the surviving contents at the partially failed nodes reduces the repair bandwidth per node significantly. Then, we list a set of invariant conditions that are sufficient for a functional repair code to be feasible. We further propose a scheme for functional repair of multiple failed nodes that satisfies the invariant conditions with high probability, and its extension to the repair of partial failures. The performance of the proposed scheme meets the cut-set bound on all the points on the trade-off curve for all admissible parameters when $k$ is divisible by $r$ , while employing linear subpacketization, which is an important practical consideration in the design of distributed storage codes. Unlike random linear codes, which are conventionally used for functional repair of failed nodes, the proposed repair scheme has lower overhead, lower input-output cost, and lower computational complexity during repair.

Sijie Li Rawad Bitar Sidharth Jaggi Yihan Zhang

We consider the problem of reliable communication over a network containing a hidden myopic adversary who can eavesdrop on some $z_{ro}$ links, jam some $z_{wo}$ links, and do both on some $z_{rw}$ links. We provide the first information-theoretically tight characterization of the optimal rate of communication possible under all possible settings of the tuple $(z_{ro},z_{wo},z_{rw})$ by providing a novel coding scheme/analysis for a subset of parameter regimes. In particular, our vanishing-error schemes bypass the Network Singleton Bound (which requires a zero-error recovery criteria) in a certain parameter regime where the capacity had been heretofore open. As a direct corollary we also obtain the capacity of the corresponding problem where information-theoretic secrecy against eavesdropping is required in addition to reliable communication.

Yanyan Dong Sheng Jin Yanzuo Chen Shenghao Yang Hoover H. F. Yin

BATS (BATched Sparse) codes are a class of efficient random linear network coding variation that has been studied for multihop wireless networks mostly in scenarios of a single communication flow. Towards sophisticated multi-flow network communications, we formulate a network utility maximization (NUM) problem that jointly optimizes the BATS code parameters of all the flows and network scheduling. The NUM problem adopts a batch-wise packet loss model that can be obtained from the network local statistics without any constraints on packet loss patterns. Moreover, the NUM problem allows a different number of recoded packets to be transmitted for different batches in a flow, which is called adaptive recoding. Due to both the probably nonconcave objective and the BATS code-related variables, the algorithms developed for the existing flow optimization problems cannot be applied directly to solve our NUM problem. We introduce a two-step algorithm to solve our NUM problem, where the first step solves the problem with nonadaptive recoding schemes, and the second step optimizes adaptive recoding hop-by-hop from upstream to downstream in each flow. We perform various numerical evaluations and simulations to verify the effectiveness and efficiency of the algorithm.

Hoover H. F. Yin Ka Hei Ng Allen Z. Zhong Raymond W. Yeung Shenghao Yang Ian Y. Y. Chan

Batched network coding (BNC) is a low-complexity solution to network transmission in multi-hop packet networks with packet loss. BNC encodes the source data into batches of packets. As a network coding scheme, the intermediate nodes perform recoding on the received packets belonging to the same batch instead of just forwarding them. A recoding scheme that may generate more recoded packets for batches of a higher rank is also called adaptive recoding. Meanwhile, in order to combat burst packet loss, the transmission of a block of batches can be interleaved. Stream interleaving studied in literature achieves the maximum separation among any two consecutive packets of a batch, but permutes packets across blocks and hence cannot bound the buffer size and the latency. To resolve the issue of stream interleaver, we design an intrablock interleaver for adaptive recoding that can preserve the advantages of using a block interleaver when the number of recoded packets is the same for all batches. We use potential energy in classical mechanics to measure the performance of an interleaver, and propose an algorithm to optimize the interleaver with this performance measure. Our problem formulation and algorithm for intrablock interleaving are also of independent interest.

Hoover H. F. Yin Bin Tang Ka Hei Ng Shenghao Yang Xishi Wang Qiaoqiao Zhou

Batched network coding is a variation of random linear network coding which has low computational and storage costs. In order to adapt to random fluctuations in the number of erasures in individual batches, it is not optimal to recode and transmit the same number of packets for all batches. Different distributed optimization models, which are called adaptive recoding schemes, were formulated for this purpose. The key component of these optimization problems is the expected value of the rank distribution of a batch at the next network node, which is also known as the expected rank. In this paper, we put forth a unified adaptive recoding framework with an arbitrary recoding field size. We show that the expected rank functions are concave when the packet loss pattern is a stationary stochastic process, which covers but not limited to independent packet loss and Gilbert-Elliott packet loss model. Under this concavity assumption, we show that there always exists a solution which not only can minimize the randomness on the number of recoded packets but also can tolerate rank distribution errors due to inaccurate measurements or limited precision of the machine. We provide an algorithm to obtain such an optimal solution, and propose tuning schemes that can turn any feasible solution into a desired optimal solution.

Mahdi Haghifam M. Nikhil Krishnan Ashish Khisti Xiaoqing Zhu Wai-Tian Tan John Apostolopoulos

Error control codes for real-time interactive applications such as audio and video streaming must operate under strict delay constraints and be resilient to burst losses. Previous works have characterized optimal streaming codes that guarantee perfect and timely recovery of all source packets when the burst loss is below a certain maximum threshold. In this work, we generalize the notion of streaming codes to the unequal error protection (UEP) setting. Toward this end, we define two natural notions of streaming codes; symbol-level UEP and packet-level UEP. In the symbol-level UEP, the symbols within each source packet have varying recoverability requirements. In the packet-level UEP scenario, packets at even time slots and odd time slots have different recovery guarantees. We discuss practical motivations for both settings and develop coding schemes. We establish optimality or near-optimality guarantees through information-theoretic converse bounds. Simulations over Gilbert and Fritchman channels show that our coding schemes outperform baseline schemes over a wide range of channel parameters.

Alireza Vahid

We study the problem of content delivery in two-user interference channels with altering topology, random available cache at the receivers, and delayed channel knowledge at the transmitters. We establish a new set of outer-bounds on the achievable rates when each receiver has access to a random fraction of the message intended for the other receiver, and when each transmitter is aware of which part of its own message is known to the unintended receiver. The outer-bounds reveal the significant potential rate boost associated with even a small amount of side-information at each receiver. The key in deriving the bounds is to quantify the baseline entropy that will always become available to the unintended receiver given the altering topology and the already available side-information. We will also present matching achievable rates in certain scenarios and outline the challenges in more general settings.

Alireza Vahid Shih-Chun Lin I-Hsiang Wang Yi-Chun Lai

We study the content delivery problem between a transmitter and two receivers through erasure links, when each receiver has access to some random side-information about the files requested by the other user. The random side-information is cached at the receiver via the decentralized content placement. The distributed nature of the receiving terminals may also make it harder for the transmitter to learn the erasure state of the two links and indices of the cached files. We thus investigate the capacity gain due to various levels of availability of channel state and cache index information at the transmitter. More precisely, we cover a wide range of settings from global delayed channel state knowledge and a non-blind transmitter (i.e., one that knows the exact cache index information at each receiver) all the way to no channel state information and a blind transmitter (i.e., one that only statistically knows cache index information at the receivers). We derive new inner and outer bounds for the problem under various settings and provide the conditions under which the two match and the capacity region is characterized. Surprisingly, for some interesting cases, the capacity region remains unchanged even with only single-user channel state or single-user cache index information at the transmitter.

Yasser Fadlallah Othmane Oubejja Sarah Kamel Philippe Ciblat Michèle Wigger Jean-Marie Gorce

This paper proposes an extended coded caching scheme based on piggyback coding for single-server multi-user networks with decentralized caching. The proposed scheme is obtained by adapting Polar codes and extending the original coded caching scheme, which is based on index coding and a data assignment that can be implemented via minimum graph-colouring. Polar codes are adapted so that users can apply parts of their cache contents as the “frozen bits” for Polar decoding, and the coded caching is adapted so as to account for different user coding rates and to combine transmissions to cache-aided and cache-free users. Numerical simulations prove that our piggyback-coding based scheme achieves higher rates than previous schemes also in the finite block-length regime. Real testbed measurements are presented, which validate the practical implementation.

Hui Wang Qingchun Chen Qin Huang Xiaohu Tang

In this paper, we will show how to interpret the coded caching design from error control coding perspective. It is shown that, when the cached and non-cached contents in the placement phase is thought of as the shortened system packets and the punctured system packets, respectively, while those coded contents transmitted in the delivery phase specify the parity packets, the coded caching design can be reformulated as a collaborative error control coding problem. The challenges for arbitrary user requests and noncooperative decoding nature at every user will be highlighted to address the design criteria in order to exploit the coding gain residing in the coded caching. Our analysis unveils that, with some supplementary parity packets (SPPs) included in either the placement phase or the delivery phase, noticeable transmission reliability improvement can be realized. It is shown that the proposed design is able to flexibly fulfill the asymmetric reliable transmission requirement by placing or transmitting some SPPs only for those users in adverse conditions. It is also shown that the proposed reliable coded caching and channel coding can be further integrated into the joint network-channel coding (JNCC) framework to fully exploit the benefits of those two schemes.

Abdelrahman M. Ibrahim Ahmed A. Zewail Aylin Yener

This paper considers a cache-aided network where the users have access to helper-caches with heterogeneous sizes. First, coded placement schemes are proposed that exploit the heterogeneity in cache sizes when one user is connected to each cache. In the proposed scheme, the unicast/multicast signals intended to serve users connected to small memories are utilized in decoding the contents of the larger memories. A reduction in delivery load with coded placement is shown compared to uncoded placement for three-user systems with arbitrary cache sizes and larger systems in the small total memory regime. Next, systems with equal-size caches where multiple users are associated with each cache are considered. It is shown that coded placement outperforms the best uncoded placement scheme. In the proposed scheme, the unicast/multicast signals sent to the overloaded helper-caches facilitate the decoding of the coded subfiles stored at the underloaded helper-caches. The gain from coded placement is explicitly characterized for two-cache systems. For larger systems, the parameters of the coded placement scheme are obtained by optimization. It is observed that the gain from coded placement becomes more evident with increasing asymmetry in users’ connectivity. Finally, a unified coded placement scheme for two-cache systems that exploits the asymmetry in both the cache sizes and the connectivity pattern is presented.

Suman Ghosh Prasad Krishnan Lakshmi Prasad Natarajan

We consider the centralized coded caching system where a library of files is available at the server and their subfiles are cached at the clients as prescribed by a placement delivery array (PDA). We are interested in the problem where a specific file in the library is replaced with a new file at the server, the contents of which are correlated with the file being replaced, and this change needs to be communicated to the caches. Upon replacement, the server has access only to the updated file and is unaware of its differences with the original, while each cache has access to specific subfiles of the original file as dictated by the PDA. Scenarios such as (a) updating an unlogged file library, (b) ensuring privacy of updated subfiles against an eavesdropper, and (c) correcting errors in the client caches, can be captured as special cases of this problem framework. We model the correlation between the updated and the original files by assuming that they differ in at the most $\epsilon $ subfiles, and aim to reduce the number of bits broadcast by the server to update the caches. We design a new elegant coded transmission strategy for the server to update the caches blindly, and also identify another simple scheme that is based on MDS codes. We then derive converse bounds on the minimum communication cost $\ell ^{*}$ among all linear strategies. For two well-known families of PDAs – Maddah-Ali & Niesen’s caching scheme and a PDA by Tang & Ramamoorthy and Yan et al. – our new scheme has cost $\ell ^{*}(1 + o(1))$ as the number of users grows large, when the updates are sufficiently sparse and the caching ratio is an arbitrary constant; whereas the scheme using MDS codes has order-optimal cost when the updates are dense.

Kota Srinivas Reddy Nikhil Karamchandani

Index coding and coded caching are two active research topics in information theory with strong ties to each other. Motivated by the multi-access coded caching problem, we study a new class of structured index coding problems (ICPs) which are formed by the union of several symmetric ICPs. We derive upper and lower bounds on the optimal server transmission rate for this class of ICPs and demonstrate that they differ by at most a factor of two. Finally, we apply these results to the multi-access coded caching problem to derive better bounds than the state of the art.

Arman Sharififar Neda Aboutorab Parastoo Sadeghi

In this paper, we propose a new scalar linear coding scheme for the index coding problem called update-based maximum column distance (UMCD) coding scheme. The central idea in each transmission is to code messages such that one of the receivers with the minimum size of side information is instantaneously eliminated from unsatisfied receivers. One main contribution of the paper is to prove that the other satisfied receivers can be identified after each transmission, using a polynomial-time algorithm solving the well-known maximum cardinality matching problem in graph theory. This leads to determining the total number of transmissions without knowing the coding coefficients. Once this number and what messages to transmit in each round are found, we then propose a method to determine all coding coefficients from a sufficiently large finite field. We provide concrete instances where the proposed UMCD coding scheme has a better broadcast performance compared to the most efficient existing coding schemes, including the recursive scheme (Arbabjolfaei and Kim, 2014) and the interlinked-cycle cover (ICC) scheme (Thapa et al., 2017). We prove that the proposed UMCD coding scheme performs at least as well as the MDS coding scheme in terms of broadcast rate. By characterizing two classes of index coding instances, we show that the gap between the broadcast rates of the recursive and ICC schemes and the UMCD scheme grows linearly with the number of messages. Then, we extend the UMCD coding scheme to its vector version by applying it as a basic coding block to solve the subinstances.

This index covers all technical items - papers, correspondence, reviews, etc. - that appeared in this periodical during the year, and items from previous years that were commented upon or corrected in this year. Departments and other items may also be covered if they have been judged to have archival value. The Author Index contains the primary entry for each item, listed under the first author's name. The primary entry includes the co-authors' names, the title of the paper or other item, and its location, specified by the publication abbreviation, year, month, and inclusive pagination. The Subject Index contains entries describing the item under all appropriate subject headings, plus the first author's name, the publication abbreviation, month, and year, and inclusive pages. Note that the item title is found only under the primary entry in the Author Index.

�鶹��ýAV

Table of Contents