麻豆传媒AV

JSAIT Logo
2023
Dimensions of Channel Coding: Special Issue Dedicated to the Memory of Alexander Vardy
Guest editors
Tuvi Etzion
Paul H. Siegel
Han Mao Kiah
Hessam Mahdavifar
Farzad Parvaresh
Moshe Schwartz
Ido Tal
Eitan Yaakobi
Xinmiao Zhang

This special issue of the 麻豆传媒AV Journal on Selected Areas in Information Theory is dedicated to the memory of Alexander Vardy, a pioneer in the theory and practice of channel coding. His ground-breaking contributions ranged from unexpected solutions of longstanding theoretical conjectures to ingenious decoding algorithms that broke seemingly insurmountable barriers to code performance. Inspired not just by the mathematical beauty of coding theory but also by its engineering utility, Alexander Vardy developed novel coding techniques that have had a profound impact on modern information technology, including computer memories, data storage systems, satellite communications, and wireless cellular networks. At the same time, his innovations left their imprint on other scientific disciplines, such as information theory, computer science, and discrete mathematics.

Tuvi Etzion    Paul H. Siegel    Han Mao Kiah    Hessam Mahdavifar    Farzad Parvaresh    Moshe Schwartz    Ido Tal    Eitan Yaakobi    Xinmiao Zhang

This special issue of the 麻豆传媒AV Journal on Selected Areas in Information Theory is dedicated to the memory of Alexander Vardy, a pioneer in the theory and practice of channel coding. His ground-breaking contributions ranged from unexpected solutions of long-standing theoretical conjectures to ingenious decoding algorithms that broke seemingly insurmountable barriers to code performance. Inspired not just by the mathematical beauty of coding theory but also by its engineering utility, Alex developed novel coding techniques that have had a profound impact on modern information technology. At the same time, his innovations have left their imprint on other scientific disciplines, such as information theory, computer science, and discrete mathematics.

Daniel Pook-Kolb    Erik Agrell    Bruce Allen

We give a detailed description of the Voronoi region of the Barnes鈥揥all lattice $\Lambda _{16}$ , including its vertices, relevant vectors, and symmetry group. The exact value of its quantizer constant is calculated, which was previously only known approximately. To verify the result, we estimate the same constant numerically and propose a new very simple method to quantify the variance of such estimates, which is far more accurate than the commonly used jackknife estimator.

V. Arvind Rameshwar    Navin Kashyap

In this paper, we study binary constrained codes that are resilient to bit-flip errors and erasures. In our first approach, we compute the sizes of constrained subcodes of linear codes. Since there exist well-known linear codes that achieve vanishing probabilities of error over the binary symmetric channel (which causes bit-flip errors) and the binary erasure channel, constrained subcodes of such linear codes are also resilient to random bit-flip errors and erasures. We employ a simple identity from the Fourier analysis of Boolean functions, which transforms the problem of counting constrained codewords of linear codes to a question about the structure of the dual code. We illustrate the utility of our method in providing explicit values or efficient algorithms for our counting problem, by showing that the Fourier transform of the indicator function of the constraint is computable, for different constraints. Our second approach is to obtain good upper bounds, using an extension of Delsarte鈥檚 linear program (LP), on the largest sizes of constrained codes that can correct a fixed number of combinatorial errors or erasures. We observe that the numerical values of our LP-based upper bounds beat the generalized sphere packing bounds of Fazeli et al. (2015).

Yok Jye Tang    Xinmiao Zhang

Generalized integrated interleaved (GII) codes are essential to next-generation digital communication and storage systems since they can achieve very high decoding throughput with low complexity. Only hard-decision GII decoding has been considered in previous work. To further improve the error-correcting capability, soft-decision decoding algorithms utilizing the channel probability information need to be developed. The decoding of GII codes constructed based on BCH codes consists of multiple rounds of BCH decoding. Among existing soft-decision decoding algorithms of BCH codes, the Chase algorithm that carries out decoding trials on multiple test vectors can achieve a better trade-off on the coding gain and complexity. Although one-pass Chase algorithms can derive the error-locator polynomials for all the test vectors in one run, the exhaustive Chien search is carried out previously on each error-locator polynomial to decide which one is correct and it leads to long latency. For the first time, this paper proposes an efficient soft-decision GII decoding algorithm. Different methods of incorporating the Chase process into the GII scheme are analyzed and compared to identify the best GII Chase decoding algorithm. Besides, a new error-locator polynomial selection scheme is developed to avoid carrying out the Chien search on each error-locator polynomial by pre-flipping a bit in the received word. Accordingly, the error-locator polynomial can be selected by testing whether it consists of a pre-determined factor. The latency is further reduced by pre-computing short remainder polynomials in our second proposed scheme. In addition, formulas have been developed to estimate the error-correcting performance of the proposed designs. This paper also develops low-complexity hardware architectures to implement the proposed GII-BCH Chase decoders. For an example GII-BCH code with 8 sub-codewords of 4095 bits over $GF(2^{12})$ , the proposed GII-BCH Chase decoder can achieve significant coding gain over hard-decision decoder with negligible silicon area overhead. Besides, our proposed designs can reduce the worst-case latency of GII-BCH Chase nested decoding rounds by 54%-80%.

Johan Chrisnata    Han Mao Kiah    Alexander Vardy    Eitan Yaakobi

Motivated by DNA-based applications, we generalize the bee identification problem proposed by Tandon et al. (2019). In this setup, we transmit all $M$ codewords from a codebook over some channel and each codeword results in $N$ noisy outputs. Then our task is to identify each codeword from this unordered set of $MN$ noisy outputs. First, via a reduction to a minimum-cost flow problem on a related bipartite flow network called the input-output flow network, we show that the problem can be solved in $O(M^{3})$ time in the worst case. Next, we consider the deletion and the insertion channels individually, and in both cases, we study the expected number of edges in their respective input-output networks. Specifically, we obtain closed expressions for this quantity for certain codebooks and when the codebook comprises all binary words, we show that this quantity is sub-quadratic when the deletion or insertion probability is less than 1/2. This then implies that the expected running time to perform joint decoding for this codebook is $o(M^{3})$ . For other codebooks, we develop methods to compute the expected number of edges efficiently. Finally, we adapt classical peeling-decoding techniques to reduce the number of nodes and edges in the input-output flow network.

Han Mao Kiah    Alexander Vardy    Hanwen Yao

The bee-identification problem, formally defined by Tandon, Tan, and Varshney (2019), requires the receiver to identify 鈥渂ees鈥 using a set of unordered noisy measurements. In this previous work, Tandon, Tan, and Varshney studied error exponents and showed that decoding the measurements jointly results in a significantly larger error exponent. In this work, we study algorithms related to this joint decoder. First, we demonstrate how to perform joint decoding efficiently. By reducing to the problem of finding perfect matching and minimum-cost matchings, we obtain joint decoders that run in time quadratic and cubic in the number of 鈥渂ees鈥 for the binary erasure (BEC) and binary symmetric channels (BSC), respectively. Next, by studying the matching algorithms in the context of channel coding, we further reduce the running times by using classical tools like peeling decoders and list-decoders. In particular, we show that our identifier algorithms when used with Reed-Muller codes terminate in almost linear and quadratic time for BEC and BSC, respectively. Finally, for explicit codebooks, we study when these joint decoders fail to identify the 鈥渂ees鈥 correctly. Specifically, we provide practical methods of estimating the probability of erroneous identification for given codebooks.

James Chin-Jen Pang    Hessam Mahdavifar    S. Sandeep Pradhan

Let $A(n, d)$ denote the maximum size of a binary code of length $n$ and minimum Hamming distance $d$ . Studying $A(n, d)$ , including efforts to determine it as well to derive bounds on $A(n, d)$ for large $n$ 鈥檚, is one of the most fundamental subjects in coding theory. In this paper, we explore new lower and upper bounds on $A(n, d)$ in the large-minimum distance regime, in particular, when $d = n/2 - \Omega (\sqrt {n})$ . We first provide a new construction of cyclic codes, by carefully selecting specific roots in the binary extension field for the check polynomial, with length $n= 2^{m} -1$ , distance $d \geq n/2 - 2^{c-1}\sqrt {n}$ , and size $n^{c+1/2}$ , for any $m\geq 4$ and any integer $c$ with $0 \leq c \leq m/2 - 1$ . These code parameters are slightly worse than those of the Delsarte鈥揋oethals (DG) codes that provide the previously known best lower bound in the large-minimum distance regime. However, using a similar and extended code construction technique we show a sequence of cyclic codes that improve upon DG codes and provide the best lower bound in a narrower range of the minimum distance $d$ , in particular, when $d = n/2 - \Omega (n^{2/3})$ . Furthermore, by leveraging a Fourier-analytic view of Delsarte鈥檚 linear program, upper bounds on $A(n, \left \lceil{ n/2 - \rho \sqrt {n}\, }\right \rceil)$ with $\rho \in (0.5, 9.5)$ are obtained that scale polynomially in $n$ . To the best of authors鈥 knowledge, the upper bound due to Barg and Nogin (2006) is the only previously known upper bound that scale polynomially in $n$ in this regime. We numerically demonstrate that our upper bound improves upon the Barg-Nogin upper bound in the specified high-minimum distance regime.

Yingquan Wu    Jun Ma

In this paper, we first formulate a novel Chase Guruswami-Sudan algorithm which list corrects not only all errors among the Chase flipping symbols but also the number of errors up to the enhanced Johnson bound among remaining positions, for Reed-Solomon and $q$ -ary BCH codes. The enhanced Johnson bound is induced by shortening the code without those Chase flipping symbols. We next devise a novel Chase K枚tter-Vardy algorithm for soft decision decoding of Reed-Solomon codes. We show that the a posteriori probabilities of two branches of one of the least unreliable symbols which undergo Chase flipping shall be merged and assigned to the correct branch out of the two (it does not matter anyway if neither is correct), based on the rationale that one of exhaustive Chase flipping patterns must hit the genie pattern among the flipped symbols. During the exhaustive trial-and-error Chase decoding, each flipping pattern is treated as the genie and thus the associated branches of each flipping symbol are assigned with the merged a posteriori probabilities while its alternative branch is amortized. The multiplicity matrix is constructed using the state-of-the-art K枚tter-Vardy transformation. We show by both theoretical analysis and simulations that the proposed Chase K枚tter-Vardy algorithm significantly outperforms the original K枚tter-Vardy algorithm under a similar complexity. We then extend this approach to a coded Chase decoding scheme wherein the Chase pattern set comes from a covering code. Surprisingly, simulations show that the best performance is typically achieved by concatenating the exhaustive Chase flipping on a number of the most reliable secondary indexes and a coded Chase flipping on the remaining secondary most reliable secondary indexes. We also employ a novel preprocessing technique that enforces a zero constant term for the message polynomial, by equivalently sacrificing one parity symbol. It effectively eradicates spurious candidate bivariate polynomials by merely computing and checking their constant terms, and thus dramatically reduces the number of candidate bivariate polynomials that are actually interpolated and factorized.

Mohammad Vahid Jamali    Xiyang Liu    Ashok Vardhan Makkuva    Hessam Mahdavifar    Sewoong Oh    Pramod Viswanath

Reed-Muller (RM) codes achieve the capacity of general binary-input memoryless symmetric channels and are conjectured to have a comparable performance to that of random codes in terms of scaling laws. However, such results are established assuming maximum-likelihood decoders for general code parameters. Also, RM codes only admit limited sets of rates. Efficient decoders such as successive cancellation list (SCL) decoder and recently-introduced recursive projection-aggregation (RPA) decoders are available for RM codes at finite lengths. In this paper, we focus on subcodes of RM codes with flexible rates. We first extend the RPA decoding algorithm to RM subcodes. To lower the complexity of our decoding algorithm, referred to as subRPA, we investigate different approaches to prune the projections. Next, we derive the soft-decision based version of our algorithm, called soft-subRPA, that not only improves upon the performance of subRPA but also enables a differentiable decoding algorithm. Building upon the soft-subRPA algorithm, we then provide a framework for training a machine learning (ML) model to search for good sets of projections that minimize the decoding error rate. Training our ML model enables achieving very close to the performance of full-projection decoding with a significantly smaller number of projections. We also show that the choice of the projections in decoding RM subcodes matters significantly, and our ML-aided projection pruning scheme is able to find a good selection, i.e., with negligible performance degradation compared to the full-projection case, given a reasonable number of projections.

Arya Mazumdar

Motivated by applications in distributed storage, the notion of a locally recoverable code (LRC) was introduced a few years back. In an LRC, any coordinate of a codeword is recoverable by accessing only a small number of other coordinates. While different properties of LRCs have been well-studied, their performance on channels with random erasures or errors has been mostly unexplored. In this paper, we analyze the performance of LRCs over such stochastic channels. In particular, for input-symmetric discrete memoryless channels, we give a tight characterization of the gap to Shannon capacity when LRCs are used over the channel. Our results hold for a general notion of LRCs that correct multiple local erasures.

Alexander Vardy    Eitan Yaakobi

Private information retrieval (PIR) protocols allow a user to retrieve a data item from a database without revealing any information about the identity of the item being retrieved. Specifically, in information-theoretic $k$ -server PIR, the database is replicated among $k$ non-communicating servers, and each server learns nothing about the item retrieved by the user. The effectiveness of PIR protocols is usually measured in terms of their communication complexity, which is the total number of bits exchanged between the user and the servers. However, another important cost parameter is storage overhead, which is the ratio between the total number of bits stored on all the servers and the number of bits in the database. Since single-server information-theoretic PIR is impossible, the storage overhead of all existing PIR protocols is at least 2 (or $k$ , in the case of $k$ -server PIR). In this work, we show that information-theoretic PIR can be achieved with storage overhead arbitrarily close to the optimal value of 1, without sacrificing the communication complexity asymptotically. Specifically, we prove that all known linear $k$ -server PIR protocols can be efficiently emulated, while preserving both privacy and communication complexity but significantly reducing the storage overhead. To this end, we distribute the $n$ bits of the database among $s+r$ servers, each storing $n/s$ coded bits (rather than replicas). Notably, our coding scheme remains the same, regardless of the specific $k$ -server PIR protocol being emulated. For every fixed $k$ , the resulting storage overhead $(s+r)/s$ approaches 1 as $s$ grows; explicitly we have $r \le k \sqrt {s}(1 + o(1))$ . Moreover, in the special case $k = 2$ , the storage overhead is only $1 + {}\frac {1}{s}$ . In order to achieve these results, we introduce and study a new kind of binary linear codes, called here $k$ -server PIR codes. We then show how such codes can be constructed from one-step majority-logic decodable codes, from Steiner systems, from constant-weight codes, and from certain locally recoverable codes. We also establish several bounds on the parameters of $k$ -server PIR codes and finally extend for array codes.

Yotam Gershon    Yuval Cassuto

We propose a new compression scheme for genomic data given as sequence fragments called reads. The scheme uses a reference genome at the decoder side only, freeing the encoder from the burdens of storing references and performing computationally costly alignment operations. The main ingredient of the scheme is a multi-layer code construction, delivering to the decoder sufficient information to align the reads, correct their differences from the reference, validate their reconstruction, and correct reconstruction errors. The core of the method is the well-known concept of distributed source coding with decoder side information, fortified by a generalized-concatenation code construction enabling efficient embedding of all the information needed for reliable reconstruction. We first present the scheme for the case of substitution errors only between the reads and the reference, and then extend it to support reads with a single deletion and multiple substitutions. A central tool in this extension is a new distance metric that is shown analytically to improve alignment performance over existing distance metrics.

Yingquan Wu

In literature, PIBMA, a linear-feedback-shift-register (LFSR) decoder, has been shown to be the most efficient high-speed decoder for Reed-Solomon (RS) codes. In this work, we follow the same design principles and present two high-speed LFSR decoder architectures for binary BCH codes, both achieving the critical path of one multiplier and one adder. We identify a key insight of the Berlekamp algorithm that iterative discrepancy computation involves only even-degree terms. The first decoder separates the even and odd-degree terms of the error-locator polynomial to iterate homogeneously with discrepancy computation. The resulting LFSR decoder architecture, dubbed PIBA, has $\lfloor {}\frac {3t}{2}\rfloor +1$ processing elements (PEs), each containing two registers, two multipliers, one adder, and two multiplexers (same as that of PIBMA), which compares favorably against the best existing architecture composed by $2t+1$ PEs. The second one, dubbed pPIBA, squeezes the entire error-locator polynomial into the even-term array of the first one to iterate along with discrepancy computation, which comes at the cost of a controlled defect rate. pPIBA employs $t+1+f$ systolic units with a defect probability of $2^{-q(f+1)}$ , where $q$ denotes the finite field dimension and $f$ is a design parameter, which significantly reduces the number of PEs for a large correcting power $t$ . The proposed architectures can be arbitrarily folded to trade off complexity with latency, due to the systolic nature. GII decoding has been notorious for the composition of many seemly irrelevant functional blocks. We are motivated by the unified framework UPIBA which can be reconfigured to carry out both error-only and error-and-erasure decoding of RS codes in the most efficient manner. We devise a unified LFSR decoder for GII-RS, GII-ERS (referring to erasure correction of GII-RS codes), and GII-BCH codes, respectively. Each LFSR decoder can be reconfigured (but not multiplexed) to execute different functional blocks, and moreover achieves the same critical path of one multiplier, one adder, and one multiplexer. The resulting GII-RS/BCH decoder contains only four functional blocks, which are literally the same as the decoder for single RS/BCH codes. For GII-RS and GII-BCH decoding, we also incorporate the original mechanism by Tang and $\text{K}\ddot {\text {o}}$ tter to minimize the miscorrection rate, which comes surprisingly at a negligible cost. Our proposed high-speed low-complexity GII-ERS decoder renders the multi-layer GII codes highly attractive against other locally recoverable codes.

Mohammad Fereydounian    Hamed Hassani    Mohammad Vahid Jamali    Hessam Mahdavifar

Low-capacity scenarios have become increasingly important in the technology of the Internet of Things (IoT) and the next generation of wireless networks. Such scenarios require efficient and reliable transmission over channels with an extremely small capacity. Within these constraints, the state-of-the-art coding techniques may not be directly applicable. Moreover, the prior work on the finite-length analysis of optimal channel coding provides inaccurate predictions of the limits in the low-capacity regime. In this paper, we study channel coding at low capacity from two perspectives: fundamental limits at finite length and code constructions. We first specify what a low-capacity regime means. We then characterize finite-length fundamental limits of channel coding in the low-capacity regime for various types of channels, including binary erasure channels (BECs), binary symmetric channels (BSCs), and additive white Gaussian noise (AWGN) channels. From the code construction perspective, we characterize the optimal number of repetitions for transmission over binary memoryless symmetric (BMS) channels, in terms of the code blocklength and the underlying channel capacity, such that the capacity loss due to the repetition is negligible. Furthermore, it is shown that capacity-achieving polar codes naturally adopt the aforementioned optimal number of repetitions.

Anindya Bijoy Das    Aditya Ramamoorthy    David J. Love    Christopher G. Brinton

Straggler nodes are well-known bottlenecks of distributed matrix computations which induce reductions in computation/communication speeds. A common strategy for mitigating such stragglers is to incorporate Reed-Solomon based MDS (maximum distance separable) codes into the framework; this can achieve resilience against an optimal number of stragglers. However, these codes assign dense linear combinations of submatrices to the worker nodes. When the input matrices are sparse, these approaches increase the number of non-zero entries in the encoded matrices, which in turn adversely affects the worker computation time. In this work, we develop a distributed matrix computation approach where the assigned encoded submatrices are random linear combinations of a small number of submatrices. In addition to being well suited for sparse input matrices, our approach continues to have the optimal straggler resilience in a certain range of problem parameters. Moreover, compared to recent sparse matrix computation approaches, the search for a 鈥済ood鈥 set of random coefficients to promote numerical stability in our method is much more computationally efficient. We show that our approach can efficiently utilize partial computations done by slower worker nodes in a heterogeneous system which can enhance the overall computation speed. Numerical experiments conducted through Amazon Web Services (AWS) demonstrate up to 30% reduction in per worker node computation time and $100\times $ faster encoding compared to the available methods.

Ron M. Roth

Let $[q\rangle $ denote the integer set $\{0,1, {\ldots },q-1\}$ and let ${{\mathbb {B}}}=\{0,1\}$ . The problem of implementing functions $[q\rangle \rightarrow {{\mathbb {B}}}$ on content-addressable memories (CAMs) is considered. CAMs can be classified by the input alphabet and the state alphabet of their cells; for example, in binary CAMs, those alphabets are both ${{\mathbb {B}}}$ , while in a ternary CAM (TCAM), both alphabets are endowed with a 鈥渄on鈥檛 care鈥 symbol. This work is motivated by recent proposals for using CAMs for fast inference on decision trees. In such learning models, the tree nodes carry out integer comparisons, such as testing equality $(x=t$ ?) or inequality $(x\le t$ ?), where $x\in [q\rangle $ is an input to the node and $t\in [q\rangle $ is a node parameter. A CAM implementation of such comparisons includes mapping (i.e., encoding) $t$ into internal states of some number $n$ of cells and mapping $x$ into inputs to these cells, with the goal of minimizing $n$ . Such mappings are presented for various comparison families, as well as for the set of all functions $[q\rangle \rightarrow {{\mathbb {B}}}$ , under several scenarios of input and state alphabets of the CAM cells. All those mappings are shown to be optimal in that they attain the smallest possible $n$ for any given $q$ .

Burak Bartan    Mert Pilanci

We present a novel distributed computing framework that is robust to slow compute nodes, and is capable of both approximate and exact computation of linear operations. The proposed mechanism integrates the concepts of randomized sketching and polar codes in the context of coded computation. We propose a sequential decoding algorithm designed to handle real valued data while maintaining low computational complexity for recovery. Additionally, we provide an anytime estimator that can generate provably accurate estimates even when the set of available node outputs is not decodable. We demonstrate the potential applications of this framework in various contexts, such as large-scale matrix multiplication and black-box optimization. We present the implementation of these methods on a serverless cloud computing system and provide numerical results to demonstrate their scalability in practice, including ImageNet scale computations.

Neophytos Charalambides    Mert Pilanci    Alfred O. Hero

Coded computing is a method for mitigating straggling workers in a centralized computing network, by using erasure-coding techniques. Federated learning is a decentralized model for training data distributed across client devices. In this work we propose approximating the inverse of an aggregated data matrix, where the data is generated by clients; similar to the federated learning paradigm, while also being resilient to stragglers. To do so, we propose a coded computing method based on gradient coding. We modify this method so that the coordinator does not access the local data at any point; while the clients access the aggregated matrix in order to complete their tasks. The network we consider is not centrally administrated, and the communications which take place are secure against potential eavesdroppers.

Huang-Chang Lee    Jyun-Han Wu    Chung-Hsuan Wang    Yeong-Luh Ueng

This paper presents a soft decoding scheme based on the binary representations transferred from the parity-check matrices (PCMs) for Reed-Solomon (RS) codes. Referring to the modified binary PCM that has a systematic part and a high-density part corresponding to the least reliable variable nodes (LRVNs) and the most reliable variable nodes (MRVNs), respectively, an informed dynamic scheduling method, called Nested-Polling Residual Belief Propagation (NP-RBP), is applied to the corresponding Tanner graph. As with the popular adaptive BP (ABP) decoding approach, adaptation in a binary PCM based on the reliability of variable nodes is also conducted in the proposed NP-RBP decoding. The NP-RBP enables the LRVNs to receive significant updates and limits the correlation accumulation from the short cycles in the MRVNs. In order to enhance the error-rate performance for long codes, a bit-flipping (BF) technique is conducted in order to correct a selection of the errors in the MRVNs such that the propagation of these errors in the subsequent NP-RBP process can be avoided. The resultant decoder is termed NP-RBP-BF. For short codes such as the (31, 25) and (63, 55) RS codes, NP-RBP is able to provide an error-rate performance close to the maximum-likelihood (ML) bound. A more significant improvement can be observed for long codes. For instance, when the proposed NP-RBP-BF decoding is applied to the (255, 239) RS code, it can provide a gain of about 0.4 dB compared to the ABP decoding and the performance gap to the ML bound can be narrowed to about 0.25 dB at a frame error rate of $2\times 10^{-3}$ .

Debarnab Mitra    Lev Tauz    Lara Dolecek

Data availability (DA) attack is a well-known problem in certain blockchains where users accept an invalid block with unavailable portions. Previous works have used LDPC and 2-D Reed Solomon (2D-RS) codes with Merkle trees to mitigate DA attacks. These codes perform well across various metrics such as DA detection probability and communication cost. However, these codes are difficult to apply to blockchains with large blocks due to large decoding complexity and coding fraud proof size (2D-RS codes), and intractable code guarantees for large code lengths (LDPC codes). In this paper, we focus on large block size applications and address the above challenges by proposing the novel Graph Coded Merkle Tree (GCMT): a Merkle tree encoded using polar encoding graphs. We provide a specialized polar encoding graph design algorithm called Sampling Efficient Freezing and an algorithm to prune the polar encoding graph. We demonstrate that the GCMT built using the above techniques results in a better DA detection probability and communication cost compared to LDPC codes, has a lower coding fraud proof size compared to LDPC and 2D-RS codes, provides tractable code guarantees at large code lengths (similar to 2D-RS codes), and has comparable decoding complexity to 2D-RS and LDPC codes.

Min Zhu    David G. M. Mitchell    Michael Lentmaier    Daniel J. Costello

In this paper, we investigate the problem of decoder error propagation for spatially coupled low-density parity-check (SC-LDPC) codes with sliding window decoding (SWD). This problem typically manifests itself at signal-to-noise ratios (SNRs) close to capacity under low-latency operating conditions. In this case, infrequent but severe decoder error propagation can sometimes occur. To help understand the error propagation problem in SWD of SC-LDPC codes, a multi-state Markov model is developed to describe decoder behavior and to analyze the error performance of SC-LDPC codes under these conditions. We then present two approaches - check node (CN) doping and variable node (VN) doping - to combating decoder error propagation and improving decoder performance. Next we describe how the performance can be further improved by employing an adaptive approach that depends on the availability of a noiseless binary feedback channel. To illustrate the effectiveness of the doping techniques, we analyze the error performance of CN doping and VN doping using the multi-state decoder model. We then present computer simulation results showing that CN and VN doping significantly improve the performance in the operating range of interest at a cost of a small rate loss and that adaptive doping further improves the performance. We also show that the rate loss is always less than that resulting from encoder termination and can be further reduced by doping only a fraction of the VNs at each doping position in the code graph with only a minor impact on performance. Finally, we show how the encoding problem for VN doping can be greatly simplified by doping only systematic bits, with little or no performance loss.

Mohammad Rowshan    Jinhong Yuan

The minimum Hamming distance of a linear block code is the smallest number of bit changes required to transform one valid codeword into another. The code鈥檚 minimum distance determines the code鈥檚 error-correcting capabilities. Furthermore, The number of minimum weight codewords, a.k.a. error coefficient, gives a good comparative measure for the block error rate (BLER) of linear block codes with identical minimum distance, in particular at a high SNR regime under maximum likelihood (ML) decoding. A code with a smaller error coefficient would give a lower BLER. Unlike polar codes, a closed-form expression for the enumeration of the error coefficient of polarization-adjusted convolutional (PAC) codes is yet unknown. As PAC codes are convolutionally pre-transformed polar codes, we study the impact of pre-transformation on polar codes in terms of minimum Hamming distance and error coefficient by partitioning the codewords into cosets. We show that the minimum distance of PAC codes does not decrease; however, the pre-transformation may reduce the error coefficient depending on the choice of convolutional polynomial. We recognize the properties of the cosets where pre-transformation is ineffective in decreasing the error coefficient, giving a lower bound for the error coefficient. Then, we propose a low-complexity enumeration method that determines the number of minimum weight codewords of PAC codes relying on the error coefficient of polar codes. That is, given the error coefficient ${\mathcal {A}}_{w_{min}}$ of polar codes, we determine the reduction $X$ in the error coefficient due to convolutional pre-transformation in PAC coding and subtract it from the error coefficient of polar codes, ${\mathcal {A}}_{w_{min}}-X$ . Furthermore, we numerically analyze the tightness of the lower bound and the impact of the choice of the convolutional polynomial on the error coefficient based on the sub-patterns in the polynomial鈥檚 coefficients. Eventually, we show how we can further reduce the error coefficient in the cosets.

Anthony G贸mez-Fonseca    Roxana Smarandache    David G. M. Mitchell

In this paper, we present an efficient strategy to enumerate the number of $k$ -cycles, $g\leq k < 2g$ , in the Tanner graph of a quasi-cyclic low-density parity-check (QC-LDPC) code with girth $g$ using its polynomial parity-check matrix $H$ . This strategy works for both $(d_{v},d_{c})$ -regular and irregular QC-LDPC codes. In this approach, we note that the $m$ th power of the polynomial adjacency matrix can be used to describe walks of length $m$ in the protograph and can therefore be sufficiently described by the matrices $B_{m}(H) \triangleq (HH^{\mathsf {T}})^{\lfloor {m/2}\rfloor }H^{(m\mod 2)}$ , where $m\geq 0$ . We provide formulas for the number of $k$ -cycles, $\mathcal {N}_{k}$ , by just taking into account repetitions in some multisets constructed from the matrices $B_{m}(H)$ . This approach is shown to have low complexity. For example, in the case of QC-LDPC codes based on the $3\times n_{v}$ fully-connected protograph, the complexity of determining $\mathcal {N}_{k}$ , for $k=4,6,8,10$ and 12, is $O(n_{v}^{2}\log (N))$ , $O(n_{v}^{2}\log (n_{v})\log (N))$ , $O(n_{v}^{4}\log ^{4}(n_{v})\log (N))$ , $O(n_{v}^{4}\log (n_{v})\log (N))$ and $O(n_{v}^{6}\log ^{6}(n_{v})\log (N))$ , respectively. The complexity, depending logarithmically on the lifting factor $N$ , gives our approach, to the best of our knowledge, a significant advantage over previous works on the cycle distribution of QC-LDPC codes.

Xuan Guang    Raymond W. Yeung

We consider linear network erro correction (LNEC) coding when errors may occur on the edges of a communication network of which the topology is known. In this paper, we first present a framework of additive adversarial network for LNEC coding, and then prove the equivalence of two well-known LNEC coding approaches, which can be unified under this framework. Furthermore, by developing a graph-theoretic approach, we obtain a significantly enhanced characterization of the error correction capability of LNEC codes in terms of the minimum distances at the sink nodes. Specifically, in order to ensure that an LNEC code can correct up to $r$ errors at a sink node $t$ , it suffices to ensure that this code can correct every error vector in a reduced set of error vectors; and on the other hand, this LNEC code in fact can correct every error vector in an enlarged set of error vectors. In general, the size of this reduced set is considerably smaller than the number of error vectors with Hamming weight not larger than $r$ , and the size of this enlarged set is considerably larger than the same number. This result has the important implication that the computational complexities for decoding and for code construction can be significantly reduced.

Shailja Agrawal    K. V. Sushena Sree    Prasad Krishnan    Abhinav Vaishya    Srikar Kale

We consider the standard broadcast setup with a single server broadcasting information to a number of clients, each of which contains local storage (called cache) of some size, which can store some parts of the available files at the server. The centralized coded caching framework, consists of a caching phase and a delivery phase, both of which are carefully designed in order to use the cache and the channel together optimally. In prior literature, various combinatorial structures have been used to construct coded caching schemes. One of the chief drawbacks of many of these existing constructions is the large subpacketization level, which denotes the number of times a file should be split for the schemes to provide coding gain. In this work, using a new binary matrix model, we present several novel constructions for coded caching based on the various types of combinatorial designs and their $q$ -analogs, which are also called subspace designs. While most of the schemes constructed in this work (based on existing designs) have a high cache requirement, they provide a rate that is either constant or decreasing, and moreover require competitively small levels of subpacketization, which is an extremely important feature in practical applications of coded caching. We also apply our constructions to the distributed computing framework of MapReduce, which consists of three phases, the Map phase, the Shuffle phase and the Reduce phase. Using our binary matrix framework, we present a new simple generic coded data shuffling scheme. Employing our designs-based constructions in conjunction with this new shuffling scheme, we obtain new coded computing schemes which have low file complexity, with marginally higher communication load compared to the optimal scheme for equivalent parameters. We show that our schemes can neatly extend to the scenario with full and partial stragglers also.

Yeow Meng Chee    Alexander Vardy    Van Khu Vu    Eitan Yaakobi

Transverse-read is a novel technique to detect the number of 鈥1鈥檚 stored in a domain wall memory, also known as racetrack memory, without shifting any domains. Motivated by the technique, we propose a novel scheme to combine transverse-read and shift-operation such that we can reduce the number of shift-operations while still achieving high capacity. We also show that this scheme is helpful to correct errors in domain wall memory. A set of valid-words in this transverse-read channel is called a transverse-read code. Our goal in this work is to study transverse-read codes with respect to their properties, capacity, and applications. We first present several properties of transverse-read codes and show that they are equivalent to a family of constrained codes. Then, we compute the maximal asymptotic rate of transverse-read codes for certain parameters. Furthermore, we also present several constructions of transverse-read codes with high rate. Finally, we design several transverse-read codes that can correct limited-shift-errors and limited-magnitude errors in domain wall memory.