Abstract
Consider a linear [ n , k , d ] q Ìý³¦´Ç»å±ðÌý C . We say that theÌý i th coordinate ofÌý C Ìý³ó²¹²õÌý locality Ìý r Ìý, if the value at this coordinate can be recovered from accessing some otherÌý r Ìýcoordinates ofÌý C . Data storage applications require codes with small redundancy, lowÌý locality Ìýfor information coordinates, large distance, and lowÌý locality Ìýfor parity coordinates. In this paper, we carry out an in-depth study of the relations between these parameters. We establish a tight bound for the redundancyÌý n - k Ìýin terms of the message length, the distance, and theÌý locality Ìýof information coordinates. We refer to codes attaining the bound as optimal. We prove some structure theorems about optimal codes, which are particularly strong for small distances. This gives a fairly complete picture of the tradeoffs between codewords length, worst case distance, andÌý locality Ìýof informationÌý symbols . We then consider the locality Ìýof parity checkÌý symbols Ìýand erasure correction beyond worst case distance for optimal codes. Using our structure theorem, we obtain a tight bound for theÌý locality Ìýof parityÌý symbols possible in such codes for a broad class of parameter settings. We prove that there is a tradeoff between having goodÌý locality Ìýand the ability to correct erasures beyond the minimum distance.