International Journal of Engineering & Technology, 7 (4.7) (2018) 180-183



# **International Journal of Engineering & Technology**

Website: www.sciencepubco.com/index.php/IJET





# The Error-Correcting Coding in Information Storage Modules with Increased Radiation Resistance

Matveev V. M., Litvinenko R. S.

#### **Abstract**

This article is devoted to the study and analysis of various noise-resistant code structures, which are designed for use in miniature memory drives on spacecrafts. Error-correcting coding is aimed for correcting memory errors that occur due to ionizing radiation. The first part of the article provides information about the general memory architecture using error-correcting coding. The second part considers linear code constructions, such as Hamming code, convolutional code, PC and LDPC code, as well as nonlinear code constructions, which are promising means of correcting memory errors (Vasiliev code, Phelps code, switching code, AMD-code).

Based on the research and analysis data, the conclusion is made about the most suitable code design for the development of the information storage module. It should be noted that the determining requirement for choosing the code for the drive used on the spacecraft is the presence of simple decoding algorithms that allow high decoding speed and low energy consumption.

Keywords: error-correcting coding, Hamming code, linear code, nonlinear code, Phelps code, switching code, Vasiliev code.

#### 1. Introduction

Nowadays the actual problem is to obtain satellite images of the Earth's surface. In this regard, there is a need for processing and storage of large amounts of photo and video data. This affects the requirements for electronic blocks that provide temporary data storage. In space there is a number of factors that prevent standard equipment, including memory drives, from failures. One of the problems that one has to deal due to the ionizing radiation is software errors (1 will become 0 or conversely— single-event upset, SELD)

Developing miniature memory drives for onboard electronic component base with high radiation resistance, it is necessary to take into account the problem of information errors, and find solutions to deal with them. An effective solution to parry SEU is noiseresistant coding

Error-correcting coding is applied to detect and correct errors i.e. the message the message is encoded so that the receiving part knows whether the error occurred or not, and could correct the errors if they occur.

In fact, coding is the addition of extra information for verification to the original information. The transmitting side uses the encoder and at the receiver side uses the decoder to retrieve the original message.

#### 2. Method

Flash memory is widely used in modern computer technology due to the high speed of data exchange and storage density, low power consumption, as well as significant mechanical endurance. At present flash memory is the most perspective memory type for onboard storage of aerospace systems and systems operating under conditions of hard space radiation.

Two main flash memory technologies are known - NAND and NOR. NAND technology compared to NOR has less erasing time, takes less space on the chip, and has a lower cost of one bit. NOR

technology allows getting quick access to each cell individually, while NAND-memory read/write operations work immediately with data pages, which size is several Kbits. At the moment NAND flash memory technology is considered more advanced. At the same time, each of the technologies has its own applications:

- NOR is used as memory of microprocessors, intended for storage of executable code and small auxiliary data,
- NAND is used to store large data amounts.

Increasing data storage density and reducing the cost of a single bit has traditionally been achieved by reducing the size of the transistors. However, in 1997 a multi-level flash memory (multi-level cell, MLC) was developed [1]. The MLC technology is based on the possibility of precise control of the number of electrons (charge value) contained on the floating gate of the memory cell transistor. This allows setting different transistor threshold voltage levels corresponding to different logic values. In other words, this technology allows storring a number of bits in a single cell of flash memory. Currently cells with three levels (TLC, three-level-cell) are used.

Increasing the number of threshold voltage levels reduces memory reliability by reducing the difference between them. The probability of error per bit of MLC NAND is about  $10^{-6}$ , which is at least a hundred times greater than the corresponding SLC memory index [2]. In addition, the following phenomena negatively affect the reliability of both SLC and MLC memory [3-6]:

- perturbations caused by writing/reading;
- error caused by exceeding the data retention period;
- accumulation of failed blocks;
- limited number of records;
- floating gate charge leakage;
- high temperature exposure;
- radiation exposure.

To reduce the probability of flash memory error, as well as to increase the function duration various methods are used, the main ones are the following:



- wear leveling is designed to distribute data across different memory blocks so that all blocks are used approximately the same number of times [7],
- management of the failed blocks, which is responsible for monitoring usage of memory blocks, showing a non-recoverable error ("hard" error), avoiding writing data in the blocks [8].

Despite the fact that the use of these methods significantly increases the memory duration, they do not allow correcting so-called "soft" errors caused by the transition of the memory cell to another state. Noise-resistant codes are capable of detecting and correcting such errors. The task of effective use of noise-resistant coding for detecting and correcting errors in solid-state memory and, particulary in flash memory, is one of the most important in the creation of reliable spacecraft drives.

The defining requirement for noise-resistant codes used in spacecraft memory is the presence of simple decoding algorithms that allow high decoding speed and low energy consumption.

#### 2.1 linear Coding

Hamming code is a binary error detection and correction code that can detect single and double errors and fix them. This code is recommended for systems with a low probability of multiple errors in a simple data structure (one erroneous bit in the data byte). The Hamming code is described by the ratio  $2^k \ge m+k+1$ , where m+k is the total number of bits in the encoded word. Based on this equation, the Hamming code can correct all single-digit errors and detect two-digit errors. In individual cases according to the number of control bits, more than a single-bit error can be corrected [9].

Advantages and disadvantages of Hamming codes:

- simple design and are easy to decode;
- can only fix single errors.

Figure 1 shows the Hamming code of a 12-bit word and the verification bits

Coded word: d11 d10 d9 d8 d7 d6 d5 d4 d3 d2 d1 d0



Figure 1:- Hamming Code of 12-bit word and check bits

Convolutional code is common in spacecraft solid-state drives, because it provides good stability to mitigate isolated impulse noise. The convolution code is created by passing the information sequence through a linear shift register with a finite number of states. Figure 2 shows the convolutional encoder.



Figure 2:Convolutional encoder

Advantages and disadvantages of convolutional codes :

- work effectively in the white noise channel;

- do not handle error packets well.

PC codes are non – binary codes where code words bits are considered as groups of bits (most often hexadecimal, octal characters or bytes). R-S code is able to detect and correct multiple and consecutive data errors [10].

Encoding with the correction of multiple errors in data stored in flash memory becomes more important with the increase of memory density. LDPC codes are well known for their ability to achieve channel throughput with additive white Gaussian noise and are promising for use in flash reading channels. Some of the LDPC codes are used in the spacecraft storage, as they meet the requirements of reliable data storage [11].

Advantages and disadvantages:

- correction of multiple errors;
- difficult to decode.

As noted above, one of the main sources of errors in the spacecraft memory storage is hard ionizing radiation. The research of noisetolerant codes indicate that the use of nonlinear codes to parry errors caused by the influence of radiation on spacecrafts may be more promising than representatives of different classes of linear codes.

#### 2.2 Nonlinear Coding

In addition, a number of researchers believe that non-linear codes can compete with linear ones in case of errors associated with a limited flash memory resource for writing/erasing. The possibility of using nonlinear codes to improve the reliability of flash memory is a topical problem.

Thus, the analysis [16-19] shows that representatives of different classes of nonlinear codes were used or recommended for use in flash memory to provide:

- integrity and noise immunity;
- protection against fault-injection attacks.

The first construction of nonlinear code, which was proposed to protect flash memory, is based on the extension of Vasiliev code (perfect nonlinear Hamming code).

Let p(u) be a linear parity check  $u, u \in GF(2^m)$ . Let V be a perfect not necessarily linear Hamming code with length  $n = 2^r - 1$  c  $k_v = m - r$  information symbols. Let there be an arbitrary map, such that  $f(0) = 0, 0 \in GF(2^m)$ ,  $u = f(v) \oplus f(v') \neq f(v \oplus v')$  for some  $v, v' \in V$ . Then the C code defined as follows

$$C = \{(u, u \oplus v, p(u) + f(v)) : u \in GF(2^{m}), v \in V\},$$
(1)

it is a perfect nonlinear Hamming code and is called Vasiliev code. The code is a partially reliable code with parameters  $(n=2m-1,k=2m-r,d_0=3)$ , it has m undetectable errors, and the probability of undetected errors is  $P_{undet}=P_f$ , where  $P_{undet}=P_f$ , there is a degree of nonlinearity of the display f.

There are two non-linear analogue of the Hamming code – Phelps code and twitching code. The parameters of the considered codes are given in table 1.

Table 1: - parameters of the considered codes

| Code<br>structure | n                  | k           | $w_d$        | Pundet                | $d_0$ |
|-------------------|--------------------|-------------|--------------|-----------------------|-------|
| Vasiliev code     | $2^{r}-1$          | $2^{r}-1-r$ | $2^{r-1}-1$  | 0.5                   | 3     |
| Code Phelps       | $2^{r}-1$          | $2^{r}-1-r$ | $2^{r} - 2r$ | $P_{\alpha}$          | 3     |
| Considered code   | 2 <sup>r</sup> - 1 | $2^{r}-1-r$ | $2^{r-1}-1$  | $-2^{-2^{r-1}+1+r}+1$ | 3     |

 $W_d$  – number of undetectable errors

 $P_{\alpha}$  – measure of nonlinearity of permutation  $\alpha$ 

It should be noted that the considered codes are superior to the linear Hamming code. The codes described above are effective for noise immunity for single-level flash memory, where multiple errors do not need to be corrected [12-14].

The advantage of such codes is that they have a high degree of noise immunity, however are difficult to decode and therefore are not suitable for the developed memory micromodule.

The algorithm of detection and correction of errors of small multiplicity in small fields has a low computational complexity and time delay. Figure 3 shows the architecture of the flash memory using the AMD code.



Figure 3: General architecture of a flash memory using AMD code [15]

However, selecting large values of the k and m parameters, the f(x, y) function calculation can be very complicated. Therefore, it is recommended to use this code to protect relatively small amounts of data. Thus, AMD-codes can be used to design a reliable single-level NOR type flash memory, resistant to both errors and attacks on the introduced interference.

Linear codes are now used in spacecrafts, but soon nonlinear structures can compete due to the fact that they have a smaller number of undetectable errors and errors, which guarantee to lead to incorrect decoding. The use of nonlinear codes can increase memory resource by reducing the probability of undetected error.

#### 3. Results

According to the research the Hamming code is the most suitable for memory micromodule. The purpose of the experiment is to demonstrate the feasibility of this code design.

It is known that the key parameter of the code used on the spacecraft onboard equipment is the presence of simple decoding algorithms, but there is such an important parameter as the number of uncorrectable errors, so it is necessary to find out whether the Hamming code is suitable for this parameter. 8-bit static random access memory drives with byte organization, which are used to store software executable in satellite computers, have the traditional protection of hardware – implemented Hamming code (12.8) - i.e. 4 bits of code are formed for each byte of data, giving 12 in sum. This code is capable of detecting and correcting any single error in a 12-bit Hamming word.

## 3.1 Modular Redundancy

An alternative approach to EDAC (error detection and correction)-based protection is an approach that was implemented on FPGA is using modular redundancy. Here four memory drives with 8bit architecture are used to store data copies. After reading, all byte copies are passed through the voting scheme (executed in a single gate matrix) so that any single errors are discarded on a majority principle. This scheme offers excellent protection against SEU effects and failures in several bits of a single word, but it entails a extra cost of 300% in memory part. This circumstance is compensated by the Hamming code, since this code has the lowest power consumption. As a result, it is advisable to use the modular redundancy method in conjunction with the Hamming code.

The micromodule structural scheme is shown on figure 3.



Figure 4: micromodule structural scheme

The micromodule main parts are:

- 4 NAND-flash chips for data storage;
- BMK or ULA (uncommitted logic array) controller chip;
- 1469TK035 chip for parrying the latch-up effect. It is established that the reading time does not exceed 33 ns, while the single flash memory chip reading time is 25 ns.

#### 3.2 Uncorrectable Errors

According to Guildford Space center on the experience of using different encoding schemes and configuration of devices for protection of data and software stored in the memory based on devices with standard commercial chips on satellites operating at low Earth orbit, Hamming code has an SEU intensity  $0.51 \times 10^{-6} \ bits^{-1} \ day^{-1}$  with the intensity of "serious" (unadjusted) errors in  $5.0 \times 10^{-9} \ bits^{-1} \ day^{-1}$ . Multidigit upsets in single-word were not observed. The error intensity is acceptable since most of the data files unloaded to the ground during the day.

#### 4. Conclusion

Based on the research and analysis of various linear and nonlinear code constructions it can be concluded that the Hamming code is the most suitable for the project purposes. The Hamming code has simple decoding algorithms and high decoding speed at low energy consumption. The use of nonlinear codes to parry errors caused by the influence of radiation on spacecrafts may be more promising compared to representatives of different classes of linear codes, but they are not suitable for the developed memory storage micromodule because they have more complicated decoding algorithms. The experimental research showed, that the reading time does not exceed 33 ns using Hamming code and modular redundancy. Further study will be aimed to decrease the module read/write operation time and radiation tolerance testing.

## Acknowledgements

The work is carried out with the financial support of the Ministry of Education and Science of Russia under Agreement no. 14.574.21.0155 (unique identifier of applied research RFMEFI57417X0155). The work was carried out using the equipment of CCU "Functional testing and diagnostics of microand nanosystem technique" on the basis of SMC "Technological center".

#### References

- G. Atwood, A. Fazio, D. Mills, B. Reaves. / Intel Strata FlashTM memory technology overview // Intel Technology Journal. 1997.
- [2] R. Dan, R. Singer. / White paper: Implementing MLC NAND flash for cost-effective, high capacity memory / M-Systems. 2003.
- [3] J. Bellorado, E.Yaakobi, A. Jiang. // Non-Volatile Memory Workshop, Center for Magnetic Recording Research, University of California, San Diego. March 3rd. 2013.
- [4] L.M. Grupp, A.M. Caulfield, J. Coburn, at al. / Characterizing flash memory: Anomalies, observations, and applications. // In Proc. 41st IEEE/ACM Int. Symp. Microarch. (MICRO). 2009. P.24-33.105
- [5] N. Mielke, T. Marquart, N. Wu, at al. / Biterror rate in NAND flash memories // In Proc. 46th Annu. Int. Reliab. Phys. Symp. 2008.
- [6] Desnoyers, P. / Empirical evaluation of NAND-flash memory performance // SIGOPS Oper. Syst.Rev. 2010. V.44. N.1. P. 50-54.
- [7] N. Agrawal, V. Prabhakaran, T. Wobber, at al. / Design tradeoffs for SSD performance // In Proc. USENIX Annu. Techn. Conf. Annu. Techn.Conf. 2008. P.57-70.
- [8] T. Jianzhong, Q. Qinglin, B. Feng, at al. / Study and design on high reliability mass capacity memory // In Proc. IEEE Int. Conf. Softw. Eng. Service Sci.2010. P.701-704.
- [9] F. sun, S. Devarjan, K. rose, T. Zhang. / design of systems on a chip for error correction for multi-level nor and NAND flash memory // IEPP circuits, devices and systems.2007.B. 1.N. 3.P. 241-249
- [10] Cota, E.; Carro, L.; Lubaszewski, M.; Velazco, R.; Rezgui. / S.Synthesis of 8051-like Microcontroller Tolerant to Transient Faults. In: 1st IEEE Latin America Test Workshop (LATW), Brazil, 2000.
- [11] Sagalovich Yu. L. / Introduction to algebraic codes 3rd ed., pererab. // I DOP. - M.: IPPI RAS, 2014. - 310 p. - ISBN 978-5-901158-24-1.
- [12] Z. Wang, M. Karpovsky, K. J. Kulikovski. / replacement linear Hamming codes buy reliable results of nonlinear codes to improve the reliability of memories // Materials of the international conference IEEE / IFIP on Dependable systems and networks.2009.P. 514–523.111
- [13] Z. Wang, M. Karpovsky, A. Joshi. / reliable flash memory MLC NAND based on nonlinear error correction T-codes // Proceedings of the international conference on reliable systems and networks.2010.
- [14] Z. Wang, M. Karpovsky, K. J. Kulikovskii. / the Design of a memory with parallel detection and correction of errors by Ninlinear codes SEC-DED // journal of electronic testing.2010.B. 26. P.559-580.

- [15] P. Lo, Z. Wang, M. G. Karpovsky. / safe NAND flash drives are resistant to strong fault-injection attacks using algebraic manipulation detection codes // proc.Doc. Int.Conference on security and management, Sam. 2013.
- [16] T. Etzion, A.Vardy. / On perfect codes and tilings: problems and solutions // SIAM J. Discrete Math.1998.V. 11.N. 2. P. 205-223.
- [17] Mollar, M. / A generalized parity function and its use in the construction of perfect codes / / SIAM J. Alg. Disc. Meth.1986.V. 7.N. 1.P. 113-115.
- [18] Phelps, K. T. / A General Product Construction for Error Correcting Codes // SIAM J. Algebraic Discrete Methods.1984.N.5.P. 224-228.