My Profile

Profile Avatar
1412 Doctors Drive
Burbank, CA 91505
United States
Abstract: To propose a system multiplexing and recovery mechanism. Partitioning multiple system image areas (including kernels and file systems) on a single NAND Flash, and adding multiple backup and recovery mechanisms to the system image. When the running mirror area fails, the watchdog mechanism is triggered to restart, and the automatic backup and recovery mechanism is activated to ensure that the system has an available system running at any time. The method is simulated in the S5PV210 processor and Linux platform, and it is well verified. When the fault occurs, it can ensure the normal startup of the system and automatically recover the failed system image. The test results prove the feasibility of the method.

Keywords :embedded; U-Boot; Linux; backup; recovery

0 Preface

Although Linux has many advantages as an embedded operating system, due to the features of the embedded operating system itself, such as the diversity and complexity of the embedded application range, the difficulty of maintaining the embedded product is more prominent. After an embedded system is put into operation in an actual environment, unexpected disasters such as power failure, user errors or malicious changes and deletions of data, and some errors that cannot be fully tested during development can cause functional failure, which may result in severe system failure. paralysis. If manual update is used, it may be inconvenient due to factors such as installation location. Therefore, the automatic backup and recovery mechanism of the embedded system is the easiest and most effective guarantee method for restoring data [1], and it has gradually become an important issue in the practical application of embedded systems.

Currently, technologies such as dual-system hot backup [2] and disaster recovery nas usb backup;, [3] are commonly used as a high reliability design scheme for embedded systems. Due to the use of two independent systems, certain design difficulties and costs are increased. [4]. In this paper, aiming at some problems in the current embedded system backup technology, by analyzing the embedded U-Boot and Linux kernels, the embedded system images are backed up into multiple copies on a set of devices, and only one system runs at a time. When the running operating system fails, it will trigger the system backup and recovery mechanism, use the next available backup partition to overwrite the failed system partition, and then start the next available system. The entire system does not require backup equipment, which greatly saves cost and power consumption and improves system stability and reliability.

1 Overall system design

1.1 Overall System Design

The overall backup and recovery scheme in the embedded Linux system is shown in Figure 1.

The system consists of x-loader (SPL), U-Boot, U-Boot Env, Judge-

Area, Kernel (uImage) and Rootfs (ubi.img) are composed of six parts. Mtd is the same partition on the same NAND Flash, and uImage and ubi.img have multiple copies on the device. The features and functions of the system components are described as follows:

(1) x-loader is a first-level boot program [5], U-Boot is a second-level boot program [6], and U-Boot Env stores kernel boot parameters.

(2) Judge-Area storage system backup and restore parameters. Access can be shared under U-Boot and Linux.

(3) Kernel is a Linux kernel [7]. Rootfs uses the open-ended Unsorted Block Image File System (UBIFS), which is especially suitable for embedded systems [8].

A, B, and C are system image areas that store the kernel and file system. U-Boot starts one of the system's mirror areas by judging the parameters in the Judge-Area, and feeds back the Judge-Area to the fault condition during startup. When the system operation fails or crashes, use the watchdog mechanism to reset its hardware. Use the automatic backup and recovery mechanism under U-Boot to overwrite the failed system mirror area with other operable backups to ensure that a usable system can be started. .

The current NAND Flash configuration running Linux systems is relatively large, while the trimmed kernel and the thin UBI file system actually occupy less than 20 MB, or less [9]. Therefore, this makes it possible to implement multiple backups of the Linux system.

1.2 Backup and Recovery Mechanism

The following uses triple backup and recovery as an example to illustrate the backup and recovery process of the system. The entire system image is stored in NAND Flash in three places. The specific process is as follows:

(1) The system powers up or resets U-Boot startup. Assume that the relevant r_active_1 (boot flag) and b_success_1 (kernel startup success flag) in area A are "yes", and the BC area is "no". The process before the system image is not successfully started can be considered as failed. Therefore, set the r_active_1 and b_success_1 in the area A to no, the rec_kernel_1 (kernel recovery flag), and the rec_fs_1 (file system recovery flag) to yes. Also set B area r_active_2, b_success_2 to 'yes'.

(2) Then U-Boot loads the system in area A. If the Linux system is running normally, all flag values ​​are restored under Linux. If the boot fails, the watchdog is triggered to reset the system after a certain time. At this time, the U-Boot runs again. It is determined that b_success_1 is "no", indicating that the last boot failed, and rec_kernel_1 and rec_fs_1 are "yes", indicating that the kernel of the B area needs to be used. And the file system is overlaid to Area A. After successful completion, the recovery flag is cleared and b_success_1 is reset. The system is then reset by the watchdog.

(3) The U-Boot judges that r_active_2 and b_success_2 are 'yes', so the system in the B zone is selected. Before the system starts successfully, set R_active_2, b_success_2 of zone B to "no", rec_kernel_2 and rec_fs_2 to "yes", and set r_active_3 in zone C to "yes".

(4) The system loading the B area, if the Linux system can run normally, restore all the flags under Linux as the initial value. If the B area also fails, reset the system and use the C area to cover the B area under U-Boot. Clear the recovery flag after reset and reset b_success_2, and then reset the system through the watchdog. In the same way, starting from the system in the C area, after the C area also has a problem, it starts again from the A area, and the cycle selects in turn.

2 system implementation

2.1 Judge-Area Basic Design

Judge-Area is a separate area on the NAND Flash, the internal storage system backup and restore environment parameters, the area is similar to the U-Boot Env area.

In order to view and modify these parameters under U-Boot, all the parameters of this area are defined in a judge_tab array and then imported into the hash table judge_htab.

Struct hsearch_data judge_htab;

Himport_r(\u0026judge_htab, (char *)judge_tab, \\

Sizeof(judge_tab), '\\0', 0)

To facilitate debugging and manual backup and restore operations, add the U-Boot serial print command printjudge, modify the command setjudge, save the command savejudge. The implementation of the printjudge print command is:

Hsearch_r(e, FIND, \u0026ep, \u0026judge_htab);

Len=hexport_r(\u0026judge_htab, '\
', \u0026res, 0);

Similarly, the modify command setjudge is also a simple operation on the judge_htab, because these modifications only operate on the value of the memory, so the modified value is not stored in the NAND Flash. The save command savejudge is to store these variables. The storage process is as follows: Use hexport_r to export the data to a new hash table, judge_htab_new, then call nand_erase_opts to erase the Judge-Area zone, and then use nand_write to write data to the zone. To improve the reliability of read-write data, hardware bad block detection and ECC check are enabled.

2.2 U-Boot Backup and Recovery Design

U-Boot backup and recovery mechanism design shown in Figure 2.

The function judge_get() actually calls the hsearch_r function to query the hash table, judge_htab, and reads from the Judge-Area the address of the next system image, addr_x, and size_x.

The function of the function img_recover() is to call img_read, img_write to restore the next system image to the current system image area.

The Judge-Area parameter reset function judge_reset() mainly works by calling setjudge to set b_success_x to \u0026quot;yes\u0026quot; and rec_fs_x and rec_kernel_x to \u0026quot;no\u0026quot; respectively.

The U-Boot Env parameter reset function uEnv_reset() works by calling setenv to set nand_src_addr, nand_img_siz, and nand_root to correspond to \u0026quot;n_kaddr_x\u0026quot;, \u0026quot;n_ksize_x\u0026quot;, \u0026quot;ubi0:rootfs rw ubi.mtd=y, 2048\u0026quot;. Among them, setenv is U-Boot's own function [10], y (y=2*x+6) is the file system partition.

The main function of the parameter preprocessing function judge_init() is to call setjudge to set r_active_x, b_success_x to \u0026quot;no\u0026quot;, and set rec_kernel_x, rec_fs_x, r_active_y, and b_success_y to \u0026quot;yes\u0026quot;. Where y = cyc_add(x) and the function cyc_add() is a periodic addition function, i.e. when x\u0026lt;MAX_MTD_SYSTEM(number of system mirror regions), x++, otherwise x=1.

2.3 Linux Backup and Recovery Design

Before the system image is started, U-Boot has modified the parameters in the Judge-Area and saved to NAND Flash. Then, these parameters are processed in Linux. This article adopts the self-starting script method to realize the parameter processing [11]. According to the actual application environment, the script can be inserted into the "distrusted" position. Once the system or program crashes, the script will receive commands from the application, the kernel, and so on. Then decide whether to perform parameter restoration or reset the system. The backup and recovery mechanism under Linux is mainly achieved through this script, as shown in Figure 3.

This article uses MTD+UBIFS to manage Flash, skipping FTL/NFTL (Flash Translation Layer/NAND Flash Translation Layer), and greatly improves management capabilities [12]. Using the tool for NAND operation [13] in the mtd-utils toolkit, the contents of the Judge-Area (mtd6) area can be saved to the judge.txt file. Then call the Linux sed command to modify the flag, and finally write the modified file back to the Judge-Area area. The entire operation is performed by the anti_judge_init script. The main tasks are as follows:

Mtd_debug read /dev/mtd6 0 $filesize judge.txt

Sed -e \u0026quot;s/$old_flag/$new_flag/g\u0026quot; judge.txt

Mtd_debug erase /dev/mtd6 0 $filesize

Mtd_debug read /dev/mtd6 0 $filesize judge.txt

To ensure NAND partition consistency, kernel NAND partition information needs to be modified. The kernel partition information is stored in the mtd_partition structure, which is usually located in the arm/arm/plat directory. In addition, the production of UBIFS file system must also be consistent with the kernel partition. To create a UBIFS image file, you need to use the mkfs.ubifs tool, which is also part of the mtd-utils toolkit [14].

3 system testing

The system was tested and passed on a variety of development boards. The triple backup and recovery system on the Tiny210 development board was used as an example. The test method was to recompile u-boot.img, uImage, and ubi.img with automatic backup and recovery mechanisms. Ubi.img has added a video monitor program, in which the camera is set aside for a period of time to open the failed BUG. Therefore, the video surveillance program is the 'distrust' position of the system. If the program fails to run, the automatic backup and recovery mechanism under Linux is performed. The Tiny210 development board integrates 512 MB of SLC NAND Flash and divides the NAND Flash into 12 regions. See Table 1.

Enter the U-Boot on the SD card. Run updatesys to write the generated u-boot.img, uImage, and ubiubi.img to the specified area. After powering on, the system starts successfully. After running the video monitoring program for a period of time, the system automatically restarts. The backup and recovery information displayed on the serial port is shown in Figure 4.

From the printed information, it can be seen that the system image of area A is covered by area B, and then the system image of another partition is run.

4 Conclusion

From the test results, the backup and recovery mechanism designed in this paper can ensure that the system can work stably in a special environment. The entire process is automatically completed by the system and is easy to maintain. At present, the cost of NAND Flash is getting lower and lower, and the capacity is getting larger and larger[15]. This kind of backup and recovery method is undoubtedly a convenient and effective method to reduce costs and ensure system stability. This scheme can be applied in some occasions such as system testing that requires high functional stability and inconvenient maintenance.


[1] Cable Red Army. Research on Hot Standby Dual Switching Technology in Embedded System[J].Microcomputer Information,2008,24(8):32-34.

[2] Ma Jinrong. An Embedded System bootrom Automatic Backup and Switching Technology[J]. Microcontrollers \u0026 Embedded Systems, 2011,11(12):74-75.

[3] Xie Changsheng, Han Dezhi, Li Huaiyang. Disaster Recovery Level and Technology[J].中国计算机用户,2003,19(18):30-31.

[4] Guo Rongzuo, Huang Jun. Hardware Reliability and Application of Embedded Real-time Control System[J].Electronic Technology Applications,2012(5):11-14.

[5] Cai Liping, Ren Jiafu, Tong Rui, et al. Nand Flash Boot Analysis and Porting Based on ARM[J].Computer Engineering and Design,2012(3):931-935.

[6] Gao Wenhui, Shi Yubing, Zhang Wei. U-Boot dual-boot implementation based on S3C2440[J].Measurement \u0026 Control Technology,2012(2):87-91.

[7] Hu Yongqi, Hou Zifeng. Design and Implementation of NAND Memory System in Embedded Linux[J].Computer Engineering,2006(4):61-63,81.

[8] Wes, Ding Zhigang, Zhang Weihong. Research and Application of UBI Subsystem in LINUX[J].Computer Applications and Software,2010(10):68-71.

[9] Jia Yuanquan, Xiao Yu, Lai Mingche, et al. Research on Bad Block Strategy in Multi-way Parallel Storage System Based on NAND FLASH[J].Journal of Computer Research and Development,2012(z1):68-72.

[10] Wu Beibei. Research and Application of SQLite Data Recovery Technology for NAND Flash[D]. Zhejiang: Hangzhou Dianzi University, 2012.

[11] Chen Peng, Wang Shuzhi, Dong Xiaofeng, et al. A method of automatic operation of an embedded operating system after sleep wake-up[J].Application of Electronic Technology,2012,38(2):11-13.

[12] Zhang Shaobo, Xu Guanghui, Tian Xiaofeng, et al. High reliability and self-recovery real-time file system based on Nand FLASH[J].Computer Engineering \u0026 Science,2012(6):169-173.

[13] Gao Li, Zhang Huanqing. Research on NAND Flash Device Driver in Embedded Linux[J]. Computer Development \u0026 Applications, 2014(5):11-16.

[14] Gao Ming, Yu Jianxin. UBIFS Flash File System Analysis and Research[J].Computer Knowledge and Technology,2014(4):749-754.

[15] Lu Huirong. Flash Memory Technology and Market Trends in 2012[J].Integrated Circuit Application,2013(1):4-6.

My InBox