6TH International Congress on Technology - Engineering - Kuala Lumpur3 - Malaysia (2018-07-19)

New Approach For Reducing The Cost Of Regression Test Selection Technique

Abstract—Regression testing is an important testing technique used in the maintenance phase of the software engineering life- cycle. The goal of regression technique is to reduce testing cost by running the minimal number of test cases. To do so, we propose two new techniques that would reduce cost of effectiveness. The first method is the risk-based regression testing which analyses and selects test cases based on the probability of faults and the impact of a fault on the whole program. The second method is the CFG-based regression testing that deals with identifying which parts of the code has been modified and which test case executes that part of the code to be run as a test case in the version. A basic example was presented to illustrate the mechanism of the CFG-based regression testing. Keywords— Software testing, regression testing, control flow graph, test case selection I. INTRODUCTION Now a days, Software maintenance is widely accepted part of software development. It stands for modifying and updating the software product done after its delivery. These modifications are due to different reasons. Some of them are: 1. Market conditions 2. Customer requirements 3. Host modifications 4. Organization and environment changes 5. User needs 6. Error correction 7. Improvement of performance System maintenance is a general term required that keeps a system running correctly and properly. The system could be computer system, mechanical system or others. The maintenance in this sense is related to the system failure that is due to its usage and age. Software development companies spend more time on maintenance of existing software than on development of new software. Reports suggest that the cost of maintenance is high. According to earlier studies, software maintenance accounts between 40-70% of its the total life-cycle costs [8] as shown in figure 1. Figure 1: Software engineering lifecycle costs’ comparison Maintenance is applicable to software developed using any generic or iterative model. II. RELATED WORK In regression testing and especially model-based regression testing, Navslavsky, Ziv and Richardson provide a new ap- proach and prototype that will creates traceability relationships between model elements and test cases [1]. Ruth and Rayford provide a unique framework that is used to test web services. This framework uses a CFG-based regression technique and local information only to test web services. It does not require private information from the web service provider. [3], [4], [5] and [6] provide regression test selection techniques and comparison schemes that can be used to compute the cost of the regression technique proposed methods. They also propose other new approaches to regression techniques for various types of applications and software fields. E.B. Swanson initially identified three categories of maintenance: corrective, adaptive, and perfective. Later on, one additional category is added by Lientz and Swanson [1]: Preventive maintenance. The usage percentage of these categories is shown in figure 2. ?Corrective maintenance is any maintenance activity to fixes the bugs that has occurred or after delivery [2], while adaptive maintenance keeps the software program working in a changed or changing environment such as the hardware or the operating system [3]. ?Perfective maintenance is every change to software to increase its performance or to enhance its user interface and its maintainability [4]. The goal of preventive maintenance is to keep the software functioning correctly at any time. III. MAINTENANCE COST As software systems age, it becomes more and more difficult to satisfy user requirements and to keep them up and running without maintenance. Maintenance is applicable to software developed using any software life cycle model (waterfall, spiral, etc.). Maintenance must be performed in order to: • Interface with other systems • Correct faults • Migrate legacy software • Implement enhancements • Adapt programs so that different hardware, software, system features, and telecommunications facilities can be used • Improve the design • Retire software? • Some of the technical and non-technical factors affecting ?software maintenance costs, as follows: • Team stability ? • Application type? • Program age and structure • Software novelty? • Stressful nature of work • Software maintenance staff availability • Software lifespan • Staff skills • Hardware characteristics For instance, in the United States, 2% of the GNP (Gross National Product) is spent on software maintenance and in UK; about $1.5 million annually are spent on software maintenance. IV. REGRESSION TESTING Regression testing is defined as the process of retesting the modified parts of the software and ensuring that no new errors have been introduced into previously tested code.?Let P be a program, let P be a modified version of P, and let T be a test suite for P. Regression testing consists of reusing T on P, and determining where the new test cases are needed to effectively test code or functionality added to or changed in producing P. There is various regression testing techniques [6]: Retest all, Regression Test Selection, Test Case Prioritization and Hybrid Approach A. Retest-all Retest all method is one of the conventional methods for regression testing in which all the tests in the existing test suite are re-ran. So the retest all technique is very expensive as compared to techniques which will be discussed further as regression test suites are costly to execute in full as it require more time and budget. B. Test case prioritization This technique of regression testing prioritizes the test cases so as to increase a test suites rate of fault detection that is how quickly a test suite detects faults in the modified program to increase reliability. This is of two types: (1) General prioritization that attempts to select an order of the test case that will be effective on average subsequent versions of software. (2) Version specific prioritization, which is concerned with particular version of the software. C. Hybrid approach The fourth regression technique is the Hybrid Approach of both Regression Test Selection and Test Case Prioritization. There are numbers of researchers working on this approach and they have proposed many algorithms for it. D. Regression test selection Due to expensive nature of retest all technique, Regression Test Selection (RTS) is performed. In this technique instead of rerunning the whole test suite we select a part of test suite to rerun if the cost of selecting a part of test suite is less than the cost of running the tests that RTS allows us to omit. RTS divides the existing test suite into (1) Reusable test cases; (2) Re-testable test cases; (3) Obsolete test cases. In addition to this classification RTS may create new test cases that test the program for areas, which are not covered, by the existing test cases. RTS techniques are broadly classified into three categories. 1) Coverage techniques: they take the test coverage criteria into account. They find coverable program parts that have been modified and select test cases that work on these parts. 2) Minimisation techniques: they are similar to coverage techniques except that they select minimum set of test cases. 3) Safe techniques: they do not focus on criteria of coverage; in contrast they select all those test cases that produce different output with a modified program as compared to its original version. Rothermel [7] identified the various categories in which Regression Test Selection Technique can be evaluated and compared. V. PROPOSED TECHNIQUE: REGRESSION TESTING BASED ON CONTROL FLOW GRAPHS The CFG-based regression enables the tester to select test cases using the control flow graph by identifying the building blocks that have been modified and selecting only the test cases that span the building blocks. A. Conversion of the code to a control flow graph In this step, we take the code the initial version and translate it into a CFG (control flow graph). The goal of this step is to translate the code into CFG so as to decide of the leaders, the building blocks as well as to know which parts of the code have been changed so that we can decide which test case will be executed. To do that, we should follow the following steps: 1) Find the leaders: In this part, we decompose the code into blocks that fulfill some basic conditions. The conditions are the following: • The first line of the function or main program • A condition expression • The first line after the condition statement 2) After the leaders have been identified, the next thing to do is to identify the building blocks. To do so, each building block starts at the a building block and ends before the second. Condition expressions can be seen as a building block composed of one expression which is the leader. 3) Create the CFG by linking building blocks together based on the flow of the code and the program. B. Test case identification After the building blocks and the control flow graph have been respectively identified and constructed, the next step is to identify the possible test cases that test the program. To do so, we may use numerous techniques as long as the test cases span the whole program. In this, the method used is the identification of the test cases based on the number of existing paths. To know the number of test cases, the method is used in the number of regions visible in the CFG and then spans all possible graphs and creates a test case for each independent path. C. Test case mapping After the CFG and the test cases have been respectively constructed and designed, the next step is to map each test case to the building blocks spanned by the test case. To do so, for each test case available in the test suite, we follow the test case path through the program execution and record each building block spanned. This is applied to every test case. The purpose of this step is to provide data about which test cases are spanned so that we can base our test case selection on it. D. Control flow graph design for the new version of the program Once a new version of the initial program is designed, the first thing to do at this point is to construct the control flow diagram of this new version so as to see the flow of execution of the new version as well as to visually see the main differences in the whole structure of the program. E. Identification of the modified building blocks The second step is to identify which part of the code has been modified and more exactly which building block has been affected by the change. There are however three types of changes: • The building block’s content was modified but the building block still exists within the new version • The building block was divided into two or more new building blocks. • The building block was removed and merged with another existing block to form a new one Each of these conditions affect the test suites’ test case selection To identify the modified building blocks, one should create a mapping table where there is a mapping between the building block of the old and new version of the program. Once should also add a column indicating whether the building block was modified or not and if yes what type of modification occurred. F. Test case selection In this step we select a subset of the test cases from those test cases that have been used to test the original and initial version of the code. Now that changes have been, we try to minimise the testing costs by running the smallest number of test cases that can test entirely the changes made to the code. To do so, we follow the following step: 1) One should identify the route of each test case within the original test suite. This means that for each test case listed in the original test suite of the initial version, we map it to a list of building blocks that it moves through as part of the test. 2) Once every test case has been mapped to a set of building blocks, we select the test cases based on the criterion which is to select the test cases that span the modified building blocks. Once a test case spans at least one of the modified or new building blocks, it is selected as a test case to be used to test the new version of the program. VI. APPLICATION OF CFG-BASED REGRESSION TESTING Let’s use the following program written in the C language to illustrate the CFG-based regression testing method. The program follows the following procedure: • The program choses a number randomly between 0 and 50 and the user is asked to guess the chosen number by the computer. • The user inputs guessed numbers and the program gives hints to the user on whether the guessed number is bigger or smaller than the chosen number. • The program ends only if the user gives a correct guess. Figure 2: Guessing game Example written in C A. Creation of the control flow graph The number guessing game has been converted into a control flow graph. Figure 3: Control flow graph of guessing game B. Test case identification Given the CFG, we will generate test cases for the program to test it. Using the regions formula, it is believed that there are at most 4 test cases. The feasible test cases are the following: TABLE I PROGRAM TEST CASES Test case Guessed Chosen Output 1 1 1 1 2 10,5 2 2 3 1,5 2 2 C. Test case mapping The goal of this part is to create a mapping table that maps each test case to a set of building block that it spans. TABLE II BUILDING BLOCKS SPANNED BY TEST CASES Test case Building blocks 1 A,B,C,G,H,I 2 A,B,C,D,E,G,H,I 3 A,B,C,D,F,G,H,I D. New Version: Changed version and CFG Let’s assume that we will change the version and add the following requirements: • The user can now chose the range of the chosen number after making this change to the requirements; the new code is displayed in figure 4. Figure 4: New version of the guessing game written in C Now, we will construct the CFG of the program. Figure 5: CFG of the new version of the code We will now map between old building blocks and new building blocks TABLE III. MAPPING TABLE Version 1 building blocks Version 2 building blocks Modified A A’ True B B’ False C C’ False D D’ False E E’ False F F’ False G G’ False H H’ False I I’ False Now that there is a mapping table between the old building blocks and the new building blocks, we chose the test case that will span the modified building blocks. This means the building blocks that go through building block A. So test cases 1,2 and 3 are chosen to be executed to test the second version of the guessing game program. VII. CONCLUSION Regression testing is a testing technique used to test new versions of the same program. The focus in regression testing is cost reduction that means running the minimal number of test cases that can test all the new changes made to the program. In CFG-based regression testing, the focus is to identify which building blocks of the CFG of the program have been modified and which test case will be run based on the modifications. Such a method involves dealing with numerous related to the type of modifications. To do so, we create a mapping table that maps the old version’s building blocks to the new ones. We then base our selection of test cases on the mapping table. VIII. FUTURE WORK As a future work where the starting point is the topic studied in this paper, an experiment where the method is extensively applied to different systems written in different programming languages would test the strengths and weaknesses of this paper as well as determine how much does it reduce cost compared to different regression testing techniques such as retest-all random selection and other known regression test selection techniques. With regards to CFG-based regression testing technique, the example above shows that the all-previous tests were run and resulted in a cost equal to the retest-all. This is due to the fact that all paths span the modified building. One should investigate how to minimize the cost of testing a building block that is spanned by all test cases. REFERENCES [1] Naslavsky L, Ziv H, Richardson DJ. A model-based regression test selection technique. Proceedings of the IEEE International Conference on Software Maintenance (ICSM 2009), IEEE Computer Society Press: Los Alamitos, CA, 2009; 515518. [2] M. Ruth and S. Tu, A CFG-Based Regression Test Selection for Web Services, to appear in Proceedings of ICIW 2007. [3] Mei, L., Chan, W.K., Tse, T.H., Data Flow Testing of Service Choreography, Proceedings of Fundamental Approaches to. Software Engineering (FASE), ACM, pp. 151-160 Amsterdam, The Netherlands, Aug. 2009. [4] Rothermel, G. and Harrold, M.J., Analyzing Regression Test Selection Techniques, IEEE Transactions on Software Engineering, Vol.22, No.8, pp.529-551 August 1996. [5] Rothermel, G., Harrold, M.J., A Safe, Efficient Regression Test Selection Technique, ” ACM Transactions on Software Engineering and Methodology, Vol. 6, No. 2, pp. 173-210 April 1997. [6] Li., L., Chou, W., Guo, W., Control Flow Analysis and Coverage Driven Testing for Web Services, Proceedings of ICWS, IEEE, pp. 473-480 Beijing, China, Sept. 2008. [7] Xu, G., Rountev , Regression Test Selection for AspectJ Software Proc. 29th Intl. Conf. on Software Engineering (Minneapolis, MN, May, 2007), ACM,hskip 1em plus 0.5em minus 0.4emNew York, NY, 65-74. [8].Bell D. Software Engineering for Students A Programming Approach, Fourth Edition. 2005. Prentice Hall international
Bouchaib Falah, Hanane Noreddine