Predicting Open Source Forked Pattern Survivability

Publication Type:
Thesis
Issue Date:
2021
Full metadata record
The motivational behaviour of open source (OS) developers has always been an active focus of research. With the introduction of the forking technique a related research area of developer forking motivational behaviour has gained significance, partly due to the problem of forking scarcity and low fork visibility performance. The objective of forking is to improve and innovate source code quality from voluntary developers. Unfortunately, the forking technique is not very sustainable in improving fork efficiency and efficacy. Further, developers may spend time forking source codes that may become inactive and consequently prove to be a waste of time and effort. From the perspective of project owners, if their repositories do not receive a good fork response from developers, their repositories will not grow. This doctoral research study aimed to address these problems by avoiding forking scarcity, increasing high fork visibility performance, and promoting positive developer forking motivation. We also needed to investigate OS environment compliance to determine whether it contributes to improved fork visibility, reduced fork deficiency and/or is viewed positively by developers. The research approach was to apply a model to predict high fork visibility. The model is based on the K Nearest Neighbour machine learning algorithm, using the Euclidean distance metric to predict high fork visibility performance. We piloted it using nine repository classifiers and then conducted a longitudinal study of five select repository classifiers to determine accuracy and distance approximation. Our work adds a new body of knowledge to OS forking theory and provides a deeper understanding of developer forking motivational behaviour. In the first phase of this study, we conducted a literature review of forking motivation and research methods used in OSS. We then developed and tested our model. In the last phase, we identified OSS patterns and detected fork longevity to determine whether environmental compliance was fully, partially or not at all satisfied. Most importantly, we showed that high fork visibility environmental compliance distance approximation can positively predict developer forking interest.
Please use this identifier to cite or link to this item: