Engineering intelligence and developer empowerment

Final Thesis: Classification and Analysis of Inner Source Development Artifacts

Abstract: Inner source software development describes the usage of open-source development practices within organizations. To make this process more transparent, classification of the software development artifacts is needed. These classifications can then be used for further analysis. To create a classification system, the design science methodology was used as a structure for the thesis. The key research questions were what objectives the classification system had to fulfill, how such a system can be designed and implemented, and what kind of analytics are possible with such a system. The objectives stated that the classification system has to be able to process different kinds of text-based software artifacts, take these artifacts from various data sources, create usable classifications for analytics, and do all of this without using machine learning techniques because these do not provide reproducibility when using different training datasets. For the design, a data pipeline is defined that extracts, preprocesses, classifies, post-processes, and writes the software artifact data. A non-complete set of classifications for different artifact categories is defined and designed. For the implementation, a Python program gets conceptualized, allowing for the demonstration of the original design. With the classifications at hand, example analytics are proposed to show the applicability of the results. The final evaluation shows that the set objectives were fulfilled by the design of the software artifact classification system.

Keywords: inner source, open source, classification, data pipelines, metrics, measurement

PDF: Master Thesis

Reference: Benedict Martin Weichselbaum. Classification and Analysis of Inner Source Development Artifacts. Master Thesis. Friedrich-Alexander-Universität Erlangen-Nürnberg: 2023.