Instant Download

Download your project material immediately after online payment.

Project File Details


3,000.00

100% Money Back Guarantee

File Type: MS Word (DOC) & PDF

File Size: 1,686 KB

Number of Pages:67

 

ABSTRACT

 

Social networking systems have found their way into all sectors of life. With the advent of social coding platform like GitHub, networks of developers can be inferred based on the projects they participated in. When a new project is created by a developer on such social coding platforms, these platforms lack the capacity to recommend potential collaborators. Recommender systems are software techniques and tools that give item suggestions to users who might be interested in such an item. Having identified this problem, we developed ProjectTrust, a trust-aware recommender model which evaluates trust between projects and developers. A natural language processing approach was identified to be a good tool for text feature extraction in GitHub readme files. As the verification of the proposed framework, experiments using real social data from GitHub are presented and results show the effectiveness of the proposed approach.
Keywords: Trust-aware, Recommender Systems, Natural Language Processing, Term Frequency, Inverse Document Frequency.

 

TABLE OF CONTENTS

 

CERTIFICATION ……………………………………………………………………………………………………………………… ii
ABSTRACT …………………………………………………………………………………………………………………………….. v
ACKNOWLEDGEMENTS ………………………………………………………………………………………………………… vi
DEDICATION ………………………………………………………………………………………………………………………… vii
TABLE OF FIGURES ……………………………………………………………………………………………………………….. x
LIST OF TABLES …………………………………………………………………………………………………………………….. x
CHAPTER ONE INTRODUCTION OF CONCEPT ……………………………………………………………….. 1
1.1 INTRODUCTION …………………………………………………………………………………………………….. 1
1.2 SOCIAL CODING AND VERSION CONTROL SYSTEMS ……………………………………………. 1
1.3 TRUST ………………………………………………………………………………………………………………….. 3
1.3.1 TRUST IN PSYCHOLOGY …………………………………………………………………………………… 3
1.3.2 TRUST IN SOCIOLOGY ………………………………………………………………………………………. 4
1.3.3 TRUST IN COMPUTER SCIENCE ………………………………………………………………………… 4
1.4 CHARACTERISTICS OF TRUST ……………………………………………………………………………… 5
1.5 RECOMMENDER SYSTEMS …………………………………………………………………………………… 7
1.6 PROBLEM STATEMENT …………………………………………………………………………………………. 8
1.7 OBJECTIVE OF THE RESEARCH WORK …………………………………………………………………. 8
1.8 RESEARCH METHODOLOGY …………………………………………………………………………………. 9
1.9 SCOPE OF WORK………………………………………………………………………………………………….. 9
1.10 ORGANISATION OF WORK ……………………………………………………………………………………. 9
CHAPTER TWO LITERATURE REVIEW…………………………………………………………………………… 10
2.1 RECOMMENDER SYSTEM …………………………………………………………………………………… 10
2.1.1 FUNCTIONS OF RECOMMENDER SYSTEM ………………………………………………………. 11
2.1.2 ELEMENTARY STRUCTURE OF RECOMMENDER SYSTEM……………………………….. 12
2.1.3 PERSONALIZED RECOMMENDATION ………………………………………………………………. 13
2.1.4 COLLABORATIVE FILTERING RECOMMENDATION …………………………………………… 13
2.1.5 CONTENT BASED RECOMMENDER SYSTEMS …………………………………………………. 16
2.1.6 KNOWLEDGE BASE RECOMMENDATION SYSTEMS …………………………………………. 17
2.1.7 HYBRID RECOMMENDATION SYSTEMS …………………………………………………………… 18
2.2 TRUST REPRESENTATION AND TRUST METRIC ………………………………………………….. 20
2.3 TRUST MODELS ………………………………………………………………………………………………….. 22
2.4 STATE OF THE ART ON RECOMMENDATION OF REPOSITORIES ON GitHub ………… 31

 

CHAPTER ONE

INTRODUCTION OF CONCEPT
1.1 INTRODUCTION
With the growth of web technology, there has been an explosive growth in the size of content available on the Internet, social network interaction has exploded as well and has become a regular part of people’s life. Other social lives activities like buying and selling now have a place to fit into social networks. Researchers and scholars need information and resources from on-line document repositories and digital libraries for proper conducting of research work and they also require collaborative researches; casual chatting and communication via mails are also parts of the exploits made from advances in web technology. The Web has turned to the best medium for many database applications, like e-commerce and digital libraries. Many of these applications have even extended their functionalities by making use of APIs (Application Programming Interfaces).
The cyber world has increasingly become social in the last 10 to 15 years, but the productivity implications remain insufficiently exploited. We can track a person’s moment-by-moment status updates on Facebook, photo update on Instagram, and updates on Twitter, blogs and wikis.
1.2 SOCIAL CODING AND VERSION CONTROL SYSTEMS
Social networks are the biggest explosions of the 21st century. We possess the technology to remain connected perpetually with our network, and they could be considered as resources for reaching out to people from all around the world in an instant. The web has enabled communities to emerge and collaborate on challenges like building the most sizably voluminous compendium of all human knowledge, Wikipedia, providing the resources and tools for communities to achieve a common goal.
The quick advancement of social coding devices is leading to a transformation in software product development. Social communications have turned into an essential factor in the assessment of
2
the software product improvement process. Version control systems (VCS) are the basic piece of a social coding stage. These days, different VCS instruments, like CVS, SVN, Git and so on, are much of the time utilized by software advancement groups. Software developers can build their own code versions, and submit changes into the decentralized VCS frameworks. Distinctive versions of a software are managed by the VCS framework, and potential clashes of software products are avoided. Early VCS frameworks are utilized just by fairly small software development groups, and are for the most part installed inside small network systems, like organizational LANs. The quantity of projects maintained under those early VCS frameworks are moderately few. As Git can make software development coordination effortless using its distributed coding collaboration feature, little wonder why it is picking up its popularity. “The forty year history of version control tools shows a steady movement toward more concurrency. In first generation tools, concurrent development was handled solely with locks. Only one person could be working on a file at a time. The second generation tools are a fair bit more permissive about simultaneous modifications, with one notable restriction. Users must merge the current revisions into their work before they are allowed to commit. The third generation tools allow merge and commit to be separated” (ericsink.com, 2017).
Table 1.1: Generations of Version Control Systems Generation Networking Operations Concurrency Examples First None One file at a time Locks RCS, SCCS Second Centralized Multi-file Merge before commit CVS, SourceSafe Subversion Team Foundation Server Third Distributed Changesets Commit before merge Bazaar Git Mercurial
With the current advances in distributed computing innovation, distributed social coding gets a major lift. Prevalent social coding stages would now be able to have a great many software projects. These days, an ever-increasing number of individuals acknowledge the possibility of
3
“social coding”. Contributions to software product advancement process are made in a distributed, collaborative effort by a virtual community. Software developers all over the world can participate in a similar software project, editing distinctive parts of the code and producing different branches in the project source tree. There are presently no expressed limits on a software team. “A software project may be developed by an ever changing set of software engineers, and a software engineer may contribute to a set of different software projects hosted in a remote server” (Hu et al., 2016). Social coding has to a large extent changed the style of software development activities. The social network of software engineers persistently interacts with the network of software projects. There have been a few social coding platforms that encourage software developers far and wide to contribute to a wide range of software projects together. Distributed development tools, like. Git, serve as the base standard of social coding platforms. In light of Git, the GitHub platform has pulled in numerous software developers to work together on huge number of open source projects. In GitHub, projects have transformed into repositories. Repositories house more information inside. As the number of GitHub users keep increasing, GitHub repositories keep growing, but user to user trust relationship is not being explored. This trust relationship can provide insight in recommender systems.
1.3 TRUST
It is an established fact that one of the significant parts of human relationship is trust. And this has made trust a multidisciplinary topic treated in both psychology, sociology and computer science alike. As a result of this, arriving at a general meaning for trust that touches each of these areas has always been a tedious task. There are numerous definition of trust. One of the definitions is that “trust is a measure of confidence that an entity will behave in an expected manner, despite the lack of ability to monitor or control the environment in which it operates” (Sherchan, 2013).
1.3.1 TRUST IN PSYCHOLOGY
From the area of psychology, one of the accepted definitions of trust is that found in Rousseau et al. in 1998. “Trust is considered to be a psychological state of the individual, where the trustor
4
risks being vulnerable to the trustee based on positive expectations of the trustee’s intentions or behavior” (Sherchan, 2013). From this definition, trust is argued to have emotional, behavioural and cognitive implications.
1.3.2 TRUST IN SOCIOLOGY
Sociologists incline to emphasize the relational characteristics or social qualities of trust, and relational characteristics or social qualities, in most part are characterized by the level of aggregation (e.g., individual level, community level, population level, organizational level and societal level). One of the most reviewed statements in sociology famously proposes that trust for a trustee will be a matter of the trustee’s perceived ability, integrity and act of giving and of the trustor’s propensity to trust.
Sociology defines trust as a bet about the future possible actions of the trustee. For this bet to be considered trust, it must have some effects upon the action of the trustor (i.e the person who makes the bet). Trust in sociology as with psychology considers trust from two viewpoints: the societal and individual trust. At individual point, the vulnerability of the person who makes the bet (i.e. trustor) is the major factor that influences trust. At societal level, it is obvious that trust is a property of every social group. This is expressed as psychological state of every member of the group towards one another and this makes each member of the social group to act in a way that they expect other members of the group to be trustworthy and also that they should be trusted by others. This means that at societal level, social trust can be grouped into an institutional or system aspect of trust.
1.3.3 TRUST IN COMPUTER SCIENCE
Just like in the other areas, the definition of trust varies from researchers to researchers although trust is one of the widely used term in computer science and network security. Before some definitions, it is worth pointing out that in computer science, trust is broadly classified into two broad categories i.e. user trust and system trust. The concept of user trust is derived from the work done by Marsh in 1994 from psychology and sociology where he gave standard definition
5
as “subjective expectation an entity has about another’s future behavior” (Marsh, 1994). Another definition could be found in Wikipedia which says that: “Trust is a particular level of the subjective probability with which an agent assesses another agent or group of agents will perform a particular action, both before he can monitor such action and in a context in which it affects his own action.” System trust on the other hand is an expectation that a system or device will behave faithfully in the manner of fulfilling its expected purpose. As a note, this thesis considers trust from the perspective of user trust in computer science. In the light of user trust, it can be implied that trust is inherently personalized. Taking online platforms like Amazon, inferring trust is based on feedback from users based on past interactions. In this sense also, trust is relational. The relationship between two members is strengthened as the two members continually interact with each other. If the outcome of this interactive experience turns out to be positive, trust evolves and increases, and decreases otherwise.
There are two fundamental sorts of trust in online systems: they are the direct trust and recommendation trust. Direct trust depends on direct experience of a member with the other party.
1.4 CHARACTERISTICS OF TRUST
1. Context Specific: Trust is context-specific in its scope. Trust context refers to the area in which the trust relationship exists. Examples are social networks, law enforcement and many others. To further explain the concept of context specific, ‘Killian’ trusts ‘Avil’ as his pilot but he doesn’t trust Avil to be his driver to drive him around. Therefore Avil is only trusted in the context of being a pilot.
2. Dynamic: Trust changes with time. It increases or decreases as new experiences are being gained during the period of the relationship or interaction. In some cases, trust can even decay to a level of distrust. Old experiences become obsolete and in most instances are irrelevant with time thus newer experiences are very necessary. In computer science, this is the most considered since the web of trust (WOT) keeps track of connections at every moment. Much research work has been done to model the dynamicity of trust.
6
3. Propagative: This is the most studied property of trust. The propagative property of trust is like the word of mouth propagation of information by humans. As a result of this propagative nature of trust, trust information can be passed down in the social network trust chain (trust chain is the network formed as trust is passed from one person to the other). Example, if Gori trusts Joseph, who in turn trusts Amanda whom Gori doesn’t know, Gori can derive some extent of trust on Amanda based on how trusted joseph has found Amanda to be. Various trust models [Schillo et al. 2000; Mui et al. 2002; Sabater 2002; Yu et al. 2004] have used this property. Similarly, literature based on the FOAF (Friend-Of-A-Friend) topology are all based on the propagative nature of trust (Sherchan, 2013). This characteristics of trust should not be interpreted as transitivity.
4. Non-transitive: If Joseph trusts Gori, and Goris trust Amanda, this doesn’t imply that Joseph trust Amanda. In trust, “transitivity implies propagation but the reverse is not true”. In general terms, we can say that trust is not transitive. It is very unfortunate to say that this property of trust is always confused with the propagative property of trust.
5. Composable: When various network chains recommend diverse trust values to a member, the trustor needs to compose the trust information. This occurs probably because trust and distrust propagation along social chains permits a member in the network to compute trust to other members not directly connected to it. Tidal trust had to take this into consideration. Typically, models that make use of the composition feature also employ the propagative feature to compute trust values from several trust chains in order to make a trust decision. For instance, Joseph is recommended to Gori by several chains in her network. In this case, Gori needs to compose the trust data received from different chains to decide whether she can trust Joseph.
6. Subjective: This property of trust brings to view that trust computation is also personalized. The choice of the trustor determines to a large extent the value of trust being computed. For example, Gori gave a positive review about a book and Hillary always believes that Gori’s reviews are always good, Hillary can go for the book based on the subjectivity of trust. On the other hand, Emeka doesn’t find Gori’s reviews to always
7
be good, then he might be sceptical in choosing the same book based on Gori’s review although he might go for the book based on personal conviction.
7. Asymmetric: Asymmetry in trust can be seen as an advanced personalized trust characteristics. Bob may trust Gori more as compared to the trust Gori has for Bob. When one member in a social network is perceived to act in an untrustworthy manner, the other member of the network might be forced to reduce its own trust value of the trustor. Asymmetry might be caused by differences in members’ beliefs, expectations and perceptions.
8. Event-sensitive: Because trust is built up over a very long period, care is always taken in order for it not to shatter under a high-impact event. “This aspect of trust has received even less attention in computer science” (Nepal et al 2010).
1.5 RECOMMENDER SYSTEMS
Recommender systems are subsystems that provide suggestions related to item(s) to user(s) who are making use of a particular system in order to ease the decision making process. These item(s) are generally used to refer to what the system recommends to the user(s). As a result of the overwhelming items, users are most likely to get confused on what to select, which item to add to their cart, which website to visit, which item to add to their wish-list and many others. Recommender systems are usually designed to be item specific and as such all the components of the systems like the user interface, the recommendation techniques, the algorithms and the user interaction are always customized to provide useful personalized suggestions of specific items to a user or a user group. Moreover, in some instances, the suggestions could be non personalized as can be seen in newspapers and top 10 selections of books etc. This type of non-personalized recommendation systems are not always of research interest since they are in general terms just like the normal web services rendered to all users using a platform. A case in point is a developer recommender system that recommends developers for a newly created project. Popular websites like Facebook use recommendation system to recommend friends of a friend; Amazon and eBay and most e-commerce websites use their recommender system to suggest items to be bought. Personalized recommendations are provided as a list ranked in order
8
of item preferences. This ranking is performed by predicting the suitable services, products or items a user might need based on the users preferences and constraints. The collection of the user(s) preferences could be done explicitly (e.g, user product rating, user friend rating, reviews etc.) or inferred based on user-system interaction (e.g., page navigation, clicks and time spent reviewing an item) as a sign of indirect preference for such an item (“Recommender system handbook,” 2014). With the rise in e-commerce websites and new e-business services (item comparisons, annotated searches, service rendering platforms) the need to render recommendations from filtering the whole contents on a particular website is pressing. Recommender systems have shown positive solutions in dealing with information overload by pointing users towards newly created item lists that can be of interest to their current task/need.
1.6 PROBLEM STATEMENT
With the advent of social networking which has found its use in all sectors of life and human development, social coding platforms not excluded, part of the challenge is recommending possible projects of interest to users. Studying the social network structure that is developed as a result of developers collaboration while working on projects can give us an insight on generating a trust network which can assist in developing a recommender system algorithm that can recommend possible developers on any project being created on such platforms.
1.7 OBJECTIVE OF THE RESEARCH WORK
Although much research work has been done both in the academic and industrial area on social recommender systems, little attention has been paid to applying the concept of recommender systems in social coding platforms. The primary objective of this work is to review extensively existing trust-aware recommender algorithms and making proper documentation of them. This will lead to applying these algorithms on GitHub datasets to see if trust networks could give better recommendations than conventional recommender system techniques. The final objective will be to devise a hybrid or novel algorithm that can outperform the existing algorithms.
9
1.8 RESEARCH METHODOLOGY
The methods adopted to carry out this research work centres on exploration of information retrieval techniques in identification of similar projects. To achieve this, a natural language processing (NLP) technique called term frequency – inverse document frequency is applied in feature extraction from readme files and a similarity metric is adopted in computing similarity. The datasets are converted to graphs using a graph library and a new graph called a trust graph is generated. Trusted user experience levels and skill set levels are evaluated, which aids in the recommendation list. Word cloud and Pearson product-moment correlation is used for evaluation and experimentation.
1.9 SCOPE OF WORK
The scope of this work includes:
1. Applying an existing algorithm on a GitHub dataset;
2. Developing a trust inference algorithm; and
3. Applying the new algorithm to the GitHub dataset.
1.10 ORGANISATION OF WORK
Chapter Two presents an extensive literature review on trust inference algorithms and GitHub repository recommendation. Chapter Three presents the research methodology and implementation of work. Chapter Four highlights the experimentation performed and results obtained. Finally Chapter Five presents recommendations for future work and summary of work done in this thesis.
10

 

GET THE FULL WORK

DISCLAIMER: All project works, files and documents posted on this website, projects.ng are the property/copyright of their respective owners. They are for research reference/guidance purposes only and the works are crowd-sourced. Please don’t submit someone’s work as your own to avoid plagiarism and its consequences. Most of the project works are provided by the schools' libraries to help in guiding students on their research. Use it as a guidance purpose only and not copy the work word for word (verbatim). If you see your work posted here, and you want it to be removed/credited, please call us on +2348157165603 or send us a mail together with the web address link to the work, to hello@projects.ng. We will reply to and honor every request. Please notice it may take up to 24 or 48 hours to process your request.