Skip to content

Commit 42efe52

Browse files
authored
Merge pull request #189 from Michadelic/master
First draft of the "InnerSource Activity Score" pattern
2 parents d82c28b + 49aa86d commit 42efe52

3 files changed

Lines changed: 128 additions & 0 deletions

File tree

README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -38,6 +38,7 @@ The below lists all known patterns. They are grouped into three [maturity levels
3838
* [InnerSource License](patterns/2-structured/innersource-license.md) - *Two legal entities that belong to the same organization want to share software source code with each other but they are concerned about the implications in terms of legal liabilities or cross-company accounting. An **InnerSource License** provides a reusable legal framework for the sharing of source code within the organization. This opens up new collaboration options, and makes the rights and obligations of the involved legal entities explicit.*
3939
* [InnerSource Portal](patterns/2-structured/innersource-portal.md) - *Create an intranet website that indexes all available InnerSource project information. This will enable potential contributors to more easily learn about projects that might interest them and for InnerSource project owners to attract an outside audience.*
4040
* [Praise Participants](patterns/2-structured/praise-participants.md) - *Thank contributors effectively to engender further engagement from them and to encourage others to contribute*
41+
* [Repository Activity Score](patterns/2-structured/repository-activity-score.md) - *The repository activity score is a numeric value that represents the (GitHub) activity of an InnerSource project.*
4142
* [Review Committee](patterns/2-structured/review-committee.md) - *A formal review committee is setup within an org to "officiate" particular inner source projects with resources, etc.*
4243
* [Service vs. library: It's inner source, not inner deployment](patterns/2-structured/service-vs-library.md) - *Teams in a DevOps environment may be reluctant to work across team boundaries on common code bases due to ambiguity over who will be responsible for responding to service downtime. The solution is to realize that often it's
4344
possible to either deploy the same service in independent environments with separate escalation chains in the event of service downtime or factor a lot of shared code out into one library and collaborate on that.*
85 KB
Loading
Lines changed: 127 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,127 @@
1+
## Title
2+
3+
Repository Activity Score
4+
5+
## Patlet
6+
7+
Potential contributors want to find active InnerSource projects in need of their help. By calculating a repository activity score for each project, a ranked list of projects can be created (e.g. on the [InnerSource portal](innersource-portal.md)), so that potential contributors can more easily determine which project they want to contribute to.
8+
9+
## Problem
10+
11+
**In which order** shall InnerSource projects be presented? Typical ranking KPIs like *GitHub Stars*, *Number of Forks*, *Number of Commits*, *Lines of Code*, *Last Update* aren't sufficient to concisely indicate the activity of a project.
12+
13+
Active projects with a lot of traction, but also fairly new and enthusiastic projects that are in need of new contributors should be ranked higher than matured projects with little activity or in maintenance mode.
14+
15+
A new metric derived from several KPIs is needed to define a reliable and versatile score for a project's activity level.
16+
It can be used to sort projects according to their activity level.
17+
18+
## Story
19+
20+
When InnerSource is practiced for a long time or scales beyond a certain number of projects (let's say 50 to give a meaningful threshold) it is hard to find the currently most popular and active InnerSource projects. Projects that exist for a long time are well-known but may no longer be very active. Fairly new projects on the other hand don't have a reputation or an active community yet.
21+
22+
A list of InnerSource projects should not be considered a static resource, but an exciting place to discover and explore new and active projects, just like a news page listing the most interesting topics of the day first. Thus it is beneficial when the order of the projects is regularly updated and changes according to the project's popularity and activity.
23+
24+
These considerations let to a first prototype to calculate a repository activity score, which worked surprisingly well and determines an ever-changing order of projects according to their activity.
25+
26+
## Context
27+
28+
Discovering InnerSource projects can be facilitated with the [InnerSource Portal](innersource-portal.md) and the [Gig Marketplace](gig-marketplace.md) pattern, or by promoting projects on other communication channels and platforms. The activity score defines a default order in which projects are presented to the community.
29+
30+
## Forces
31+
32+
Automated KPIs that can be fetched by querying the GitHub API are only part of the truth. What about code quality, the availability of good documentation, or an active and helping community that makes the project a fun place to contribute?
33+
34+
Such "soft" KPIs would have to be manually or semi-automatically added to the calculation and the resulting score. If tools exist that provide more context for the repository, like a code coverage reporting, they can easily be worked in.
35+
36+
## Sketch
37+
38+
![Ecosystem for the Repository Activity Score](../../assets/img/repository_activity_score.png)
39+
40+
A centralized approach for calculating an applying the repository activity score. For more details, see [Resulting Context](#resulting-context)
41+
42+
## Solutions
43+
44+
The repository activity score is a numeric value that represents the (GitHub) activity of an InnerSource project. It is derived automatically from repository statistics like GitHub stars, watches, and forks and may be enriched with KPIs from other tools or manual evaluations.
45+
46+
In addition, it considers activity parameters like last update and creation date of the repo to give young projects with a lot of traction a boost.
47+
Projects with contributing guidelines and issues (public backlog) receive a higher ranking as well.
48+
49+
All of this can be fetched and calculated automatically using the result set of the [GitHub search API](https://developer.github.com/v3/search/#search-repositories) and [GitHub statistics API](https://developer.github.com/v3/repos/statistics/). Other code versioning systems like BitBucket, Gitlab, Gerrit can be integrated as well if a similar API is available.
50+
51+
The code below assumes the variable `repo` contains an entity fetched from the GitHub `search` API and the `participation` object contains an entity from the GitHub `stats/participation` API.
52+
53+
Manual adjustments according to soft KPIs (see [Forces](#forces)) can be made on top if needed.
54+
55+
``` javascript
56+
// calculate a virtual InnerSource score from stars, watches, commits, and issues
57+
function calculateScore(repo) {
58+
// weighting:
59+
// forks and watches count most, then stars, add some little score for open issues, too
60+
let iScore = 1 + repo["forks_count"] * 5 + repo["watchers_count"] + repo["stargazers_count"] / 3 + repo["open_issues_count"] / 5;
61+
let iDaysSinceLastUpdate = (new Date().getTime() - new Date(repo.updated_at).getTime()) / 1000 / 86400;
62+
// updated in last 3 months: adds a bonus multiplier between 0..1 to overall score (1 = updated today, 0 = updated more than 100 days ago)
63+
iScore = iScore * (1 + (100 - Math.min(iDaysSinceLastUpdate, 100)) / 100);
64+
// evaluate participation stats for the previous 3 months
65+
repo._InnerSourceMetadata = repo._InnerSourceMetadata || {};
66+
if (repo._InnerSourceMetadata.participation) {
67+
// average commits: adds a bonus multiplier between 0..1 to overall score (1 = >10 commits per week, 0 = less than 3 commits per week)
68+
let iAverageCommitsPerWeek = repo._InnerSourceMetadata.participation.slice(repo._InnerSourceMetadata.participation - 13).reduce((a, b) => a + b) / 13;
69+
iScore = iScore * (1 + (Math.min(Math.max(iAverageCommitsPerWeek - 3, 0), 7)) / 7);
70+
}
71+
// boost calculation:
72+
// all repositories updated in the previous year will receive a boost of maximum 1000 declining by days since last update
73+
let iBoost = (1000 - Math.min(iDaysSinceLastUpdate, 365) * 2.74);
74+
// gradually scale down boost according to repository creation date to mix with "real" engagement stats
75+
let iDaysSinceCreation = (new Date().getTime() - new Date(repo.created_at).getTime()) / 1000 / 86400;
76+
iBoost *= (365 - Math.min(iDaysSinceCreation, 365)) / 365;
77+
// add boost to score
78+
iScore += iBoost;
79+
// give projects with contribution guidelines (CONTRIBUTING.md) file a static boost of 100
80+
iScore += (repo["_InnerSourceMetadata"] && repo["_InnerSourceMetadata"]["guidelines"] ? 100 : 0);
81+
// build in a logarithmic scale for very active projects (open ended but stabilizing around 5000)
82+
if (iScore > 3000) {
83+
iScore = 3000 + Math.log(iScore) * 100;
84+
}
85+
// final score is a rounded value starting from 0
86+
iScore = Math.round(iScore - 1);
87+
// add score to metadata on the fly
88+
repo._InnerSourceMetadata.score = iScore;
89+
return iScore;
90+
}
91+
```
92+
93+
## Resulting Context
94+
95+
Contributors are free to commit a part of their time to InnerSource project. They may choose to contribute to a project that they depend on for the work in their regular team anyways. However they may also choose to contribute to something completely different, based on their interests and personal development goals.
96+
97+
Projects can be sorted and presented by repository activity score to give a meaningful order in a portal presenting projects to potential new contributors. The score can be calculated on the fly or in a background job that evaluates all projects on a regular basis and stores a list of results.
98+
99+
A crawler that regularly searches all InnerSource repositories (e.g. tagged with a certain topic in GitHub) can be a helpful addition as well. It provides a ranked list of projects that can be used as an input for tools like the [InnerSource Portal](innersource-portal.md), a search engine, or an interactive chat bot.
100+
101+
## Rationale
102+
103+
The repository activity score is a simple calculation based on the GitHub API. It can be fully automated and easily adapted to new requirements.
104+
105+
## Known Instances
106+
107+
Used in SAP's InnerSource project portal to define the default order of the InnerSource projects. It was first created in July 2020 and is fine-tuned and updated frequently ever since.
108+
109+
When proposed to InnerSourceCommons in July 2020, this pattern emerged.
110+
111+
## Status (optional until merging)
112+
113+
* First Draft: 30th July 2020
114+
* Second Draft: 5th August 2020
115+
* Third Draft: 6th August 2020
116+
117+
## Author(s)
118+
119+
[Michael Graf (SAP)](mi.graf@sap.com)
120+
121+
## Acknowledgements
122+
123+
Thank you to the InnerSource Commons Community for lightning-fast advice, and a lot of helpful input to feed this pattern! Especially:
124+
* Johannes Tigges
125+
* Sebastian Spier
126+
* Maximilian Capraro
127+
* Tim Yao

0 commit comments

Comments
 (0)