-
Notifications
You must be signed in to change notification settings - Fork 2k
A review of open source practices
This wiki presents the findings of some research into open source project processes. The focus was non-technical in nature, and the aim was to get a feel for how other open source projects have worked in the past with regards to their internal processes. The notes present a rough summary of each resource, although the resources are organized into groups.
The material presented below summarizes a range of material on the subject. This includes academic papers, blog posts, and other online materials such as the Github open source guide. It includes general overviews as well as particular successes and failures. It is generally focused towards projects which have some similarity to RIOT, although not always; for example, the broad reviews tend to cover the entirety of open source projects in some way.
-
Broad reviews
- Understanding the Impressions, Motivations, and Barriers of One Time Code Contributors to FLOSS Projects: A Survey (Lee, Carver, Bosu, 2017)
- Defining Open Source Software Project Success - Kevin Crowston, Hala Annabi and James Howison (2003)
- Free/Libre Open-Source Software Development: What We Know and What We Do Not Know - Crowston, Wei, Howison, Wiggins (2012)
- The Cathedral and the Bazaar: Musings on Linux and Open Source by an Accidental Revolutionary (Eric S Raymond)
- Internet Success: A Study of Open-Source Software Commons, (Schweik and English) (2009)
- Two Case Studies of Open Source Software Development: Apache and Mozilla - AUDRIS MOCKUS
- Case Studies
-
Guides
- Github open source guides
- Samsung presentation on open source projects
- How to be an open source gardener - Issue triage
- Scaling open source communities
- The Social Coding Contract presentation, RubyConf 2014
- Moya's Contributing.md Template for Liberal Contribution projects
- The art of PR closing
- Consensus decision making - Wikipedia
- Open source development at Google
- How to make your open source project thrive - Andrey Petrov (urllib3)
- Maintaining a growing open source project
- How I got 1000 stars on my github project
- How to get hundreds of stars on your github project
These sources represent a broad review of the subject. These are literature reviews, or studies using Github or SourceForge data, or books, or other overviews of the subject, all from sources considered reputable.
Understanding the Impressions, Motivations, and Barriers of One Time Code Contributors to FLOSS Projects: A Survey (Lee, Carver, Bosu, 2017)
- Most one-time contributors only contribute once because eg they just want to submit a patch or fix bugs that impede their work
- Otherwise: barriers that prevent people going further/actions we can take to increase chances of conversion/other tips
- Responsiveness of project members
- Mostly, OTCs want to fix a bug
- Often, they are community minded: can leverage support of the community
- Quick turnaround on review would encourage people because otherwise it takes a lot of time
- Bite-sized or somehow time-defined pieces of work (issues) would encourage people so it doesn't necessarily take a lot of time
- Ability to easily get onboard without familiarity of code base would encourage people, otherwise it takes a lot of time
- Short and simple development/submission/review process would encourage people
- Entry barriers to understanding the project (documentation, setup).
Defining Open Source Software Project Success - Kevin Crowston, Hala Annabi and James Howison (2003)
This study covered the ways by which we may measure the success of open source projects. Generally their conclusion is that a range of measures to is appropriate. They do a literature review; an analysis based on a process model for OSS development; and a questionnaire on slashdot.org.
Points that stand out to me are code quality, developer satisfaction, number of users, time to close bugs or implement features, level of activity.
- Literature review
- System and information quality
- Code quality (e.g., understandability, completeness, conciseness, portability, consistency, maintainability, testability, usability, reliability, structuredness, efficiency)
- Documentation quality
- User satisfaction
- User ratings
- Opinions on mailing lists
- User surveys
- Use
- Use (e.g., Debian Popularity Contest)
- Number of users
- Downloads
- Inclusion in distributions
- Popularity or views of information page
- Package dependencies
- Reuse of code
- Individual and organizational impacts
- Economic and other implications
- System and information quality
- Process analysis, DeLone and McLean's success model
- Project output
- Movement from alpha to beta to stable
- Achieved identified goals
- Developer satisfaction
- Process
- Number of developers
- Level of activity (developer and user contributions, number of releases)
- Time between releases
- Time to close bugs or implement features
- Outcomes for project members
- Individual job opportunities and salary
- Individual reputation
- Knowledge creation
- Project output
- Questionnaire. The responses were, in order:
- 1 – Developer satisfaction
- 2 – User involvement
- 3 – Developer involvement
- 4 – User satisfaction
- 5 – Code quality
- 6 – Adherence to processes
- 7 – Meets requirements, attention and recognition (joint)
- etc
Free/Libre Open-Source Software Development: What We Know and What We Do Not Know - Crowston, Wei, Howison, Wiggins (2012)
This study focuses on the state of the literature, investigating how much and what areas of the subject have been studied and what havent't. It also offers some useful insights in itself. Also, can provide a springboard to investigation in a range of areas, as the literature has been characterised and categorised according to a model through what seems to be a methodological process.
- Characterized by a globally distributed developer force; a rapid, reliable software development process; and a diversity of tools to support distributed collaborative development, effective FLOSS development teams somehow profit from the advantages and overcome the challenges of distributed work [Alho and Sulonen 1998]
- Some notable findings that stuck out for me:
- Motivation
- Extrinsic motivations: reputation, and reward motives such as career development, particularly
- Intrinsic motivations: Enjoyment based motivations such as fun and sharing or learning opportunities, particularly
- Internalized extrinsic motivations: user needs, particularly
- Motives are not static but evolve over time. i.e., what motivates them to start does not necessarily motivate them to continue.
- A need for software drives initial participation but the majority of participants leave once their needs are met. For the remaining developers, other motives evolve. Related to personal history prior to and during participation. Initial motivations do not effectively predict long term contribution. Situated learning and identity construction behaviours (?) are positively linked to sustained contribution (Fang and Neufeld [2009])
- Xu et al. [2009] found that individuals' involvement in FLOSS projects depends on both intrinsic motivations (i.e., personal needs, reputation, skill gaining benefits, and fun in coding) and project community factors (i.e., leadership effectiveness, interpersonal relationship, and community ideology)
- Roberts et al. [2006] studied the impact of different motivations on individual contribution levels in Apache project. The results showed that developers' paid participation and status motivations lead to above-average contribution levels; use-value motivations lead to below-average contribution levels; and intrinsic motivations do not significantly impact average contribution levels
- sufficient detail has been provided regarding why individuals contribute to FLOSS development, but little work has been done to examine the impact of various motivations on individual behaviors in FLOSS development. It seems likely that motivations are linked to other facets of contribution, such as longevity of participation or achievement of leadership. Further, few studies have examined changes in motivation over time, although previous research has indicated that motivations may alter over time. For example, Ghosh [2002] mentioned that reputation is a stronger motivation amongst the longer term participants (who might be expected to actually have garnered reputation)(Conclusions)
- Processes
- Projects usually rely on "virtual project management," meaning that different people take on management tasks as needed. In this way, the project also mobilizes use of private resources [Scacchi 2004]
- Glance [2004] examined the kernel change logs to determine the criteria applied for releasing the Linux kernel. She argued that a release contains whatever has been added, as opposed to a clear process of planning the functionality needed for a release. Same with requirements analysis – mailing list discussions, bug reports and feature requests instead
- Modularity has been seen as key to the feasibility fo distributed development (Scacchi [2004], MacCormack et al [2006]). Incl ability to add functionality via scripting or plugins.
- Testing processes vary by project.
- Similar difference in release schedules.
- Maintenance
- the nature of maintenance has been described as more like reinvention, which acts as "a continually emerging source of adaptation, learning, and improvement in FLOSS functionality and quality" [Scacchi 2004].
- Activities include problem solving, user support, software quality maintenance, patches, change management, problem resolution
- Singh et al. [2006] analyzed help interactions and found that this process is often inefficient because initial posts lack the necessary information to answer questions, resulting in back-and-forth postings. The authors suggested that some details be captured automatically as part of initial reports, and also articulated the potential benefit of developing practices and tools for more explicitly reusing information, for example, marking particularly helpful answers to automatically populate a kind of FAQ
- many, if not all, of the major FLOSS projects have planning and requirements analysis mechanisms. Monthly meetings, roadmaps, schedules.
- Social processes – verbal and behavioural activities
- Socialization is treated in the literature as a process that places emphasis on a participant's actions and willingness to understand not just the code base, but also the social structure of the project.
- The onus for socialization falls almost entirely on the would-be developer, rather than the team. The process thus acts as a filter for participants that match the project. (von Krogh et al. [2003])
- Decision making should be transparent. A lack of transparency and consideration in the decision-making process tends to alienate those who are not being consulted and erodes the sense of community [Jensen and Scacchi 2005]. Everyone should understand how decisions are being made. Decision making styles may change over the course of the project. In the early life of a project, a small group will control decision making, but as the project grows, more developers will get involved [Fitzgerald 2006]
- Leadership is usually shared [Sadowski et al. 2008], and leaders emerge rather than being appointed, Individuals are perceived by others as leaders based on their sustained and strong technical contributions [Scozzi et al. 2008], diversified skills [Giuri et al. 2008], and a structural position in their teams [Evans and Wolf 2005; Scozzi et al. 2008]. According to Fielding [1999], shared leadership enables these teams to continue to survive independent of individuals, and enables them to succeed in a globally distributed and volunteer organizational environment
- Coordination mechanisms discussed are: mechanisms to control the number of developers (lower = more coordination, nb that a small portion of developers are responsible for most of the outputs); Modularity and division of labor (again, modularity is emphasized); Task assignment mechanisms (Self-assignment is highly dominant); instructive materials and standardization initiatives (as a means to coordinate effort, rather than dictate design)
- Conflict management mechanisms: From interviews, van Wendel de Joode [2004] identified four conflict management mechanisms between firm-supported developers and voluntary developers: third party intervention, modularity, parallel software development lines, and the exit option.
- little research has been conducted on social processes related to conflict management and team maintenance (conclusion)
- Emergent project team states
- Trust: Trust is often related to team effectiveness. For example, Stewart and Gosain [2001] proposed that shared ideology enables the development of affective and cognitive trust, which in turn leads to group efficacy and effectiveness. But not all researchers share the same belief. In a study of published case studies of FLOSS projects, Gallivan [2001] found that group effectiveness can be achieved in the absence of trust if a set of control and self-control mechanisms is presented.
- while FLOSS projects have larger total numbers of contributors, the bulk of activity, especially for new features, is quite highly centralized Mockus et al. [2002]
- Prior research suggests that the existence of accurate shared-mental models that guide member actions are important for team effectiveness [Cannon-Bowers and Salas 1993]. Technically and organizationally.
- Most current research on FLOSS team effectiveness uses objective measures such as downloads, code quality, bug-fixing time, and number of developers. Behavioral measures, which are believed to impact members' desire to work together in the future, are typically missing. (conclusion)
- Relationship between success and other variables
- Subramaniam et al. [2009] found that restrictive licenses (as defined in Section 4.2.2) have an adverse impact on FLOSS success
- Stewart and Ammeter [2002] found that sponsorship of a project, project types, and project development status are all related to one measure of project success: popularity (i.e., how much user attention is focused on the project).
- Based on 75 FLOSS projects, Capra et al. [2008] reported a high degree of openness in governance practices leads to higher software quality
- Gallivan [2001] noted that although trust is rarely mentioned, ensuring control is an important criterion for effective performance within OSS projects.
- Wynn [2004] found the fit between the life cycle stage and the specific organizational characteristics of projects (focus, division of labor, role of the leader, level of commitment, and coordination/control) was an effective indicator of success, as measured by the satisfaction and involvement of both developers and user
- Evolution of product and community
- Research confirms that the evolution of projects' size over time seems to contradict the laws of software evolution proposed for commercial software [Koch 2004]. For example, Godfrey and Tu [2000] observed that the evolution of the Linux Kernel does not obey Lehman's laws, which states that "as the system grew, the rate of growth would slow, with the system growth approximating an inverse square curve."
- By studying three FLOSS projects, Long and Siau [2007] found that project interaction patterns evolve from a single hub at the beginning to a core/periphery model as the projects mature
- a snowball effect might lead more members to leave when one member drops out, which might result in network separation and disintegration, so it may be important to maintain a balanced composition of all the different roles in a community [Nakakoji et al. 2002]
- Motivation
The Cathedral and the Bazaar: Musings on Linux and Open Source by an Accidental Revolutionary (Eric S Raymond)
This is generally about creating high quality open source software. The 'cathedral' is where only a select group of developers can operate on it. 'Bazaar' is eg Linux, i.e. open-source. Seen as a central book in the open source movement.
19 Lessons. A series of maxims.
- Every good work of software starts by scratching a developer's personal itch - Because developers are writing software they want = motivation.
- Good programmers know what to write. Great ones know what to rewrite (and reuse) – eg Linux was started by reusing Minix code.
- Plan to throw one [version] away; you will, anyhow. (Copied from Frederick Brooks' The Mythical Man-Month) – you don't really understand the problem until the first implementation has been done
- If you have the right attitude, interesting problems will find you – e.g. to help with another problem, rather than explicitly focusing on finding an interesting problem
- When you lose interest in a program, your last duty to it is to hand it off to a competent successor – although another hacker is likely to find the abandoned work anyway, see 4.
- Treating your users as co-developers is your least-hassle route to rapid code improvement and effective debugging – because these users can diagnose problems and suggest fixes.
- Release early. Release often. And listen to your customers – means the developers are continually stimulated
- Given a large enough beta-tester and co-developer base, almost every problem will be characterized quickly and the fix obvious to someone - "given enough eyeballs, all bugs are shallow"
- Smart data structures and dumb code works a lot better than the other way around – when you understand the data structures, understanding the code is easier.
- If you treat your beta-testers as if they're your most valuable resource, they will respond by becoming your most valuable resource – releasing early and often, and engaging very actively with the beta testers incl polling about design decisions
- The next best thing to having good ideas is recognizing good ideas from your users. Sometimes the latter is better
- Often, the most striking and innovative solutions come from realizing that your concept of the problem was wrong – be open to this
- Perfection (in design) is achieved not when there is nothing more to add, but rather when there is nothing more to take away. (Attributed to Antoine de Saint-Exupéry) – when the code is getting both better and simpler, we're making progress
- Any tool should be useful in the expected way, but a truly great tool lends itself to uses you never expected
- When writing gateway software of any kind, take pains to disturb the data stream as little as possible—and never throw away information unless the recipient forces you to!
- When your language is nowhere near Turing-complete, syntactic sugar can be your friend.
- A security system is only as secure as its secret. Beware of pseudo-secrets.
- To solve an interesting problem, start by finding a problem that is interesting to you.
- Provided the development coordinator has a communications medium at least as good as the Internet, and knows how to lead without coercion, many heads are inevitably better than one.
Internet Success: A Study of Open-Source Software Commons, (Schweik and English) (2009)
A prominent textbook on the subject. They are of the opinion that Linux and Apache were anomalies. Studied SourceForge data. NB Crowston/annabi/howison say this approach is limited. Schweik and English supplemented this with a survey of 1400 developers.
Identified characteristics that correlated statistically to greater likelihood of success. They defined "success" as achieving at least three software releases and having value for at least a few users, value being attributed to downloads/installations, development activity, posts to discussion boards or email lists, and the addition of developers.
- Most open source projects not successful - only 17%. Most in the initiation stage, but almost as many after the initial release.
- Successful projects have some common characteristics. To me, the summary of this seems to be 1) clear goals 2) good internal organization and 3) modularity of the architecture to facilitate co-working. Most of these come down to effective leadership, someone who was trying to describe the project to the world and provide a clear vision of where the project was going
-
- A "relatively clearly defined vision and a mechanism to communicate the vision early in the project's life"
- A clearly defined set of users who have a need that can be met by the software
- Well-articulated and clear goals established by the project's leaders
- Good project communication -- a quality website, good documentation, a bug-tracking system and a communication system such as an email list or forum.
- Once a project has achieved its initial release, a software architecture that is modular -- so future development tasks can be carved out at different levels of complexity for other developers to work on. (Modular architecture alone isn't enough -- many abandoned projects were also modular, Schweik said.)
- Developers also users of the software – I need this software so I want to work on it, even after its initial release
- Internet faciliates collaboration, driven by this user-centric need. Even one developer added is meaningful, since most open-source projects are relatively small
- Some characteristics that were found to not matter:
-
- which operating system written for
- How many developers involved
- Whether there was a formalized system of governance (seemed to be because lots of projects were small
- Which type of open source license was used
- Whether the project has a source of funding. Projects that are funded have higher success rates, although the cause-effect relationship could be the other way round
- Success doesn't have to mean large-scale adoption. A clearly identifiable need, even without a big user base, can be successful.
What criteria should a funder use to determine which projects to support?
For projects in the growth stage (after the first code release):
- A well-defined set of users
- Presence of developers who are interested in continuing to use the software
- Developers with prior open source experience
- Clear vision being articulated by project leaders
- Professional web presence
Came to a series of hypotheses:
H1: “Open source developments will have a core of developers who control the code base, and will create approximately 80% or more of the new functionality. If this core group uses only informal ad hoc means of coordinating their work, the group will be no larger than 10 to 15 people.”
H2: “If a project is so large that more than 10 to 15 people are required to complete 80% of the code in the desired time frame, then other mechanisms, rather than just informal ad hoc arrangements, will be required in order to coordinate the work. These mechanisms may include one or more of the following: explicit development processes, individual or group code ownership, and required inspections.”
H3: “In successful open source developments, a group larger by an order of magnitude than the core will repair defects, and a yet larger group (by another order of magnitude) will report problems.”
H4: “Open source developments that have a strong core of developers but never achieve large numbers of contributors beyond that core will be able to create new functionality but will fail because of a lack of resources devoted to finding and repairing defects.”
H5 “Defect density in open source releases will generally be lower than commercial code that has only been feature-tested, that is, received a comparable level of testing.” H6: “In successful open source developments, the developers will also be users of the software.”
H7: “OSS developments exhibit very rapid responses to customer problems.”
These are summaries of some of the key organizational points from some prominent open source project. Generally, they have been chosen as projects which share some similarities with RIOT, or have certain operational processes which could be helpful with RIOT.
Node.js foundation's base contribution policy
- They had a similar problem a few years ago – loads of overhead for committers
- People can log issues with questions and feedback regarding the policy
- "Important not to define processes that make life easier for a small group of maintainers, often at the cost of attracting new contributors". As projects mature, there's a tendency to become top heavy and overly hierarchical as a means of quality control and this is enforced through process, which create barriers to contribution. We use process to add transparency that encourages participation which grows the code review pool which leads to better quality control.
- The process they've come up with is based on different types of sub-projects.
- We're trying to achieve a healthy project.
- They have contributors (anyone who creates or comments on an issue or PR), committers (contributors who have been given write access to the repo), and a technical committee (a group of committers representing the required technical expertise to resolve rare disputes).
- Contributors (not maintainers) can still help newbies, eg by helping them to write good issues. (with the help of documentation on how to write good bugs). They can also do other stuff such as adding metadata to issues and PRs
- Code of conduct to reduce hostility
- Every change needs to be a PR. Docs, code, binaries, etc. Everyone must use PRs. All PRs must be reviewed. PRs capture entire discussion and review of a change. Allow PRs to sit for some period of time to allow everyone to discuss it. Once all issues brought by committers are addressed it can be merged. If people don't object, then it can land. If a particular expert is required, they're mentioned, and the PR needs an LGTM from them.
- There's actually a tool called LGTM which helps manage pull request approval
- Lazy consensus on PR discussion: only if there is disagreement. Most PRs should be uncontroversial. If it is controversial (small minority), it gets escalated to the Technical Committee
- Any contributor who lands a non-trivial contribution becomes onboarded and becomes a committer. NB: in git, there isn't a lot that can't be fixed; not every committer has the rights to release or make high-level decisions – e.g., they can't push to master. Smaller changes can be reviewed by less technical contributors. The more technical contributors can spend their time on reviews only they can do.
- The key to scaling contribution growth is committer growth. Committers are expected to continue to open PRs (which are then approved by other committers).
- Technical committee discuss exceptional issues (consensus seeking – no open objections, rather than pure consensus) and in the final event that people don't agree (small minority), it comes down to a vote. The existence of a vote means people can't veto stuff. Node.js has never had to call for a vote. Resolution may involve returning the issue to the committers.
- Good to provide some additional guidance that helps contributors converge on a decision themselves. An empowered contributorship.
- "We need to view a constant need for intervention by a few people to make any and every tough decision as the biggest obstacle to healthy Open Source".
- Committers can nominate other committers to the TC at any time. TC uses its standard consensus seeking process to evaluate whether or not to add this new member. Members who don't participate as much as others are expected to resign.
- Building skills into the project to keep it healthy.
- "People don't need to know about every part of the project to be involved in making decisions – they will want to defer hard decisions to people they know have more experience."
https://github.com/ayojs/ayo/issues/2
- "working groups need either explicit authority or trust from whatever governance bodies are above it"
- "The main problem with Node's governance was lack of accountability imho"
- "I too say no WG for now. Let them pop up when there is a need." - six thumbs up. Apparently the technical committee chartered working groups and often retained more power then they should have
- They are suggesting BDFL as a replacement to technical committees
- Apparently Rust's governance model is similar to a set of teams with a clear separation of concerns and responsibilities and equal power, each operating in consensus-seeking decision-making manner with a leader who is final arbiter in lack of consensus.
- Apparently their working group model didn't work very well in the case of docs: "We were forced to keep our work within the code repo and within the existing tooling, which was very limiting for what we wanted to achieve. There was also ideas like code comment doc generation similar to Rust that were blocked immediately seemingly only because of style preference of a few TSC members. As most of you well know, documentation is a critical need of open source projects and a multi-faceted problem that requires many forms of content to truly cover the needs of the end user. Node.js has always had very poor docs and the power structures in place constantly got in the way of doing anything about that." Six thumbs up and some party horns and hearts
- One guy proposes making working groups largely self-governed sub-projects. Breaking down the project into more manageable chunks with smaller teams that are easier to coordinate. Interesting
- One comment points out that a "steering committee" should be cultural/strategic whereas the technical committee should be concerned with technical minutiae, so not determining eg the code of conduct
- Apache model is detailed, and although it's pretty vote-y, it has a board/subcommittee/committer structure:
- Board – determines broad policies about how to run things the "Apache way" and otherwise trust that projects will self regulate. Wholly independent.
- Project management committees – Govern projects. Incl. Roadmap, members (committers/maintainers), conduct within the project. Has a Chair who coordinates with the Board and has the same number of votes (one) as the other project members. NB that projects are supposed to work by consensus and only fall back on voting when this fails.
- Committers – per-project – have write access to the repo.
- If PMCs have insolvable problems, the board will get involved. As a first step this is to say "seems you have a problem, please tell us how you're going to fix it".
- Shane Curcuru (apache software foundation Director) says "it's more important IMO to get the basics of day-to-day project governance running (so you can show progress and do some actual work), and then define very clearly who the "board" is (to make it crystal clear that after those X people decide on new policies, they're done), and then work on the rest of the details of governance."
- Maximum PMC self governance and a completely independent board is important
- Teams of at least three people. People can volunteer or be nominated by a team member and nominations are voted for by the existing team members.
- Core team – decides general direction of project and cross concerns
- Moderation team – code of conduct and manage conflict. No core team members on the moderation team.
- Subteams focus on a particular area of the project with full autonomy including who gets permissions for their repo. Each sub team needs to have one member that is also a member of the core team.
- Elected final arbiters (kind of come from the BDFL idea) that serve to resolve issues where there is no consensus, elected on a per-issue basis. There should be multiple of these.
- Teams and EFAs follow a consensus seeking decision model: nobody has to object. If no consensus can be found, issue is postponed until the team reconvenes.
- One underlying idea behind teams is to define the set of people with whom consensus around an idea is meaningful
- Rust has an IRC culture.
- Also have a subreddit, website, stack overflow
- NB Rust is a much bigger project than ours and so has some governance features that wouldn't work/be too heavy for us. NB also that they (presumably) have full time Mozilla developers working on the project.
- Contributing.md is quite long. Some features are
- there's an RFC repo for change requests, submitted via issues
- They have an optional bug templates
- Give some detail on the build system and how to use it
- They have a fork and pull development model too: https://help.github.com/articles/about-collaborative-development-models/
- Scripted static check that people can run before submitting a PR
- By default, a random PR reviewer assigned by bot. But you can request a specific reviewer too. Once approved, the automated tests will be run once it's at the top of the merge queue. The tests are pretty extensive and also include building external dependencies and testing themselves
- Documentation handled via PRs. NB most projects that I came across tend to have documentation done via PRs
- Issue triage.
- Contributing.md gives a rundown of what the labels mean. Prefixes for different types e.g. area of project, experience required, importance (changes to priority after triage meetings), platform, status (of pull requests).
- They have regular (weekly/biweekly according to the subject of the triage) meetings.
- Looks like they still change the way they do things, with people saying they find keeping PRs under control, at least last year: https://internals.rust-lang.org/t/release-cycle-triage-proposal/3544/25
- RFC process
- There's an RFC repo where each RFC is submitted as a md document, discussed via PR, and subsequently merged (or not). i.e. there are loads of RFCs in the repo to date.
- Substantial changes go through a design process with sub-teams and community
- Provide a consistent and controlled path for new features to enter the language and libraries.
- Apparently they have been a major boon for improving design quality and fostering deep, productive discussion
- RFCs closed immediately if not viable or assigned a shepherd who keeps the discussion moving and ensures all concerns are responded to
- Final decision with the core team, sometimes after many rounds of consensus building. Note that decisions are most of the time about weighting different tradeoffs and it's the job of the core team to make the final decision
- Governance
- Core team, responsible for steering the design and development process, overseeing the introduction of new features, including final decisions on RFCs, and ultimately making decisions for which there is no consensus (this happens rarely).
- Subteams
- These were suggested in this RFC, and subsequently approved after discussion: https://github.com/nox/rust-rfcs/blob/master/text/1068-rust-governance.md
- Most of the review process takes place in a subteam. Each subteam is led by a member of the core team.
- Subteam 'shepherd' RFCs for a given area. This means
- Make sure stakeholders are aware of the change
- Tease out different design tradeoffs
- Build consensus
- They also set policy on which changes require RFCs, delegating reviewer rights for the subteam area
- Rights to approve PRs and merge to master branch can be granted as a stepping stone towards team membership. NB that RFCs for a subteam area should be discussed by all stakeholders. Earning r+ (merging) rights seems to be something which requires more than just one merged PR
- Subteams consist of at least one core team member who is the leader of the subteam; area experts; and stakeholders – e.g. heavy users. (last is crucial).
- Members of a subteam should have demonstrated significant engineering skill in that area, whereas leaders should have demonstrated exceptional skill.
- Leader is responsible for setting up the subteam (initial membership, determining policies incl addition/removal of members), communicating appropriately with the core team and making sure that RFCs are progressing, and making the final decision if no consensus can be achieved
- Core team spins up and shuts down subteams
- Proposed some initial subteams (incl tooling and infrastructure, moderation – CoC violation – no core team members, libraries...)
- Implementation of the subteam structure is descibed: a slow build-out from the current core team. "In particular, today core team members routinely seek input directly from other community members who would be likely subteam members; in some ways, this RFC just makes that process more official." This could work for us.
- Mozilla's module system was a partial inspiration – an evolution where subteam leaders are also in the core team. Subteam = module to avoid confusion
- Consensus-based decision making
- All sides have had their concerns addressed
- Realization that all design decisions carry a tradeoff
- Process:
- Initial RFC proposed
- Comments reveal additional factors
- RFC revised to address comments
- Repeat until major objections are addressed or there's a fundamental choice. i.e. until only minor objections – people don't feel a strong need to actively block the RFC.
- Then – when "steady state" reached – move into "final comment period" of 1 week. (announcement for this should be very visible). This is where the team weighs up the already-revealed tradeoffs against the project's priorities and values (set by the core team). Without introducing new arguments.
- This is quite a heavy process, so it was proposed that subteams determine what requires an RFC process and what doesn't. The nightly/stable build arrangement they have allows for this.
- Consensus in the subteam is what is sought, at the very least.
- If no consensus can be achieved:
- Trivial stuff eg naming – subteam leader makes executive decision
- Deeper stuff eg fundamental design tradeoffs – subteam leader should consult with core team but is likewise empowered to make a decision
- Code of conduct includes
- people have differences of opinion and that every design or implementation choice carries a trade-off and numerous costs. There is seldom a right answer.
- "Shepherds"
- These drive consensus process.
Experiences from a Decade of TinyOS Development - Philip Levis
- fine-grained components are good for experimentation but add unnecessary and painful complexity to stable software that expects reuse (e.g., a kernel).
- the natural tendency to support longstanding, dedicated users and evolve a system to better meet their needs undermines system adoption
- Research wants to push a frontier, but doing so can alienate a broader audience and stifle long-term success.
- We lost sight of the fact that "code reuse" really means within a system, not necessarily across completely independent systems: a well designed and carefully implemented operating system is more helpful than an operating system toolkit or operating system software designed with reuse in mind
- Technical lessons
- Evolving language primitives and programming abstractions to push what are traditionally dynamic operations into static ones allowed it to have near-optimal RAM overhead while enabling large, complex and dependable software systems
- Static virtualization
- While the process of writing tutorials, API reference documents, and programming manuals is neither glamorous nor exciting, the presence of these materials reduced the long-term effort needed to support a large user community.
- "The very different timescales of startups and academia proved to be an irreconcilable tension"
- Lessons learnt:
- Component based design is good.
- Generalizations of component design are bad. We should have started with fine-grained components, then over time transitioned to more monolithic implementations as they stabilized.
- Get an initial group of users by promoting use internally among groups or researchers, or have a funding agency give grants to do work that involves the system.
- Focusing on experts and the research community exacerbated technical complexity
- Late industrial involvement (contribution) was good
- Diverse user documentation was good to keep down support effort. Tutorials, API/implementation references, and a programming manual.
- They over-modularized, which meant that it was difficult to understand for the first time. Very fine-grained, reusable software components are good for experimentation but a poor fit for operating systems because they add unnecessary complexity to stable software.
- Developing their own language increased the steepness of the learning curve, which made it focused on expert users (no good for the hobbyist community) and had implications for staffing – was difficult to hire people
- Static virtualization is good
- This is very similar to us
- They consisted entirely of volunteers, each having at least one other "real" job that competed for their time. How they dealt with this:
- Emphasize decentralized workspaces and asynchronous communication
- Email lists exclusively to communicate
- Minimal quorum voting system to resolve conflicts: gathered the minimum people necessary together and then voted.
- Members are elected to core team who can commit
- Largely freeform. All members of the apache group (core plus some others) can commit code. People can propose fixes, uptake work themselves, communicate through email, etc.
- All members generally review all changes
- Structure is core team (elected eventually because of disputes), committers (commit privileges granted when a nomination by an existing committer is approved by the core team), outside contributors
- Commit approval granting: "if you submit enough useful and correct PRs, eventually some committer will get sick of taking care of your work and will ask you if you want to be able to commit them yourself. This process serves multiple purposes; after all, the FreeBSD community is made up of people who do the work. For committers, the work consists of creating useful and correct patches. If you don't consistently and regularly create good patches, there's no point in giving you commit access, now is there?...By the time you've submitted several dozen PRs, you'll either work well with the FreeBSD team or everyone will understand that you and the team just can't get along. Direct-commit access is either an obvious next step, or an obviously bad move (Lucas 2002)."
- New committers are assigned a mentor
- Ad hoc teams
- These are areas such as marketing, etc – nontechnical
- Committers are assigned to these teams by the core team on a voluntary basis
- Hats
- Committers appointed by the core team to be responsible for some area of code. Guide development in that area and review submitted code, also tasks of internal admin
- Maintainers are related to a certain area of the codebase. Committer != maintainer. Maintainers are caretakers. Guidance, not ownership.
- Had an informal governance phase (those who hacked most became part of the core team) followed by a formal one. Following growing criticism about the core team abusing authority for its own interest, followed by a flashpoint
- they released documents around 2000 after 7 years with a view to imparting structure to what was until then largely an informal development processes
- Elected core team model was implemented
- After democratic governance system was implemented, more systematization of processes, generally following conflicts
- A standard argument of organisation theory is that work coordination in a small group may well be informal, based on the mutual adjustment of group members. However, as the group gets larger, it becomes less able to coordinate informally.
- Whereas Linux supervised the work process of changes, FreeBSD instead standardized skills through training ("mentorship") of new committers, using tasks intended to familiarise him with the tools used by committers and the commit process. The committer can't integrate changes without the approval of their mentor until the committer is "released"
- No formal voting
- Rough consensus, with humming, and a session chair who is usually the arbiter of consensus, although mailing list consensus is more important, presumably because more people can have an input.
- Dissenting opinions are heard, but not controlling
- Self-selected individual participants
- No formal government role
- Bottom up
- working groups proposed by IETF participants to meet a perceived need
- negotiate a charter
- working groups evolve out of "birds of a feather" meetings which are more informal
- Market-based adoption
These are summaries of internet guides. Most significantly, there are summaries of some of the Github Open Source Guides, which are concise and very useful overviews with many links and other launchpads into further research. There are also blog posts and links which were high in Google rankings or backlinks from other useful pages. Your mileage may vary depending on how much these ideas resonate or whether you feel the background is similar to RIOT or the author has valid expertise; however, they might be seen as an ideas pool from which we can draw.
They are:
- How to contribute to open source
- Starting an open source project
- Finding users for your project
- Building welcoming communities
- Best practice for maintainers
- Leadership and governance – formal rules for making decisions
- Getting paid for open source work
- Your code of conduct
- Open source metrics
- The legal side of open source
The ones that were investigated are summarized below.
- Spread the word – this isn't compulsory though!
- Figure out your message – think about use for others. http://mozillascience.github.io/working-open-workshop/personas_pathways/ exercise can help
- Help people find and follow the project – home URL, website, Twitter, etc. They have examples of websites.
- Go where your project's audience is (online) – use online channels (Stack overflow, Reddit, Hackernews, Quora). Target to where our audience would be. Wider communities. Find people with a problem we solve. Ask for feedback. Try to finish the sentence: "I think my project would really help X, who are trying to do Y". Focus on helping others.
- Go where your project's audience is (offline) – meetups, conferences
- Build a reputation – incl by: helping newcomers, sharing resources, making contributions to others' work. Collaborations.
- Keep at it – focus on building relationships, rather than a magic bullet. Be patient.
- Setting your project up for success
- Make people feel welcome – contributor funnel (how people go from user->contributor->maintainer). Clear documentation (readme, contributing files). Being nice to people and encouraging them, being responsive, let people help how they want to help, graceful denial of PRs, letting people help out with easy stuff (documentation etc). Lots of casual contributors. Empower people to do work themselves, lessening work for us.
- Document everything – not just technical: roadmap, why decisions made, meeting minutes, early PR opening with WIP label.
- Be responsive – to feedback, issues, PRs. Quickly. Even if you can't review immediately, at least respond. Mozilla study found within 48 hours had a much higher rate of repeat contribution. Also, look on Stack Overflow, Twitter, Reddit.
- Give the community a place to congregate – So people can get to know each other, and so people can help each other. Kubernetes, maintainers set aside newbie time every other week. Exceptions to public communication are security issues and sensitive code of conduct violations.
- Growing your community – some good tips in here.
- Don't tolerate bad eggs
- Make it easy to contribute with clear documentation and tasks for a range of ability levels (especially beginners).
- Share ownership of the project – give away easy bugs, start a CONTRIBUTORS or AUTHORS file, newsletter thanking contributors, give every contributor commit access, an Organization account on github.
- Resolving conflicts
- Maintainers' job to keep situations from escalating and always be cool-headed
- README can act as a constitution with goals, product vision, and roadmap, for reference during discussions
- Focus on allowing everyone their space to express themselves, including the silent majority. Emphasize a "consensus seeking" process rather than consensus. In other words, listening and discussion. Make sure everyone gets heard.
- Keep the conversation moving towards a resolution and focused on action. Refocus the conversation if it starts to unravel or people quibble about minor details. Can shut down the conversation if it's not going anywhere or the action has already been discussed/resolved elsewhere. Remember: this may benefit the silent majority. Guiding a thread towards usefulness is an art: you can't just shut people down, you need to guide them somewhere and give them a path to follow.
- Pick battles wisely – according to who's involved, context etc, the discussion/argument may simply not be worth it. Eg a few troublemakers or a recurring issue without a clear resolution.
- Identify a tiebreaker – this should be a last resort. This process should be identified in a GOVERNANCE file. Eg a small group of people who make a decision based on voting.
- NB: Community is the heart of open source.
"As a maintainer, your happiness is a non-negotiable requirement for the survival of any open source project."
- What does it mean to be a maintainer?
- Shift from coding to processes and community engagement
- Documenting your processes
- Even bullet points is ok
- Keep up-to-date
- Vision of project in README or VISION to keep it focused
- Communicate expectations – for example, how much time maintainers can spend, review and acceptance process, when to expect a response, the types of contributions that are needed
- Keep communication public – eg if maintainers meet privately.
- Having a good process for issue triaging and code reviews can lower the number of open issues and pull requests
- Learning to say no
- Can refer to the project's criteria and documentation to depersonalize things
- Keep the conversation friendly – possible to handle negative responses politely and in a timely fashion. Thank them, explain why it doesn't fit and maybe suggestions for improvement, give a link to relevant documentation, and then close the request.
- No is temporary, yes is forever
- Be proactive – explain the process upfront to limit the amount of bad issues and PRs. Suggestions include: Issue/PR template/checklist, have to open an issue before submitting a PR, CONTRIBUTING.md file
- Embrace mentorship – if someone's really keen but needs a bit of polish.
- Leverage your community – basically, delegation
- Share the workload – offer responsibility and the opportunity to move up the tree, with a clear indication of what the path is to do this
- People forking your project is great if they want to do something we can't support
- Can offer APIs and customization hooks
- Bring in the robots – automation. Ideas:
- Automatic tests (explained in the CONTRIBUTING file)
- There are tools suggested to automate releases, mention potential PR reviewers, or automating code review. Eg Danger
- Issue/PR templates available on GitHub
- Style guides and linters
- Look at other similar projects and what they use
- Don't make standards too complicated as this can increase the barrier
- It's ok to hit pause – take a break if you're burning out
- Can find support to take over while you're gone
- It's important so you can come back refreshed with enthusiasm
- If you can't find support, take a break anyway and communicate so people aren't confused by lack of responsiveness
- Take care of yourself first – to stay happy, refreshed and productive
"Growing open source projects can benefit from formal rules for making decisions"
- Understanding governance for your growing project – how to incorporate contributors into the workflow
- What are examples of formal roles used in open source projects? - these can be narrow, or wide in scope. Guide suggests wide.
- Maintainer – could be people with commit access, or include people who evangelize or write documentation. Someone who feels responsibility over the direction of the project, in general.
- Contributor – someone who adds value in any way, from anyone with a merged pull request (narrow) to people who comment, triage issues or organize events.
- Committer – distinguishes commit access from other types of contribution.
- How do I formalize these leadership roles? - this helps people feel ownership and tells people who to look for for help.
- Could be adding people to README or CONTRIBUTORS text file
- Could be adding to team page on website
- Could have a core team of maintainers, or subcommittees. Allow people to volunteer for roles; allow it to be self-organizing
- Document how people can attain leadership roles or join a subcommittee in GOVERNANCE.md.
- Vossibility can help track who is making contributions. Document the info to avoid the perception that maintaners are a clique that makes decisions privately.
- Are we a GitHub Organization?
- When should I give someone commit access?
- Seems like a bad idea, but one guy says that giving every contributor commit access is an open-source lifehack and can massively improve engagement
- Can use protected branches to manage who can push to a particular branch and under which circumstances
- What are some of the common governance structures for open source projects?
- BFDL – benevolent dicatator for life, eg Python
- Meritocracy – people who contribute are allowed to vote. All Apache projects
- Liberal contribution (us) – consensus seeking process rather than vote. Eg Rust and Node.js, whose policy is here: https://medium.com/the-node-js-collection/healthy-open-source-967fa8be7951
- Do I need governance docs when I launch my project?
- Easier to do once you've seen community dynamics play out – let them shape themselves
- What happens if corporate employees start submitting contributions?
- Best to judge all contributions (and contributors, in the context of their potential for positions) by their value only, rather than any background of the contributor
- Don't feel embarrassed about discussion a particular feature's merit on the basis of either its commercial or non-commercial use
- Do I need a legal entity to support my project?
- If you're handling money.
- For donations, you can set yourself up as a nonprofit or go through a foundation to accept donations on your behalf
- This activity can help a project become self sustainable
- Why measure anything?
- Metrics may or may not be important to you
- You can understand popularity or need for features, see where users/contributors come from and how to get more, and this can help to raise money through sponsorship and grants
- Discovery – are people finding the project?
- Github has tools where you can view how many people land on your project, where they go within the project, where they come from
- You can also do this from Google PageRank or your specific website
- Usage – are people using this project?
- On Github, you can see the clone graph to see how many times a project has been cloned
- Conversion rate (traffic:cloning ratio) can depend on where the traffic is coming from
- Can try and figure out what people are doing with it: forking and adding features? Using for science or industry?
- Retention – are people contributing back to the project?
- Retention requires an inflow of new contributors: there's a link to a model from the Mozilla foundation
- On Github, can view contributor count and commits per contributor (for default repo branch)
- Might want to look at some other metrics such as one-time contributors, number of open issues/PRs, previously opened issues/PRs, types of contributions, to gain introspection about what work is being done, where time is being spent etc
- Maintainer activity – are we responding to our community?
- Can maintainers handle the volume of contributions received?
- Unresponsive maintainers become a bottleneck and discourage people. Research from Mozilla suggests that maintainer responsiveness is a critical factor in encouraging repeat contributions – just a response, doesn't require action.
- Could measure the time it takes to move between stages in the contribution process. Avg time an issue remains open, whether issues get closed by PRs, whether stale issues get closed, avg time to merge a PR.
- Use metrics to learn about people – helps focus attention on the type of behaviour that will help the project thrive
Mostly overgeneralised IMO but some interesting points:
- Project checkup periodically to make sure project elements are balanced and functioning well. Breaks project elements into:
- Brain – governance and processes that guide development
- Heart – community and transparent communication
- Blood – consistent, strong contributions
- Skeleton – infrastructure, eg github and websites
- Issue triage
- Issue triage is like a garden – you need someone to pull weeds, and do it often and regularly.
- What are issues for? Bugs, features (could use mailing list), help questions (could use stack overflow)? Make a plan.
- The first trim will be big
- Set up email filters and notifications on github
- Have a grace period before marking an issue as stale (NB: you can re-open issues later)
- https://www.codetriage.com/ to monitor open source projects with loads of issues
This is really for very large projects, we're not really here yet
- Ideas to handle feature requests:
- As issues, labelled as feature requests
- A separate page where users can submit and upvote feature requests
- Feature freeze
- Some maintainer problem root causes
- You're in the spotlight, both code and conversations
- PRs are difficult and time consuming to review – need to support new features, keep it being tested, generally consider the long term effect of a change
- Maintainers stop being product users
- Information imbalance: for a contributor, it's their most important (only) issue, whereas for a maintainer it's one of hundreds
- Some tips on improvement
- Improve error messages eg from Murdock and from RIOT itself, even maybe including instructions on how to fix the error
- Make it easy to find existing related issues (gh_inspector is one tool that can help)
- Use bots: eg to enforce people to include certain information, direct people to a relevant troubleshooting guide for common issues, keepint issues from going stale, with auto closure, locking resolved and inactive issues, to tell people to fix the tests.
- Use rules for code changes. Danger: http://danger.systems/ can automate common code review chores
- md document can help guide decision making. People can submit PRs for this.
- Scaling contribution is really important – more and more people to commit code. Moya contributing guidelines is a good approach to this.
- Code of conduct v important
- Make it v easy for engineers to set up their dev environment
- Be friendly
- Keep code base as simple as possible
- Dynamic configuration to allow things to scale (where possible, for us!)
- Allow local extensions and plugins
- A marketplace for developers to share their integrations: like the RAPstore
- Scaling open source projects is hard
"Convenience and ego drive most open source adoption, but these shortsighted motivations raise long-term problems we need to clearly identify if we can ever hope to solve them. This presentation is just a first step in that direction."
Gives a template for contributor guidelines
- Gives a few different ways to close a PR (gracefully)
- Ego stroke and just close
- Close early when you know it's not right
- Close it with recommendations for redesign
- Maintainer adds something, if the contributor disappeared or if it's just easier than the committer doing it themselves. Keep the original commits so the right people get credit.
- Eg the IETF rough consensus model. The "sense of the group" is determined by a chairperson. Eg humming rather than handraising.
Inside look into open source at Google
- Projects are developed completely in the open with little internal-only discussion and planning
- Google products go through Dogfooding trials – eat your own dog food, I guess
- Google has published their policies and documentation. Might be interesting
- One of the biggest challenges ends up being community management
- Create an environment where people feel willing to participate. Also, be prepared to handle conflict. Grace can be extended when people mess up, and don't be afraid to pull out the weeds when they're choking the flowers.
Guy who wrote urllib3
- Define what success looks like
- Have a great README, which answers these questions: who else uses it, for what, and where can I get more help
- Is success stars on Github, or being used everyday by loads of people? Is it number of contributions, or pleasure? There's a paper on this, which I've bookmarked
- Recruit core contributors
- Onboarding
- Be very inclusive
- Market and promote your product
- Blog posts
- Answer questions on Stack Overflow
- Participate in discussions on hackernews etc
- Partnerships with other open source projects
- Response and engagements to comments
- Profit?
- Docker didn't over-focus on the business model until the project was really popularity
- Summary
- Lots of work. Specifically, the work he put in was:
- Focus on a small, but common challenge
- Write a great readme
- reach out to people to contribute and companies who may benefit from our work
- Promote project through blog posts and social media / wider activity
- Lots of work. Specifically, the work he put in was:
This guy did a framework for iOS development.
Four stages:
- Stage 1: Putting source code on GitHub
- Stage 2: Developers start using your software
- Stage 3: Project is popular and the go-to solution in its field
- Stage 4: Hyper-scale open source projects
Tips and things he did:
-
Read more about StoreKit and ensure my implementation was in line with Apple's guidelines
-
Get organised and respond more timely to issues and PRs
-
Avoid feature creep and learn when to say no
-
Add unit tests
-
Standardise and streamline contributions — both mine and from other users
-
Rewritten the purchase flows entirely with full unit test coverage
-
Added a Travis CI job to run the build and the tests
-
Added an ISSUE_TEMPLATE.md file to encourage reporters to include all relevant details. Big win
-
Added a CONTRIBUTING.md file outlining the scope of the project, pull requests process, list of missing features
-
Added SwiftLint to the project to standardise project contributions and make them easier to review and accept. Go check it out! ?
-
Adopted gitflow. This introduces a development branch where new feature work can be merged and tested before going to master
-
Use feature branches by default for my own contributions. Makes it easier to track features and reference relevant issues
-
Periodically spend a few hours reviewing all the issues and closing inactive ones
-
Publish milestones to give more visibility on priority and timeframes for for planned future work
Same guy, different post.
My recipe:
- Choose the right project – solve a problem you have
- Make it easy to use
- Write the best README you can:
Your README is the landing page of your project. You should spend a lot of time on it. It should look good! If you're building a UI control, include an animated gif, screenshots, or even a link to a prototype. Swift Messages does this really well. It should include badges so you can see at a glance the state of the project. A lot of projects use shields.io. So should you.
The README should highlight:
- What is the feature set and how to use the project by clearly documenting the API.
- How to install it. Note: Do support as many dependency management tools as possible, not just _ Cocoapods._
- Supported platforms. You guessed it, as many as you can.
- Supported languages , with links to the appropriate branches or tags for the various language versions.
- List of known issues (optional). This can serve as a summary of the current issues in the project.
Additionally, you can add a FAQ section, references to related projects and reading resources, a list of credits, and the license.
Very important : if multiple issues are opened to ask for usage clarifications, your README is not good enough. Answer the questions and improve accordingly.
Include a sample project
Providing a sample demo app goes a long way in helping other developers using your project.
If your project becomes popular, you will see the issues tab filling up with questions. Having a great README and sample project helps keeping the number of open questions under control.
- Share in the right places – websites and github repos that aggregate open source projects
- GitHub Trending – see post below about how to get loads of starts and start trending
- Google Search – i.e. SEO
- Keep growing – support the people asking a lot of questions and opening pull requests. Maintainer tips:
-
- Carefully evaluate feature requests – keep code clean and features focused
- Don't be afraid to request changes to useful pull requests where appropriate.
- Don't be afraid to reject pull requests that are out of scope or already covered.
- Always be courteous and polite, especially when rejecting pull requests.
- You have become a steward, so aim for good stewardship.
Some other tips:
- add CONTRIBUTING.md file, giving how to open issues and Prs
- add a code of conduct
- Increase unit test coverage
- Use code linting
He has a 'secret recipe' of how to get a trending repo, says he's succeeded a bunch of times with different repos by following this:
- Projects are Everything – make something that is useful and solves a good problem
- Read and Research – make sure that problem hasn't been solved elsewhere
- Building the Repo – clean up the interface and make it usable
- THE README – should be beautifully designed and pretty, really important. Spend as much time writing this as coding. Has an example. So people can glance and star. Graphics, gifs, etc
- THE GRAPHIC – at the top of the readme. He has an example. Get design-y.
- THE FEEDBACK LOOP – post on sites and get feedback