Sunday, June 13, 2010

Day 10 Readings

Today's reading includes a look into the socialization of open source software projects from a dynamic perspective of networks and micro and macro sociological issues surrounding such projects. The reading is titled:
In the paper, the author argues that while open source software projects have been studied for a long time, the majority of such studies were in one of two camps. In the first camp researchers consider sociological aspects of participating in open source software projects, such as what incentives are there for people participate and contribute to OSS projects. On the other hand, there are a camp of researchers who are solely interested in the technical implications of OSS projects, such as the defect density of OSS projects versus commercial software projects. The author's work looks into both the sociological and technical aspects of OSS projects to form a more nuanced understanding of the dynamics of OSS projects.

Beyond studying the socio-technical aspects of OSS projects, he also applies several new viewpoints to research in the field. Firstly, he considers the socio-technical aspects of OSS projects from an ethnographic prospective, studying how people work in both a micro and macroscopic perspective. For instance, while a lot of research has looked into the aggregate statistics of a OSS project, few have studied how individual entities evolve, communicate, and belong in an open source community. Secondly, he considers the evolution of an open source community, namely the Python project, in several different dimensions (such as evolution through time), in a dynamic manner. This is in stark contrast to prior work which has simply looked at fixed, or "static," moments of time (for instance the CVS snapshot ending on a certain day). Finally, and most importantly he treats an open source project as a collection of networks as defined by Bruno Latour's work on actor-network theory as opposed to the conventional research view of open source communities, which are simply individuals who can be classified and placed into one of many layers people and resources that make up an OSS projects.

The author goes on to discuss software that he has created to aid in the ethnographic research that he conducted on the Python project. The software has been designed to expose the hybrid networks that form in an open source project, such as between people and people, as well as between people and artifacts, and other entities that exist in an OSS project. It was also designed to offer the ethnographer a dynamic perspective on activities in an OSS project, and allow for observations and information to be tracked over time. Since studying OSS projects leads to information overload and difficulty in analyzing trends in OSS projects, the system also allows for data to be aggregated while at the same time preserving the referential data. Finally, the system allows for not only the study of macro phenomenon, but also the ability to track a participant in the OSS project. It then goes on to demonstrate the software in action, for the python project, noting how the software's capabilities allow for a more in-depth study OSS projects to be conducted.

The author goes on to explore the Python project from a socio-technical perspective apply ethnographic techniques as well as the software to analyze this particular project. Python is a fairly old, large, mature and successful OSS project, which he claims represents one end of the OSS project spectrum. One aspect that the study considered was the mail archive of the python developer's mailing list and the CVS source tree for the project throughout the year 2002. We have seen in previous day's readings that this is a common data source that OSS researchers look at. It is from this data that the study was able to confirm the findings of similar studies in the technical aspect of OSS proejcts (for instance that a small number of developers make up the largest proportion of contributions). However what was interesting about the paper was that the author sampled a small portion of the python community to better understand OSS projects from a micro level.

The paper comes to an important discovery, by my interpretation, which is that open source communities rely on a networks of trust. Since members of a particular open source community are in essence strangers from around the world, trust is an important element in determining what rights and privileges (such as CVS commit access), a particular member of the community should have over time. The paper argues that members of the community gain trust not only through participation, but also through building an identity and reputation (technical or otherwise), the political posturing (such as building allies), and how they behave in the context of the community. Furthermore this trust cannot be gained instantaneously, but rather over the span of many months or perhaps years. The paper goes on to argue that an open source project has several well-defined routes or trajectories for which a newcomer may follow, and which results in different outcomes. Those who do not follow such trajectories, or who too rapidly assume that they have gained the trust of the general community, are destined to not move forward that far in the community.

The author tells the story of two individuals which he followed.

The first individual, Fred, began his relationship with the python community by asking questions about python to the python development community. It is through conversation that Fred began to build connections with other Python contributors as well as to uncover the structure of the network for which he may join. He then moved on to reporting bugs, followed by attaching potential fixes to these bugs, which in turn made himself recognized as a trusted bug fixer by the community. Finally, he moved on to suggesting that Python include his library in their project.
On the other hand, another member who was studied, David, began engaging the community by posting a simple bug report to the mailing list, without any suggestions as to how to fix it.
One week later, he asked the community to consider a module which he created for the next release of Python, which was promptly rejected by the core members of the community.
Clearly, David did not understand the network and inherent rules that govern this community.
He pushed too hard, too fast for rights and privileges when the community was not in a position to trust these rights to him.

Several other important discoveries were made about OSS projects and communities:
  • Contributors are part of a network connecting themselves to different entities (people or otherwise) and that these connections can strengthen or weaken over time
  • Contributors to OSS projects are rarely (if ever) coached - they bring their technical expertise to the table
  • Correcting bugs or suggesting new improvements are areas where conflict can arise in a OSS project which can be critical historical events for a OSS project
  • While the open source community likes to claim that their process is entirely transparent, in fact it is not transparent, processes and discussions are often times hidden from the outside observer
  • Networks are designed to minimize the effects of radical changes, they must be weakened in some respects when important changes must be pushed through
Finally, the paper goes on to discuss the implications of the software that was developed for the paper (OSS Browser). This software could help people joining an open source effort to better understand the networks and dynamics of the project. This is a skill that many potential contributors lack (see the case of David), but is essentially to begin to build the trust of the community. In addition, the tool could assist in helping these newcomers navigate the political and socio-technical dynamics that are inherent in any organization. These two capabilities, as well as the other capabilities already mentioned that the software offers, could provide the opportunity of project maintainers to better help newcomers understand and assimilate to the project's culture. Currently, the most common ways that a newcomer can understand an open source project are also the most primitive methods - such as electronic mail and newsgroup clients. In addition, by making the structure and dynamics of the network readily available to members of the community, such members can determine who their allies are more accurately than ever before. This in turn could potentially improve the evaluation, debate and potential integration process for additions to an open source project, or alternatively for resolutions to bugs that appear in OSS.

No comments:

Post a Comment

 
Creative Commons License
CS889 Readings by James Simpson is licensed under a Creative Commons Attribution-Noncommercial 2.5 Canada License.
Based on a work at j2simpso.blogspot.com.
Permissions beyond the scope of this license may be available at j2simpso at uwaterloo dot ca.