Learn everything about our new Bitergia Research branch!

Share this Post

Table of Contents

Bitergia team at FOSDEM
Bitergia team at FOSDEM

As we were telling you a few weeks ago, we were joining FOSDEM, not only as attendees but as speakers.

We would like to share our experience, giving you details about our talks…

1. How Rust in being developed.

Jesús González-Barahona gave a talk about how Mozilla Rust is developed by the community.

He presented the dashboard we have produced for Mozilla, and showed how to use it to learn about the details of Rust development processes and community. He also presented some interesting data obtained from it:

  • Affiliation. 20% of the contributions are from Mozilla community, but there are people from other communities, such as Microsoft, Red Hat or Mongo, not only developing Rust but contributing answering question on Stack Overflow.  The most of the contributions provides from non affiliated people.
  • Git gives metrics about the commits made for last two years and shows how Mozilla community did more than the 24% of them. For that case, non affiliated people provided only the 60%. The dashboard shows clearly how the number of commits has going down in the last October, having the lowest point currently.
  • Stack Overflow gives data about questions and answers submitted by users and community. The dashboard shows not only the activity rates, but the most involved people and the reputation for each. That is a good hint about the adoption of the product.
  • The dashboard shows also the analysis of the mail threats, giving the user a trust score.
Rush community activity dashboard
Rush open source metrics

Jesús gave some details about how to filter, drill down and compare. Here you have the dashboard to have your own conclusions and the recording of the session.

2. How LibreOffice is being developed

Jesús was explaining LibreOffice Grimoire Lab dashboard deployed by The Document Foundation with some Bitergia support.

For this project, they have used tree data sources: Git, Gerrit and Bugzilla.
We can see easy and quickly some data about the project such as commits, authors, changesets and repositories.

You are able to notice just with a quick overview that:

  • There is a high peak of commits in 2015.
  • You can see that, although the number of authors has been regular over the time, currently it’s lower than ever.
  • Despite of that, the activity around the changesets is higher than ever in 2017.

About the authority, you can check the affiliation of the contributions. We take the affiliation from the mail domain declared by the autor, asuming that


are non-affilieted. Given that:

  • We can see that the organization that is making the biggest investment on this project is Red Hat, followed by Collabora.
  • The majority of the contributions are made anonymously
The Document Foundation development dashboard
General data of the code contributions for LibreOffice

You can see more details of the talk on the deck, or with the video. Or get your own conclusions going deeply in the dashboard.

3. Data Science for Community Managers

Manrique had another talk on Fosdem about data for communities managers.

Manrique during his talk
Manrique during his talk

He started his talk reflecting around the concept of community, giving this definition:

“A community is commonly considered[by whom?] a social unit (a group of people) who have something in common, such as norms, values, or identity.” Wikipedia

Manrique described how communities of software are made by people sharing common interestings and how they are organized using several tools like Git, GitHub, Mediawiki or Star Overflow, between others.

Communities are organized around three concepts:

Self awareness
Communities have their own structure and every person knows their role.

Diagram about the anatomy of a community
The anatomy of a community

Establishment of policies, and continuous monitoring of their proper implementation, by the members of the governing body of an organization. It includes the mechanisms required to balance the powers of the members (with the associated accountability), and their primary duty of enhancing the prosperity and viability of the organization. Business dictionary

Transparency is the key of the relationships on the community, because it gives the sense of fairness to everyone and for third parties, it gives trust.

Some communities reach the need to have a community manager. Its function is to ensure community health, productivity and visibility.

With that context, data would be a powerful tool for making decisions about efforts or investment, but the bloody truth is that they have no data available and structured because code and collaborations sources are in silos and bring them together is hard.

Grimoire Lab was born just for that, to allow a deep knowledgement of the activity on an open source and inner source project.

Grimoire Lab is a tool to provide a free, open source software platform for:

  • Automatic and incremental data gathering from almost any tool (data source) related with contributing to Open Source development (source code management, issue tracking systems, forums, etc.)
  • Automatic gathered data enrichment, merging duplicated identities, adding additional information about contributors affiliation, calculation delays, geographical data, etc.
  • Data visualization, allowing filtering by time range, project, repository, contributor, etc.
Grimoire architecture
Grimoire Lab is a free, open source tool for development data analytics

Some of the features that Grimoire provides are:

  • Graphical drill down
  • Time frame selection
  • Easy sharing/embedding
  • Data export
  • Query API (Elasticsearch)
  • Custom widgets and panels creation
  • Easy data validation
  • Link to real artifacts
  • Search box

Moreover, you can get some extra metrics using Grimoire Labs, like:

  • Network.

    Kibana network analysis example
    Kibana network analysis example
  • Dependency, as a way of knowing who your project depends on, analyzing:
    • Onion model
    • ASF Pony Factor
    • Bitergia Elephant factor
    • Bitergia Zapata factor
    • United Fruit Company factor

    We’ll write another post telling you what are those variables.

  • Geo-analysis based even in time zones to determine the location of the contributions.
  • Gender diversity
  • Community demography
  • Code Review process analytics
  • Issues and backlog analytics
  • Contributors funnel

Manrique finished his talk, presenting the beta version of Cauldron, a tool to analyze activity in GitHub organizations.

You can check more details on his deck or with the recording of the session.

Don’t miss the opportunity for measuring your open source project. We have an entire training tutorial where you can start to learn how to know better your community.

4. Grimoire Lab: a Python toolkit for software development analytics

The talk explained how to analyze software development repositories of common use in the free software community with GrimoireLab tools, a toolset for software development analytics writing in Python. The talk started by explaining how to retrieve data from git, Bugzilla, GitHub, mailing lists, StackOverflow, Gerrit, and many other repositories, and organizing them in a database.

The architecture of Grimoire is quite simple from the point of view of the data flow:

Grimoire architecture
Grimoire architecture

Usually, data flow starts in the code repositories (git, mailing lists, GitHub). Perceval goes to them and extracts data that you can upload to, for example, Elasticseach (generating a raw index). Perceval is totally database agnostic; it only takes the data and produces a set of JSON files.

Then, for producing valuable indexes, we use GrimoireELK; Grimoire ELK goes to raw index and enrich this information doing things like calculating how long did it take to close a ticket. With that information we produce again another ElasticSearch index that we call enriched indexes . Those indexes are designed to be used with Kibana for visualizations. Our version of Kibana is called Kibiter.

Apart of that we have Sorting Hat for identities enrichment; it can do things like affiliations for people or check if two mail address belong to the same person.

Grimoirelab components
Grimoirelab components

Jesús continued his talk going to practical exercises that you can find on the session recording. You can check his presentation too and of course the more detailed tutorial.

Our booth to meet friends

Besides the talks we meet friends and community at our boot. And give away some presents!

Bitergia Merchandaising for FOSDEM
Presents for our friends

It was our pleasure to have the chance to share tips and trick with developers.

See you next year!

Our Boot at FOSDEM
First hours at FOSDEM…our owls were coming
Bitergia Boot FOSDEM
Our booth ready to rock!
Last day Bitergia's boot at FOSDEM
Last day at FOSDEM…all owls have flown away!
Emilio Galeano Gryciuk

Emilio Galeano Gryciuk

Marketing Specialist at Bitergia

More To Explore

woman smiling shaking hands
Open Source

Bitergia’s Insights into ASF Community Diversity and Inclusion

We proudly present and highlight a comprehensive report from the Apache Software Foundation (ASF) on diversity and inclusion within its community. The report resulted from a project between Bitergia, Oregon State University, and the ASF. We thank Google for sponsoring this research work.

Do You Want To Start
Your Metrics Journey?

drop us a line and Start with a Free Demo!