[openstack-dev] Grizzly's out - let the numbers begin...
Vishvananda Ishaya
vishvananda at gmail.com
Wed Apr 10 01:42:56 UTC 2013
On Apr 9, 2013, at 3:02 PM, Daniel Izquierdo <dizquierdo at bitergia.com> wrote:
> Hi Vish,
>
> On 04/09/2013 05:47 PM, Vishvananda Ishaya wrote:
>> Any insight into how the number of closed tickets were generated? I just checked my stats for grizzly and noticed that I fixed 38 bugs in nova alone[1]. Counting all projects (it is a little harder to find results for the python-*clients since we don't have coordinated releases) it is somewhere around 45, yet our company only has 34 listed for the entire release. There is clearly some error in the tickect closing statistics.
>
> We already had a small discussion with other OpenStack members about how to measure activity. What we're measuring when talking about tickets are those whose final stage is "Fix Committed".
>
> At the beginning, we considered other stages as "final" ones such as "Fix Released" or "Won't Fix", however Thierry appeared to have an incredible high number of "closed" tickets for that definition. So, what you can see (as specified in [1]) is the number of closed bugs with the condition of being in the "Fix Committed" stage (only that).
The act of "closing" as in changing the state of the bug is relatively meaningless. Jenkins sets fix committed is most cases and the release manager or jenkins sets fix released. It makes a lot more sense to use the assignee of the bug. That is the person that did the actual work.
Vish
>
> We know that this is incomplete data. And real activity should be complemented adding information from Gerrit (work in progress so far [2] in the toolset we're developing and using).
>
> That said, we could also add extra final stages, as you can see as an example in the line number 59 at [3]. Thus, I would say that what you can see in the charts (regarding to this issue) is only closing activity in Launchpad, but nothing else.
>
> The question now should be to check what Launchpad understands as "closed" issues and if this is what you would like to measure.
>
> Hope this clarify the method!
>
> Regards,
> Daniel.
>
>
> [1] http://bitergia.com/public/reports/openstack/2013_04_grizzly/notes.html#note:charts:summary:closed
> [2] https://github.com/MetricsGrimoire/Bicho/blob/master/Bicho/backends/gerrit.py
> [3] https://github.com/VizGrimoire/VizGrimoireR/blob/newperiod/examples/vizGrimoireJS/its-analysis.R
>
>
>> Vish
>>
>> [1] https://launchpad.net/nova/grizzly/2013.1
>>
>> On Apr 8, 2013, at 10:30 AM, Daniel Izquierdo <dizquierdo at bitergia.com> wrote:
>>
>>> Hi Michael,
>>>
>>> On 04/08/2013 05:36 PM, Michael Basnight wrote:
>>>> what would account for nebulas lines of code to be ~51m. did they sneak a java project in? ;)
>>> this is a really big (big big) bug I'm afraid... it does not make sense to have in only 6 months 50 million added lines (well, this may make sense at some point) while the project is as big as 680k/750k according to SLOCCount[1] or Cloc[2] respectively.
>>>
>>> Tables were just updated to real data.
>>>
>>> Sorry for the noise again...
>>>
>>>
>>> [1] http://www.dwheeler.com/sloccount/
>>> [2] http://cloc.sourceforge.net/
>>>
>>>> On Apr 8, 2013, at 7:16 AM, Daniel Izquierdo wrote:
>>>>
>>>>> Hi again,
>>>>>
>>>>> On 04/05/2013 09:18 PM, Daniel Izquierdo wrote:
>>>>>> Hi Eric,
>>>>>>
>>>>>> On 04/05/2013 08:31 PM, Eric Windisch wrote:
>>>>>>> On Friday, April 5, 2013 at 14:17 PM, Stefano Maffulli wrote:
>>>>>>>
>>>>>>>> Let me pull in the authors of the study, as they may be able to shed
>>>>>>>> some light on the inconsistencies you found.
>>>>>>>>
>>>>>>>> Eric, Joshua: can you please send Daniel and Jesus more details so they
>>>>>>>> can look into them?
>>>>>>> I made a note on the blog. The response to others indicates that their results are based on two different methodologies (git-dm and their own dataset analysis), this would likely be the source of differences in numbers. I haven't noticed variations anywhere except author counts, but I haven't looked very hard, either.
>>>>> Charts are now updated [1]. The bug as mentioned was found in the queries, where we were counting people not taking into account unique identities (in some cases a developer may use more than one identity to commit changes to the source code).
>>>>>
>>>>> Thus charts and tables are now updated (and should contain consistent data). You will notice a small increase of commits in Rackspace and small decrease in Canonical. Those are due to a developer that was initially wrongly assigned to Canonical for the whole history.
>>>>>
>>>>> Again, our idea is to help you in the better understanding of how OpenStack is being developed and main actors involved in it, so any feedback is more than welcome. Indeed, our intention for the following steps is to provide closer information to developers, so this could be useful in the software development process. We have our own ideas, but you could probably have better ones.
>>>>>
>>>>> Thanks a lot for your comments and see you in Portland!
>>>>> Daniel.
>>>>>
>>>>> [1] http://bitergia.com/public/reports/openstack/2013_04_grizzly/
>>>>>
>>>>>
>>>>>> The methodology we have used to match developers and affiliations is based on information partially obtained from the OpenStack gitdm project, but also compared to our own dataset (that we already had from previous releases). Sorry if I didn't explain myself consistently in the blog.
>>>>>>
>>>>>> The bug here is related to how we're calculating data for the spreadsheets and company by company. The result is that the company by company analysis had a bug, and we were counting some more developers than expected and commits (we were counting for instance as two different people a developer who used at some point two different email addresses).
>>>>>>
>>>>>> So, the data at the tables (bottom part in the main page) is the correct one. The data for the source code management system in the left part of each of the companies is overestimated.
>>>>>>
>>>>>> In addition, the number of commits in Rackspace will be a bit higher for the next round. Another developer told us that he moved from one company to Rackspace at some point, so you will see how that number will increased a bit.
>>>>>>
>>>>>>> I guess it could also be differences or errors in employee->company mappings? Perhaps instead, one methodology includes those that report bugs, while the other only accounts for git? I'm not sure.
>>>>>> Regarding to this point, the data about bug tracking system and mailing lists is only based on activity from developers. This means that people that have not committed a change to the source code are not counted as part of the activity of companies in Launchpad and Mailing Lists. In any case and as an example, we're covering around a 60% of the activity in the mailing lists because people that at some point submitted changes to the Git are that active.
>>>>>>
>>>>>> Our purpose with this is to show only activity from developers and their affiliations through the three data sources (git, tickets and mailing lists). This is also an option. From our point of view this analysis was pretty interesting, but perhaps for others this is not good enough.
>>>>>>
>>>>>>> Other things like dividing commits/authors seems to just be the wrong methodology where a median would be more appropriate and harder to game.
>>>>>>>
>>>>>> This is a good point. As you mention it is probably more fair to have such metric. At some point we would like to show some boxplots and other metrics to better understand the distribution of the datasets, but we had to choose some. In any case, we will take into account this for the next reports for sure. Thanks!
>>>>>>
>>>>>> Probably a good approach would be to have a common view with all of the people interested in this type of analysis. In this way we could reach an agreement about how to visualize data, necessary and interesting metrics, common methodology to measure stuff and projects involved. This analysis is just a possibility, but there are some more for sure.
>>>>>>
>>>>>> In any case, please, let us know any other concerns you may have and any feedback of the community is more than appreciated.
>>>>>>
>>>>>> Thanks a lot for all your comments.
>>>>>>
>>>>>>
>>>>>> Regards,
>>>>>> Daniel Izquierdo.
>>>>>>
>>>>>>> Regards,
>>>>>>> Eric Windisch
>>>>>>>
>>>>> _______________________________________________
>>>>> OpenStack-dev mailing list
>>>>> OpenStack-dev at lists.openstack.org
>>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>> _______________________________________________
>>>> OpenStack-dev mailing list
>>>> OpenStack-dev at lists.openstack.org
>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>>
>>>
>>> _______________________________________________
>>> OpenStack-dev mailing list
>>> OpenStack-dev at lists.openstack.org
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>> _______________________________________________
>> OpenStack-dev mailing list
>> OpenStack-dev at lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
More information about the OpenStack-dev
mailing list