OpenDataHouse/overview.html at master · KayGau/OpenDataHouse · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
<!--
Copyright © 2019, empirical software engineering team from Peking Uninversity and ISCAS, All rights reserved.

Written by:
  Jiaxin Zhu
-->

<script src='js/header.js'></script>
     <div id="header"></div>
      <section class="jumbotron text-center">
        <div class="container" style="max-width:800px">
          <h1 class="jumbotron-heading" style="margin-bottom:30px">数据概览</h1>
          <p></p>
          <p></p>
          <img src="img/multi_levels.jpg" width="80%">
          <hr style="border:1px dotted #036" />
          <img src="img/muilti_extracts.jpg" width="71%">
          <p class="lead text-muted" style="font-size: 19px; text-align: left; margin-top:50px">
          To facilitate the analyses of OSS data, we organize our data with multiple extracts and multiple levels.
          The extracts are retrieved through different channels, e.g., front-ends (web user interface (UI)) and back-ends
          (official database dump) of the development supporting tools, e.g., issue tracker at three different times.
          The variations (dynamics) among extracts provide space for researchers to reproduce and validate their studies,
          while revealing potential opportunities for studies that otherwise could not be conducted.
          We provide different data levels for each extract ranging from raw data to standardized data as well as to the
          calculated data level for targeting specific research questions.
          Data retrieving and processing scripts related to each data level are offered too.
          By employing the multi-level structure, analysts can more efficiently start an inquiry from the standardized level
          and easily trace the data chain when necessary (e.g., to verify if a phenomenon reflected by the data is an actual
          event).
          We have built several datasets with this method and applied the datasets to several published studies.
          </p>
          <p class="lead text-muted" style="font-size: 16px; margin-top:50px">论文: </p>
          <p class="lead text-muted" style="font-size: 16px; font-style:italic; text-align: left">
          Jiaxin Zhu, Minghui Zhou, and Hong Mei. Multi-extract and multi-level dataset of mozilla issue tracking history.
          In Proceedings of the 13th International Conference on Mining Software Repositories (MSR '16). ACM, New York, NY, USA, 472-475.
          </p>
        </div>
      </section>
      <div id="footer"></div>
<script src='js/footer.js'></script>