writing方面的记录

2019-12-29 关于投一个benchmark dataset的文章:

(ICST 2019: BugsJS: a Benchmark of JavaScript Bugs)

JavaScript is a popular programming language that is also error-prone due to its asynchronous, dynamic, and loosely-typed nature. In recent years, numerous techniques have been proposed for analyzing and testing JavaScript applications. However, our survey of the literature in this area revealed that the proposed techniques are often evaluated on different datasets of programs and bugs【这个感觉站不住脚】. The lack of a commonly used benchmark limits the ability to perform fair and unbiased comparisons for assessing the efficacy of new techniques【好像也站不住脚,都在同一个dataset上比不就好了么,只能说没有一个a strong benchmark】. To fill this gap, we propose BugsJS, a benchmark of 453【也不大大呀】 real, manually validated JavaScript bugs from 10 popular JavaScript server-side programs, comprising 444k LOC in total. Each bug is accompanied by its bug report, the test cases that detect it, as well as the patch that fixes it. BugsJS features a rich interface for accessing the faulty and fixed versions of the programs and executing the corresponding test cases, which facilitates conducting highly-reproducible empirical studies and comparisons of JavaScript analysis and testing tools.

2019-12-13 国家公祭日:

In essence, our study empirically confirms and complements previous research findings (and common sense): Developers (and users) prefer documentation that is correct, complete, up to date, usable, maintainable, readable and useful.



Findings: Table VII shows the impact of using each of the

17 languages on the number of bug fixing commits in a

single-language (denoted as hlanguageiS) and multi-language

(denoted as hlanguageiM) setting. From the table, we can note

that the coefficients of the languages are not always statistically

significant. The statistically significant ones are marked with

one or multiple asterisks. There are 20 of them. For those that

are not statistically significant (i.e., 14 of them), unfortunately

not much conclusion can be drawn.

For some languages, the coefficient for the single-language

setting is significant, while the one for the multi-language

setting is not (four languages: CoffeeScript, Ruby, Erlang,

Haskell). For some other languages, it is the other way around

— the coefficient for the multi-language setting is significant,

while the one for the single-language setting is not (four languages:

C, Go, PHP, Python). For yet other languages, their coefficients

for both settings are not significant (three languages:

C#, JavaScript, Perl). Unfortunately, for such languages (11

languages), we cannot compare the two settings (i.e., singlelanguage

and multi-language), because the coefficient of at

least one of the settings is inconclusive.

Thus, we focus on languages with statistically significant

coefficients for both single and multi-language settings. We

find six languages with statistically significant coefficients:

C++, Objective-C, Java, TypeScript, Clojure, and Scala. For all

of them, we consistently find that their coefficients are larger

when they are used in a multi-language setting. This means

that there is a statistically significant support that using these

languages in a multi-language setting (rather than a singlelanguage

setting) increases bug proneness. The findings for the

other eleven languages do not refute the six languages, because

we can not conclude when coefficients are not statistically

significant.

Six languages including C++, Objective-C, Java, TypeScript,

Clojure, and Scala are more defect prone when they are used

with other languages. The results are inconclusive for the

other eleven languages.

1. in objective terms

2. 【可以作为将来的精读文章】Patters of knowledge in API reference Documentation. TSE'13 by Martin. P. Robillard. 这篇文章主要对API reference Documentation(如jdk和.net的以api name为index的每一个webpage介绍该api的使用内容啥的documentation)进行content的分析。

对内容进行分析,主要是想知道一般的api documentation中包含了哪些内容,其实如何组织的。具体的就是作者们花了大工夫先定义好了12中knowledge type(如api的功能是什么,该如何使用等),随后分析了这些knowledge type在documentation中的分布按照type vs. method, classes vs. interface and member vs. variable的形式来进行统计。同时辅助于frequent itemsets mining(使用R中的arules进行统计的)。

全文在写作或者实验方面:对最重要的第一步定义knowledge type的整个过程完全可以好好学习。对后面几个较直观的RQ的统计分析也是较常规的方法。对自己工作的意义方面写得较合理有说服力。 值得学习!

如果做类似这样的工作,里面的方法值得借鉴!

------en...support evidence:

All newly developed applications have bugs—some of them are quite difficult to locate because they exist within the coding logic, some are simply a matter of not

最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 211,817评论 6 492
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 90,329评论 3 385
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 157,354评论 0 348
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 56,498评论 1 284
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 65,600评论 6 386
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 49,829评论 1 290
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 38,979评论 3 408
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 37,722评论 0 266
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 44,189评论 1 303
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 36,519评论 2 327
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 38,654评论 1 340
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 34,329评论 4 330
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 39,940评论 3 313
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 30,762评论 0 21
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 31,993评论 1 266
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 46,382评论 2 360
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 43,543评论 2 349

推荐阅读更多精彩内容