(Don't read this page. It is a work in progress for a Fall'19 graduate automated SE subject at NC State. Come back in mid-October!)
Start1.preface2.why se 4 ai? 3.tools 4.ethics: how |
ToolsbaselinesData mining: discretization basic advanced Optimizers: landscapes basic advanced optimizing+data mining Theorem provers: basic advanced |
Processrequirementscollect cleanse label train eval deploy monitor |
Codeconfigtests |
Exercises12 3a 3b 3c 3d 4 |
Behnoush Abdollahi and Olfa Nasraoui. 2016. Explainable restricted Boltzmann machines for collaborative filtering. arXiv preprint arXiv:1606.07129 (2016).
Abdessalem, R.B., Nejati, S., Briand, L.C., Stifter, T.: Testing vision-based control systems using learnable evolutionary algorithms. In: Proceedings of the 40th International Conference on Software Engineering, ICSE ’18, pp. 1016–1026. ACM, New York, NY, USA (2018). DOI 10.1145/3180155. 3180160
Agrawal, Amritanshu, Wei Fu, and Tim Menzies. “What is wrong with topic modeling? And how to fix it using search-based software engineering.” Information and Software Technology 98 (2018): 74-88.
Agrawal,A.,Menzies,T.:Is better data better than better dat aminers?: on the benefits of tuning SMOTE for defect prediction. In: Proceedings of the 40th International Conference on Software Engineering, pp. 1050–1061. ACM (2018)
Saleema Amershi, Andrew Begel, Christian Bird, Rob DeLine, Harald Gall, Ece Kamar, Nachiappan Nagappan, Besmira Nushi, Thomas Zimmerman. (2019) Software Engineering for Machine Learning: A Case Study, ICSE SEIP.
R. K. E. Bellamy et al., “Think Your Artificial Intelligence Software Is Fair? Think Again,” in IEEE Software, vol. 36, no. 4, pp. 76-80, July-Aug. 2019.
William Benton. 2019.
“Machine learning and discovery with Kubernetes”. SEMLA’19.
https://freevariable.com/slides/semla-2019.pdf
J. Brickell and V. Shmatikov, “The cost of privacy: destruction of data- mining utility in anonymized data publishing,” in Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining. New York, NY, USA: ACM, 2008, pp. 70–78.
Breiman, Leo; Friedman, J. H.; Olshen, R. A.; Stone, C. J. (1984). Classification and regression trees. Monterey, CA: Wadsworth & Brooks/Cole Advanced Books & Software. ISBN 978-0-412-04841-8.
Y. Brun and A. Meliou, “Software fairness,” ser. ESEC/FSE 18. NY, USA: ACM, pp. 754–759.
Joymallya Chakraborty, Tianpei Xia, Fahmid M. Fahid, Tim Menzies, 2019, Software Engineering for Fairness: A Case Study with Hyperparameter Optimization IEEE ASE’19.
Di Chen, Wei Fu, Rahul Krishna, and Tim Menzies. 2018. Applications of psychological science for actionable analytics. In Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2018). ACM, New York, NY, USA, 456-467. DOI: https://doi.org/10.1145/3236024.3236050
Chen, J., Nair, V., Krishna, R., & Menzies, T. (2018). “ Sampling” as a Baseline Optimizer for Search-based Software Engineering. IEEE Transactions on Software Engineering
Jianfeng Chen. (2019) On the Value of Sampling and Pruning for Search-Based Software Engineering. Ph.D. thesis. NC State, USa. https://www.youtube.com/watch?v=jU9w6w8LwqM
Report of Columbia Accident Investigation Board. Aug. 26, 2003. Volume1.
Mark W Craven and Jude W Shavlik. 2014. Learning symbolic rules using artificial neural networks. In Proceedings of the Tenth International Conference on Machine Learning. 73–80.
Hoa Khanh Dam, Truyen Tran, and Aditya Ghose. 2018. “Explainable Software Analytics.”. ICSE, NIER, 2018, arXiv preprint arXiv:1802.00603 (2018).
James Dougherty, Ron Kohavi, and Mehran Sahami. 1995. Supervised and unsupervised discretization of continuous features. In Proceedings of the Twelfth International Conference on International Conference on Machine Learning (ICML’95), Armand Prieditis and Stuart J. Russell (Eds.). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 194-202. http://robotics.stanford.edu/users/sahami/papers-dir/disc.pdf
Richard O. Duda, Peter E. Hart, and David G. Stork. 2000. “Pattern Classification” (2nd Edition). Wiley-Interscience, New York, NY, USA.
von Neumann, Shannon, and Entropy: http://bit.ly/2Y1yBQh
M. S. Feather and T. Menzies, “Converging on the optimal attainment of requirements,” Proceedings IEEE Joint International Conference on Requirements Engineering, Essen, Germany, 2002, pp. 263-270. doi: 10.1109/ICRE.2002.1048537
Edward Feigenbaum and Pamela McCorduck. 1983. The Fifth Generation: Artificial Intelligence and Japan’s Computer Challenge to the World. Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA.
Wei Fu, Tim Menzies, Xipeng Shen. 2016. “Tuning for software analytics: Is it really necessary?” Information and Software Technology, Volume 76, Pages 135-146, ISSN 0950-5849, https://doi.org/10.1016/j.infsof.2016.04.017.
Wei Fu and Tim Menzies. 2017. Easy over hard: a case study on deep learning. In Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering. ACM, 49–60.
Garcia, S., Derrac, J., Cano, J. R., & Herrera, F. (2011). Prototype selection for nearest neighbor classification: Taxonomy and empirical study. IEEE Transactions on Pattern Analysis & Machine Intelligence, (3), 417-435.
Gregory Gay, Tim Menzies, Omid Jalali, Gregory Mundy, Beau Gilkerson, Martin Feather. 2010. Finding robust solutions in requirements models Automated Software Engineering March. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.650.3238&rep=rep1&type=pdf
G. Gigerenzer. (2008). Why Heuristics Work. Perspect Psychol Sci. 2008 Jan;3(1):20-9. doi: 10.1111/j.1745-6916.2008.00058.x.
A. Gosiewska and P. Biecek, “iBreakDown: Uncertainty of Model Explanations for Non-additive Predictive Models,” arXiv preprint arXiv:1903.11420, 2019.
M. Grechanik, C. Csallner, C. Fu, and Q. Xie, “Is data privacy always good for software testing?” in Proceedings of the 2010 IEEE 21st Inter- national Symposium on Software Reliability Engineering. Washington, DC, USA: IEEE Computer Society, 2010, pp. 368–377.
Joel Grus. 2019. “Data Science from Scratch: First Principles with Python” (2nd ed.). O’Reilly Media, Inc
M. Harman. The current state and future of search based software engineering. In Future of Software Engineering, ICSE’07. 2007.
Mark Harman, S. Afshin Mansouri, and Yuanyuan Zhang. 2012. “Search-based software engineering: Trends, techniques and applications.” ACM Comput. Surv. 45, 1, Article 11 (December 2012), 61 pages. DOI=http://dx.doi.org/10.1145/2379776.2379787
Hall, Mark & Holmes, Geoffrey. (2003). Benchmarking Attribute Selection Techniques for Discrete Class Data Mining. Knowledge and Data Engineering, IEEE Transactions on. 15. 1437- 1447. 10.1109/TKDE.2003.1245283.
IEEE, 2019. “Ethically-Aligned Design: A Vision for Priorizing Human Well-Begin with Autonomous and Intelligence Systems”. First edition.
Kelly G (1955). The Psychology of Personal Constructs New York: W W Norton.
Miryung Kim, Thomas Zimmermann, Robert DeLine, and Andrew Begel.
Krall, Joseph, Tim Menzies, and Misty Davies. “Gale: Geometric active learning for search-based software engineering.” IEEE Transactions on Software Engineering 41.10 (2015): 1001-1018.
Rahul Krishna and Tim Menzies. 2015. Actionable= Cluster+ Contrast?. In Auto- mated Software Engineering Workshop (ASEW), 2015 30th IEEE/ACM International Conference on. IEEE, 14–17.
Jill Larkin, John McDermott, Dorothea P. Simon, and Herbert A. Simon. 1980. Expert and Novice Performance in Solving Physics Problems. Science 208, 4450 (1980), 1335–1342. DOI:http://dx.doi.org/10.1126/science.208.4450.1335 arXiv:http://science.sciencemag.org/content/208/4450/1335.full.pdf
Nancy G. Leveson. 1995. Safeware: System Safety and Computers. ACM, New York, NY, USA.
Zachary C Lipton. 2016. The mythos of model interpretability. arXiv preprint arXiv:1606.03490 (2016).
Wei Ji Ma, Masud Husain, and Paul M Bays. 2014. Changing concepts of working memory. Nature neuroscience 17, 3 (2014), 347–356.
Suvodeep Majumder, Nikhila Balaji, Katie Brey, Wei Fu, and Tim Menzies. 2018. 500+ times faster than deep learning: a case study exploring faster methods for text mining stackoverflow. In Proceedings of the 15th International Conference on Mining Software Repositories (MSR ‘18). ACM, New York, NY, USA, 554-563. DOI: https://doi.org/10.1145/3196398.3196424
Thilo Mende and Rainer Koschke. 2010. Effort-aware defect prediction models. In Software Maintenance and Reengineering (CSMR), 2010 14th European Conference on. IEEE, 107–116.
Marcilio Mendonca, Andrzej Wąsowski, and Krzysztof Czarnecki. 2009. SAT-based analysis of feature models is easy. In Proceedings of the 13th International Software Product Line Conference (SPLC ‘09). Carnegie Mellon University, Pittsburgh, PA, USA, 231-240.
Tim Menzies and Ying Hu. 2003. Data Mining for Very Busy People. Computer 36, 11 (November 2003), 22-29. DOI: https://doi.org/10.1109/MC.2003.1244531
Tim Menzies, Oussama Elrawas, Jairus Hihn, Martin Feather, Ray Madachy, and Barry Boehm. 2007. The business case for automated software engineering. ASE’07. http://doi.org/10.1145/1321631.1321676
Tim Menzies, Oussama El-Rawas, Jairus Hihn, and Barry Boehm. 2009. Can we build software faster and better and cheaper?. In Proceedings of the 5th International Conference on Predictor Models in Software Engineering (PROMISE ‘09). ACM, New York, NY, USA, , Article 2 , 9 pages. DOI=http://dx.doi.org/10.1145/1540438.1540442
Microsoft AI principles. 2019. https://www.microsoft.com/en-us/ai/our-approach-to-ai
Mittas N, Angelis L (2013) Ranking and clustering software cost estimationmodels through a multiple comparisons algorithm. IEEE Trans SE 39(4):537–551, DOI 10.1109/TSE.2012.45
Nair, V., Menzies, T., Siegmund, N., & Apel, S. (2018). Faster discovery of faster system configurations with spectral learning. Automated Software Engineering, 25(2), 247-277.
J. Nam and S. Kim, “CLAMI: Defect Prediction on Unlabeled Datasets (T),” 2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE), Lincoln, NE, 2015, pp. 452-463. http://people.csail.mit.edu/hunkim/papers/nam-HDP-fse2015.pdf
Stuart Russell and Peter Norvig. 2009. Artificial Intelligence: A Modern Approach (3rd ed.). Prentice Hall Press, Upper Saddle River, NJ, USA. http://aima.cs.berkeley.edu
H. Osman, M. Ghafari, Mohammad and 0. Nierstrasz. (2017). Hyperparameter optimization to improve bug prediction accuracy. MaLTeSQuE,17. IEEE Press.,
Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, Jake Vanderplas, Alexandre Passos, David Cournapeau, Matthieu Brucher, Matthieu Perrot, and Édouard Duchesnay.
K. Petersen and C. Wohlin, “Context in industrial software engineering research,” in Empirical Software Engineering and Measurement, 2009. ESEM 2009. 3rd International Symposium on, Oct. 2009, pp. 401 –404.
Fayola Peters, Tim Menzies, and Lucas Layman. 2015. LACE2: better privacy-preserving data sharing for cross project defect prediction. In Proceedings of the 37th International Conference on Software Engineering - Volume 1 (ICSE ‘15), Vol. 1. IEEE Press, Piscataway, NJ, USA, 801-811.
Nathaniel D Phillips, Hansjoerg Neth, Jan K Woike, and Wolfgang Gaissmaier. 2017. FFTrees: A toolbox to create, visualize, and evaluate fast-and-frugal decision trees. Judgment and Decision Making 12, 4 (2017), 344–368.
Karl Popper. (1963). Conjectures and Refutations: The Growth of Scientific Knowledge, ISBN 0-415-04318-2
Quinlan, J. R. (1986). “Induction of decision trees” (PDF). Machine Learning. 1: 81–106. doi:10.1007/BF00116251
Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. “Why Should I Trust You?”: Explaining the Predictions of Any Classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD ‘16). ACM, New York, NY, USA, 1135-1144. DOI: https://doi.org/10.1145/2939672.2939778
Robert Sawyer. 2013. Bias Impact on Analyses and Decision Making Depends on the Development of Less Complex Applications. In Principles and Applications of Business Intelligence Research. IGI Global, 83–95.
Sayyad, A. S., Menzies, T., & Ammar, H. (2013, May). On the value of user preferences in search-based software engineering: a case study in software product lines. In Proceedings of the 2013 International Conference on Software Engineering (pp. 492-501). IEEE Press.
D. Sculley. 2010. “Web-scale k-means clustering”. In Proceedings of the 19th international conference on World wide web (WWW ‘10). ACM, New York, NY, USA, 1177-1178. DOI=http://dx.doi.org/10.1145/1772690.1772862
D. Sculley, Gary Holt, Daniel Golovin, Eugene Davydov, Todd Phillips, Dietmar Ebner, Vinay Chaudhary, Michael Young, Jean-Francois Crespo, and Dan Dennison. 2015. “Hidden technical debt in Machine learning systems.” In Proceedings of the 28th International Conference on Neural Information Processing Systems - Volume 2 (NIPS’15), C. Cortes, D. D. Lee, M. Sugiyama, and R. Garnett (Eds.), Vol. 2. MIT Press, Cambridge, MA, USA, 2503-2511.
Simon, Herbert A. (1956). “Rational Choice and the Structure of the Environment” . Psychological Review. 63 (2): 129–138. CiteSeerX 10.1.1.545.5116. doi:10.1037/h0042769.
Diana F. Gordon. (2001). APT Agents: Agents That Are Adaptive Predictable and Timely. International Workshop on Formal Approaches to Agent-Based Systems FAABS 2000: Formal Approaches to Agent-Based Systems pp 278-293
Shiang-Yen Tan and Taizan Chan. 2016. Defining and conceptualizing actionable insight: a conceptual framework for decision-centric analytics. arXiv preprint arXiv:1606.03510 (2016).
Tantithamthavorn, C., McIntosh, S., Hassan, A. E., & Matsumoto, K. (2016, May). Automated parameter optimization of classification techniques for defect prediction models. In 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE) (pp. 321-332). IEEE.
iD. Wolpert. (2013(. Ubiquity symposium: Evolutionary computation and the processes of life: what the no free lunch theorems really mean: how to improve search algorithms. Ubitquity. Volume 2013, Number December (2013), Pages 1-15. https://ubiquity.acm.org/article.cfm?id=2555237
Ian H. Witten, Eibe Frank, Mark A. Hall, and Christopher J. Pal.
T. Xia, R.Krishna, J.Chen and G.Mathew and X.Shen and T. Menzies. (2018). Hyperparameter Optimization for Effort Estimation, https://arxiv.org/pdf/1805.00336.pdf.
Tianyin Xu, Long Jin, Xuepeng Fan, Yuanyuan Zhou, Shankar Pasupathy, and Rukma Talwadker. 2015. Hey, you have given me too many knobs!: understanding and dealing with over-designed configuration in system software. In Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering (ESEC/FSE 2015). ACM, New York, NY, USA, 307-319. DOI: https://doi.org/10.1145/2786805.2786852
[download]
Zitzler, Eckart & Künzli, Simon. (2004). Indicator-Based Selection in Multiobjective Search. Conference on Parallel Problem Solving from Nature (PPSN VIII). 832-842. 10.1007/978-3-540-30217-9_84. http://www.simonkuenzli.ch/docs/ZK04.pdf