نتائج البحث - "CASARI, AMANDA"

تحديد النتيجة رقم 1
1

تقرير

The OCEAN mailing list data set: Network analysis spanning mailing lists and code repositories

المؤلفون: Warrick, Melanie, Rosenblatt, Samuel F., Young, Jean-Gabriel, Casari, Amanda, Hébert-Dufresne, Laurent, Bagrow, James

مصطلحات موضوعية: Computer Science - Computers and Society, Computer Science - Software Engineering

الوصف: Communication surrounding the development of an open source project largely occurs outside the software repository itself. Historically, large communities often used a collection of mailing lists to discuss the different aspects of their projects. Multimodal tool use, with software development and communication happening on different channels, complicates the study of open source projects as a sociotechnical system. Here, we combine and standardize mailing lists of the Python community, resulting in 954,287 messages from 1995 to the present. We share all scraping and cleaning code to facilitate reproduction of this work, as well as smaller datasets for the Golang (122,721 messages), Angular (20,041 messages) and Node.js (12,514 messages) communities. To showcase the usefulness of these data, we focus on the CPython repository and merge the technical layer (which GitHub account works on what file and with whom) with the social layer (messages from unique email addresses) by identifying 33% of GitHub contributors in the mailing list data. We then explore correlations between the valence of social messaging and the structure of the collaboration network. We discuss how these data provide a laboratory to test theories from standard organizational science in large open source projects.
Comment: Accepted for the 19th International Conference on Mining Software Repositories (MSR '22), May 23--24, 2022, Pittsburgh, PA, USA

الوصول الحر: http://arxiv.org/abs/2204.00603Test

View record in Arxiv

عرض رمز QR

أضف إلى السلة حذف من سلة الكتب
أضف إلى المفضلة

محفوظ في:
تحديد النتيجة رقم 2
2

دورية أكاديمية

Invisible Labor in Open Source Software Ecosystems

المؤلفون: Meluso, John, Casari, Amanda, McLaughlin, Katie, Trujillo, Milo Z.

مصطلحات موضوعية: Computer Science - Software Engineering, D.2.9, K.6.3

الوصف: Invisible labor is work that is not fully visible, not appropriately compensated, or both. In open source software (OSS) ecosystems, essential tasks that do not involve code (like content moderation) often become invisible to the detriment of individuals and organizations. However, invisible labor is so difficult to measure that we do not know how much of OSS activities are invisible. Our study addresses this challenge, demonstrating that roughly half of OSS work is invisible. We do this by developing a survey technique with cognitive anchoring that measures OSS developer self-assessments of labor visibility and attribution. Survey respondents (n=142) reported that their work is more likely to be nonvisible or partially visible (i.e. visible to at most 1 other person) than fully visible (i.e. visible to 2 or more people). Furthermore, cognitively anchoring participants to the idea of high work visibility increased perceptions of labor visibility and decreased visibility importance compared to anchoring to low work visibility. This suggests that advertising OSS activities as "open" may not make labor visible to most people, but rather lead contributors to overestimate labor visibility. We therefore add to a growing body of evidence that designing systems that recognize all kinds of labor as legitimate contributions is likely to improve fairness in software development while providing greater transparency into work designs that help organizations and communities achieve their goals. ; Comment: 18 pages, 6 figures

العلاقة: http://arxiv.org/abs/2401.06889Test

الإتاحة: http://arxiv.org/abs/2401.06889Test

View record in BASE

عرض رمز QR

أضف إلى السلة حذف من سلة الكتب
أضف إلى المفضلة

محفوظ في:
تحديد النتيجة رقم 3
3

تقرير

Which contributions count? Analysis of attribution in open source

المؤلفون: Young, Jean-Gabriel, Casari, Amanda, McLaughlin, Katie, Trujillo, Milo Z., Hébert-Dufresne, Laurent, Bagrow, James P.

المصدر: 2021 IEEE/ACM 18th International Conference on Mining Software Repositories (MSR), pp. 242-253 (2021)

مصطلحات موضوعية: Computer Science - Software Engineering, Computer Science - Computers and Society

الوصف: Open source software projects usually acknowledge contributions with text files, websites, and other idiosyncratic methods. These data sources are hard to mine, which is why contributorship is most frequently measured through changes to repositories, such as commits, pushes, or patches. Recently, some open source projects have taken to recording contributor actions with standardized systems; this opens up a unique opportunity to understand how community-generated notions of contributorship map onto codebases as the measure of contribution. Here, we characterize contributor acknowledgment models in open source by analyzing thousands of projects that use a model called All Contributors to acknowledge diverse contributions like outreach, finance, infrastructure, and community management. We analyze the life cycle of projects through this model's lens and contrast its representation of contributorship with the picture given by other methods of acknowledgment, including GitHub's top committers indicator and contributions derived from actions taken on the platform. We find that community-generated systems of contribution acknowledgment make work like idea generation or bug finding more visible, which generates a more extensive picture of collaboration. Further, we find that models requiring explicit attribution lead to more clearly defined boundaries around what is and what is not a contribution.
Comment: Extended version of a paper accepted at MSR 2021

الوصول الحر: http://arxiv.org/abs/2103.11007Test

View record in Arxiv

عرض رمز QR

أضف إلى السلة حذف من سلة الكتب
أضف إلى المفضلة

محفوظ في:
تحديد النتيجة رقم 4
4

مؤتمر

Patterns and Anti-Patterns when Measuring Diversity in Open Source

المؤلفون: casari, amanda

الوصف: If we fundamentally believe that 'Open source is for everyone', how do we know we are actually bringing everyone in, meeting them where they are, and fostering a diverse and inclusive open source ecosystem? Our open source team has evolved our practices for measuring open source communities, and the impact we have on them. This poster presents patterns and anti-patterns we have learned about measuring diversity in global open source communities.

العلاقة: https://zenodo.org/communities/scipyTest; https://zenodo.org/record/8220844Test; https://doi.org/10.25080/gerudo-f2bc6f59-01cTest; oai:zenodo.org:8220844

الإتاحة: https://doi.org/10.25080/gerudo-f2bc6f59-01cTest
https://zenodo.org/record/8220844Test

View record in BASE

عرض رمز QR

أضف إلى السلة حذف من سلة الكتب
أضف إلى المفضلة

محفوظ في:
تحديد النتيجة رقم 5
5

دورية

Beyond the Repository: Best practices for open source ecosystems researchers.

المؤلفون: CASARI, AMANDA¹, FERRAIOLI, JULIA, LOVATO, JUNIPER²

المصدر: Communications of the ACM. Oct2023, Vol. 66 Issue 10, p50-55. 6p. 2 Color Photographs.

مصطلحات موضوعية: *OPEN source software, *OPEN data movement, *INFORMED consent (Law), RESEARCH personnel, RESEARCH ethics, ACQUISITION of data, DATA privacy

مستخلص: This article details best practices for open source ecosystems research to uphold the integrity of ecosystems. The article details nine best practices as a guide for researchers working with ecosystems with an emphasis on ethics and respect. Topics include understanding and adhering to information usage policies, data collection methods, and collaboration with the communities involved with these ecosystems.

عرض رمز QR

أضف إلى السلة حذف من سلة الكتب
أضف إلى المفضلة

محفوظ في:
تحديد النتيجة رقم 6
6

دورية أكاديمية

Open source ecosystems need equitable credit across contributions

المؤلفون: Casari, Amanda, McLaughlin, Katie, Trujillo, Milo Z., Young, Jean-Gabriel, Bagrow, James P., Hébert-Dufresne, Laurent

المساهمون: Google Open Source

المصدر: Nature Computational Science ; volume 1, issue 1, page 2-2 ; ISSN 2662-8457

مصطلحات موضوعية: Computer Networks and Communications, Computer Science Applications, Computer Science (miscellaneous)

الإتاحة: https://doi.org/10.1038/s43588-020-00011-wTest
https://www.nature.com/articles/s43588-020-00011-w.pdfTest
https://www.nature.com/articles/s43588-020-00011-wTest

View record in BASE

عرض رمز QR

أضف إلى السلة حذف من سلة الكتب
أضف إلى المفضلة

محفوظ في:
تحديد النتيجة رقم 7
7

The OCEAN mailing list data set

المؤلفون: Warrick, Melanie, Rosenblatt, Samuel F., Young, Jean-Gabriel, Casari, Amanda, Hébert-Dufresne, Laurent, Bagrow, James

المصدر: Proceedings of the 19th International Conference on Mining Software Repositories.

مصطلحات موضوعية: Software Engineering (cs.SE), FOS: Computer and information sciences, Computer Science - Computers and Society, Computer Science - Software Engineering, Computers and Society (cs.CY)

الوصف: Communication surrounding the development of an open source project largely occurs outside the software repository itself. Historically, large communities often used a collection of mailing lists to discuss the different aspects of their projects. Multimodal tool use, with software development and communication happening on different channels, complicates the study of open source projects as a sociotechnical system. Here, we combine and standardize mailing lists of the Python community, resulting in 954,287 messages from 1995 to the present. We share all scraping and cleaning code to facilitate reproduction of this work, as well as smaller datasets for the Golang (122,721 messages), Angular (20,041 messages) and Node.js (12,514 messages) communities. To showcase the usefulness of these data, we focus on the CPython repository and merge the technical layer (which GitHub account works on what file and with whom) with the social layer (messages from unique email addresses) by identifying 33% of GitHub contributors in the mailing list data. We then explore correlations between the valence of social messaging and the structure of the collaboration network. We discuss how these data provide a laboratory to test theories from standard organizational science in large open source projects.
Comment: Accepted for the 19th International Conference on Mining Software Repositories (MSR '22), May 23--24, 2022, Pittsburgh, PA, USA

الوصول الحر: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::dbdc686f46db1f2d48b8c6dde08558a2Test
https://doi.org/10.1145/3524842.3528479Test

عرض رمز QR

أضف إلى السلة حذف من سلة الكتب
أضف إلى المفضلة

محفوظ في:
تحديد النتيجة رقم 8
8

كتاب إلكتروني

Feature Engineering for Machine Learning : Principles and Techniques for Data Scientists

الوصف: Feature engineering is a crucial step in the machine-learning pipeline, yet this topic is rarely examined on its own. With this practical book, you'll learn techniques for extracting and transforming features—the numeric representations of raw data—into formats for machine-learning models. Each chapter guides you through a single data problem, such as how to represent text or image data. Together, these examples illustrate the main principles of feature engineering.Rather than simply teach these principles, authors Alice Zheng and Amanda Casari focus on practical application with exercises throughout the book. The closing chapter brings everything together by tackling a real-world, structured dataset with several feature-engineering techniques. Python packages including numpy, Pandas, Scikit-learn, and Matplotlib are used in code examples.You'll examine:Feature engineering for numeric data: filtering, binning, scaling, log transforms, and power transformsNatural text techniques: bag-of-words, n-grams, and phrase detectionFrequency-based filtering and feature scaling for eliminating uninformative featuresEncoding techniques of categorical variables, including feature hashing and bin-countingModel-based feature engineering with principal component analysisThe concept of model stacking, using k-means as a featurization techniqueImage feature extraction with manual and deep-learning techniques

المؤلفون: Zheng, Alice, Casari, Amanda

نوع المادة: eBook.

الموضوعات: Data mining, Machine learning

تصنيفات: COMPUTERS / Data Science / General, COMPUTERS / Database Administration & Management, COMPUTERS / Data Science / Data Analytics, COMPUTERS / Data Science / Data Warehousing, COMPUTERS / Data Science / Data Modeling & Design, COMPUTERS / Databases / Servers

عرض رمز QR

أضف إلى السلة حذف من سلة الكتب
أضف إلى المفضلة

محفوظ في:
تحديد النتيجة رقم 9
9

Feature engineering for machine learning: principles and techniques for data scientists

المؤلفون: Zheng, Alice, Casari, Amanda

مصطلحات موضوعية: Computing and Computers

العلاقة: http://cds.cern.ch/record/2670779Test; oai:cds.cern.ch:2670779

الإتاحة: http://cds.cern.ch/record/2670779Test

View record in BASE

عرض رمز QR

أضف إلى السلة حذف من سلة الكتب
أضف إلى المفضلة

محفوظ في:
تحديد النتيجة رقم 10
10

كتاب

Feature engineering for machine learning : principles and techniques for data scientists / Alice Zheng and Amanda Casari.

المؤلفون: Zheng, Alice. Autor

المساهمون: Casari, Amanda., O'Reilly Media. Wydawca pbl

المصدر: Bibliografia przy rozdziałach. Indeks.

مصطلحات موضوعية: Uczenie się automatyczne, Eksploracja danych

الوصول الحر: http://katalog.nukat.edu.pl/lib/item?id=chamo:4817185&theme=nukatTest

View record in NUKAT

عرض رمز QR

أضف إلى السلة حذف من سلة الكتب
أضف إلى المفضلة

محفوظ في:

تنقيح النتائج