We hope everyone is having a great week! Check out our message to the community, featured blogs, and upcoming events!
- Message to the Community -
We are relieved Derek Chauvin was found guilty of murdering George Floyd in May 2020. Although we don’t believe in the incarceration system, as we prefer forms of restorative justice, the verdict sets a precedent that the institutionally-sanctioned murder of Black people will not be tolerated, a critical step in addressing racial injustice and anti-Blackness in this country. We still mourn George Floyd -- and Breonna Taylor, Daunte Wright, Adam Toledo and too many others -- as nothing can bring those lives back or fill the voids left in their families and communities. We will continue to fight for Black lives and strive to eradicate systemic racism in our own community, on campus, in our state, and across the nation.
- D-Lab Community
- Blog Posts -
Machine Learning in Atmospheric Science By Grazia Rovelli
"The atmosphere is an incredibly complex (and fascinating, I would add!) chemical-physical system...Both our knowledge about the chemical composition of the atmosphere and our ability to measure trace concentrations of chemicals has dramatically improved over the last decades...It seems more and more clear that machine learning algorithms could represent powerful tools to investigate the intrinsic complexity of the atmosphere."
The Importance of Design Plans for Data Science By Alex Stephenson
"Since becoming a Data Fellow at the D-Lab, I have had the opportunity to assist many talented social scientists through the D-Lab’s Consulting service. A regular consulting request is to help with the research design for a new project... For empirical researchers, a high-quality research design makes or breaks a research project. In this post, I suggest a few benefits of writing a skeleton design plan before writing any code whatsoever."
The Digital Humanities Fair (DH Fair) is an annual event that offers the UC Berkeley community the opportunity to share projects at various stages of development, receive invaluable feedback from peers, and reflect on the field more broadly. This year's event will feature speakers and lightning talks.
"Public meetings allow constituents to comment on matters before local governments, providing signals of public opinion in an otherwise low-information environment. Recent advances in the availability of administrative data have shown inequalities in who participates by race, age, and homeownership status...Does the demographically unrepresentative slice of opinion presented at these meetings lead policy to be out of step with mass opinion?"
To read more about Alexander's research and attend his talk click here!
Carl Boe of the Center for the Economics and Demography of Aging and Patty Frontiera of the D-Lab will speak about working with the ZTrax dataset. The Zillow Transaction and Assessment Dataset (ZTRAX) is the country’s largest real estate database made available free of charge to U.S. academic, nonprofit and government researchers.
Computational Text Analysis:
The Cost of Conflict in Bankruptcy Reorganizations
April 30 | 1:00 PM to 2:00 PM | Register Link
Speaker: Kenneth Ayotte
Chapter 11 bankruptcy reorganizations are complex and involve negotiations between classes of claims about how a company’s value is to be divided among these claims. Sometimes these negotiations can be resolved quickly and cheaply, while other cases are long, contentious affairs. We would like to use text analysis to understand the causes and consequences of conflict in bankruptcy cases.
We are hiring a department manager who will be expected to assist the executive director in all aspects of program planning, function as a resource for all D-Lab staff, and more. Applicants should have strong communication and organization skills and a good awareness of data science fields.
For a complete list of application requirements and job expectations, please view here.
This workshop introduces the basic concepts of Deep Learning - the training and performance evaluation of large neural networks, especially for image classification, natural language processing, and time-series data. Like many other machine learning algorithms, we will use deep learning algorithms to map input data to their appropriately classified outcome labels. You will use the R interface to Keras to become familiar with basic concepts like input and output layers, batch sizes and output dimensions, dropout rates, weight parametrization and bias, backpropagation, and loss, activation, and optimization functions. You will also gain confidence exploring more complex approaches that utilize pretrained and fine-tuned models.
This workshop will discuss a brief history of ANNs (Artificial Neural Networks) and an explanation of the intuition behind them. This part aims to give the audience a conceptual understanding with few mathematical barriers, and no programming requirements. It will have a step-by-step construction of a very basic ANN. Although the code will be written in Python, it will be intuitive enough for programmers of other languages to follow along. Using the popular Python library scikit-learn, an ANN will be implemented on a classification problem. High-level libraries reduce the work for a researcher implementing ANN down to tuning a set of parameters, which will be explained in this part.
The D-Lab's R Fundamentals workshop is a four-part introductory series that will teach you R from scratch with clear introductions, concise examples, and support documents. You will learn how to download and install the open-sourced R Studio software, import, export, manipulate, and visualize data, and learn to write shorthand functions of your code. After completion of this workshop you will have a foundational understanding to create, organize, and utilize workflows for your personal research.
Geospatial data are an important component of data visualization and analysis in the social sciences, humanities, and elsewhere. The R programming language is a great platform for exploring these data and integrating them into your research. Part one of this multi-part workshop series will introduce basic methods and packages for working with geospatial data in R. Part two will dive deeper into data driven mapping in R, using color palettes and data classification to communicate information with maps. Part three will introduce tools and approaches for working with raster data.
This four-part, interactive workshop series is your complete introduction to programming Python for people with little or no previous programming experience. By the end of the series, you will be able to apply your knowledge of basic principles of programming and data manipulation to a real-world social science application.
Geospatial data are an important component of social science and humanities data visualization and analysis. This workshop will introduce basic methods for working with geospatial data in Python using GeoPandas, a relatively new Python library for working with geospatial data that has matured and stabilized in the last few years. In the workshop we will import geospatial data stored in shapefiles and CSV files into geopandas objects. We will explore methods for subsetting and spatial reshaping these objects. We will use geopandas methods for defining and transforming coordinate reference systems. Participants will also join tabular data to geospatial data and create maps based on the data values.
Qualtrics is a powerful online tool available to Berkeley community members that can be used for a range of data collection activities. Primarily, Qualtrics is designed to make web surveys easy to write, test, and implement, but the software can be used for data entry, training, quality control, evaluation, market research, pre/post-event feedback, and other uses with some creativity. This overview will introduce a simple workflow using the system with an orientation to the main interfaces for web survey design, sample management, corresponding with sample members, and exporting data at the end of the field period. Examples from completed surveys in public health, economics, program evaluation, and other disciplines will be incorporated in the overview.
An introduction to programming basics in Bash and GitHub that are often assumed, but that you might have never had good instruction on! The first half of this workshop will introduce you navigating your computer’s filesystem and basic Bash commands to remove the fear of working with the command line to give you the confidence to use it to increase your productivity. The second half of this workshop will introduce you to Git, a powerful tool for keeping track of changes you make to the files in a project. You can use it to synchronize your work across computers, collaborate with others, and even deploy applications to the cloud. In this workshop, you will learn the basics to understand and use Git, including working with the popular "social coding" website, GitHub.
To see a calendar view of our upcoming April and May 2021 Workshops, click here!
- Consulting Drop-in Hours -
We are continuing to offer drop-in
consulting Monday-Friday from 10am-5pm.
Stop by our virtual front desk
to speak with a consultant.
Are you interested in developing transferable competencies that are attractive to employers and academic programs? In our digital humanities program, the UC Berkeley Cultural Analytics Learning Institute for Digital Humanities (CALI-DH), you will explore questions about art and culture using digital tools. By pairing computational methods and domain specialization you can better understand complex phenomena and cultures and how computational analysis influences what you see. CALI-DH Online will guide you through the entire process of identifying relevant cultural artifacts and archives, curating your own subset of data, conducting advanced research, and communicating your findings.
To learn more about the courses offered this summer please visit here!
- Upcoming Events -
Computational Social Science Forum:
A Relational View on Ethics and Technology April 26 | 12:00 PM - 1:30 PM | Register
Speaker: Bogdana Rakova
Bringing awareness to our inherent positionality, this talk will start by acknowledging that no view is a view from nowhere and explore what that means in the world of artificial intelligence (AI). Giving examples from a recent ethnographic study in the intersection of organizational structure and the work on ensuring the responsible development and use of AI, we will investigate the so-called socio-technical context - the lived experience of some of the people actively involved in the AI ethics field. We will then explore what the future holds for the intersection of ethics and technology by embracing the pluriverse and bringing in perspectives from the field of community well-being.
Truth and Denial: Searching for Information in the Digital Era Panel
April 22 | 12:00 PM to 1:30 PM | Register here
Presented as part of the "Matrix on Point" series, this event will feature a group of distinguished panelists discussing questions related to objectivity, disinformation, and the construction of truth from a media-consumption (rather than media-production) perspective, focusing on how internet users find information, how algorithms play a deterministic role in search results, and how lies propagate and solidify.
Introduction to Containers on Savio: Creating Reproducible, Scalable & Portable Workflows April 21 | 3:40 PM - 5:00 PM | Register
This training will introduce you to the key concepts and tools for using containers, in particular, Docker and Singularity containers. Containers make it easy to install software, move your computation between different computing environments, and make your workflow reproducible. The training will focus on the use of Singularity on Savio, the campus high-performance computing cluster, including how to use Docker images by Singularity in a cluster environment.
- Jobs & External Opportunities -
Assistant Professor, Halıcıoğlu Data Science Institute
UC San Diego invites applications from outstanding candidates for an open-rank (tenure-track or tenured) faculty position for a primary appointment at the Halicioglu Data Science Institute (HDSI) with an optional joint appointment in another academic department. HDSI welcomes outstanding candidates who advance research at the intersection of Data Science and Health Sciences/ Humanities/Social Sciences. A successful candidate should use and/or develop Data Science methods to study important societal questions related to Black communities anywhere in the US, Africa, or the Black Diaspora.
The Risk Resilience Research lab at the University of California, Berkeley is recruiting a Graduate Student Researcher to be appointed at 50% time. The expected start date is May 15, 2021, and the recruitment will be open until filled. The level of appointment will be commensurate with the applicant’s experience.
The Informatics and GIS Program in the UC Division of Agriculture and Natural Resources is recruiting a GIS web developer. This is a great opportunity for someone with a background in open source GIS and web development, who enjoys building information tools to support natural resource management and agriculture. This new position will be based at Davis or Berkeley, with the possibility of working remotely.
The application deadline is May 2. For more info and to apply, see the announcement!
College of Letters & Science Graduate Mentor
The College of Letters and Sciences is seeking two graduate students from the areas of Arts & Humanities, Biological Science, Interdisciplinary Studies, Mathematics & Physical Science, Social Sciences, and/or students with liberal arts backgrounds currently studying in professional school programs to mentor L&S undergraduates. Mentors will assist the L&S advising team in the expansion of advising tools and programs aimed at supporting undergraduates’ exploration of a broad-based liberal arts education.
The NSF Center for Genetically Encoded Materials (C-GEM) is searching for a developer to build a web-based database to support a diverse set of ribosome engineering efforts. This database will serve as a critical resource for C-GEM and the broader chemical, synthetic, and engineering biology communities. We are looking for a motivated developer who can create the architecture of this database and generate instructions on how to input raw data by May 7th. We envision two UIs: one for data entry and one for searching/data display. We would like to have a beta version of the database available for internal use by May 24th. We will support an additional round of development in July or August (once the database is populated) to make modifications suggested by the C-GEM community prior to public release. We look forward to collaborating with a software expert to make our vision a reality!
Postdoctoral Fellowships at the Proctor Foundation, UCSF
The Francis I. Proctor Foundation at the University of California, San Francisco is seeking postdoctoral fellows to contribute to computational studies of trachoma, enteric pathogens, antimicrobial resistance, and healthcare-associated infections. Ideal applicants will have a Ph.D. and a record of achievement in infectious disease epidemiology, biostatistics, bioinformatics, or quantitative biological fields. Applicants must have strong writing and analytical skills should be adept at programming and data analysis (preferably R) and should have a publication record commensurate with experience.