[2023.08] Data Umbrella Newsletter: August 2023
We organize data science events for the community.
Data Umbrella is a non-profit global community for underrepresented persons in data science. We organize online data science events for the community. All levels are welcome. Our Code of Conduct applies to all of our spaces.
Announcements
Open Source Sprint Impact
Congratuations to Daniel Saunders, who is now a Google Summer of Code Intern for June-August 2023 with the PyMC project. Daniel previously participated in the Data Umbrella PyMC 2023 Working Sessions.
I participated in a PyMC sprint in July 2022, organized with Data Umbrella. I remember Reshama Shaikh, Ravin Kumar, Rowan Schaefer, and Oriol Abril Pla being really nice and super helpful. They taught me how git works and how to tidy up doc strings.
Resources: Job Boards
Here by popular demand, we have some job boards and communities to share:
Open Source Startups
A collection of curated jobs in commercial open source startups. Find listed jobs here: https://www.ossjobs.dev/
Responsible Tech Job Board
Are you looking to find your next career opportunity in responsible tech? All Tech Is Human’s Responsible Tech Job Board curates roles focused on reducing the harms of technology, diversifying the tech pipeline, and ensuring that technology is aligned with the public interest. It is the premiere resource in the field for people exploring a career change, new role, or interested in the growing responsible tech ecosystem.
Diversify Tech Newsletter
Diversity tech aims at connecting underrepresented folks in tech with career opportunities. Get events, scholarships and job opportunities from vetted companies in your inbox every week by subscribing to their free newsletter.
Call for Suggestions
Do you have suggestions for future webinar topics or speakers? Would you like to speak on a topic? For these and any other suggestions, please complete our Online Suggestion Box.
Call for Contributors
There are various ways to get involved in open source and community. Here are some projects, at all different levels. Contact us if you would like to contribute to any of these projects.
Call for Volunteers: Video Timestamps
We are looking for assistance in adding video timestamps.
We have instructions on how you can contribute to this project on GitHub. Help us help the community. Pick a video and get started!
Data Umbrella Impact
Would you like to share an impact that Data Umbrella events and resources have had in your data journey? Send it to us (info@dataumbrella.org), and we will feature it on our Impact Page.
Upcoming Events (free & online webinars)
Lessons from COVID-19: Non-random Missing Data and Its Consequences
August 8, 2023
A fundamental challenge for survey and observational datasets is that not all records in the dataset are complete; key pieces of information may be missing. In this talk, we work through the models and methods from the paper MODELING RACIAL/ETHNIC DIFFERENCES IN COVID-19 INCIDENCE WITH COVARIATES SUBJECT TO NON-RANDOM MISSINGNESS
They write:
In emergency situations, such as a surging pandemic, it is easy to see how the disease process itself may induce non-random missingness of covariates. For example, during a period of rapidly increasing caseloads, such as the Delta and Omicron surges of the COVID-19 pandemic, the overwhelming number of cases is likely to limit the ability of case investigators to collect data that are as detailed as those collected during lower-incidence periods. These differences may also be more pronounced when comparing wealthier and poorer jurisdictions with differential resources for case-finding and intervention.
Using the Stan language and CmdStanR interface, together with a simulated dataset of Covid-19 cases and population demographics, where age, gender, race/ethnicity, and neighborhood have varying degrees of missingness, we will demonstrate how different approaches produce different estimates of Covid-19 prevalence among key demographics.
Blockchains for Open Source Solutions
August 15, 2023
Blockchain as Open Source: A Case for Africa
We will discuss the transformative potential of blockchain, from driving funding access and operational efficiency with smart contracts, to fostering collaboration and restoring trust. Learn how supportive ecosystems are crucial for blockchain initiatives, discover standout partnerships and investments, and understand what we need to do next to harness blockchain's full potential.
Recent Events
In case you missed our recent events, the videos have been posted. Subscribe to our Data Umbrella YouTube to receive notifications when the videos premiere.
Reproducible Publications with Python and Quarto
Quarto is an open-source scientific and technical publishing system that builds on standard markdown with features essential for scientific communication. The system has support for reproducible embedded computations, equations, citations, crossrefs, figure panels, callouts, advanced layouts, and more. In this talk we'll explore the use of Quarto with Python, describing both integration with IPython/Jupyter and the Quarto VS Code extension. Users can author Jupyter notebooks or documents as plain text markdowns with code in Python, R, Julia or Observable. Quarto includes the ability to publish high-quality articles, reports, presentations, websites, blogs, and books in HTML, PDF, MS Word, ePub, Reveal.js and more.
Machine Learning Visualization Using Yellowbrick
Yellowbrick is a very convenient Python library for creating visualizers within the machine learning workflow. With just a few lines of code, a wide range of visualizers can be generated for various stages of the ML process; paying homage to the adage “a picture is worth a thousand words”. In this session, we briefly introduce the Yellowbrick library and explore the range of visualizers. Then we delve into a practical machine learning classification problem and create classification visualizers using the Yellowbrick library.
Events Board: Live Coding Sessions
Data Umbrella has an open source project, the Data Events Board. We have had working sessions with community members to learn to contribute to the open source project, as well as build their professional resume.
These events are run by Naj N of Program Equity.
Live Coding Session #3: Codespaces
Live Coding Session #2: debugging dev environment setup
Live Coding Session #1: environment setup
Community Talks
Community chat: with Beryl Kanali
Check out our discussion with Beryl on what makes a community. There is a full transcript available as well.
Featured Resources
Video Playlists
Data Umbrella Resources
Visit our blog site: blog.dataumbrella.org, and see articles written by our community members on their experience in recent sprints.
We have a Job Board. You can: post jobs (for free), search jobs, subscribe to a weekly update to see postings.
Our Data Umbrella YouTube is growing! Subscribe to our channel to receive notifications of when our event videos are posted.
Accessibility Corner
Accessibility Update: Closed Captioning
Our webinars have closed captioning available! This feature makes our live events more accessible to those with hearing needs and for folks in general who like to see the transcript live during presentation to fully process information.
Connect with Us
dataumbrella.org (*resources*)
Meetup: Data Umbrella & Data Umbrella Africa (*upcoming events*)
YouTube (*past recorded talks*)
Twitter: @DataUmbrella