Data Umbrella Newsletter: June 2022
We organize data science events for the community, with a focus on underrepresented persons.
About Data Umbrella
Data Umbrella is a non-profit global community for underrepresented persons in data science. We organize online data science events for the community. All levels are welcome, beginners and experts. Our Code of Conduct applies to all of our spaces.
Announcements
Blogs
Check out our latest blogs:
5 Years, 10 Sprints, a scikit-learn Open Source Journey – Data Umbrella
The Value of Open Source Sprints, the scikit-learn Experience
Community Blogs
Call for Speakers
We are looking for speakers on the following topics. If you or someone you know can speak on these topics, please email us: info@dataumbrella.org
Software Testing in PyData Open Source
Open Source Literacy (history, challenges, education or other related topics)
How to Debug in Python
Data Umbrella Impact
Would you like to share an impact that Data Umbrella events and resources have had in your data journey? Send it to us (info@dataumbrella.org), and we will feature it on our Impact Page.
Upcoming Events
Intro to Django
This talk will introduce attendees to Django, a free and open source Python-based web framework that encourages rapid development using software engineering best practices. We will start with covering the basic concepts of web development and learning how software frameworks like Django remove much of the hassle of building web applications and allows developers to focus on writing code without needing to reinvent the wheel. The presentation will demonstrate how Django enables development of secure and maintainable software by scaffolding a simple dynamic website from the ground up. We will then take a look at a real-world open source project built with Django and see an example of a web application released in a production environment. This talk will end with a recap of we've learned together and recommendations for getting started with web development with Django.
Short Stories of Data Visualization
Ever wondered how people do all those nice data visualizations out there? How could you start to do some too? In my talk I will present some of my contributions to the #30DayChartChallenge as short stories. I will focus on where did I get my ideas, what was my process, what tools did I use and some best practices I learned. Hopefully after this talk you will be motivated to get started with data visualization and participate in a challenge.
Community Events
All Things Open
Save the date. All Things Open is October 30 to November 10, 2022. ATO is one of the largest open source events, and there will be in-person and virtual options.
Recent Events
In case you missed our recent events, the videos have been posted. Subscribe to our Data Umbrella YouTube to receive notifications when the videos premiere.
Reshama Shaikh: 5 Years, 10 Sprints, A scikit-learn Open Source Journey (Keynote)
We all use open source tools in various capacities, yet knowing how to contribute to open source is not as well known or accessible. The limited knowledge and education surrounding contributing to open source could be one explanation of the low participation rates by underrepresented persons in open source. Open source sprints are hands-on “workshops” or “hackathons” where contributors collaborate to resolve coding and documentation issues posted on a GitHub repository.
Reshama Shaikh shares how she organized her first open source sprint in 2017, which was in-person and held in New York City. Over the next 5 years, she organized in-person sprints from San Francisco, USA to Nairobi, Kenya, as well as pivoting to online sprints due to the global pandemic. In this keynote, she shares highlights, challenges and lessons learned. (https://www.dataumbrella.org/sprints).
Contributing to SciPy (Melissa Weber Mendonça)
SciPy is one of the foundational libraries in the PyData/Scientific Python stack, and is a popular tool for scientists, data scientists, industry professionals and students. It builds upon the NumPy array structures and provides algorithms for optimization, integration, interpolation, eigenvalue problems, algebraic equations, differential equations, statistics and many other classes of problems. In this talk, we will walk through the steps necessary to contribute to SciPy, including code, documentation, triaging issues and pull request reviews. We will also talk about community and the different ways you can interact with SciPy maintainers.
Intro to PyTorch (Sebastian Raschka)
This talk will introduce attendees to using PyTorch for deep learning. We will start by covering PyTorch from the ground up and learn how it can be both powerful and convenient. At times, Machine learning models can become so large that they can't be trained on a notebook anymore. Being able to take advantage AI-optimized accelerators such as GPU or TPU and scaling the training of models to hundreds of these devices is essential to the researcher and data scientist. However, adding support for one or several of these in the source code can be complex, time consuming and error-prone. What starts as a fun research project ends up being an engineering problem with hard to debug code. This talk will introduce LightningLite, an open source library that removes this burden completely. You will learn how you can accelerate your PyTorch training script in just under ten lines of code to take advantage of multi-GPU, TPU, multi-node, mixed-precision training and more.
Scaling Up with LightningLite (Adrian Wälchli)
Q&A: Sebastian Raschka & Adrian Wälchli (PyTorch, LightningLite)
Featured Resources
Video Playlists
Highlighted Resource
Data Umbrella Team
In this section, we share content from our team.
Blog
Why Women Are Flourishing In R Community But Lagging In Python
From the Vault
Here we share one of our favorite and impactful videos in open source:
Supporting Data Umbrella
Data Umbrella is on Benevity. If your company uses Benevity, which is a donation platform for employer-matching contributions to non-profits, please consider making a contribution to Data Umbrella. Note: this link is active for registered users of Benevity: Data Umbrella on Benevity
For users not on Benevity, donations can be made directly to the Open Collective.
Supporters can donate company stock to Data Umbrella Open Collective. Normally when you sell stock, you have to pay capital gains tax. But if you donate it to a tax-exempt nonprofit, it can be sold tax-free. So everyone wins: the donor gets tax benefits, and the Collective gets a donation to support its mission. Contact us for more information: info@dataumbrella.org
Data Umbrella Resources
Visit our blog site: blog.dataumbrella.org, and see articles written by our community members on their experience in recent sprints.
We have a Job Board. You can: post jobs (for free), search jobs, subscribe to a weekly update to see postings.
Our Data Umbrella YouTube is growing! Subscribe to our channel to receive notifications of when our event videos are posted.
Accessibility Corner
Accessibility Update: Closed Captioning
Our webinars have closed captioning available! This feature makes our live events more accessible to those with hearing needs and for folks in general who like to see the transcript live during presentation to fully process information.
Connect with Us
dataumbrella.org (*resources*)
Meetup: Data Umbrella & Data Umbrella Africa (*upcoming events*)
YouTube (*past recorded talks*)
Twitter: @DataUmbrella