Newsletter: May 2021
Announcements
Data Umbrella has been busy (thus, our first newsletter since January!). Please read on for upcoming events, past event recordings and resources.
Three Components for Reviewing a Pull Request
Thomas Fan, a core contributor of scikit-learn, will be presenting on reviewing pull requests in an open source project:
1. The mechanics of code review on GitHub.
2. The social aspects of code review and how to effectively give feedback.
3. The technical aspects of reviewing a pull request.
Scikit-learn Open Source Sprint (Latin America)
We have organized an open source sprint for Saturday, 26-Jun-2021, with a focus on Latin America region. A sprint is a 4-hour online hackathon where data scientists / developers will work with a pair programming partner on a beginner-friendly issue in the scikit-learn repo. Some knowledge of python, scikit-learn and machine learning is required. This sprint is an excellent opportunity to increase machine learning and python skills, get mentorship from core developers of the library and get started in contributing to open source.
Full details are available here: https://latam2021.dataumbrella.org
Reminder: Read through and submit application.
Python Backend Developer at Quansight (Remote)
more…
We are pleased to be a Community Partner for NODES 2021, Neo4j's annual developer summit. Join us the week of June 7 for free virtual training sessions each day from 9am to 11am EDT.
Jun 7: Hands-on Introduction to Neo4j
Jun 8: Getting Started with Neo4j Aura
Jun 9: Getting Started with Neo4j Bloom
Jun 10: Getting Started with Neo4j + GraphQL
Jun 11: Building a Knowledge Graph with NLP
Please register on Neo4j's meetup group directly:
https://www.meetup.com/Neo4j-Online-Meetup/
NODES 2021 is a free 1-day developer conference. Join us on June 17 for this live, virtual, all-day (free) event, for beginners and experts alike. Whether you’re experiencing your first graph epiphany, or back to learn best practices from the experts, there’s something for everyone.
Luisa Rebull: Astronomy Data & Image Processing
Did you know that there are many astronomy data archives, all publicly available? There is a ton of research-quality astronomy data available to you *right now*. You just need to know how to get access to it! In order to understand how astronomical images work, you need to know about what color images really are, so I spend time on that. I cover a few of the many ways that you can get access to real data, from citizen science web-based projects to FITS files. I cover a few basics of how to interpret astronomy images, and demonstrate how to get access to NASA’s Infrared Science Archive (IRSA).
Oriol Abril Pla: Bayesian Modeling with PyMC3
In this talk we give an overview of PyMC3, and why you should supercharge your data science skills with probabilistic programming. The talk is organized by layers from more generic to more specific. We cover the main features of the Bayesian paradigm, then probabilistic progamming, then PyMC3 and we cover some hands on examples of Bayesian modeling with PyMC3.
Sam Bail: Wonderful World of Data Quality in Python
We look at the landscape of data quality related open source libraries and look at examples such as pydqc, datagristle, bulwark, dvc, dedupe, and Great Expectations, some of the most popular open source Python packages for data validation and documentation.
Featured
Article: Data Umbrella AFME (Africa & Middle East) scikit-learn Sprint Report
Reshama Shaikh: 2021 State of PyLadies
2021 is the 10-YEAR Anniversary of PyLadies!
This article explores the current state of PyLadies, specifically data around chapters, locations and members.
Hands-on Project: Deploying a Deep Learning Model on Web and Mobile Applications (using TensorFlow)
Playlists
Connect with Us
dataumbrella.org (*resources*)
Meetup: Data Umbrella & Data Umbrella Africa (*upcoming events*)
YouTube (*past recorded talks*)
Twitter: @DataUmbrella & @DataUmbrellaAFR