[2023.07] Data Umbrella Newsletter: July 2023
We organize data science events for the community.
Data Umbrella is a non-profit global community for underrepresented persons in data science. We organize online data science events for the community. All levels are welcome. Our Code of Conduct applies to all of our spaces.
Announcements
Timestamps
We would like to thank Daniel Boadzie and Divine Ediebah for their contributions to Data Umbrella by adding timestamps to our YouTube videos MLOps: from Concept to Product and Create a Python Web App Using Shiny.
Call for Volunteers: Video Timestamps
We are looking for assistance in adding video timestamps.
We have instructions on how you can contribute to this project on GitHub. Help us help the community. Pick a video and get started!
Resource: Imposter Syndrome
We hear a lot about “imposter syndrome” in our field. But, what can we do about it? We found this gem of a podcast episode on Building Confidence.
How comfortable are we with our own presence? In this conversation with Selena Rezvani, we explore what it means to “right-size” your confidence, how to get the body on board with feeling more powerful, and what confidence can do for your path ahead.
Call for Suggestions
Do you have suggestions for future webinar topics or speakers? Would you like to speak on a topic? For these and any other suggestions, please complete our Online Suggestion Box.
Call for Contributors
There are various ways to get involved in open source and community. Here are some projects, at all different levels. Contact us if you would like to contribute to any of these projects.
Call for Speakers
We are looking for speakers on the following topics. If you or someone you know can speak on these topics, please email us: info@dataumbrella.org
Open Source Literacy (history, challenges, education or other related topics)
How to Debug in Python
Data Umbrella Impact
Would you like to share an impact that Data Umbrella events and resources have had in your data journey? Send it to us (info@dataumbrella.org), and we will feature it on our Impact Page.
Upcoming Events (free & online webinars)
Reproducible Publications with Python and Quarto
July 11, 2023
Quarto is an open-source scientific and technical publishing system that builds on standard markdown with features essential for scientific communication. The system has support for reproducible embedded computations, equations, citations, crossrefs, figure panels, callouts, advanced layouts, and more. In this talk we'll explore the use of Quarto with Python, describing both integration with IPython/Jupyter and the Quarto VS Code extension. Users can author Jupyter notebooks or documents as plain text markdowns with code in Python, R, Julia or Observable. Quarto includes the ability to publish high-quality articles, reports, presentations, websites, blogs, and books in HTML, PDF, MS Word, ePub, Reveal.js and more.
Machine learning visualization using Yellowbrick
July 25, 2023
Yellowbrick is a very convenient Python library for creating visualizers within the machine learning workflow. With just a few lines of code, a wide range of visualizers can be generated for various stages of the ML process; paying homage to the adage “a picture is worth a thousand words”. In this session, we shall briefly introduce the Yellowbrick library and explore the range of visualizers. Then we shall delve into a practical machine learning classification problem and create classification visualizers using the Yellowbrick library.Save the date for this insightful session!
Recent Events
In case you missed our recent events, the videos have been posted. Subscribe to our Data Umbrella YouTube to receive notifications when the videos premiere.
Intro to FluxML and Machine Learning in Julia
Julia is a high-level, general-purpose dynamic programming language. Its features are well suited for numerical analysis and computational science. If you have experience programming in another language, you will find that most of your knowledge will be easily transferred to Julia.
Flux is an elegant approach to machine learning. It's a 100% pure-Julia stack, and provides lightweight abstractions on top of Julia's native GPU and AD support. Flux makes the easy things easy while remaining fully hackable. Learn about the basics of machine learning with FluxML in Julia, how it differs from other frameworks, and why it matters. This presentation will introduce the basics of machine learning with FluxML in Julia, how it differs from other frameworks, and why it matters.
An Introduction to ImageIO
Get ready for an exciting and informative introduction to ImageIO! ImageIO is a popular Python library that offers powerful capabilities for reading and writing images and videos. Together, we'll explore common use cases, best practices, and anti-patterns when using the library. We'll also build intuition for ImageIO’s v3 API, which will allow us to solve complex and non-standard problems. Plus, we'll take a brief look at ImageIO's plugin system, which – among other things – allows tweaking I/O performance. And to top it all off, we'll dive into an end-to-end machine-learning example of how to use ImageIO to train a vision model. So don't miss this chance to expand your knowledge and improve your Python skills with ImageIO.
From Vicuna to human-aligned evaluation
Vicuna is an open-source chatbot trained by fine-tuning LLaMA on user-shared conversations collected from ShareGPT. It is an auto-regressive language model, based on the transformer architecture.
We will talk about the training and deployment experiences of Vicuna, which is a high-quality chat assistant. After training the models, we found evaluating them even more difficult, so we launched Chatbot Arena, a crowd-sourced benchmarking platform featuring randomized battles. However, relying on human evaluation is costly and slow, so we study whether we can replace human evaluators with strong LLMs like GPT-4 for evaluating these models. We termed this approach “LLM-as-a-judge”.
Featured Resources
Video Playlists
Data Umbrella Resources
Visit our blog site: blog.dataumbrella.org, and see articles written by our community members on their experience in recent sprints.
We have a Job Board. You can: post jobs (for free), search jobs, subscribe to a weekly update to see postings.
Our Data Umbrella YouTube is growing! Subscribe to our channel to receive notifications of when our event videos are posted.
Accessibility Corner
Accessibility Update: Closed Captioning
Our webinars have closed captioning available! This feature makes our live events more accessible to those with hearing needs and for folks in general who like to see the transcript live during presentation to fully process information.
Connect with Us
dataumbrella.org (*resources*)
Meetup: Data Umbrella & Data Umbrella Africa (*upcoming events*)
YouTube (*past recorded talks*)
Twitter: @DataUmbrella