[2023.06] Data Umbrella Newsletter: June 2023
We organize data science events for the community.
Data Umbrella is a non-profit global community for underrepresented persons in data science. We organize online data science events for the community. All levels are welcome. Our Code of Conduct applies to all of our spaces.
Announcements
If you are interested in getting your resume reviewed and receiving valuable feedback, join the MLOps Discord server (https://discord.gg/Ud2jRWaKhu), and check out their #career-advice channel. Members can collaborate, share insights, and refine their professional documents to enhance their job prospects.
Call for Suggestions
Do you have suggestions for future webinar topics or speakers? Would you like to speak on a topic? For these and any other suggestions, please complete our Online Suggestion Box.
Call for Contributors
There are various ways to get involved in open source and community. Here are some projects, at all different levels. Contact us if you would like to contribute to any of these projects.
Call for Speakers
We are looking for speakers on the following topics. If you or someone you know can speak on these topics, please email us: info@dataumbrella.org
Open Source Literacy (history, challenges, education or other related topics)
How to Debug in Python
Data Umbrella Impact
Would you like to share an impact that Data Umbrella events and resources have had in your data journey? Send it to us (info@dataumbrella.org), and we will feature it on our Impact Page.
Upcoming Events (free & online webinars)
An Introduction to ImageIO
June 6, 2023
ImageIO is a popular Python library that offers powerful capabilities for reading and writing images and videos. This webinar will explore common use cases, best practices, and anti-patterns when using the library. We will also build intuition for ImageIO’s v3 API, which will allow us to solve complex and non-standard problems. Plus, a brief look at ImageIO's plugin system, which – among other things – allows tweaking I/O performance. And to top it all off, there will be a dive into an end-to-end machine-learning example of how to use ImageIO to train a vision model.
Intro to FluxML and Machine Learning in Julia
June 14, 2023
Julia is a high-level, general-purpose dynamic programming language. Its features are well suited for numerical analysis and computational science. If you have experience programming in another language, you will find that most of your knowledge will be easily transferred to Julia. Refer to the documentation of some noteworthy differences from other popular languages: Matlab, R, Python and C/C++.
Flux is an elegant approach to machine learning. It's a 100% pure-Julia stack, and provides lightweight abstractions on top of Julia's native GPU and AD support. Flux makes the easy things easy while remaining fully hackable. Learn about the basics of machine learning with FluxML in Julia, how it differs from other frameworks, and why it matters.
Recent Events
In case you missed our recent events, the videos have been posted. Subscribe to our Data Umbrella YouTube to receive notifications when the videos premiere.
Solving NLP (Natural Language Processing) Tasks Using LLMs (Large Language Models)
LLMs (Large Language Models) have changed what it means to "do natural language processing", opening up the fields to newcomers and beginners. This talk will present some traditional NLP tasks and how they can be fielded using prompting and tools such as GPT-4. Discussion of open source tools (like OpenChatKit) will also be included.
Create a Python Web App Using Shiny
Shiny makes it easy to build interactive web applications with the power of Python’s data and scientific stack. If you want to develop a python web application you usually need to choose between simple, limited frameworks like Streamlit and more extensible frameworks like Dash. This can cause a lot of problems if you get started with a simple framework but then discover that you need to refactor your application to accommodate the next user request. Shiny for Python differs from other frameworks because it has tremendous range. You can build a small application in a few minutes with the confidence that the framework can handle much more complex problems. In this workshop we will go through the core limitations of Streamlit, and build a Shiny app which avoids those limitations.
MLOps: from Concept to Product
Finding ways to use your data to solve a problem is a great step, which needs to activate a process that allows moving from a Proof of Concept (POC) to a feature or product. Products are meant to be used (obviously) by users who have expectations about their performance, reliability and usability. This process is guided by MLOps practices. In this talk, we will explore what that really means and how you could start applying these practices in real-world scenarios.
Featured Resources
Video Playlists
Data Umbrella Resources
Visit our blog site: blog.dataumbrella.org, and see articles written by our community members on their experience in recent sprints.
We have a Job Board. You can: post jobs (for free), search jobs, subscribe to a weekly update to see postings.
Our Data Umbrella YouTube is growing! Subscribe to our channel to receive notifications of when our event videos are posted.
Accessibility Corner
Accessibility Update: Closed Captioning
Our webinars have closed captioning available! This feature makes our live events more accessible to those with hearing needs and for folks in general who like to see the transcript live during presentation to fully process information.
Connect with Us
dataumbrella.org (*resources*)
Meetup: Data Umbrella & Data Umbrella Africa (*upcoming events*)
YouTube (*past recorded talks*)
Twitter: @DataUmbrella