Effective Pandas will teach you the foundational skills of Pandas to be effective with your data. Open the doors to advanced analysis, visualization, machine learning, and more!
Have you ever seen a real master in Excel? 🧙
They can do amazing things quickly! 🍔
Sometimes Excel is excellent for whipping something up fast.⏱
However, like fast food, Excel is not flexible.💪
And if you need to change your results or collaborate with others ... good luck! 🤞
With Python and Pandas, you get speed, scale, and flexibility! 🚅📈🐼
This course will teach you how to be productive with pandas. ⚒
It will teach best practices gleaned over many years of Pandas usage.👩🏫
You can scale this from simple exploration to deployment! ⚙
Pandas can be very tricky. Just because you are familiar with Python, there is no guarantee that pandas will click with you. Learning Pandas takes time, and you are busy.
What's in it for me? I can just write a Python script, SQL, or use Excel? Those are quick and easy.
After debugging your results and pulling out a bit of hair, you think, "perhaps there is a better way?"
Your time is valuable, and I don't want to waste it. This material packs my findings after using Pandas in production, writing a few books about Pandas, and teaching Pandas to thousands through live and virtual trainings.
In the real world, data is messy. Often it doesn't come in a format that gives you everything on a silver platter. Using real datasets, we will learn how to load data from common formats. We will explain and demo how to read files from CSV and Excel. We will also show how to read and write to databases.
Getting your data to load is not enough. We will start exploring the data, diving into types, shapes, memory usage, and more. Jupyter also plays a role here, and we will show how to adjust settings to view more data.
Big data Borat claimed that Data Scientists spend 80% of their time cleaning up the data. We will show common manipulations for numeric, string, and data types. We will discuss optimized operations that get your code to run 10-20 times faster while still retaining that easy to read quality. Big data Borat is going to be upset!
To understand your data, you are going to need to dig into it. Do you want to really grok your data? If you do, this module will be a treat for you. We will look into understanding numeric and categorical data. Also, we will discuss how to quantify the relation between columns. You will start to feel like your data is a familiar book, dog-eared, and underlined!
Summarizing your data with numbers is one way to understand it. However, we will not end there. It has been said that a picture is worth a thousand words, which holds for data as well. When you master visualization, you will be able to picture what relationships really look like. And you will have insight into your data that you wouldn't have even thought about until you see it. In this case, seeing is believing.
After you understand your data, you will want to dig into it. You will want to slice and dice it. Wrapping your head around this can be really confusing. However, we will walk through real-world examples and help you understand the options that you have. Faster than a speeding SQL query, you will have the data that you want.
Masters of Excel know how to pivot like the back of their hand. They can quickly whip up summaries along multiple dimensions. We will show you how to one-up them in Pandas. Pandas has powerful aggregation tooling for summarizing along any axis you can think of. Of course, you can always export it to a spreadsheet after if your boss really wants one.
Data doesn't live in isolation. It has friends and enemies. Sometimes these need to get together and have a pow-wow. We will demonstrate how to do this in Pandas as well as show some of the situations you might want to look out for.
High-resolution full-color flashcard for each module. These flashcards demonstrate best practices and gotchas you might run into.
Yoda once said, "The greatest teacher failure is." I don't necessarily want you to fail. However, you will run into issues when you start using Pandas. After teaching thousands, I can guarantee it. However, I flip this around and view this as a positive. You will be forced to overcome these issues with the assignments. And they are using real-world data! Practice makes perfect!
My students often tell me that solution walkthroughs are the most valuable part of the course. You've had a chance to work on the material, it is fresh on your mind, and you have used a different part of your brain by working on the labs. These walk-throughs will open your mind to solving problems that you run into. You will see how to compose tools to build solutions. And you will see how I encounter failure and deal with it-highly recommended after you finish the labs!
This is your go-to reference. Print this full cover PDF out and keep it handy when you can remember what construct you need to use.
Perfect if you want to learn the basics through just video, not practice and exercises
Standard version is for those who want to commit to mastery through practice and exercises
Professional version is for those who want to commit to mastery through practice and exercises and want live help
Matt is the author of the Pandas 1.x Cookbook, Machine Learning Pocket Reference, best selling Illustrated Guide to Python 3, and Learning the Pandas Library, as well as other books.
He runs MetaSnake, a Python and Data Science consultancy and corporate training shop. In the past, he has worked across the domains of search, build management and testing, business intelligence, and storage.
He has presented and taught tutorials at conferences such as Strata, SciPy, SCALE, PyCON, USENIX, and OSCON as well as local user conferences.