Summary statistics gives you the tools you need to boil down massive datasets to reveal the highlights. In this chapter, you'll explore summary statistics including mean, median, and standard deviation, and learn how to accurately interpret them. You'll also develop your critical thinking skills, allowing you to choose the best summary statistics for your data. This is the Summary of lecture "Introduction to Statistics in Python", via datacamp.
Neural Information Processing Systems (NIPS) is one of the top machine learning conferences in the world where groundbreaking work is published. In this Project, you will analyze a large collection of NIPS research papers from the past decade to discover the latest trends in machine learning. The techniques used here to handle large amounts of data can be applied to other text datasets as well. This is the Result of Project "The Hottest Topics in Machine Learning", via datacamp.
Since the 1930s, Walt Disney Studios has released more than 600 films covering a wide range of genres. While some movies are indeed directed towards kids, many are intended for a broad audience. In this project, you will analyze data to see how Disney movies have changed in popularity since its first movie release. You will also perform hypothesis testing to see what aspects of a movie contribute to its success. The dataset used in this project is a modified version of the Disney Character Success dataset from Kelly Garrett. This is the Result of Project "Disney Movies and Box Office Success", via datacamp.
Tweets are notoriously difficult, as they are shorter than most texts and usually have hard-to-parse content like hashtags, mentions, links and emoji. Despite the difficulties, tweets are fun content, so in this notebook we'll take a look at classifying two prominent North American politicians. Can we determine if it is Donald Trump or Justin Trudeau based on just a tweet? This is the Result of Project "Who's Tweeting? Trump or Trudeau?", via datacamp.
While I'm not a fan nor a hater of the Kardashians and Jenners, the polarizing family intrigues me. Why? Their marketing prowess. Say what you will about them and what they stand for, they are great at the hype game. Everything they touch turns to content. In this Project, you will explore the data underneath the hype in the form of search interest data from Google Trends. You'll recreate the Google Trends plot to visualize their ups and downs over time, then make a few custom plots of your own. And you'll answer the big question - "is Kim even the most famous sister anymore?" This is the Result of Project "Up and Down With the Kardashians", via datacamp.
Buying new cosmetic products is difficult. It can even be scary for those who have sensitive skin and are prone to skin trouble. The information needed to alleviate this problem is on the back of each product, but it's tought to interpret those ingredient lists unless you have a background in chemistry.
The Rebrickable database includes data on every LEGO set that ever been sold; the names of the sets, what bricks they contain, what color the bricks are, etc. It might be small bricks, but this is big data! In this project, you will get to explore the Rebrickable database. To do this you need to know your way around pandas dataframes This is the Result of Project "Exploring 67 years of LEGO", via datacamp.
Cut image processing to the bone by transforming x-ray images. You'll learn how to exploit intensity patterns to select sub-regions of an array, and you'll use convolutional filters to detect interesting features. You'll also use SciPy's ndimage module, which contains a treasure trove of image processing tools. This is the Summary of lecture "Biomedical Image Analysis in Python", via datacamp.
Can a machine distinguish between a honey bee and a bumble bee? Being able to identify bee species from images, while challenging, would allow researchers to more quickly and effectively collect field data. In this Project, you will use the Python image library Pillow to load and manipulate image data. You'll learn common transformations of images and how to build them into a pipeline. This is the Result of Project "Naive Bees Image Loading and Processing", via datacamp.
Prepare to conquer the Nth dimension! To begin the course, you'll learn how to load, build and navigate N-dimensional images using a CT image of the human chest. You'll also leverage the useful ImageIO package and hone your NumPy and matplotlib skills. This is the Summary of lecture "Biomedical Image Analysis in Python", via datacamp.
To better understand the growth and impact of Bitcoin and other cryptocurrencies you will, in this project, explore the market capitalization of different cryptocurrencies. This is the Result of Project "Exploring the Bitcoin Cryptocurrency Market", via datacamp.
Time to bring everything together and build some models! In this last chapter, you will build a base model before tuning some hyperparameters and improving your results with ensembles. You will then get some final tips and tricks to help you compete more efficiently. This is the Summary of lecture "Winning a Kaggle Competition in Python", via datacamp.
You will now get exposure to different types of features. You will modify existing features and create new ones. Also, you will treat the missing data accordingly. This is the Summary of lecture "Winning a Kaggle Competition in Python", via datacamp.
Now that you know the basics of Kaggle competitions, you will learn how to study the specific problem at hand. You will practice EDA and get to establish correct local validation strategies. You will also learn about data leakage. This is the Summary of lecture "Winning a Kaggle Competition in Python", via datacamp.