Currently Migrating into Quarto platform. See the progress on here :)

Posts

Introduction to Probability and Statistics

A summary of "Probability and Statistics in Data Science using Python", offered from UCSD DSE210x

Aug 30, 2020
Correlation and Experimental Design

In this chapter, you'll learn how to quantify the strength of a linear relationship between two variables, and explore how confounding variables can affect the relationship between two other variables. You'll also see how a study’s design can influence its results, change how the data should be analyzed, and potentially affect the reliability of your conclusions. This is the Summary of lecture "Introduction to Statistics in Python", via datacamp.

Aug 28, 2020
More Distributions and the Central Limit Theorem

It’s time to explore one of the most important probability distributions in statistics, normal distribution. You’ll create histograms to plot normal distributions and gain an understanding of the central limit theorem, before expanding your knowledge of statistical functions by adding the Poisson, exponential, and t-distributions to your repertoire. This is the Summary of lecture "Introduction to Statistics in Python", via datacamp.

Aug 28, 2020
Random Numbers and Probability

In this chapter, you'll learn how to generate random samples and measure chance using probability. You'll work with real-world sales data to calculate the probability of a salesperson being successful. Finally, you’ll use the binomial distribution to model events with binary outcomes. This is the Summary of lecture "Introduction to Statistics in Python", via datacamp.

Aug 26, 2020
Summary Statistics with Python

Summary statistics gives you the tools you need to boil down massive datasets to reveal the highlights. In this chapter, you'll explore summary statistics including mean, median, and standard deviation, and learn how to accurately interpret them. You'll also develop your critical thinking skills, allowing you to choose the best summary statistics for your data. This is the Summary of lecture "Introduction to Statistics in Python", via datacamp.

Aug 26, 2020
The Hottest Topics in Machine Learning

Neural Information Processing Systems (NIPS) is one of the top machine learning conferences in the world where groundbreaking work is published. In this Project, you will analyze a large collection of NIPS research papers from the past decade to discover the latest trends in machine learning. The techniques used here to handle large amounts of data can be applied to other text datasets as well. This is the Result of Project "The Hottest Topics in Machine Learning", via datacamp.

Aug 24, 2020
Disney Movies and Box Office Success

Since the 1930s, Walt Disney Studios has released more than 600 films covering a wide range of genres. While some movies are indeed directed towards kids, many are intended for a broad audience. In this project, you will analyze data to see how Disney movies have changed in popularity since its first movie release. You will also perform hypothesis testing to see what aspects of a movie contribute to its success. The dataset used in this project is a modified version of the Disney Character Success dataset from Kelly Garrett. This is the Result of Project "Disney Movies and Box Office Success", via datacamp.

Aug 23, 2020
Analyze Your Runkeeper Fitness Data

Import, clean, and analyze seven years worth of training data tracked on the Runkeeper app. This is the Result of Project "Analyze Your Runkeeper Fitness Data", via datacamp.

Aug 22, 2020
Who's Tweeting? Trump or Trudeau?

Tweets are notoriously difficult, as they are shorter than most texts and usually have hard-to-parse content like hashtags, mentions, links and emoji. Despite the difficulties, tweets are fun content, so in this notebook we'll take a look at classifying two prominent North American politicians. Can we determine if it is Donald Trump or Justin Trudeau based on just a tweet? This is the Result of Project "Who's Tweeting? Trump or Trudeau?", via datacamp.

Aug 21, 2020
Up and Down With the Kardashians

While I'm not a fan nor a hater of the Kardashians and Jenners, the polarizing family intrigues me. Why? Their marketing prowess. Say what you will about them and what they stand for, they are great at the hype game. Everything they touch turns to content. In this Project, you will explore the data underneath the hype in the form of search interest data from Google Trends. You'll recreate the Google Trends plot to visualize their ups and downs over time, then make a few custom plots of your own. And you'll answer the big question - "is Kim even the most famous sister anymore?" This is the Result of Project "Up and Down With the Kardashians", via datacamp.

Aug 20, 2020
Comparing Cosmetics by Ingredients

Buying new cosmetic products is difficult. It can even be scary for those who have sensitive skin and are prone to skin trouble. The information needed to alleviate this problem is on the back of each product, but it's tought to interpret those ingredient lists unless you have a background in chemistry.

Aug 19, 2020
Exploring 67 years of LEGO

The Rebrickable database includes data on every LEGO set that ever been sold; the names of the sets, what bricks they contain, what color the bricks are, etc. It might be small bricks, but this is big data! In this project, you will get to explore the Rebrickable database. To do this you need to know your way around pandas dataframes This is the Result of Project "Exploring 67 years of LEGO", via datacamp.

Aug 17, 2020
Masks and Filters in Biomedical Image Analysis

Cut image processing to the bone by transforming x-ray images. You'll learn how to exploit intensity patterns to select sub-regions of an array, and you'll use convolutional filters to detect interesting features. You'll also use SciPy's ndimage module, which contains a treasure trove of image processing tools. This is the Summary of lecture "Biomedical Image Analysis in Python", via datacamp.

Aug 15, 2020
Naive Bees Image Loading and Processing

Can a machine distinguish between a honey bee and a bumble bee? Being able to identify bee species from images, while challenging, would allow researchers to more quickly and effectively collect field data. In this Project, you will use the Python image library Pillow to load and manipulate image data. You'll learn common transformations of images and how to build them into a pipeline. This is the Result of Project "Naive Bees Image Loading and Processing", via datacamp.

Aug 14, 2020
Exploring the Bitcoin Cryptocurrency Market

To better understand the growth and impact of Bitcoin and other cryptocurrencies you will, in this project, explore the market capitalization of different cryptocurrencies. This is the Result of Project "Exploring the Bitcoin Cryptocurrency Market", via datacamp.

Aug 13, 2020