Note: Most of the examples used to explain concepts of NumPy have been taken from Python For Data Analysis by Wes McKinney.
Let’s get started.
A ndarray is a generic multidimensional container for homogeneous data; that is, all of the elements must be the same type.
Every array has a shape, a tuple indicating the size of each dimension, and a dtype, an object describing the data type of the array:
# randn returns elements from a standard normal distributiondata = np.random.randn(2,3)
array([[-1.00945873, -0.14747028, 1.04654565],
[-0.69762101, 0.35370184, -0.08946465]])
To check the type of each element we use:
A Sentiment is a thought, opinion, or idea based on a feeling about a situation, or a way of thinking about something. That something can be about movies, food, restaurant, etc.
With help of user sentiments, we tend to recommend movies, food, etc.
Here we will be focussing on the IMDB movies dataset of 25000 reviews with positive/negative sentiment labels in the training set and an equal amount in the test set. The good thing about this dataset is that it comes with Keras by default and you don't need to download it from another website.
Let’s get started with…
Object Detection in a video is to locate the presence of objects, put them under certain classes based on our deep learning model, and place bounding boxes around them.
To simply put our Input is a video or images with multiple objects and Output is the same video or image with bounding boxes (of certain height and width) around the objects + class names and probabilities to which they belong.
Here we will be using a pre-trained YOLO (You Only Look Once) model which was trained for a large dataset of around 80 classes of objects for a long time…
Web Scraping is the process of gathering information from the internet.
Note: If you are scraping a page that is out there on the World Wide Web just for educational purposes, then it seems fine. But still, you should consider checking their Terms of service as few websites don’t like it when automatic scrapers gather their data, while others don’t mind.
Let me give you an easy example of where it can be used. Say you want to buy a popular product from a website that goes out of stock as soon as it comes up. …
While I was working on python I tried copying a list to a new variable by using the ‘=’ operator. After making a few changes I got to know this is not the right way we do it in python.
A few seconds later I was scrolling through multiple articles understanding 2 terminologies used while copying data in python: shallow and deep copy.
The = operator does not copy the object, it just creates a new variable and shares the reference of the original variable.
Let's first start with the most basic and most understood regression: Linear Regression. In simpler terms, linear regression attempts to model the relationship between two variables by fitting a linear equation to observed data.
More generally, a linear regression model predicts by simply computing a weighted sum of the input features + constant called the bias term (also known as the intercept term).
So after a month of revising concepts of Python, understanding the basics of a few python data handling & visualization libraries like NumPy, Pandas, Matplotlib, etc. I decided to create my account on Kaggle and start my Data Analytics journey with real data.
After signing up I started looking into some popular datasets for which people have submitted their notebooks. I looked into a few of them(high rated). They have been using seaborn, Plotly, etc. to have better visualizations and easy-to-code Plots.
I’ve seen a lot of times how similar data is shown in different ways to bias its overall effect.
Tricks can be performed to frame a particular set of results, converting its outcome from positive to negative or vice-versa.
The FRAMING EFFECT is when our decisions are influenced by the way information is presented. Equivalent information can be more or less attractive depending on what features are highlighted. Decisions based on this framing effect can put something which is of lesser value or importance into a positive light and highly important information into a negative one.
5% mortality sounds worse…
Data visualization is one of the key skills expected these days while working with data. It helps in simplifying complex data into an understandable format and making decisions based on that.
Companies are collecting various types of data including climate data, user data, transactional data, medical data, etc. All these data are analyzed and visualized further to make some very important business decisions.
Visualization works from a human perspective because we respond to and process visual data better than any other type of data. In fact, the human brain processes image 60,000 times faster than text, and 90 percent of…
While doing data analysis, a major chunk of your time (around 80%) is spent on loading, cleaning, transforming, rearranging data, and doing all kinds of stuff to bring it in the ‘right format’ state.
Most commonly it is missing data that needs to be handled.
In pandas, we define missing data as NA (not available). We can also use None value as it also treated the same way.
Let’s try to understand it with an example:
# import all the required libraries on your jupyter notebook
import pandas as pd
import numpy as np
from numpy import nan as NA
Implementation Analyst || Learning Analytics and Machine Learning