Frequently used functions of NumPy, Pandas, Matplotlib, Seaborn, Scikit-Learn, Tensorflow and Keras
If your language is python
and you belong from any of the given fields (or related), then this cheat sheet is for you —
- Data analysis
- Data science
- Machine learning
- Deep learning
What benefit do I get from this cheat sheet?
If you’re starting out with these technologies, then there’s no need to rush with all the functionality of each technology because most of the times only few functions do the majority of the task. So, just kick-start with these listed functions and you’re good to go.
Moving ahead, I’ve compiled a list of frequently used functions used by people till date (March 5th, 2019) from the time when the first question was filed on Stack Overflow regarding any of the technologies listed in the title. I’ve chosen those functions which have an occurrence of more than 100. Also, I’ve provided the complete procedure at the bottom of this article on how I compiled this cheat sheet from data collection to data preparation and final data presentation. You can take your time to manipulate the code and bring more refined results at your pace.
Note —
1. There are rare functions which are associated with almost all of these libraries, hence they are present in other library lists too. For instance,
groupby
belongs toPandas
but may also be present in other libraries due to its high frequency.2. The complete list of all the functions corresponding to each library is provided in the CSV format in the decreasing order of their frequency. The links to CSV files are mentioned in the Reference section.
NumPy
Top-most: array
, arange
, reshape
, shape
Rest as wordcloud.

Pandas
Top-most: DataFrame
, groupby
, apply
, loc
, reset_index
Rest as wordcloud.

Matplotlib
Top-most: plot
, show
, figure
, subplots
Rest as wordcloud.

Seaborn
Top-most: show
, DataFrame
, plots
, subplots
Rest as wordcloud.

Scikit-Learn
Top-most: fit
, fit_transform
, predict
, array
Rest as wordcloud.

TensorFlow
Top-most: run
, Session
, placeholder
, Variable
, constant
Rest as wordcloud.

Keras
Top-most: add
, compile
, shape
, fit
Rest as wordcloud.

So, what’s the procedure of generating my own cheat sheet?
It’s simple. Just a bit of SQL
, Python
and R (optional)
.
I’ve grabbed the data from StackExchange by composing a SQL query which results in all the answers stored in their database till date corresponding to each technology.
First, I extracted each technology ID one by one and later used those IDs to fetch all the answers associated with each question which are tagged with that specific technology.
For instance, I just have to run the given SQL query to grab all the answers written on Stack Overflow till date associated with all the questions under numpy
tag —
SELECT Body
FROM Posts
WHERE PostTypeId = 2
AND ParentId IN (SELECT PostId
FROM PostTags
WHERE TagId = (SELECT Id
FROM Tags
WHERE TagName = 'numpy'))
Note — StackExchange keeps a limit to return max 50k records, so I’ve to keep a boundary on last record creation date and used it as a constraint to fetch next records. You may find such difficulty in technologies with high records like pandas
. I got few duplicates after merging 3 chunks in pandas
which I’ve taken care of in Python
.
Second, I cleaned the data to select only the methods from the text body assuming they start with a .
and either ends with a (
or [
. I utilizedregex
in Python
to achieve the same.
import re
p = re.compile(‘\.([A-Za-z_]+)(?=\(|\[)’)
ans[‘Methods’] = ans.Body.apply(lambda x: p.findall(x))
The complete code is listed here. You will find two codes, first is meant for pandas
and the second is common to the rest. I’ve to write separate code for pandas
using chunks to avoid memory limitations. By here, you’ll have CSV corresponding to each library with you.
Third, to perform visualization I personally felt a need to chose R
as its wordcloud2
library performs quite well as compared to Python's wordcloud
library. Here’s the syntax used to construct each plot.
library(wordcloud2)
library(webshot)
library(htmlwidgets)df <- read.csv(‘methods_keras.csv’)my_graph = wordcloud2(df[,-1], size = 0.5, minRotation = -pi/6, maxRotation = -pi/6, rotateRatio = 1)
saveWidget(my_graph,”tmp.html”,selfcontained = F)
webshot(“tmp.html”,”keras.png”, delay=10, vwidth = 800, vheight=400)
References
- StackExchange and Stack Overflow
- Get the complete NumPy, Pandas, Matplotlib, Seaborn, Scikit-Learn, TensorFlow, and Keras CSV files.
Do comment if you have any ideas to improve the work or if you have any other suggestions. :)
Happy engineering!