Oracle APEX 23 Course For Beginners

Oracle APEX 23 Course For Beginners
Oracle APEX 23 Course For Beginners

Thursday, 20 October 2022

Best Data Analysis Book

Python Data Analysis Book
Python for Data Analysis: Data Wrangling with pandas, NumPy, and Jupyter 3rd Edition

The objective of this book is to teach you what data analysis is and how to manipulate, process, clean, and crunch data in Python and data analysis tools. In order to become an effective data analyst, this book acts as a guide to the parts of the Python programming language and its data-oriented library ecosystem and tools. The book focuses on Python programming, libraries, and tools that you need for data analysis. The Python open-source ecosystem for doing data analysis (or data science) has evolved over the years and it has developed a large and active scientific computing and data analysis community. Python has now become one of the most important languages for data science, machine learning, and general software development. In recent years, Python's improved open-source pandas and scikit-learn libraries have made it a popular choice for data analysis tasks. Here is a summarized list of topics covered in this book:

  • Use the Jupyter notebook and IPython shell for exploratory computing
  • Learn basic and advanced features in NumPy
  • Get started with data analysis tools in the pandas library
  • Use flexible tools to load, clean, transform, merge, and reshape data
  • Create informative visualizations with matplotlib
  • Apply the pandas groupby facility to slice, dice, and summarize datasets
  • Analyze and manipulate regular and irregular time series data
  • Learn how to solve real-world data analysis problems with thorough, detailed examples

Here are the details of the topics covered in this book:

Why Python for Data Analysis?

Python Libraries (NumPy, pandas, matplotlib, IPython and Jupyter, SciPy, scikit-learn statsmodels)
Installation (Miniconda on Windows, GNU/Linux, Miniconda on macOS)

Python Language Basics, IPython, and Jupyter Notebook 

 Language Semantics
 Scalar Types
 Control Flow

Data Structures and Sequences 

Tuple
List
Dictionary
Set
Built-In Sequence Functions
List, Set, and Dictionary Comprehensions

Functions

Namespaces, Scope, and Local Functions
Returning Multiple Values
Functions Are Objects
Anonymous (Lambda) Functions
Generators
Errors and Exception Handling
        Bytes and Unicode with Files

NumPy Basics: Arrays and Vectorized Computation

Create ndarray: A multidimensional array object
Data types of ndarrays
Arithmeic with NumPy arrays
Basic indexing and slicing
Boolean indexing
Fancy indexing
Transposing arrays and swapping axes
Pseudorandom number generation
Universal functions: Fast element-wise array functions
Array-Oriented programming with arrays
Expressing conditional logic as array operations
Mathematical and statistical methods
Methods for boolean arrays
        Sorting
Unique and other set logic
File input and output with arrays
Linear algebra

Introduction to pandas data structures

Series
Data Frame
Index Objects
Reindexing
Dropping entries from an Axis
Indexing, Selection, and Filtering
Arithmetic and data alignment
Function application and mapping
Sorting and ranking
Axis indexes with duplicate labels
Correlation and covariance
Unique values, value counts, and membership

Data Loading, Storage, and File Formats

Reading and writing data in text format
Reading text files in pieces
Writing data to text format
working with other delimited formats
JSON data
XML and HTML web scraping
Binary data formats
Reading Microsoft Excel files
Using HDF5 format
Interacting with Web APIs
Interacting with databases

Data Cleaning and Preparation

Handling missing data
Filtering out missing data
Filling in missing data
Data Transformation
Removing duplicates
Transforming data using a function or mapping
Replacing values
Renaming axis indexes
Discretization and binning
Detecting and filtering outliers
Permutation and random sampling
Computing indicator/dummy variables
Extension data types
String manipulation
Python built-in string object methods
Regular expressions
String functions in pandas
Categorical data
Background and motivation
Categorical extension type in pandas
Computations with categoricals
Categorical methods

Data Wrangling: Join, Combine, and Reshape

Hierarchical indexing
Reordering and sorting levels
Summary statistics by level
Indexing with a DataFrame's columns
Combining and merging datasets
Database-style DataFrame joins
Merging on index
Concatenating along an axis
Combining data with overlap
Reshaping and pivoting
Reshaping with Hierarchical indexing
Pivoting "Long" to "Wide" format
Pivoting "Wide" to "Long" format

Plogging and Visualization

Matplotlib API primer
Figures and subplots
Colors, markers, and line styles
Ticks, labels, and legends
Annotations and drawings on a subplot
Saving plots to file
matplotlib configuration
Plotting with pandas and seaborn
Line plots
Bar plots
Histograms and density plots
Scatter or Point plots
Facet grids and categorical data
Other Python visualization tools

Data Aggregation and Group Operations

How to think about group operations
Iterating over groups
Selecting a column or subset of columns
Grouping with dictionaries and series
Grouping with functions
Grouping by index levels
Column-wise and multiple-function application
Returning aggregated data without row indexes
General split-apply-combine
Suppressing the group keys
Quantile and bucket analysis
Example: Filling missing values with group-specific values
Example: Random sampling and permutation
Example: Group weighted average and correlation
Example: Group-wise linear regression
Group transforms and "Unwrapped" groupbys
Pivot tables and corss-tabulation
Cross-tabulations: Crosstab

Time Series

Data and time data types and tools
Converting between string and datetime
Time series basics
Indexing, selectin, subsetting
Time series with duplicate indices
Date ranges, frequencies, and shifting
Generating date ranges
Frequencies and date offsets
Shifting (leading and lagging) data
Time zone handling
Time zone localization and conversion
Operations with time zone-aware timestamp objects
Operations between different time zones
Periods and Period arithmetic
Period frequency conversion
Quarterly period frequencies
Converting timestamps to periods (and back)
Creating a PeriodIndex from arrays
Resampling and frequency conversion
Downsampling
Upsampling and interpolation
Resampling with periods
Grouped time resampling
Moving window functions
Exponentially weighted functions
Binary moving window functions
User-defined moving window functions

Modeling Libraries in Python

Interfacing between pandas and model code
Creating model descriptions with Patsy
Data transformations in Patsy formulas
Categorical data and Patsy
Introduction to statsmodels
Estimating linear models
Estimating time series processes
Introduction to scikit-learn

Data Analysis Examples

Counting time zones in Pure Python
Counting Time Zones with pandas
Measuring rating disagreement
Analyzing naming trends
Donation statistics by occupation and employer
Bucketing donations amounts
Donation statistics by state

Advanced NumPy

ndarray object internals
NumPy data type hierarchy
Advanced array manipulation
Reshaping arrays
C versus Fortran Order
Concatenating and splitting arrays
Repeating elements: tile and repeat
Fancy indexing equivalens: take and put
Broadcasting over other axes
Setting array values by broadcasting
Advanced ufunc usage
ufunc instance methods
Writing new ufuncs in Python
Structured and record arrays
Nested data types and multidimensional fields
Why use structured arrays?
Indirect sorts: argsort and lexsort
Alternative sort algorithms
Partially sorting arrays
numpy.searchsorted: Finding elements in a sorted array
Writing fast NumPy functions with Numba
Creating custom numpy.ufunc objects with Numba
Advanced array input and output
Memory-mapped files
HDF5 and other array storage options
The importance of Contiguous memory

IPython System

Terminal keyboard shortcuts
Magic commands
The %run command
Executing code from the clipboard
Searching and reusing the command history
Input and output variables
Interacting with the operating system
Shell command and aliases
Directory bookmark system
Software development tools
Interactive debugger
Timing Code: %time and %timeit
Basic profiling: %prun and %run -p
Profiling and function line by line
Tips for productive code development using IPython
Reloading module dependencies
Code design tips
Advanced IPython features
Profiles and configuration


Display Data Dynamically In A Gauge Chart

In this tutorial, we will learn how to display customer's ordered data in a gauge chart dynamically. As you choose a customer name from ...