Skip to content

smammadl/ds_course

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

247 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Syllabus

1. Programming with Python

1.1: Introduction to Python and Basic Concepts

  • Introduction to Python and Data Science
  • Setting up the Environment
  • Basic Operators
    • Arithmetic Operators
    • Operator Precedence
    • Comparison Operators
  • Variables and Data Types
    • Reassignment
    • Assignment Operators
    • Variable Naming Rules
    • Basic Data Types
    • Assigning Differet Data Types to Variables
    • Type Conversion Functions
  • Strings in Detail
    • String Concatenation
    • String Replication
    • Escape Characters
    • String Indexing and Slicing
    • String Properties
    • String Functions
    • String Methods
  • Basic Input and Output
    • Output
    • Formatting Options
    • Input

1.2. Data Structures and Control Flow

  • Introduction to Data Structures
  • Lists
    • Creating Lists
    • List Membership
    • Indexing and Slicing
    • Modifying Lists
    • List Methods
    • Nested Lists
  • Tuples
    • Creating Tuples
    • When to use Tuples?
  • Logical Operators
    • Chained Comparison Operators
  • Conditional Statements
    • if, elif, and else
    • Nested Conditional Statements
    • Single-line Conditional Statements
    • Match Statement
  • Loops
    • for Loops
    • while Loops
    • break, continue, pass
    • Nested Loops
  • List Comprehensions
    • The Basic Syntax
    • Adding a Condition
    • if-else in List Comprehensions
    • Nested List Comprehensions

1.3. Advanced Data Structures and Functions

  • Dictionaries
    • Creating Dictionaries
    • Accessing Values
    • Modifying Dictionaries
    • More complex Dictionaries
    • Operations on Dictionaries
  • Sets
    • Creating Sets
    • Set Operations
  • Introduction to Functions
    • Writing and Calling Functions
  • Function Arguments
    • Positional Arguments
    • Keyword Arguments
    • Using return statement
    • Adding logic to functions
  • Scope of Variables

1.4. Advanced Functions and Functional Programming for Data Processing

  • What is Functional Programming?
  • Lambda Functions
  • map, filter, and reduce
    • map
    • filter
    • reduce
  • List Comprehensions
    • Basic Syntax
    • Adding a Condition
    • if-else in List Comprehensions
  • Dictionary and Set Comprehensions
    • Dictionary Comprehensions
    • Set Comprehensions
  • Error Handling (try...except)
    • Basic Syntax
    • Handling Multiple Exceptions
    • else and finally clauses
  • Raising Exceptions (raise)
  • Assertions

1.5. Object-Oriented Programming (OOP)

  • Introduction to OOP
  • Creating Classes
    • Defining a Class
    • Creating an Instance (Object)
    • A different example
  • Inheritance
  • Polymorphism
  • Encapsulation
  • Special Methods
  • Data Classes

1.6. Advanced Data Structures and Iterators

  • Args and Kwargs
    • *args
    • **kwargs
    • Using *args and **kwargs together
  • Generators
  • Iterators
  • Decorators
    • Examples
    • Built-in Decorators
  • Reading and Writing Files
    • Pathlib Module
    • File Operations
  • Introduction to Python's Standard Library
    • Collections Module
    • Itertools Module
    • Random Module

1.7. Advanced Topic (Not Covered in the Course)

  • Advanced OOP and Design Patterns
    • Metaclasses
    • Descriptors
    • Abstract Base Classes (ABCs)
    • Common Design Patterns
    • Data Model Protocols
  • Concurrency and Parallelism
    • Introduction to Concurrency
    • Threading Module
    • Multiprocessing Module
    • Asyncio
    • Concurrent Data Processing
  • Performance and Optimization
    • Profiling Code
    • Memoization and Caching
    • Working with C/C++ extensions
    • Memory Management
    • Vectorization Principles
    • Numerical Computing Foundations
  • Production-Ready Data Science Code
    • Creating Data Science Packages
    • Virtual Environments and Dependency Management
    • Testing Data Pipelines
    • Code Style and Documentation
    • Logging and Monitoring
    • Bridging to Data Science Libraries

2. Data Analysis and Statistics

2.1. Numerical Computing with NumPy

  • Introduction to NumPy
  • NumPy Arrays (ndarrays)
    • Creating NumPy Arrays
    • Array Attributes
    • Reshaping Arrays
    • Data Types
    • Array from Functions
  • Array Indexing and Slicing
    • 1D Array Indexing and Slicing
    • 2D Array Indexing and Slicing
    • Ellipsis
    • Boolean Indexing
  • Array Operations
    • Vectorized Operations
    • Broadcasting
    • Stacking Arrays
    • Splitting Arrays
    • Transposing Arrays
  • Mathematical and Statistical Functions
  • Linear Algebra with NumPy
    • Examples
    • Solving a system of linear scalar equations

2.2. Data Manipulation with Pandas

  • Introduction to Pandas
  • Pandas Data Structures
  • Series
    • Creating a Series
    • Index
    • Automatic Alignment
    • Name of the Series
    • Conditional Replace
    • Counting values
    • Applying a Function
    • Series Utils
  • DataFrame, IO, Multi-Index
    • Creating a DataFrame
    • Reading and Writing Data
    • Multi-Index DataFrame
    • Operations on Indexes and Columns
  • Sort, Index, Ops
    • Accessing Rows and Columns
    • Sorting a DataFrame
    • Operations on DataFrames
    • Data Cleaning and Preparation
  • Data Transformations
    • Merging, Joining, and Concatenating DataFrames
    • Aggregating and Grouping Data
    • Aggregate, Filter, Transform, Apply
    • Pivot Tables
  • String and DateTime
    • Vectorized String Operations
    • Date/Time Manipulation

2.3. Visualization with Matplotlib and Seaborn (Vanderplas' Book)

  • Simple Line Plots
    • Adjusting the Plot: Line Colors and Styles
    • Adjusting the Plot: Axes Limits
    • Labeling Plots
  • Simple Scatter Plots
    • Scatter Plots with plt.plot
    • Scatter Plots with plt.scatter
    • Visualizing Uncertainties
  • Density and ContourPlots
    • Visualizing a Three-Dimensional Function
    • Histograms, Binnings, and Density
    • Two-Dimensional Histograms and Binnings
  • Multiple Subplots
    • plt.axes: Subplots by Hand
    • plt.subplot: Simple Grids of Subplots
    • plt.subplots: The Whole Grid in One Go
    • plt.GridSpec: More Complicated Arrangements
  • Visualization with Seaborn
    • Exploring Seaborn Plots
      • Histograms, KDE, and Densities
      • Pair Plots
      • Faceted Histograms
    • Categorical Plots
      • Joint Distributions
      • Bar Plots
    • Violin Plots
    • Heatmaps

2.4. SQL for Data Analysis (from a Pandas Perspective)

  • Introduction to Relational Databases and SQL
    • Comments
    • Data Types
    • Constraints
  • Querying and Selecting Data (similar to df[] and df.loc)
    • Basic Queries: SELECT, FROM, WHERE
    • Aliases
    • DISTINCT for unique values
  • Filtering and Sorting Data (similar to boolean indexing and df.sort_values())
    • ORDER BY
    • LIMIT
    • Logical operators: AND, OR, NOT
    • Comparison operators: IN, NOT IN, BETWEEN
    • Pattern matching with LIKE
    • Handling nulls with IS NULL, IS NOT NULL, COALESCE, NULLIF
  • Data Manipulation and Cleaning (similar to df.apply, .astype, and string methods)
    • CASE statements for conditional logic
    • CAST for type conversion
    • String Functions
      • CONCAT
      • UPPER
      • LOWER
      • TRIM
      • REPLACE
      • LEN
      • LEFT
      • RIGHT
      • SUBSTRING
    • Numerical Functions
      • ABS
      • CEIL
      • FLOOR
      • ROUND
      • TRUNC
      • SQRT
      • POWER
      • LOG
    • Date Functions
      • EXTRACT / PART
      • DATE_TRUNC
      • DATEDIFF
      • DATE_ADD / DATE_SUB
  • Aggregation and Grouping (similar to df.groupby() and aggregation functions)
    • Aggregate Functions: COUNT, SUM, AVG, MIN, MAX
    • GROUP BY
    • HAVING for filtering groups
    • Rollups and Cubes (briefly)
  • Combining Tables (similar to pd.merge and pd.concat)
    • Keys
    • INNER, LEFT, RIGHT, FULL OUTER JOIN
    • Self Joins & Cross Join
    • UNION, UNION ALL
    • INTERSECT, EXCEPT
  • Advanced Querying
    • Subqueries (Scalar, Multi-row, Correlated)
    • Common Table Expressions (CTEs) with WITH
    • Window Functions: ROW_NUMBER, RANK, DENSE_RANK, LEAD, LAG, NTILE
  • Advanced Topics
    • Data Cleaning and Transformation (Pivot/Unpivot patterns)
    • DDL for Analysts: CREATE TABLE AS SELECT (CTAS), CREATE VIEW
    • Indexes, Triggers, Views, Stored Procedures, Transactions
    • Connectors and ORMs for Python (SQLAlchemy, psycopg2)

2.5. Analytics applications

  • Interactive Visualization with Plotly
    • Getting Started: Plotly Express vs Graph Objects
    • Interactive Features: hover, zoom, selection
    • Styling and Layout: axes, annotations, themes
    • Subplots and Faceting
    • Advanced Charts: treemap, sunburst, polar, funnel
    • Maps: choropleth, scatter_mapbox
    • Time Series and Animation
    • Integrating with Pandas
    • Exporting: static images, HTML
  • Streamlit
    • App Structure and streamlit run
    • Widgets and Interactivity
    • Forms and Validation
    • State Management with st.session_state
    • Caching with st.cache_data and st.cache_resource
    • Displaying DataFrames and Tables
    • Layout: sidebar, columns, tabs
    • Visualizations: Matplotlib, Seaborn, Plotly
    • File Upload/Download
    • Multipage Apps
    • Deployment Basics
  • PowerBI
    • Introduction to PowerBI
    • Data Sources
    • Data Modeling
    • Data Visualization
    • Dashboards
    • Reports

3. Data Science and Machine Learning

  • Scikit-learn

    • Statistics
    • Handling Outliers
    • Handling Nulls
    • Handling Categorical Variables
    • Feature Scaling
    • Logs, exponentials, etc.
    • Regression
    • Classification
    • Regression and Classification Models
    • Metrics
    • Ensembles
    • Tuning
    • Unsupervised learning
    • Dimensionality Reduction???
  • Pytorch

    • Introduction to Pytorch
    • Tensor Operations
    • Neural Networks
    • Training Loop
    • Metrics
    • Evaluation
    • More Complex Models
    • Saving and Inference

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages