Professional Python project: exploratory data analysis with Jupyter notebooks.
Data analytics requires a variety of skills. This course builds capabilities through working projects.
In the age of generative AI, durable skills are grounded in real work: setting up a professional environment, reading and running code, understanding the logic, and pushing work to a shared repository. Each project follows the structure of professional Python projects. We learn by doing.
This project introduces Exploratory Data Analysis (EDA) using Jupyter notebooks.
When we encounter a new dataset, we want to explore quickly: run checks, view distributions, identify missing values or outliers. Notebooks combine Markdown narrative with Python code cells and are ideal for this kind of investigation.
You will run the example notebook, read the code and narrative, and create your own notebook to explore a different tabular dataset.
You'll work with just these areas:
- docs/ - the project narrative and documentation
- src/datafun - supporting Python module
- notebooks/ - where the analysis happens
- pyproject.toml - update authorship & links
- zensical.toml - update authorship & links
Follow the step-by-step workflow guide to complete:
- Phase 1. Start & Run
- Phase 2. Change Authorship
- Phase 3. Read & Understand
- Phase 4. Modify
- Phase 5. Apply
Challenges are expected. Sometimes instructions may not quite match your operating system. When issues occur, share screenshots, error messages, and details about what you tried. Working through issues is part of implementing professional projects.
After completing Phase 1. Start & Run, you'll have your own GitHub project, with the example notebook executed and committed, and running the example script will print out:
========================
Executed successfully!
========================A new file project.log will appear in the root project folder.
The commands below are used in the workflow guide above. They are provided here for convenience.
Follow the guide for the full instructions.
Show command reference
After you get a copy of this repo in your own GitHub account,
open a machine terminal in your Repos folder:
# Replace username with YOUR GitHub username.
git clone https://github.com/username/datafun-04-notebooks
cd datafun-04-notebooks
code .These are listed for convenience. For best results, follow the detailed instructions in pro-analytics-02 guide.
uv self update
uv python pin 3.14
uv lock --upgrade
uv sync --extra dev --extra docs --upgrade
uvx pre-commit install
git add -A
uvx pre-commit run --all-files
# repeat if changes were made
uvx pre-commit run --all-files
# run the module to verify the environment (.venv)
uv run python -m datafun.app_case
# do chores
uv run ruff format .
uv run ruff check . --fix
uv run python -m pyright
uv run python -m pytest
uv run python -m zensical build
# save progress
git add -A
git commit -m "update"
git push -u origin main- Use the UP ARROW and DOWN ARROW in the terminal to scroll through past commands.
- Use
CTRL+fto find (and replace) text within a file. - You do not need to add to or modify
tests/. They are provided for example only. - Many files are silent helpers. Explore as you like, but nothing is required.
- You do NOT not to understand everything; understanding builds naturally over time.
If you see something like this in your terminal: >>> or ...
You accidentally started Python interactive mode.
It happens.
Press Ctrl+c (both keys together) or Ctrl+Z then Enter on Windows.
| INFO | EDA | --- Section 9: Summary and next steps ---
| INFO | EDA | ========================
| INFO | EDA | SUMMARY
| INFO | EDA | ========================
| INFO | EDA | Dataset: penguins
| INFO | EDA | Original rows: 344
| INFO | EDA | Clean rows: 342
| INFO | EDA | Groups found in species: ['Adelie', 'Chinstrap', 'Gentoo']
| INFO | EDA | Strongest correlation:
| INFO | EDA | flipper_length_mm and body_mass_g (~0.87)
| INFO | EDA | Suggested next step:
| INFO | EDA | Model body_mass_g ~ flipper_length_mm with linear regression
| INFO | EDA | ----- in a script, call plt.show() once at the end to display all charts -----
| INFO | EDA | EDA workflow complete
| INFO | EDA | IMPORTANT: This script creates chart windows.
| INFO | EDA | Close any chart windows and terminate this process with CTRL+c as needed.
| INFO | EDA | ========================
| INFO | EDA | Executed successfully!
| INFO | EDA | ========================Take screenshots of your charts and provide them here with a discussion. In Markdown, display a figure by using: an exclamation mark immediately followed by square brackets containing a useful caption immediately followed by parentheses containing the relative path to your figure. Note: When you start typing the path with a dot (.) for "here, in this directory", the IDE may help complete the path.
Follow this example, but the figures should reflect your work and include your narrative. Remove unnecessary instructional comments in your final version of this README.md.


