-
Code Structure & Organization:
- Script + Variable Configuration: By default, CLI is not used unless specifically requested. Configuration items are defined as constants or variables at the top of the script.
- Modular Design: Prefers to break down the code into small, reusable modules.
-
Import Style:
- Unified Import Method: Uses
from pathlib import Path, avoidingimport pathlib.
- Unified Import Method: Uses
-
String Handling:
- String Format: Prefers double quotes (
") over single quotes (').
- String Format: Prefers double quotes (
-
Type Hinting:
- Type Annotations: Prefers type hints. Non-primitive types are always enclosed in double quotes.
-
Object-Oriented Programming:
- Class Syntax: Prefers using
classfor encapsulation of complex logic.
- Class Syntax: Prefers using
-
Code Style:
- Clean, Readable Code: Encapsulates complex logic into functions and classes. Frequently adds detailed comments.
-
Error Handling:
- Exception Handling: Uses
try-exceptblocks for exceptions with appropriate logging.
- Exception Handling: Uses
-
File & Data Operations:
- Path Management: Prefers
Pathlibfor handling file paths. - Directory Operations: Uses methods like
Path.mkdir()andPath.exists()to check and create directories.
- Path Management: Prefers
-
Testing & Debugging:
- Simple Test Cases: Prefers simple functional tests added after the
__main__block.
- Simple Test Cases: Prefers simple functional tests added after the
-
Performance Optimization:
- Execution Efficiency: Focused on optimizing code execution, especially with large datasets.
- Libraries: Frequently uses
PolarsandPandas.
-
Libraries & Frameworks:
- Data Analysis & Processing: Prefers
Pandas,Polars, andSQLAlchemy. - Task Scheduling: Uses
Apache Airflowfor task automation. - Machine Learning: Utilizes
scikit-learnandXGBoostwhen needed.
- Data Analysis & Processing: Prefers
-
Asynchronous & Concurrent Processing:
- Asyncio & Multi-threading: Prefers
asyncioand multi-threading for concurrent processing.
- Asyncio & Multi-threading: Prefers