Skip to content

Commit 8932b29

Browse files
committed
v8
1 parent 619ac93 commit 8932b29

8 files changed

Lines changed: 174 additions & 83 deletions

File tree

README.md

Lines changed: 172 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,20 +1,185 @@
1-
# MachineLearningToolKit
2-
Helper functions for all stages of the machine learning cycle.
1+
Welcome to ds11mltoolkit, we are delighted to see you here!
32

4-
# List of functions and methods
3+
Thank you for your interest, and we hope this library can help you in your daily life as a **Data Scientist**
54

6-
## Data collection, loading and pre-processing
5+
![Logotipo](./assets/logo.jpg)
76

7+
[![Powered by NumFOCUS](https://img.shields.io/badge/powered%20by-TheBridge-orange.svg?style=flat&colorA=E1523D&colorB=007D8A)](https://www.thebridge.tech/) ![Powered by NumFOCUS](https://img.shields.io/badge/Contributors-13-orange.svg?style=flat&colorA=E1523D&colorB=007D8A) ![PyPI](https://img.shields.io/pypi/v/ds11mltoolkit.svg)
8+
9+
## Table of contents
10+
- What is ds11mltoolkit?
11+
- How to install ds11mltoolkit
12+
- Dependencies
13+
- Functions and methods
14+
- Data Analysis
15+
- Data visualization and exploration
16+
- Data processing
17+
- Machine Learning
18+
- Github framework
19+
- Contributors
20+
21+
## What is ds11mltoolKit?
22+
23+
It is a Python package that will help you in your first steps as a Data Scientist. *"Faster, cleaner, easier"* From simple databasis to complex neural networks, this library will accelerate your work processes in all stages of the machine learning cycle.
24+
25+
## How to install ds11mltoolkit?
26+
27+
Install as you would normally install a Pypi library.
28+
29+
```
30+
pip install ds11mltoolkit
31+
```
32+
33+
We suggest to import ds11mltoolkit as mlt, to make it easier to deploy by the users
34+
35+
```
36+
import ds11mltoolkit as mlt
37+
```
38+
39+
# Dependencies
40+
41+
ds11mltoolkit requires these libraries to work properly:
42+
43+
- beautifulsoup4==4.11.1
44+
- imblearn==0.0
45+
- keras==2.11.0
46+
- matplotlib==0.1.6
47+
- nltk==3.8.1
48+
- opencv-python-headless==4.7.0.68
49+
- pandas==1.3.5
50+
- Pillow==9.3.0
51+
- plotly==5.11.0
52+
- requests==2.28.1
53+
- scikit-image==1.0.2
54+
- scikit-learn==0.19.3
55+
- scipy==1.7.3
56+
- seaborn==0.12.1
57+
- selenium==4.7.2
58+
- tensorflow==2.11.0
59+
- wordcloud==1.7.0
60+
61+
## Functions and methods
62+
63+
In the current version, ds11mltoolkit will provide users around 40 functions, divided in 4 groups:
64+
65+
## Data Analysis
66+
67+
* read_url
68+
* read_csv_zip
69+
* chi_squared_test
870

971
## Data visualisation and exploration
1072

73+
* heatmap
74+
* sunburst
75+
* correl_map_max
76+
* plot_map
77+
* plot_ngram
78+
* wordcloudviz
79+
* plot_cumulative_variance_ratio
80+
* plot_roc_cruve
81+
* plot_multiclass_prediction_image
82+
83+
## Data processing
84+
85+
* list_categorical_columns
86+
* last_columns
87+
* uniq_value
88+
* load_imgs
89+
* class ImageDataGen(ImageDataGenerator) 3-in-1 functions
90+
* clean_text
91+
* processing_model_classification
92+
* replace_convert_numeric
93+
* log_transform_numeric
94+
* add_previous
95+
* _exponential_smooth
96+
* Nan treatment
97+
* convert_to_numeric
98+
* auto_dtype_converter
99+
* winner_loser
100+
* lstm_model
101+
102+
## Machine Learning
103+
104+
* export_model
105+
* import_model
106+
* worst_params
107+
* load_model_zip
108+
* quickregression
109+
* polynomial_features_non_binary
110+
* balance_binary_target
111+
* image_scrap
112+
* create_multiclass_prediction_df
113+
* show_scoring
114+
* predict_model_classification
115+
* Unsupervised KMeans
116+
* UnsupervisedPCA
117+
118+
119+
## Quick example
120+
121+
122+
```
123+
124+
df = pd.DataFrame(data= {'Cities': ['Madrid', 'Barcelona'],
125+
'Teams': ['Team 1', 'Team 2'],
126+
'Players': ['Vinicius', 'Pedri'],
127+
'Goals': [10, 9]})
128+
129+
130+
def list_categorical_columns(df):
131+
'''
132+
Function that returns a list with the names of the categorical columns of a dataframe.
133+
134+
Parameters
135+
----------
136+
df : dataframe
137+
138+
Return
139+
----------
140+
features: list of names
141+
142+
'''
143+
features = []
144+
145+
for c in df.columns:
146+
t = str(df[c].dtype)
147+
if "object" in t:
148+
features.append(c)
149+
return features
150+
151+
list_categorical_columns(df)
152+
153+
output: ['Cities', 'Teams', 'Players']
154+
155+
156+
```
157+
158+
## Github framework
11159

12-
## Machine Learning Models
160+
![Logotipo](https://github.com/TheBridgeMachineLearningPythonLibrary/MachineLearningToolKit/blob/dev/assets/diagrama.png?raw=true)
13161

162+
## Contributors
14163

15-
## Model Productizing
164+
- [Miguel de Frutos](https://github.com/Migueldfr)
165+
- [Pedro Vergara](https://github.com/pericotronic)
166+
- [Bogdan Radacina](https://github.com/BogdanBoyan92)
167+
- [Sean Stevenson](https://github.com/seenstevo)
168+
- [José Nevado](https://github.com/JNevado81)
169+
- [Celia Cabello](https://github.com/celiacnavarro)
170+
- [Jared Rivas](https://github.com/JaredR33)
171+
- [Nicolás Eyzaguirre](https://github.com/NicolasEyzaguirre)
172+
- [Enrique Moya](https://github.com/3Moya)
173+
- [Javi López](https://github.com/javlopsan)
174+
- [Kyung Min Ohn](https://github.com/exAdun)
175+
- [Leandro Salvado](https://github.com/Lean788)
176+
- [Ramón Fernández](https://github.com/RamonFCerezo)
16177

178+
# License
17179

180+
ds11mltoolkit uses an “Interface-Protection Clause” on top of the MIT license. This library is free for personal use. Therefore, it can be used for both commercial and non-commercial purpose.
18181

19-
#### Contributors
182+
[See license](https://github.com/TheBridgeMachineLearningPythonLibrary/MachineLearningToolKit/blob/dev/LICENSE.txt)
183+
---
20184

185+
Please don't hesitate to contact us if you have any questions or comments. Thank you for using our library!

dist/ds11mltoolkit-1.7.tar.gz

-22.7 KB
Binary file not shown.

ds11mltoolkit.egg-info/PKG-INFO

Lines changed: 0 additions & 40 deletions
This file was deleted.

ds11mltoolkit.egg-info/SOURCES.txt

Lines changed: 0 additions & 14 deletions
This file was deleted.

ds11mltoolkit.egg-info/dependency_links.txt

Lines changed: 0 additions & 1 deletion
This file was deleted.

ds11mltoolkit.egg-info/requires.txt

Lines changed: 0 additions & 18 deletions
This file was deleted.

ds11mltoolkit.egg-info/top_level.txt

Lines changed: 0 additions & 1 deletion
This file was deleted.

setup.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -7,15 +7,15 @@
77
setup(
88
name = 'ds11mltoolkit',
99
packages = ['ds11mltoolkit'],
10-
version = '1.7',
10+
version = '1.8',
1111
license = 'MIT',
1212
description = 'Helper functions for all stages of the machine learning model building process',
1313
long_description = long_description,
1414
long_description_content_type='text/markdown',
1515
author = 'TheBridgeMachineLearningPythonLibrary',
1616
author_email = 'seenstevol@protonmail.com',
1717
url = 'https://github.com/TheBridgeMachineLearningPythonLibrary/MachineLearningToolKit',
18-
download_url = 'https://github.com/TheBridgeMachineLearningPythonLibrary/MachineLearningToolKit/archive/refs/tags/v_1_7.tar.gz',
18+
download_url = 'https://github.com/TheBridgeMachineLearningPythonLibrary/MachineLearningToolKit/archive/refs/tags/v_1_8.tar.gz',
1919
keywords = ['machine learning', 'data visualization', 'data processing', 'sklearn', 'pandas'],
2020
install_requires=['pandas',
2121
'scipy',

0 commit comments

Comments
 (0)