Skip to content

Commit b69376c

Browse files
committed
documentacion
1 parent 2fe3afd commit b69376c

1 file changed

Lines changed: 13 additions & 6 deletions

File tree

toolkit/data_processing.py

Lines changed: 13 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -125,11 +125,14 @@ def buffdescribe(df, stats=['mean', 'median', 'std']):
125125
'''
126126
Function to facilitate a first exploration of a dataframe's data by concentrating the most relevant information
127127
128-
Params:
128+
Parameters
129+
----------
129130
- df: Dataframe
130131
- stats: Descriptive statistics to calculate. Default: Mean, Median, and Standard Deviation
131132
132-
Returns: Dataframe with the following columns:
133+
Returns
134+
----------
135+
Dataframe with the following columns:
133136
- Column names from the original df
134137
- Data type of each column
135138
- Percentage of null values in each column
@@ -183,14 +186,16 @@ def clean_text(df, column:str, language:str, target:str, filename:str='data_proc
183186
'''
184187
Function to preprocess and clean a dataframe with text as a preliminary step for Natural Language Processing
185188
186-
Params:
189+
Parameters
190+
----------
187191
- df: Dataframe
188192
- column: The name of the column in which the text is located (str)
189193
- language: The language in which the text is written (str) in ENGLISH (e.g. 'spanish', 'english')
190194
- target: The name of the column in which the target to be predicted is located
191195
- filename: Name for the processed dataframe to be saved
192196
193-
Returns:
197+
Returns
198+
----------
194199
- df_processed: Dataframe after cleaning. It contains only the text variable and the target variable
195200
'''
196201

@@ -244,11 +249,13 @@ def load_imgs(path, im_size:int):
244249
(e.g. one directory for dog photos and another for cat photos).
245250
It can be used for both binary and categorical classification.
246251
247-
Args:
252+
Parameters
253+
----------
248254
- path: Path where the subdirectories with the images are located.
249255
- im_size: Size to which we want to resize the image (e.g. 32).
250256
251-
Returns:
257+
Returns
258+
----------
252259
- df: Dataframe with the names of the images and the category to which they belong (target).
253260
- X_train: Array with the image data loaded after resizing.
254261
- y_train: Array with the target values.

0 commit comments

Comments
 (0)