Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
193 changes: 82 additions & 111 deletions practicals_jn_book/week_3/finalbook.ipynb

Large diffs are not rendered by default.

11 changes: 10 additions & 1 deletion practicals_jn_book/week_4/finalbook.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -617,12 +617,20 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"::::{tip}\n",
"When functions or algorithms have a stochastic(random) component, setting a `random_state` ensures reproducibility of results by initializing the random number generator to a fixed state. This allows consistent outputs across multiple runs of the code.\n",
"::::\n",
"\n",
"Now, that works like a charm, but let's quickly brake down what we actually did based on the provided key-word arguments:\n",
"\n",
"1. We provided our features (`X`) and outcome (`y`) because this is the data that needs to be split in a training- and a test set. \n",
"\n",
"2. We specified a `test_size`. This refers to the relative size of the test-dataset. In this case 0.3 means the test-dataset will contain 30% of the dataset, and the training-dataset will automatically contain 70% of the dataset. \n",
"\n",
"::::{note}\n",
"Depending on the data variance and size of the dataset, a test set size between 0.1 and 0.3 is normally considered.\n",
"::::\n",
"\n",
"3. We specified a random_state. Under the hood the `train_test_split` function randomly draws from the dataset, however, by specificy a random_state we make sure that the function start at the same location every time, so that we have a constant output if we would run our code again. \n",
"\n",
"::::{important}\n",
Expand All @@ -636,6 +644,7 @@
"```\n",
"\n",
"or\n",
"\n",
"```python\n",
"scaler = StandardScaler()\n",
"X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)\n",
Expand Down Expand Up @@ -870,7 +879,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## Taking the next step (bonus)\n",
"## Assignment 8\n",
"\n",
"Now that you have trained simple and polynomial regression models it is time for the next step. The previous task was fairly straightforward, as our dataset only contained a limited number of features that were all linked to the outcome variable. \n",
"\n",
Expand Down