Changes of step_length #67
-
how does
|
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 2 replies
-
|
@Huadangfan For method 0, the stpe length will decrease at the (n+1)-th iteration if the objective function at the n-th iteration is greater than that at the (n-1)-th iteration. In the case, the step length is too large for linear approximation (or called first-order Tayler expansion). A decreased step length is preferred. For method 1, the stpe length with decrease at the (n+1)-th iteration if the gradient at n-th iteration significantly differs from that at the (n-1)-th iteration. For example, the angle between two kernels is greater than the default value 120 degree, which means the model update direction are somewhat reversed. In this case, the model update is in the stage of oscillation. A decreased step length is preferred. |
Beta Was this translation helpful? Give feedback.

@Huadangfan
Thanks for your interest in the difference between these two strategies. Here is some information for your reference:
For method 0, the stpe length will decrease at the (n+1)-th iteration if the objective function at the n-th iteration is greater than that at the (n-1)-th iteration. In the case, the step length is too large for linear approximation (or called first-order Tayler expansion). A decreased step length is preferred.
For method 1, the stpe length with decrease at the (n+1)-th iteration if the gradient at n-th iteration significantly differs from that at the (n-1)-th iteration. For example, the angle between two kernels is greater than the default value 120 degree, wh…