I need to get the cross-validation statistics explicitly for each split of the (X_test, y_test) data.
So, to try to do so I did:
kf = KFold(n_splits=n_splits) X_train_tmp =  y_train_tmp =  X_test_tmp =  y_test_tmp =  mae_train_cv_list =  mae_test_cv_list =  for train_index, test_index in kf.split(X_train): for i in range(len(train_index)): X_train_tmp.append(X_train[train_index[i]]) y_train_tmp.append(y_train[train_index[i]]) for i in range(len(test_index)): X_test_tmp.append(X_train[test_index[i]]) y_test_tmp.append(y_train[test_index[i]]) model.fit(X_train_tmp, y_train_tmp) # FIT the model = SVR, NN, etc. mae_train_cv_list.append( mean_absolute_error(y_train_tmp, model.predict(X_train_tmp)) # MAE of the train part of the KFold. mae_test_cv_list.append( mean_absolute_error(y_test_tmp, model.predict(X_test_tmp)) ) # MAE of the test part of the KFold. X_train_tmp =  y_train_tmp =  X_test_tmp =  y_test_tmp = 
Is it the proper way of getting the Mean Absolute Error (MAE) for each cross-validation split by using, for instance, KFold?
There are some issues with your approach.
To start with, you certainly don’t have to append the data manually one by one in your training & validation lists (i.e. your 2 inner
for loops); simple indexing will do the job.
Additionally, we normally never compute & report the error of the training CV folds – only the error on the validation folds.
Keeping these in mind, and switching the terminology to “validation” instead of “test”, here is a simple reproducible example using the Boston data, which should be straighforward to adapt to your case:
from sklearn.model_selection import KFold from sklearn.datasets import load_boston from sklearn.metrics import mean_absolute_error from sklearn.tree import DecisionTreeRegressor X, y = load_boston(return_X_y=True) n_splits = 5 kf = KFold(n_splits=n_splits, shuffle=True) model = DecisionTreeRegressor(criterion='mae') cv_mae =  for train_index, val_index in kf.split(X): model.fit(X[train_index], y[train_index]) pred = model.predict(X[val_index]) err = mean_absolute_error(y[val_index], pred) cv_mae.append(err)
after which, your
cv_mae should be something like (details will differ due to the random nature of CV):
[3.5294117647058827, 3.3039603960396042, 3.5306930693069307, 2.6910891089108913, 3.0663366336633664]
Of course, all this explicit stuff is not really necessary; you could do the job much more simply with
cross_val_score. There is a small catch though:
from sklearn.model_selection import cross_val_score cv_mae2 =cross_val_score(model, X, y, cv=n_splits, scoring="neg_mean_absolute_error") cv_mae2 # result array([-2.94019608, -3.71980198, -4.92673267, -4.5990099 , -4.22574257])
Apart from the negative sign which is not really an issue, you’ll notice that the variance of the results looks significantly higher compared to our
cv_mae above; and the reason is that we didn’t shuffle our data. Unfortunately,
cross_val_score does not provide a shuffling option, so we have to do this manually using
shuffle. So our final code should be:
from sklearn.model_selection import cross_val_score from sklearn.utils import shuffle X_s, y_s =shuffle(X, y) cv_mae3 =cross_val_score(model, X_s, y_s, cv=n_splits, scoring="neg_mean_absolute_error") cv_mae3 # result: array([-3.24117647, -3.57029703, -3.10891089, -3.45940594, -2.78316832])
which is of significantly less variance between the folds, and much closer to our initial