John Richey
2013-05-03 14:41:49 UTC
Hi all -
I am relatively new to the world of machine learning, and I am having a little difficulty in interpreting the output of a support vector regression problem. For simplicity, lets say I have 2 variables and 100 subjects. Both variables in my model are continuous.
To make matters a little more complicated, I have four "sites" at which data were collected, and I want to "leave one label out", where labels correspond to sites for the purposes of assessing whether site has an influence on the predictive model.
Here is the code so far.
lolo = LeaveOneLabelOut(labels)
for train_index, test_index in lolo:
X_train, X_test = X[train_index], X[test_index]
y_train, y_test = y[train_index], y[test_index]
clf = svm.SVR()
clf = clf.fit(X_train, y_train)
s=clf.score(X_test, y_test)
print s
scores = cross_validation.cross_val_score(clf,X_test, y_test)
print "Accuracy: %0.2f (+/- %0.2f)" % (scores.mean(), scores.std() / 2)
It produces the following output
0.0343889480748
Accuracy: -0.05 (+/- 0.05)
-0.0786771792262
Accuracy: -0.25 (+/- 0.07)
-0.0871562121791
Accuracy: -0.12 (+/- 0.05)
-0.0496675695436
Accuracy: -0.16 (+/- 0.03)
Could someone help me in how to interpret the substantive meaning of the 'score' in an SVR problem? Thanks in advance.
I am relatively new to the world of machine learning, and I am having a little difficulty in interpreting the output of a support vector regression problem. For simplicity, lets say I have 2 variables and 100 subjects. Both variables in my model are continuous.
To make matters a little more complicated, I have four "sites" at which data were collected, and I want to "leave one label out", where labels correspond to sites for the purposes of assessing whether site has an influence on the predictive model.
Here is the code so far.
lolo = LeaveOneLabelOut(labels)
for train_index, test_index in lolo:
X_train, X_test = X[train_index], X[test_index]
y_train, y_test = y[train_index], y[test_index]
clf = svm.SVR()
clf = clf.fit(X_train, y_train)
s=clf.score(X_test, y_test)
print s
scores = cross_validation.cross_val_score(clf,X_test, y_test)
print "Accuracy: %0.2f (+/- %0.2f)" % (scores.mean(), scores.std() / 2)
It produces the following output
0.0343889480748
Accuracy: -0.05 (+/- 0.05)
-0.0786771792262
Accuracy: -0.25 (+/- 0.07)
-0.0871562121791
Accuracy: -0.12 (+/- 0.05)
-0.0496675695436
Accuracy: -0.16 (+/- 0.03)
Could someone help me in how to interpret the substantive meaning of the 'score' in an SVR problem? Thanks in advance.