Train a logistic regression model on one stream of data and make predictions
on another stream, where the data streams arrive as text files
into two different directories.
The rows of the text files must be labeled data points in the form
(y,[x1,x2,x3,...,xn])
Where n is the number of features, y is a binary label, and
n must be the same for train and test.
To run on your local machine using the two directories trainingDir and testDir,
with updates every 5 seconds, and 2 features per data point, call:
$ bin/run-example mllib.StreamingLogisticRegression trainingDir testDir 5 2
As you add text files to trainingDir the model will continuously update.
Anytime you add text files to testDir, you'll see predictions from the current model.
Train a logistic regression model on one stream of data and make predictions on another stream, where the data streams arrive as text files into two different directories.
The rows of the text files must be labeled data points in the form
(y,[x1,x2,x3,...,xn])
Where n is the number of features, y is a binary label, and n must be the same for train and test.Usage: StreamingLogisticRegression <trainingDir> <testDir> <batchDuration> <numFeatures>
To run on your local machine using the two directories
trainingDir
andtestDir
, with updates every 5 seconds, and 2 features per data point, call: $ bin/run-example mllib.StreamingLogisticRegression trainingDir testDir 5 2As you add text files to
trainingDir
the model will continuously update. Anytime you add text files totestDir
, you'll see predictions from the current model.