alipy.experiment.StateIO
object is a class to save and load your intermediate results.
This object implements several crucial functions:
- Save intermediate results to files
- Recover workspace (label set and unlabel set) at any iterations
- Recover program from the breakpoint in case the program exits unexpectedly
- Print the active learning progress: current_iteration, current_mean_performance, current_cost, etc.
It is strongly recommended to use this tool class to manage your intermediate results.
Because many other components in alipy support StateIO object directly (e.g.,
Analyser
,
StoppingCriteria
).
If you are going to use those tool classes too, it can save some time on processing the data types.
In the following tutorial, we will introduce the basic usage of
alipy.experiment.StateIO
and
alipy.experiment.State
class.
Note that, the
StateIO
object is in units of one-fold experiment, it needs the data split and
fold number of your current fold experiment when initializing:
# split your data first
from alipy.experiment import StateIO, State
saver = StateIO(round=0, train_idx=train_idx[0],
test_idx=test_idx[0], init_L=label_idx[0],
init_U=unlabel_idx[0], saving_path='.')
When adding query into the
StateIO
object, it is required to use a
alipy.experiment.State
object
which is a dict like container
to save some necessary information of one query (The state of current iteration).
Such as cost, performance, selected indexes, and so on.
You need to set the queried indexes and performance when initializing a State object, the cost and queried_labels are optional:
st = State(select_index=select_ind, performance=accuracy,
cost=cost, queried_label=queried_label)
You can also add some other entries as you need:
st.add_element(key='my_entry', value=my_value)
After you put all useful information into a
State
object,
you should add the state to the
StateIO
object, and use
save()
method to save the
intermediate results to file:
saver.add_state(st)
saver.save()
If you want to check the previous queries for analysing, you can get any past queries by:
prev_st = saver.get_state(index=1) # get 2nd query
# or use the index operation directly
prev_st = saver[1]
You can use the similar way to get the values in a
State
object:
value = prev_st.get_value(key='select_index')
# or use the index operation directly
value = prev_st['select_index']
You can recover the
StateIO
object to any past states.
For example, you have queried 10 times already, and want to
go back to the workspace (label and unlabel set) when only 2 queries
are performed for analysing, you can invoke
get_workspace(iteration)
or
recover_workspace(iteration)
method to achieve this goal.
The formal will return the train, test, label, unlabel indexes of the given iteration, while the object itself remains unchanged. And the latter one will recover itself to the specific iteration which will discard the information after the given iteration .
train, test, L, U = saver.get_workspace(iteration=2)
# or recover the saver itself
saver.recover_workspace(iteration=2)
The iteration parameter is the number of queries you want to recover.
For example, if 0 is given, the initial workspace without any querying will be recovered.
If your experiment exit unexpectly, you can load the StateIO binary file to recover your program without re-run your previous queries.
saver = StateIO.load(path='./AL_round_0.pkl')
train, test, L, U = saver.get_workspace() # will return the latest workspace
Copyright © 2018, alipy developers (BSD 3 License).