query_strategy.cost_sensitive. QueryCostSensitivePerformance

query_strategy.cost_sensitive.QueryCostSensitivePerformance(X, y)

Selects the most uncertrainty instance-label pairs under the constraints of meeting the budget conditions.

Methods

query_strategy.cost_sensitive.QueryCostSensitivePerformance. init

query_strategy.cost_sensitive.QueryCostSensitivePerformance.init(X=None, y=None)
Parameters:
X: 2D array
Feature matrix of the whole dataset. It is a reference which will not use additional memory.
y: array-like
Label matrix of the whole dataset. It is a reference which will not use additional memory.

query_strategy.cost_sensitive.QueryCostSensitivePerformance. select

query_strategy.cost_sensitive.QueryCostSensitivePerformance.select(label_index, unlabel_index, oracle, cost, budget=40, basemodel=None, models=None)

Selects the most uncertrainty instance-label pairs under the constraints of meeting the budget conditions.

Parameters:
label_index: MultiLabelIndexCollection
The indexes of labeled samples. It should be a 1d array of indexes (column major, start from 0) or
MultiLabelIndexCollection or a list of tuples with 2 elements, in which,
the 1st element is the index of instance and the 2nd element is the index of labels.
unlabel_index: MultiLabelIndexCollection
The indexes of unlabeled samples. It should be a 1d array of indexes (column major, start from 0) or
MultiLabelIndexCollection or a list of tuples with 2 elements, in which,
the 1st element is the index of instance and the 2nd element is the index of labels.
oracle: Oracle,(default=None)
Oracle indicate the cost for each label.
Oracle in active learning whose role is to label the given query.And it can also give the cost of
each corresponding label.The Oracle includes the label and cost information at least.
Oracle(labels=labels, cost=cost)
costs: np.array, (default=None), shape [1, n_classes] or [n_classes]
the costs of querying each class.if not provide,it will all be 1.
budget: int, optional (default=40)
The budget of the select cost.If cost for eatch labels is 1,will degenerate into the batch_size.
models: object, optional (default=None)
Current classification model, should have the 'predict_proba' method for probabilistic output.
If not provided,it will build the model based the base_model.
base_model: object, optional(default=None)
The classification model for eatch label,if the models is not provided.It will build a classifi
-cation model for the multilabel taks.If not provided, SVM with default parameters implemented
by sklearn will be used.
Returns:
selected_ins_lab_pair: list
A list of tuples that contains the indexes of selected instance-label pairs.

Copyright © 2018, alipy developers (BSD 3 License).