-
Notifications
You must be signed in to change notification settings - Fork 5
/
Copy pathreadme.txt
217 lines (187 loc) · 12 KB
/
readme.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
Modify by HUI 2017
======================================
WHAT'S NEW
- [sample] Col_sample and Row Sample
- [Risk] Three kinds of risk metrics
- [bugfix] RankLib crashes when #features < #threads with some algorithms.
- [bugfix] Normalization issue with RankBoost (the command "-test <test-data> -load <RB-model-file> -norm zscore/sum" incorrectly normalizes the test data).
- [bugfix] Reciprocal rank is always measured off the entire ranked list, regardless of the specified cut-off point.
- [bugfix] MAP change when swaping two documents in a ranked list is incorrect. This only affects LambdaMART trained with MAP. It doesn't seem to have any significant impact on model effectiveness though.
- Slight efficiency improvement.
- RankLib can now be used to evaluate (well, kind of) models trained using other LTR software packages (e.g. SVM-Rank)
- Siddhartha Bagaria's patch for sparse dataset is included but not yet fully enabled since it's not thread-safe.
- Linear regression (L2 norm) should now work properly
========================
v 2.2
------
- Proper support for k-fold cross-validation (cv models can now be... saved!!! support random partitions in addition to sequential partitions)
- Help compare different models (i.e. how much gain? win/loss? statistically significant? improvement/hurt analysis).
- Add "linear" (Min/Max) feature normalization. ( q->D={d1,d2,...,d_n}; F={f1,f2,...,fm}; fk_di = (fk_di - min_fk_all_d_in_D) / (max_fk_all_d_in_D - min_fk_all_d_in_D) ).
- Coordinate Ascent now takes into account validation data just like all other algorithms (though Coordinate Ascent usually works well without validation data)
- Default value for #nRestart (Coordinate Ascent) is changed to 5 (increasing it might give better results, but training will take longer)
- [Bugfix] Feature normalization: RankLib sometimes doesn't do normalization even when it is told to do so.
- [Bugfix] Certain combinations of #features/#samples and #cpu-cores screwed up my multi-threaded implementatin of LambdaMART, causing the algorithm to underperform.
- [Bugfix] LambdaMART's occasional crashes.
- [Bugfix] RankLib crashes if the comment text associated with each feature vector contains additional "#" (other than the "#" used to specify a comment)
- Internal class/package re-arrangement (expect minor code change if you're using RankLib codes programatically)
(personal reminder)
- [Beta] Sparse feature vector: work properly, but its benefit has NOT been evaluated (slow down vs. memory gain? -- might be useless at the moment!)
- [Beta] Added L2 linear regression (NOT yet tested).
========================
v 2.1
------
- Add ListNet.
- Add Random Forest.
- With little manual work, it can do BagBoo/Bagging LambaMART too.
- For my personal use only: add support for
(1) external relevance judgment file [-qrel]
(2) output ranking in indri run file format (not exposed via cmd parameters) [-indri][requires doc-ID stored for each feature vector]
(3) ignore ranked list without any relevant document [-hr]
========================
v 2.0
------
- Add MART
- Add LambdaMART
- Change the calculation of NDCG to the standard version: (2^{rel_i} - 1) / log_{2} (i+1). Therefore, the absolute NDCG score might be slightly lower than before.
- Add zscore normalization.
- Fix the divide-by-zero bug related to the sum normalization ( q->D={d1,d2,...,d_n}; F={f1,f2,...,fm}; fk_di = fk_di / sum_{dj \in D} |fk_dj| ).
(I do not claim that these normalization methods are good -- in fact, I think it's a better idea for you to normalize your own data using your favorate method)
- Add the ability to split the training file to x% train and (100-x)% validation (previous version only allows train/test split, not train/validation).
- Add some minor cmd-line parameters.
- Some cmd-line parameter string have been changed.
- Internal code clean up for slight improvement in efficiency/speed.
========================
v 1.2.1
------
- Fix the error with sparse train/test/validate file (with v 1.1, when we do not specify feature whose value is 0, the system crashes in some cases)
- Speedup RankNet using batch learning + add some tricks (see the LambdaRank paper for details).
- Change default epochs to 50 for RankNet.
- Fix a bug related to RankBoost not dealing properly with features whose values are negative.
========================
v 1.1
------
- Change data types in some classes to reduce the amount of memory use. Thus this version can work with larger dataset.
- Rearrange packages
- Change some functions' name
========================
v 1.0
------
This is the first version of RankLib.
======================================
1. OVERVIEW
RankLib is a library for comparing different ranking algorithms. In the current version:
- Algorithms: MART, RankNet, RankBoost, AdaRank, Coordinate Ascent, LambdaMART, ListNet and Random Forests.
- Training data: it allow users to:
+ Specify train/test data separately
+ Automatically does train/test split from a single input file
+ Do k-fold cross validation (only sequential split at the moment, NO RANDOM SPLIT)
+ Allow users to specify validation set to guide the training process. It will pick the model that performs best on the validation data instead of the one on the training data. This is useful for easily overfitted algorithms like RankNet.
+ ...
- Evaluation metrics: MAP, NDCG@k, DCG@k, P@k, RR@k, ERR@k
===============================================================================================================================================
2. HOW TO USE
2.1. Binary
Usage: java -jar RankLib.jar <Params>
Params:
[+] Training (+ tuning and evaluation)
-train <file> Training data
-ranker <type> Specify which ranking algorithm to use
0: MART (gradient boosted regression tree)
1: RankNet
2: RankBoost
3: AdaRank
4: Coordinate Ascent
6: LambdaMART
7: ListNet
8: Random Forests
[ -feature <file> ] Feature description file: list features to be considered by the learner, each on a separate line
If not specified, all features will be used.
[ -metric2t <metric> ] Metric to optimize on the training data. Supported: MAP, NDCG@k, DCG@k, P@k, RR@k, ERR@k (default=ERR@10)
[ -metric2T <metric> ] Metric to evaluate on the test data (default to the same as specified for -metric2t)
[ -gmax <label> ] Highest judged relevance label. It affects the calculation of ERR (default=4, i.e. 5-point scale {0,1,2,3,4})
[ -test <file> ] Specify if you want to evaluate the trained model on this data (default=unspecified)
[ -validate <file> ] Specify if you want to tune your system on the validation data (default=unspecified)
If specified, the final model will be the one that performs best on the validation data
[ -tvs <x \in [0..1]> ] Set train-validation split to be (x)(1.0-x)
[ -tts <x \in [0..1]> ] Set train-test split to be (x)(1.0-x). -tts will override -tvs
[ -kcv <k> ] Specify if you want to perform k-fold cross validation using ONLY the specified training data (default=NoCV)
[ -norm <method>] Normalize feature vectors (default=no-normalization). Method can be:
sum: normalize each feature by the sum of all its values
zscore: normalize each feature by its mean/standard deviation
[ -save <model> ] Save the learned model to the specified file (default=not-save)
[ -silent ] Do not print progress messages (which are printed by default)
[-] RankNet-specific parameters
[ -epoch <T> ] The number of epochs to train (default=100)
[ -layer <layer> ] The number of hidden layers (default=1)
[ -node <node> ] The number of hidden nodes per layer (default=10)
[ -lr <rate> ] Learning rate (default=0.00005)
[-] RankBoost-specific parameters
[ -round <T> ] The number of rounds to train (default=300)
[ -tc <k> ] Number of threshold candidates to search. -1 to use all feature values (default=10)
[-] AdaRank-specific parameters
[ -round <T> ] The number of rounds to train (default=500)
[ -noeq ] Train without enqueuing too-strong features (default=unspecified)
[ -tolerance <t> ] Tolerance between two consecutive rounds of learning (default=0.0020)
[ -max <times> ] The maximum number of times can a feature be consecutively selected without changing performance (default=5)
[-] Coordinate Ascent-specific parameters
[ -r <k> ] The number of random restarts (default=2)
[ -i <iteration> ] The number of iterations to search in each dimension (default=25)
[ -tolerance <t> ] Performance tolerance between two solutions (default=0.0010)
[ -reg <slack> ] Regularization parameter (default=no-regularization)
[-] {MART, LambdaMART}-specific parameters
[ -tree <t> ] Number of trees (default=1000)
[ -leaf <l> ] Number of leaves for each tree (default=10)
[ -shrinkage <factor> ] Shrinkage, or learning rate (default=0.1)
[ -tc <k> ] Number of threshold candidates for tree spliting. -1 to use all feature values (default=256)
[ -mls <n> ] Min leaf support -- minimum #samples each leaf has to contain (default=1)
[ -estop <e> ] Stop early when no improvement is observed on validaton data in e consecutive rounds (default=100)
[ -col_sample <f> ] Feature sampling when spliting a node (default=1.0).
[ -row_sample <f> ] Instance sampling when creating a tree (default=1.0).
[ -alpha <f> ] risk-sensitivity parameter that controls the tradeoff between risk and reward (default=0.0).
[ -risk_type <t> ] risk-sensitivity type urisk:0 -- SARO:1 -- FARO:2 (default=0).
[-] ListNet-specific parameters
[ -epoch <T> ] The number of epochs to train (default=1500)
[ -lr <rate> ] Learning rate (default=0.00001)
[-] Random Forests-specific parameters
[ -bag <r> ] Number of bags (default=300)
[ -srate <r> ] Sub-sampling rate (default=1.0)
[ -frate <r> ] Feature sampling rate (default=0.3)
[ -rtype <type> ] Ranker to bag (default=0, i.e. MART)
[ -tree <t> ] Number of trees in each bag (default=1)
[ -leaf <l> ] Number of leaves for each tree (default=100)
[ -shrinkage <factor> ] Shrinkage, or learning rate (default=0.1)
[ -tc <k> ] Number of threshold candidates for tree spliting. -1 to use all feature values (default=256)
[ -mls <n> ] Min leaf support -- minimum #samples each leaf has to contain (default=1)
[+] Testing previously saved models
-load <model> The model to load
-test <file> Test data to evaluate the model (specify either this or -rank but not both)
-rank <file> Rank the samples in the specified file (specify either this or -test but not both)
[ -metric2T <metric> ] Metric to evaluate on the test data (default=ERR@10)
[ -gmax <label> ] Highest judged relevance label. It affects the calculation of ERR (default=4, i.e. 5-point scale {0,1,2,3,4})
[ -score <file>] Store ranker's score for each object being ranked (has to be used with -rank)
[ -idv ] Print model performance (in test metric) on individual ranked lists (has to be used with -test)
[ -norm ] Normalize feature vectors (similar to -norm for training/tuning)
2.2. Build
An ant xml config. file is included. Make sure you have ant on your machine. Just type "ant" and you are good to go.
==================================================================
3. FILE FORMAT (TRAIN/TEST/VALIDATION)
The file format of the training and test and validation files is the same as for SVM-Rank (http://www.cs.cornell.edu/People/tj/svm_light/svm_rank.html). This is also the format used in the LETOR datasets. Each of the following lines represents one training example and is of the following format:
<line> .=. <target> qid:<qid> <feature>:<value> <feature>:<value> ... <feature>:<value> # <info>
<target> .=. <float>
<qid> .=. <positive integer>
<feature> .=. <positive integer>
<value> .=. <float>
<info> .=. <string>
Here's an example: (taken from the SVM-Rank website). Note that everything after "#" are discarded.
3 qid:1 1:1 2:1 3:0 4:0.2 5:0 # 1A
2 qid:1 1:0 2:0 3:1 4:0.1 5:1 # 1B
1 qid:1 1:0 2:1 3:0 4:0.4 5:0 # 1C
1 qid:1 1:0 2:0 3:1 4:0.3 5:0 # 1D
1 qid:2 1:0 2:0 3:1 4:0.2 5:0 # 2A
2 qid:2 1:1 2:0 3:1 4:0.4 5:0 # 2B
1 qid:2 1:0 2:0 3:1 4:0.1 5:0 # 2C
1 qid:2 1:0 2:0 3:1 4:0.2 5:0 # 2D
2 qid:3 1:0 2:0 3:1 4:0.1 5:1 # 3A
3 qid:3 1:1 2:1 3:0 4:0.3 5:0 # 3B
4 qid:3 1:1 2:0 3:0 4:0.4 5:1 # 3C
1 qid:3 1:0 2:1 3:1 4:0.5 5:0 # 3D