Active learning methods which present selected examples from the
Relation extraction has seen a lot of development in recent years, with supervised learning methods using neural network models making the most progress ([
Active learning methods which present selected examples from the
This has led in practice to the continued use of manually-curated rule-based systems, which are more inspectable and thus give the developer more control [
We utilize lexicalized dependency paths as the representation for our active learning system. Lexicalized dependency path (LDP) was proposed in [
Reference [
LDP [
The
Extending dependency path this way has several advantages over predecessors. First, it creates a rule-based system that is unified and easily understood and it allows a nature extension to other forms of learning such as active learning. Second, this representation also means we can dissect a sentence into a number of LDPs allowing for a supervised trainer to be used.
To combine coarse and fine semantic classes, we differentiate LDPs formed between pairs of entity type arguments and all of its subtype arguments. Given a sentence with a pair of entity mentions, we can proceed in three ways: we can create a type LDP from the types of the entity mentions; we can create a subtype LDP from the subtypes of the entity mentions; or we can create both type and subtype LDPs. In the last case, a test LDP may have two matches against the model: one matching types, one matching subtypes. In that case the subtype match takes precedence. This makes it possible to have general rules and more specific exceptions. Supervised learning of LDPs is shown to be effective in [
Active learning (AL) involves a tight coupling of developer and machine. At each cycle of the learning process, the machine selects from an unlabeled training set one or more items and the developer labels them. The selection criteria vary, but basically try to maximize the expected value of the improvement of the accuracy of the model being trained. In our case the items are LDPs and the labels are the relation types or “no relation”.
We perform both simulated and real active learning. Real active learning involves a real developer as an integral part of the customization process, an expensive process but an essential one to ensure that we are not overlooking some potential problem with the learning process. In simulated active learning, we simulate the judgment of the developer by consulting an annotated training
Experiments in active learning require a small set of seeds, a large unlabeled
ICE [
Apart from the shared argument ranker, we also tried the maximum entropy and smallest margin rankers. In terms of entropy, given unlabeled example
The smallest margin can be calculated as follows given the two most likely classes
The user can then accept some of these LDPs and reject others. This process is repeated for several iterations. The selected LDPs are then scored on the target domain. Simulated active learning splits the ACE
The ACE 2005
The source domain of LA Times contains over 800 k LDPs to sample from, and a portion of selected LDPs won’t have a match in ACE, therefore we choose a relatively large batch size of 100. 90% of ACE is used as the oracle, and the remaining 10% as the target domain for active learning (AL), where we picked a train/test split on ACE that produced a F1 close to the supervised mean of 59.14% after 20 runs to reduce the random variations. We use several starting seeds for each relation class, some of which shared with other classes (e.g., the seeds ‘Person of GPE’ and ‘GPE Person’ are used for ORG-AFF, GEN-AFF and PHYS).
We ran experiments using type only LDPs first and then added subtypes. The AL curves of using type only LDPs and type + subtype LDPs along with a direct comparison of F1 between the two are shown in
In terms of labeling reduction, after 10 iterations, the LDPs selected from ICE matched a total of 439 LDPs from ACE (274 type and 165 subtype). The training split of ACE contains a total of 5782 LDPs (2524 type and 3258 subtype). Therefore, for type LDP, we only labeled 10.86% of the training LDPs and is able to achieve 91.01% of supervised scores, reducing labeling effort on the new domain by 89.1%. For type + subtype LDPs, we used 7.59% of training LDPs and achieved 93.95% of supervised scores, reducing labeling effort on the new domain by 92.4%.
The source domain for real active learning (AL) is the same as simulated AL but the target domain is now the entire ACE
Comparing simulated AL with real AL in
We compare our active learning results to other state-of-the-arts methods in
In terms of other methods that we compare with in
Relation | Type + Subtype LDP | B + N + S + G | LGCo-Testing | ||
---|---|---|---|---|---|
P | R | F | F | F | |
ORG-AFF | 74.25 | 57.93 | 65.08 | 66.20 | |
GEN-AFF | 55.74 | 46.17 | 43.01 | 44.50 | |
PART-WHOLE | 62.12 | 50.78 | 55.88 | N/A | N/A |
PHYS | 52.49 | 42.61 | 41.16 | 39.90 | |
PER-SOC | 71.32 | 66.18 | 68.65 | 68.87 | |
ART | 73.26 | 46.82 | 57.13 | 43.33 | |
ALL | 63.98 | 52.29 | 52.98 | N/A |
Our LDP model is inspectable and so can provide insights to understand system failures and errors much more easily than feature-based and neural network based models. The supervised LDP model on average misses about 40% of ACE relations. Many of these can be recovered by a kNN classifier or soft match in place of exact match for LDPs and readily explained.
Some errors reflect choices made regarding the dependency structure. For example, some parsers generate rather flat noun groups, thus missing some ‘true’ dependency relations. The noun phrase ‘GM chief Rick…’ has the LDP ORGANIZATION--nn-1--PERSON which doesn’t contain the crucial word chief on its path that can easily identify the ORG-AFF relation. These prenominal modifiers that serve to identify the relation class would not be on the path. Another example is ‘William, brother Harry…’ which has the LDP PERSON--appos--PERSON and doesn’t contain the word brother which identifies a PER-SOC relation. Reference [
We aim to improve the ranking of unlabeled LDPs. In addition to the shared argument ranker used in these experiments, we have implemented a ranker based on word embeddings and plan to try rankers which more directly capture uncertainty. We plan to test on additional entity and relation types, such as SemEval, and to provide a more general hierarchy extending the type-subtype pairs of ACE.
We proposed an active learner utilizing lexicalized dependency path (LDP) as the representation to provide the developer with more control over the extraction model. We applied our LDP model to simulated and real active learning and found that combining fine and coarse semantic classes in our use of type + subtype LDPs achieved F1 improvement of 4.5% over just coarse classes. Simulated active learning experiments using LDPs reduced the labeling effort on ACE by over 92% and maintained close to supervised scores. Real active learning experiments of LDPs outperformed simulated by 2% due to having a much bigger pool of unlabeled examples to select from and achieved a better overall performance to the state-of-the-art, showing the advantage of collaborative involvement of the developer in the extraction model.