Identification of causality is not a trivial problem. Causation can occur in various forms. Two common differentiations are made on:
- Marked and Unmarked causality (a)
- Implicit and Explicit causality (b)
Marked Causality is where there is a linguistic signal of causation present.
For example, “I attended the event because I was invited”. Here, causality is marked by because. On the other hand in “Drive slowly. There are potholes”, causality is unmarked.
Explicit Causality is where both cause and effect are stated. For example, “The burst has been caused by water hammer pressure” has both cause and effect stated explicitly.
However, “The car ran over his leg” does not have the effect of the accident explicitly stated.
Automatic extraction of cause-effect relations are primarily based on three different approaches namely, Linguistic rule based, supervised and unsupervised machine learning approaches.
The architecture uses word level embeddings and other linguistic features to detect causal events and their effects mentioned within a sentence. The extracted events and their relations are used to build a causal-graph after clustering and appropriate generalization, which is then used for predictive purposes.