Skip to content

Commit 52c1f1b

Browse files
committed
Ready for Cadmium
1 parent 46e3e47 commit 52c1f1b

File tree

6 files changed

+176
-175
lines changed

6 files changed

+176
-175
lines changed

README.md

Lines changed: 32 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -1,34 +1,56 @@
1-
# franca
1+
# Language Detector
22

3-
Crystal port of [franc](https://github.com/wooorm/franc)
3+
Crystal port of [franc](https://github.com/wooorm/franc).
4+
5+
It's not the state-of-the-art algorithm on language identification, but gets 90%+ success on long enough text samples.
6+
7+
It identifies any given text sample by extracting its 3 characters trigrams and comparing them to the most recurring trigrams extracted from a translation of the [UDHR](https://www.un.org/en/universal-declaration-human-rights/) in all the available languages.
8+
9+
Language Detector returns the ISO-869-3 three letters language code of the most probable guess.
410

511
## Installation
612

713
1. Add the dependency to your `shard.yml`:
814

915
```yaml
1016
dependencies:
11-
franca:
12-
github: rmarronnier/franca
17+
cadmium_language_detector:
18+
github: cadmiumcr/language_detector
1319
```
1420
1521
2. Run `shards install`
1622

1723
## Usage
1824

1925
```crystal
20-
require "franca"
21-
```
26+
require "language_detector"
27+
28+
text = "Alice was published in 1865, three years after Charles Lutwidge Dodgson and the Reverend Robinson Duckworth rowed in a
29+
boat, on 4 July 1862 [4] (this popular date of the golden afternoon [5] might be a confusion or even another Alice-tale, for that
30+
particular day was cool, cloudy and rainy [6] ), up the Isis with the three young daughters of Henry Liddell (the Vice-Chancellor ofOxford University and Dean of Christ Church): Lorina Charlotte Liddell (aged
31+
13, born 1849) (Prima in the book's prefatory verse); Alice Pleasance Liddell
32+
(aged 10, born 1852) (Secunda in the prefatory verse); Edith Mary Liddell
33+
(aged 8, born 1853) (Tertia in the prefatory verse). [7]
34+
The journey began at Folly Bridge near Oxford and ended five miles away in the
35+
village of Godstow. During the trip Charles Dodgson told the girls a story that
36+
featured a bored little girl named Alice who goes looking for an adventure. The
37+
girls loved it, and Alice Liddell asked Dodgson to write it down for her. He
38+
began writing the manuscript of the story the next day, although that earliest
39+
version no longer exists. The girls and Dodgson took another boat trip a month
40+
later when he elaborated the plot to the story of Alice, and in November he
41+
began working on the manuscript in earnest."
42+
43+
pp LanguageDetector.new.detect(text)
44+
45+
# "eng"
2246
23-
TODO: Write usage instructions here
47+
```
2448

25-
## Development
2649

27-
TODO: Write development instructions here
2850

2951
## Contributing
3052

31-
1. Fork it (<https://github.com/rmarronnier/franca/fork>)
53+
1. Fork it (<https://github.com/cadmiumcr/language_detector/fork>)
3254
2. Create your feature branch (`git checkout -b my-new-feature`)
3355
3. Commit your changes (`git commit -am 'Add some feature'`)
3456
4. Push to the branch (`git push origin my-new-feature`)

shard.yml

Lines changed: 1 addition & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,9 @@
1-
name: franca
1+
name: cadmium_language_detector
22
version: 0.1.0
33

44
authors:
55
- Rémy Marronnier <rmarronnier@gmail.com>
66

77
crystal: 0.30.1
88

9-
dependencies:
10-
cadmium:
11-
github: watzon/cadmium
12-
branch: master
13-
149
license: MIT

spec/franca_spec.cr renamed to spec/language_detector_spec.cr

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
require "./spec_helper"
2-
include Franca
3-
describe Franca do
2+
include LanguageDetector
3+
describe LanguageDetector do
44
subject = LanguageDetector.new
55
test = "제 26 조
66
1) 모든 사람은 교육을 받을 권리를 가진다 . 교육은 최소한 초등 및 기초단계에서는 무상이어야 한다. 초등교육은 의무적이어야 한다. 기술 및 직업교육은 일반적으로 접근이 가능하여야 하며, 고등교육은 모든 사람에게 실력에 근거하여 동등하게 접근 가능하여야 한다.

spec/spec_helper.cr

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,2 @@
11
require "spec"
2-
require "../src/franca"
2+
require "../src/language_detector"

src/franca.cr

Lines changed: 0 additions & 156 deletions
This file was deleted.

0 commit comments

Comments
 (0)