Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
igorsterner authored Jul 29, 2024
1 parent f9cbe38 commit a164db1
Showing 1 changed file with 8 additions and 6 deletions.
14 changes: 8 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,22 +26,24 @@ sat = SaT("sat-3l")
# also supports TPUs via e.g. sat.to("xla:0"), in that case pass `pad_last_batch=True` to sat.split
sat.half().to("cuda")

# returns ["This is a test", "This is another test."]
sat.split("This is a test This is another test.")
# returns ["This is a test ", "This is another test."]

# returns an iterator yielding a lists of sentences for every text
# do this instead of calling sat.split on every text individually for much better performance
sat.split(["This is a test This is another test.", "And some more texts..."])
# returns an iterator yielding lists of sentences for every text

# use our '-sm' models for general sentence segmentation tasks
sat_sm = SaT("sat-3l-sm")
# this will be especially better for noisy text
sat_sm.half().to("cuda") # optional, see above
sat_sm.split("this is a test this is another test")
# returns ["this is a test", "this is another test"]
# returns ["this is a test ", "this is another test"]

# use trained lora modules for strong adaptation to language & domain/style
sat_adapted = SaT("sat-3l-sm", lang_code="en", style="ud")
sat.split("This is a test This is another test.")
sat_adapted = SaT("sat-3l", style_or_domain="ud", language="en")
sat_adapted.half().to("cuda") # optional, see above
sat_adapted.split("This is a test This is another test.")
# returns ['This is a test ', 'This is another test']
```


Expand Down

0 comments on commit a164db1

Please sign in to comment.